You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@seatunnel.apache.org by GitBox <gi...@apache.org> on 2022/08/05 03:17:33 UTC

[GitHub] [incubator-seatunnel] eyys opened a new issue, #2371: [Feature][seatunnel-connectors-v2][connector-kafka] Kafka supports custom schema

eyys opened a new issue, #2371:
URL: https://github.com/apache/incubator-seatunnel/issues/2371

   ### Search before asking
   
   - [X] I had searched in the [feature](https://github.com/apache/incubator-seatunnel/issues?q=is%3Aissue+label%3A%22Feature%22) and found no similar feature requirement.
   
   
   ### Description
   
   English:
   The original Kafka connector supports custom schema. The current V2 connector does not support custom schema
   
   
   中文:
   原始的Kafka连接器支持自定义模式。当前的V2连接器不支持自定义schema
   
   ### Usage Scenario
   
   Kafka Source user-defined data structures
   
   ### Related issues
   
   none
   
   ### Are you willing to submit a PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-seatunnel] ashulin commented on issue #2371: [Feature][seatunnel-connectors-v2][connector-kafka] Kafka supports custom schema

Posted by GitBox <gi...@apache.org>.
ashulin commented on issue #2371:
URL: https://github.com/apache/incubator-seatunnel/issues/2371#issuecomment-1207579977

   This is a common feature, for example, kafka, HTTP, File, Pulsar, etc;


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-seatunnel] ashulin commented on issue #2371: [Feature][seatunnel-connectors-v2][connector-kafka] Kafka supports custom schema

Posted by GitBox <gi...@apache.org>.
ashulin commented on issue #2371:
URL: https://github.com/apache/incubator-seatunnel/issues/2371#issuecomment-1207577449

   https://github.com/apache/incubator-seatunnel/issues/2299


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-seatunnel] TyrantLucifer commented on issue #2371: [Feature][seatunnel-connectors-v2][connector-kafka] Kafka supports custom schema

Posted by GitBox <gi...@apache.org>.
TyrantLucifer commented on issue #2371:
URL: https://github.com/apache/incubator-seatunnel/issues/2371#issuecomment-1214970654

   This feature is in development, please waiting pr.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-seatunnel] TyrantLucifer commented on issue #2371: [Feature][seatunnel-connectors-v2][connector-kafka] Kafka supports custom schema

Posted by GitBox <gi...@apache.org>.
TyrantLucifer commented on issue #2371:
URL: https://github.com/apache/incubator-seatunnel/issues/2371#issuecomment-1207612744

   > > This is a common feature, for example, kafka, HTTP, File, Pulsar, etc; My opinion is to conform to the habit of SQL;
   > > ```
   > > # conf file
   > > Source {
   > >     schema {
   > >         field = "STRING"
   > >         field2 = "INT"
   > >         field3 = "DECIMAL(30, 3)"
   > >     }
   > > }
   > > ```
   > 
   > I think it's a good idea that add a common feature in connector to support user-defined schema. In my option, I think we can add a new config option `schema` in all source connector source configs and for each connector they can parse their own schema in `getProducedType` method. The implement function of parsing `schema` we can add it in module `seatunnel-common`. And I agree with your advice @ashulin to conform schema type to the habit of SQL the same as defined in code.
   > 
   > ![image](https://user-images.githubusercontent.com/51053924/183329465-58f5c04d-bffc-43ff-81df-ed15837e823d.png)
   
   By the way, the row data format information also should defined in source connector. The final source config as the following:
   
   ```hcon
   Source {
      Kafka {
        shema {
          fields {
             field = "STRING"
             field2 = "INT"
             field3 = "DECIMAL(30, 3)"
          }
           format = "json" // format = "text"
        }
     }
   }
   
   ```
   In most use cases, I think we can support `json` or `text`, if we assign `text`, we also assign `delimiter` in config file.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-seatunnel] TyrantLucifer commented on issue #2371: [Feature][seatunnel-connectors-v2][connector-kafka] Kafka supports custom schema

Posted by GitBox <gi...@apache.org>.
TyrantLucifer commented on issue #2371:
URL: https://github.com/apache/incubator-seatunnel/issues/2371#issuecomment-1217502402

   > This feature is in development, please waiting pr.
   
   @eyys Please refer to #2439 #2436, look forward to your contribution.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-seatunnel] Hisoka-X commented on issue #2371: [Feature][seatunnel-connectors-v2][connector-kafka] Kafka supports custom schema

Posted by GitBox <gi...@apache.org>.
Hisoka-X commented on issue #2371:
URL: https://github.com/apache/incubator-seatunnel/issues/2371#issuecomment-1207566573

   Yes, it is a important thing that decide how define schema. Please send a discuss mail in dev@seatunnel.apache.org. And attach your idea.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-seatunnel] hailin0 commented on issue #2371: [Feature][seatunnel-connectors-v2][connector-kafka] Kafka supports custom schema

Posted by GitBox <gi...@apache.org>.
hailin0 commented on issue #2371:
URL: https://github.com/apache/incubator-seatunnel/issues/2371#issuecomment-1207435224

   I want to discuss how to define fields schema configuration for connector in seatunnel
   
   I see use case from flink:
   `create table x (
   field1 bigint,
   field2 string,
   ......
   ) with (
   format: 'xxx'
   )`
   
   Connector selectd `xxx` format factory to convert message data format to  `row[fleid1<bigint>,field2<string>]`.
   
   
   How should it be defined in seatunnel?
   e.g:
   `
   source {
     KafkaSource {
        format = xxx
        format_schema = [{"name": "field1", "data_type": "bigint"}  , {"name": "field2", "data_type": "string"}]
      }
   }
   `


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-seatunnel] eyys commented on issue #2371: [Feature][seatunnel-connectors-v2][connector-kafka] Kafka supports custom schema

Posted by GitBox <gi...@apache.org>.
eyys commented on issue #2371:
URL: https://github.com/apache/incubator-seatunnel/issues/2371#issuecomment-1214932183

   > 
   Where should the TXT type column delimiter be defined here
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-seatunnel] eyys closed issue #2371: [Feature][seatunnel-connectors-v2][connector-kafka] Kafka supports custom schema

Posted by GitBox <gi...@apache.org>.
eyys closed issue #2371: [Feature][seatunnel-connectors-v2][connector-kafka] Kafka supports custom schema
URL: https://github.com/apache/incubator-seatunnel/issues/2371


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-seatunnel] TyrantLucifer commented on issue #2371: [Feature][seatunnel-connectors-v2][connector-kafka] Kafka supports custom schema

Posted by GitBox <gi...@apache.org>.
TyrantLucifer commented on issue #2371:
URL: https://github.com/apache/incubator-seatunnel/issues/2371#issuecomment-1207596678

   > This is a common feature, for example, kafka, HTTP, File, Pulsar, etc; My opinion is to conform to the habit of SQL;
   > 
   > ```
   > # conf file
   > Source {
   >     schema {
   >         field = "STRING"
   >         field2 = "INT"
   >         field3 = "DECIMAL(30, 3)"
   >     }
   > }
   > ```
   
   I think it's a good idea that add a common feature in connector to support user-defined schema. In my option, I think we can add a new config option `schema` in all source connector source configs and for each connector they can parse their own schema in `getProducedType` method. The implement function of parsing `schema` we can add it in module `seatunnel-common`. And I agree with your advice @ashulin to conform schema type to the habit of SQL the same as defined in code.
   
   ![image](https://user-images.githubusercontent.com/51053924/183329465-58f5c04d-bffc-43ff-81df-ed15837e823d.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-seatunnel] EricJoy2048 commented on issue #2371: [Feature][seatunnel-connectors-v2][connector-kafka] Kafka supports custom schema

Posted by GitBox <gi...@apache.org>.
EricJoy2048 commented on issue #2371:
URL: https://github.com/apache/incubator-seatunnel/issues/2371#issuecomment-1207579856

   cc @TyrantLucifer  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-seatunnel] Hisoka-X commented on issue #2371: [Feature][seatunnel-connectors-v2][connector-kafka] Kafka supports custom schema

Posted by GitBox <gi...@apache.org>.
Hisoka-X commented on issue #2371:
URL: https://github.com/apache/incubator-seatunnel/issues/2371#issuecomment-1207566774

   > Yes, this is very import function we should implement. If you can do this will be very helpful
   
   cc @ashulin 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-seatunnel] Hisoka-X commented on issue #2371: [Feature][seatunnel-connectors-v2][connector-kafka] Kafka supports custom schema

Posted by GitBox <gi...@apache.org>.
Hisoka-X commented on issue #2371:
URL: https://github.com/apache/incubator-seatunnel/issues/2371#issuecomment-1206155138

   Yes, this is very import function we should implement. If you can do this will be very helpful


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-seatunnel] hailin0 commented on issue #2371: [Feature][seatunnel-connectors-v2][connector-kafka] Kafka supports custom schema

Posted by GitBox <gi...@apache.org>.
hailin0 commented on issue #2371:
URL: https://github.com/apache/incubator-seatunnel/issues/2371#issuecomment-1207604962

   Another problem is missing metadata of mq connector rowdata.
   e.g:  partition、timestamp、key、headers
   
   `
   public ProducerRecord(String topic, Integer partition, Long timestamp, K key, V value, Iterable<Header> headers)
   `
   
   I see use case from flink: 
   https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/table/DynamicKafkaRecordSerializationSchema.java#L119


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org