You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Kevin Kwon (Jira)" <ji...@apache.org> on 2020/10/07 02:11:00 UTC

[jira] [Updated] (FLINK-19517) Support for Confluent Kafka on table creation

     [ https://issues.apache.org/jira/browse/FLINK-19517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kevin Kwon updated FLINK-19517:
-------------------------------
    Description: 
Currently, table creation from SQL client such as below works well
{code:sql}
CREATE TABLE kafkaTable (
  user_id BIGINT,
  item_id BIGINT,
  category_id BIGINT,
  behavior STRING,
  ts TIMESTAMP(3)
) WITH (
  'connector' = 'kafka',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'format' = 'avro',
  'scan.startup.mode' = 'earliest-offset'
)
{code}
Although I would wish for the table creation to support Confluent Kafka configuration as well. For example some think like
{code:sql}
CREATE TABLE kafkaTable (
  user_id BIGINT,
  item_id BIGINT,
  category_id BIGINT,
  behavior STRING,
  ts TIMESTAMP(3)
) WITH (
  'connector' = 'kafka-confluent',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'schema-registry' = 'http://schema-registry.com',
  'scan.startup.mode' = 'earliest-offset'
)
{code}
If this is enabled, it will be much more convenient to test queries on-the-fly that business analysts want to test against. For example, Business analysts can give some standard DDL to data engineers and data engineers can fill in the WITH clause and immediately start executing queries against these tables

Additionally, it will be better if we can
 - specify 'parallelism' within WITH clause to support parallel partition processing
 - specify custom properties within WITH clause specified in [https://docs.confluent.io/5.4.2/installation/configuration/consumer-configs.html]
 - have remote access to SQL client in cluster from local environment

  was:
Currently, table creation from SQL client such as below works well
{code:sql}
CREATE TABLE kafkaTable (
  user_id BIGINT,
  item_id BIGINT,
  category_id BIGINT,
  behavior STRING,
  ts TIMESTAMP(3)
) WITH (
  'connector' = 'kafka',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'format' = 'avro',
  'scan.startup.mode' = 'earliest-offset'
)
{code}
Although I would wish for the table creation to support Confluent Kafka configuration as well. For example some think like
{code:sql}
CREATE TABLE kafkaTable (
  user_id BIGINT,
  item_id BIGINT,
  category_id BIGINT,
  behavior STRING,
  ts TIMESTAMP(3)
) WITH (
  'connector' = 'kafka-confluent',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'schema-registry' = 'http://schema-registry.com',
  'scan.startup.mode' = 'earliest-offset'
)
{code}
Additionally, it will be better if we can
 - specify 'parallelism' within WITH clause to support parallel partition processing
 - specify custom properties within WITH clause specified in [https://docs.confluent.io/5.4.2/installation/configuration/consumer-configs.html]


> Support for Confluent Kafka on table creation
> ---------------------------------------------
>
>                 Key: FLINK-19517
>                 URL: https://issues.apache.org/jira/browse/FLINK-19517
>             Project: Flink
>          Issue Type: Wish
>    Affects Versions: 1.12.0
>            Reporter: Kevin Kwon
>            Priority: Critical
>
> Currently, table creation from SQL client such as below works well
> {code:sql}
> CREATE TABLE kafkaTable (
>   user_id BIGINT,
>   item_id BIGINT,
>   category_id BIGINT,
>   behavior STRING,
>   ts TIMESTAMP(3)
> ) WITH (
>   'connector' = 'kafka',
>   'topic' = 'user_behavior',
>   'properties.bootstrap.servers' = 'localhost:9092',
>   'properties.group.id' = 'testGroup',
>   'format' = 'avro',
>   'scan.startup.mode' = 'earliest-offset'
> )
> {code}
> Although I would wish for the table creation to support Confluent Kafka configuration as well. For example some think like
> {code:sql}
> CREATE TABLE kafkaTable (
>   user_id BIGINT,
>   item_id BIGINT,
>   category_id BIGINT,
>   behavior STRING,
>   ts TIMESTAMP(3)
> ) WITH (
>   'connector' = 'kafka-confluent',
>   'topic' = 'user_behavior',
>   'properties.bootstrap.servers' = 'localhost:9092',
>   'properties.group.id' = 'testGroup',
>   'schema-registry' = 'http://schema-registry.com',
>   'scan.startup.mode' = 'earliest-offset'
> )
> {code}
> If this is enabled, it will be much more convenient to test queries on-the-fly that business analysts want to test against. For example, Business analysts can give some standard DDL to data engineers and data engineers can fill in the WITH clause and immediately start executing queries against these tables
> Additionally, it will be better if we can
>  - specify 'parallelism' within WITH clause to support parallel partition processing
>  - specify custom properties within WITH clause specified in [https://docs.confluent.io/5.4.2/installation/configuration/consumer-configs.html]
>  - have remote access to SQL client in cluster from local environment



--
This message was sent by Atlassian Jira
(v8.3.4#803005)