You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Nazar Volynets (Jira)" <ji...@apache.org> on 2020/12/22 13:09:00 UTC

[jira] [Issue Comment Deleted] (FLINK-17691) FlinkKafkaProducer transactional.id too long when using Semantic.EXACTLY_ONCE

     [ https://issues.apache.org/jira/browse/FLINK-17691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nazar Volynets updated FLINK-17691:
-----------------------------------
    Comment: was deleted

(was: [~freezhan] & [~jmathews3773],

*Issue #3*

And finally, why not give a possibility for user to specify `transaction-id` (what is the main reason) ?

+Details+

Current option with auto generation brings as follows drawbacks:
 # It is useless or bring a lot of issues if I am using transactional consumers to drain data from `sink` topic. Basically it means if I will update my Flink program then after that I am FORCED to reconfigure all my transactional consumers (as I will have new auto generated transaction-id)
 # Complicated deployment. As I will be FORCED to run Flink application to determine transaction-id. So before that I am blocked to deploy/finally configure my transactional consumers.

+Summary+

Should I create new/separate Jira issue for this use case or it is expect behaviour ?)

> FlinkKafkaProducer transactional.id too long when using Semantic.EXACTLY_ONCE
> -----------------------------------------------------------------------------
>
>                 Key: FLINK-17691
>                 URL: https://issues.apache.org/jira/browse/FLINK-17691
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / Kafka
>    Affects Versions: 1.10.0, 1.11.0
>            Reporter: freezhan
>            Assignee: John Mathews
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.12.0
>
>         Attachments: image-2020-05-14-20-43-57-414.png, image-2020-05-14-20-45-24-030.png, image-2020-05-14-20-45-59-878.png, image-2020-05-14-21-09-01-906.png, image-2020-05-14-21-16-43-810.png, image-2020-05-14-21-17-09-784.png
>
>
> When sink to Kafka using the {color:#FF0000}Semantic.EXACTLY_ONCE {color}mode.
> The flink Kafka Connector Producer will auto set the {color:#FF0000}transactional.id{color}, and the user - defined value are ignored.
>  
> When the job operator name too long, will send failed
> transactional.id is exceeds the kafka  {color:#FF0000}coordinator_key{color} limit
> !image-2020-05-14-21-09-01-906.png!
>  
> *The flink Kafka Connector policy for automatic generation of transaction.id is as follows*
>  
> 1. use the {color:#FF0000}taskName + "-" + operatorUniqueID{color} as transactional.id prefix (may be too long)
>   getRuntimeContext().getTaskName() + "-" + ((StreamingRuntimeContext)    getRuntimeContext()).getOperatorUniqueID()
> 2. Range of available transactional ids 
> [nextFreeTransactionalId, nextFreeTransactionalId + parallelism * kafkaProducersPoolSize)
> !image-2020-05-14-20-43-57-414.png!
>   !image-2020-05-14-20-45-24-030.png!
> !image-2020-05-14-20-45-59-878.png!
>  
> *The Kafka transaction.id check policy as follows:*
>  
> {color:#FF0000}string bytes.length can't larger than Short.MAX_VALUE (32767){color}
> !image-2020-05-14-21-16-43-810.png!
> !image-2020-05-14-21-17-09-784.png!
>  
> *To reproduce this bug, the following conditions must be met:*
>  
>  # send msg to kafka with exactly once mode
>  # the task TaskName' length + TaskName's length is lager than the 32767 (A very long line of SQL or window statements can appear)
> *I suggest a solution:*
>  
>      1.  Allows users to customize transactional.id 's prefix
> or
>      2. Do md5 on the prefix before returning the real transactional.id
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)