You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Fabian Paul <fp...@apache.org> on 2022/02/15 15:49:36 UTC

Re: Flink 1.14.2/3 - KafkaSink vs deprecated FlinkKafkaProducer

Hi Danny,

I am ccing the user ML again so that more people may help. Looking through
the logs I did not spot anything suspicious. Can you also share the JM logs
with us? I am also wondering since you have said that other jobs are also
running on the Flink cluster if the problem still persists if you run the
job on its own cluster.

Best,
Fabian

On Tue, Feb 15, 2022 at 12:33 PM Daniel Peled <da...@gmail.com>
wrote:

> Hi Fabian,
>
> Sorry for the late reply !
>
> Attached is an example of one of our jobs that hangs.
> The name of the job is *liteViewerJobOrchestrator *and is is part of the
> transactional id prefix
> We use the same *transactional id prefix between restarts*
> The transactional id prefix is <name of the job>-<output topic
> name>-<additional id that is optional>
>
> The Kafka client/server version is 2.4.1
> The job is unbounded and the job parallelism is 1
>
> As you can see all sinks are in initializing status
>
> I am also attaching the TM log file.
> You will see many log messages of other jobs
> Please look for the messages with job name *liteViewerJobOrchestrator*
>
> Please let me know if you need any more information
> BR,
> Danny
>
> [image: image.png]
>
> ‫בתאריך יום ב׳, 31 בינו׳ 2022 ב-19:12 מאת ‪Fabian Paul‬‏ <‪
> fpaul@apache.org‬‏>:‬
>
>> Hi Daniel,
>>
>> Thanks for reaching out, we are constantly trying to improve the
>> reliability of our connectors. I assume you are running the KafkaSink
>> with exactly-once delivery guarantee. On startup, the KafkaSink tries
>> to abort lingering transactions from previous executions.
>> Unfortunately, nothing comes to my mind immediately why your job
>> hangs.
>>
>> Can you maybe share the logs with us from such a run? It would be also
>> great to know more information about your environment e.g. bounded or
>> unbounded jobs, parallelism, Kafka client/server version, potential
>> topic acls, transactional id prefix.
>>
>> Best,
>> Fabian
>>
>> On Mon, Jan 31, 2022 at 3:27 PM Daniel Peled
>> <da...@gmail.com> wrote:
>> >
>> > Hi everyone,
>> >
>> > Has anyone encountered any problem with the new KafkaSink that is used
>> in Flink 1.14 ?
>> >
>> > When running our jobs, the sinks of some of our jobs are stuck in
>> initializing for more than an hour.
>> > The only thing that helps is deleting the topic __transaction_state.
>> > After deleting this topic, all sinks are immediately released and are
>> in running status.
>> > The problem is quite random each time in a different job.
>> > There are times that all jobs start running without any problems.
>> >
>> > Unfortunately we had to go back to the deprecated FlinkKafkaProducer.
>> >
>> > We didn't have these problems with Flink 1.13 and FlinkKafkaProducer
>> >
>> > Any ideas on what to do ?
>> > What are we doing wrong?
>> >
>> > BR,
>> > Daniel
>> >
>>
>