You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Alexis Seigneurin (JIRA)" <ji...@apache.org> on 2018/08/13 19:44:00 UTC

[jira] [Created] (SPARK-25106) A new Kafka consumer gets created for every batch

Alexis Seigneurin created SPARK-25106:
-----------------------------------------

             Summary: A new Kafka consumer gets created for every batch
                 Key: SPARK-25106
                 URL: https://issues.apache.org/jira/browse/SPARK-25106
             Project: Spark
          Issue Type: Bug
          Components: Structured Streaming
    Affects Versions: 2.3.1
            Reporter: Alexis Seigneurin
         Attachments: console.txt

I have a fairly simple piece of code that reads from Kafka, applies some transformations - including applying a UDF - and writes the result to the console. Every time a batch is created, a new consumer is created (and not closed), eventually leading to a "too many open files" error.

I created a test case, with the code available here: [https://github.com/aseigneurin/spark-kafka-issue]

To reproduce:
 # Start Kafka and create a topic called "persons"
 # Run "Producer" to generate data
 # Run "Consumer"

I am attaching the log where you can see a new consumer being initialized between every batch.

Please note this issue does *not* appear with Spark 2.2.2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org