You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by "hsy541@gmail.com" <hs...@gmail.com> on 2018/01/24 07:02:16 UTC

Questions about using pyspark 2.1.1 pushing data to kafka

I have questions about using pyspark 2.1.1 pushing data to kafka.

I don't see any pyspark streaming api to write data directly to kafka, if
there is one or example, please point me to the right page.

I implemented my own way which using a global kafka producer and push the
data picked from foreach.  The problem is foreach is a single thread model,
so I can only do synced push, otherwise there is no way to know whether
data is pushed successfully.

Is there anyone try to do the same thing, any suggestions from you guys?

Thanks!

Regards,
Siyuan