You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by ravidspark <ra...@gmail.com> on 2018/05/07 23:53:05 UTC

Best place to persist offsets into Zookeeper

Hi All,

I have the below problem in Spark Kafka steaming.

Environment:
Spark-2.2.0

Problem:
We have written our own logic for offset management in zookeeper when
streaming data with Spark + Kafka. Everything is working fine and we are
able to control the offset commitment to zookeeper during failure i.e not
letting the app commit offset to zookeeper node during app failure. Thus
achieving zero message loss when restarting the app. But, sometimes when
there are some unexpected exceptions, We see that offsets are getting
committed to zookeeper for at least the next 3 batches. Not able to figure
out how to control these situations. Right now we are committing the offsets
to zookeeper at the end of every batch.

I am happy to share the code.

Can you help me in solving this problem?


Thanks,
Ravi



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org