You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Dmitry Goldenberg (JIRA)" <ji...@apache.org> on 2019/04/20 14:52:00 UTC

[jira] [Created] (SPARK-27529) Spark Streaming consumer dies with kafka.common.OffsetOutOfRangeException

Dmitry Goldenberg created SPARK-27529:
-----------------------------------------

Summary: Spark Streaming consumer dies with kafka.common.OffsetOutOfRangeException
Key: SPARK-27529
URL: https://issues.apache.org/jira/browse/SPARK-27529
Project: Spark
Issue Type: Bug
Components: DStreams
Affects Versions: 1.5.0
Reporter: Dmitry Goldenberg

We have a Spark Streaming consumer which at a certain point started consistently failing upon a restart with the below error.

Some details:
* Spark version is 1.5.0.
* Kafka version is 0.8.2.1 (2.10-0.8.2.1).
* The topic is configured with: retention.ms=1471228928, max.message.bytes=100000000.
* The consumer runs with auto.offset.reset=smallest.
* No checkpointing is currently enabled.

I don't see anything in the Spark or Kafka doc to understand why this is happening. From googling around,
{noformat}
https://blog.cloudera.com/blog/2015/03/exactly-once-spark-streaming-from-apache-kafka/

Finally, I’ll repeat that any semantics beyond at-most-once require that you have sufficient log retention in Kafka. If you’re seeing things like OffsetOutOfRangeException, it’s probably because you underprovisioned Kafka storage, not because something’s wrong with Spark or Kafka.{noformat}
Also looking at SPARK-12693 and SPARK-11693, I don't understand the possible causes.
{noformat}
You've under-provisioned Kafka storage and / or Spark compute capacity.
The result is that data is being deleted before it has been processed.{noformat}
All we're trying to do is start the consumer and consume from the topic from the earliest available offset. Why would we not be able to do that? How can the offsets be out of range if we're saying, just read from the earliest available?

Since we have the retention.ms set to 1 year and we created the topic just a few weeks ago, I'd not expect any deletion being done by Kafka as we're consuming.

The behavior we're seeing on the consumer side does not feel intuitive or cohesive to me. If it is, I'd like to know how to work around it.

--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org