You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Dan Dutrow <da...@gmail.com> on 2016/01/21 18:11:15 UTC

Recovery for Spark Streaming Kafka Direct with OffsetOutOfRangeException

Hey Cody, I would have responded to the mailing list but it looks like this
thread got aged off. I have the problem where one of my topics dumps more
data than my spark job can keep up with. We limit the input rate with
maxRatePerPartition Eventually, when the data is aged off, I get the
OffsetOutOfRangeException from Kafka, as we would expect. As we work
towards more efficient processing of that topic, or get more resources, I'd
like to be able to log the error and continue the application without
failing. Is there a place where I can catch that error before it gets to
org.apache.spark.util.Utils$.logUncaughtExceptions ? Maybe somewhere in
DirectKafkaInputDStream::compute?

https://mail-archives.apache.org/mod_mbox/spark-user/201512.mbox/%3CCAKWX9VUoNd4ATGF+0TkNJ+9=b8r2nr9Pt7SbgR-Bv4NntTk-xw@mail.gmail.com%3E
-- 
Dan ✆

Re: Recovery for Spark Streaming Kafka Direct with OffsetOutOfRangeException

Posted by Cody Koeninger <co...@koeninger.org>.

Looks like this response did go to the list.

As far as OffsetOutOfRange goes, right now that's an unrecoverable error,
because it breaks the underlying invariants (e.g. that the number of
messages in a partition is deterministic once the RDD is defined)

If you want to do some hacking for your own purposes, the place to start
looking would be in KafkaRDD.scala, in fetchBatch.  Just be aware that's a
situation where data has been lost, so you can't get the "right" answer,
you just have to decide what variety of wrong answer you want to get :)

On Thu, Jan 21, 2016 at 11:11 AM, Dan Dutrow <da...@gmail.com> wrote:

> Hey Cody, I would have responded to the mailing list but it looks like
> this thread got aged off. I have the problem where one of my topics dumps
> more data than my spark job can keep up with. We limit the input rate with
> maxRatePerPartition Eventually, when the data is aged off, I get the
> OffsetOutOfRangeException from Kafka, as we would expect. As we work
> towards more efficient processing of that topic, or get more resources, I'd
> like to be able to log the error and continue the application without
> failing. Is there a place where I can catch that error before it gets to
> org.apache.spark.util.Utils$.logUncaughtExceptions ? Maybe somewhere in
> DirectKafkaInputDStream::compute?
>
>
> https://mail-archives.apache.org/mod_mbox/spark-user/201512.mbox/%3CCAKWX9VUoNd4ATGF+0TkNJ+9=b8r2nr9Pt7SbgR-Bv4NntTk-xw@mail.gmail.com%3E
> --
> Dan ✆
>