You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by kant kodali <ka...@gmail.com> on 2017/05/04 07:43:19 UTC

unable to find how to integrate SparkSession with a Custom Receiver.

Hi All,

I have a Custom Receiver that implements onStart() and OnStop Methods of
the Receiver class and I am trying to figure out how to integrate with
SparkSession since I want to do stateful analytics using Structured
Streaming. I couldn't find it in the docs. any idea? When I was doing
stateless streaming I was calling
javaStreamingContext.receiverStream(CustomReceiver)
and it worked fine but now I am trying to do Structured Streaming using the
same custom receiver and I couldn't find how to integrate with
SparkSession.

Thanks!

Re: unable to find how to integrate SparkSession with a Custom Receiver.

Posted by kant kodali <ka...@gmail.com>.
got it! Thank you!

On Thu, May 4, 2017 at 12:58 AM, Tathagata Das <ta...@gmail.com>
wrote:

> Structured Streaming is not designed to integrate with receivers. The
> sources in Structured Streaming are designed for providing stronger
> fault-tolerance guarantees by precisely tracking records by their offsets
> (e.g. Kafka offsets). This is different from the Receiver APIs which did
> not require records to have trackable offsets, but didnt provide
> exactly-once guarantees either.
>
> If you really want to use Structured streaming, you could try writing a
> simpler application that blindly forwards data from your custom source to
> Kafka, and then run Structured Streaming on Kafka. We have added very good
> support for reading from and writing to Kafka in Spark.
>
> https://databricks.com/blog/2017/04/26/processing-data-in-
> apache-kafka-with-structured-streaming-in-apache-spark-2-2.html
>
>
> On Thu, May 4, 2017 at 12:43 AM, kant kodali <ka...@gmail.com> wrote:
>
>> Hi All,
>>
>> I have a Custom Receiver that implements onStart() and OnStop Methods of
>> the Receiver class and I am trying to figure out how to integrate with
>> SparkSession since I want to do stateful analytics using Structured
>> Streaming. I couldn't find it in the docs. any idea? When I was doing
>> stateless streaming I was calling javaStreamingContext.receiverStream(CustomReceiver)
>> and it worked fine but now I am trying to do Structured Streaming using
>> the same custom receiver and I couldn't find how to integrate with
>> SparkSession.
>>
>> Thanks!
>>
>
>

Re: unable to find how to integrate SparkSession with a Custom Receiver.

Posted by Tathagata Das <ta...@gmail.com>.
Structured Streaming is not designed to integrate with receivers. The
sources in Structured Streaming are designed for providing stronger
fault-tolerance guarantees by precisely tracking records by their offsets
(e.g. Kafka offsets). This is different from the Receiver APIs which did
not require records to have trackable offsets, but didnt provide
exactly-once guarantees either.

If you really want to use Structured streaming, you could try writing a
simpler application that blindly forwards data from your custom source to
Kafka, and then run Structured Streaming on Kafka. We have added very good
support for reading from and writing to Kafka in Spark.

https://databricks.com/blog/2017/04/26/processing-data-in-apache-kafka-with-structured-streaming-in-apache-spark-2-2.html


On Thu, May 4, 2017 at 12:43 AM, kant kodali <ka...@gmail.com> wrote:

> Hi All,
>
> I have a Custom Receiver that implements onStart() and OnStop Methods of
> the Receiver class and I am trying to figure out how to integrate with
> SparkSession since I want to do stateful analytics using Structured
> Streaming. I couldn't find it in the docs. any idea? When I was doing
> stateless streaming I was calling javaStreamingContext.receiverStream(CustomReceiver)
> and it worked fine but now I am trying to do Structured Streaming using
> the same custom receiver and I couldn't find how to integrate with
> SparkSession.
>
> Thanks!
>