You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Gaurav1809 <ga...@gmail.com> on 2017/03/13 18:21:39 UTC

Structured Streaming - Can I start using it?

I read in spark documentation that Structured Streaming is still ALPHA in
Spark 2.1 and the APIs are still experimental. Shall I use it to re write my
existing spark streaming code? Looks like it is not yet production ready.
What happens if Structured Streaming project gets withdrawn?



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Structured-Streaming-Can-I-start-using-it-tp28488.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: Structured Streaming - Can I start using it?

Posted by Gaurav Pandya <ga...@gmail.com>.
Thanks a lot Michal & Ofir for your insights.

To Ofir - I have not yet finalized my spark streaming code. it is still
work in progress. Now we have Structured streaming available, so thought to
re write it to gain maximum benefit in future. As of now, there are no
specific functional or performance issues Nor I have to leverage any new
API. This is just considering future aspects.

Thanks
Gaurav

On Tue, Mar 14, 2017 at 1:05 PM, Ofir Manor <of...@equalum.io> wrote:

> To add to what Michael said, my experience was that Structured Streaming
> in 2.0 was half-baked / alpha, but in 2.1 it is significantly more robust.
> Also a lot of its "missing functionality" were not available in Spark
> Streaming either way.
> HOWEVER, you mentioned that you think about rewriting your existing spark
> streaming code... May I ask why do you need a rewrite? Do you have a
> specific functional or performance issues? Some specific new use case or a
> specific new API you want to leverage?
> Changing an existing, working solution has its costs, both in dev time and
> ops time (changes to monitoring, troubleshooting etc), so I think you
> should know what you want to achieve here and ask / prototype if current
> release fits it.
>
> Ofir Manor
>
> Co-Founder & CTO | Equalum
>
> Mobile: +972-54-7801286 | Email: ofir.manor@equalum.io
>
> On Mon, Mar 13, 2017 at 9:45 PM, Michael Armbrust <mi...@databricks.com>
> wrote:
>
>> I think its very very unlikely that it will get withdrawn.  The primary
>> reason that the APIs are still marked experimental is that we like to have
>> several releases before committing to interface stability (in particular
>> the interfaces to write custom sources and sinks are likely to evolve).
>> Also, there are currently quite a few limitations in the types of queries
>> that we can run (i.e. multiple aggregations are disallowed, we don't
>> support stream-stream joins yet).  In these cases though, we explicitly say
>> its not supported when you try to start your stream.
>>
>> For the use cases that are supported in 2.1 though (streaming ETL, event
>> time aggregation, etc) I'll say that we have been using it in production
>> for several months and we have customers doing the same.
>>
>> On Mon, Mar 13, 2017 at 11:21 AM, Gaurav1809 <ga...@gmail.com>
>> wrote:
>>
>>> I read in spark documentation that Structured Streaming is still ALPHA in
>>> Spark 2.1 and the APIs are still experimental. Shall I use it to re
>>> write my
>>> existing spark streaming code? Looks like it is not yet production ready.
>>> What happens if Structured Streaming project gets withdrawn?
>>>
>>>
>>>
>>> --
>>> View this message in context: http://apache-spark-user-list.
>>> 1001560.n3.nabble.com/Structured-Streaming-Can-I-start-using
>>> -it-tp28488.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>>
>>>
>>
>

Re: Structured Streaming - Can I start using it?

Posted by Ofir Manor <of...@equalum.io>.
To add to what Michael said, my experience was that Structured Streaming in
2.0 was half-baked / alpha, but in 2.1 it is significantly more robust.
Also a lot of its "missing functionality" were not available in Spark
Streaming either way.
HOWEVER, you mentioned that you think about rewriting your existing spark
streaming code... May I ask why do you need a rewrite? Do you have a
specific functional or performance issues? Some specific new use case or a
specific new API you want to leverage?
Changing an existing, working solution has its costs, both in dev time and
ops time (changes to monitoring, troubleshooting etc), so I think you
should know what you want to achieve here and ask / prototype if current
release fits it.

Ofir Manor

Co-Founder & CTO | Equalum

Mobile: +972-54-7801286 | Email: ofir.manor@equalum.io

On Mon, Mar 13, 2017 at 9:45 PM, Michael Armbrust <mi...@databricks.com>
wrote:

> I think its very very unlikely that it will get withdrawn.  The primary
> reason that the APIs are still marked experimental is that we like to have
> several releases before committing to interface stability (in particular
> the interfaces to write custom sources and sinks are likely to evolve).
> Also, there are currently quite a few limitations in the types of queries
> that we can run (i.e. multiple aggregations are disallowed, we don't
> support stream-stream joins yet).  In these cases though, we explicitly say
> its not supported when you try to start your stream.
>
> For the use cases that are supported in 2.1 though (streaming ETL, event
> time aggregation, etc) I'll say that we have been using it in production
> for several months and we have customers doing the same.
>
> On Mon, Mar 13, 2017 at 11:21 AM, Gaurav1809 <ga...@gmail.com>
> wrote:
>
>> I read in spark documentation that Structured Streaming is still ALPHA in
>> Spark 2.1 and the APIs are still experimental. Shall I use it to re write
>> my
>> existing spark streaming code? Looks like it is not yet production ready.
>> What happens if Structured Streaming project gets withdrawn?
>>
>>
>>
>> --
>> View this message in context: http://apache-spark-user-list.
>> 1001560.n3.nabble.com/Structured-Streaming-Can-I-start-
>> using-it-tp28488.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>
>>
>

Re: Structured Streaming - Can I start using it?

Posted by Michael Armbrust <mi...@databricks.com>.
I think its very very unlikely that it will get withdrawn.  The primary
reason that the APIs are still marked experimental is that we like to have
several releases before committing to interface stability (in particular
the interfaces to write custom sources and sinks are likely to evolve).
Also, there are currently quite a few limitations in the types of queries
that we can run (i.e. multiple aggregations are disallowed, we don't
support stream-stream joins yet).  In these cases though, we explicitly say
its not supported when you try to start your stream.

For the use cases that are supported in 2.1 though (streaming ETL, event
time aggregation, etc) I'll say that we have been using it in production
for several months and we have customers doing the same.

On Mon, Mar 13, 2017 at 11:21 AM, Gaurav1809 <ga...@gmail.com>
wrote:

> I read in spark documentation that Structured Streaming is still ALPHA in
> Spark 2.1 and the APIs are still experimental. Shall I use it to re write
> my
> existing spark streaming code? Looks like it is not yet production ready.
> What happens if Structured Streaming project gets withdrawn?
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/Structured-Streaming-Can-I-
> start-using-it-tp28488.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>