You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by craigjar <cr...@gmail.com> on 2016/07/07 03:19:24 UTC

Structured Streaming Comparison to AMPS

I have been doing several Spark PoC projects recently and the latest one
involved the new 2.0 experimental feature Structured Streaming.  My PoC
ended up being a non-starter as I quickly realized the stream to stream
joins are not implemented yet.  I believe this feature will be immensely
powerful and allow applications to migrate from batch to streaming with
ease, which is exactly what I would like to accomplish.

I came across an interesting article (link below) from another streaming
platform called AMPS which I believe describes a common use case where
Spark's Structured Streaming would be perfect for and I hope the Spark dev
team is looking at the AMPS platform for some inspiration.

http://www.crankuptheamps.com/blog/posts/2014/04/07/real-time-streaming-joins-reinvented/

I am posting this for two reason:
 - Share AMPS approach to this problem and get the communities thoughts on
Spark's Structured Streaming approach if it will provide similar semantics
 - Inquire about the availability of stream to stream joins and where I can
follow along (is there a JIRA or dev mailing list topic I can follow).

Thank you in advance for any replies.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Structured-Streaming-Comparison-to-AMPS-tp27303.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: Structured Streaming Comparison to AMPS

Posted by Arnaud Bailly <ar...@gmail.com>.
Hi,

I am also interested in having JOIN support streams on both sides. I
understand Spark SQL's JOIN currently support having a stream on a single
side only. Could you please provide some more details on why this is the
case? What are the technical limitations that make it harder to implement
stream to stream JOINs in Spark's code?

Thanks,

-- 
Arnaud Bailly

twitter: abailly
skype: arnaud-bailly
linkedin: http://fr.linkedin.com/in/arnaudbailly/

On Thu, Jul 7, 2016 at 9:17 AM, Tathagata Das <ta...@gmail.com>
wrote:

> We will look into streaming-streaming joins in future release of Spark,
> though no promises on any timeline yet. We are currently fighting to get
> Spark 2.0 out of the door.
> There isnt a JIRA for this right now. However, you can track the
> Structured Streaming Epic JIRA to track whats going on. I try to post major
> feature stuff as sub jiras of this master jira.
> https://issues.apache.org/jira/browse/SPARK-8360
>
> On Wed, Jul 6, 2016 at 8:19 PM, craigjar <cr...@gmail.com> wrote:
>
>> I have been doing several Spark PoC projects recently and the latest one
>> involved the new 2.0 experimental feature Structured Streaming.  My PoC
>> ended up being a non-starter as I quickly realized the stream to stream
>> joins are not implemented yet.  I believe this feature will be immensely
>> powerful and allow applications to migrate from batch to streaming with
>> ease, which is exactly what I would like to accomplish.
>>
>> I came across an interesting article (link below) from another streaming
>> platform called AMPS which I believe describes a common use case where
>> Spark's Structured Streaming would be perfect for and I hope the Spark dev
>> team is looking at the AMPS platform for some inspiration.
>>
>>
>> http://www.crankuptheamps.com/blog/posts/2014/04/07/real-time-streaming-joins-reinvented/
>>
>> I am posting this for two reason:
>>  - Share AMPS approach to this problem and get the communities thoughts on
>> Spark's Structured Streaming approach if it will provide similar semantics
>>  - Inquire about the availability of stream to stream joins and where I
>> can
>> follow along (is there a JIRA or dev mailing list topic I can follow).
>>
>> Thank you in advance for any replies.
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Structured-Streaming-Comparison-to-AMPS-tp27303.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>
>>
>

Re: Structured Streaming Comparison to AMPS

Posted by Tathagata Das <ta...@gmail.com>.
We will look into streaming-streaming joins in future release of Spark,
though no promises on any timeline yet. We are currently fighting to get
Spark 2.0 out of the door.
There isnt a JIRA for this right now. However, you can track the Structured
Streaming Epic JIRA to track whats going on. I try to post major feature
stuff as sub jiras of this master jira.
https://issues.apache.org/jira/browse/SPARK-8360

On Wed, Jul 6, 2016 at 8:19 PM, craigjar <cr...@gmail.com> wrote:

> I have been doing several Spark PoC projects recently and the latest one
> involved the new 2.0 experimental feature Structured Streaming.  My PoC
> ended up being a non-starter as I quickly realized the stream to stream
> joins are not implemented yet.  I believe this feature will be immensely
> powerful and allow applications to migrate from batch to streaming with
> ease, which is exactly what I would like to accomplish.
>
> I came across an interesting article (link below) from another streaming
> platform called AMPS which I believe describes a common use case where
> Spark's Structured Streaming would be perfect for and I hope the Spark dev
> team is looking at the AMPS platform for some inspiration.
>
>
> http://www.crankuptheamps.com/blog/posts/2014/04/07/real-time-streaming-joins-reinvented/
>
> I am posting this for two reason:
>  - Share AMPS approach to this problem and get the communities thoughts on
> Spark's Structured Streaming approach if it will provide similar semantics
>  - Inquire about the availability of stream to stream joins and where I can
> follow along (is there a JIRA or dev mailing list topic I can follow).
>
> Thank you in advance for any replies.
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Structured-Streaming-Comparison-to-AMPS-tp27303.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>