You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Nicholas Chammas <ni...@gmail.com> on 2014/04/21 22:00:13 UTC

Spark Streaming source from Amazon Kinesis

I'm looking to start experimenting with Spark Streaming, and I'd like to
use Amazon Kinesis <https://aws.amazon.com/kinesis/> as my data source.
Looking at the list of supported Spark Streaming
sources<http://spark.apache.org/docs/latest/streaming-programming-guide.html#linking>,
I don't see any mention of Kinesis.

Is it possible to use Spark Streaming with Amazon Kinesis? If not, are
there plans to add such support in the future?

Nick




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-source-from-Amazon-Kinesis-tp4550.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Spark Streaming source from Amazon Kinesis

Posted by Parviz Deyhim <pd...@gmail.com>.
it is possible Nick. Please take a look here:
https://aws.amazon.com/articles/Elastic-MapReduce/4926593393724923

the source code is here as a pull request:
https://github.com/apache/spark/pull/223

let me know if you have any questions.


On Mon, Apr 21, 2014 at 1:00 PM, Nicholas Chammas <
nicholas.chammas@gmail.com> wrote:

> I'm looking to start experimenting with Spark Streaming, and I'd like to
> use Amazon Kinesis <https://aws.amazon.com/kinesis/> as my data source.
> Looking at the list of supported Spark Streaming sources<http://spark.apache.org/docs/latest/streaming-programming-guide.html#linking>,
> I don't see any mention of Kinesis.
>
> Is it possible to use Spark Streaming with Amazon Kinesis? If not, are
> there plans to add such support in the future?
>
> Nick
>
>
> ------------------------------
> View this message in context: Spark Streaming source from Amazon Kinesis<http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-source-from-Amazon-Kinesis-tp4550.html>
> Sent from the Apache Spark User List mailing list archive<http://apache-spark-user-list.1001560.n3.nabble.com/>at Nabble.com.
>

Re: Spark Streaming source from Amazon Kinesis

Posted by Nicholas Chammas <ni...@gmail.com>.
Thanks for the link! I'll follow the discussion there.


On Mon, Apr 21, 2014 at 4:10 PM, Matei Zaharia <ma...@gmail.com>wrote:

> There was a patch posted a few weeks ago (
> https://github.com/apache/spark/pull/223), but it needs a few changes in
> packaging because it uses a license that isn’t fully compatible with
> Apache. I’d like to get this merged when the changes are made though — it
> would be a good input source to support.
>
> Matei
>
>
> On Apr 21, 2014, at 1:00 PM, Nicholas Chammas <ni...@gmail.com>
> wrote:
>
> I'm looking to start experimenting with Spark Streaming, and I'd like to
> use Amazon Kinesis <https://aws.amazon.com/kinesis/> as my data source.
> Looking at the list of supported Spark Streaming sources<http://spark.apache.org/docs/latest/streaming-programming-guide.html#linking>,
> I don't see any mention of Kinesis.
>
> Is it possible to use Spark Streaming with Amazon Kinesis? If not, are
> there plans to add such support in the future?
>
> Nick
>
>
> ------------------------------
> View this message in context: Spark Streaming source from Amazon Kinesis<http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-source-from-Amazon-Kinesis-tp4550.html>
> Sent from the Apache Spark User List mailing list archive<http://apache-spark-user-list.1001560.n3.nabble.com/>at
> Nabble.com.
>
>
>

Re: Spark Streaming source from Amazon Kinesis

Posted by Tathagata Das <ta...@gmail.com>.
I will take a look at it tomorrow!

TD


On Tue, Jul 22, 2014 at 9:30 AM, Chris Fregly <ch...@fregly.com> wrote:

> i took this over from parviz.
>
> i recently submitted a new PR for Kinesis Spark Streaming support:
> https://github.com/apache/spark/pull/1434
>
> others have tested it with good success, so give it a whirl!
>
> waiting for it to be reviewed/merged.  please put any feedback into the PR
> directly.
>
> thanks!
>
> -chris
>
>
> On Mon, Apr 21, 2014 at 2:39 PM, Matei Zaharia <ma...@gmail.com>
> wrote:
>
>> No worries, looking forward to it!
>>
>> Matei
>>
>> On Apr 21, 2014, at 1:59 PM, Parviz Deyhim <pd...@gmail.com> wrote:
>>
>> sorry Matei. Will definitely start working on making the changes soon :)
>>
>>
>> On Mon, Apr 21, 2014 at 1:10 PM, Matei Zaharia <ma...@gmail.com>
>> wrote:
>>
>>> There was a patch posted a few weeks ago (
>>> https://github.com/apache/spark/pull/223), but it needs a few changes
>>> in packaging because it uses a license that isn’t fully compatible with
>>> Apache. I’d like to get this merged when the changes are made though — it
>>> would be a good input source to support.
>>>
>>> Matei
>>>
>>>
>>> On Apr 21, 2014, at 1:00 PM, Nicholas Chammas <
>>> nicholas.chammas@gmail.com> wrote:
>>>
>>> I'm looking to start experimenting with Spark Streaming, and I'd like to
>>> use Amazon Kinesis <https://aws.amazon.com/kinesis/> as my data source.
>>> Looking at the list of supported Spark Streaming sources
>>> <http://spark.apache.org/docs/latest/streaming-programming-guide.html#linking>,
>>> I don't see any mention of Kinesis.
>>>
>>> Is it possible to use Spark Streaming with Amazon Kinesis? If not, are
>>> there plans to add such support in the future?
>>>
>>> Nick
>>>
>>>
>>> ------------------------------
>>> View this message in context: Spark Streaming source from Amazon Kinesis
>>> <http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-source-from-Amazon-Kinesis-tp4550.html>
>>> Sent from the Apache Spark User List mailing list archive
>>> <http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com
>>> <http://nabble.com/>.
>>>
>>>
>>>
>>
>>
>

Re: Spark Streaming source from Amazon Kinesis

Posted by Chris Fregly <ch...@fregly.com>.
i took this over from parviz.

i recently submitted a new PR for Kinesis Spark Streaming support:
https://github.com/apache/spark/pull/1434

others have tested it with good success, so give it a whirl!

waiting for it to be reviewed/merged.  please put any feedback into the PR
directly.

thanks!

-chris


On Mon, Apr 21, 2014 at 2:39 PM, Matei Zaharia <ma...@gmail.com>
wrote:

> No worries, looking forward to it!
>
> Matei
>
> On Apr 21, 2014, at 1:59 PM, Parviz Deyhim <pd...@gmail.com> wrote:
>
> sorry Matei. Will definitely start working on making the changes soon :)
>
>
> On Mon, Apr 21, 2014 at 1:10 PM, Matei Zaharia <ma...@gmail.com>
> wrote:
>
>> There was a patch posted a few weeks ago (
>> https://github.com/apache/spark/pull/223), but it needs a few changes in
>> packaging because it uses a license that isn’t fully compatible with
>> Apache. I’d like to get this merged when the changes are made though — it
>> would be a good input source to support.
>>
>> Matei
>>
>>
>> On Apr 21, 2014, at 1:00 PM, Nicholas Chammas <ni...@gmail.com>
>> wrote:
>>
>> I'm looking to start experimenting with Spark Streaming, and I'd like to
>> use Amazon Kinesis <https://aws.amazon.com/kinesis/> as my data source.
>> Looking at the list of supported Spark Streaming sources
>> <http://spark.apache.org/docs/latest/streaming-programming-guide.html#linking>,
>> I don't see any mention of Kinesis.
>>
>> Is it possible to use Spark Streaming with Amazon Kinesis? If not, are
>> there plans to add such support in the future?
>>
>> Nick
>>
>>
>> ------------------------------
>> View this message in context: Spark Streaming source from Amazon Kinesis
>> <http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-source-from-Amazon-Kinesis-tp4550.html>
>> Sent from the Apache Spark User List mailing list archive
>> <http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com
>> <http://nabble.com/>.
>>
>>
>>
>
>

Re: Spark Streaming source from Amazon Kinesis

Posted by Matei Zaharia <ma...@gmail.com>.
No worries, looking forward to it!

Matei

On Apr 21, 2014, at 1:59 PM, Parviz Deyhim <pd...@gmail.com> wrote:

> sorry Matei. Will definitely start working on making the changes soon :)
> 
> 
> On Mon, Apr 21, 2014 at 1:10 PM, Matei Zaharia <ma...@gmail.com> wrote:
> There was a patch posted a few weeks ago (https://github.com/apache/spark/pull/223), but it needs a few changes in packaging because it uses a license that isn’t fully compatible with Apache. I’d like to get this merged when the changes are made though — it would be a good input source to support.
> 
> Matei
> 
> 
> On Apr 21, 2014, at 1:00 PM, Nicholas Chammas <ni...@gmail.com> wrote:
> 
>> I'm looking to start experimenting with Spark Streaming, and I'd like to use Amazon Kinesis as my data source. Looking at the list of supported Spark Streaming sources, I don't see any mention of Kinesis.
>> 
>> Is it possible to use Spark Streaming with Amazon Kinesis? If not, are there plans to add such support in the future?
>> 
>> Nick
>> 
>> 
>> View this message in context: Spark Streaming source from Amazon Kinesis
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> 


Re: Spark Streaming source from Amazon Kinesis

Posted by Parviz Deyhim <pd...@gmail.com>.
sorry Matei. Will definitely start working on making the changes soon :)


On Mon, Apr 21, 2014 at 1:10 PM, Matei Zaharia <ma...@gmail.com>wrote:

> There was a patch posted a few weeks ago (
> https://github.com/apache/spark/pull/223), but it needs a few changes in
> packaging because it uses a license that isn’t fully compatible with
> Apache. I’d like to get this merged when the changes are made though — it
> would be a good input source to support.
>
> Matei
>
>
> On Apr 21, 2014, at 1:00 PM, Nicholas Chammas <ni...@gmail.com>
> wrote:
>
> I'm looking to start experimenting with Spark Streaming, and I'd like to
> use Amazon Kinesis <https://aws.amazon.com/kinesis/> as my data source.
> Looking at the list of supported Spark Streaming sources<http://spark.apache.org/docs/latest/streaming-programming-guide.html#linking>,
> I don't see any mention of Kinesis.
>
> Is it possible to use Spark Streaming with Amazon Kinesis? If not, are
> there plans to add such support in the future?
>
> Nick
>
>
> ------------------------------
> View this message in context: Spark Streaming source from Amazon Kinesis<http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-source-from-Amazon-Kinesis-tp4550.html>
> Sent from the Apache Spark User List mailing list archive<http://apache-spark-user-list.1001560.n3.nabble.com/>at
> Nabble.com.
>
>
>

Re: Spark Streaming source from Amazon Kinesis

Posted by Matei Zaharia <ma...@gmail.com>.
There was a patch posted a few weeks ago (https://github.com/apache/spark/pull/223), but it needs a few changes in packaging because it uses a license that isn’t fully compatible with Apache. I’d like to get this merged when the changes are made though — it would be a good input source to support.

Matei


On Apr 21, 2014, at 1:00 PM, Nicholas Chammas <ni...@gmail.com> wrote:

> I'm looking to start experimenting with Spark Streaming, and I'd like to use Amazon Kinesis as my data source. Looking at the list of supported Spark Streaming sources, I don't see any mention of Kinesis.
> 
> Is it possible to use Spark Streaming with Amazon Kinesis? If not, are there plans to add such support in the future?
> 
> Nick
> 
> 
> View this message in context: Spark Streaming source from Amazon Kinesis
> Sent from the Apache Spark User List mailing list archive at Nabble.com.