You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Ravi Sharma <ra...@gmail.com> on 2014/09/10 15:55:12 UTC

Global Variables in Spark Streaming

Hi Friends,

I'm using spark streaming as for kafka consumer. I want to do the CEP by
spark. So as for that I need to store my sequence of events. so that I cant
detect some pattern.

My question is How can I save my events in java collection temporary , So
that i can detect pattern by *processed(temporary stored) and upcoming
events.*


Cheers,
Ravi Sharma

Re: Global Variables in Spark Streaming

Posted by Akhil Das <ak...@sigmoidanalytics.com>.
Yes your understanding is correct. In that case one easiest option would be
to Serialize the object and dump it somewhere in hdfs so that you will be
able to recreate/update the object from the file.

We have something similar which you can find over BroadCastServer
<https://github.com/sigmoidanalytics/spork/blob/spork-0.9/src/org/apache/pig/backend/hadoop/executionengine/spark/BroadCastServer.java>
and BroadCastClient
<https://github.com/sigmoidanalytics/spork/blob/spork-0.9/src/org/apache/pig/backend/hadoop/executionengine/spark/BroadCastClient.java>
which we use internally to pass/update Objects between master and worker
nodes.

Thanks
Best Regards

On Wed, Sep 10, 2014 at 7:50 PM, Ravi Sharma <ra...@gmail.com>
wrote:

> Akhil, By using broadcast variable Will I be able to change the values of
> Broadcast variable?
> As per my understanding It will create final variable to access the value
> across the cluster.
>
> Please correct me if I'm wrong.
>
> Thanks,
>
> Cheers,
> Ravi Sharma
>
> On Wed, Sep 10, 2014 at 7:31 PM, Akhil Das <ak...@sigmoidanalytics.com>
> wrote:
>
>> Have a look at Broadcasting variables
>> http://spark.apache.org/docs/latest/programming-guide.html#broadcast-variables
>>
>>
>> Thanks
>> Best Regards
>>
>> On Wed, Sep 10, 2014 at 7:25 PM, Ravi Sharma <ra...@gmail.com>
>> wrote:
>>
>>> Hi Friends,
>>>
>>> I'm using spark streaming as for kafka consumer. I want to do the CEP by
>>> spark. So as for that I need to store my sequence of events. so that I cant
>>> detect some pattern.
>>>
>>> My question is How can I save my events in java collection temporary ,
>>> So that i can detect pattern by *processed(temporary stored) and
>>> upcoming events.*
>>>
>>>
>>> Cheers,
>>> Ravi Sharma
>>>
>>
>>
>

Re: Global Variables in Spark Streaming

Posted by Ravi Sharma <ra...@gmail.com>.
Akhil, By using broadcast variable Will I be able to change the values of
Broadcast variable?
As per my understanding It will create final variable to access the value
across the cluster.

Please correct me if I'm wrong.

Thanks,

Cheers,
Ravi Sharma

On Wed, Sep 10, 2014 at 7:31 PM, Akhil Das <ak...@sigmoidanalytics.com>
wrote:

> Have a look at Broadcasting variables
> http://spark.apache.org/docs/latest/programming-guide.html#broadcast-variables
>
>
> Thanks
> Best Regards
>
> On Wed, Sep 10, 2014 at 7:25 PM, Ravi Sharma <ra...@gmail.com>
> wrote:
>
>> Hi Friends,
>>
>> I'm using spark streaming as for kafka consumer. I want to do the CEP by
>> spark. So as for that I need to store my sequence of events. so that I cant
>> detect some pattern.
>>
>> My question is How can I save my events in java collection temporary , So
>> that i can detect pattern by *processed(temporary stored) and upcoming
>> events.*
>>
>>
>> Cheers,
>> Ravi Sharma
>>
>
>

Re: Global Variables in Spark Streaming

Posted by Akhil Das <ak...@sigmoidanalytics.com>.
Have a look at Broadcasting variables
http://spark.apache.org/docs/latest/programming-guide.html#broadcast-variables


Thanks
Best Regards

On Wed, Sep 10, 2014 at 7:25 PM, Ravi Sharma <ra...@gmail.com>
wrote:

> Hi Friends,
>
> I'm using spark streaming as for kafka consumer. I want to do the CEP by
> spark. So as for that I need to store my sequence of events. so that I cant
> detect some pattern.
>
> My question is How can I save my events in java collection temporary , So
> that i can detect pattern by *processed(temporary stored) and upcoming
> events.*
>
>
> Cheers,
> Ravi Sharma
>

Re: Global Variables in Spark Streaming

Posted by Santiago Mola <sm...@stratio.com>.
Hi Ravi,

2014-09-10 15:55 GMT+02:00 Ravi Sharma <ra...@gmail.com>:
>
>
> I'm using spark streaming as for kafka consumer. I want to do the CEP by
spark. So as for that I need to store my sequence of events. so that I cant
detect some pattern.
>

Depending on what you're trying to accomplish, you might implement this
using Spark Streaming only, by using the updateStateByKey transformation.
[1]

This will allow you to maintain global states that you can combine with
other streaming operations. We have successfully used this approach to
detect patterns in log sequences with Spark Streaming.

[1]
http://spark.apache.org/docs/latest/streaming-programming-guide.html#transformations

Best,
-- 
Santiago M. Mola
smola@stratio.com