You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by yh18190 <yh...@gmail.com> on 2014/04/03 01:11:32 UTC

Regarding Sparkcontext object

Hi

Is it always needed that sparkcontext object be created in Main method of
class.Is it necessary?Can we create "sc" object in other class and try to
use it by passing this object through function and use it?

Please clarify..



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Regarding-Sparkcontext-object-tp3671.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Regarding Sparkcontext object

Posted by Daniel Siegmann <da...@velos.io>.

On Wed, Apr 2, 2014 at 7:11 PM, yh18190 <yh...@gmail.com> wrote:

> Is it always needed that sparkcontext object be created in Main method of
> class.Is it necessary?Can we create "sc" object in other class and try to
> use it by passing this object through function and use it?
>

The Spark context can be initialized wherever you like and passed around
just as any other object. Just don't try to create multiple contexts
against "local" (without stopping the previous one first), or you may get
ArrayStoreExceptions (I learned that one the hard way).

-- 
Daniel Siegmann, Software Developer
Velos
Accelerating Machine Learning

440 NINTH AVENUE, 11TH FLOOR, NEW YORK, NY 10001
E: daniel.siegmann@velos.io W: www.velos.io

Re: Avro serialization

Posted by Ron Gonzalez <zl...@yahoo.com>.

Thanks will take a look...

Sent from my iPad

> On Apr 3, 2014, at 7:49 AM, FRANK AUSTIN NOTHAFT <fn...@berkeley.edu> wrote:
> 
> We use avro objects in our project, and have a Kryo serializer for generic Avro SpecificRecords. Take a look at:
> 
> https://github.com/bigdatagenomics/adam/blob/master/adam-core/src/main/scala/edu/berkeley/cs/amplab/adam/serialization/ADAMKryoRegistrator.scala
> 
> Also, Matt Massie has a good blog post about this at http://zenfractal.com/2013/08/21/a-powerful-big-data-trio/.
> 
> Frank Austin Nothaft
> fnothaft@berkeley.edu
> fnothaft@eecs.berkeley.edu
> 202-340-0466
> 
> 
>> On Thu, Apr 3, 2014 at 7:16 AM, Ian O'Connell <ia...@ianoconnell.com> wrote:
>> Objects been transformed need to be one of these in flight. Source data can just use the mapreduce input formats, so anything you can do with mapred. doing an avro one for this you probably want one of :
>> https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/*ProtoBuf*
>> 
>> or just whatever your using at the moment to open them in a MR job probably could be re-purposed
>> 
>> 
>>> On Thu, Apr 3, 2014 at 7:11 AM, Ron Gonzalez <zl...@yahoo.com> wrote:
>>> 
>>> Hi,
>>>   I know that sources need to either be java serializable or use kryo serialization.
>>>   Does anyone have sample code that reads, transforms and writes avro files in spark?
>>> 
>>> Thanks,
>>> Ron
>

Re: Avro serialization

Posted by FRANK AUSTIN NOTHAFT <fn...@berkeley.edu>.

We use avro objects in our project, and have a Kryo serializer for generic
Avro SpecificRecords. Take a look at:

https://github.com/bigdatagenomics/adam/blob/master/adam-core/src/main/scala/edu/berkeley/cs/amplab/adam/serialization/ADAMKryoRegistrator.scala

Also, Matt Massie has a good blog post about this at
http://zenfractal.com/2013/08/21/a-powerful-big-data-trio/.

Frank Austin Nothaft
fnothaft@berkeley.edu
fnothaft@eecs.berkeley.edu
202-340-0466


On Thu, Apr 3, 2014 at 7:16 AM, Ian O'Connell <ia...@ianoconnell.com> wrote:

> Objects been transformed need to be one of these in flight. Source data
> can just use the mapreduce input formats, so anything you can do with
> mapred. doing an avro one for this you probably want one of :
>
> https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/*ProtoBuf*
>
> or just whatever your using at the moment to open them in a MR job
> probably could be re-purposed
>
>
> On Thu, Apr 3, 2014 at 7:11 AM, Ron Gonzalez <zl...@yahoo.com> wrote:
>
>>
>>   Hi,
>>   I know that sources need to either be java serializable or use kryo
>> serialization.
>>   Does anyone have sample code that reads, transforms and writes avro
>> files in spark?
>>
>> Thanks,
>> Ron
>>
>
>

Re: Avro serialization

Posted by Ian O'Connell <ia...@ianoconnell.com>.

Objects been transformed need to be one of these in flight. Source data can
just use the mapreduce input formats, so anything you can do with mapred.
doing an avro one for this you probably want one of :
https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/*ProtoBuf*

or just whatever your using at the moment to open them in a MR job probably
could be re-purposed

On Thu, Apr 3, 2014 at 7:11 AM, Ron Gonzalez <zl...@yahoo.com> wrote:

>
>   Hi,
>   I know that sources need to either be java serializable or use kryo
> serialization.
>   Does anyone have sample code that reads, transforms and writes avro
> files in spark?
>
> Thanks,
> Ron
>

Avro serialization

Posted by Ron Gonzalez <zl...@yahoo.com>.

Hi,
  I know that sources need to either be java serializable or use kryo serialization.
  Does anyone have sample code that reads, transforms and writes avro files in spark?

Thanks,
Ron