You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Umar Javed <um...@gmail.com> on 2013/10/25 21:27:44 UTC

understanding spark internals

Hi,

I want to build an extension to spark that needs me to understand the
existing spark components. Does anybody have an idea where to start? Should
I use a debugger, or are print statements the way to go? I'm using pyspark
btw.

cheers,
Umar

Re: understanding spark internals

Posted by Mark Hamstra <ma...@clearstorydata.com>.
You can start by moving this discussion to the dev list.



On Fri, Oct 25, 2013 at 12:27 PM, Umar Javed <um...@gmail.com> wrote:

> Hi,
>
> I want to build an extension to spark that needs me to understand the
> existing spark components. Does anybody have an idea where to start? Should
> I use a debugger, or are print statements the way to go? I'm using pyspark
> btw.
>
> cheers,
> Umar
>

Re: understanding spark internals

Posted by Nan Zhu <zh...@gmail.com>.
dev-subscribe@spark.incubator.apache.org


On Fri, Oct 25, 2013 at 4:44 PM, dachuan <hd...@gmail.com> wrote:

> hi, all,
>
> sorry to ask this simple question, but any idea about how to join the dev
> mailing list? I have sent an empty email to dev@spark.incubator.apache.org,
> but I got rejected.
>
> thanks.
>
>
> On Fri, Oct 25, 2013 at 4:06 PM, dachuan <hd...@gmail.com> wrote:
>
>> Hi, I just started reading spark code for two days. My final goal is to
>> reach fault tolerance code area, wish somebody can point me directly to
>> that place, I am kind of lost in oceans of Akka messages..
>>
>> A quicknote of what I have done in these two days;
>> I follow the SparkPageRank.scala code, and follow the workflow, the first
>> class is SparkContext. And finally two Actors are alive (DriverActor and
>> Client).
>>
>> I am happy to share my notes (which is in onenote format) if you need.
>>
>> thanks,
>> dachuan.
>>
>
>
>
> --
> Dachuan Huang
> Cellphone: 614-390-7234
> 2015 Neil Avenue
> Ohio State University
> Columbus, Ohio
> U.S.A.
> 43210
>

Re: understanding spark internals

Posted by dachuan <hd...@gmail.com>.
hi, all,

sorry to ask this simple question, but any idea about how to join the dev
mailing list? I have sent an empty email to dev@spark.incubator.apache.org,
but I got rejected.

thanks.


On Fri, Oct 25, 2013 at 4:06 PM, dachuan <hd...@gmail.com> wrote:

> Hi, I just started reading spark code for two days. My final goal is to
> reach fault tolerance code area, wish somebody can point me directly to
> that place, I am kind of lost in oceans of Akka messages..
>
> A quicknote of what I have done in these two days;
> I follow the SparkPageRank.scala code, and follow the workflow, the first
> class is SparkContext. And finally two Actors are alive (DriverActor and
> Client).
>
> I am happy to share my notes (which is in onenote format) if you need.
>
> thanks,
> dachuan.
>



-- 
Dachuan Huang
Cellphone: 614-390-7234
2015 Neil Avenue
Ohio State University
Columbus, Ohio
U.S.A.
43210

Re: understanding spark internals

Posted by dachuan <hd...@gmail.com>.
Hi, I just started reading spark code for two days. My final goal is to
reach fault tolerance code area, wish somebody can point me directly to
that place, I am kind of lost in oceans of Akka messages..

A quicknote of what I have done in these two days;
I follow the SparkPageRank.scala code, and follow the workflow, the first
class is SparkContext. And finally two Actors are alive (DriverActor and
Client).

I am happy to share my notes (which is in onenote format) if you need.

thanks,
dachuan.

Re: understanding spark internals

Posted by Matei Zaharia <ma...@gmail.com>.
Hi Umar,

The Spark wiki at https://cwiki.apache.org/confluence/display/SPARK/Wiki+Homepage has a few pages on Spark internals (specifically the Python and Java APIs) and on how to build and contribute to Spark (https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark). Hopefully this is a start to learn the codebase.

Matei

On Oct 25, 2013, at 12:27 PM, Umar Javed <um...@gmail.com> wrote:

> Hi,
> 
> I want to build an extension to spark that needs me to understand the existing spark components. Does anybody have an idea where to start? Should I use a debugger, or are print statements the way to go? I'm using pyspark btw.
> 
> cheers,
> Umar