You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Bahubali Jain <ba...@gmail.com> on 2015/08/20 15:15:10 UTC

DAG related query

Hi,
How would the DAG look like for the below code

JavaRDD<String> rdd1 = context.textFile(<SOMEPATH>);
JavaRDD<String> rdd2 = rdd1.map(<DO something>);
rdd1 =  rdd2.map(<Do SOMETHING>);

Does this lead to any kind of cycle?

Thanks,
Baahu

Re: DAG related query

Posted by Andrew Or <an...@databricks.com>.
Hi Bahubali,

Once RDDs are created, they are immutable (in most cases). In your case you
end up with 3 RDDs:

(1) the original rdd1 that reads from the text file
(2) rdd2, that applies a map function on (1), and
(3) the new rdd1 that applies a map function on (2)

There's no cycle because you have 3 distinct RDDs. All you're doing is
reassigning a reference `rdd1`, but the underlying RDD doesn't change.

-Andrew

2015-08-20 6:21 GMT-07:00 Sean Owen <so...@cloudera.com>:

> No. The third line creates a third RDD whose reference simply replaces
> the reference to the first RDD in your local driver program. The first
> RDD still exists.
>
> On Thu, Aug 20, 2015 at 2:15 PM, Bahubali Jain <ba...@gmail.com> wrote:
> > Hi,
> > How would the DAG look like for the below code
> >
> > JavaRDD<String> rdd1 = context.textFile(<SOMEPATH>);
> > JavaRDD<String> rdd2 = rdd1.map(<DO something>);
> > rdd1 =  rdd2.map(<Do SOMETHING>);
> >
> > Does this lead to any kind of cycle?
> >
> > Thanks,
> > Baahu
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Re: DAG related query

Posted by Sean Owen <so...@cloudera.com>.
No. The third line creates a third RDD whose reference simply replaces
the reference to the first RDD in your local driver program. The first
RDD still exists.

On Thu, Aug 20, 2015 at 2:15 PM, Bahubali Jain <ba...@gmail.com> wrote:
> Hi,
> How would the DAG look like for the below code
>
> JavaRDD<String> rdd1 = context.textFile(<SOMEPATH>);
> JavaRDD<String> rdd2 = rdd1.map(<DO something>);
> rdd1 =  rdd2.map(<Do SOMETHING>);
>
> Does this lead to any kind of cycle?
>
> Thanks,
> Baahu

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org