You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Bahubali Jain <ba...@gmail.com> on 2015/08/20 15:15:10 UTC
DAG related query
Hi,
How would the DAG look like for the below code
JavaRDD<String> rdd1 = context.textFile(<SOMEPATH>);
JavaRDD<String> rdd2 = rdd1.map(<DO something>);
rdd1 = rdd2.map(<Do SOMETHING>);
Does this lead to any kind of cycle?
Thanks,
Baahu
Re: DAG related query
Posted by Andrew Or <an...@databricks.com>.
Hi Bahubali,
Once RDDs are created, they are immutable (in most cases). In your case you
end up with 3 RDDs:
(1) the original rdd1 that reads from the text file
(2) rdd2, that applies a map function on (1), and
(3) the new rdd1 that applies a map function on (2)
There's no cycle because you have 3 distinct RDDs. All you're doing is
reassigning a reference `rdd1`, but the underlying RDD doesn't change.
-Andrew
2015-08-20 6:21 GMT-07:00 Sean Owen <so...@cloudera.com>:
> No. The third line creates a third RDD whose reference simply replaces
> the reference to the first RDD in your local driver program. The first
> RDD still exists.
>
> On Thu, Aug 20, 2015 at 2:15 PM, Bahubali Jain <ba...@gmail.com> wrote:
> > Hi,
> > How would the DAG look like for the below code
> >
> > JavaRDD<String> rdd1 = context.textFile(<SOMEPATH>);
> > JavaRDD<String> rdd2 = rdd1.map(<DO something>);
> > rdd1 = rdd2.map(<Do SOMETHING>);
> >
> > Does this lead to any kind of cycle?
> >
> > Thanks,
> > Baahu
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>
Re: DAG related query
Posted by Sean Owen <so...@cloudera.com>.
No. The third line creates a third RDD whose reference simply replaces
the reference to the first RDD in your local driver program. The first
RDD still exists.
On Thu, Aug 20, 2015 at 2:15 PM, Bahubali Jain <ba...@gmail.com> wrote:
> Hi,
> How would the DAG look like for the below code
>
> JavaRDD<String> rdd1 = context.textFile(<SOMEPATH>);
> JavaRDD<String> rdd2 = rdd1.map(<DO something>);
> rdd1 = rdd2.map(<Do SOMETHING>);
>
> Does this lead to any kind of cycle?
>
> Thanks,
> Baahu
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org