You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spark.apache.org by vinodkc <vi...@gmail.com> on 2015/04/16 12:36:38 UTC

spark shell paste mode is not consistent

Hi All,

I faced below issue while working with spark. It seems spark shell paste
mode is not consistent

Example code 
---------------
val textFile = sc.textFile("README.md")
textFile.count() 
textFile.first() 
val linesWithSpark = textFile.filter(line => line.contains("Spark"))
textFile.filter(line => line.contains("Spark")).count() 

Step 1 : Run above code in spark-shell
--------

scala> val textFile = sc.textFile("README.md")
textFile: org.apache.spark.rdd.RDD[String] = README.md MapPartitionsRDD[1]
at textFile at <console>:21

scala> textFile.count() 
res0: Long = 98

scala> textFile.first() 
res1: String = # Apache Spark

scala> val linesWithSpark = textFile.filter(line => line.contains("Spark"))
linesWithSpark: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[2] at
filter at <console>:23

scala> textFile.filter(line => line.contains("Spark")).count() 
res2: Long = 19

Result 1: Following actions are evaluated properly
textFile.count() ,textFile.first() ,textFile.filter(line =>
line.contains("Spark")).count() 
res0: Long = 98,res1: String = # Apache Spark,res2: Long = 19

Step 2 : Run above code in spark-shell paste mode
scala> :p
// Entering paste mode (ctrl-D to finish)

val textFile = sc.textFile("README.md")
textFile.count() 
textFile.first() 
val linesWithSpark = textFile.filter(line => line.contains("Spark"))
textFile.filter(line => line.contains("Spark")).count()

// Exiting paste mode, now interpreting.

textFile: org.apache.spark.rdd.RDD[String] = README.md MapPartitionsRDD[1]
at textFile at <console>:21
linesWithSpark: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[2] at
filter at <console>:24
res0: Long = 19

scala>

Result 2: Only one action is executed 
textFile.filter(line => line.contains("Spark")).count()  
res0: Long = 19

Expected result : Result 1 and Result 2 should be same

I feel this is an issue with spark shell . I fixed and verified it
locally.If community also think that it need to be handled, I can make a PR.

Thanks
Vinod KC



--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/spark-shell-paste-mode-is-not-consistent-tp11621.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org

Re: spark shell paste mode is not consistent

Posted by vinodkc <vi...@gmail.com>.

Yes I could see all actions in Spark UI.

Paste command is returning last action result to console, that is why I got
confused.
Thank you for the help

Vinod
On Apr 16, 2015 5:22 PM, "Sean Owen [via Apache Spark Developers List]" <
ml-node+s1001551n11624h17@n3.nabble.com> wrote:

> No, look at the Spark UI. You can see all three were executed.
>
> On Thu, Apr 16, 2015 at 12:05 PM, Vinod KC <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=11624&i=0>> wrote:
>
> > Hi Sean,
> >
> > In paste mode , shell is evaluating only last action.It ignores previous
> > actions
> >
> > .ie , it is  not executing  actions
> > textFile.count()  and textFile.first
> >
> > Thanks
> > Vinod
> >
> > I'm not sure I understand what you are suggesting is wrong. It prints
> the
> > result of the last command. In the second case that is the whole pasted
> > block so you see 19.
> > On Apr 16, 2015 11:37 AM, "vinodkc" <[hidden email]> wrote:
> >
> >> Hi All,
> >>
> >> I faced below issue while working with spark. It seems spark shell
> paste
> >> mode is not consistent
> >>
> >> Example code
> >> ---------------
> >> val textFile = sc.textFile("README.md")
> >> textFile.count()
> >> textFile.first()
> >> val linesWithSpark = textFile.filter(line => line.contains("Spark"))
> >> textFile.filter(line => line.contains("Spark")).count()
> >>
> >> Step 1 : Run above code in spark-shell
> >> --------
> >>
> >> scala> val textFile = sc.textFile("README.md")
> >> textFile: org.apache.spark.rdd.RDD[String] = README.md
> MapPartitionsRDD[1]
> >> at textFile at <console>:21
> >>
> >> scala> textFile.count()
> >> res0: Long = 98
> >>
> >> scala> textFile.first()
> >> res1: String = # Apache Spark
> >>
> >> scala> val linesWithSpark = textFile.filter(line =>
> >> line.contains("Spark"))
> >> linesWithSpark: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[2]
> at
> >> filter at <console>:23
> >>
> >> scala> textFile.filter(line => line.contains("Spark")).count()
> >> res2: Long = 19
> >>
> >> Result 1: Following actions are evaluated properly
> >> textFile.count() ,textFile.first() ,textFile.filter(line =>
> >> line.contains("Spark")).count()
> >> res0: Long = 98,res1: String = # Apache Spark,res2: Long = 19
> >>
> >> Step 2 : Run above code in spark-shell paste mode
> >> scala> :p
> >> // Entering paste mode (ctrl-D to finish)
> >>
> >> val textFile = sc.textFile("README.md")
> >> textFile.count()
> >> textFile.first()
> >> val linesWithSpark = textFile.filter(line => line.contains("Spark"))
> >> textFile.filter(line => line.contains("Spark")).count()
> >>
> >> // Exiting paste mode, now interpreting.
> >>
> >> textFile: org.apache.spark.rdd.RDD[String] = README.md
> MapPartitionsRDD[1]
> >> at textFile at <console>:21
> >> linesWithSpark: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[2]
> at
> >> filter at <console>:24
> >> res0: Long = 19
> >>
> >> scala>
> >>
> >> Result 2: Only one action is executed
> >> textFile.filter(line => line.contains("Spark")).count()
> >> res0: Long = 19
> >>
> >> Expected result : Result 1 and Result 2 should be same
> >>
> >> I feel this is an issue with spark shell . I fixed and verified it
> >> locally.If community also think that it need to be handled, I can make
> a
> >> PR.
> >>
> >> Thanks
> >> Vinod KC
> >>
> >>
> >>
> >> --
> >> View this message in context:
> >>
> >>
> http://apache-spark-developers-list.1001551.n3.nabble.com/spark-shell-paste-mode-is-not-consistent-tp11621.html
> >> Sent from the Apache Spark Developers List mailing list archive at
> >> Nabble.com.
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [hidden email]
> >> For additional commands, e-mail: [hidden email]
> >>
> >>
> >
> >
> > ________________________________
> > If you reply to this email, your message will be added to the discussion
> > below:
> >
> http://apache-spark-developers-list.1001551.n3.nabble.com/spark-shell-paste-mode-is-not-consistent-tp11621p11622.html
> > To start a new topic under Apache Spark Developers List, email
> > [hidden email] <http:///user/SendEmail.jtp?type=node&node=11624&i=1>
> > To unsubscribe from Apache Spark Developers List, click here.
> > NAML
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> <http:///user/SendEmail.jtp?type=node&node=11624&i=2>
> For additional commands, e-mail: [hidden email]
> <http:///user/SendEmail.jtp?type=node&node=11624&i=3>
>
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-spark-developers-list.1001551.n3.nabble.com/spark-shell-paste-mode-is-not-consistent-tp11621p11624.html
>  To start a new topic under Apache Spark Developers List, email
> ml-node+s1001551n1h4@n3.nabble.com
> To unsubscribe from Apache Spark Developers List, click here
> <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=dmlub2Qua2MuaW5AZ21haWwuY29tfDF8MTk2Mjg4MTAzOA==>
> .
> NAML
> <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/spark-shell-paste-mode-is-not-consistent-tp11621p11625.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Re: spark shell paste mode is not consistent

Posted by Sean Owen <so...@cloudera.com>.

No, look at the Spark UI. You can see all three were executed.

On Thu, Apr 16, 2015 at 12:05 PM, Vinod KC <vi...@gmail.com> wrote:
> Hi Sean,
>
> In paste mode , shell is evaluating only last action.It ignores previous
> actions
>
> .ie , it is  not executing  actions
> textFile.count()  and textFile.first
>
> Thanks
> Vinod
>
> I'm not sure I understand what you are suggesting is wrong. It prints the
> result of the last command. In the second case that is the whole pasted
> block so you see 19.
> On Apr 16, 2015 11:37 AM, "vinodkc" <[hidden email]> wrote:
>
>> Hi All,
>>
>> I faced below issue while working with spark. It seems spark shell paste
>> mode is not consistent
>>
>> Example code
>> ---------------
>> val textFile = sc.textFile("README.md")
>> textFile.count()
>> textFile.first()
>> val linesWithSpark = textFile.filter(line => line.contains("Spark"))
>> textFile.filter(line => line.contains("Spark")).count()
>>
>> Step 1 : Run above code in spark-shell
>> --------
>>
>> scala> val textFile = sc.textFile("README.md")
>> textFile: org.apache.spark.rdd.RDD[String] = README.md MapPartitionsRDD[1]
>> at textFile at <console>:21
>>
>> scala> textFile.count()
>> res0: Long = 98
>>
>> scala> textFile.first()
>> res1: String = # Apache Spark
>>
>> scala> val linesWithSpark = textFile.filter(line =>
>> line.contains("Spark"))
>> linesWithSpark: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[2] at
>> filter at <console>:23
>>
>> scala> textFile.filter(line => line.contains("Spark")).count()
>> res2: Long = 19
>>
>> Result 1: Following actions are evaluated properly
>> textFile.count() ,textFile.first() ,textFile.filter(line =>
>> line.contains("Spark")).count()
>> res0: Long = 98,res1: String = # Apache Spark,res2: Long = 19
>>
>> Step 2 : Run above code in spark-shell paste mode
>> scala> :p
>> // Entering paste mode (ctrl-D to finish)
>>
>> val textFile = sc.textFile("README.md")
>> textFile.count()
>> textFile.first()
>> val linesWithSpark = textFile.filter(line => line.contains("Spark"))
>> textFile.filter(line => line.contains("Spark")).count()
>>
>> // Exiting paste mode, now interpreting.
>>
>> textFile: org.apache.spark.rdd.RDD[String] = README.md MapPartitionsRDD[1]
>> at textFile at <console>:21
>> linesWithSpark: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[2] at
>> filter at <console>:24
>> res0: Long = 19
>>
>> scala>
>>
>> Result 2: Only one action is executed
>> textFile.filter(line => line.contains("Spark")).count()
>> res0: Long = 19
>>
>> Expected result : Result 1 and Result 2 should be same
>>
>> I feel this is an issue with spark shell . I fixed and verified it
>> locally.If community also think that it need to be handled, I can make a
>> PR.
>>
>> Thanks
>> Vinod KC
>>
>>
>>
>> --
>> View this message in context:
>>
>> http://apache-spark-developers-list.1001551.n3.nabble.com/spark-shell-paste-mode-is-not-consistent-tp11621.html
>> Sent from the Apache Spark Developers List mailing list archive at
>> Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>
>
>
> ________________________________
> If you reply to this email, your message will be added to the discussion
> below:
> http://apache-spark-developers-list.1001551.n3.nabble.com/spark-shell-paste-mode-is-not-consistent-tp11621p11622.html
> To start a new topic under Apache Spark Developers List, email
> ml-node+s1001551n1h4@n3.nabble.com
> To unsubscribe from Apache Spark Developers List, click here.
> NAML

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org

Re: spark shell paste mode is not consistent

Posted by Vinod KC <vi...@gmail.com>.

Hi Sean,

In paste mode , shell is evaluating only last action.It ignores previous
actions

.ie , it is  not executing  actions
textFile.count()  and textFile.first

Thanks
Vinod
 I'm not sure I understand what you are suggesting is wrong. It prints the
result of the last command. In the second case that is the whole pasted
block so you see 19.
On Apr 16, 2015 11:37 AM, "vinodkc" <[hidden email]
<http:///user/SendEmail.jtp?type=node&node=11622&i=0>> wrote:

> Hi All,
>
> I faced below issue while working with spark. It seems spark shell paste
> mode is not consistent
>
> Example code
> ---------------
> val textFile = sc.textFile("README.md")
> textFile.count()
> textFile.first()
> val linesWithSpark = textFile.filter(line => line.contains("Spark"))
> textFile.filter(line => line.contains("Spark")).count()
>
> Step 1 : Run above code in spark-shell
> --------
>
> scala> val textFile = sc.textFile("README.md")
> textFile: org.apache.spark.rdd.RDD[String] = README.md
MapPartitionsRDD[1]
> at textFile at <console>:21
>
> scala> textFile.count()
> res0: Long = 98
>
> scala> textFile.first()
> res1: String = # Apache Spark
>
> scala> val linesWithSpark = textFile.filter(line =>
line.contains("Spark"))
> linesWithSpark: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[2] at
> filter at <console>:23
>
> scala> textFile.filter(line => line.contains("Spark")).count()
> res2: Long = 19
>
> Result 1: Following actions are evaluated properly
> textFile.count() ,textFile.first() ,textFile.filter(line =>
> line.contains("Spark")).count()
> res0: Long = 98,res1: String = # Apache Spark,res2: Long = 19
>
> Step 2 : Run above code in spark-shell paste mode
> scala> :p
> // Entering paste mode (ctrl-D to finish)
>
> val textFile = sc.textFile("README.md")
> textFile.count()
> textFile.first()
> val linesWithSpark = textFile.filter(line => line.contains("Spark"))
> textFile.filter(line => line.contains("Spark")).count()
>
> // Exiting paste mode, now interpreting.
>
> textFile: org.apache.spark.rdd.RDD[String] = README.md
MapPartitionsRDD[1]
> at textFile at <console>:21
> linesWithSpark: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[2] at
> filter at <console>:24
> res0: Long = 19
>
> scala>
>
> Result 2: Only one action is executed
> textFile.filter(line => line.contains("Spark")).count()
> res0: Long = 19
>
> Expected result : Result 1 and Result 2 should be same
>
> I feel this is an issue with spark shell . I fixed and verified it
> locally.If community also think that it need to be handled, I can make a
> PR.
>
> Thanks
> Vinod KC
>
>
>
> --
> View this message in context:
>
http://apache-spark-developers-list.1001551.n3.nabble.com/spark-shell-paste-mode-is-not-consistent-tp11621.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
<http:///user/SendEmail.jtp?type=node&node=11622&i=1>
> For additional commands, e-mail: [hidden email]
<http:///user/SendEmail.jtp?type=node&node=11622&i=2>
>
>


------------------------------
 If you reply to this email, your message will be added to the discussion
below:
http://apache-spark-developers-list.1001551.n3.nabble.com/spark-shell-paste-mode-is-not-consistent-tp11621p11622.html
 To start a new topic under Apache Spark Developers List, email
ml-node+s1001551n1h4@n3.nabble.com
To unsubscribe from Apache Spark Developers List, click here
<http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=dmlub2Qua2MuaW5AZ21haWwuY29tfDF8MTk2Mjg4MTAzOA==>
.
NAML
<http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>

Re: spark shell paste mode is not consistent

Posted by Sean Owen <so...@cloudera.com>.

I'm not sure I understand what you are suggesting is wrong. It prints the
result of the last command. In the second case that is the whole pasted
block so you see 19.
On Apr 16, 2015 11:37 AM, "vinodkc" <vi...@gmail.com> wrote:

> Hi All,
>
> I faced below issue while working with spark. It seems spark shell paste
> mode is not consistent
>
> Example code
> ---------------
> val textFile = sc.textFile("README.md")
> textFile.count()
> textFile.first()
> val linesWithSpark = textFile.filter(line => line.contains("Spark"))
> textFile.filter(line => line.contains("Spark")).count()
>
> Step 1 : Run above code in spark-shell
> --------
>
> scala> val textFile = sc.textFile("README.md")
> textFile: org.apache.spark.rdd.RDD[String] = README.md MapPartitionsRDD[1]
> at textFile at <console>:21
>
> scala> textFile.count()
> res0: Long = 98
>
> scala> textFile.first()
> res1: String = # Apache Spark
>
> scala> val linesWithSpark = textFile.filter(line => line.contains("Spark"))
> linesWithSpark: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[2] at
> filter at <console>:23
>
> scala> textFile.filter(line => line.contains("Spark")).count()
> res2: Long = 19
>
> Result 1: Following actions are evaluated properly
> textFile.count() ,textFile.first() ,textFile.filter(line =>
> line.contains("Spark")).count()
> res0: Long = 98,res1: String = # Apache Spark,res2: Long = 19
>
> Step 2 : Run above code in spark-shell paste mode
> scala> :p
> // Entering paste mode (ctrl-D to finish)
>
> val textFile = sc.textFile("README.md")
> textFile.count()
> textFile.first()
> val linesWithSpark = textFile.filter(line => line.contains("Spark"))
> textFile.filter(line => line.contains("Spark")).count()
>
> // Exiting paste mode, now interpreting.
>
> textFile: org.apache.spark.rdd.RDD[String] = README.md MapPartitionsRDD[1]
> at textFile at <console>:21
> linesWithSpark: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[2] at
> filter at <console>:24
> res0: Long = 19
>
> scala>
>
> Result 2: Only one action is executed
> textFile.filter(line => line.contains("Spark")).count()
> res0: Long = 19
>
> Expected result : Result 1 and Result 2 should be same
>
> I feel this is an issue with spark shell . I fixed and verified it
> locally.If community also think that it need to be handled, I can make a
> PR.
>
> Thanks
> Vinod KC
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/spark-shell-paste-mode-is-not-consistent-tp11621.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>