You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spark.apache.org by Zahid Rahman <za...@gmail.com> on 2020/03/27 05:24:06 UTC

BUG: take with SparkSession.master[url]

with the following sparksession configuration

val spark = SparkSession.builder().master("local[*]").appName("Spark
Session take").getOrCreate();

this line works

flights.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME !=
"Canada").map(flight_row => flight_row).take(5)


however if change the master url like so, with the ip address then the
following error is produced by the position of .take(5)

val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark
Session take").getOrCreate();


20/03/27 05:15:20 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 1,
192.168.0.38, executor 0): java.lang.ClassCastException: cannot assign
instance of java.lang.invoke.SerializedLambda to field
org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance
of org.apache.spark.rdd.MapPartitionsRDD

BUT if I  remove take(5) or change the position of take(5) or insert an
extra take(5) as illustrated in code then it works. I don't see why the
position of take(5) should cause such an error or be caused by changing the
master url

flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME !=
"Canada").map(flight_row => flight_row).take(5)

  flights.take(5)

  flights
  .take(5)
  .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
  .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
   flights.show(5)


complete code if you wish to replicate it.

import org.apache.spark.sql.SparkSession

object sessiontest {

  // define specific  data type class then manipulate it using the
filter and map functions
  // this is also known as an Encoder
  case class flight (DEST_COUNTRY_NAME: String,
                     ORIGIN_COUNTRY_NAME:String,
                     count: BigInt)


  def main(args:Array[String]): Unit ={

    val spark =
SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark
Session take").getOrCreate();

    import spark.implicits._
    val flightDf =
spark.read.parquet("/data/flight-data/parquet/2010-summary.parquet/")
    val flights = flightDf.as[flight]

    flights.take(5).filter(flight_row =>
flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row =>
flight_row).take(5)

      flights.take(5)

      flights
      .take(5)
      .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
      .map(fr => flight(fr.DEST_COUNTRY_NAME,
fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
       flights.show(5)

  } // main
}





Backbutton.co.uk
¯\_(ツ)_/¯
♡۶Java♡۶RMI ♡۶
Make Use Method {MUM}
makeuse.org
<http://www.backbutton.co.uk>

Re: BUG: take with SparkSession.master[url]

Posted by Zahid Rahman <za...@gmail.com>.

~/spark-3.0.0-preview2-bin-hadoop2.7$ sbin/start-slave.sh spark://
192.168.0.38:7077
~/spark-3.0.0-preview2-bin-hadoop2.7$ sbin/start-master.sh

Backbutton.co.uk
¯\_(ツ)_/¯
♡۶Java♡۶RMI ♡۶
Make Use Method {MUM}
makeuse.org
<http://www.backbutton.co.uk>


On Fri, 27 Mar 2020 at 06:12, Zahid Rahman <za...@gmail.com> wrote:

> sbin/start-master.sh
> sbin/start-slave.sh spark://192.168.0.38:7077
>
> Backbutton.co.uk
> ¯\_(ツ)_/¯
> ♡۶Java♡۶RMI ♡۶
> Make Use Method {MUM}
> makeuse.org
> <http://www.backbutton.co.uk>
>
>
> On Fri, 27 Mar 2020 at 05:59, Wenchen Fan <cl...@gmail.com> wrote:
>
>> Your Spark cluster, spark://192.168.0.38:7077, how is it deployed if you
>> just include Spark dependency in IntelliJ?
>>
>> On Fri, Mar 27, 2020 at 1:54 PM Zahid Rahman <za...@gmail.com>
>> wrote:
>>
>>> I have configured  in IntelliJ as external jars
>>> spark-3.0.0-preview2-bin-hadoop2.7/jar
>>>
>>> not pulling anything from maven.
>>>
>>> Backbutton.co.uk
>>> ¯\_(ツ)_/¯
>>> ♡۶Java♡۶RMI ♡۶
>>> Make Use Method {MUM}
>>> makeuse.org
>>> <http://www.backbutton.co.uk>
>>>
>>>
>>> On Fri, 27 Mar 2020 at 05:45, Wenchen Fan <cl...@gmail.com> wrote:
>>>
>>>> Which Spark/Scala version do you use?
>>>>
>>>> On Fri, Mar 27, 2020 at 1:24 PM Zahid Rahman <za...@gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>> with the following sparksession configuration
>>>>>
>>>>> val spark = SparkSession.builder().master("local[*]").appName("Spark Session take").getOrCreate();
>>>>>
>>>>> this line works
>>>>>
>>>>> flights.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>>>
>>>>>
>>>>> however if change the master url like so, with the ip address then the
>>>>> following error is produced by the position of .take(5)
>>>>>
>>>>> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>>>>
>>>>>
>>>>> 20/03/27 05:15:20 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID
>>>>> 1, 192.168.0.38, executor 0): java.lang.ClassCastException: cannot assign
>>>>> instance of java.lang.invoke.SerializedLambda to field
>>>>> org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance
>>>>> of org.apache.spark.rdd.MapPartitionsRDD
>>>>>
>>>>> BUT if I  remove take(5) or change the position of take(5) or insert
>>>>> an extra take(5) as illustrated in code then it works. I don't see why the
>>>>> position of take(5) should cause such an error or be caused by changing the
>>>>> master url
>>>>>
>>>>> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>>>
>>>>>   flights.take(5)
>>>>>
>>>>>   flights
>>>>>   .take(5)
>>>>>   .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>>>>>   .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>>>>>    flights.show(5)
>>>>>
>>>>>
>>>>> complete code if you wish to replicate it.
>>>>>
>>>>> import org.apache.spark.sql.SparkSession
>>>>>
>>>>> object sessiontest {
>>>>>
>>>>>   // define specific  data type class then manipulate it using the filter and map functions
>>>>>   // this is also known as an Encoder
>>>>>   case class flight (DEST_COUNTRY_NAME: String,
>>>>>                      ORIGIN_COUNTRY_NAME:String,
>>>>>                      count: BigInt)
>>>>>
>>>>>
>>>>>   def main(args:Array[String]): Unit ={
>>>>>
>>>>>     val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>>>>
>>>>>     import spark.implicits._
>>>>>     val flightDf = spark.read.parquet("/data/flight-data/parquet/2010-summary.parquet/")
>>>>>     val flights = flightDf.as[flight]
>>>>>
>>>>>     flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>>>
>>>>>       flights.take(5)
>>>>>
>>>>>       flights
>>>>>       .take(5)
>>>>>       .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>>>>>       .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>>>>>        flights.show(5)
>>>>>
>>>>>   } // main
>>>>> }
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Backbutton.co.uk
>>>>> ¯\_(ツ)_/¯
>>>>> ♡۶Java♡۶RMI ♡۶
>>>>> Make Use Method {MUM}
>>>>> makeuse.org
>>>>> <http://www.backbutton.co.uk>
>>>>>
>>>>

Re: BUG: take with SparkSession.master[url]

Posted by Zahid Rahman <za...@gmail.com>.

~/spark-3.0.0-preview2-bin-hadoop2.7$ sbin/start-slave.sh spark://
192.168.0.38:7077
~/spark-3.0.0-preview2-bin-hadoop2.7$ sbin/start-master.sh

Backbutton.co.uk
¯\_(ツ)_/¯
♡۶Java♡۶RMI ♡۶
Make Use Method {MUM}
makeuse.org
<http://www.backbutton.co.uk>


On Fri, 27 Mar 2020 at 06:12, Zahid Rahman <za...@gmail.com> wrote:

> sbin/start-master.sh
> sbin/start-slave.sh spark://192.168.0.38:7077
>
> Backbutton.co.uk
> ¯\_(ツ)_/¯
> ♡۶Java♡۶RMI ♡۶
> Make Use Method {MUM}
> makeuse.org
> <http://www.backbutton.co.uk>
>
>
> On Fri, 27 Mar 2020 at 05:59, Wenchen Fan <cl...@gmail.com> wrote:
>
>> Your Spark cluster, spark://192.168.0.38:7077, how is it deployed if you
>> just include Spark dependency in IntelliJ?
>>
>> On Fri, Mar 27, 2020 at 1:54 PM Zahid Rahman <za...@gmail.com>
>> wrote:
>>
>>> I have configured  in IntelliJ as external jars
>>> spark-3.0.0-preview2-bin-hadoop2.7/jar
>>>
>>> not pulling anything from maven.
>>>
>>> Backbutton.co.uk
>>> ¯\_(ツ)_/¯
>>> ♡۶Java♡۶RMI ♡۶
>>> Make Use Method {MUM}
>>> makeuse.org
>>> <http://www.backbutton.co.uk>
>>>
>>>
>>> On Fri, 27 Mar 2020 at 05:45, Wenchen Fan <cl...@gmail.com> wrote:
>>>
>>>> Which Spark/Scala version do you use?
>>>>
>>>> On Fri, Mar 27, 2020 at 1:24 PM Zahid Rahman <za...@gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>> with the following sparksession configuration
>>>>>
>>>>> val spark = SparkSession.builder().master("local[*]").appName("Spark Session take").getOrCreate();
>>>>>
>>>>> this line works
>>>>>
>>>>> flights.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>>>
>>>>>
>>>>> however if change the master url like so, with the ip address then the
>>>>> following error is produced by the position of .take(5)
>>>>>
>>>>> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>>>>
>>>>>
>>>>> 20/03/27 05:15:20 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID
>>>>> 1, 192.168.0.38, executor 0): java.lang.ClassCastException: cannot assign
>>>>> instance of java.lang.invoke.SerializedLambda to field
>>>>> org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance
>>>>> of org.apache.spark.rdd.MapPartitionsRDD
>>>>>
>>>>> BUT if I  remove take(5) or change the position of take(5) or insert
>>>>> an extra take(5) as illustrated in code then it works. I don't see why the
>>>>> position of take(5) should cause such an error or be caused by changing the
>>>>> master url
>>>>>
>>>>> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>>>
>>>>>   flights.take(5)
>>>>>
>>>>>   flights
>>>>>   .take(5)
>>>>>   .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>>>>>   .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>>>>>    flights.show(5)
>>>>>
>>>>>
>>>>> complete code if you wish to replicate it.
>>>>>
>>>>> import org.apache.spark.sql.SparkSession
>>>>>
>>>>> object sessiontest {
>>>>>
>>>>>   // define specific  data type class then manipulate it using the filter and map functions
>>>>>   // this is also known as an Encoder
>>>>>   case class flight (DEST_COUNTRY_NAME: String,
>>>>>                      ORIGIN_COUNTRY_NAME:String,
>>>>>                      count: BigInt)
>>>>>
>>>>>
>>>>>   def main(args:Array[String]): Unit ={
>>>>>
>>>>>     val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>>>>
>>>>>     import spark.implicits._
>>>>>     val flightDf = spark.read.parquet("/data/flight-data/parquet/2010-summary.parquet/")
>>>>>     val flights = flightDf.as[flight]
>>>>>
>>>>>     flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>>>
>>>>>       flights.take(5)
>>>>>
>>>>>       flights
>>>>>       .take(5)
>>>>>       .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>>>>>       .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>>>>>        flights.show(5)
>>>>>
>>>>>   } // main
>>>>> }
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Backbutton.co.uk
>>>>> ¯\_(ツ)_/¯
>>>>> ♡۶Java♡۶RMI ♡۶
>>>>> Make Use Method {MUM}
>>>>> makeuse.org
>>>>> <http://www.backbutton.co.uk>
>>>>>
>>>>

Re: BUG: take with SparkSession.master[url]

Posted by Zahid Rahman <za...@gmail.com>.

sbin/start-master.sh
sbin/start-slave.sh spark://192.168.0.38:7077

Backbutton.co.uk
¯\_(ツ)_/¯
♡۶Java♡۶RMI ♡۶
Make Use Method {MUM}
makeuse.org
<http://www.backbutton.co.uk>


On Fri, 27 Mar 2020 at 05:59, Wenchen Fan <cl...@gmail.com> wrote:

> Your Spark cluster, spark://192.168.0.38:7077, how is it deployed if you
> just include Spark dependency in IntelliJ?
>
> On Fri, Mar 27, 2020 at 1:54 PM Zahid Rahman <za...@gmail.com> wrote:
>
>> I have configured  in IntelliJ as external jars
>> spark-3.0.0-preview2-bin-hadoop2.7/jar
>>
>> not pulling anything from maven.
>>
>> Backbutton.co.uk
>> ¯\_(ツ)_/¯
>> ♡۶Java♡۶RMI ♡۶
>> Make Use Method {MUM}
>> makeuse.org
>> <http://www.backbutton.co.uk>
>>
>>
>> On Fri, 27 Mar 2020 at 05:45, Wenchen Fan <cl...@gmail.com> wrote:
>>
>>> Which Spark/Scala version do you use?
>>>
>>> On Fri, Mar 27, 2020 at 1:24 PM Zahid Rahman <za...@gmail.com>
>>> wrote:
>>>
>>>>
>>>> with the following sparksession configuration
>>>>
>>>> val spark = SparkSession.builder().master("local[*]").appName("Spark Session take").getOrCreate();
>>>>
>>>> this line works
>>>>
>>>> flights.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>>
>>>>
>>>> however if change the master url like so, with the ip address then the
>>>> following error is produced by the position of .take(5)
>>>>
>>>> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>>>
>>>>
>>>> 20/03/27 05:15:20 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID
>>>> 1, 192.168.0.38, executor 0): java.lang.ClassCastException: cannot assign
>>>> instance of java.lang.invoke.SerializedLambda to field
>>>> org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance
>>>> of org.apache.spark.rdd.MapPartitionsRDD
>>>>
>>>> BUT if I  remove take(5) or change the position of take(5) or insert an
>>>> extra take(5) as illustrated in code then it works. I don't see why the
>>>> position of take(5) should cause such an error or be caused by changing the
>>>> master url
>>>>
>>>> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>>
>>>>   flights.take(5)
>>>>
>>>>   flights
>>>>   .take(5)
>>>>   .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>>>>   .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>>>>    flights.show(5)
>>>>
>>>>
>>>> complete code if you wish to replicate it.
>>>>
>>>> import org.apache.spark.sql.SparkSession
>>>>
>>>> object sessiontest {
>>>>
>>>>   // define specific  data type class then manipulate it using the filter and map functions
>>>>   // this is also known as an Encoder
>>>>   case class flight (DEST_COUNTRY_NAME: String,
>>>>                      ORIGIN_COUNTRY_NAME:String,
>>>>                      count: BigInt)
>>>>
>>>>
>>>>   def main(args:Array[String]): Unit ={
>>>>
>>>>     val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>>>
>>>>     import spark.implicits._
>>>>     val flightDf = spark.read.parquet("/data/flight-data/parquet/2010-summary.parquet/")
>>>>     val flights = flightDf.as[flight]
>>>>
>>>>     flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>>
>>>>       flights.take(5)
>>>>
>>>>       flights
>>>>       .take(5)
>>>>       .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>>>>       .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>>>>        flights.show(5)
>>>>
>>>>   } // main
>>>> }
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Backbutton.co.uk
>>>> ¯\_(ツ)_/¯
>>>> ♡۶Java♡۶RMI ♡۶
>>>> Make Use Method {MUM}
>>>> makeuse.org
>>>> <http://www.backbutton.co.uk>
>>>>
>>>

Re: BUG: take with SparkSession.master[url]

Posted by Zahid Rahman <za...@gmail.com>.

sbin/start-master.sh
sbin/start-slave.sh spark://192.168.0.38:7077

Backbutton.co.uk
¯\_(ツ)_/¯
♡۶Java♡۶RMI ♡۶
Make Use Method {MUM}
makeuse.org
<http://www.backbutton.co.uk>


On Fri, 27 Mar 2020 at 05:59, Wenchen Fan <cl...@gmail.com> wrote:

> Your Spark cluster, spark://192.168.0.38:7077, how is it deployed if you
> just include Spark dependency in IntelliJ?
>
> On Fri, Mar 27, 2020 at 1:54 PM Zahid Rahman <za...@gmail.com> wrote:
>
>> I have configured  in IntelliJ as external jars
>> spark-3.0.0-preview2-bin-hadoop2.7/jar
>>
>> not pulling anything from maven.
>>
>> Backbutton.co.uk
>> ¯\_(ツ)_/¯
>> ♡۶Java♡۶RMI ♡۶
>> Make Use Method {MUM}
>> makeuse.org
>> <http://www.backbutton.co.uk>
>>
>>
>> On Fri, 27 Mar 2020 at 05:45, Wenchen Fan <cl...@gmail.com> wrote:
>>
>>> Which Spark/Scala version do you use?
>>>
>>> On Fri, Mar 27, 2020 at 1:24 PM Zahid Rahman <za...@gmail.com>
>>> wrote:
>>>
>>>>
>>>> with the following sparksession configuration
>>>>
>>>> val spark = SparkSession.builder().master("local[*]").appName("Spark Session take").getOrCreate();
>>>>
>>>> this line works
>>>>
>>>> flights.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>>
>>>>
>>>> however if change the master url like so, with the ip address then the
>>>> following error is produced by the position of .take(5)
>>>>
>>>> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>>>
>>>>
>>>> 20/03/27 05:15:20 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID
>>>> 1, 192.168.0.38, executor 0): java.lang.ClassCastException: cannot assign
>>>> instance of java.lang.invoke.SerializedLambda to field
>>>> org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance
>>>> of org.apache.spark.rdd.MapPartitionsRDD
>>>>
>>>> BUT if I  remove take(5) or change the position of take(5) or insert an
>>>> extra take(5) as illustrated in code then it works. I don't see why the
>>>> position of take(5) should cause such an error or be caused by changing the
>>>> master url
>>>>
>>>> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>>
>>>>   flights.take(5)
>>>>
>>>>   flights
>>>>   .take(5)
>>>>   .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>>>>   .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>>>>    flights.show(5)
>>>>
>>>>
>>>> complete code if you wish to replicate it.
>>>>
>>>> import org.apache.spark.sql.SparkSession
>>>>
>>>> object sessiontest {
>>>>
>>>>   // define specific  data type class then manipulate it using the filter and map functions
>>>>   // this is also known as an Encoder
>>>>   case class flight (DEST_COUNTRY_NAME: String,
>>>>                      ORIGIN_COUNTRY_NAME:String,
>>>>                      count: BigInt)
>>>>
>>>>
>>>>   def main(args:Array[String]): Unit ={
>>>>
>>>>     val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>>>
>>>>     import spark.implicits._
>>>>     val flightDf = spark.read.parquet("/data/flight-data/parquet/2010-summary.parquet/")
>>>>     val flights = flightDf.as[flight]
>>>>
>>>>     flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>>
>>>>       flights.take(5)
>>>>
>>>>       flights
>>>>       .take(5)
>>>>       .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>>>>       .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>>>>        flights.show(5)
>>>>
>>>>   } // main
>>>> }
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Backbutton.co.uk
>>>> ¯\_(ツ)_/¯
>>>> ♡۶Java♡۶RMI ♡۶
>>>> Make Use Method {MUM}
>>>> makeuse.org
>>>> <http://www.backbutton.co.uk>
>>>>
>>>

Re: BUG: take with SparkSession.master[url]

Posted by Wenchen Fan <cl...@gmail.com>.

Your Spark cluster, spark://192.168.0.38:7077, how is it deployed if you
just include Spark dependency in IntelliJ?

On Fri, Mar 27, 2020 at 1:54 PM Zahid Rahman <za...@gmail.com> wrote:

> I have configured  in IntelliJ as external jars
> spark-3.0.0-preview2-bin-hadoop2.7/jar
>
> not pulling anything from maven.
>
> Backbutton.co.uk
> ¯\_(ツ)_/¯
> ♡۶Java♡۶RMI ♡۶
> Make Use Method {MUM}
> makeuse.org
> <http://www.backbutton.co.uk>
>
>
> On Fri, 27 Mar 2020 at 05:45, Wenchen Fan <cl...@gmail.com> wrote:
>
>> Which Spark/Scala version do you use?
>>
>> On Fri, Mar 27, 2020 at 1:24 PM Zahid Rahman <za...@gmail.com>
>> wrote:
>>
>>>
>>> with the following sparksession configuration
>>>
>>> val spark = SparkSession.builder().master("local[*]").appName("Spark Session take").getOrCreate();
>>>
>>> this line works
>>>
>>> flights.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>
>>>
>>> however if change the master url like so, with the ip address then the
>>> following error is produced by the position of .take(5)
>>>
>>> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>>
>>>
>>> 20/03/27 05:15:20 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID
>>> 1, 192.168.0.38, executor 0): java.lang.ClassCastException: cannot assign
>>> instance of java.lang.invoke.SerializedLambda to field
>>> org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance
>>> of org.apache.spark.rdd.MapPartitionsRDD
>>>
>>> BUT if I  remove take(5) or change the position of take(5) or insert an
>>> extra take(5) as illustrated in code then it works. I don't see why the
>>> position of take(5) should cause such an error or be caused by changing the
>>> master url
>>>
>>> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>
>>>   flights.take(5)
>>>
>>>   flights
>>>   .take(5)
>>>   .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>>>   .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>>>    flights.show(5)
>>>
>>>
>>> complete code if you wish to replicate it.
>>>
>>> import org.apache.spark.sql.SparkSession
>>>
>>> object sessiontest {
>>>
>>>   // define specific  data type class then manipulate it using the filter and map functions
>>>   // this is also known as an Encoder
>>>   case class flight (DEST_COUNTRY_NAME: String,
>>>                      ORIGIN_COUNTRY_NAME:String,
>>>                      count: BigInt)
>>>
>>>
>>>   def main(args:Array[String]): Unit ={
>>>
>>>     val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>>
>>>     import spark.implicits._
>>>     val flightDf = spark.read.parquet("/data/flight-data/parquet/2010-summary.parquet/")
>>>     val flights = flightDf.as[flight]
>>>
>>>     flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>
>>>       flights.take(5)
>>>
>>>       flights
>>>       .take(5)
>>>       .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>>>       .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>>>        flights.show(5)
>>>
>>>   } // main
>>> }
>>>
>>>
>>>
>>>
>>>
>>> Backbutton.co.uk
>>> ¯\_(ツ)_/¯
>>> ♡۶Java♡۶RMI ♡۶
>>> Make Use Method {MUM}
>>> makeuse.org
>>> <http://www.backbutton.co.uk>
>>>
>>

Re: BUG: take with SparkSession.master[url]

Posted by Wenchen Fan <cl...@gmail.com>.

Your Spark cluster, spark://192.168.0.38:7077, how is it deployed if you
just include Spark dependency in IntelliJ?

On Fri, Mar 27, 2020 at 1:54 PM Zahid Rahman <za...@gmail.com> wrote:

> I have configured  in IntelliJ as external jars
> spark-3.0.0-preview2-bin-hadoop2.7/jar
>
> not pulling anything from maven.
>
> Backbutton.co.uk
> ¯\_(ツ)_/¯
> ♡۶Java♡۶RMI ♡۶
> Make Use Method {MUM}
> makeuse.org
> <http://www.backbutton.co.uk>
>
>
> On Fri, 27 Mar 2020 at 05:45, Wenchen Fan <cl...@gmail.com> wrote:
>
>> Which Spark/Scala version do you use?
>>
>> On Fri, Mar 27, 2020 at 1:24 PM Zahid Rahman <za...@gmail.com>
>> wrote:
>>
>>>
>>> with the following sparksession configuration
>>>
>>> val spark = SparkSession.builder().master("local[*]").appName("Spark Session take").getOrCreate();
>>>
>>> this line works
>>>
>>> flights.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>
>>>
>>> however if change the master url like so, with the ip address then the
>>> following error is produced by the position of .take(5)
>>>
>>> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>>
>>>
>>> 20/03/27 05:15:20 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID
>>> 1, 192.168.0.38, executor 0): java.lang.ClassCastException: cannot assign
>>> instance of java.lang.invoke.SerializedLambda to field
>>> org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance
>>> of org.apache.spark.rdd.MapPartitionsRDD
>>>
>>> BUT if I  remove take(5) or change the position of take(5) or insert an
>>> extra take(5) as illustrated in code then it works. I don't see why the
>>> position of take(5) should cause such an error or be caused by changing the
>>> master url
>>>
>>> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>
>>>   flights.take(5)
>>>
>>>   flights
>>>   .take(5)
>>>   .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>>>   .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>>>    flights.show(5)
>>>
>>>
>>> complete code if you wish to replicate it.
>>>
>>> import org.apache.spark.sql.SparkSession
>>>
>>> object sessiontest {
>>>
>>>   // define specific  data type class then manipulate it using the filter and map functions
>>>   // this is also known as an Encoder
>>>   case class flight (DEST_COUNTRY_NAME: String,
>>>                      ORIGIN_COUNTRY_NAME:String,
>>>                      count: BigInt)
>>>
>>>
>>>   def main(args:Array[String]): Unit ={
>>>
>>>     val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>>
>>>     import spark.implicits._
>>>     val flightDf = spark.read.parquet("/data/flight-data/parquet/2010-summary.parquet/")
>>>     val flights = flightDf.as[flight]
>>>
>>>     flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>
>>>       flights.take(5)
>>>
>>>       flights
>>>       .take(5)
>>>       .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>>>       .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>>>        flights.show(5)
>>>
>>>   } // main
>>> }
>>>
>>>
>>>
>>>
>>>
>>> Backbutton.co.uk
>>> ¯\_(ツ)_/¯
>>> ♡۶Java♡۶RMI ♡۶
>>> Make Use Method {MUM}
>>> makeuse.org
>>> <http://www.backbutton.co.uk>
>>>
>>

Re: BUG: take with SparkSession.master[url]

Posted by Zahid Rahman <za...@gmail.com>.

I have configured  in IntelliJ as external jars
spark-3.0.0-preview2-bin-hadoop2.7/jar

not pulling anything from maven.

Backbutton.co.uk
¯\_(ツ)_/¯
♡۶Java♡۶RMI ♡۶
Make Use Method {MUM}
makeuse.org
<http://www.backbutton.co.uk>


On Fri, 27 Mar 2020 at 05:45, Wenchen Fan <cl...@gmail.com> wrote:

> Which Spark/Scala version do you use?
>
> On Fri, Mar 27, 2020 at 1:24 PM Zahid Rahman <za...@gmail.com> wrote:
>
>>
>> with the following sparksession configuration
>>
>> val spark = SparkSession.builder().master("local[*]").appName("Spark Session take").getOrCreate();
>>
>> this line works
>>
>> flights.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>
>>
>> however if change the master url like so, with the ip address then the
>> following error is produced by the position of .take(5)
>>
>> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>
>>
>> 20/03/27 05:15:20 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 1,
>> 192.168.0.38, executor 0): java.lang.ClassCastException: cannot assign
>> instance of java.lang.invoke.SerializedLambda to field
>> org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance
>> of org.apache.spark.rdd.MapPartitionsRDD
>>
>> BUT if I  remove take(5) or change the position of take(5) or insert an
>> extra take(5) as illustrated in code then it works. I don't see why the
>> position of take(5) should cause such an error or be caused by changing the
>> master url
>>
>> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>
>>   flights.take(5)
>>
>>   flights
>>   .take(5)
>>   .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>>   .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>>    flights.show(5)
>>
>>
>> complete code if you wish to replicate it.
>>
>> import org.apache.spark.sql.SparkSession
>>
>> object sessiontest {
>>
>>   // define specific  data type class then manipulate it using the filter and map functions
>>   // this is also known as an Encoder
>>   case class flight (DEST_COUNTRY_NAME: String,
>>                      ORIGIN_COUNTRY_NAME:String,
>>                      count: BigInt)
>>
>>
>>   def main(args:Array[String]): Unit ={
>>
>>     val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>
>>     import spark.implicits._
>>     val flightDf = spark.read.parquet("/data/flight-data/parquet/2010-summary.parquet/")
>>     val flights = flightDf.as[flight]
>>
>>     flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>
>>       flights.take(5)
>>
>>       flights
>>       .take(5)
>>       .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>>       .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>>        flights.show(5)
>>
>>   } // main
>> }
>>
>>
>>
>>
>>
>> Backbutton.co.uk
>> ¯\_(ツ)_/¯
>> ♡۶Java♡۶RMI ♡۶
>> Make Use Method {MUM}
>> makeuse.org
>> <http://www.backbutton.co.uk>
>>
>

Re: BUG: take with SparkSession.master[url]

Posted by Zahid Rahman <za...@gmail.com>.

I have configured  in IntelliJ as external jars
spark-3.0.0-preview2-bin-hadoop2.7/jar

not pulling anything from maven.

Backbutton.co.uk
¯\_(ツ)_/¯
♡۶Java♡۶RMI ♡۶
Make Use Method {MUM}
makeuse.org
<http://www.backbutton.co.uk>


On Fri, 27 Mar 2020 at 05:45, Wenchen Fan <cl...@gmail.com> wrote:

> Which Spark/Scala version do you use?
>
> On Fri, Mar 27, 2020 at 1:24 PM Zahid Rahman <za...@gmail.com> wrote:
>
>>
>> with the following sparksession configuration
>>
>> val spark = SparkSession.builder().master("local[*]").appName("Spark Session take").getOrCreate();
>>
>> this line works
>>
>> flights.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>
>>
>> however if change the master url like so, with the ip address then the
>> following error is produced by the position of .take(5)
>>
>> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>
>>
>> 20/03/27 05:15:20 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 1,
>> 192.168.0.38, executor 0): java.lang.ClassCastException: cannot assign
>> instance of java.lang.invoke.SerializedLambda to field
>> org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance
>> of org.apache.spark.rdd.MapPartitionsRDD
>>
>> BUT if I  remove take(5) or change the position of take(5) or insert an
>> extra take(5) as illustrated in code then it works. I don't see why the
>> position of take(5) should cause such an error or be caused by changing the
>> master url
>>
>> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>
>>   flights.take(5)
>>
>>   flights
>>   .take(5)
>>   .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>>   .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>>    flights.show(5)
>>
>>
>> complete code if you wish to replicate it.
>>
>> import org.apache.spark.sql.SparkSession
>>
>> object sessiontest {
>>
>>   // define specific  data type class then manipulate it using the filter and map functions
>>   // this is also known as an Encoder
>>   case class flight (DEST_COUNTRY_NAME: String,
>>                      ORIGIN_COUNTRY_NAME:String,
>>                      count: BigInt)
>>
>>
>>   def main(args:Array[String]): Unit ={
>>
>>     val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>
>>     import spark.implicits._
>>     val flightDf = spark.read.parquet("/data/flight-data/parquet/2010-summary.parquet/")
>>     val flights = flightDf.as[flight]
>>
>>     flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>
>>       flights.take(5)
>>
>>       flights
>>       .take(5)
>>       .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>>       .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>>        flights.show(5)
>>
>>   } // main
>> }
>>
>>
>>
>>
>>
>> Backbutton.co.uk
>> ¯\_(ツ)_/¯
>> ♡۶Java♡۶RMI ♡۶
>> Make Use Method {MUM}
>> makeuse.org
>> <http://www.backbutton.co.uk>
>>
>

Re: BUG: take with SparkSession.master[url]

Posted by Wenchen Fan <cl...@gmail.com>.

Which Spark/Scala version do you use?

On Fri, Mar 27, 2020 at 1:24 PM Zahid Rahman <za...@gmail.com> wrote:

>
> with the following sparksession configuration
>
> val spark = SparkSession.builder().master("local[*]").appName("Spark Session take").getOrCreate();
>
> this line works
>
> flights.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>
>
> however if change the master url like so, with the ip address then the
> following error is produced by the position of .take(5)
>
> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>
>
> 20/03/27 05:15:20 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 1,
> 192.168.0.38, executor 0): java.lang.ClassCastException: cannot assign
> instance of java.lang.invoke.SerializedLambda to field
> org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance
> of org.apache.spark.rdd.MapPartitionsRDD
>
> BUT if I  remove take(5) or change the position of take(5) or insert an
> extra take(5) as illustrated in code then it works. I don't see why the
> position of take(5) should cause such an error or be caused by changing the
> master url
>
> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>
>   flights.take(5)
>
>   flights
>   .take(5)
>   .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>   .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>    flights.show(5)
>
>
> complete code if you wish to replicate it.
>
> import org.apache.spark.sql.SparkSession
>
> object sessiontest {
>
>   // define specific  data type class then manipulate it using the filter and map functions
>   // this is also known as an Encoder
>   case class flight (DEST_COUNTRY_NAME: String,
>                      ORIGIN_COUNTRY_NAME:String,
>                      count: BigInt)
>
>
>   def main(args:Array[String]): Unit ={
>
>     val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>
>     import spark.implicits._
>     val flightDf = spark.read.parquet("/data/flight-data/parquet/2010-summary.parquet/")
>     val flights = flightDf.as[flight]
>
>     flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>
>       flights.take(5)
>
>       flights
>       .take(5)
>       .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>       .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>        flights.show(5)
>
>   } // main
> }
>
>
>
>
>
> Backbutton.co.uk
> ¯\_(ツ)_/¯
> ♡۶Java♡۶RMI ♡۶
> Make Use Method {MUM}
> makeuse.org
> <http://www.backbutton.co.uk>
>

Re: BUG: take with SparkSession.master[url]

Posted by Wenchen Fan <cl...@gmail.com>.

Which Spark/Scala version do you use?

On Fri, Mar 27, 2020 at 1:24 PM Zahid Rahman <za...@gmail.com> wrote:

>
> with the following sparksession configuration
>
> val spark = SparkSession.builder().master("local[*]").appName("Spark Session take").getOrCreate();
>
> this line works
>
> flights.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>
>
> however if change the master url like so, with the ip address then the
> following error is produced by the position of .take(5)
>
> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>
>
> 20/03/27 05:15:20 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 1,
> 192.168.0.38, executor 0): java.lang.ClassCastException: cannot assign
> instance of java.lang.invoke.SerializedLambda to field
> org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance
> of org.apache.spark.rdd.MapPartitionsRDD
>
> BUT if I  remove take(5) or change the position of take(5) or insert an
> extra take(5) as illustrated in code then it works. I don't see why the
> position of take(5) should cause such an error or be caused by changing the
> master url
>
> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>
>   flights.take(5)
>
>   flights
>   .take(5)
>   .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>   .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>    flights.show(5)
>
>
> complete code if you wish to replicate it.
>
> import org.apache.spark.sql.SparkSession
>
> object sessiontest {
>
>   // define specific  data type class then manipulate it using the filter and map functions
>   // this is also known as an Encoder
>   case class flight (DEST_COUNTRY_NAME: String,
>                      ORIGIN_COUNTRY_NAME:String,
>                      count: BigInt)
>
>
>   def main(args:Array[String]): Unit ={
>
>     val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>
>     import spark.implicits._
>     val flightDf = spark.read.parquet("/data/flight-data/parquet/2010-summary.parquet/")
>     val flights = flightDf.as[flight]
>
>     flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>
>       flights.take(5)
>
>       flights
>       .take(5)
>       .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>       .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>        flights.show(5)
>
>   } // main
> }
>
>
>
>
>
> Backbutton.co.uk
> ¯\_(ツ)_/¯
> ♡۶Java♡۶RMI ♡۶
> Make Use Method {MUM}
> makeuse.org
> <http://www.backbutton.co.uk>
>