You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Zahid Rahman <za...@gmail.com> on 2020/03/27 05:24:06 UTC
BUG: take with SparkSession.master[url]
with the following sparksession configuration
val spark = SparkSession.builder().master("local[*]").appName("Spark
Session take").getOrCreate();
this line works
flights.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME !=
"Canada").map(flight_row => flight_row).take(5)
however if change the master url like so, with the ip address then the
following error is produced by the position of .take(5)
val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark
Session take").getOrCreate();
20/03/27 05:15:20 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 1,
192.168.0.38, executor 0): java.lang.ClassCastException: cannot assign
instance of java.lang.invoke.SerializedLambda to field
org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance
of org.apache.spark.rdd.MapPartitionsRDD
BUT if I remove take(5) or change the position of take(5) or insert an
extra take(5) as illustrated in code then it works. I don't see why the
position of take(5) should cause such an error or be caused by changing the
master url
flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME !=
"Canada").map(flight_row => flight_row).take(5)
flights.take(5)
flights
.take(5)
.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
.map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
flights.show(5)
complete code if you wish to replicate it.
import org.apache.spark.sql.SparkSession
object sessiontest {
// define specific data type class then manipulate it using the
filter and map functions
// this is also known as an Encoder
case class flight (DEST_COUNTRY_NAME: String,
ORIGIN_COUNTRY_NAME:String,
count: BigInt)
def main(args:Array[String]): Unit ={
val spark =
SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark
Session take").getOrCreate();
import spark.implicits._
val flightDf =
spark.read.parquet("/data/flight-data/parquet/2010-summary.parquet/")
val flights = flightDf.as[flight]
flights.take(5).filter(flight_row =>
flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row =>
flight_row).take(5)
flights.take(5)
flights
.take(5)
.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
.map(fr => flight(fr.DEST_COUNTRY_NAME,
fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
flights.show(5)
} // main
}
Backbutton.co.uk
¯\_(ツ)_/¯
♡۶Java♡۶RMI ♡۶
Make Use Method {MUM}
makeuse.org
<http://www.backbutton.co.uk>
Re: BUG: take with SparkSession.master[url]
Posted by Zahid Rahman <za...@gmail.com>.
~/spark-3.0.0-preview2-bin-hadoop2.7$ sbin/start-slave.sh spark://
192.168.0.38:7077
~/spark-3.0.0-preview2-bin-hadoop2.7$ sbin/start-master.sh
Backbutton.co.uk
¯\_(ツ)_/¯
♡۶Java♡۶RMI ♡۶
Make Use Method {MUM}
makeuse.org
<http://www.backbutton.co.uk>
On Fri, 27 Mar 2020 at 06:12, Zahid Rahman <za...@gmail.com> wrote:
> sbin/start-master.sh
> sbin/start-slave.sh spark://192.168.0.38:7077
>
> Backbutton.co.uk
> ¯\_(ツ)_/¯
> ♡۶Java♡۶RMI ♡۶
> Make Use Method {MUM}
> makeuse.org
> <http://www.backbutton.co.uk>
>
>
> On Fri, 27 Mar 2020 at 05:59, Wenchen Fan <cl...@gmail.com> wrote:
>
>> Your Spark cluster, spark://192.168.0.38:7077, how is it deployed if you
>> just include Spark dependency in IntelliJ?
>>
>> On Fri, Mar 27, 2020 at 1:54 PM Zahid Rahman <za...@gmail.com>
>> wrote:
>>
>>> I have configured in IntelliJ as external jars
>>> spark-3.0.0-preview2-bin-hadoop2.7/jar
>>>
>>> not pulling anything from maven.
>>>
>>> Backbutton.co.uk
>>> ¯\_(ツ)_/¯
>>> ♡۶Java♡۶RMI ♡۶
>>> Make Use Method {MUM}
>>> makeuse.org
>>> <http://www.backbutton.co.uk>
>>>
>>>
>>> On Fri, 27 Mar 2020 at 05:45, Wenchen Fan <cl...@gmail.com> wrote:
>>>
>>>> Which Spark/Scala version do you use?
>>>>
>>>> On Fri, Mar 27, 2020 at 1:24 PM Zahid Rahman <za...@gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>> with the following sparksession configuration
>>>>>
>>>>> val spark = SparkSession.builder().master("local[*]").appName("Spark Session take").getOrCreate();
>>>>>
>>>>> this line works
>>>>>
>>>>> flights.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>>>
>>>>>
>>>>> however if change the master url like so, with the ip address then the
>>>>> following error is produced by the position of .take(5)
>>>>>
>>>>> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>>>>
>>>>>
>>>>> 20/03/27 05:15:20 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID
>>>>> 1, 192.168.0.38, executor 0): java.lang.ClassCastException: cannot assign
>>>>> instance of java.lang.invoke.SerializedLambda to field
>>>>> org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance
>>>>> of org.apache.spark.rdd.MapPartitionsRDD
>>>>>
>>>>> BUT if I remove take(5) or change the position of take(5) or insert
>>>>> an extra take(5) as illustrated in code then it works. I don't see why the
>>>>> position of take(5) should cause such an error or be caused by changing the
>>>>> master url
>>>>>
>>>>> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>>>
>>>>> flights.take(5)
>>>>>
>>>>> flights
>>>>> .take(5)
>>>>> .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>>>>> .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>>>>> flights.show(5)
>>>>>
>>>>>
>>>>> complete code if you wish to replicate it.
>>>>>
>>>>> import org.apache.spark.sql.SparkSession
>>>>>
>>>>> object sessiontest {
>>>>>
>>>>> // define specific data type class then manipulate it using the filter and map functions
>>>>> // this is also known as an Encoder
>>>>> case class flight (DEST_COUNTRY_NAME: String,
>>>>> ORIGIN_COUNTRY_NAME:String,
>>>>> count: BigInt)
>>>>>
>>>>>
>>>>> def main(args:Array[String]): Unit ={
>>>>>
>>>>> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>>>>
>>>>> import spark.implicits._
>>>>> val flightDf = spark.read.parquet("/data/flight-data/parquet/2010-summary.parquet/")
>>>>> val flights = flightDf.as[flight]
>>>>>
>>>>> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>>>
>>>>> flights.take(5)
>>>>>
>>>>> flights
>>>>> .take(5)
>>>>> .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>>>>> .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>>>>> flights.show(5)
>>>>>
>>>>> } // main
>>>>> }
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Backbutton.co.uk
>>>>> ¯\_(ツ)_/¯
>>>>> ♡۶Java♡۶RMI ♡۶
>>>>> Make Use Method {MUM}
>>>>> makeuse.org
>>>>> <http://www.backbutton.co.uk>
>>>>>
>>>>
Re: BUG: take with SparkSession.master[url]
Posted by Zahid Rahman <za...@gmail.com>.
~/spark-3.0.0-preview2-bin-hadoop2.7$ sbin/start-slave.sh spark://
192.168.0.38:7077
~/spark-3.0.0-preview2-bin-hadoop2.7$ sbin/start-master.sh
Backbutton.co.uk
¯\_(ツ)_/¯
♡۶Java♡۶RMI ♡۶
Make Use Method {MUM}
makeuse.org
<http://www.backbutton.co.uk>
On Fri, 27 Mar 2020 at 06:12, Zahid Rahman <za...@gmail.com> wrote:
> sbin/start-master.sh
> sbin/start-slave.sh spark://192.168.0.38:7077
>
> Backbutton.co.uk
> ¯\_(ツ)_/¯
> ♡۶Java♡۶RMI ♡۶
> Make Use Method {MUM}
> makeuse.org
> <http://www.backbutton.co.uk>
>
>
> On Fri, 27 Mar 2020 at 05:59, Wenchen Fan <cl...@gmail.com> wrote:
>
>> Your Spark cluster, spark://192.168.0.38:7077, how is it deployed if you
>> just include Spark dependency in IntelliJ?
>>
>> On Fri, Mar 27, 2020 at 1:54 PM Zahid Rahman <za...@gmail.com>
>> wrote:
>>
>>> I have configured in IntelliJ as external jars
>>> spark-3.0.0-preview2-bin-hadoop2.7/jar
>>>
>>> not pulling anything from maven.
>>>
>>> Backbutton.co.uk
>>> ¯\_(ツ)_/¯
>>> ♡۶Java♡۶RMI ♡۶
>>> Make Use Method {MUM}
>>> makeuse.org
>>> <http://www.backbutton.co.uk>
>>>
>>>
>>> On Fri, 27 Mar 2020 at 05:45, Wenchen Fan <cl...@gmail.com> wrote:
>>>
>>>> Which Spark/Scala version do you use?
>>>>
>>>> On Fri, Mar 27, 2020 at 1:24 PM Zahid Rahman <za...@gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>> with the following sparksession configuration
>>>>>
>>>>> val spark = SparkSession.builder().master("local[*]").appName("Spark Session take").getOrCreate();
>>>>>
>>>>> this line works
>>>>>
>>>>> flights.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>>>
>>>>>
>>>>> however if change the master url like so, with the ip address then the
>>>>> following error is produced by the position of .take(5)
>>>>>
>>>>> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>>>>
>>>>>
>>>>> 20/03/27 05:15:20 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID
>>>>> 1, 192.168.0.38, executor 0): java.lang.ClassCastException: cannot assign
>>>>> instance of java.lang.invoke.SerializedLambda to field
>>>>> org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance
>>>>> of org.apache.spark.rdd.MapPartitionsRDD
>>>>>
>>>>> BUT if I remove take(5) or change the position of take(5) or insert
>>>>> an extra take(5) as illustrated in code then it works. I don't see why the
>>>>> position of take(5) should cause such an error or be caused by changing the
>>>>> master url
>>>>>
>>>>> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>>>
>>>>> flights.take(5)
>>>>>
>>>>> flights
>>>>> .take(5)
>>>>> .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>>>>> .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>>>>> flights.show(5)
>>>>>
>>>>>
>>>>> complete code if you wish to replicate it.
>>>>>
>>>>> import org.apache.spark.sql.SparkSession
>>>>>
>>>>> object sessiontest {
>>>>>
>>>>> // define specific data type class then manipulate it using the filter and map functions
>>>>> // this is also known as an Encoder
>>>>> case class flight (DEST_COUNTRY_NAME: String,
>>>>> ORIGIN_COUNTRY_NAME:String,
>>>>> count: BigInt)
>>>>>
>>>>>
>>>>> def main(args:Array[String]): Unit ={
>>>>>
>>>>> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>>>>
>>>>> import spark.implicits._
>>>>> val flightDf = spark.read.parquet("/data/flight-data/parquet/2010-summary.parquet/")
>>>>> val flights = flightDf.as[flight]
>>>>>
>>>>> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>>>
>>>>> flights.take(5)
>>>>>
>>>>> flights
>>>>> .take(5)
>>>>> .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>>>>> .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>>>>> flights.show(5)
>>>>>
>>>>> } // main
>>>>> }
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Backbutton.co.uk
>>>>> ¯\_(ツ)_/¯
>>>>> ♡۶Java♡۶RMI ♡۶
>>>>> Make Use Method {MUM}
>>>>> makeuse.org
>>>>> <http://www.backbutton.co.uk>
>>>>>
>>>>
Re: BUG: take with SparkSession.master[url]
Posted by Zahid Rahman <za...@gmail.com>.
sbin/start-master.sh
sbin/start-slave.sh spark://192.168.0.38:7077
Backbutton.co.uk
¯\_(ツ)_/¯
♡۶Java♡۶RMI ♡۶
Make Use Method {MUM}
makeuse.org
<http://www.backbutton.co.uk>
On Fri, 27 Mar 2020 at 05:59, Wenchen Fan <cl...@gmail.com> wrote:
> Your Spark cluster, spark://192.168.0.38:7077, how is it deployed if you
> just include Spark dependency in IntelliJ?
>
> On Fri, Mar 27, 2020 at 1:54 PM Zahid Rahman <za...@gmail.com> wrote:
>
>> I have configured in IntelliJ as external jars
>> spark-3.0.0-preview2-bin-hadoop2.7/jar
>>
>> not pulling anything from maven.
>>
>> Backbutton.co.uk
>> ¯\_(ツ)_/¯
>> ♡۶Java♡۶RMI ♡۶
>> Make Use Method {MUM}
>> makeuse.org
>> <http://www.backbutton.co.uk>
>>
>>
>> On Fri, 27 Mar 2020 at 05:45, Wenchen Fan <cl...@gmail.com> wrote:
>>
>>> Which Spark/Scala version do you use?
>>>
>>> On Fri, Mar 27, 2020 at 1:24 PM Zahid Rahman <za...@gmail.com>
>>> wrote:
>>>
>>>>
>>>> with the following sparksession configuration
>>>>
>>>> val spark = SparkSession.builder().master("local[*]").appName("Spark Session take").getOrCreate();
>>>>
>>>> this line works
>>>>
>>>> flights.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>>
>>>>
>>>> however if change the master url like so, with the ip address then the
>>>> following error is produced by the position of .take(5)
>>>>
>>>> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>>>
>>>>
>>>> 20/03/27 05:15:20 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID
>>>> 1, 192.168.0.38, executor 0): java.lang.ClassCastException: cannot assign
>>>> instance of java.lang.invoke.SerializedLambda to field
>>>> org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance
>>>> of org.apache.spark.rdd.MapPartitionsRDD
>>>>
>>>> BUT if I remove take(5) or change the position of take(5) or insert an
>>>> extra take(5) as illustrated in code then it works. I don't see why the
>>>> position of take(5) should cause such an error or be caused by changing the
>>>> master url
>>>>
>>>> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>>
>>>> flights.take(5)
>>>>
>>>> flights
>>>> .take(5)
>>>> .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>>>> .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>>>> flights.show(5)
>>>>
>>>>
>>>> complete code if you wish to replicate it.
>>>>
>>>> import org.apache.spark.sql.SparkSession
>>>>
>>>> object sessiontest {
>>>>
>>>> // define specific data type class then manipulate it using the filter and map functions
>>>> // this is also known as an Encoder
>>>> case class flight (DEST_COUNTRY_NAME: String,
>>>> ORIGIN_COUNTRY_NAME:String,
>>>> count: BigInt)
>>>>
>>>>
>>>> def main(args:Array[String]): Unit ={
>>>>
>>>> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>>>
>>>> import spark.implicits._
>>>> val flightDf = spark.read.parquet("/data/flight-data/parquet/2010-summary.parquet/")
>>>> val flights = flightDf.as[flight]
>>>>
>>>> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>>
>>>> flights.take(5)
>>>>
>>>> flights
>>>> .take(5)
>>>> .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>>>> .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>>>> flights.show(5)
>>>>
>>>> } // main
>>>> }
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Backbutton.co.uk
>>>> ¯\_(ツ)_/¯
>>>> ♡۶Java♡۶RMI ♡۶
>>>> Make Use Method {MUM}
>>>> makeuse.org
>>>> <http://www.backbutton.co.uk>
>>>>
>>>
Re: BUG: take with SparkSession.master[url]
Posted by Zahid Rahman <za...@gmail.com>.
sbin/start-master.sh
sbin/start-slave.sh spark://192.168.0.38:7077
Backbutton.co.uk
¯\_(ツ)_/¯
♡۶Java♡۶RMI ♡۶
Make Use Method {MUM}
makeuse.org
<http://www.backbutton.co.uk>
On Fri, 27 Mar 2020 at 05:59, Wenchen Fan <cl...@gmail.com> wrote:
> Your Spark cluster, spark://192.168.0.38:7077, how is it deployed if you
> just include Spark dependency in IntelliJ?
>
> On Fri, Mar 27, 2020 at 1:54 PM Zahid Rahman <za...@gmail.com> wrote:
>
>> I have configured in IntelliJ as external jars
>> spark-3.0.0-preview2-bin-hadoop2.7/jar
>>
>> not pulling anything from maven.
>>
>> Backbutton.co.uk
>> ¯\_(ツ)_/¯
>> ♡۶Java♡۶RMI ♡۶
>> Make Use Method {MUM}
>> makeuse.org
>> <http://www.backbutton.co.uk>
>>
>>
>> On Fri, 27 Mar 2020 at 05:45, Wenchen Fan <cl...@gmail.com> wrote:
>>
>>> Which Spark/Scala version do you use?
>>>
>>> On Fri, Mar 27, 2020 at 1:24 PM Zahid Rahman <za...@gmail.com>
>>> wrote:
>>>
>>>>
>>>> with the following sparksession configuration
>>>>
>>>> val spark = SparkSession.builder().master("local[*]").appName("Spark Session take").getOrCreate();
>>>>
>>>> this line works
>>>>
>>>> flights.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>>
>>>>
>>>> however if change the master url like so, with the ip address then the
>>>> following error is produced by the position of .take(5)
>>>>
>>>> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>>>
>>>>
>>>> 20/03/27 05:15:20 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID
>>>> 1, 192.168.0.38, executor 0): java.lang.ClassCastException: cannot assign
>>>> instance of java.lang.invoke.SerializedLambda to field
>>>> org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance
>>>> of org.apache.spark.rdd.MapPartitionsRDD
>>>>
>>>> BUT if I remove take(5) or change the position of take(5) or insert an
>>>> extra take(5) as illustrated in code then it works. I don't see why the
>>>> position of take(5) should cause such an error or be caused by changing the
>>>> master url
>>>>
>>>> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>>
>>>> flights.take(5)
>>>>
>>>> flights
>>>> .take(5)
>>>> .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>>>> .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>>>> flights.show(5)
>>>>
>>>>
>>>> complete code if you wish to replicate it.
>>>>
>>>> import org.apache.spark.sql.SparkSession
>>>>
>>>> object sessiontest {
>>>>
>>>> // define specific data type class then manipulate it using the filter and map functions
>>>> // this is also known as an Encoder
>>>> case class flight (DEST_COUNTRY_NAME: String,
>>>> ORIGIN_COUNTRY_NAME:String,
>>>> count: BigInt)
>>>>
>>>>
>>>> def main(args:Array[String]): Unit ={
>>>>
>>>> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>>>
>>>> import spark.implicits._
>>>> val flightDf = spark.read.parquet("/data/flight-data/parquet/2010-summary.parquet/")
>>>> val flights = flightDf.as[flight]
>>>>
>>>> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>>
>>>> flights.take(5)
>>>>
>>>> flights
>>>> .take(5)
>>>> .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>>>> .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>>>> flights.show(5)
>>>>
>>>> } // main
>>>> }
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Backbutton.co.uk
>>>> ¯\_(ツ)_/¯
>>>> ♡۶Java♡۶RMI ♡۶
>>>> Make Use Method {MUM}
>>>> makeuse.org
>>>> <http://www.backbutton.co.uk>
>>>>
>>>
Re: BUG: take with SparkSession.master[url]
Posted by Wenchen Fan <cl...@gmail.com>.
Your Spark cluster, spark://192.168.0.38:7077, how is it deployed if you
just include Spark dependency in IntelliJ?
On Fri, Mar 27, 2020 at 1:54 PM Zahid Rahman <za...@gmail.com> wrote:
> I have configured in IntelliJ as external jars
> spark-3.0.0-preview2-bin-hadoop2.7/jar
>
> not pulling anything from maven.
>
> Backbutton.co.uk
> ¯\_(ツ)_/¯
> ♡۶Java♡۶RMI ♡۶
> Make Use Method {MUM}
> makeuse.org
> <http://www.backbutton.co.uk>
>
>
> On Fri, 27 Mar 2020 at 05:45, Wenchen Fan <cl...@gmail.com> wrote:
>
>> Which Spark/Scala version do you use?
>>
>> On Fri, Mar 27, 2020 at 1:24 PM Zahid Rahman <za...@gmail.com>
>> wrote:
>>
>>>
>>> with the following sparksession configuration
>>>
>>> val spark = SparkSession.builder().master("local[*]").appName("Spark Session take").getOrCreate();
>>>
>>> this line works
>>>
>>> flights.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>
>>>
>>> however if change the master url like so, with the ip address then the
>>> following error is produced by the position of .take(5)
>>>
>>> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>>
>>>
>>> 20/03/27 05:15:20 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID
>>> 1, 192.168.0.38, executor 0): java.lang.ClassCastException: cannot assign
>>> instance of java.lang.invoke.SerializedLambda to field
>>> org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance
>>> of org.apache.spark.rdd.MapPartitionsRDD
>>>
>>> BUT if I remove take(5) or change the position of take(5) or insert an
>>> extra take(5) as illustrated in code then it works. I don't see why the
>>> position of take(5) should cause such an error or be caused by changing the
>>> master url
>>>
>>> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>
>>> flights.take(5)
>>>
>>> flights
>>> .take(5)
>>> .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>>> .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>>> flights.show(5)
>>>
>>>
>>> complete code if you wish to replicate it.
>>>
>>> import org.apache.spark.sql.SparkSession
>>>
>>> object sessiontest {
>>>
>>> // define specific data type class then manipulate it using the filter and map functions
>>> // this is also known as an Encoder
>>> case class flight (DEST_COUNTRY_NAME: String,
>>> ORIGIN_COUNTRY_NAME:String,
>>> count: BigInt)
>>>
>>>
>>> def main(args:Array[String]): Unit ={
>>>
>>> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>>
>>> import spark.implicits._
>>> val flightDf = spark.read.parquet("/data/flight-data/parquet/2010-summary.parquet/")
>>> val flights = flightDf.as[flight]
>>>
>>> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>
>>> flights.take(5)
>>>
>>> flights
>>> .take(5)
>>> .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>>> .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>>> flights.show(5)
>>>
>>> } // main
>>> }
>>>
>>>
>>>
>>>
>>>
>>> Backbutton.co.uk
>>> ¯\_(ツ)_/¯
>>> ♡۶Java♡۶RMI ♡۶
>>> Make Use Method {MUM}
>>> makeuse.org
>>> <http://www.backbutton.co.uk>
>>>
>>
Re: BUG: take with SparkSession.master[url]
Posted by Wenchen Fan <cl...@gmail.com>.
Your Spark cluster, spark://192.168.0.38:7077, how is it deployed if you
just include Spark dependency in IntelliJ?
On Fri, Mar 27, 2020 at 1:54 PM Zahid Rahman <za...@gmail.com> wrote:
> I have configured in IntelliJ as external jars
> spark-3.0.0-preview2-bin-hadoop2.7/jar
>
> not pulling anything from maven.
>
> Backbutton.co.uk
> ¯\_(ツ)_/¯
> ♡۶Java♡۶RMI ♡۶
> Make Use Method {MUM}
> makeuse.org
> <http://www.backbutton.co.uk>
>
>
> On Fri, 27 Mar 2020 at 05:45, Wenchen Fan <cl...@gmail.com> wrote:
>
>> Which Spark/Scala version do you use?
>>
>> On Fri, Mar 27, 2020 at 1:24 PM Zahid Rahman <za...@gmail.com>
>> wrote:
>>
>>>
>>> with the following sparksession configuration
>>>
>>> val spark = SparkSession.builder().master("local[*]").appName("Spark Session take").getOrCreate();
>>>
>>> this line works
>>>
>>> flights.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>
>>>
>>> however if change the master url like so, with the ip address then the
>>> following error is produced by the position of .take(5)
>>>
>>> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>>
>>>
>>> 20/03/27 05:15:20 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID
>>> 1, 192.168.0.38, executor 0): java.lang.ClassCastException: cannot assign
>>> instance of java.lang.invoke.SerializedLambda to field
>>> org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance
>>> of org.apache.spark.rdd.MapPartitionsRDD
>>>
>>> BUT if I remove take(5) or change the position of take(5) or insert an
>>> extra take(5) as illustrated in code then it works. I don't see why the
>>> position of take(5) should cause such an error or be caused by changing the
>>> master url
>>>
>>> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>
>>> flights.take(5)
>>>
>>> flights
>>> .take(5)
>>> .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>>> .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>>> flights.show(5)
>>>
>>>
>>> complete code if you wish to replicate it.
>>>
>>> import org.apache.spark.sql.SparkSession
>>>
>>> object sessiontest {
>>>
>>> // define specific data type class then manipulate it using the filter and map functions
>>> // this is also known as an Encoder
>>> case class flight (DEST_COUNTRY_NAME: String,
>>> ORIGIN_COUNTRY_NAME:String,
>>> count: BigInt)
>>>
>>>
>>> def main(args:Array[String]): Unit ={
>>>
>>> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>>
>>> import spark.implicits._
>>> val flightDf = spark.read.parquet("/data/flight-data/parquet/2010-summary.parquet/")
>>> val flights = flightDf.as[flight]
>>>
>>> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>>
>>> flights.take(5)
>>>
>>> flights
>>> .take(5)
>>> .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>>> .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>>> flights.show(5)
>>>
>>> } // main
>>> }
>>>
>>>
>>>
>>>
>>>
>>> Backbutton.co.uk
>>> ¯\_(ツ)_/¯
>>> ♡۶Java♡۶RMI ♡۶
>>> Make Use Method {MUM}
>>> makeuse.org
>>> <http://www.backbutton.co.uk>
>>>
>>
Re: BUG: take with SparkSession.master[url]
Posted by Zahid Rahman <za...@gmail.com>.
I have configured in IntelliJ as external jars
spark-3.0.0-preview2-bin-hadoop2.7/jar
not pulling anything from maven.
Backbutton.co.uk
¯\_(ツ)_/¯
♡۶Java♡۶RMI ♡۶
Make Use Method {MUM}
makeuse.org
<http://www.backbutton.co.uk>
On Fri, 27 Mar 2020 at 05:45, Wenchen Fan <cl...@gmail.com> wrote:
> Which Spark/Scala version do you use?
>
> On Fri, Mar 27, 2020 at 1:24 PM Zahid Rahman <za...@gmail.com> wrote:
>
>>
>> with the following sparksession configuration
>>
>> val spark = SparkSession.builder().master("local[*]").appName("Spark Session take").getOrCreate();
>>
>> this line works
>>
>> flights.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>
>>
>> however if change the master url like so, with the ip address then the
>> following error is produced by the position of .take(5)
>>
>> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>
>>
>> 20/03/27 05:15:20 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 1,
>> 192.168.0.38, executor 0): java.lang.ClassCastException: cannot assign
>> instance of java.lang.invoke.SerializedLambda to field
>> org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance
>> of org.apache.spark.rdd.MapPartitionsRDD
>>
>> BUT if I remove take(5) or change the position of take(5) or insert an
>> extra take(5) as illustrated in code then it works. I don't see why the
>> position of take(5) should cause such an error or be caused by changing the
>> master url
>>
>> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>
>> flights.take(5)
>>
>> flights
>> .take(5)
>> .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>> .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>> flights.show(5)
>>
>>
>> complete code if you wish to replicate it.
>>
>> import org.apache.spark.sql.SparkSession
>>
>> object sessiontest {
>>
>> // define specific data type class then manipulate it using the filter and map functions
>> // this is also known as an Encoder
>> case class flight (DEST_COUNTRY_NAME: String,
>> ORIGIN_COUNTRY_NAME:String,
>> count: BigInt)
>>
>>
>> def main(args:Array[String]): Unit ={
>>
>> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>
>> import spark.implicits._
>> val flightDf = spark.read.parquet("/data/flight-data/parquet/2010-summary.parquet/")
>> val flights = flightDf.as[flight]
>>
>> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>
>> flights.take(5)
>>
>> flights
>> .take(5)
>> .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>> .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>> flights.show(5)
>>
>> } // main
>> }
>>
>>
>>
>>
>>
>> Backbutton.co.uk
>> ¯\_(ツ)_/¯
>> ♡۶Java♡۶RMI ♡۶
>> Make Use Method {MUM}
>> makeuse.org
>> <http://www.backbutton.co.uk>
>>
>
Re: BUG: take with SparkSession.master[url]
Posted by Zahid Rahman <za...@gmail.com>.
I have configured in IntelliJ as external jars
spark-3.0.0-preview2-bin-hadoop2.7/jar
not pulling anything from maven.
Backbutton.co.uk
¯\_(ツ)_/¯
♡۶Java♡۶RMI ♡۶
Make Use Method {MUM}
makeuse.org
<http://www.backbutton.co.uk>
On Fri, 27 Mar 2020 at 05:45, Wenchen Fan <cl...@gmail.com> wrote:
> Which Spark/Scala version do you use?
>
> On Fri, Mar 27, 2020 at 1:24 PM Zahid Rahman <za...@gmail.com> wrote:
>
>>
>> with the following sparksession configuration
>>
>> val spark = SparkSession.builder().master("local[*]").appName("Spark Session take").getOrCreate();
>>
>> this line works
>>
>> flights.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>
>>
>> however if change the master url like so, with the ip address then the
>> following error is produced by the position of .take(5)
>>
>> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>
>>
>> 20/03/27 05:15:20 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 1,
>> 192.168.0.38, executor 0): java.lang.ClassCastException: cannot assign
>> instance of java.lang.invoke.SerializedLambda to field
>> org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance
>> of org.apache.spark.rdd.MapPartitionsRDD
>>
>> BUT if I remove take(5) or change the position of take(5) or insert an
>> extra take(5) as illustrated in code then it works. I don't see why the
>> position of take(5) should cause such an error or be caused by changing the
>> master url
>>
>> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>
>> flights.take(5)
>>
>> flights
>> .take(5)
>> .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>> .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>> flights.show(5)
>>
>>
>> complete code if you wish to replicate it.
>>
>> import org.apache.spark.sql.SparkSession
>>
>> object sessiontest {
>>
>> // define specific data type class then manipulate it using the filter and map functions
>> // this is also known as an Encoder
>> case class flight (DEST_COUNTRY_NAME: String,
>> ORIGIN_COUNTRY_NAME:String,
>> count: BigInt)
>>
>>
>> def main(args:Array[String]): Unit ={
>>
>> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>>
>> import spark.implicits._
>> val flightDf = spark.read.parquet("/data/flight-data/parquet/2010-summary.parquet/")
>> val flights = flightDf.as[flight]
>>
>> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>>
>> flights.take(5)
>>
>> flights
>> .take(5)
>> .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
>> .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
>> flights.show(5)
>>
>> } // main
>> }
>>
>>
>>
>>
>>
>> Backbutton.co.uk
>> ¯\_(ツ)_/¯
>> ♡۶Java♡۶RMI ♡۶
>> Make Use Method {MUM}
>> makeuse.org
>> <http://www.backbutton.co.uk>
>>
>
Re: BUG: take with SparkSession.master[url]
Posted by Wenchen Fan <cl...@gmail.com>.
Which Spark/Scala version do you use?
On Fri, Mar 27, 2020 at 1:24 PM Zahid Rahman <za...@gmail.com> wrote:
>
> with the following sparksession configuration
>
> val spark = SparkSession.builder().master("local[*]").appName("Spark Session take").getOrCreate();
>
> this line works
>
> flights.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>
>
> however if change the master url like so, with the ip address then the
> following error is produced by the position of .take(5)
>
> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>
>
> 20/03/27 05:15:20 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 1,
> 192.168.0.38, executor 0): java.lang.ClassCastException: cannot assign
> instance of java.lang.invoke.SerializedLambda to field
> org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance
> of org.apache.spark.rdd.MapPartitionsRDD
>
> BUT if I remove take(5) or change the position of take(5) or insert an
> extra take(5) as illustrated in code then it works. I don't see why the
> position of take(5) should cause such an error or be caused by changing the
> master url
>
> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>
> flights.take(5)
>
> flights
> .take(5)
> .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
> .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
> flights.show(5)
>
>
> complete code if you wish to replicate it.
>
> import org.apache.spark.sql.SparkSession
>
> object sessiontest {
>
> // define specific data type class then manipulate it using the filter and map functions
> // this is also known as an Encoder
> case class flight (DEST_COUNTRY_NAME: String,
> ORIGIN_COUNTRY_NAME:String,
> count: BigInt)
>
>
> def main(args:Array[String]): Unit ={
>
> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>
> import spark.implicits._
> val flightDf = spark.read.parquet("/data/flight-data/parquet/2010-summary.parquet/")
> val flights = flightDf.as[flight]
>
> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>
> flights.take(5)
>
> flights
> .take(5)
> .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
> .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
> flights.show(5)
>
> } // main
> }
>
>
>
>
>
> Backbutton.co.uk
> ¯\_(ツ)_/¯
> ♡۶Java♡۶RMI ♡۶
> Make Use Method {MUM}
> makeuse.org
> <http://www.backbutton.co.uk>
>
Re: BUG: take with SparkSession.master[url]
Posted by Wenchen Fan <cl...@gmail.com>.
Which Spark/Scala version do you use?
On Fri, Mar 27, 2020 at 1:24 PM Zahid Rahman <za...@gmail.com> wrote:
>
> with the following sparksession configuration
>
> val spark = SparkSession.builder().master("local[*]").appName("Spark Session take").getOrCreate();
>
> this line works
>
> flights.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>
>
> however if change the master url like so, with the ip address then the
> following error is produced by the position of .take(5)
>
> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>
>
> 20/03/27 05:15:20 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 1,
> 192.168.0.38, executor 0): java.lang.ClassCastException: cannot assign
> instance of java.lang.invoke.SerializedLambda to field
> org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance
> of org.apache.spark.rdd.MapPartitionsRDD
>
> BUT if I remove take(5) or change the position of take(5) or insert an
> extra take(5) as illustrated in code then it works. I don't see why the
> position of take(5) should cause such an error or be caused by changing the
> master url
>
> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>
> flights.take(5)
>
> flights
> .take(5)
> .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
> .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
> flights.show(5)
>
>
> complete code if you wish to replicate it.
>
> import org.apache.spark.sql.SparkSession
>
> object sessiontest {
>
> // define specific data type class then manipulate it using the filter and map functions
> // this is also known as an Encoder
> case class flight (DEST_COUNTRY_NAME: String,
> ORIGIN_COUNTRY_NAME:String,
> count: BigInt)
>
>
> def main(args:Array[String]): Unit ={
>
> val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark Session take").getOrCreate();
>
> import spark.implicits._
> val flightDf = spark.read.parquet("/data/flight-data/parquet/2010-summary.parquet/")
> val flights = flightDf.as[flight]
>
> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada").map(flight_row => flight_row).take(5)
>
> flights.take(5)
>
> flights
> .take(5)
> .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada")
> .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count + 5))
> flights.show(5)
>
> } // main
> }
>
>
>
>
>
> Backbutton.co.uk
> ¯\_(ツ)_/¯
> ♡۶Java♡۶RMI ♡۶
> Make Use Method {MUM}
> makeuse.org
> <http://www.backbutton.co.uk>
>