You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Zahid Rahman <za...@gmail.com> on 2020/03/27 21:49:02 UTC

what a plava !

I was very impressed with the amount of material available from
https://github.com/databricks/Spark-The-Definitive-Guide/
Over 450+
*  megabytes.*
<https://www.google.com/search?safe=strict&client=ubuntu&hs=TRK&channel=fs&sxsrf=ALeKk03x1cgbXY4fOsCpCDlXYBqobvJi4w:1585344152905&q=megabytes&spell=1&sa=X&ved=2ahUKEwizm9KYy7voAhWQO8AKHYCSCz8QkeECKAB6BAgWECc>

I have a corrected the scala code  by adding
*.sort(desc("sum(total_cost)"))* to the code provided on page 34 (see
below).

I have noticed numerous uses of exclamation marks almost over use.
for example:
page 23: Let's specify some more *transformatrions !*
page 24: you've read your first explain *plan !*
page 26: Notice that these plans compile to the exactsame underlying *plan
!*
page 29: The last step is our *action !*
page 34: The best thing about structured  streaming ....rapidly...
with *virtually
no code *

1. I have never read a science book with such emotion of frustration.
Is Spark difficult to understand made more complicated  with the
proliferation of languages
scala , Java , python SQL R.

2. Secondly, Is spark architecture made more complex due to competing
technologies ?

I have spark cluster setup with master and slave to load balancing heavy
activity like so:
sbin/start-master.sh
sbin/start-slave.sh spark://192.168.0.38:7077
for load balancing I imagine, conceptually speaking,  although I haven't
tried it , I can have as many
slaves(workers)  on other physical machines  by simply downloading spark
zip file
and running workers from those other physical machine(s) with
sbin/start-slave.sh  spark://192.168.0.38:7077.

*My question is under the circumstances do I need to bother with mesos or
yarn ?*

Collins dictionary
The exclamation mark is used after exclamations and emphatic expressions.

   - I can’t believe it!
   - Oh, no! Look at this mess!

The exclamation mark loses its effect if it is overused. It is better to
use a full stop after a sentence expressing mild excitement or humour.

   It was such a beautiful day.
   I felt like a perfect banana.


import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.functions.{window,column,desc,col}

object RetailData {

  def main(args: Array[String]): Unit = {

    val spark =
SparkSession.builder().master("spark://192.168.0.38:7077").appName("Retail
Data").getOrCreate();

    // create a static frame
  val staticDataFrame = spark.read.format("csv")
    .option ("header","true")
    .option("inferschema","true")
    .load("/data/retail-data/by-day/*.csv")

    staticDataFrame.createOrReplaceTempView("retail_data")
    val staticFrame = staticDataFrame.schema

    staticDataFrame
      .selectExpr(
        "CustomerId","UnitPrice * Quantity as total_cost", "InvoiceDate")
      .groupBy(col("CustomerId"), window(col("InvoiceDate"), "1 day"))
      .sum("total_cost")
      .sort(desc("sum(total_cost)"))
      .show(1)

  } // main

} // object



Backbutton.co.uk
¯\_(ツ)_/¯
♡۶Java♡۶RMI ♡۶
Make Use Method {MUM}
makeuse.org
<http://www.backbutton.co.uk>

Re: what a plava !

Posted by Zahid Rahman <za...@gmail.com>.
That confirms the three technologies are competing
for the same space as I suspected but wasn't sure.
I can focus on the APIs and not waste any unnecessary time on even
looking at mesos and yarn.


Backbutton.co.uk
¯\_(ツ)_/¯
♡۶Java♡۶RMI ♡۶
Make Use Method {MUM}
makeuse.org
<http://www.backbutton.co.uk>


On Sat, 28 Mar 2020 at 02:07, Sean Owen <sr...@gmail.com> wrote:

> Spark standalone is a resource manager like YARN and Mesos. It is
> specific to Spark, and is therefore simpler, as it assumes it can take
> over whole machines.
> YARN and Mesos are for mediating resource usage across applications on
> a cluster, which may be running more than Spark apps.
>
> On Fri, Mar 27, 2020 at 7:30 PM Zahid Rahman <za...@gmail.com> wrote:
> >
> > OK, Thanks.
> >
> > issue of load balancing /Clustering:
> >
> > I believe if I setup clustering like so :
> > sbin/start-master.sh
> > sbin/start-slave spark://master:port
> >
> > another machine
> > sbin/start-slave spark://master:port
> >
> > Does yarn and mesos do anything different than that ?
> >
> > The spark clustering setup and yarn and mesos, are they competing
> technologies  for the same space / functionality ?
> >
>

Re: what a plava !

Posted by Sean Owen <sr...@gmail.com>.
Spark standalone is a resource manager like YARN and Mesos. It is
specific to Spark, and is therefore simpler, as it assumes it can take
over whole machines.
YARN and Mesos are for mediating resource usage across applications on
a cluster, which may be running more than Spark apps.

On Fri, Mar 27, 2020 at 7:30 PM Zahid Rahman <za...@gmail.com> wrote:
>
> OK, Thanks.
>
> issue of load balancing /Clustering:
>
> I believe if I setup clustering like so :
> sbin/start-master.sh
> sbin/start-slave spark://master:port
>
> another machine
> sbin/start-slave spark://master:port
>
> Does yarn and mesos do anything different than that ?
>
> The spark clustering setup and yarn and mesos, are they competing technologies  for the same space / functionality ?
>

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: what a plava !

Posted by Zahid Rahman <za...@gmail.com>.
OK, Thanks.

issue of load balancing /Clustering:

I believe if I setup clustering like so :
sbin/start-master.sh
sbin/start-slave spark://master:port

*another machine*
sbin/start-slave spark://master:port

Does yarn and mesos do anything different than that ?

The spark clustering setup and yarn and mesos, are they competing
technologies  for the same space / functionality ?

Backbutton.co.uk
¯\_(ツ)_/¯
♡۶Java♡۶RMI ♡۶
Make Use Method {MUM}
makeuse.org
<http://www.backbutton.co.uk>


On Fri, 27 Mar 2020 at 22:38, Sean Owen <sr...@gmail.com> wrote:

> - dev@, which is more for project devs to communicate. Cross-posting
> is discouraged too.
>
> The book isn't from the Spark OSS project, so not really the place to
> give feedback here.
>
> I don't quite understand the context of your other questions, but
> would elaborate them in individual, clear emails instead to increase
> the chance that someone will answer.
>
> On Fri, Mar 27, 2020 at 4:49 PM Zahid Rahman <za...@gmail.com> wrote:
> >
> >
> > I was very impressed with the amount of material available from
> https://github.com/databricks/Spark-The-Definitive-Guide/
> > Over 450+  megabytes.
> >
> > I have a corrected the scala code  by adding
> > .sort(desc("sum(total_cost)")) to the code provided on page 34 (see
> below).
> >
> > I have noticed numerous uses of exclamation marks almost over use.
> > for example:
> > page 23: Let's specify some more transformatrions !
> > page 24: you've read your first explain plan !
> > page 26: Notice that these plans compile to the exactsame underlying
> plan !
> > page 29: The last step is our action !
> > page 34: The best thing about structured  streaming ....rapidly... with
> virtually no code
> >
> > 1. I have never read a science book with such emotion of frustration.
> > Is Spark difficult to understand made more complicated  with the
> proliferation of languages
> > scala , Java , python SQL R.
> >
> > 2. Secondly, Is spark architecture made more complex due to competing
> technologies ?
> >
> > I have spark cluster setup with master and slave to load balancing heavy
> activity like so:
> > sbin/start-master.sh
> > sbin/start-slave.sh spark://192.168.0.38:7077
> > for load balancing I imagine, conceptually speaking,  although I haven't
> tried it , I can have as many
> > slaves(workers)  on other physical machines  by simply downloading spark
> zip file
> > and running workers from those other physical machine(s) with
> sbin/start-slave.sh  spark://192.168.0.38:7077.
> > My question is under the circumstances do I need to bother with mesos or
> yarn ?
> >
> > Collins dictionary
> > The exclamation mark is used after exclamations and emphatic expressions.
> >
> > I can’t believe it!
> > Oh, no! Look at this mess!
> >
> > The exclamation mark loses its effect if it is overused. It is better to
> use a full stop after a sentence expressing mild excitement or humour.
> >
> > It was such a beautiful day.
> > I felt like a perfect banana.
> >
> >
> > import org.apache.spark.sql.SparkSession
> > import org.apache.spark.sql.functions.{window,column,desc,col}
> >
> > object RetailData {
> >
> >   def main(args: Array[String]): Unit = {
> >
> >     val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Retail
> Data").getOrCreate();
> >
> >     // create a static frame
> >   val staticDataFrame = spark.read.format("csv")
> >     .option ("header","true")
> >     .option("inferschema","true")
> >     .load("/data/retail-data/by-day/*.csv")
> >
> >     staticDataFrame.createOrReplaceTempView("retail_data")
> >     val staticFrame = staticDataFrame.schema
> >
> >     staticDataFrame
> >       .selectExpr(
> >         "CustomerId","UnitPrice * Quantity as total_cost", "InvoiceDate")
> >       .groupBy(col("CustomerId"), window(col("InvoiceDate"), "1 day"))
> >       .sum("total_cost")
> >       .sort(desc("sum(total_cost)"))
> >       .show(1)
> >
> >   } // main
> >
> > } // object
> >
> >
> >
> > Backbutton.co.uk
> > ¯\_(ツ)_/¯
> > ♡۶Java♡۶RMI ♡۶
> > Make Use Method {MUM}
> > makeuse.org
>

Re: what a plava !

Posted by Sean Owen <sr...@gmail.com>.
- dev@, which is more for project devs to communicate. Cross-posting
is discouraged too.

The book isn't from the Spark OSS project, so not really the place to
give feedback here.

I don't quite understand the context of your other questions, but
would elaborate them in individual, clear emails instead to increase
the chance that someone will answer.

On Fri, Mar 27, 2020 at 4:49 PM Zahid Rahman <za...@gmail.com> wrote:
>
>
> I was very impressed with the amount of material available from https://github.com/databricks/Spark-The-Definitive-Guide/
> Over 450+  megabytes.
>
> I have a corrected the scala code  by adding
> .sort(desc("sum(total_cost)")) to the code provided on page 34 (see below).
>
> I have noticed numerous uses of exclamation marks almost over use.
> for example:
> page 23: Let's specify some more transformatrions !
> page 24: you've read your first explain plan !
> page 26: Notice that these plans compile to the exactsame underlying plan !
> page 29: The last step is our action !
> page 34: The best thing about structured  streaming ....rapidly... with virtually no code
>
> 1. I have never read a science book with such emotion of frustration.
> Is Spark difficult to understand made more complicated  with the proliferation of languages
> scala , Java , python SQL R.
>
> 2. Secondly, Is spark architecture made more complex due to competing technologies ?
>
> I have spark cluster setup with master and slave to load balancing heavy activity like so:
> sbin/start-master.sh
> sbin/start-slave.sh spark://192.168.0.38:7077
> for load balancing I imagine, conceptually speaking,  although I haven't tried it , I can have as many
> slaves(workers)  on other physical machines  by simply downloading spark zip file
> and running workers from those other physical machine(s) with  sbin/start-slave.sh  spark://192.168.0.38:7077.
> My question is under the circumstances do I need to bother with mesos or yarn ?
>
> Collins dictionary
> The exclamation mark is used after exclamations and emphatic expressions.
>
> I can’t believe it!
> Oh, no! Look at this mess!
>
> The exclamation mark loses its effect if it is overused. It is better to use a full stop after a sentence expressing mild excitement or humour.
>
> It was such a beautiful day.
> I felt like a perfect banana.
>
>
> import org.apache.spark.sql.SparkSession
> import org.apache.spark.sql.functions.{window,column,desc,col}
>
> object RetailData {
>
>   def main(args: Array[String]): Unit = {
>
>     val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Retail Data").getOrCreate();
>
>     // create a static frame
>   val staticDataFrame = spark.read.format("csv")
>     .option ("header","true")
>     .option("inferschema","true")
>     .load("/data/retail-data/by-day/*.csv")
>
>     staticDataFrame.createOrReplaceTempView("retail_data")
>     val staticFrame = staticDataFrame.schema
>
>     staticDataFrame
>       .selectExpr(
>         "CustomerId","UnitPrice * Quantity as total_cost", "InvoiceDate")
>       .groupBy(col("CustomerId"), window(col("InvoiceDate"), "1 day"))
>       .sum("total_cost")
>       .sort(desc("sum(total_cost)"))
>       .show(1)
>
>   } // main
>
> } // object
>
>
>
> Backbutton.co.uk
> ¯\_(ツ)_/¯
> ♡۶Java♡۶RMI ♡۶
> Make Use Method {MUM}
> makeuse.org

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: what a plava !

Posted by Sean Owen <sr...@gmail.com>.
- dev@, which is more for project devs to communicate. Cross-posting
is discouraged too.

The book isn't from the Spark OSS project, so not really the place to
give feedback here.

I don't quite understand the context of your other questions, but
would elaborate them in individual, clear emails instead to increase
the chance that someone will answer.

On Fri, Mar 27, 2020 at 4:49 PM Zahid Rahman <za...@gmail.com> wrote:
>
>
> I was very impressed with the amount of material available from https://github.com/databricks/Spark-The-Definitive-Guide/
> Over 450+  megabytes.
>
> I have a corrected the scala code  by adding
> .sort(desc("sum(total_cost)")) to the code provided on page 34 (see below).
>
> I have noticed numerous uses of exclamation marks almost over use.
> for example:
> page 23: Let's specify some more transformatrions !
> page 24: you've read your first explain plan !
> page 26: Notice that these plans compile to the exactsame underlying plan !
> page 29: The last step is our action !
> page 34: The best thing about structured  streaming ....rapidly... with virtually no code
>
> 1. I have never read a science book with such emotion of frustration.
> Is Spark difficult to understand made more complicated  with the proliferation of languages
> scala , Java , python SQL R.
>
> 2. Secondly, Is spark architecture made more complex due to competing technologies ?
>
> I have spark cluster setup with master and slave to load balancing heavy activity like so:
> sbin/start-master.sh
> sbin/start-slave.sh spark://192.168.0.38:7077
> for load balancing I imagine, conceptually speaking,  although I haven't tried it , I can have as many
> slaves(workers)  on other physical machines  by simply downloading spark zip file
> and running workers from those other physical machine(s) with  sbin/start-slave.sh  spark://192.168.0.38:7077.
> My question is under the circumstances do I need to bother with mesos or yarn ?
>
> Collins dictionary
> The exclamation mark is used after exclamations and emphatic expressions.
>
> I can’t believe it!
> Oh, no! Look at this mess!
>
> The exclamation mark loses its effect if it is overused. It is better to use a full stop after a sentence expressing mild excitement or humour.
>
> It was such a beautiful day.
> I felt like a perfect banana.
>
>
> import org.apache.spark.sql.SparkSession
> import org.apache.spark.sql.functions.{window,column,desc,col}
>
> object RetailData {
>
>   def main(args: Array[String]): Unit = {
>
>     val spark = SparkSession.builder().master("spark://192.168.0.38:7077").appName("Retail Data").getOrCreate();
>
>     // create a static frame
>   val staticDataFrame = spark.read.format("csv")
>     .option ("header","true")
>     .option("inferschema","true")
>     .load("/data/retail-data/by-day/*.csv")
>
>     staticDataFrame.createOrReplaceTempView("retail_data")
>     val staticFrame = staticDataFrame.schema
>
>     staticDataFrame
>       .selectExpr(
>         "CustomerId","UnitPrice * Quantity as total_cost", "InvoiceDate")
>       .groupBy(col("CustomerId"), window(col("InvoiceDate"), "1 day"))
>       .sum("total_cost")
>       .sort(desc("sum(total_cost)"))
>       .show(1)
>
>   } // main
>
> } // object
>
>
>
> Backbutton.co.uk
> ¯\_(ツ)_/¯
> ♡۶Java♡۶RMI ♡۶
> Make Use Method {MUM}
> makeuse.org

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org