You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Pat Ferrel <pa...@occamsmachete.com> on 2019/03/23 21:12:53 UTC

Where does the Driver run?

I have researched this for a significant amount of time and find answers
that seem to be for a slightly different question than mine.

The Spark 2.3.3 cluster is running fine. I see the GUI on “
http://master-address:8080", there are 2 idle workers, as configured.

I have a Scala application that creates a context and starts execution of a
Job. I *do not use spark-submit*, I start the Job programmatically and this
is where many explanations forks from my question.

In "my-app" I create a new SparkConf, with the following code (slightly
abbreviated):

      conf.setAppName(“my-job")
      conf.setMaster(“spark://master-address:7077”)
      conf.set(“deployMode”, “cluster”)
      // other settings like driver and executor memory requests
      // the driver and executor memory requests are for all mem on the
slaves, more than
      // mem available on the launching machine with “my-app"
      val jars = listJars(“/path/to/lib")
      conf.setJars(jars)
      …

When I launch the job I see 2 executors running on the 2 workers/slaves.
Everything seems to run fine and sometimes completes successfully. Frequent
failures are the reason for this question.

Where is the Driver running? I don’t see it in the GUI, I see 2 Executors
taking all cluster resources. With a Yarn cluster I would expect the
“Driver" to run on/in the Yarn Master but I am using the Spark Standalone
Master, where is the Drive part of the Job running?

If is is running in the Master, we are in trouble because I start the
Master on one of my 2 Workers sharing resources with one of the Executors.
Executor mem + driver mem is > available mem on a Worker. I can change this
but need so understand where the Driver part of the Spark Job runs. Is it
in the Spark Master, or inside and Executor, or ???

The “Driver” creates and broadcasts some large data structures so the need
for an answer is more critical than with more typical tiny Drivers.

Thanks for you help!

Re: Where does the Driver run?

Posted by ayan guha <gu...@gmail.com>.
Have you tried apache Livy?

On Fri, 29 Mar 2019 at 9:32 pm, Jianneng Li <ji...@workday.com> wrote:

> Hi Pat,
>
> Now that I understand your terminology better, the method I described was
> actually closer to spark-submit than what you referred to as
> "programmatically". You want to have SparkContext running in the launcher
> program, and also the driver somehow running on the cluster, and
> unfortunately I don't think you can do that.
>
> So yes, it does look like you need to refactor. If you need to actively
> use SparkContext to submit more jobs after the Spark application has
> started, you can write a custom Spark driver that, for example, runs a HTTP
> server that can receive requests and call SparkContext accordingly.
>
> Best,
>
> Jianneng
>
> ------------------------------
> *From:* Pat Ferrel <pa...@occamsmachete.com>
> *Sent:* Thursday, March 28, 2019 10:10 AM
> *To:* Jianneng Li
> *Cc:* user@spark.apache.org; akhld@hacked.work; andrew.melo@gmail.com;
> andrey@actionml.com
>
> *Subject:* Re: Where does the Driver run?
>
> Thanks for the pointers. We’ll investigate.
>
> We have been told that the “Driver” is run in the launching JVM because
> deployMode = cluster is ignored if spark-submit is not used to launch.
>
> You are saying that there is a loophole and if you use one of these client
> classes there is a way to run part of the app on the cluster, and you have
> seen this for Yarn?
>
> To explain more, we create a SparkConf, and then a SparkContext, which we
> pass around implicitly to functions that I would define as the Spark
> Driver. It seems that if you do not use spark-submit, the entire launching
> app/JVM process is considered the Driver AND is always run in client mode.
>
> I hope your loophole pays off or we will have to do a major refactoring.
>
>
> From: Jianneng Li <ji...@workday.com> <ji...@workday.com>
> Reply: Jianneng Li <ji...@workday.com> <ji...@workday.com>
> Date: March 28, 2019 at 2:03:47 AM
> To: pat@occamsmachete.com <pa...@occamsmachete.com> <pa...@occamsmachete.com>
> Cc: andrew.melo@gmail.com <an...@gmail.com> <an...@gmail.com>,
> user@spark.apache.org <us...@spark.apache.org> <us...@spark.apache.org>,
> akhld@hacked.work <ak...@hacked.work> <ak...@hacked.work>
> Subject:  Re: Where does the Driver run?
>
> Hi Pat,
>
> The driver runs in the same JVM as SparkContext. You didn't go into detail
> about how you "launch" the job (i.e. how the SparkContext is created), so
> it's hard for me to guess where the driver is.
>
> For reference, we've had success launching Spark programmatically to YARN
> in cluster mode by creating a SparkConf like you did and using it to call
> this class:
> https://github.com/apache/spark/blob/master/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_spark_blob_master_resource-2Dmanagers_yarn_src_main_scala_org_apache_spark_deploy_yarn_Client.scala&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=VEtAA5SS60IF_f_H4BzelvlCoMSY5ifjy9fFlCw_oas&m=57pw6H_5YXkfV4f7GSBdsVbhwlnRRKgUkyPEAczMtjQ&s=BUpRknSFJ1_EkStADxp--Qgj0q8tgVpqWIWOefQDQb8&e=>
>
> I haven't tried this myself, but for standalone mode you might be able to
> use this:
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/Client.scala
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_spark_blob_master_core_src_main_scala_org_apache_spark_deploy_Client.scala&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=VEtAA5SS60IF_f_H4BzelvlCoMSY5ifjy9fFlCw_oas&m=57pw6H_5YXkfV4f7GSBdsVbhwlnRRKgUkyPEAczMtjQ&s=zL7v5Vs2jtU6LJociB66bGfdnJi4b497Lq_haPdVTCY&e=>
>
> Lastly, you can always check where Spark processes run by executing ps on
> the machine, i.e. `ps aux | grep java`.
>
> Best,
>
> Jianneng
>
>
>
> *From:* Pat Ferrel <pa...@occamsmachete.com>
> *Date:* Monday, March 25, 2019 at 12:58 PM
> *To:* Andrew Melo <an...@gmail.com>
> *Cc:* user <us...@spark.apache.org>, Akhil Das <ak...@hacked.work>
> *Subject:* Re: Where does the Driver run?
>
>
>
> I’m beginning to agree with you and find it rather surprising that this is
> mentioned nowhere explicitly (maybe I missed?). It is possible to serialize
> code to be executed in executors to various nodes. It also seems possible
> to serialize the “driver” bits of code although I’m not sure how the
> boundary would be defined. All code is in the jars we pass to Spark so
> until now I did not question the docs.
>
>
>
> I see no mention of a distinction between running a driver in spark-submit
> vs being programmatically launched for any of the Spark Master types:
> Standalone, Yarn, Mesos, k8s.
>
>
>
> We are building a Machine Learning Server in OSS. It has pluggable Engines
> for different algorithms. Some of these use Spark so it is highly desirable
> to offload driver code to the cluster since we don’t want the diver
> embedded in the Server process. The Driver portion of our training workflow
> could be very large indeed and so could force the scaling of the server to
> worst case.
>
>
>
> I hope someone knows how to run “Driver” code on the cluster when our
> server is launching the code. So deployMode = cluster, deploy method =
> programatic launch.
>
>
>
>
> From: Andrew Melo <an...@gmail.com> <an...@gmail.com>
> Reply: Andrew Melo <an...@gmail.com> <an...@gmail.com>
> Date: March 25, 2019 at 11:40:07 AM
> To: Pat Ferrel <pa...@occamsmachete.com> <pa...@occamsmachete.com>
> Cc: Akhil Das <ak...@hacked.work> <ak...@hacked.work>, user
> <us...@spark.apache.org> <us...@spark.apache.org>
> Subject:  Re: Where does the Driver run?
>
>
>
> Hi Pat,
>
>
>
> Indeed, I don't think that it's possible to use cluster mode w/o
> spark-submit. All the docs I see appear to always describe needing to use
> spark-submit for cluster mode -- it's not even compatible with spark-shell.
> But it makes sense to me -- if you want Spark to run your application's
> driver, you need to package it up and send it to the cluster manager. You
> can't start spark one place and then later migrate it to the cluster. It's
> also why you can't use spark-shell in cluster mode either, I think.
>
>
>
> Cheers
>
> Andrew
>
>
>
> On Mon, Mar 25, 2019 at 11:22 AM Pat Ferrel <pa...@occamsmachete.com> wrote:
>
> In the GUI while the job is running the app-id link brings up logs to both
> executors, The “name” link goes to 4040 of the machine that launched the
> job but is not resolvable right now so the page is not shown. I’ll try the
> netstat but the use of port 4040 was a good clue.
>
>
>
> By what you say below this indicates the Driver is running on the
> launching machine, the client to the Spark Cluster. This should be the case
> in deployMode = client.
>
>
>
> Can someone explain what us going on? The Evidence seems to say that
> deployMode = cluster *does not work* as described unless you use
> spark-submit (and I’m only guessing at that).
>
>
>
> Further; if we don’t use spark-submit we can’t use deployMode = cluster ???
>
>
>
>
> From: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
> Reply: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
> Date: March 24, 2019 at 7:45:07 PM
> To: Pat Ferrel <pa...@occamsmachete.com> <pa...@occamsmachete.com>
> Cc: user <us...@spark.apache.org> <us...@spark.apache.org>
> Subject:  Re: Where does the Driver run?
>
>
>
> There's also a driver ui (usually available on port 4040), after running
> your code, I assume you are running it on your machine, visit
> localhost:4040 and you will get the driver UI.
>
>
>
> If you think the driver is running on your master/executor nodes, login to
> those machines and do a
>
>
>
>    netstat -napt | grep -I listen
>
>
>
> You will see the driver listening on 404x there, this won't be the case
> mostly as you are not doing Spark-submit or using the deployMode=cluster.
>
>
>
> On Mon, 25 Mar 2019, 01:03 Pat Ferrel, <pa...@occamsmachete.com> wrote:
>
> Thanks, I have seen this many times in my research. Paraphrasing docs: “in
> deployMode ‘cluster' the Driver runs on a Worker in the cluster”
>
>
>
> When I look at logs I see 2 executors on the 2 slaves (executor 0 and 1
> with addresses that match slaves). When I look at memory usage while the
> job runs I see virtually identical usage on the 2 Workers. This would
> support your claim and contradict Spark docs for deployMode = cluster.
>
>
>
> The evidence seems to contradict the docs. I am now beginning to wonder if
> the Driver only runs in the cluster if we use spark-submit????
>
>
>
>
>
>
> From: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
> Reply: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
> Date: March 23, 2019 at 9:26:50 PM
> To: Pat Ferrel <pa...@occamsmachete.com> <pa...@occamsmachete.com>
> Cc: user <us...@spark.apache.org> <us...@spark.apache.org>
> Subject:  Re: Where does the Driver run?
>
>
>
> If you are starting your "my-app" on your local machine, that's where the
> driver is running.
>
>
>
> *Error! Filename not specified.*
>
>
>
> Hope this helps.
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__spark.apache.org_docs_latest_cluster-2Doverview.html&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=wuPgP0XTkA0aSITo752Pl2Mh3FfB1th7k-btT_qhdIA&m=3CYOFHITFcOX-5guCh_jvT5tPXbglI97H-L0qp9B-bc&s=0dTrGaPbO_ZAG3IMNW4ElhYpdQxZa2yYCdn5iHc-PZ8&e=>
>
>
>
> On Sun, Mar 24, 2019 at 4:13 AM Pat Ferrel <pa...@occamsmachete.com> wrote:
>
> I have researched this for a significant amount of time and find answers
> that seem to be for a slightly different question than mine.
>
>
>
> The Spark 2.3.3 cluster is running fine. I see the GUI on “
> http://master-address:8080
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__master-2Daddress-3A8080&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=wuPgP0XTkA0aSITo752Pl2Mh3FfB1th7k-btT_qhdIA&m=3CYOFHITFcOX-5guCh_jvT5tPXbglI97H-L0qp9B-bc&s=qcN9_iGnF84Xal2XP_z8LOd1vncYSx8FfVxfoyYjSQk&e=>",
> there are 2 idle workers, as configured.
>
>
>
> I have a Scala application that creates a context and starts execution of
> a Job. I *do not use spark-submit*, I start the Job programmatically and
> this is where many explanations forks from my question.
>
>
>
> In "my-app" I create a new SparkConf, with the following code (slightly
> abbreviated):
>
>
>
>       conf.setAppName(“my-job")
>
>       conf.setMaster(“spark://master-address:7077”)
>
>       conf.set(“deployMode”, “cluster”)
>
>       // other settings like driver and executor memory requests
>
>       // the driver and executor memory requests are for all mem on the
> slaves, more than
>
>       // mem available on the launching machine with “my-app"
>
>       val jars = listJars(“/path/to/lib")
>
>       conf.setJars(jars)
>
>       …
>
>
>
> When I launch the job I see 2 executors running on the 2 workers/slaves.
> Everything seems to run fine and sometimes completes successfully. Frequent
> failures are the reason for this question.
>
>
>
> Where is the Driver running? I don’t see it in the GUI, I see 2 Executors
> taking all cluster resources. With a Yarn cluster I would expect the
> “Driver" to run on/in the Yarn Master but I am using the Spark Standalone
> Master, where is the Drive part of the Job running?
>
>
>
> If is is running in the Master, we are in trouble because I start the
> Master on one of my 2 Workers sharing resources with one of the Executors.
> Executor mem + driver mem is > available mem on a Worker. I can change this
> but need so understand where the Driver part of the Spark Job runs. Is it
> in the Spark Master, or inside and Executor, or ???
>
>
>
> The “Driver” creates and broadcasts some large data structures so the need
> for an answer is more critical than with more typical tiny Drivers.
>
>
>
> Thanks for you help!
>
>
>
>
> --
>
> Cheers!
>
>
>
> --
Best Regards,
Ayan Guha

Re: Where does the Driver run?

Posted by Jianneng Li <ji...@workday.com>.
Hi Pat,

Now that I understand your terminology better, the method I described was actually closer to spark-submit than what you referred to as "programmatically". You want to have SparkContext running in the launcher program, and also the driver somehow running on the cluster, and unfortunately I don't think you can do that.

So yes, it does look like you need to refactor. If you need to actively use SparkContext to submit more jobs after the Spark application has started, you can write a custom Spark driver that, for example, runs a HTTP server that can receive requests and call SparkContext accordingly.

Best,

Jianneng

________________________________
From: Pat Ferrel <pa...@occamsmachete.com>
Sent: Thursday, March 28, 2019 10:10 AM
To: Jianneng Li
Cc: user@spark.apache.org; akhld@hacked.work; andrew.melo@gmail.com; andrey@actionml.com
Subject: Re: Where does the Driver run?

Thanks for the pointers. We’ll investigate.

We have been told that the “Driver” is run in the launching JVM because deployMode = cluster is ignored if spark-submit is not used to launch.

You are saying that there is a loophole and if you use one of these client classes there is a way to run part of the app on the cluster, and you have seen this for Yarn?

To explain more, we create a SparkConf, and then a SparkContext, which we pass around implicitly to functions that I would define as the Spark Driver. It seems that if you do not use spark-submit, the entire launching app/JVM process is considered the Driver AND is always run in client mode.

I hope your loophole pays off or we will have to do a major refactoring.


From: Jianneng Li <ji...@workday.com>
Reply: Jianneng Li <ji...@workday.com>
Date: March 28, 2019 at 2:03:47 AM
To: pat@occamsmachete.com<ma...@occamsmachete.com> <pa...@occamsmachete.com>
Cc: andrew.melo@gmail.com<ma...@gmail.com> <an...@gmail.com>, user@spark.apache.org<ma...@spark.apache.org> <us...@spark.apache.org>, akhld@hacked.work<ma...@hacked.work> <ak...@hacked.work>
Subject:  Re: Where does the Driver run?

Hi Pat,

The driver runs in the same JVM as SparkContext. You didn't go into detail about how you "launch" the job (i.e. how the SparkContext is created), so it's hard for me to guess where the driver is.

For reference, we've had success launching Spark programmatically to YARN in cluster mode by creating a SparkConf like you did and using it to call this class: https://github.com/apache/spark/blob/master/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_spark_blob_master_resource-2Dmanagers_yarn_src_main_scala_org_apache_spark_deploy_yarn_Client.scala&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=VEtAA5SS60IF_f_H4BzelvlCoMSY5ifjy9fFlCw_oas&m=57pw6H_5YXkfV4f7GSBdsVbhwlnRRKgUkyPEAczMtjQ&s=BUpRknSFJ1_EkStADxp--Qgj0q8tgVpqWIWOefQDQb8&e=>

I haven't tried this myself, but for standalone mode you might be able to use this: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/Client.scala<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_spark_blob_master_core_src_main_scala_org_apache_spark_deploy_Client.scala&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=VEtAA5SS60IF_f_H4BzelvlCoMSY5ifjy9fFlCw_oas&m=57pw6H_5YXkfV4f7GSBdsVbhwlnRRKgUkyPEAczMtjQ&s=zL7v5Vs2jtU6LJociB66bGfdnJi4b497Lq_haPdVTCY&e=>

Lastly, you can always check where Spark processes run by executing ps on the machine, i.e. `ps aux | grep java`.

Best,

Jianneng




From: Pat Ferrel <pa...@occamsmachete.com>>
Date: Monday, March 25, 2019 at 12:58 PM
To: Andrew Melo <an...@gmail.com>>
Cc: user <us...@spark.apache.org>>, Akhil Das <ak...@hacked.work>>
Subject: Re: Where does the Driver run?



I’m beginning to agree with you and find it rather surprising that this is mentioned nowhere explicitly (maybe I missed?). It is possible to serialize code to be executed in executors to various nodes. It also seems possible to serialize the “driver” bits of code although I’m not sure how the boundary would be defined. All code is in the jars we pass to Spark so until now I did not question the docs.



I see no mention of a distinction between running a driver in spark-submit vs being programmatically launched for any of the Spark Master types: Standalone, Yarn, Mesos, k8s.



We are building a Machine Learning Server in OSS. It has pluggable Engines for different algorithms. Some of these use Spark so it is highly desirable to offload driver code to the cluster since we don’t want the diver embedded in the Server process. The Driver portion of our training workflow could be very large indeed and so could force the scaling of the server to worst case.



I hope someone knows how to run “Driver” code on the cluster when our server is launching the code. So deployMode = cluster, deploy method = programatic launch.



From: Andrew Melo <an...@gmail.com>
Reply: Andrew Melo <an...@gmail.com>
Date: March 25, 2019 at 11:40:07 AM
To: Pat Ferrel <pa...@occamsmachete.com>
Cc: Akhil Das <ak...@hacked.work>, user <us...@spark.apache.org>
Subject:  Re: Where does the Driver run?



Hi Pat,



Indeed, I don't think that it's possible to use cluster mode w/o spark-submit. All the docs I see appear to always describe needing to use spark-submit for cluster mode -- it's not even compatible with spark-shell. But it makes sense to me -- if you want Spark to run your application's driver, you need to package it up and send it to the cluster manager. You can't start spark one place and then later migrate it to the cluster. It's also why you can't use spark-shell in cluster mode either, I think.



Cheers

Andrew



On Mon, Mar 25, 2019 at 11:22 AM Pat Ferrel <pa...@occamsmachete.com>> wrote:

In the GUI while the job is running the app-id link brings up logs to both executors, The “name” link goes to 4040 of the machine that launched the job but is not resolvable right now so the page is not shown. I’ll try the netstat but the use of port 4040 was a good clue.



By what you say below this indicates the Driver is running on the launching machine, the client to the Spark Cluster. This should be the case in deployMode = client.



Can someone explain what us going on? The Evidence seems to say that deployMode = cluster does not work as described unless you use spark-submit (and I’m only guessing at that).



Further; if we don’t use spark-submit we can’t use deployMode = cluster ???



From: Akhil Das <ak...@hacked.work>
Reply: Akhil Das <ak...@hacked.work>
Date: March 24, 2019 at 7:45:07 PM
To: Pat Ferrel <pa...@occamsmachete.com>
Cc: user <us...@spark.apache.org>
Subject:  Re: Where does the Driver run?



There's also a driver ui (usually available on port 4040), after running your code, I assume you are running it on your machine, visit localhost:4040 and you will get the driver UI.



If you think the driver is running on your master/executor nodes, login to those machines and do a



   netstat -napt | grep -I listen



You will see the driver listening on 404x there, this won't be the case mostly as you are not doing Spark-submit or using the deployMode=cluster.



On Mon, 25 Mar 2019, 01:03 Pat Ferrel, <pa...@occamsmachete.com>> wrote:

Thanks, I have seen this many times in my research. Paraphrasing docs: “in deployMode ‘cluster' the Driver runs on a Worker in the cluster”



When I look at logs I see 2 executors on the 2 slaves (executor 0 and 1 with addresses that match slaves). When I look at memory usage while the job runs I see virtually identical usage on the 2 Workers. This would support your claim and contradict Spark docs for deployMode = cluster.



The evidence seems to contradict the docs. I am now beginning to wonder if the Driver only runs in the cluster if we use spark-submit????





From: Akhil Das <ak...@hacked.work>
Reply: Akhil Das <ak...@hacked.work>
Date: March 23, 2019 at 9:26:50 PM
To: Pat Ferrel <pa...@occamsmachete.com>
Cc: user <us...@spark.apache.org>
Subject:  Re: Where does the Driver run?



If you are starting your "my-app" on your local machine, that's where the driver is running.



Error! Filename not specified.



Hope this helps.<https://urldefense.proofpoint.com/v2/url?u=https-3A__spark.apache.org_docs_latest_cluster-2Doverview.html&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=wuPgP0XTkA0aSITo752Pl2Mh3FfB1th7k-btT_qhdIA&m=3CYOFHITFcOX-5guCh_jvT5tPXbglI97H-L0qp9B-bc&s=0dTrGaPbO_ZAG3IMNW4ElhYpdQxZa2yYCdn5iHc-PZ8&e=>



On Sun, Mar 24, 2019 at 4:13 AM Pat Ferrel <pa...@occamsmachete.com>> wrote:

I have researched this for a significant amount of time and find answers that seem to be for a slightly different question than mine.



The Spark 2.3.3 cluster is running fine. I see the GUI on “http://master-address:8080<https://urldefense.proofpoint.com/v2/url?u=http-3A__master-2Daddress-3A8080&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=wuPgP0XTkA0aSITo752Pl2Mh3FfB1th7k-btT_qhdIA&m=3CYOFHITFcOX-5guCh_jvT5tPXbglI97H-L0qp9B-bc&s=qcN9_iGnF84Xal2XP_z8LOd1vncYSx8FfVxfoyYjSQk&e=>", there are 2 idle workers, as configured.



I have a Scala application that creates a context and starts execution of a Job. I *do not use spark-submit*, I start the Job programmatically and this is where many explanations forks from my question.



In "my-app" I create a new SparkConf, with the following code (slightly abbreviated):



      conf.setAppName(“my-job")

      conf.setMaster(“spark://master-address:7077”)

      conf.set(“deployMode”, “cluster”)

      // other settings like driver and executor memory requests

      // the driver and executor memory requests are for all mem on the slaves, more than

      // mem available on the launching machine with “my-app"

      val jars = listJars(“/path/to/lib")

      conf.setJars(jars)

      …



When I launch the job I see 2 executors running on the 2 workers/slaves. Everything seems to run fine and sometimes completes successfully. Frequent failures are the reason for this question.



Where is the Driver running? I don’t see it in the GUI, I see 2 Executors taking all cluster resources. With a Yarn cluster I would expect the “Driver" to run on/in the Yarn Master but I am using the Spark Standalone Master, where is the Drive part of the Job running?



If is is running in the Master, we are in trouble because I start the Master on one of my 2 Workers sharing resources with one of the Executors. Executor mem + driver mem is > available mem on a Worker. I can change this but need so understand where the Driver part of the Spark Job runs. Is it in the Spark Master, or inside and Executor, or ???



The “Driver” creates and broadcasts some large data structures so the need for an answer is more critical than with more typical tiny Drivers.



Thanks for you help!




--

Cheers!



Re: Where does the Driver run?

Posted by Pat Ferrel <pa...@occamsmachete.com>.
Thanks for the pointers. We’ll investigate.

We have been told that the “Driver” is run in the launching JVM because
deployMode = cluster is ignored if spark-submit is not used to launch.

You are saying that there is a loophole and if you use one of these client
classes there is a way to run part of the app on the cluster, and you have
seen this for Yarn?

To explain more, we create a SparkConf, and then a SparkContext, which we
pass around implicitly to functions that I would define as the Spark
Driver. It seems that if you do not use spark-submit, the entire launching
app/JVM process is considered the Driver AND is always run in client mode.

I hope your loophole pays off or we will have to do a major refactoring.


From: Jianneng Li <ji...@workday.com> <ji...@workday.com>
Reply: Jianneng Li <ji...@workday.com> <ji...@workday.com>
Date: March 28, 2019 at 2:03:47 AM
To: pat@occamsmachete.com <pa...@occamsmachete.com> <pa...@occamsmachete.com>
Cc: andrew.melo@gmail.com <an...@gmail.com> <an...@gmail.com>,
user@spark.apache.org <us...@spark.apache.org> <us...@spark.apache.org>,
akhld@hacked.work <ak...@hacked.work> <ak...@hacked.work>
Subject:  Re: Where does the Driver run?

Hi Pat,

The driver runs in the same JVM as SparkContext. You didn't go into detail
about how you "launch" the job (i.e. how the SparkContext is created), so
it's hard for me to guess where the driver is.

For reference, we've had success launching Spark programmatically to YARN
in cluster mode by creating a SparkConf like you did and using it to call
this class:
https://github.com/apache/spark/blob/master/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala

I haven't tried this myself, but for standalone mode you might be able to
use this:
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/Client.scala

Lastly, you can always check where Spark processes run by executing ps on
the machine, i.e. `ps aux | grep java`.

Best,

Jianneng



*From:* Pat Ferrel <pa...@occamsmachete.com>
*Date:* Monday, March 25, 2019 at 12:58 PM
*To:* Andrew Melo <an...@gmail.com>
*Cc:* user <us...@spark.apache.org>, Akhil Das <ak...@hacked.work>
*Subject:* Re: Where does the Driver run?



I’m beginning to agree with you and find it rather surprising that this is
mentioned nowhere explicitly (maybe I missed?). It is possible to serialize
code to be executed in executors to various nodes. It also seems possible
to serialize the “driver” bits of code although I’m not sure how the
boundary would be defined. All code is in the jars we pass to Spark so
until now I did not question the docs.



I see no mention of a distinction between running a driver in spark-submit
vs being programmatically launched for any of the Spark Master types:
Standalone, Yarn, Mesos, k8s.



We are building a Machine Learning Server in OSS. It has pluggable Engines
for different algorithms. Some of these use Spark so it is highly desirable
to offload driver code to the cluster since we don’t want the diver
embedded in the Server process. The Driver portion of our training workflow
could be very large indeed and so could force the scaling of the server to
worst case.



I hope someone knows how to run “Driver” code on the cluster when our
server is launching the code. So deployMode = cluster, deploy method =
programatic launch.




From: Andrew Melo <an...@gmail.com> <an...@gmail.com>
Reply: Andrew Melo <an...@gmail.com> <an...@gmail.com>
Date: March 25, 2019 at 11:40:07 AM
To: Pat Ferrel <pa...@occamsmachete.com> <pa...@occamsmachete.com>
Cc: Akhil Das <ak...@hacked.work> <ak...@hacked.work>, user
<us...@spark.apache.org> <us...@spark.apache.org>
Subject:  Re: Where does the Driver run?



Hi Pat,



Indeed, I don't think that it's possible to use cluster mode w/o
spark-submit. All the docs I see appear to always describe needing to use
spark-submit for cluster mode -- it's not even compatible with spark-shell.
But it makes sense to me -- if you want Spark to run your application's
driver, you need to package it up and send it to the cluster manager. You
can't start spark one place and then later migrate it to the cluster. It's
also why you can't use spark-shell in cluster mode either, I think.



Cheers

Andrew



On Mon, Mar 25, 2019 at 11:22 AM Pat Ferrel <pa...@occamsmachete.com> wrote:

In the GUI while the job is running the app-id link brings up logs to both
executors, The “name” link goes to 4040 of the machine that launched the
job but is not resolvable right now so the page is not shown. I’ll try the
netstat but the use of port 4040 was a good clue.



By what you say below this indicates the Driver is running on the launching
machine, the client to the Spark Cluster. This should be the case in
deployMode = client.



Can someone explain what us going on? The Evidence seems to say that
deployMode = cluster *does not work* as described unless you use
spark-submit (and I’m only guessing at that).



Further; if we don’t use spark-submit we can’t use deployMode = cluster ???




From: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
Reply: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
Date: March 24, 2019 at 7:45:07 PM
To: Pat Ferrel <pa...@occamsmachete.com> <pa...@occamsmachete.com>
Cc: user <us...@spark.apache.org> <us...@spark.apache.org>
Subject:  Re: Where does the Driver run?



There's also a driver ui (usually available on port 4040), after running
your code, I assume you are running it on your machine, visit
localhost:4040 and you will get the driver UI.



If you think the driver is running on your master/executor nodes, login to
those machines and do a



   netstat -napt | grep -I listen



You will see the driver listening on 404x there, this won't be the case
mostly as you are not doing Spark-submit or using the deployMode=cluster.



On Mon, 25 Mar 2019, 01:03 Pat Ferrel, <pa...@occamsmachete.com> wrote:

Thanks, I have seen this many times in my research. Paraphrasing docs: “in
deployMode ‘cluster' the Driver runs on a Worker in the cluster”



When I look at logs I see 2 executors on the 2 slaves (executor 0 and 1
with addresses that match slaves). When I look at memory usage while the
job runs I see virtually identical usage on the 2 Workers. This would
support your claim and contradict Spark docs for deployMode = cluster.



The evidence seems to contradict the docs. I am now beginning to wonder if
the Driver only runs in the cluster if we use spark-submit????






From: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
Reply: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
Date: March 23, 2019 at 9:26:50 PM
To: Pat Ferrel <pa...@occamsmachete.com> <pa...@occamsmachete.com>
Cc: user <us...@spark.apache.org> <us...@spark.apache.org>
Subject:  Re: Where does the Driver run?



If you are starting your "my-app" on your local machine, that's where the
driver is running.



*Error! Filename not specified.*



Hope this helps.
<https://urldefense.proofpoint.com/v2/url?u=https-3A__spark.apache.org_docs_latest_cluster-2Doverview.html&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=wuPgP0XTkA0aSITo752Pl2Mh3FfB1th7k-btT_qhdIA&m=3CYOFHITFcOX-5guCh_jvT5tPXbglI97H-L0qp9B-bc&s=0dTrGaPbO_ZAG3IMNW4ElhYpdQxZa2yYCdn5iHc-PZ8&e=>



On Sun, Mar 24, 2019 at 4:13 AM Pat Ferrel <pa...@occamsmachete.com> wrote:

I have researched this for a significant amount of time and find answers
that seem to be for a slightly different question than mine.



The Spark 2.3.3 cluster is running fine. I see the GUI on “
http://master-address:8080
<https://urldefense.proofpoint.com/v2/url?u=http-3A__master-2Daddress-3A8080&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=wuPgP0XTkA0aSITo752Pl2Mh3FfB1th7k-btT_qhdIA&m=3CYOFHITFcOX-5guCh_jvT5tPXbglI97H-L0qp9B-bc&s=qcN9_iGnF84Xal2XP_z8LOd1vncYSx8FfVxfoyYjSQk&e=>",
there are 2 idle workers, as configured.



I have a Scala application that creates a context and starts execution of a
Job. I *do not use spark-submit*, I start the Job programmatically and this
is where many explanations forks from my question.



In "my-app" I create a new SparkConf, with the following code (slightly
abbreviated):



      conf.setAppName(“my-job")

      conf.setMaster(“spark://master-address:7077”)

      conf.set(“deployMode”, “cluster”)

      // other settings like driver and executor memory requests

      // the driver and executor memory requests are for all mem on the
slaves, more than

      // mem available on the launching machine with “my-app"

      val jars = listJars(“/path/to/lib")

      conf.setJars(jars)

      …



When I launch the job I see 2 executors running on the 2 workers/slaves.
Everything seems to run fine and sometimes completes successfully. Frequent
failures are the reason for this question.



Where is the Driver running? I don’t see it in the GUI, I see 2 Executors
taking all cluster resources. With a Yarn cluster I would expect the
“Driver" to run on/in the Yarn Master but I am using the Spark Standalone
Master, where is the Drive part of the Job running?



If is is running in the Master, we are in trouble because I start the
Master on one of my 2 Workers sharing resources with one of the Executors.
Executor mem + driver mem is > available mem on a Worker. I can change this
but need so understand where the Driver part of the Spark Job runs. Is it
in the Spark Master, or inside and Executor, or ???



The “Driver” creates and broadcasts some large data structures so the need
for an answer is more critical than with more typical tiny Drivers.



Thanks for you help!




--

Cheers!

Re: Where does the Driver run?

Posted by Jianneng Li <ji...@workday.com>.
Hi Pat,

The driver runs in the same JVM as SparkContext. You didn't go into detail about how you "launch" the job (i.e. how the SparkContext is created), so it's hard for me to guess where the driver is.

For reference, we've had success launching Spark programmatically to YARN in cluster mode by creating a SparkConf like you did and using it to call this class: https://github.com/apache/spark/blob/master/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala

I haven't tried this myself, but for standalone mode you might be able to use this: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/Client.scala

Lastly, you can always check where Spark processes run by executing ps on the machine, i.e. `ps aux | grep java`.

Best,

Jianneng




From: Pat Ferrel <pa...@occamsmachete.com>
Date: Monday, March 25, 2019 at 12:58 PM
To: Andrew Melo <an...@gmail.com>
Cc: user <us...@spark.apache.org>, Akhil Das <ak...@hacked.work>
Subject: Re: Where does the Driver run?



I’m beginning to agree with you and find it rather surprising that this is mentioned nowhere explicitly (maybe I missed?). It is possible to serialize code to be executed in executors to various nodes. It also seems possible to serialize the “driver” bits of code although I’m not sure how the boundary would be defined. All code is in the jars we pass to Spark so until now I did not question the docs.



I see no mention of a distinction between running a driver in spark-submit vs being programmatically launched for any of the Spark Master types: Standalone, Yarn, Mesos, k8s.



We are building a Machine Learning Server in OSS. It has pluggable Engines for different algorithms. Some of these use Spark so it is highly desirable to offload driver code to the cluster since we don’t want the diver embedded in the Server process. The Driver portion of our training workflow could be very large indeed and so could force the scaling of the server to worst case.



I hope someone knows how to run “Driver” code on the cluster when our server is launching the code. So deployMode = cluster, deploy method = programatic launch.



From: Andrew Melo <an...@gmail.com>
Reply: Andrew Melo <an...@gmail.com>
Date: March 25, 2019 at 11:40:07 AM
To: Pat Ferrel <pa...@occamsmachete.com>
Cc: Akhil Das <ak...@hacked.work>, user <us...@spark.apache.org>
Subject:  Re: Where does the Driver run?



Hi Pat,



Indeed, I don't think that it's possible to use cluster mode w/o spark-submit. All the docs I see appear to always describe needing to use spark-submit for cluster mode -- it's not even compatible with spark-shell. But it makes sense to me -- if you want Spark to run your application's driver, you need to package it up and send it to the cluster manager. You can't start spark one place and then later migrate it to the cluster. It's also why you can't use spark-shell in cluster mode either, I think.



Cheers

Andrew



On Mon, Mar 25, 2019 at 11:22 AM Pat Ferrel <pa...@occamsmachete.com>> wrote:

In the GUI while the job is running the app-id link brings up logs to both executors, The “name” link goes to 4040 of the machine that launched the job but is not resolvable right now so the page is not shown. I’ll try the netstat but the use of port 4040 was a good clue.



By what you say below this indicates the Driver is running on the launching machine, the client to the Spark Cluster. This should be the case in deployMode = client.



Can someone explain what us going on? The Evidence seems to say that deployMode = cluster does not work as described unless you use spark-submit (and I’m only guessing at that).



Further; if we don’t use spark-submit we can’t use deployMode = cluster ???



From: Akhil Das <ak...@hacked.work>
Reply: Akhil Das <ak...@hacked.work>
Date: March 24, 2019 at 7:45:07 PM
To: Pat Ferrel <pa...@occamsmachete.com>
Cc: user <us...@spark.apache.org>
Subject:  Re: Where does the Driver run?



There's also a driver ui (usually available on port 4040), after running your code, I assume you are running it on your machine, visit localhost:4040 and you will get the driver UI.



If you think the driver is running on your master/executor nodes, login to those machines and do a



   netstat -napt | grep -I listen



You will see the driver listening on 404x there, this won't be the case mostly as you are not doing Spark-submit or using the deployMode=cluster.



On Mon, 25 Mar 2019, 01:03 Pat Ferrel, <pa...@occamsmachete.com>> wrote:

Thanks, I have seen this many times in my research. Paraphrasing docs: “in deployMode ‘cluster' the Driver runs on a Worker in the cluster”



When I look at logs I see 2 executors on the 2 slaves (executor 0 and 1 with addresses that match slaves). When I look at memory usage while the job runs I see virtually identical usage on the 2 Workers. This would support your claim and contradict Spark docs for deployMode = cluster.



The evidence seems to contradict the docs. I am now beginning to wonder if the Driver only runs in the cluster if we use spark-submit????





From: Akhil Das <ak...@hacked.work>
Reply: Akhil Das <ak...@hacked.work>
Date: March 23, 2019 at 9:26:50 PM
To: Pat Ferrel <pa...@occamsmachete.com>
Cc: user <us...@spark.apache.org>
Subject:  Re: Where does the Driver run?



If you are starting your "my-app" on your local machine, that's where the driver is running.



Error! Filename not specified.



Hope this helps.<https://urldefense.proofpoint.com/v2/url?u=https-3A__spark.apache.org_docs_latest_cluster-2Doverview.html&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=wuPgP0XTkA0aSITo752Pl2Mh3FfB1th7k-btT_qhdIA&m=3CYOFHITFcOX-5guCh_jvT5tPXbglI97H-L0qp9B-bc&s=0dTrGaPbO_ZAG3IMNW4ElhYpdQxZa2yYCdn5iHc-PZ8&e=>



On Sun, Mar 24, 2019 at 4:13 AM Pat Ferrel <pa...@occamsmachete.com>> wrote:

I have researched this for a significant amount of time and find answers that seem to be for a slightly different question than mine.



The Spark 2.3.3 cluster is running fine. I see the GUI on “http://master-address:8080<https://urldefense.proofpoint.com/v2/url?u=http-3A__master-2Daddress-3A8080&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=wuPgP0XTkA0aSITo752Pl2Mh3FfB1th7k-btT_qhdIA&m=3CYOFHITFcOX-5guCh_jvT5tPXbglI97H-L0qp9B-bc&s=qcN9_iGnF84Xal2XP_z8LOd1vncYSx8FfVxfoyYjSQk&e=>", there are 2 idle workers, as configured.



I have a Scala application that creates a context and starts execution of a Job. I *do not use spark-submit*, I start the Job programmatically and this is where many explanations forks from my question.



In "my-app" I create a new SparkConf, with the following code (slightly abbreviated):



      conf.setAppName(“my-job")

      conf.setMaster(“spark://master-address:7077”)

      conf.set(“deployMode”, “cluster”)

      // other settings like driver and executor memory requests

      // the driver and executor memory requests are for all mem on the slaves, more than

      // mem available on the launching machine with “my-app"

      val jars = listJars(“/path/to/lib")

      conf.setJars(jars)

      …



When I launch the job I see 2 executors running on the 2 workers/slaves. Everything seems to run fine and sometimes completes successfully. Frequent failures are the reason for this question.



Where is the Driver running? I don’t see it in the GUI, I see 2 Executors taking all cluster resources. With a Yarn cluster I would expect the “Driver" to run on/in the Yarn Master but I am using the Spark Standalone Master, where is the Drive part of the Job running?



If is is running in the Master, we are in trouble because I start the Master on one of my 2 Workers sharing resources with one of the Executors. Executor mem + driver mem is > available mem on a Worker. I can change this but need so understand where the Driver part of the Spark Job runs. Is it in the Spark Master, or inside and Executor, or ???



The “Driver” creates and broadcasts some large data structures so the need for an answer is more critical than with more typical tiny Drivers.



Thanks for you help!




--

Cheers!



Re: Where does the Driver run?

Posted by Pat Ferrel <pa...@occamsmachete.com>.
I’m beginning to agree with you and find it rather surprising that this is
mentioned nowhere explicitly (maybe I missed?). It is possible to serialize
code to be executed in executors to various nodes. It also seems possible
to serialize the “driver” bits of code although I’m not sure how the
boundary would be defined. All code is in the jars we pass to Spark so
until now I did not question the docs.

I see no mention of a distinction between running a driver in spark-submit
vs being programmatically launched for any of the Spark Master types:
Standalone, Yarn, Mesos, k8s.

We are building a Machine Learning Server in OSS. It has pluggable Engines
for different algorithms. Some of these use Spark so it is highly desirable
to offload driver code to the cluster since we don’t want the diver
embedded in the Server process. The Driver portion of our training workflow
could be very large indeed and so could force the scaling of the server to
worst case.

I hope someone knows how to run “Driver” code on the cluster when our
server is launching the code. So deployMode = cluster, deploy method =
programatic launch.


From: Andrew Melo <an...@gmail.com> <an...@gmail.com>
Reply: Andrew Melo <an...@gmail.com> <an...@gmail.com>
Date: March 25, 2019 at 11:40:07 AM
To: Pat Ferrel <pa...@occamsmachete.com> <pa...@occamsmachete.com>
Cc: Akhil Das <ak...@hacked.work> <ak...@hacked.work>, user
<us...@spark.apache.org> <us...@spark.apache.org>
Subject:  Re: Where does the Driver run?

Hi Pat,

Indeed, I don't think that it's possible to use cluster mode w/o
spark-submit. All the docs I see appear to always describe needing to use
spark-submit for cluster mode -- it's not even compatible with spark-shell.
But it makes sense to me -- if you want Spark to run your application's
driver, you need to package it up and send it to the cluster manager. You
can't start spark one place and then later migrate it to the cluster. It's
also why you can't use spark-shell in cluster mode either, I think.

Cheers
Andrew

On Mon, Mar 25, 2019 at 11:22 AM Pat Ferrel <pa...@occamsmachete.com> wrote:

> In the GUI while the job is running the app-id link brings up logs to both
> executors, The “name” link goes to 4040 of the machine that launched the
> job but is not resolvable right now so the page is not shown. I’ll try the
> netstat but the use of port 4040 was a good clue.
>
> By what you say below this indicates the Driver is running on the
> launching machine, the client to the Spark Cluster. This should be the case
> in deployMode = client.
>
> Can someone explain what us going on? The Evidence seems to say that
> deployMode = cluster *does not work* as described unless you use
> spark-submit (and I’m only guessing at that).
>
> Further; if we don’t use spark-submit we can’t use deployMode = cluster ???
>
>
> From: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
> Reply: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
> Date: March 24, 2019 at 7:45:07 PM
> To: Pat Ferrel <pa...@occamsmachete.com> <pa...@occamsmachete.com>
> Cc: user <us...@spark.apache.org> <us...@spark.apache.org>
> Subject:  Re: Where does the Driver run?
>
> There's also a driver ui (usually available on port 4040), after running
> your code, I assume you are running it on your machine, visit
> localhost:4040 and you will get the driver UI.
>
> If you think the driver is running on your master/executor nodes, login to
> those machines and do a
>
>    netstat -napt | grep -I listen
>
> You will see the driver listening on 404x there, this won't be the case
> mostly as you are not doing Spark-submit or using the deployMode=cluster.
>
> On Mon, 25 Mar 2019, 01:03 Pat Ferrel, <pa...@occamsmachete.com> wrote:
>
>> Thanks, I have seen this many times in my research. Paraphrasing docs:
>> “in deployMode ‘cluster' the Driver runs on a Worker in the cluster”
>>
>> When I look at logs I see 2 executors on the 2 slaves (executor 0 and 1
>> with addresses that match slaves). When I look at memory usage while the
>> job runs I see virtually identical usage on the 2 Workers. This would
>> support your claim and contradict Spark docs for deployMode = cluster.
>>
>> The evidence seems to contradict the docs. I am now beginning to wonder
>> if the Driver only runs in the cluster if we use spark-submit????
>>
>>
>>
>> From: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
>> Reply: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
>> Date: March 23, 2019 at 9:26:50 PM
>> To: Pat Ferrel <pa...@occamsmachete.com> <pa...@occamsmachete.com>
>> Cc: user <us...@spark.apache.org> <us...@spark.apache.org>
>> Subject:  Re: Where does the Driver run?
>>
>> If you are starting your "my-app" on your local machine, that's where the
>> driver is running.
>>
>> [image: image.png]
>>
>> Hope this helps.
>> <https://spark.apache.org/docs/latest/cluster-overview.html>
>>
>> On Sun, Mar 24, 2019 at 4:13 AM Pat Ferrel <pa...@occamsmachete.com> wrote:
>>
>>> I have researched this for a significant amount of time and find answers
>>> that seem to be for a slightly different question than mine.
>>>
>>> The Spark 2.3.3 cluster is running fine. I see the GUI on “
>>> http://master-address:8080", there are 2 idle workers, as configured.
>>>
>>> I have a Scala application that creates a context and starts execution
>>> of a Job. I *do not use spark-submit*, I start the Job programmatically and
>>> this is where many explanations forks from my question.
>>>
>>> In "my-app" I create a new SparkConf, with the following code (slightly
>>> abbreviated):
>>>
>>>       conf.setAppName(“my-job")
>>>       conf.setMaster(“spark://master-address:7077”)
>>>       conf.set(“deployMode”, “cluster”)
>>>       // other settings like driver and executor memory requests
>>>       // the driver and executor memory requests are for all mem on the
>>> slaves, more than
>>>       // mem available on the launching machine with “my-app"
>>>       val jars = listJars(“/path/to/lib")
>>>       conf.setJars(jars)
>>>       …
>>>
>>> When I launch the job I see 2 executors running on the 2 workers/slaves.
>>> Everything seems to run fine and sometimes completes successfully. Frequent
>>> failures are the reason for this question.
>>>
>>> Where is the Driver running? I don’t see it in the GUI, I see 2
>>> Executors taking all cluster resources. With a Yarn cluster I would expect
>>> the “Driver" to run on/in the Yarn Master but I am using the Spark
>>> Standalone Master, where is the Drive part of the Job running?
>>>
>>> If is is running in the Master, we are in trouble because I start the
>>> Master on one of my 2 Workers sharing resources with one of the Executors.
>>> Executor mem + driver mem is > available mem on a Worker. I can change this
>>> but need so understand where the Driver part of the Spark Job runs. Is it
>>> in the Spark Master, or inside and Executor, or ???
>>>
>>> The “Driver” creates and broadcasts some large data structures so the
>>> need for an answer is more critical than with more typical tiny Drivers.
>>>
>>> Thanks for you help!
>>>
>>
>>
>> --
>> Cheers!
>>
>>

Re: Where does the Driver run?

Posted by Andrew Melo <an...@gmail.com>.
Hi Pat,

Indeed, I don't think that it's possible to use cluster mode w/o
spark-submit. All the docs I see appear to always describe needing to use
spark-submit for cluster mode -- it's not even compatible with spark-shell.
But it makes sense to me -- if you want Spark to run your application's
driver, you need to package it up and send it to the cluster manager. You
can't start spark one place and then later migrate it to the cluster. It's
also why you can't use spark-shell in cluster mode either, I think.

Cheers
Andrew

On Mon, Mar 25, 2019 at 11:22 AM Pat Ferrel <pa...@occamsmachete.com> wrote:

> In the GUI while the job is running the app-id link brings up logs to both
> executors, The “name” link goes to 4040 of the machine that launched the
> job but is not resolvable right now so the page is not shown. I’ll try the
> netstat but the use of port 4040 was a good clue.
>
> By what you say below this indicates the Driver is running on the
> launching machine, the client to the Spark Cluster. This should be the case
> in deployMode = client.
>
> Can someone explain what us going on? The Evidence seems to say that
> deployMode = cluster *does not work *as described unless you use
> spark-submit (and I’m only guessing at that).
>
> Further; if we don’t use spark-submit we can’t use deployMode = cluster ???
>
>
> From: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
> Reply: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
> Date: March 24, 2019 at 7:45:07 PM
> To: Pat Ferrel <pa...@occamsmachete.com> <pa...@occamsmachete.com>
> Cc: user <us...@spark.apache.org> <us...@spark.apache.org>
> Subject:  Re: Where does the Driver run?
>
> There's also a driver ui (usually available on port 4040), after running
> your code, I assume you are running it on your machine, visit
> localhost:4040 and you will get the driver UI.
>
> If you think the driver is running on your master/executor nodes, login to
> those machines and do a
>
>    netstat -napt | grep -I listen
>
> You will see the driver listening on 404x there, this won't be the case
> mostly as you are not doing Spark-submit or using the deployMode=cluster.
>
> On Mon, 25 Mar 2019, 01:03 Pat Ferrel, <pa...@occamsmachete.com> wrote:
>
>> Thanks, I have seen this many times in my research. Paraphrasing docs:
>> “in deployMode ‘cluster' the Driver runs on a Worker in the cluster”
>>
>> When I look at logs I see 2 executors on the 2 slaves (executor 0 and 1
>> with addresses that match slaves). When I look at memory usage while the
>> job runs I see virtually identical usage on the 2 Workers. This would
>> support your claim and contradict Spark docs for deployMode = cluster.
>>
>> The evidence seems to contradict the docs. I am now beginning to wonder
>> if the Driver only runs in the cluster if we use spark-submit????
>>
>>
>>
>> From: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
>> Reply: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
>> Date: March 23, 2019 at 9:26:50 PM
>> To: Pat Ferrel <pa...@occamsmachete.com> <pa...@occamsmachete.com>
>> Cc: user <us...@spark.apache.org> <us...@spark.apache.org>
>> Subject:  Re: Where does the Driver run?
>>
>> If you are starting your "my-app" on your local machine, that's where the
>> driver is running.
>>
>> [image: image.png]
>>
>> Hope this helps.
>> <https://spark.apache.org/docs/latest/cluster-overview.html>
>>
>> On Sun, Mar 24, 2019 at 4:13 AM Pat Ferrel <pa...@occamsmachete.com> wrote:
>>
>>> I have researched this for a significant amount of time and find answers
>>> that seem to be for a slightly different question than mine.
>>>
>>> The Spark 2.3.3 cluster is running fine. I see the GUI on “
>>> http://master-address:8080", there are 2 idle workers, as configured.
>>>
>>> I have a Scala application that creates a context and starts execution
>>> of a Job. I *do not use spark-submit*, I start the Job programmatically and
>>> this is where many explanations forks from my question.
>>>
>>> In "my-app" I create a new SparkConf, with the following code (slightly
>>> abbreviated):
>>>
>>>       conf.setAppName(“my-job")
>>>       conf.setMaster(“spark://master-address:7077”)
>>>       conf.set(“deployMode”, “cluster”)
>>>       // other settings like driver and executor memory requests
>>>       // the driver and executor memory requests are for all mem on the
>>> slaves, more than
>>>       // mem available on the launching machine with “my-app"
>>>       val jars = listJars(“/path/to/lib")
>>>       conf.setJars(jars)
>>>       …
>>>
>>> When I launch the job I see 2 executors running on the 2 workers/slaves.
>>> Everything seems to run fine and sometimes completes successfully. Frequent
>>> failures are the reason for this question.
>>>
>>> Where is the Driver running? I don’t see it in the GUI, I see 2
>>> Executors taking all cluster resources. With a Yarn cluster I would expect
>>> the “Driver" to run on/in the Yarn Master but I am using the Spark
>>> Standalone Master, where is the Drive part of the Job running?
>>>
>>> If is is running in the Master, we are in trouble because I start the
>>> Master on one of my 2 Workers sharing resources with one of the Executors.
>>> Executor mem + driver mem is > available mem on a Worker. I can change this
>>> but need so understand where the Driver part of the Spark Job runs. Is it
>>> in the Spark Master, or inside and Executor, or ???
>>>
>>> The “Driver” creates and broadcasts some large data structures so the
>>> need for an answer is more critical than with more typical tiny Drivers.
>>>
>>> Thanks for you help!
>>>
>>
>>
>> --
>> Cheers!
>>
>>

Re: Where does the Driver run?

Posted by Pat Ferrel <pa...@occamsmachete.com>.
In the GUI while the job is running the app-id link brings up logs to both
executors, The “name” link goes to 4040 of the machine that launched the
job but is not resolvable right now so the page is not shown. I’ll try the
netstat but the use of port 4040 was a good clue.

By what you say below this indicates the Driver is running on the launching
machine, the client to the Spark Cluster. This should be the case in
deployMode = client.

Can someone explain what us going on? The Evidence seems to say that
deployMode = cluster *does not work *as described unless you use
spark-submit (and I’m only guessing at that).

Further; if we don’t use spark-submit we can’t use deployMode = cluster ???


From: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
Reply: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
Date: March 24, 2019 at 7:45:07 PM
To: Pat Ferrel <pa...@occamsmachete.com> <pa...@occamsmachete.com>
Cc: user <us...@spark.apache.org> <us...@spark.apache.org>
Subject:  Re: Where does the Driver run?

There's also a driver ui (usually available on port 4040), after running
your code, I assume you are running it on your machine, visit
localhost:4040 and you will get the driver UI.

If you think the driver is running on your master/executor nodes, login to
those machines and do a

   netstat -napt | grep -I listen

You will see the driver listening on 404x there, this won't be the case
mostly as you are not doing Spark-submit or using the deployMode=cluster.

On Mon, 25 Mar 2019, 01:03 Pat Ferrel, <pa...@occamsmachete.com> wrote:

> Thanks, I have seen this many times in my research. Paraphrasing docs: “in
> deployMode ‘cluster' the Driver runs on a Worker in the cluster”
>
> When I look at logs I see 2 executors on the 2 slaves (executor 0 and 1
> with addresses that match slaves). When I look at memory usage while the
> job runs I see virtually identical usage on the 2 Workers. This would
> support your claim and contradict Spark docs for deployMode = cluster.
>
> The evidence seems to contradict the docs. I am now beginning to wonder if
> the Driver only runs in the cluster if we use spark-submit????
>
>
>
> From: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
> Reply: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
> Date: March 23, 2019 at 9:26:50 PM
> To: Pat Ferrel <pa...@occamsmachete.com> <pa...@occamsmachete.com>
> Cc: user <us...@spark.apache.org> <us...@spark.apache.org>
> Subject:  Re: Where does the Driver run?
>
> If you are starting your "my-app" on your local machine, that's where the
> driver is running.
>
> [image: image.png]
>
> Hope this helps.
> <https://spark.apache.org/docs/latest/cluster-overview.html>
>
> On Sun, Mar 24, 2019 at 4:13 AM Pat Ferrel <pa...@occamsmachete.com> wrote:
>
>> I have researched this for a significant amount of time and find answers
>> that seem to be for a slightly different question than mine.
>>
>> The Spark 2.3.3 cluster is running fine. I see the GUI on “
>> http://master-address:8080", there are 2 idle workers, as configured.
>>
>> I have a Scala application that creates a context and starts execution of
>> a Job. I *do not use spark-submit*, I start the Job programmatically and
>> this is where many explanations forks from my question.
>>
>> In "my-app" I create a new SparkConf, with the following code (slightly
>> abbreviated):
>>
>>       conf.setAppName(“my-job")
>>       conf.setMaster(“spark://master-address:7077”)
>>       conf.set(“deployMode”, “cluster”)
>>       // other settings like driver and executor memory requests
>>       // the driver and executor memory requests are for all mem on the
>> slaves, more than
>>       // mem available on the launching machine with “my-app"
>>       val jars = listJars(“/path/to/lib")
>>       conf.setJars(jars)
>>       …
>>
>> When I launch the job I see 2 executors running on the 2 workers/slaves.
>> Everything seems to run fine and sometimes completes successfully. Frequent
>> failures are the reason for this question.
>>
>> Where is the Driver running? I don’t see it in the GUI, I see 2 Executors
>> taking all cluster resources. With a Yarn cluster I would expect the
>> “Driver" to run on/in the Yarn Master but I am using the Spark Standalone
>> Master, where is the Drive part of the Job running?
>>
>> If is is running in the Master, we are in trouble because I start the
>> Master on one of my 2 Workers sharing resources with one of the Executors.
>> Executor mem + driver mem is > available mem on a Worker. I can change this
>> but need so understand where the Driver part of the Spark Job runs. Is it
>> in the Spark Master, or inside and Executor, or ???
>>
>> The “Driver” creates and broadcasts some large data structures so the
>> need for an answer is more critical than with more typical tiny Drivers.
>>
>> Thanks for you help!
>>
>
>
> --
> Cheers!
>
>

Re: Where does the Driver run?

Posted by Akhil Das <ak...@hacked.work>.
There's also a driver ui (usually available on port 4040), after running
your code, I assume you are running it on your machine, visit
localhost:4040 and you will get the driver UI.

If you think the driver is running on your master/executor nodes, login to
those machines and do a

   netstat -napt | grep -I listen

You will see the driver listening on 404x there, this won't be the case
mostly as you are not doing Spark-submit or using the deployMode=cluster.

On Mon, 25 Mar 2019, 01:03 Pat Ferrel, <pa...@occamsmachete.com> wrote:

> Thanks, I have seen this many times in my research. Paraphrasing docs: “in
> deployMode ‘cluster' the Driver runs on a Worker in the cluster”
>
> When I look at logs I see 2 executors on the 2 slaves (executor 0 and 1
> with addresses that match slaves). When I look at memory usage while the
> job runs I see virtually identical usage on the 2 Workers. This would
> support your claim and contradict Spark docs for deployMode = cluster.
>
> The evidence seems to contradict the docs. I am now beginning to wonder if
> the Driver only runs in the cluster if we use spark-submit????
>
>
>
> From: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
> Reply: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
> Date: March 23, 2019 at 9:26:50 PM
> To: Pat Ferrel <pa...@occamsmachete.com> <pa...@occamsmachete.com>
> Cc: user <us...@spark.apache.org> <us...@spark.apache.org>
> Subject:  Re: Where does the Driver run?
>
> If you are starting your "my-app" on your local machine, that's where the
> driver is running.
>
> [image: image.png]
>
> Hope this helps.
> <https://spark.apache.org/docs/latest/cluster-overview.html>
>
> On Sun, Mar 24, 2019 at 4:13 AM Pat Ferrel <pa...@occamsmachete.com> wrote:
>
>> I have researched this for a significant amount of time and find answers
>> that seem to be for a slightly different question than mine.
>>
>> The Spark 2.3.3 cluster is running fine. I see the GUI on “
>> http://master-address:8080", there are 2 idle workers, as configured.
>>
>> I have a Scala application that creates a context and starts execution of
>> a Job. I *do not use spark-submit*, I start the Job programmatically and
>> this is where many explanations forks from my question.
>>
>> In "my-app" I create a new SparkConf, with the following code (slightly
>> abbreviated):
>>
>>       conf.setAppName(“my-job")
>>       conf.setMaster(“spark://master-address:7077”)
>>       conf.set(“deployMode”, “cluster”)
>>       // other settings like driver and executor memory requests
>>       // the driver and executor memory requests are for all mem on the
>> slaves, more than
>>       // mem available on the launching machine with “my-app"
>>       val jars = listJars(“/path/to/lib")
>>       conf.setJars(jars)
>>       …
>>
>> When I launch the job I see 2 executors running on the 2 workers/slaves.
>> Everything seems to run fine and sometimes completes successfully. Frequent
>> failures are the reason for this question.
>>
>> Where is the Driver running? I don’t see it in the GUI, I see 2 Executors
>> taking all cluster resources. With a Yarn cluster I would expect the
>> “Driver" to run on/in the Yarn Master but I am using the Spark Standalone
>> Master, where is the Drive part of the Job running?
>>
>> If is is running in the Master, we are in trouble because I start the
>> Master on one of my 2 Workers sharing resources with one of the Executors.
>> Executor mem + driver mem is > available mem on a Worker. I can change this
>> but need so understand where the Driver part of the Spark Job runs. Is it
>> in the Spark Master, or inside and Executor, or ???
>>
>> The “Driver” creates and broadcasts some large data structures so the
>> need for an answer is more critical than with more typical tiny Drivers.
>>
>> Thanks for you help!
>>
>
>
> --
> Cheers!
>
>

Re: Where does the Driver run?

Posted by Arko Provo Mukherjee <ar...@gmail.com>.
Hello,

Is spark.driver.memory per Job or shared across jobs? You should do load
testing before setting this?

Thanks & regards
Arko


On Sun, Mar 24, 2019 at 3:09 PM Pat Ferrel <pa...@occamsmachete.com> wrote:

>
> 2 Slaves, one of which is also Master.
>
> Node 1 & 2 are slaves. Node 1 is where I run start-all.sh.
>
> The machines both have 60g of free memory (leaving about 4g for the master
> process on Node 1). The only constraint to the Driver and Executors is
> spark.driver.memory = spark.executor.memory = 60g
>
> BTW I would expect this to create one Executor, one Driver, and the Master
> on 2 Workers.
>
>
>
>
> From: Andrew Melo <an...@gmail.com> <an...@gmail.com>
> Reply: Andrew Melo <an...@gmail.com> <an...@gmail.com>
> Date: March 24, 2019 at 12:46:35 PM
> To: Pat Ferrel <pa...@occamsmachete.com> <pa...@occamsmachete.com>
> Cc: Akhil Das <ak...@hacked.work> <ak...@hacked.work>, user
> <us...@spark.apache.org> <us...@spark.apache.org>
> Subject:  Re: Where does the Driver run?
>
> Hi Pat,
>
> On Sun, Mar 24, 2019 at 1:03 PM Pat Ferrel <pa...@occamsmachete.com> wrote:
>
>> Thanks, I have seen this many times in my research. Paraphrasing docs:
>> “in deployMode ‘cluster' the Driver runs on a Worker in the cluster”
>>
>> When I look at logs I see 2 executors on the 2 slaves (executor 0 and 1
>> with addresses that match slaves). When I look at memory usage while the
>> job runs I see virtually identical usage on the 2 Workers. This would
>> support your claim and contradict Spark docs for deployMode = cluster.
>>
>> The evidence seems to contradict the docs. I am now beginning to wonder
>> if the Driver only runs in the cluster if we use spark-submit????
>>
>
> Where/how are you starting "./sbin/start-master.sh"?
>
> Cheers
> Andrew
>
>
>>
>>
>>
>> From: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
>> Reply: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
>> Date: March 23, 2019 at 9:26:50 PM
>> To: Pat Ferrel <pa...@occamsmachete.com> <pa...@occamsmachete.com>
>> Cc: user <us...@spark.apache.org> <us...@spark.apache.org>
>> Subject:  Re: Where does the Driver run?
>>
>> If you are starting your "my-app" on your local machine, that's where the
>> driver is running.
>>
>> [image: image.png]
>>
>> Hope this helps.
>> <https://spark.apache.org/docs/latest/cluster-overview.html>
>>
>> On Sun, Mar 24, 2019 at 4:13 AM Pat Ferrel <pa...@occamsmachete.com> wrote:
>>
>>> I have researched this for a significant amount of time and find answers
>>> that seem to be for a slightly different question than mine.
>>>
>>> The Spark 2.3.3 cluster is running fine. I see the GUI on “
>>> http://master-address:8080", there are 2 idle workers, as configured.
>>>
>>> I have a Scala application that creates a context and starts execution
>>> of a Job. I *do not use spark-submit*, I start the Job programmatically and
>>> this is where many explanations forks from my question.
>>>
>>> In "my-app" I create a new SparkConf, with the following code (slightly
>>> abbreviated):
>>>
>>>       conf.setAppName(“my-job")
>>>       conf.setMaster(“spark://master-address:7077”)
>>>       conf.set(“deployMode”, “cluster”)
>>>       // other settings like driver and executor memory requests
>>>       // the driver and executor memory requests are for all mem on the
>>> slaves, more than
>>>       // mem available on the launching machine with “my-app"
>>>       val jars = listJars(“/path/to/lib")
>>>       conf.setJars(jars)
>>>       …
>>>
>>> When I launch the job I see 2 executors running on the 2 workers/slaves.
>>> Everything seems to run fine and sometimes completes successfully. Frequent
>>> failures are the reason for this question.
>>>
>>> Where is the Driver running? I don’t see it in the GUI, I see 2
>>> Executors taking all cluster resources. With a Yarn cluster I would expect
>>> the “Driver" to run on/in the Yarn Master but I am using the Spark
>>> Standalone Master, where is the Drive part of the Job running?
>>>
>>> If is is running in the Master, we are in trouble because I start the
>>> Master on one of my 2 Workers sharing resources with one of the Executors.
>>> Executor mem + driver mem is > available mem on a Worker. I can change this
>>> but need so understand where the Driver part of the Spark Job runs. Is it
>>> in the Spark Master, or inside and Executor, or ???
>>>
>>> The “Driver” creates and broadcasts some large data structures so the
>>> need for an answer is more critical than with more typical tiny Drivers.
>>>
>>> Thanks for you help!
>>>
>>
>>
>> --
>> Cheers!
>>
>>

Re: Where does the Driver run?

Posted by Pat Ferrel <pa...@occamsmachete.com>.
2 Slaves, one of which is also Master.

Node 1 & 2 are slaves. Node 1 is where I run start-all.sh.

The machines both have 60g of free memory (leaving about 4g for the master
process on Node 1). The only constraint to the Driver and Executors is
spark.driver.memory = spark.executor.memory = 60g

BTW I would expect this to create one Executor, one Driver, and the Master
on 2 Workers.




From: Andrew Melo <an...@gmail.com> <an...@gmail.com>
Reply: Andrew Melo <an...@gmail.com> <an...@gmail.com>
Date: March 24, 2019 at 12:46:35 PM
To: Pat Ferrel <pa...@occamsmachete.com> <pa...@occamsmachete.com>
Cc: Akhil Das <ak...@hacked.work> <ak...@hacked.work>, user
<us...@spark.apache.org> <us...@spark.apache.org>
Subject:  Re: Where does the Driver run?

Hi Pat,

On Sun, Mar 24, 2019 at 1:03 PM Pat Ferrel <pa...@occamsmachete.com> wrote:

> Thanks, I have seen this many times in my research. Paraphrasing docs: “in
> deployMode ‘cluster' the Driver runs on a Worker in the cluster”
>
> When I look at logs I see 2 executors on the 2 slaves (executor 0 and 1
> with addresses that match slaves). When I look at memory usage while the
> job runs I see virtually identical usage on the 2 Workers. This would
> support your claim and contradict Spark docs for deployMode = cluster.
>
> The evidence seems to contradict the docs. I am now beginning to wonder if
> the Driver only runs in the cluster if we use spark-submit????
>

Where/how are you starting "./sbin/start-master.sh"?

Cheers
Andrew


>
>
>
> From: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
> Reply: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
> Date: March 23, 2019 at 9:26:50 PM
> To: Pat Ferrel <pa...@occamsmachete.com> <pa...@occamsmachete.com>
> Cc: user <us...@spark.apache.org> <us...@spark.apache.org>
> Subject:  Re: Where does the Driver run?
>
> If you are starting your "my-app" on your local machine, that's where the
> driver is running.
>
> [image: image.png]
>
> Hope this helps.
> <https://spark.apache.org/docs/latest/cluster-overview.html>
>
> On Sun, Mar 24, 2019 at 4:13 AM Pat Ferrel <pa...@occamsmachete.com> wrote:
>
>> I have researched this for a significant amount of time and find answers
>> that seem to be for a slightly different question than mine.
>>
>> The Spark 2.3.3 cluster is running fine. I see the GUI on “
>> http://master-address:8080", there are 2 idle workers, as configured.
>>
>> I have a Scala application that creates a context and starts execution of
>> a Job. I *do not use spark-submit*, I start the Job programmatically and
>> this is where many explanations forks from my question.
>>
>> In "my-app" I create a new SparkConf, with the following code (slightly
>> abbreviated):
>>
>>       conf.setAppName(“my-job")
>>       conf.setMaster(“spark://master-address:7077”)
>>       conf.set(“deployMode”, “cluster”)
>>       // other settings like driver and executor memory requests
>>       // the driver and executor memory requests are for all mem on the
>> slaves, more than
>>       // mem available on the launching machine with “my-app"
>>       val jars = listJars(“/path/to/lib")
>>       conf.setJars(jars)
>>       …
>>
>> When I launch the job I see 2 executors running on the 2 workers/slaves.
>> Everything seems to run fine and sometimes completes successfully. Frequent
>> failures are the reason for this question.
>>
>> Where is the Driver running? I don’t see it in the GUI, I see 2 Executors
>> taking all cluster resources. With a Yarn cluster I would expect the
>> “Driver" to run on/in the Yarn Master but I am using the Spark Standalone
>> Master, where is the Drive part of the Job running?
>>
>> If is is running in the Master, we are in trouble because I start the
>> Master on one of my 2 Workers sharing resources with one of the Executors.
>> Executor mem + driver mem is > available mem on a Worker. I can change this
>> but need so understand where the Driver part of the Spark Job runs. Is it
>> in the Spark Master, or inside and Executor, or ???
>>
>> The “Driver” creates and broadcasts some large data structures so the
>> need for an answer is more critical than with more typical tiny Drivers.
>>
>> Thanks for you help!
>>
>
>
> --
> Cheers!
>
>

Re: Where does the Driver run?

Posted by Pat Ferrel <pa...@occamsmachete.com>.
2 Slaves, one of which is also Master.

Node 1 & 2 are slaves. Node 1 is where I run start-all.sh.

The machines both have 60g of free memory (leaving about 4g for the master
process on Node 1). The only constraint to the Driver and Executors is
spark.driver.memory = spark.executor.memory = 60g


From: Andrew Melo <an...@gmail.com> <an...@gmail.com>
Reply: Andrew Melo <an...@gmail.com> <an...@gmail.com>
Date: March 24, 2019 at 12:46:35 PM
To: Pat Ferrel <pa...@occamsmachete.com> <pa...@occamsmachete.com>
Cc: Akhil Das <ak...@hacked.work> <ak...@hacked.work>, user
<us...@spark.apache.org> <us...@spark.apache.org>
Subject:  Re: Where does the Driver run?

Hi Pat,

On Sun, Mar 24, 2019 at 1:03 PM Pat Ferrel <pa...@occamsmachete.com> wrote:

> Thanks, I have seen this many times in my research. Paraphrasing docs: “in
> deployMode ‘cluster' the Driver runs on a Worker in the cluster”
>
> When I look at logs I see 2 executors on the 2 slaves (executor 0 and 1
> with addresses that match slaves). When I look at memory usage while the
> job runs I see virtually identical usage on the 2 Workers. This would
> support your claim and contradict Spark docs for deployMode = cluster.
>
> The evidence seems to contradict the docs. I am now beginning to wonder if
> the Driver only runs in the cluster if we use spark-submit????
>

Where/how are you starting "./sbin/start-master.sh"?

Cheers
Andrew


>
>
>
> From: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
> Reply: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
> Date: March 23, 2019 at 9:26:50 PM
> To: Pat Ferrel <pa...@occamsmachete.com> <pa...@occamsmachete.com>
> Cc: user <us...@spark.apache.org> <us...@spark.apache.org>
> Subject:  Re: Where does the Driver run?
>
> If you are starting your "my-app" on your local machine, that's where the
> driver is running.
>
> [image: image.png]
>
> Hope this helps.
> <https://spark.apache.org/docs/latest/cluster-overview.html>
>
> On Sun, Mar 24, 2019 at 4:13 AM Pat Ferrel <pa...@occamsmachete.com> wrote:
>
>> I have researched this for a significant amount of time and find answers
>> that seem to be for a slightly different question than mine.
>>
>> The Spark 2.3.3 cluster is running fine. I see the GUI on “
>> http://master-address:8080", there are 2 idle workers, as configured.
>>
>> I have a Scala application that creates a context and starts execution of
>> a Job. I *do not use spark-submit*, I start the Job programmatically and
>> this is where many explanations forks from my question.
>>
>> In "my-app" I create a new SparkConf, with the following code (slightly
>> abbreviated):
>>
>>       conf.setAppName(“my-job")
>>       conf.setMaster(“spark://master-address:7077”)
>>       conf.set(“deployMode”, “cluster”)
>>       // other settings like driver and executor memory requests
>>       // the driver and executor memory requests are for all mem on the
>> slaves, more than
>>       // mem available on the launching machine with “my-app"
>>       val jars = listJars(“/path/to/lib")
>>       conf.setJars(jars)
>>       …
>>
>> When I launch the job I see 2 executors running on the 2 workers/slaves.
>> Everything seems to run fine and sometimes completes successfully. Frequent
>> failures are the reason for this question.
>>
>> Where is the Driver running? I don’t see it in the GUI, I see 2 Executors
>> taking all cluster resources. With a Yarn cluster I would expect the
>> “Driver" to run on/in the Yarn Master but I am using the Spark Standalone
>> Master, where is the Drive part of the Job running?
>>
>> If is is running in the Master, we are in trouble because I start the
>> Master on one of my 2 Workers sharing resources with one of the Executors.
>> Executor mem + driver mem is > available mem on a Worker. I can change this
>> but need so understand where the Driver part of the Spark Job runs. Is it
>> in the Spark Master, or inside and Executor, or ???
>>
>> The “Driver” creates and broadcasts some large data structures so the
>> need for an answer is more critical than with more typical tiny Drivers.
>>
>> Thanks for you help!
>>
>
>
> --
> Cheers!
>
>

Re: Where does the Driver run?

Posted by Andrew Melo <an...@gmail.com>.
Hi Pat,

On Sun, Mar 24, 2019 at 1:03 PM Pat Ferrel <pa...@occamsmachete.com> wrote:

> Thanks, I have seen this many times in my research. Paraphrasing docs: “in
> deployMode ‘cluster' the Driver runs on a Worker in the cluster”
>
> When I look at logs I see 2 executors on the 2 slaves (executor 0 and 1
> with addresses that match slaves). When I look at memory usage while the
> job runs I see virtually identical usage on the 2 Workers. This would
> support your claim and contradict Spark docs for deployMode = cluster.
>
> The evidence seems to contradict the docs. I am now beginning to wonder if
> the Driver only runs in the cluster if we use spark-submit????
>

Where/how are you starting "./sbin/start-master.sh"?

Cheers
Andrew


>
>
>
> From: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
> Reply: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
> Date: March 23, 2019 at 9:26:50 PM
> To: Pat Ferrel <pa...@occamsmachete.com> <pa...@occamsmachete.com>
> Cc: user <us...@spark.apache.org> <us...@spark.apache.org>
> Subject:  Re: Where does the Driver run?
>
> If you are starting your "my-app" on your local machine, that's where the
> driver is running.
>
> [image: image.png]
>
> Hope this helps.
> <https://spark.apache.org/docs/latest/cluster-overview.html>
>
> On Sun, Mar 24, 2019 at 4:13 AM Pat Ferrel <pa...@occamsmachete.com> wrote:
>
>> I have researched this for a significant amount of time and find answers
>> that seem to be for a slightly different question than mine.
>>
>> The Spark 2.3.3 cluster is running fine. I see the GUI on “
>> http://master-address:8080", there are 2 idle workers, as configured.
>>
>> I have a Scala application that creates a context and starts execution of
>> a Job. I *do not use spark-submit*, I start the Job programmatically and
>> this is where many explanations forks from my question.
>>
>> In "my-app" I create a new SparkConf, with the following code (slightly
>> abbreviated):
>>
>>       conf.setAppName(“my-job")
>>       conf.setMaster(“spark://master-address:7077”)
>>       conf.set(“deployMode”, “cluster”)
>>       // other settings like driver and executor memory requests
>>       // the driver and executor memory requests are for all mem on the
>> slaves, more than
>>       // mem available on the launching machine with “my-app"
>>       val jars = listJars(“/path/to/lib")
>>       conf.setJars(jars)
>>       …
>>
>> When I launch the job I see 2 executors running on the 2 workers/slaves.
>> Everything seems to run fine and sometimes completes successfully. Frequent
>> failures are the reason for this question.
>>
>> Where is the Driver running? I don’t see it in the GUI, I see 2 Executors
>> taking all cluster resources. With a Yarn cluster I would expect the
>> “Driver" to run on/in the Yarn Master but I am using the Spark Standalone
>> Master, where is the Drive part of the Job running?
>>
>> If is is running in the Master, we are in trouble because I start the
>> Master on one of my 2 Workers sharing resources with one of the Executors.
>> Executor mem + driver mem is > available mem on a Worker. I can change this
>> but need so understand where the Driver part of the Spark Job runs. Is it
>> in the Spark Master, or inside and Executor, or ???
>>
>> The “Driver” creates and broadcasts some large data structures so the
>> need for an answer is more critical than with more typical tiny Drivers.
>>
>> Thanks for you help!
>>
>
>
> --
> Cheers!
>
>

Re: Where does the Driver run?

Posted by Pat Ferrel <pa...@occamsmachete.com>.
Thanks, I have seen this many times in my research. Paraphrasing docs: “in
deployMode ‘cluster' the Driver runs on a Worker in the cluster”

When I look at logs I see 2 executors on the 2 slaves (executor 0 and 1
with addresses that match slaves). When I look at memory usage while the
job runs I see virtually identical usage on the 2 Workers. This would
support your claim and contradict Spark docs for deployMode = cluster.

The evidence seems to contradict the docs. I am now beginning to wonder if
the Driver only runs in the cluster if we use spark-submit????



From: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
Reply: Akhil Das <ak...@hacked.work> <ak...@hacked.work>
Date: March 23, 2019 at 9:26:50 PM
To: Pat Ferrel <pa...@occamsmachete.com> <pa...@occamsmachete.com>
Cc: user <us...@spark.apache.org> <us...@spark.apache.org>
Subject:  Re: Where does the Driver run?

If you are starting your "my-app" on your local machine, that's where the
driver is running.

[image: image.png]

Hope this helps.
<https://spark.apache.org/docs/latest/cluster-overview.html>

On Sun, Mar 24, 2019 at 4:13 AM Pat Ferrel <pa...@occamsmachete.com> wrote:

> I have researched this for a significant amount of time and find answers
> that seem to be for a slightly different question than mine.
>
> The Spark 2.3.3 cluster is running fine. I see the GUI on “
> http://master-address:8080", there are 2 idle workers, as configured.
>
> I have a Scala application that creates a context and starts execution of
> a Job. I *do not use spark-submit*, I start the Job programmatically and
> this is where many explanations forks from my question.
>
> In "my-app" I create a new SparkConf, with the following code (slightly
> abbreviated):
>
>       conf.setAppName(“my-job")
>       conf.setMaster(“spark://master-address:7077”)
>       conf.set(“deployMode”, “cluster”)
>       // other settings like driver and executor memory requests
>       // the driver and executor memory requests are for all mem on the
> slaves, more than
>       // mem available on the launching machine with “my-app"
>       val jars = listJars(“/path/to/lib")
>       conf.setJars(jars)
>       …
>
> When I launch the job I see 2 executors running on the 2 workers/slaves.
> Everything seems to run fine and sometimes completes successfully. Frequent
> failures are the reason for this question.
>
> Where is the Driver running? I don’t see it in the GUI, I see 2 Executors
> taking all cluster resources. With a Yarn cluster I would expect the
> “Driver" to run on/in the Yarn Master but I am using the Spark Standalone
> Master, where is the Drive part of the Job running?
>
> If is is running in the Master, we are in trouble because I start the
> Master on one of my 2 Workers sharing resources with one of the Executors.
> Executor mem + driver mem is > available mem on a Worker. I can change this
> but need so understand where the Driver part of the Spark Job runs. Is it
> in the Spark Master, or inside and Executor, or ???
>
> The “Driver” creates and broadcasts some large data structures so the need
> for an answer is more critical than with more typical tiny Drivers.
>
> Thanks for you help!
>


--
Cheers!

Re: Where does the Driver run?

Posted by Akhil Das <ak...@hacked.work>.
If you are starting your "my-app" on your local machine, that's where the
driver is running.

[image: image.png]

Hope this helps.
<https://spark.apache.org/docs/latest/cluster-overview.html>

On Sun, Mar 24, 2019 at 4:13 AM Pat Ferrel <pa...@occamsmachete.com> wrote:

> I have researched this for a significant amount of time and find answers
> that seem to be for a slightly different question than mine.
>
> The Spark 2.3.3 cluster is running fine. I see the GUI on “
> http://master-address:8080", there are 2 idle workers, as configured.
>
> I have a Scala application that creates a context and starts execution of
> a Job. I *do not use spark-submit*, I start the Job programmatically and
> this is where many explanations forks from my question.
>
> In "my-app" I create a new SparkConf, with the following code (slightly
> abbreviated):
>
>       conf.setAppName(“my-job")
>       conf.setMaster(“spark://master-address:7077”)
>       conf.set(“deployMode”, “cluster”)
>       // other settings like driver and executor memory requests
>       // the driver and executor memory requests are for all mem on the
> slaves, more than
>       // mem available on the launching machine with “my-app"
>       val jars = listJars(“/path/to/lib")
>       conf.setJars(jars)
>       …
>
> When I launch the job I see 2 executors running on the 2 workers/slaves.
> Everything seems to run fine and sometimes completes successfully. Frequent
> failures are the reason for this question.
>
> Where is the Driver running? I don’t see it in the GUI, I see 2 Executors
> taking all cluster resources. With a Yarn cluster I would expect the
> “Driver" to run on/in the Yarn Master but I am using the Spark Standalone
> Master, where is the Drive part of the Job running?
>
> If is is running in the Master, we are in trouble because I start the
> Master on one of my 2 Workers sharing resources with one of the Executors.
> Executor mem + driver mem is > available mem on a Worker. I can change this
> but need so understand where the Driver part of the Spark Job runs. Is it
> in the Spark Master, or inside and Executor, or ???
>
> The “Driver” creates and broadcasts some large data structures so the need
> for an answer is more critical than with more typical tiny Drivers.
>
> Thanks for you help!
>


-- 
Cheers!

Re: Where does the Driver run?

Posted by Mich Talebzadeh <mi...@gmail.com>.
Hi,

I have explained this in my following Linkedlin article "The Operational
Advantages of Spark as a Distributed Processing Framework
<https://www.linkedin.com/pulse/operational-advantages-spark-distributed-processing-mich/>
"

An extract

*2) YARN Deployment Modes*

The term D*eployment mode of Spark*, simply means that “where the driver
program will be run”. There are two ways, namely; *Spark Client Mode*
<https://spark.apache.org/docs/latest/running-on-yarn.html>* and **Spark
Cluster Mode* <https://spark.apache.org/docs/latest/cluster-overview.html>
*.* These are described below:

*In the Client mode,* *the driver daemon runs in the node through which you
submit the spark job to your cluster.* This is often done through the Edge
Node. This mode is valuable when you want to use spark interactively like
in our case where we would like to display high value prices in the
dashboard. In the Client mode you do not want to reserve any resource from
your cluster for the driver daemon

*In Cluster mode,* *you submit the spark job to your cluster and the driver
daemon is run inside your cluster and application master*. In this mode you
do not get to use the spark job interactively as the client through which
you submit the job is gone as soon as it successfully submits the job to
cluster. You will have to reserve some resources for the driver daemon
process as it will be running in your cluster.

HTH

Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Sat, 23 Mar 2019 at 21:13, Pat Ferrel <pa...@occamsmachete.com> wrote:

> I have researched this for a significant amount of time and find answers
> that seem to be for a slightly different question than mine.
>
> The Spark 2.3.3 cluster is running fine. I see the GUI on “
> http://master-address:8080", there are 2 idle workers, as configured.
>
> I have a Scala application that creates a context and starts execution of
> a Job. I *do not use spark-submit*, I start the Job programmatically and
> this is where many explanations forks from my question.
>
> In "my-app" I create a new SparkConf, with the following code (slightly
> abbreviated):
>
>       conf.setAppName(“my-job")
>       conf.setMaster(“spark://master-address:7077”)
>       conf.set(“deployMode”, “cluster”)
>       // other settings like driver and executor memory requests
>       // the driver and executor memory requests are for all mem on the
> slaves, more than
>       // mem available on the launching machine with “my-app"
>       val jars = listJars(“/path/to/lib")
>       conf.setJars(jars)
>       …
>
> When I launch the job I see 2 executors running on the 2 workers/slaves.
> Everything seems to run fine and sometimes completes successfully. Frequent
> failures are the reason for this question.
>
> Where is the Driver running? I don’t see it in the GUI, I see 2 Executors
> taking all cluster resources. With a Yarn cluster I would expect the
> “Driver" to run on/in the Yarn Master but I am using the Spark Standalone
> Master, where is the Drive part of the Job running?
>
> If is is running in the Master, we are in trouble because I start the
> Master on one of my 2 Workers sharing resources with one of the Executors.
> Executor mem + driver mem is > available mem on a Worker. I can change this
> but need so understand where the Driver part of the Spark Job runs. Is it
> in the Spark Master, or inside and Executor, or ???
>
> The “Driver” creates and broadcasts some large data structures so the need
> for an answer is more critical than with more typical tiny Drivers.
>
> Thanks for you help!
>