You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by cbruegg <ma...@cbruegg.com> on 2016/06/30 18:36:19 UTC

Debugging Spark itself in standalone cluster mode

Hello everyone,

I'm a student assistant in research at the University of Paderborn, working
on integrating Spark (v1.6.2) with a new network resource management system.
I have already taken a deep dive into the source code of spark-core w.r.t.
its scheduling systems.

We are running a cluster in standalone mode consisting of a master node and
three slave nodes. Am I right to assume that tasks are scheduled within the
TaskSchedulerImpl using the DAGScheduler in this mode? I need to find a
place where the execution plan (and each stage) for a job is computed and
can be analyzed, so I placed some breakpoints in these two classes.

The remote debugging session within IntelliJ IDEA has been established by
running the following commands on the master node before:

  export SPARK_WORKER_OPTS="-Xdebug
-Xrunjdwp:server=y,transport=dt_socket,address=4000,suspend=n"
  export SPARK_MASTER_OPTS="-Xdebug
-Xrunjdwp:server=y,transport=dt_socket,address=4000,suspend=n"

Port 4000 has been forwarded to my local machine. Unfortunately, none of my
breakpoints through the class get hit when I invoke a task like
sc.parallelize(1 to 1000).count() in spark-shell on the master node (using
--master spark://...), though when I pause all threads I can see that the
process I am debugging runs some kind of event queue, which means that the
debugger is connected to /something/.

Do I rely on false assumptions or should these breakpoints in fact get hit?
I am not too familiar with Spark, so please bear with me if I got something
wrong. Many thanks in advance for your help.

Best regards,
Christian Brüggemann



--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Debugging-Spark-itself-in-standalone-cluster-mode-tp18139.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Re: Debugging Spark itself in standalone cluster mode

Posted by cbruegg <ma...@cbruegg.com>.
Thanks for the guidance! Setting the --driver-java-options in spark-shell
instead of SPARK_MASTER_OPTS made the debugger connect to the right JVM. My
breakpoints get hit now.

nirandap [via Apache Spark Developers List] <
ml-node+s1001551n18145h83@n3.nabble.com> schrieb am Fr., 1. Juli 2016 um
04:39 Uhr:

> Guys,
>
> Aren't TaskScheduler and DAGScheduler residing in the spark context? So,
> the debug configs need to be set in the JVM where the spark context is
> running? [1]
>
> But yes, I agree, if you really need to check the execution, you need to
> set those configs in the executors [2]
>
> [1]
> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-sparkcontext.html
> [2]
> http://spark.apache.org/docs/latest/configuration.html#runtime-environment
>
>
> On Fri, Jul 1, 2016 at 12:30 AM, rxin [via Apache Spark Developers List] <[hidden
> email] <http:///user/SendEmail.jtp?type=node&node=18145&i=0>> wrote:
>
>> Yes, scheduling is centralized in the driver.
>>
>> For debugging, I think you'd want to set the executor JVM, not the worker
>> JVM flags.
>>
>> On Thu, Jun 30, 2016 at 11:36 AM, cbruegg <[hidden email]
>> <http:///user/SendEmail.jtp?type=node&node=18141&i=0>> wrote:
>>
> Hello everyone,
>>>
>>> I'm a student assistant in research at the University of Paderborn,
>>> working
>>> on integrating Spark (v1.6.2) with a new network resource management
>>> system.
>>> I have already taken a deep dive into the source code of spark-core
>>> w.r.t.
>>> its scheduling systems.
>>>
>>> We are running a cluster in standalone mode consisting of a master node
>>> and
>>> three slave nodes. Am I right to assume that tasks are scheduled within
>>> the
>>> TaskSchedulerImpl using the DAGScheduler in this mode? I need to find a
>>> place where the execution plan (and each stage) for a job is computed and
>>> can be analyzed, so I placed some breakpoints in these two classes.
>>>
>>> The remote debugging session within IntelliJ IDEA has been established by
>>> running the following commands on the master node before:
>>>
>>>   export SPARK_WORKER_OPTS="-Xdebug
>>> -Xrunjdwp:server=y,transport=dt_socket,address=4000,suspend=n"
>>>   export SPARK_MASTER_OPTS="-Xdebug
>>> -Xrunjdwp:server=y,transport=dt_socket,address=4000,suspend=n"
>>>
>>> Port 4000 has been forwarded to my local machine. Unfortunately, none of
>>> my
>>> breakpoints through the class get hit when I invoke a task like
>>> sc.parallelize(1 to 1000).count() in spark-shell on the master node
>>> (using
>>> --master spark://...), though when I pause all threads I can see that the
>>> process I am debugging runs some kind of event queue, which means that
>>> the
>>> debugger is connected to /something/.
>>>
>>> Do I rely on false assumptions or should these breakpoints in fact get
>>> hit?
>>> I am not too familiar with Spark, so please bear with me if I got
>>> something
>>> wrong. Many thanks in advance for your help.
>>>
>>> Best regards,
>>> Christian Brüggemann
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-spark-developers-list.1001551.n3.nabble.com/Debugging-Spark-itself-in-standalone-cluster-mode-tp18139.html
>>> Sent from the Apache Spark Developers List mailing list archive at
>>> Nabble.com.
>>>
>>> ---------------------------------------------------------------------
>>>
>> To unsubscribe e-mail: [hidden email]
>>> <http:///user/SendEmail.jtp?type=node&node=18141&i=1>
>>>
>>>
>>
>> ------------------------------
>> If you reply to this email, your message will be added to the discussion
>> below:
>>
>> http://apache-spark-developers-list.1001551.n3.nabble.com/Debugging-Spark-itself-in-standalone-cluster-mode-tp18139p18141.html
>>
> To start a new topic under Apache Spark Developers List, email [hidden
>> email] <http:///user/SendEmail.jtp?type=node&node=18145&i=1>
>> To unsubscribe from Apache Spark Developers List, click here.
>> NAML
>> <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>
>
>
> --
> Niranda
> @n1r44 <https://twitter.com/N1R44>
> +94-71-554-8430
> https://pythagoreanscript.wordpress.com/
>
>
> ------------------------------
> If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-spark-developers-list.1001551.n3.nabble.com/Debugging-Spark-itself-in-standalone-cluster-mode-tp18139p18145.html
> To unsubscribe from Debugging Spark itself in standalone cluster mode, click
> here
> <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=18139&code=bWFpbEBjYnJ1ZWdnLmNvbXwxODEzOXwtMjAxMTcyNDY4OQ==>
> .
> NAML
> <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Debugging-Spark-itself-in-standalone-cluster-mode-tp18139p18146.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Re: Debugging Spark itself in standalone cluster mode

Posted by nirandap <ni...@gmail.com>.
Guys,

Aren't TaskScheduler and DAGScheduler residing in the spark context? So,
the debug configs need to be set in the JVM where the spark context is
running? [1]

But yes, I agree, if you really need to check the execution, you need to
set those configs in the executors [2]

[1]
https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-sparkcontext.html
[2]
http://spark.apache.org/docs/latest/configuration.html#runtime-environment


On Fri, Jul 1, 2016 at 12:30 AM, rxin [via Apache Spark Developers List] <
ml-node+s1001551n18141h33@n3.nabble.com> wrote:

> Yes, scheduling is centralized in the driver.
>
> For debugging, I think you'd want to set the executor JVM, not the worker
> JVM flags.
>
>
> On Thu, Jun 30, 2016 at 11:36 AM, cbruegg <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=18141&i=0>> wrote:
>
>> Hello everyone,
>>
>> I'm a student assistant in research at the University of Paderborn,
>> working
>> on integrating Spark (v1.6.2) with a new network resource management
>> system.
>> I have already taken a deep dive into the source code of spark-core w.r.t.
>> its scheduling systems.
>>
>> We are running a cluster in standalone mode consisting of a master node
>> and
>> three slave nodes. Am I right to assume that tasks are scheduled within
>> the
>> TaskSchedulerImpl using the DAGScheduler in this mode? I need to find a
>> place where the execution plan (and each stage) for a job is computed and
>> can be analyzed, so I placed some breakpoints in these two classes.
>>
>> The remote debugging session within IntelliJ IDEA has been established by
>> running the following commands on the master node before:
>>
>>   export SPARK_WORKER_OPTS="-Xdebug
>> -Xrunjdwp:server=y,transport=dt_socket,address=4000,suspend=n"
>>   export SPARK_MASTER_OPTS="-Xdebug
>> -Xrunjdwp:server=y,transport=dt_socket,address=4000,suspend=n"
>>
>> Port 4000 has been forwarded to my local machine. Unfortunately, none of
>> my
>> breakpoints through the class get hit when I invoke a task like
>> sc.parallelize(1 to 1000).count() in spark-shell on the master node (using
>> --master spark://...), though when I pause all threads I can see that the
>> process I am debugging runs some kind of event queue, which means that the
>> debugger is connected to /something/.
>>
>> Do I rely on false assumptions or should these breakpoints in fact get
>> hit?
>> I am not too familiar with Spark, so please bear with me if I got
>> something
>> wrong. Many thanks in advance for your help.
>>
>> Best regards,
>> Christian Brüggemann
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-developers-list.1001551.n3.nabble.com/Debugging-Spark-itself-in-standalone-cluster-mode-tp18139.html
>> Sent from the Apache Spark Developers List mailing list archive at
>> Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: [hidden email]
>> <http:///user/SendEmail.jtp?type=node&node=18141&i=1>
>>
>>
>
>
> ------------------------------
> If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-spark-developers-list.1001551.n3.nabble.com/Debugging-Spark-itself-in-standalone-cluster-mode-tp18139p18141.html
> To start a new topic under Apache Spark Developers List, email
> ml-node+s1001551n1h93@n3.nabble.com
> To unsubscribe from Apache Spark Developers List, click here
> <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=bmlyYW5kYS5wZXJlcmFAZ21haWwuY29tfDF8NjAxMDUyMzU5>
> .
> NAML
> <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>



-- 
Niranda
@n1r44 <https://twitter.com/N1R44>
+94-71-554-8430
https://pythagoreanscript.wordpress.com/




--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Debugging-Spark-itself-in-standalone-cluster-mode-tp18139p18145.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Re: Debugging Spark itself in standalone cluster mode

Posted by Reynold Xin <rx...@databricks.com>.
Yes, scheduling is centralized in the driver.

For debugging, I think you'd want to set the executor JVM, not the worker
JVM flags.


On Thu, Jun 30, 2016 at 11:36 AM, cbruegg <ma...@cbruegg.com> wrote:

> Hello everyone,
>
> I'm a student assistant in research at the University of Paderborn, working
> on integrating Spark (v1.6.2) with a new network resource management
> system.
> I have already taken a deep dive into the source code of spark-core w.r.t.
> its scheduling systems.
>
> We are running a cluster in standalone mode consisting of a master node and
> three slave nodes. Am I right to assume that tasks are scheduled within the
> TaskSchedulerImpl using the DAGScheduler in this mode? I need to find a
> place where the execution plan (and each stage) for a job is computed and
> can be analyzed, so I placed some breakpoints in these two classes.
>
> The remote debugging session within IntelliJ IDEA has been established by
> running the following commands on the master node before:
>
>   export SPARK_WORKER_OPTS="-Xdebug
> -Xrunjdwp:server=y,transport=dt_socket,address=4000,suspend=n"
>   export SPARK_MASTER_OPTS="-Xdebug
> -Xrunjdwp:server=y,transport=dt_socket,address=4000,suspend=n"
>
> Port 4000 has been forwarded to my local machine. Unfortunately, none of my
> breakpoints through the class get hit when I invoke a task like
> sc.parallelize(1 to 1000).count() in spark-shell on the master node (using
> --master spark://...), though when I pause all threads I can see that the
> process I am debugging runs some kind of event queue, which means that the
> debugger is connected to /something/.
>
> Do I rely on false assumptions or should these breakpoints in fact get hit?
> I am not too familiar with Spark, so please bear with me if I got something
> wrong. Many thanks in advance for your help.
>
> Best regards,
> Christian Brüggemann
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/Debugging-Spark-itself-in-standalone-cluster-mode-tp18139.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>
>