You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Paolo Patierno <pp...@live.com> on 2017/05/31 07:42:41 UTC

Worker node log not showed

Hi all,


I have a simple cluster with one master and one worker. On another machine I launch the driver where at some point I have following line of codes :


max.foreachRDD(rdd -> {

    LOG.info("*** max.foreachRDD");

    rdd.foreach(value -> {

        LOG.info("*** rdd.foreach");

    });
});

The message "*** max.foreachRDD" is visible in the console of the driver machine ... and it's ok.
I can't see the "*** rdd.foreach" message that should be executed on the worker node right ? Btw on the worker node console I can't see it. Why ?

My need is to log what happens in the code executed on worker node (because it works if I execute it on master local[*] but not when submitted to a worker node).

Following the log4j.properties file I put in the /conf dir :


# Set everything to be logged to the console
log4j.rootCategory=INFO, console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n

# Settings to quiet third party logs that are too verbose
log4j.logger.org.spark-project.jetty=WARN
log4j.logger.org.spark-project.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO


Thanks
Paolo.


Paolo Patierno
Senior Software Engineer (IoT) @ Red Hat
Microsoft MVP on Windows Embedded & IoT
Microsoft Azure Advisor

Twitter : @ppatierno<http://twitter.com/ppatierno>
Linkedin : paolopatierno<http://it.linkedin.com/in/paolopatierno>
Blog : DevExperience<http://paolopatierno.wordpress.com/>

Re: Worker node log not showed

Posted by Eike von Seggern <ei...@sevenval.com>.
2017-05-31 10:48 GMT+02:00 Paolo Patierno <pp...@live.com>:

> No it's running in standalone mode as Docker image on Kubernetes.
>
>
> The only way I found was to access "stderr" file created under the "work"
> directory in the SPARK_HOME but ... is it the right way ?
>

I think that is the right way. I haven't looked in the documentation, but I
think in a stand-alone cluster you have a master node, that manages your
worker node, each node running one "management" process. When you submit a
job, these management processes spawn "executor" processes which have
stdout/-err going to $SPARK_HOME/work/…, but are not piped through the
management processes. The logs should also be available through the web UI
on port 8081 of the worker.

Best

Eike

Re: Worker node log not showed

Posted by Ryan <ry...@gmail.com>.
I think you need to get the logger within the lambda, otherwise it's the
logger on driver side which can't work.

On Wed, May 31, 2017 at 4:48 PM, Paolo Patierno <pp...@live.com> wrote:

> No it's running in standalone mode as Docker image on Kubernetes.
>
>
> The only way I found was to access "stderr" file created under the "work"
> directory in the SPARK_HOME but ... is it the right way ?
>
>
> *Paolo Patierno*
>
> *Senior Software Engineer (IoT) @ Red Hat **Microsoft MVP on **Windows
> Embedded & IoT*
> *Microsoft Azure Advisor*
>
> Twitter : @ppatierno <http://twitter.com/ppatierno>
> Linkedin : paolopatierno <http://it.linkedin.com/in/paolopatierno>
> Blog : DevExperience <http://paolopatierno.wordpress.com/>
>
>
> ------------------------------
> *From:* Alonso Isidoro Roman <al...@gmail.com>
> *Sent:* Wednesday, May 31, 2017 8:39 AM
> *To:* Paolo Patierno
> *Cc:* user@spark.apache.org
> *Subject:* Re: Worker node log not showed
>
> Are you running the code with yarn?
>
> if so, figure out the applicationID through the web ui, then run the next
> command:
>
> yarn logs your_application_id
>
> Alonso Isidoro Roman
> [image: https://]about.me/alonso.isidoro.roman
>
> <https://about.me/alonso.isidoro.roman?promo=email_sig&utm_source=email_sig&utm_medium=email_sig&utm_campaign=external_links>
>
> 2017-05-31 9:42 GMT+02:00 Paolo Patierno <pp...@live.com>:
>
>> Hi all,
>>
>>
>> I have a simple cluster with one master and one worker. On another
>> machine I launch the driver where at some point I have following line of
>> codes :
>>
>>
>> max.foreachRDD(rdd -> {
>>
>>     LOG.info("*** max.foreachRDD");
>>
>>     rdd.foreach(value -> {
>>
>>         LOG.info("*** rdd.foreach");
>>
>>     });
>> });
>>
>>
>> The message "*** max.foreachRDD" is visible in the console of the driver
>> machine ... and it's ok.
>> I can't see the "*** rdd.foreach" message that should be executed on the
>> worker node right ? Btw on the worker node console I can't see it. Why ?
>>
>> My need is to log what happens in the code executed on worker node
>> (because it works if I execute it on master local[*] but not when submitted
>> to a worker node).
>>
>> Following the log4j.properties file I put in the /conf dir :
>>
>> # Set everything to be logged to the console
>> log4j.rootCategory=INFO, console
>> log4j.appender.console=org.apache.log4j.ConsoleAppender
>> log4j.appender.console.target=System.err
>> log4j.appender.console.layout=org.apache.log4j.PatternLayout
>> log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n
>>
>> # Settings to quiet third party logs that are too verbose
>> log4j.logger.org.spark-project.jetty=WARN
>> log4j.logger.org.spark-project.jetty.util.component.AbstractLifeCycle=ERROR
>> log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
>> log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO
>>
>>
>>
>> Thanks
>> Paolo.
>>
>>
>> *Paolo Patierno*
>>
>> *Senior Software Engineer (IoT) @ Red Hat **Microsoft MVP on **Windows
>> Embedded & IoT*
>> *Microsoft Azure Advisor*
>>
>> Twitter : @ppatierno <http://twitter.com/ppatierno>
>> Linkedin : paolopatierno <http://it.linkedin.com/in/paolopatierno>
>> Blog : DevExperience <http://paolopatierno.wordpress.com/>
>>
>
>

Re: Worker node log not showed

Posted by Paolo Patierno <pp...@live.com>.
No it's running in standalone mode as Docker image on Kubernetes.


The only way I found was to access "stderr" file created under the "work" directory in the SPARK_HOME but ... is it the right way ?


Paolo Patierno
Senior Software Engineer (IoT) @ Red Hat
Microsoft MVP on Windows Embedded & IoT
Microsoft Azure Advisor

Twitter : @ppatierno<http://twitter.com/ppatierno>
Linkedin : paolopatierno<http://it.linkedin.com/in/paolopatierno>
Blog : DevExperience<http://paolopatierno.wordpress.com/>


________________________________
From: Alonso Isidoro Roman <al...@gmail.com>
Sent: Wednesday, May 31, 2017 8:39 AM
To: Paolo Patierno
Cc: user@spark.apache.org
Subject: Re: Worker node log not showed

Are you running the code with yarn?

if so, figure out the applicationID through the web ui, then run the next command:

yarn logs your_application_id

<https://about.me/alonso.isidoro.roman?promo=email_sig&utm_source=email_sig&utm_medium=email_sig&utm_campaign=external_links>
Alonso Isidoro Roman
about.me/alonso.isidoro.roman


2017-05-31 9:42 GMT+02:00 Paolo Patierno <pp...@live.com>>:

Hi all,


I have a simple cluster with one master and one worker. On another machine I launch the driver where at some point I have following line of codes :


max.foreachRDD(rdd -> {

    LOG.info("*** max.foreachRDD");

    rdd.foreach(value -> {

        LOG.info("*** rdd.foreach");

    });
});

The message "*** max.foreachRDD" is visible in the console of the driver machine ... and it's ok.
I can't see the "*** rdd.foreach" message that should be executed on the worker node right ? Btw on the worker node console I can't see it. Why ?

My need is to log what happens in the code executed on worker node (because it works if I execute it on master local[*] but not when submitted to a worker node).

Following the log4j.properties file I put in the /conf dir :


# Set everything to be logged to the console
log4j.rootCategory=INFO, console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n

# Settings to quiet third party logs that are too verbose
log4j.logger.org.spark-project.jetty=WARN
log4j.logger.org.spark-project.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO


Thanks
Paolo.


Paolo Patierno
Senior Software Engineer (IoT) @ Red Hat
Microsoft MVP on Windows Embedded & IoT
Microsoft Azure Advisor

Twitter : @ppatierno<http://twitter.com/ppatierno>
Linkedin : paolopatierno<http://it.linkedin.com/in/paolopatierno>
Blog : DevExperience<http://paolopatierno.wordpress.com/>


Re: Worker node log not showed

Posted by Alonso Isidoro Roman <al...@gmail.com>.
Are you running the code with yarn?

if so, figure out the applicationID through the web ui, then run the next
command:

yarn logs your_application_id

Alonso Isidoro Roman
[image: https://]about.me/alonso.isidoro.roman
<https://about.me/alonso.isidoro.roman?promo=email_sig&utm_source=email_sig&utm_medium=email_sig&utm_campaign=external_links>

2017-05-31 9:42 GMT+02:00 Paolo Patierno <pp...@live.com>:

> Hi all,
>
>
> I have a simple cluster with one master and one worker. On another machine
> I launch the driver where at some point I have following line of codes :
>
>
> max.foreachRDD(rdd -> {
>
>     LOG.info("*** max.foreachRDD");
>
>     rdd.foreach(value -> {
>
>         LOG.info("*** rdd.foreach");
>
>     });
> });
>
>
> The message "*** max.foreachRDD" is visible in the console of the driver
> machine ... and it's ok.
> I can't see the "*** rdd.foreach" message that should be executed on the
> worker node right ? Btw on the worker node console I can't see it. Why ?
>
> My need is to log what happens in the code executed on worker node
> (because it works if I execute it on master local[*] but not when submitted
> to a worker node).
>
> Following the log4j.properties file I put in the /conf dir :
>
> # Set everything to be logged to the console
> log4j.rootCategory=INFO, console
> log4j.appender.console=org.apache.log4j.ConsoleAppender
> log4j.appender.console.target=System.err
> log4j.appender.console.layout=org.apache.log4j.PatternLayout
> log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n
>
> # Settings to quiet third party logs that are too verbose
> log4j.logger.org.spark-project.jetty=WARN
> log4j.logger.org.spark-project.jetty.util.component.AbstractLifeCycle=ERROR
> log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
> log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO
>
>
>
> Thanks
> Paolo.
>
>
> *Paolo Patierno*
>
> *Senior Software Engineer (IoT) @ Red Hat **Microsoft MVP on **Windows
> Embedded & IoT*
> *Microsoft Azure Advisor*
>
> Twitter : @ppatierno <http://twitter.com/ppatierno>
> Linkedin : paolopatierno <http://it.linkedin.com/in/paolopatierno>
> Blog : DevExperience <http://paolopatierno.wordpress.com/>
>