You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by "Michael Williams (SSI)" <Mi...@ssigroup.com> on 2022/02/21 14:15:09 UTC

Logging to determine why driver fails

Hello,

We have a POC using Spark 3.2.1 and none of us have any prior Spark experience.  Our setup uses the native Spark REST api (http://localhost:6066/v1/submissions/create) on the master node (not Livy, not Spark Job server).  While we have been successful at submitting python jobs via this endpoint, when we implemented .NET for Spark and have been attempting to trigger those jobs using the api, the driver (on the worker) simply reports failed, but there aren't any log files created because it is failing before the application starts.

Is there a logging configuration that can be made that might increase the logging detail on the worker for internal Spark processes and possibly tell us specifically the error occurring?

Thank you,
Mike



This electronic message may contain information that is Proprietary, Confidential, or legally privileged or protected. It is intended only for the use of the individual(s) and entity named in the message. If you are not an intended recipient of this message, please notify the sender immediately and delete the material from your computer. Do not deliver, distribute or copy this message and do not disclose its contents or take any action in reliance on the information it contains. Thank You.

Re: Logging to determine why driver fails

Posted by Artemis User <ar...@dtechspace.com>.
Spark uses log4j for logging.  There is a log4j properties template file 
in the conf directory.  Just remove the "template" extension and change 
the content of log4j.properties to meet your need.  More info on log4j 
can be found at logging.apache.org...

On 2/21/22 9:15 AM, Michael Williams (SSI) wrote:
>
> Hello,
>
> We have a POC using Spark 3.2.1 and none of us have any prior Spark 
> experience.  Our setup uses the native Spark REST api 
> (http://localhost:6066/v1/submissions/create) on the master node (not 
> Livy, not Spark Job server).  While we have been successful at 
> submitting python jobs via this endpoint, when we implemented .NET for 
> Spark and have been attempting to trigger those jobs using the api, 
> the driver (on the worker) simply reports failed, but there aren’t any 
> log files created because it is failing before the application starts.
>
> Is there a logging configuration that can be made that might increase 
> the logging detail on the worker for internal Spark processes and 
> possibly tell us specifically the error occurring?
>
> Thank you,
>
> Mike
>
> This electronic message may contain information that is Proprietary, 
> Confidential, or legally privileged or protected. It is intended only 
> for the use of the individual(s) and entity named in the message. If 
> you are not an intended recipient of this message, please notify the 
> sender immediately and delete the material from your computer. Do not 
> deliver, distribute or copy this message and do not disclose its 
> contents or take any action in reliance on the information it 
> contains. Thank You. 

Re: Logging to determine why driver fails

Posted by Artemis User <ar...@dtechspace.com>.
Another unknown issue I'd like to mention was, as we had in the past, 
that Spark 3.2.1 was bundled with log4j version 1.2.7.  That jar file is 
missing some APIs (e.g. RollingFileAppender), and you may encounter some 
ClassNotFound exceptions.  To resolve that issue, please make sure you 
download and place the apache-log4j-extras-1.2.17.jar file (should be 
downloadable from apache's web site) to the Spark's jars directory.

Not sure why Spark didn't bundle the extra jar file to the package. Hope 
this little bug will be resolved in the next release...

On 2/21/22 9:37 AM, Michael Williams (SSI) wrote:
>
> Thank you.
>
> *From:* Artemis User [mailto:artemis@dtechspace.com]
> *Sent:* Monday, February 21, 2022 8:23 AM
> *To:* Michael Williams (SSI) <Mi...@ssigroup.com>
> *Subject:* Re: Logging to determine why driver fails
>
> Spark uses Log4j for logging.  There is a log4j properties template 
> file located in the conf directory.  You can use that as a reference 
> to turn on logging.  More info on log4j can be found on the 
> logging.apache.org web site.
>
> On 2/21/22 9:15 AM, Michael Williams (SSI) wrote:
>
>     Hello,
>
>     We have a POC using Spark 3.2.1 and none of us have any prior
>     Spark experience.  Our setup uses the native Spark REST api
>     (http://localhost:6066/v1/submissions/create
>     <https://urldefense.com/v3/__http:/localhost:6066/v1/submissions/create__;!!IPetXT4!mXmSW0hs56znHriM4Qq8mNLNRsSPaoWREmeTXvUcE4wpIQWJTbnRu5YIwSVU2ma_AztAfDI$>)
>     on the master node (not Livy, not Spark Job server).  While we
>     have been successful at submitting python jobs via this endpoint,
>     when we implemented .NET for Spark and have been attempting to
>     trigger those jobs using the api, the driver (on the worker)
>     simply reports failed, but there aren’t any log files created
>     because it is failing before the application starts.
>
>     Is there a logging configuration that can be made that might
>     increase the logging detail on the worker for internal Spark
>     processes and possibly tell us specifically the error occurring?
>
>     Thank you,
>
>     Mike
>
>     This electronic message may contain information that is
>     Proprietary, Confidential, or legally privileged or protected. It
>     is intended only for the use of the individual(s) and entity named
>     in the message. If you are not an intended recipient of this
>     message, please notify the sender immediately and delete the
>     material from your computer. Do not deliver, distribute or copy
>     this message and do not disclose its contents or take any action
>     in reliance on the information it contains. Thank You.
>
> This electronic message may contain information that is Proprietary, 
> Confidential, or legally privileged or protected. It is intended only 
> for the use of the individual(s) and entity named in the message. If 
> you are not an intended recipient of this message, please notify the 
> sender immediately and delete the material from your computer. Do not 
> deliver, distribute or copy this message and do not disclose its 
> contents or take any action in reliance on the information it 
> contains. Thank You. 

RE: Logging to determine why driver fails

Posted by "Michael Williams (SSI)" <Mi...@ssigroup.com>.
Thank you.

From: Artemis User [mailto:artemis@dtechspace.com]
Sent: Monday, February 21, 2022 8:23 AM
To: Michael Williams (SSI) <Mi...@ssigroup.com>
Subject: Re: Logging to determine why driver fails

Spark uses Log4j for logging.  There is a log4j properties template file located in the conf directory.  You can use that as a reference to turn on logging.  More info on log4j can be found on the logging.apache.org web site.
On 2/21/22 9:15 AM, Michael Williams (SSI) wrote:
Hello,

We have a POC using Spark 3.2.1 and none of us have any prior Spark experience.  Our setup uses the native Spark REST api (http://localhost:6066/v1/submissions/create<https://urldefense.com/v3/__http:/localhost:6066/v1/submissions/create__;!!IPetXT4!mXmSW0hs56znHriM4Qq8mNLNRsSPaoWREmeTXvUcE4wpIQWJTbnRu5YIwSVU2ma_AztAfDI$>) on the master node (not Livy, not Spark Job server).  While we have been successful at submitting python jobs via this endpoint, when we implemented .NET for Spark and have been attempting to trigger those jobs using the api, the driver (on the worker) simply reports failed, but there aren’t any log files created because it is failing before the application starts.

Is there a logging configuration that can be made that might increase the logging detail on the worker for internal Spark processes and possibly tell us specifically the error occurring?

Thank you,
Mike


This electronic message may contain information that is Proprietary, Confidential, or legally privileged or protected. It is intended only for the use of the individual(s) and entity named in the message. If you are not an intended recipient of this message, please notify the sender immediately and delete the material from your computer. Do not deliver, distribute or copy this message and do not disclose its contents or take any action in reliance on the information it contains. Thank You.




This electronic message may contain information that is Proprietary, Confidential, or legally privileged or protected. It is intended only for the use of the individual(s) and entity named in the message. If you are not an intended recipient of this message, please notify the sender immediately and delete the material from your computer. Do not deliver, distribute or copy this message and do not disclose its contents or take any action in reliance on the information it contains. Thank You.