You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Kenneth Knowles (Jira)" <ji...@apache.org> on 2021/03/15 18:38:00 UTC

[jira] [Commented] (BEAM-11735) Logging from DoFn doesn't work with Spark Runner in cluster mode

    [ https://issues.apache.org/jira/browse/BEAM-11735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17301861#comment-17301861 ] 

Kenneth Knowles commented on BEAM-11735:
----------------------------------------

Pinging [~iemejia] who may know something about this.

> Logging from DoFn doesn't work with Spark Runner in cluster mode
> ----------------------------------------------------------------
>
>                 Key: BEAM-11735
>                 URL: https://issues.apache.org/jira/browse/BEAM-11735
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-spark, sdk-java-core
>    Affects Versions: 2.26.0, 2.27.0
>         Environment: Cloudera 6, Hadoop 3, Spark 2.4
>            Reporter: Claudio Venturini
>            Priority: P2
>              Labels: SLF4J, log-aggregation, log4j, logging,, spark
>
> Log messages emitted by any DoFn is not logged by spark executors when the pipeline is run with Spark in cluster deployment mode (on YARN). Tested on Cloudera 6 with Spark 2.4.
> I made a test project to reproduce the issue: [https://github.com/ventuc/beam-log-test]. Run it with:
> {{spark-submit --class beam.tests.log.LogTesting --name LogTesting --deploy-mode cluster --master yarn --conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=[file:log4j.properties|file://log4j.properties/]" --conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=[file:log4j.properties|file://log4j.properties/]" --files $HOME/log4j.properties beam-log-test-0.0.1-SNAPSHOT.jar}}
> To retrieve logs from YARN run:
> {{yarn logs -applicationId <app_id>}}
> As you can see, logs from the beam.tests.log appear only in the driver's log, and not in the executor's log.
>  
> There's not any documentation about how to handle logs in Beam with the Spark runner. Please document it as requested also by BEAM-792.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)