You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Milan Brna (JIRA)" <ji...@apache.org> on 2015/10/27 14:42:27 UTC

[jira] [Updated] (SPARK-11346) Spark EventLog for completed applications

     [ https://issues.apache.org/jira/browse/SPARK-11346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Milan Brna updated SPARK-11346:
-------------------------------
    Attachment: eventLogTest.scala

> Spark EventLog for completed applications
> -----------------------------------------
>
>                 Key: SPARK-11346
>                 URL: https://issues.apache.org/jira/browse/SPARK-11346
>             Project: Spark
>          Issue Type: Question
>    Affects Versions: 1.5.1
>         Environment: Centos 6.7
>            Reporter: Milan Brna
>         Attachments: eventLogTest.scala
>
>
> Environment description: Spark 1.5.1 build following way:
> ./dev/change-scala-version.sh 2.11
> export MAVEN_OPTS="-Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=512m"
> ./make-distribution.sh --name custom-spark --tgz -Phadoop-2.6 -Pyarn -Dscala-2.11 -Phive -Phive-thriftserver
> 4 node standalone cluster (node1-node4)
> Master configuration in spark-defaults.conf:
> spark.eventLog.enabled            true
> spark.eventLog.dir                hdfs://node1:38200/user/spark-events
> The same configuration was created during tests of event logging on all 4 nodes.
> Cluster is started from node1 (master) by ./start-all.sh, thrift server and history server are additionally started
> Simple application (see attached scala file eventLogTest.scala) is executed from remote laptop, using intellij GUI.
> When conf.set("spark.eventLog.enabled","true") and conf.set("spark.eventLog.dir","hdfs://node1:38200/user/spark-events")
> are un-commented, application eventlog directory in hdfs://node1:38200/user/spark-events is created and contains data.
> History server properly sees and presents content. Everything allright as far.
> If both parameters in application are turned off (commented in source) however, no eventlog directory is ever created for the application.
> I'd expect that parameters spark.eventLog.enabled and spark.eventLog.dir from spark-defaults.conf which is present on all four nodes will be sufficient for the application (even remote) to create eventlog.
> Additionally, I have experimented with following options on all four nodes in spark-env.sh:
> SPARK_MASTER_OPTS="-Dspark.eventLog.enabled=true -Dspark.eventLog.dir=hdfs://node1:38200/user/spark-events"
> SPARK_WORKER_OPTS="-Dspark.eventLog.enabled=true -Dspark.eventLog.dir=hdfs://node1:38200/user/spark-events"
> SPARK_JAVA_OPTS="-Dspark.eventLog.enabled=true -Dspark.eventLog.dir=hdfs://node1:38200/user/spark-events"
> JAVA_OPTS="-Dspark.eventLog.enabled=true -Dspark.eventLog.dir=hdfs://node1:38200/user/spark-events"
> SPARK_CONF_DIR="/u01/com/app/spark-1.5.1-bin-cdma-spark/conf"
> SPARK_HISTORY_OPTS="-Dspark.eventLog.enabled=true -Dspark.eventLog.dir=hdfs://node1:38200/user/spark-events"
> SPARK_SHUFFLE_OPTS="-Dspark.eventLog.enabled=true -Dspark.eventLog.dir=hdfs://node1:38200/user/spark-events"
> SPARK_DAEMON_JAVA_OPTS="-Dspark.eventLog.enabled=true -Dspark.eventLog.dir=hdfs://node1:38200/user/spark-events"
> and I have even tried to set following configuration option in application spark context configuration:
> conf.set("spark.submit.deployMode","cluster")
> but none of these settings caused eventlog to appear for completed application.
> EventLog is present for application started from the cluster servers i.e. pyspark, thrift server
> Question: Is this correct behaviour that executing application from remote intellij produces no eventlog unless these options are explicitely specified by configuration inside scala code of the application, hence ignoring settings in spark-defaults.conf file?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org