You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Andrew Lee <al...@hotmail.com> on 2014/07/03 01:49:36 UTC

RE: write event logs with YARN

Hi Christophe,
Make sure you have 3 slashes in the hdfs scheme.
e.g.
hdfs:///<server_name>:9000/user/<user_name>/spark-events
and in the spark-defaults.conf as well.spark.eventLog.dir=hdfs:///<server_name>:9000/user/<user_name>/spark-events

> Date: Thu, 19 Jun 2014 11:18:51 +0200
> From: christophe.preaud@kelkoo.com
> To: user@spark.apache.org
> Subject: write event logs with YARN
> 
> Hi,
> 
> I am trying to use the new Spark history server in 1.0.0 to view finished applications (launched on YARN), without success so far.
> 
> Here are the relevant configuration properties in my spark-defaults.conf:
> 
> spark.yarn.historyServer.address=<server_name>:18080
> spark.ui.killEnabled=false
> spark.eventLog.enabled=true
> spark.eventLog.compress=true
> spark.eventLog.dir=hdfs://<server_name>:9000/user/<user_name>/spark-events
> 
> And the history server has been launched with the command below:
> 
> /opt/spark/sbin/start-history-server.sh hdfs://<server_name>:9000/user/<user_name>/spark-events
> 
> 
> However, the finished application do not appear in the history server UI (though the UI itself works correctly).
> Apparently, the problem is that the APPLICATION_COMPLETE file is not created:
> 
> hdfs dfs -stat %n spark-events/<application_name>-1403166516102/*
> COMPRESSION_CODEC_org.apache.spark.io.LZFCompressionCodec
> EVENT_LOG_2
> SPARK_VERSION_1.0.0
> 
> Indeed, if I manually create an empty APPLICATION_COMPLETE file in the above directory, the application can now be viewed normally in the history server.
> 
> Finally, here is the relevant part of the YARN application log, which seems to imply that
> the DFS Filesystem is already closed when the APPLICATION_COMPLETE file is created:
> 
> (...)
> 14/06/19 08:29:29 INFO ApplicationMaster: finishApplicationMaster with SUCCEEDED
> 14/06/19 08:29:29 INFO AMRMClientImpl: Waiting for application to be successfully unregistered.
> 14/06/19 08:29:29 INFO ApplicationMaster: AppMaster received a signal.
> 14/06/19 08:29:29 INFO ApplicationMaster: Deleting staging directory .sparkStaging/application_1397477394591_0798
> 14/06/19 08:29:29 INFO ApplicationMaster$$anon$1: Invoking sc stop from shutdown hook
> 14/06/19 08:29:29 INFO SparkUI: Stopped Spark web UI at http://dc1-ibd-corp-hadoop-02.corp.dc1.kelkoo.net:54877
> 14/06/19 08:29:29 INFO DAGScheduler: Stopping DAGScheduler
> 14/06/19 08:29:29 INFO CoarseGrainedSchedulerBackend: Shutting down all executors
> 14/06/19 08:29:29 INFO CoarseGrainedSchedulerBackend: Asking each executor to shut down
> 14/06/19 08:29:30 INFO MapOutputTrackerMasterActor: MapOutputTrackerActor stopped!
> 14/06/19 08:29:30 INFO ConnectionManager: Selector thread was interrupted!
> 14/06/19 08:29:30 INFO ConnectionManager: ConnectionManager stopped
> 14/06/19 08:29:30 INFO MemoryStore: MemoryStore cleared
> 14/06/19 08:29:30 INFO BlockManager: BlockManager stopped
> 14/06/19 08:29:30 INFO BlockManagerMasterActor: Stopping BlockManagerMaster
> 14/06/19 08:29:30 INFO BlockManagerMaster: BlockManagerMaster stopped
> Exception in thread "Thread-44" java.io.IOException: Filesystem closed
>         at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:629)
>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1365)
>         at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1307)
>         at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:384)
>         at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:380)
>         at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>         at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:380)
>         at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:324)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:905)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:886)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:783)
>         at org.apache.spark.util.FileLogger.createWriter(FileLogger.scala:117)
>         at org.apache.spark.util.FileLogger.newFile(FileLogger.scala:181)
>         at org.apache.spark.scheduler.EventLoggingListener.stop(EventLoggingListener.scala:129)
>         at org.apache.spark.SparkContext$$anonfun$stop$2.apply(SparkContext.scala:989)
>         at org.apache.spark.SparkContext$$anonfun$stop$2.apply(SparkContext.scala:989)
>         at scala.Option.foreach(Option.scala:236)
>         at org.apache.spark.SparkContext.stop(SparkContext.scala:989)
>         at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$1.run(ApplicationMaster.scala:443)
> 14/06/19 08:29:30 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
> 
> 
> Am I missing something, or is it a bug?
> 
> Thanks,
> Christophe.
> 
> Kelkoo SAS
> Société par Actions Simplifiée
> Au capital de € 4.168.964,30
> Siège social : 8, rue du Sentier 75002 Paris
> 425 093 069 RCS Paris
> 
> Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.
 		 	   		  

Re: write event logs with YARN

Posted by Christophe Préaud <ch...@kelkoo.com>.
Hi Andrew,

Thanks for your explanation, I confirm that the entries show up in the history server UI when I create empty APPLICATION_COMPLETE files for each of them.

Christophe.

On 03/07/2014 18:27, Andrew Or wrote:
Hi Christophe, another Andrew speaking.

Your configuration looks fine to me. From the stack trace it seems that we are in fact closing the file system pre-maturely elsewhere in the system, such that when it tries to write the APPLICATION_COMPLETE file it throws the exception you see. This does look like a potential bug in Spark. Tracing the source of this may take a little, but we will start looking into it.

I'm assuming if you manually create your own APPLICATION_COMPLETE file then the entries should show up. Unfortunately I don't see another workaround for this, but we'll fix this as soon as possible.

Andrew


2014-07-03 1:44 GMT-07:00 Christophe Préaud <ch...@kelkoo.com>>:
Hi Andrew,

This does not work (the application failed), I have the following error when I put 3 slashes in the hdfs scheme:
(...)
Caused by: java.lang.IllegalArgumentException: Pathname /dc1-ibd-corp-hadoop-01.corp.dc1.kelkoo.net:9000/user/kookel/spark-events/kelkoo.searchkeywordreport-1404374686442<http://dc1-ibd-corp-hadoop-01.corp.dc1.kelkoo.net:9000/user/kookel/spark-events/kelkoo.searchkeywordreport-1404374686442> from hdfs:/dc1-ibd-corp-hadoop-01.corp.dc1.kelkoo.net:9000/user/kookel/spark-events/kelkoo.searchkeywordreport-1404374686442<http://dc1-ibd-corp-hadoop-01.corp.dc1.kelkoo.net:9000/user/kookel/spark-events/kelkoo.searchkeywordreport-1404374686442> is not a valid DFS filename.
(...)

Besides, I do not think that there is an issue with the hdfs path name since only the empty APPLICATION_COMPLETE file is missing (with "spark.eventLog.dir=hdfs://<server_name>:9000/user/<user_name>/spark-events"), all other files are correctly created, e.g.:
hdfs dfs -ls spark-events/kelkoo.searchkeywordreport-1404376178470
Found 3 items
-rwxrwx---   1 kookel supergroup          0 2014-07-03 08:29 spark-events/kelkoo.searchkeywordreport-1404376178470/COMPRESSION_CODEC_org.apache.spark.io.LZFCompressionCodec
-rwxrwx---   1 kookel supergroup     137948 2014-07-03 08:32 spark-events/kelkoo.searchkeywordreport-1404376178470/EVENT_LOG_2
-rwxrwx---   1 kookel supergroup          0 2014-07-03 08:29 spark-events/kelkoo.searchkeywordreport-1404376178470/SPARK_VERSION_1.0.0

You help is appreciated though, do not hesitate if you have any other idea on how to fix this.

Thanks,
Christophe.


On 03/07/2014 01:49, Andrew Lee wrote:
Hi Christophe,

Make sure you have 3 slashes in the hdfs scheme.

e.g.

hdfs:///<server_name>:9000/user/<user_name>/spark-events

and in the spark-defaults.conf as well.
spark.eventLog.dir=hdfs:///<server_name>:9000/user/<user_name>/spark-events


> Date: Thu, 19 Jun 2014 11:18:51 +0200
> From: christophe.preaud@kelkoo.com<ma...@kelkoo.com>
> To: user@spark.apache.org<ma...@spark.apache.org>
> Subject: write event logs with YARN
>
> Hi,
>
> I am trying to use the new Spark history server in 1.0.0 to view finished applications (launched on YARN), without success so far.
>
> Here are the relevant configuration properties in my spark-defaults.conf:
>
> spark.yarn.historyServer.address=<server_name>:18080
> spark.ui.killEnabled=false
> spark.eventLog.enabled=true
> spark.eventLog.compress=true
> spark.eventLog.dir=hdfs://<server_name>:9000/user/<user_name>/spark-events
>
> And the history server has been launched with the command below:
>
> /opt/spark/sbin/start-history-server.sh hdfs://<server_name>:9000/user/<user_name>/spark-events
>
>
> However, the finished application do not appear in the history server UI (though the UI itself works correctly).
> Apparently, the problem is that the APPLICATION_COMPLETE file is not created:
>
> hdfs dfs -stat %n spark-events/<application_name>-1403166516102/*
> COMPRESSION_CODEC_org.apache.spark.io.LZFCompressionCodec
> EVENT_LOG_2
> SPARK_VERSION_1.0.0
>
> Indeed, if I manually create an empty APPLICATION_COMPLETE file in the above directory, the application can now be viewed normally in the history server.
>
> Finally, here is the relevant part of the YARN application log, which seems to imply that
> the DFS Filesystem is already closed when the APPLICATION_COMPLETE file is created:
>
> (...)
> 14/06/19 08:29:29 INFO ApplicationMaster: finishApplicationMaster with SUCCEEDED
> 14/06/19 08:29:29 INFO AMRMClientImpl: Waiting for application to be successfully unregistered.
> 14/06/19 08:29:29 INFO ApplicationMaster: AppMaster received a signal.
> 14/06/19 08:29:29 INFO ApplicationMaster: Deleting staging directory .sparkStaging/application_1397477394591_0798
> 14/06/19 08:29:29 INFO ApplicationMaster$$anon$1: Invoking sc stop from shutdown hook
> 14/06/19 08:29:29 INFO SparkUI: Stopped Spark web UI at http://dc1-ibd-corp-hadoop-02.corp.dc1.kelkoo.net:54877
> 14/06/19 08:29:29 INFO DAGScheduler: Stopping DAGScheduler
> 14/06/19 08:29:29 INFO CoarseGrainedSchedulerBackend: Shutting down all executors
> 14/06/19 08:29:29 INFO CoarseGrainedSchedulerBackend: Asking each executor to shut down
> 14/06/19 08:29:30 INFO MapOutputTrackerMasterActor: MapOutputTrackerActor stopped!
> 14/06/19 08:29:30 INFO ConnectionManager: Selector thread was interrupted!
> 14/06/19 08:29:30 INFO ConnectionManager: ConnectionManager stopped
> 14/06/19 08:29:30 INFO MemoryStore: MemoryStore cleared
> 14/06/19 08:29:30 INFO BlockManager: BlockManager stopped
> 14/06/19 08:29:30 INFO BlockManagerMasterActor: Stopping BlockManagerMaster
> 14/06/19 08:29:30 INFO BlockManagerMaster: BlockManagerMaster stopped
> Exception in thread "Thread-44" java.io.IOException: Filesystem closed
> at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:629)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1365)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1307)
> at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:384)
> at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:380)
> at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:380)
> at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:324)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:905)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:886)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:783)
> at org.apache.spark.util.FileLogger.createWriter(FileLogger.scala:117)
> at org.apache.spark.util.FileLogger.newFile(FileLogger.scala:181)
> at org.apache.spark.scheduler.EventLoggingListener.stop(EventLoggingListener.scala:129)
> at org.apache.spark.SparkContext$$anonfun$stop$2.apply(SparkContext.scala:989)
> at org.apache.spark.SparkContext$$anonfun$stop$2.apply(SparkContext.scala:989)
> at scala.Option.foreach(Option.scala:236)
> at org.apache.spark.SparkContext.stop(SparkContext.scala:989)
> at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$1.run(ApplicationMaster.scala:443)
> 14/06/19 08:29:30 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
>
>
> Am I missing something, or is it a bug?
>
> Thanks,
> Christophe.
>
> Kelkoo SAS
> Société par Actions Simplifiée
> Au capital de € 4.168.964,30
> Siège social : 8, rue du Sentier 75002 Paris
> 425 093 069 RCS Paris
>
> Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.


________________________________
Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.



________________________________
Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.

Re: write event logs with YARN

Posted by Andrew Or <an...@databricks.com>.
Hi Christophe, another Andrew speaking.

Your configuration looks fine to me. From the stack trace it seems that we
are in fact closing the file system pre-maturely elsewhere in the system,
such that when it tries to write the APPLICATION_COMPLETE file it throws
the exception you see. This does look like a potential bug in Spark.
Tracing the source of this may take a little, but we will start looking
into it.

I'm assuming if you manually create your own APPLICATION_COMPLETE file then
the entries should show up. Unfortunately I don't see another workaround
for this, but we'll fix this as soon as possible.

Andrew


2014-07-03 1:44 GMT-07:00 Christophe Préaud <ch...@kelkoo.com>:

>  Hi Andrew,
>
> This does not work (the application failed), I have the following error
> when I put 3 slashes in the hdfs scheme:
> (...)
> Caused by: java.lang.IllegalArgumentException: Pathname /
> dc1-ibd-corp-hadoop-01.corp.dc1.kelkoo.net:9000/user/kookel/spark-events/kelkoo.searchkeywordreport-1404374686442
> from hdfs:/
> dc1-ibd-corp-hadoop-01.corp.dc1.kelkoo.net:9000/user/kookel/spark-events/kelkoo.searchkeywordreport-1404374686442
> is not a valid DFS filename.
> (...)
>
> Besides, I do not think that there is an issue with the hdfs path name
> since only the empty APPLICATION_COMPLETE file is missing (with
> "spark.eventLog.dir=hdfs://<server_name>:9000/user/<user_name>/spark-events"),
> all other files are correctly created, e.g.:
> hdfs dfs -ls spark-events/kelkoo.searchkeywordreport-1404376178470
> Found 3 items
> -rwxrwx---   1 kookel supergroup          0 2014-07-03 08:29
> spark-events/kelkoo.searchkeywordreport-1404376178470/COMPRESSION_CODEC_org.apache.spark.io.LZFCompressionCodec
> -rwxrwx---   1 kookel supergroup     137948 2014-07-03 08:32
> spark-events/kelkoo.searchkeywordreport-1404376178470/EVENT_LOG_2
> -rwxrwx---   1 kookel supergroup          0 2014-07-03 08:29
> spark-events/kelkoo.searchkeywordreport-1404376178470/SPARK_VERSION_1.0.0
>
> You help is appreciated though, do not hesitate if you have any other idea
> on how to fix this.
>
> Thanks,
> Christophe.
>
>
> On 03/07/2014 01:49, Andrew Lee wrote:
>
> Hi Christophe,
>
>  Make sure you have 3 slashes in the hdfs scheme.
>
>  e.g.
>
>  hdfs:*///*<server_name>:9000/user/<user_name>/spark-events
>
>  and in the spark-defaults.conf as well.
> spark.eventLog.dir=hdfs:*///*
> <server_name>:9000/user/<user_name>/spark-events
>
>
> > Date: Thu, 19 Jun 2014 11:18:51 +0200
> > From: christophe.preaud@kelkoo.com
> > To: user@spark.apache.org
> > Subject: write event logs with YARN
> >
> > Hi,
> >
> > I am trying to use the new Spark history server in 1.0.0 to view
> finished applications (launched on YARN), without success so far.
> >
> > Here are the relevant configuration properties in my spark-defaults.conf:
> >
> > spark.yarn.historyServer.address=<server_name>:18080
> > spark.ui.killEnabled=false
> > spark.eventLog.enabled=true
> > spark.eventLog.compress=true
> >
> spark.eventLog.dir=hdfs://<server_name>:9000/user/<user_name>/spark-events
> >
> > And the history server has been launched with the command below:
> >
> > /opt/spark/sbin/start-history-server.sh
> hdfs://<server_name>:9000/user/<user_name>/spark-events
> >
> >
> > However, the finished application do not appear in the history server UI
> (though the UI itself works correctly).
> > Apparently, the problem is that the APPLICATION_COMPLETE file is not
> created:
> >
> > hdfs dfs -stat %n spark-events/<application_name>-1403166516102/*
> > COMPRESSION_CODEC_org.apache.spark.io.LZFCompressionCodec
> > EVENT_LOG_2
> > SPARK_VERSION_1.0.0
> >
> > Indeed, if I manually create an empty APPLICATION_COMPLETE file in the
> above directory, the application can now be viewed normally in the history
> server.
> >
> > Finally, here is the relevant part of the YARN application log, which
> seems to imply that
> > the DFS Filesystem is already closed when the APPLICATION_COMPLETE file
> is created:
> >
> > (...)
> > 14/06/19 08:29:29 INFO ApplicationMaster: finishApplicationMaster with
> SUCCEEDED
> > 14/06/19 08:29:29 INFO AMRMClientImpl: Waiting for application to be
> successfully unregistered.
> > 14/06/19 08:29:29 INFO ApplicationMaster: AppMaster received a signal.
> > 14/06/19 08:29:29 INFO ApplicationMaster: Deleting staging directory
> .sparkStaging/application_1397477394591_0798
> > 14/06/19 08:29:29 INFO ApplicationMaster$$anon$1: Invoking sc stop from
> shutdown hook
> > 14/06/19 08:29:29 INFO SparkUI: Stopped Spark web UI at
> http://dc1-ibd-corp-hadoop-02.corp.dc1.kelkoo.net:54877
> > 14/06/19 08:29:29 INFO DAGScheduler: Stopping DAGScheduler
> > 14/06/19 08:29:29 INFO CoarseGrainedSchedulerBackend: Shutting down all
> executors
> > 14/06/19 08:29:29 INFO CoarseGrainedSchedulerBackend: Asking each
> executor to shut down
> > 14/06/19 08:29:30 INFO MapOutputTrackerMasterActor:
> MapOutputTrackerActor stopped!
> > 14/06/19 08:29:30 INFO ConnectionManager: Selector thread was
> interrupted!
> > 14/06/19 08:29:30 INFO ConnectionManager: ConnectionManager stopped
> > 14/06/19 08:29:30 INFO MemoryStore: MemoryStore cleared
> > 14/06/19 08:29:30 INFO BlockManager: BlockManager stopped
> > 14/06/19 08:29:30 INFO BlockManagerMasterActor: Stopping
> BlockManagerMaster
> > 14/06/19 08:29:30 INFO BlockManagerMaster: BlockManagerMaster stopped
> > Exception in thread "Thread-44" java.io.IOException: Filesystem closed
> > at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:629)
> > at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1365)
> > at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1307)
> > at
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:384)
> > at
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:380)
> > at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> > at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:380)
> > at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:324)
> > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:905)
> > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:886)
> > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:783)
> > at org.apache.spark.util.FileLogger.createWriter(FileLogger.scala:117)
> > at org.apache.spark.util.FileLogger.newFile(FileLogger.scala:181)
> > at
> org.apache.spark.scheduler.EventLoggingListener.stop(EventLoggingListener.scala:129)
> > at
> org.apache.spark.SparkContext$$anonfun$stop$2.apply(SparkContext.scala:989)
> > at
> org.apache.spark.SparkContext$$anonfun$stop$2.apply(SparkContext.scala:989)
> > at scala.Option.foreach(Option.scala:236)
> > at org.apache.spark.SparkContext.stop(SparkContext.scala:989)
> > at
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$1.run(ApplicationMaster.scala:443)
> > 14/06/19 08:29:30 INFO RemoteActorRefProvider$RemotingTerminator:
> Shutting down remote daemon.
> >
> >
> > Am I missing something, or is it a bug?
> >
> > Thanks,
> > Christophe.
> >
> > Kelkoo SAS
> > Société par Actions Simplifiée
> > Au capital de € 4.168.964,30
> > Siège social : 8, rue du Sentier 75002 Paris
> > 425 093 069 RCS Paris
> >
> > Ce message et les pièces jointes sont confidentiels et établis à
> l'attention exclusive de leurs destinataires. Si vous n'êtes pas le
> destinataire de ce message, merci de le détruire et d'en avertir
> l'expéditeur.
>
>
>
> ------------------------------
> Kelkoo SAS
> Société par Actions Simplifiée
> Au capital de € 4.168.964,30
> Siège social : 8, rue du Sentier 75002 Paris
> 425 093 069 RCS Paris
>
> Ce message et les pièces jointes sont confidentiels et établis à
> l'attention exclusive de leurs destinataires. Si vous n'êtes pas le
> destinataire de ce message, merci de le détruire et d'en avertir
> l'expéditeur.
>

Re: write event logs with YARN

Posted by Christophe Préaud <ch...@kelkoo.com>.
Hi Andrew,

This does not work (the application failed), I have the following error when I put 3 slashes in the hdfs scheme:
(...)
Caused by: java.lang.IllegalArgumentException: Pathname /dc1-ibd-corp-hadoop-01.corp.dc1.kelkoo.net:9000/user/kookel/spark-events/kelkoo.searchkeywordreport-1404374686442 from hdfs:/dc1-ibd-corp-hadoop-01.corp.dc1.kelkoo.net:9000/user/kookel/spark-events/kelkoo.searchkeywordreport-1404374686442 is not a valid DFS filename.
(...)

Besides, I do not think that there is an issue with the hdfs path name since only the empty APPLICATION_COMPLETE file is missing (with "spark.eventLog.dir=hdfs://<server_name>:9000/user/<user_name>/spark-events"), all other files are correctly created, e.g.:
hdfs dfs -ls spark-events/kelkoo.searchkeywordreport-1404376178470
Found 3 items
-rwxrwx---   1 kookel supergroup          0 2014-07-03 08:29 spark-events/kelkoo.searchkeywordreport-1404376178470/COMPRESSION_CODEC_org.apache.spark.io.LZFCompressionCodec
-rwxrwx---   1 kookel supergroup     137948 2014-07-03 08:32 spark-events/kelkoo.searchkeywordreport-1404376178470/EVENT_LOG_2
-rwxrwx---   1 kookel supergroup          0 2014-07-03 08:29 spark-events/kelkoo.searchkeywordreport-1404376178470/SPARK_VERSION_1.0.0

You help is appreciated though, do not hesitate if you have any other idea on how to fix this.

Thanks,
Christophe.

On 03/07/2014 01:49, Andrew Lee wrote:
Hi Christophe,

Make sure you have 3 slashes in the hdfs scheme.

e.g.

hdfs:///<server_name>:9000/user/<user_name>/spark-events

and in the spark-defaults.conf as well.
spark.eventLog.dir=hdfs:///<server_name>:9000/user/<user_name>/spark-events


> Date: Thu, 19 Jun 2014 11:18:51 +0200
> From: christophe.preaud@kelkoo.com<ma...@kelkoo.com>
> To: user@spark.apache.org<ma...@spark.apache.org>
> Subject: write event logs with YARN
>
> Hi,
>
> I am trying to use the new Spark history server in 1.0.0 to view finished applications (launched on YARN), without success so far.
>
> Here are the relevant configuration properties in my spark-defaults.conf:
>
> spark.yarn.historyServer.address=<server_name>:18080
> spark.ui.killEnabled=false
> spark.eventLog.enabled=true
> spark.eventLog.compress=true
> spark.eventLog.dir=hdfs://<server_name>:9000/user/<user_name>/spark-events
>
> And the history server has been launched with the command below:
>
> /opt/spark/sbin/start-history-server.sh hdfs://<server_name>:9000/user/<user_name>/spark-events
>
>
> However, the finished application do not appear in the history server UI (though the UI itself works correctly).
> Apparently, the problem is that the APPLICATION_COMPLETE file is not created:
>
> hdfs dfs -stat %n spark-events/<application_name>-1403166516102/*
> COMPRESSION_CODEC_org.apache.spark.io.LZFCompressionCodec
> EVENT_LOG_2
> SPARK_VERSION_1.0.0
>
> Indeed, if I manually create an empty APPLICATION_COMPLETE file in the above directory, the application can now be viewed normally in the history server.
>
> Finally, here is the relevant part of the YARN application log, which seems to imply that
> the DFS Filesystem is already closed when the APPLICATION_COMPLETE file is created:
>
> (...)
> 14/06/19 08:29:29 INFO ApplicationMaster: finishApplicationMaster with SUCCEEDED
> 14/06/19 08:29:29 INFO AMRMClientImpl: Waiting for application to be successfully unregistered.
> 14/06/19 08:29:29 INFO ApplicationMaster: AppMaster received a signal.
> 14/06/19 08:29:29 INFO ApplicationMaster: Deleting staging directory .sparkStaging/application_1397477394591_0798
> 14/06/19 08:29:29 INFO ApplicationMaster$$anon$1: Invoking sc stop from shutdown hook
> 14/06/19 08:29:29 INFO SparkUI: Stopped Spark web UI at http://dc1-ibd-corp-hadoop-02.corp.dc1.kelkoo.net:54877
> 14/06/19 08:29:29 INFO DAGScheduler: Stopping DAGScheduler
> 14/06/19 08:29:29 INFO CoarseGrainedSchedulerBackend: Shutting down all executors
> 14/06/19 08:29:29 INFO CoarseGrainedSchedulerBackend: Asking each executor to shut down
> 14/06/19 08:29:30 INFO MapOutputTrackerMasterActor: MapOutputTrackerActor stopped!
> 14/06/19 08:29:30 INFO ConnectionManager: Selector thread was interrupted!
> 14/06/19 08:29:30 INFO ConnectionManager: ConnectionManager stopped
> 14/06/19 08:29:30 INFO MemoryStore: MemoryStore cleared
> 14/06/19 08:29:30 INFO BlockManager: BlockManager stopped
> 14/06/19 08:29:30 INFO BlockManagerMasterActor: Stopping BlockManagerMaster
> 14/06/19 08:29:30 INFO BlockManagerMaster: BlockManagerMaster stopped
> Exception in thread "Thread-44" java.io.IOException: Filesystem closed
> at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:629)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1365)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1307)
> at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:384)
> at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:380)
> at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:380)
> at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:324)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:905)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:886)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:783)
> at org.apache.spark.util.FileLogger.createWriter(FileLogger.scala:117)
> at org.apache.spark.util.FileLogger.newFile(FileLogger.scala:181)
> at org.apache.spark.scheduler.EventLoggingListener.stop(EventLoggingListener.scala:129)
> at org.apache.spark.SparkContext$$anonfun$stop$2.apply(SparkContext.scala:989)
> at org.apache.spark.SparkContext$$anonfun$stop$2.apply(SparkContext.scala:989)
> at scala.Option.foreach(Option.scala:236)
> at org.apache.spark.SparkContext.stop(SparkContext.scala:989)
> at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$1.run(ApplicationMaster.scala:443)
> 14/06/19 08:29:30 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
>
>
> Am I missing something, or is it a bug?
>
> Thanks,
> Christophe.
>
> Kelkoo SAS
> Société par Actions Simplifiée
> Au capital de € 4.168.964,30
> Siège social : 8, rue du Sentier 75002 Paris
> 425 093 069 RCS Paris
>
> Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.


________________________________
Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.