You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Craig Vanderborgh <cr...@gmail.com> on 2013/10/15 22:03:51 UTC

Logging from Spark Jobs on a Mesos Cluster

Hello All,

I now have a real mesos dev cluster set up and it appears to be working.
There is one major weirdness though that I need help with.

First, my configuration is the following:

Chronos 1.0 snapshot running on single-node zookeeper (localhost:2181)
Spark-0.7.3
Mesos-0.14.0-rc4 (a Mesosphere branch)

The problem is this: Let's say that I run the SparkPi job as follows, from
Chronos:

/usr/sbin/run spark.examples.SparkPi mesos:5050 1000

The log output from the job does not get output anywhere!

SO, then I changed this test to redirect the output to a file on /tmp:

/usr/sbin/run spark.examples.SparkPi mesos:5050 1000 >& /tmp/sparkpi.log


This WORKS, but the log file comes out on "one of" the cluster nodes, not
the Mesos master, and not the Chronos machine, although it DOES come out on
the expected machine SOME of the time.

In other words: the log output appears variously on the Mesos node that
ends up running the driver program.  This is nasty!

What is the correct way to log from Spark jobs on a Mesos cluster so that
the output appears at a predictable, expected place?

Please advise.

Thanks in advance,
Craig Vanderborgh

Re: Logging from Spark Jobs on a Mesos Cluster

Posted by Paco Nathan <ce...@gmail.com>.
One approach could be: use the Mesos UI, click into the Chronos task, then
click its "Sandbox", to locate the stdout

There's work in progress to generalize how log output gets handled.


On Tue, Oct 15, 2013 at 1:03 PM, Craig Vanderborgh <
craigvanderborgh@gmail.com> wrote:

> Hello All,
>
> I now have a real mesos dev cluster set up and it appears to be working.
> There is one major weirdness though that I need help with.
>
> First, my configuration is the following:
>
> Chronos 1.0 snapshot running on single-node zookeeper (localhost:2181)
> Spark-0.7.3
> Mesos-0.14.0-rc4 (a Mesosphere branch)
>
> The problem is this: Let's say that I run the SparkPi job as follows, from
> Chronos:
>
> /usr/sbin/run spark.examples.SparkPi mesos:5050 1000
>
> The log output from the job does not get output anywhere!
>
> SO, then I changed this test to redirect the output to a file on /tmp:
>
> /usr/sbin/run spark.examples.SparkPi mesos:5050 1000 >& /tmp/sparkpi.log
>
>
> This WORKS, but the log file comes out on "one of" the cluster nodes, not
> the Mesos master, and not the Chronos machine, although it DOES come out on
> the expected machine SOME of the time.
>
> In other words: the log output appears variously on the Mesos node that
> ends up running the driver program.  This is nasty!
>
> What is the correct way to log from Spark jobs on a Mesos cluster so that
> the output appears at a predictable, expected place?
>
> Please advise.
>
> Thanks in advance,
> Craig Vanderborgh
>