You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Ankur Gupta <an...@cloudera.com.INVALID> on 2018/08/21 21:19:10 UTC

Persisting driver logs in yarn client mode (SPARK-25118)

Hi all,

I want to highlight a problem that we face here at Cloudera and start a
discussion on how to go about solving it.

*Problem Statement:*
Our customers reach out to us when they face problems in their Spark
Applications. Those problems can be related to Spark, environment issues,
their own code or something else altogether. A lot of times these customers
run their Spark Applications in Yarn Client mode, which as we all know,
uses a ConsoleAppender to print logs to the console. These customers
usually send their Yarn logs to us to troubleshoot. As you may have
figured, these logs do not contain driver logs and makes it difficult for
us to troubleshoot the issue. In that scenario our customers end up running
the application again, piping the output to a log file or using a local log
appender and then sending over that file.

I believe that there are other users in the community who also face similar
problem, where the central team managing Spark clusters face difficulty in
helping the end users because they ran their application in shell or yarn
client mode (I am not sure what is the equivalent in Mesos).

Additionally, there may be teams who want to capture all these logs so that
they can be analyzed at some later point in time and the fact that driver
logs are not a part of Yarn Logs causes them to capture only partial logs
or makes it difficult to capture all the logs.

*Proposed Solution:*
One "low touch" approach will be to create an ApplicationListener which
listens for Application Start and Application End events. On Application
Start, this listener will append a Log Appender which writes to a local or
remote (eg:hdfs) log file in an application specific directory and moves
this to Yarn's Remote Application Dir (or equivalent Mesos Dir) on
application end. This way the logs will be available as part of Yarn Logs.

I am also interested in hearing about other ideas that the community may
have about this. Or if someone has already solved this problem, then I
would like them to contribute their solution to the community.

Thanks,
Ankur

Re: Persisting driver logs in yarn client mode (SPARK-25118)

Posted by Henry Robinson <he...@apache.org>.
On Mon, 27 Aug 2018 at 13:04, Ankur Gupta <an...@cloudera.com.invalid>
wrote:

> Thanks all for your responses.
>
> So I believe a solution that accomplishes the following will be a good
> solution:
>
> 1. Writes logs to Hdfs asynchronously
>

In the limit, this could perform just as slowly at shutdown time as
synchronous logging (imagine a job produces a huge amount of log output and
immediately completes). Will you plan to wait for the logging to complete,
wait up to some maximum time, or just exit quickly no matter how much log
shipping has been done?


> 2. Writes logs at INFO level while ensuring that console logs are written
> at WARN level by default (in shell mode)
> 3. Optionally, moves this file to Yarn's Remote Application Dir (to ensure
> that shutdown operation does not slow down significantly)
>
> If this resolves all the concerns, then I can work on a PR to add this
> functionality.
>
> On Fri, Aug 24, 2018 at 3:12 PM Marcelo Vanzin <va...@cloudera.com.invalid>
> wrote:
>
>> I think this would be useful, but I also share Saisai's and Marco's
>> concern about the extra step when shutting down the application. If
>> that could be minimized this would be a much more interesting feature.
>>
>> e.g. you could upload logs incrementally to HDFS, asynchronously,
>> while the app is running. Or you could pipe them to the YARN AM over
>> Spark's RPC (losing some logs in  the beginning and end of the driver
>> execution). Or maybe something else.
>>
>> There is also the issue of shell logs being at "warn" level by
>> default, so even if you write these to a file, they're not really that
>> useful for debugging. So a solution than keeps that behavior, but
>> writes INFO logs to this new sink, would be great.
>>
>> If you can come up with a solution to those problems I think this
>> could be a good feature.
>>
>>
>> On Wed, Aug 22, 2018 at 10:01 AM, Ankur Gupta
>> <an...@cloudera.com.invalid> wrote:
>> > Thanks for your responses Saisai and Marco.
>> >
>> > I agree that "rename" operation can be time-consuming on object storage,
>> > which can potentially delay the shutdown.
>> >
>> > I also agree that customers/users have a way to use log appenders to
>> write
>> > log files and then send them along with Yarn application logs but I
>> still
>> > think it is a cumbersome process. Also, there is the issue that
>> customers
>> > cannot easily identify which logs belong to which application, without
>> > reading the log file. And if users run multiple applications with
>> default
>> > log4j configurations on the same host, then they can end up writing to
>> the
>> > same log file.
>> >
>> > Because of the issues mentioned above, we can maybe think of this as an
>> > optional feature, which will be disabled by default but turned on by
>> > customers. This will solve the problems mentioned above, reduce the
>> overhead
>> > on users/customers while adding a bit of overhead during the shutdown
>> phase
>> > of Spark Application.
>> >
>> > Thanks,
>> > Ankur
>> >
>> > On Wed, Aug 22, 2018 at 1:36 AM Marco Gaido <ma...@gmail.com>
>> wrote:
>> >>
>> >> I agree with Saisai. You can also configure log4j to append anywhere
>> else
>> >> other than the console. Many companies have their system for
>> collecting and
>> >> monitoring logs and they just customize the log4j configuration. I am
>> not
>> >> sure how needed this change would be.
>> >>
>> >> Thanks,
>> >> Marco
>> >>
>> >> Il giorno mer 22 ago 2018 alle ore 04:31 Saisai Shao
>> >> <sa...@gmail.com> ha scritto:
>> >>>
>> >>> One issue I can think of is that this "moving the driver log" in the
>> >>> application end is quite time-consuming, which will significantly
>> delay the
>> >>> shutdown. We already suffered such "rename" problem for event log on
>> object
>> >>> store, the moving of driver log will make the problem severe.
>> >>>
>> >>> For a vanilla Spark on yarn client application, I think user could
>> >>> redirect the console outputs to log and provides both driver log and
>> yarn
>> >>> application log to the customers, this seems not a big overhead.
>> >>>
>> >>> Just my two cents.
>> >>>
>> >>> Thanks
>> >>> Saisai
>> >>>
>> >>> Ankur Gupta <an...@cloudera.com.invalid> 于2018年8月22日周三
>> 上午5:19写道:
>> >>>>
>> >>>> Hi all,
>> >>>>
>> >>>> I want to highlight a problem that we face here at Cloudera and
>> start a
>> >>>> discussion on how to go about solving it.
>> >>>>
>> >>>> Problem Statement:
>> >>>> Our customers reach out to us when they face problems in their Spark
>> >>>> Applications. Those problems can be related to Spark, environment
>> issues,
>> >>>> their own code or something else altogether. A lot of times these
>> customers
>> >>>> run their Spark Applications in Yarn Client mode, which as we all
>> know, uses
>> >>>> a ConsoleAppender to print logs to the console. These customers
>> usually send
>> >>>> their Yarn logs to us to troubleshoot. As you may have figured,
>> these logs
>> >>>> do not contain driver logs and makes it difficult for us to
>> troubleshoot the
>> >>>> issue. In that scenario our customers end up running the application
>> again,
>> >>>> piping the output to a log file or using a local log appender and
>> then
>> >>>> sending over that file.
>> >>>>
>> >>>> I believe that there are other users in the community who also face
>> >>>> similar problem, where the central team managing Spark clusters face
>> >>>> difficulty in helping the end users because they ran their
>> application in
>> >>>> shell or yarn client mode (I am not sure what is the equivalent in
>> Mesos).
>> >>>>
>> >>>> Additionally, there may be teams who want to capture all these logs
>> so
>> >>>> that they can be analyzed at some later point in time and the fact
>> that
>> >>>> driver logs are not a part of Yarn Logs causes them to capture only
>> partial
>> >>>> logs or makes it difficult to capture all the logs.
>> >>>>
>> >>>> Proposed Solution:
>> >>>> One "low touch" approach will be to create an ApplicationListener
>> which
>> >>>> listens for Application Start and Application End events. On
>> Application
>> >>>> Start, this listener will append a Log Appender which writes to a
>> local or
>> >>>> remote (eg:hdfs) log file in an application specific directory and
>> moves
>> >>>> this to Yarn's Remote Application Dir (or equivalent Mesos Dir) on
>> >>>> application end. This way the logs will be available as part of Yarn
>> Logs.
>> >>>>
>> >>>> I am also interested in hearing about other ideas that the community
>> may
>> >>>> have about this. Or if someone has already solved this problem, then
>> I would
>> >>>> like them to contribute their solution to the community.
>> >>>>
>> >>>> Thanks,
>> >>>> Ankur
>>
>>
>>
>> --
>> Marcelo
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>
>

Re: Persisting driver logs in yarn client mode (SPARK-25118)

Posted by Ankur Gupta <an...@cloudera.com.INVALID>.
Thanks all for your responses.

So I believe a solution that accomplishes the following will be a good
solution:

1. Writes logs to Hdfs asynchronously
2. Writes logs at INFO level while ensuring that console logs are written
at WARN level by default (in shell mode)
3. Optionally, moves this file to Yarn's Remote Application Dir (to ensure
that shutdown operation does not slow down significantly)

If this resolves all the concerns, then I can work on a PR to add this
functionality.

On Fri, Aug 24, 2018 at 3:12 PM Marcelo Vanzin <va...@cloudera.com.invalid>
wrote:

> I think this would be useful, but I also share Saisai's and Marco's
> concern about the extra step when shutting down the application. If
> that could be minimized this would be a much more interesting feature.
>
> e.g. you could upload logs incrementally to HDFS, asynchronously,
> while the app is running. Or you could pipe them to the YARN AM over
> Spark's RPC (losing some logs in  the beginning and end of the driver
> execution). Or maybe something else.
>
> There is also the issue of shell logs being at "warn" level by
> default, so even if you write these to a file, they're not really that
> useful for debugging. So a solution than keeps that behavior, but
> writes INFO logs to this new sink, would be great.
>
> If you can come up with a solution to those problems I think this
> could be a good feature.
>
>
> On Wed, Aug 22, 2018 at 10:01 AM, Ankur Gupta
> <an...@cloudera.com.invalid> wrote:
> > Thanks for your responses Saisai and Marco.
> >
> > I agree that "rename" operation can be time-consuming on object storage,
> > which can potentially delay the shutdown.
> >
> > I also agree that customers/users have a way to use log appenders to
> write
> > log files and then send them along with Yarn application logs but I still
> > think it is a cumbersome process. Also, there is the issue that customers
> > cannot easily identify which logs belong to which application, without
> > reading the log file. And if users run multiple applications with default
> > log4j configurations on the same host, then they can end up writing to
> the
> > same log file.
> >
> > Because of the issues mentioned above, we can maybe think of this as an
> > optional feature, which will be disabled by default but turned on by
> > customers. This will solve the problems mentioned above, reduce the
> overhead
> > on users/customers while adding a bit of overhead during the shutdown
> phase
> > of Spark Application.
> >
> > Thanks,
> > Ankur
> >
> > On Wed, Aug 22, 2018 at 1:36 AM Marco Gaido <ma...@gmail.com>
> wrote:
> >>
> >> I agree with Saisai. You can also configure log4j to append anywhere
> else
> >> other than the console. Many companies have their system for collecting
> and
> >> monitoring logs and they just customize the log4j configuration. I am
> not
> >> sure how needed this change would be.
> >>
> >> Thanks,
> >> Marco
> >>
> >> Il giorno mer 22 ago 2018 alle ore 04:31 Saisai Shao
> >> <sa...@gmail.com> ha scritto:
> >>>
> >>> One issue I can think of is that this "moving the driver log" in the
> >>> application end is quite time-consuming, which will significantly
> delay the
> >>> shutdown. We already suffered such "rename" problem for event log on
> object
> >>> store, the moving of driver log will make the problem severe.
> >>>
> >>> For a vanilla Spark on yarn client application, I think user could
> >>> redirect the console outputs to log and provides both driver log and
> yarn
> >>> application log to the customers, this seems not a big overhead.
> >>>
> >>> Just my two cents.
> >>>
> >>> Thanks
> >>> Saisai
> >>>
> >>> Ankur Gupta <an...@cloudera.com.invalid> 于2018年8月22日周三 上午5:19写道:
> >>>>
> >>>> Hi all,
> >>>>
> >>>> I want to highlight a problem that we face here at Cloudera and start
> a
> >>>> discussion on how to go about solving it.
> >>>>
> >>>> Problem Statement:
> >>>> Our customers reach out to us when they face problems in their Spark
> >>>> Applications. Those problems can be related to Spark, environment
> issues,
> >>>> their own code or something else altogether. A lot of times these
> customers
> >>>> run their Spark Applications in Yarn Client mode, which as we all
> know, uses
> >>>> a ConsoleAppender to print logs to the console. These customers
> usually send
> >>>> their Yarn logs to us to troubleshoot. As you may have figured, these
> logs
> >>>> do not contain driver logs and makes it difficult for us to
> troubleshoot the
> >>>> issue. In that scenario our customers end up running the application
> again,
> >>>> piping the output to a log file or using a local log appender and then
> >>>> sending over that file.
> >>>>
> >>>> I believe that there are other users in the community who also face
> >>>> similar problem, where the central team managing Spark clusters face
> >>>> difficulty in helping the end users because they ran their
> application in
> >>>> shell or yarn client mode (I am not sure what is the equivalent in
> Mesos).
> >>>>
> >>>> Additionally, there may be teams who want to capture all these logs so
> >>>> that they can be analyzed at some later point in time and the fact
> that
> >>>> driver logs are not a part of Yarn Logs causes them to capture only
> partial
> >>>> logs or makes it difficult to capture all the logs.
> >>>>
> >>>> Proposed Solution:
> >>>> One "low touch" approach will be to create an ApplicationListener
> which
> >>>> listens for Application Start and Application End events. On
> Application
> >>>> Start, this listener will append a Log Appender which writes to a
> local or
> >>>> remote (eg:hdfs) log file in an application specific directory and
> moves
> >>>> this to Yarn's Remote Application Dir (or equivalent Mesos Dir) on
> >>>> application end. This way the logs will be available as part of Yarn
> Logs.
> >>>>
> >>>> I am also interested in hearing about other ideas that the community
> may
> >>>> have about this. Or if someone has already solved this problem, then
> I would
> >>>> like them to contribute their solution to the community.
> >>>>
> >>>> Thanks,
> >>>> Ankur
>
>
>
> --
> Marcelo
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>
>

Re: Persisting driver logs in yarn client mode (SPARK-25118)

Posted by Marcelo Vanzin <va...@cloudera.com.INVALID>.
I think this would be useful, but I also share Saisai's and Marco's
concern about the extra step when shutting down the application. If
that could be minimized this would be a much more interesting feature.

e.g. you could upload logs incrementally to HDFS, asynchronously,
while the app is running. Or you could pipe them to the YARN AM over
Spark's RPC (losing some logs in  the beginning and end of the driver
execution). Or maybe something else.

There is also the issue of shell logs being at "warn" level by
default, so even if you write these to a file, they're not really that
useful for debugging. So a solution than keeps that behavior, but
writes INFO logs to this new sink, would be great.

If you can come up with a solution to those problems I think this
could be a good feature.


On Wed, Aug 22, 2018 at 10:01 AM, Ankur Gupta
<an...@cloudera.com.invalid> wrote:
> Thanks for your responses Saisai and Marco.
>
> I agree that "rename" operation can be time-consuming on object storage,
> which can potentially delay the shutdown.
>
> I also agree that customers/users have a way to use log appenders to write
> log files and then send them along with Yarn application logs but I still
> think it is a cumbersome process. Also, there is the issue that customers
> cannot easily identify which logs belong to which application, without
> reading the log file. And if users run multiple applications with default
> log4j configurations on the same host, then they can end up writing to the
> same log file.
>
> Because of the issues mentioned above, we can maybe think of this as an
> optional feature, which will be disabled by default but turned on by
> customers. This will solve the problems mentioned above, reduce the overhead
> on users/customers while adding a bit of overhead during the shutdown phase
> of Spark Application.
>
> Thanks,
> Ankur
>
> On Wed, Aug 22, 2018 at 1:36 AM Marco Gaido <ma...@gmail.com> wrote:
>>
>> I agree with Saisai. You can also configure log4j to append anywhere else
>> other than the console. Many companies have their system for collecting and
>> monitoring logs and they just customize the log4j configuration. I am not
>> sure how needed this change would be.
>>
>> Thanks,
>> Marco
>>
>> Il giorno mer 22 ago 2018 alle ore 04:31 Saisai Shao
>> <sa...@gmail.com> ha scritto:
>>>
>>> One issue I can think of is that this "moving the driver log" in the
>>> application end is quite time-consuming, which will significantly delay the
>>> shutdown. We already suffered such "rename" problem for event log on object
>>> store, the moving of driver log will make the problem severe.
>>>
>>> For a vanilla Spark on yarn client application, I think user could
>>> redirect the console outputs to log and provides both driver log and yarn
>>> application log to the customers, this seems not a big overhead.
>>>
>>> Just my two cents.
>>>
>>> Thanks
>>> Saisai
>>>
>>> Ankur Gupta <an...@cloudera.com.invalid> 于2018年8月22日周三 上午5:19写道:
>>>>
>>>> Hi all,
>>>>
>>>> I want to highlight a problem that we face here at Cloudera and start a
>>>> discussion on how to go about solving it.
>>>>
>>>> Problem Statement:
>>>> Our customers reach out to us when they face problems in their Spark
>>>> Applications. Those problems can be related to Spark, environment issues,
>>>> their own code or something else altogether. A lot of times these customers
>>>> run their Spark Applications in Yarn Client mode, which as we all know, uses
>>>> a ConsoleAppender to print logs to the console. These customers usually send
>>>> their Yarn logs to us to troubleshoot. As you may have figured, these logs
>>>> do not contain driver logs and makes it difficult for us to troubleshoot the
>>>> issue. In that scenario our customers end up running the application again,
>>>> piping the output to a log file or using a local log appender and then
>>>> sending over that file.
>>>>
>>>> I believe that there are other users in the community who also face
>>>> similar problem, where the central team managing Spark clusters face
>>>> difficulty in helping the end users because they ran their application in
>>>> shell or yarn client mode (I am not sure what is the equivalent in Mesos).
>>>>
>>>> Additionally, there may be teams who want to capture all these logs so
>>>> that they can be analyzed at some later point in time and the fact that
>>>> driver logs are not a part of Yarn Logs causes them to capture only partial
>>>> logs or makes it difficult to capture all the logs.
>>>>
>>>> Proposed Solution:
>>>> One "low touch" approach will be to create an ApplicationListener which
>>>> listens for Application Start and Application End events. On Application
>>>> Start, this listener will append a Log Appender which writes to a local or
>>>> remote (eg:hdfs) log file in an application specific directory and moves
>>>> this to Yarn's Remote Application Dir (or equivalent Mesos Dir) on
>>>> application end. This way the logs will be available as part of Yarn Logs.
>>>>
>>>> I am also interested in hearing about other ideas that the community may
>>>> have about this. Or if someone has already solved this problem, then I would
>>>> like them to contribute their solution to the community.
>>>>
>>>> Thanks,
>>>> Ankur



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Re: Persisting driver logs in yarn client mode (SPARK-25118)

Posted by Ankur Gupta <an...@cloudera.com.INVALID>.
Thanks for your responses Saisai and Marco.

I agree that "rename" operation can be time-consuming on object storage,
which can potentially delay the shutdown.

I also agree that customers/users have a way to use log appenders to write
log files and then send them along with Yarn application logs but I still
think it is a cumbersome process. Also, there is the issue that customers
cannot easily identify which logs belong to which application, without
reading the log file. And if users run multiple applications with default
log4j configurations on the same host, then they can end up writing to the
same log file.

Because of the issues mentioned above, we can maybe think of this as an
optional feature, which will be disabled by default but turned on by
customers. This will solve the problems mentioned above, reduce the
overhead on users/customers while adding a bit of overhead during the
shutdown phase of Spark Application.

Thanks,
Ankur

On Wed, Aug 22, 2018 at 1:36 AM Marco Gaido <ma...@gmail.com> wrote:

> I agree with Saisai. You can also configure log4j to append anywhere else
> other than the console. Many companies have their system for collecting and
> monitoring logs and they just customize the log4j configuration. I am not
> sure how needed this change would be.
>
> Thanks,
> Marco
>
> Il giorno mer 22 ago 2018 alle ore 04:31 Saisai Shao <
> sai.sai.shao@gmail.com> ha scritto:
>
>> One issue I can think of is that this "moving the driver log" in the
>> application end is quite time-consuming, which will significantly delay the
>> shutdown. We already suffered such "rename" problem for event log on object
>> store, the moving of driver log will make the problem severe.
>>
>> For a vanilla Spark on yarn client application, I think user could
>> redirect the console outputs to log and provides both driver log and yarn
>> application log to the customers, this seems not a big overhead.
>>
>> Just my two cents.
>>
>> Thanks
>> Saisai
>>
>> Ankur Gupta <an...@cloudera.com.invalid> 于2018年8月22日周三 上午5:19写道:
>>
>>> Hi all,
>>>
>>> I want to highlight a problem that we face here at Cloudera and start a
>>> discussion on how to go about solving it.
>>>
>>> *Problem Statement:*
>>> Our customers reach out to us when they face problems in their Spark
>>> Applications. Those problems can be related to Spark, environment issues,
>>> their own code or something else altogether. A lot of times these customers
>>> run their Spark Applications in Yarn Client mode, which as we all know,
>>> uses a ConsoleAppender to print logs to the console. These customers
>>> usually send their Yarn logs to us to troubleshoot. As you may have
>>> figured, these logs do not contain driver logs and makes it difficult for
>>> us to troubleshoot the issue. In that scenario our customers end up running
>>> the application again, piping the output to a log file or using a local log
>>> appender and then sending over that file.
>>>
>>> I believe that there are other users in the community who also face
>>> similar problem, where the central team managing Spark clusters face
>>> difficulty in helping the end users because they ran their application in
>>> shell or yarn client mode (I am not sure what is the equivalent in Mesos).
>>>
>>> Additionally, there may be teams who want to capture all these logs so
>>> that they can be analyzed at some later point in time and the fact that
>>> driver logs are not a part of Yarn Logs causes them to capture only partial
>>> logs or makes it difficult to capture all the logs.
>>>
>>> *Proposed Solution:*
>>> One "low touch" approach will be to create an ApplicationListener which
>>> listens for Application Start and Application End events. On Application
>>> Start, this listener will append a Log Appender which writes to a local or
>>> remote (eg:hdfs) log file in an application specific directory and moves
>>> this to Yarn's Remote Application Dir (or equivalent Mesos Dir) on
>>> application end. This way the logs will be available as part of Yarn Logs.
>>>
>>> I am also interested in hearing about other ideas that the community may
>>> have about this. Or if someone has already solved this problem, then I
>>> would like them to contribute their solution to the community.
>>>
>>> Thanks,
>>> Ankur
>>>
>>

Re: Persisting driver logs in yarn client mode (SPARK-25118)

Posted by Marco Gaido <ma...@gmail.com>.
I agree with Saisai. You can also configure log4j to append anywhere else
other than the console. Many companies have their system for collecting and
monitoring logs and they just customize the log4j configuration. I am not
sure how needed this change would be.

Thanks,
Marco

Il giorno mer 22 ago 2018 alle ore 04:31 Saisai Shao <sa...@gmail.com>
ha scritto:

> One issue I can think of is that this "moving the driver log" in the
> application end is quite time-consuming, which will significantly delay the
> shutdown. We already suffered such "rename" problem for event log on object
> store, the moving of driver log will make the problem severe.
>
> For a vanilla Spark on yarn client application, I think user could
> redirect the console outputs to log and provides both driver log and yarn
> application log to the customers, this seems not a big overhead.
>
> Just my two cents.
>
> Thanks
> Saisai
>
> Ankur Gupta <an...@cloudera.com.invalid> 于2018年8月22日周三 上午5:19写道:
>
>> Hi all,
>>
>> I want to highlight a problem that we face here at Cloudera and start a
>> discussion on how to go about solving it.
>>
>> *Problem Statement:*
>> Our customers reach out to us when they face problems in their Spark
>> Applications. Those problems can be related to Spark, environment issues,
>> their own code or something else altogether. A lot of times these customers
>> run their Spark Applications in Yarn Client mode, which as we all know,
>> uses a ConsoleAppender to print logs to the console. These customers
>> usually send their Yarn logs to us to troubleshoot. As you may have
>> figured, these logs do not contain driver logs and makes it difficult for
>> us to troubleshoot the issue. In that scenario our customers end up running
>> the application again, piping the output to a log file or using a local log
>> appender and then sending over that file.
>>
>> I believe that there are other users in the community who also face
>> similar problem, where the central team managing Spark clusters face
>> difficulty in helping the end users because they ran their application in
>> shell or yarn client mode (I am not sure what is the equivalent in Mesos).
>>
>> Additionally, there may be teams who want to capture all these logs so
>> that they can be analyzed at some later point in time and the fact that
>> driver logs are not a part of Yarn Logs causes them to capture only partial
>> logs or makes it difficult to capture all the logs.
>>
>> *Proposed Solution:*
>> One "low touch" approach will be to create an ApplicationListener which
>> listens for Application Start and Application End events. On Application
>> Start, this listener will append a Log Appender which writes to a local or
>> remote (eg:hdfs) log file in an application specific directory and moves
>> this to Yarn's Remote Application Dir (or equivalent Mesos Dir) on
>> application end. This way the logs will be available as part of Yarn Logs.
>>
>> I am also interested in hearing about other ideas that the community may
>> have about this. Or if someone has already solved this problem, then I
>> would like them to contribute their solution to the community.
>>
>> Thanks,
>> Ankur
>>
>

Re: Persisting driver logs in yarn client mode (SPARK-25118)

Posted by Saisai Shao <sa...@gmail.com>.
One issue I can think of is that this "moving the driver log" in the
application end is quite time-consuming, which will significantly delay the
shutdown. We already suffered such "rename" problem for event log on object
store, the moving of driver log will make the problem severe.

For a vanilla Spark on yarn client application, I think user could redirect
the console outputs to log and provides both driver log and yarn
application log to the customers, this seems not a big overhead.

Just my two cents.

Thanks
Saisai

Ankur Gupta <an...@cloudera.com.invalid> 于2018年8月22日周三 上午5:19写道:

> Hi all,
>
> I want to highlight a problem that we face here at Cloudera and start a
> discussion on how to go about solving it.
>
> *Problem Statement:*
> Our customers reach out to us when they face problems in their Spark
> Applications. Those problems can be related to Spark, environment issues,
> their own code or something else altogether. A lot of times these customers
> run their Spark Applications in Yarn Client mode, which as we all know,
> uses a ConsoleAppender to print logs to the console. These customers
> usually send their Yarn logs to us to troubleshoot. As you may have
> figured, these logs do not contain driver logs and makes it difficult for
> us to troubleshoot the issue. In that scenario our customers end up running
> the application again, piping the output to a log file or using a local log
> appender and then sending over that file.
>
> I believe that there are other users in the community who also face
> similar problem, where the central team managing Spark clusters face
> difficulty in helping the end users because they ran their application in
> shell or yarn client mode (I am not sure what is the equivalent in Mesos).
>
> Additionally, there may be teams who want to capture all these logs so
> that they can be analyzed at some later point in time and the fact that
> driver logs are not a part of Yarn Logs causes them to capture only partial
> logs or makes it difficult to capture all the logs.
>
> *Proposed Solution:*
> One "low touch" approach will be to create an ApplicationListener which
> listens for Application Start and Application End events. On Application
> Start, this listener will append a Log Appender which writes to a local or
> remote (eg:hdfs) log file in an application specific directory and moves
> this to Yarn's Remote Application Dir (or equivalent Mesos Dir) on
> application end. This way the logs will be available as part of Yarn Logs.
>
> I am also interested in hearing about other ideas that the community may
> have about this. Or if someone has already solved this problem, then I
> would like them to contribute their solution to the community.
>
> Thanks,
> Ankur
>