You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Gyula Fóra <gy...@gmail.com> on 2019/10/03 08:55:54 UTC

[DISCUSS] Improve Flink logging with contextual information

Hi all!

We have been thinking that it would be a great improvement to add
contextual information to the Flink logs:

 - Container / yarn / host info to JM/TM logs
 - Job info (job id/ jobname) to task logs

I this should be similar to how the metric scopes are set up and should be
able to provide the same information for logs. Ideally it would be user
configurable.

We are wondering what would be the best way to do this, and would like to
ask for opinions or past experiences.

Our natural first thought was setting NDC / MDC in the different threads
but it seems to be a somewhat fragile mechanism as it can be easily
"cleared" or deleted by the user.

What do you think?

Gyula

Re: [DISCUSS] Improve Flink logging with contextual information

Posted by Yang Wang <da...@gmail.com>.
+1 to Rong’s approach.

Using java option and log4j, we could save the user logs to different file.

Best
Yang

Gyula Fóra <gy...@gmail.com> 于2019年10月18日周五 下午4:41写道:

> Hi all!
>
> Thanks for the answers, this has been very helpful and we could set up a
> similar scheme using the Env variables.
>
> Cheers,
> Gyula
>
> On Tue, Oct 15, 2019 at 9:55 AM Paul Lam <pa...@gmail.com> wrote:
>
> > +1 to Rong’s approach. We use a similar solution to the log context
> > problem
> > on YARN setups. FYI.
> >
> > WRT container contextual informations, we collection logs via ELK so that
> > the log file paths (which contains application id and container id) and
> > the host
> > are attached with the logs. But if you don’t want a new log collector,
> you
> > can
> > also use the system env variables in your log pattern. Flink sets the
> > container
> > informations into the system env variables, which could be found in the
> > container
> > launch script.
> >
> > WRT job contextual informations, we’ve tried MDC on task threads but it
> > ended
> > up with poor readability because Flink system threads are not set with
> the
> > MDC
> > variables (in my case user info), so now we use user name in system env
> as
> > the logger pattern variable instead. However, for job id/name, I’m afraid
> > that
> > they can not be found in the default system env variables. You may need
> > to find a way to set them into the system env or system properties.
> >
> > Best,
> > Paul Lam
> >
> > > 在 2019年10月15日,12:50,Rong Rong <wa...@gmail.com> 写道:
> > >
> > > Hi Gyula,
> > >
> > > Sorry for the late reply. I think it is definitely a challenge in terms
> > of
> > > log visibility.
> > > However, for your requirement I think you can customize your Flink job
> by
> > > utilizing a customized log formatter/encoder (e.g. log4j.properties or
> > > logback.xml) and a suitable logger implementation.
> > >
> > > One example you can follow is to provide customFields in your log
> > encoding
> > > [1,2] and utilizing a supported Appender to append your log to a file.
> > > You can also utilize a more customized appender to log the data into
> some
> > > external database (for example, ElasticSearch and access via Kibana).
> > >
> > > One challenge you might face is how to configure these contextual
> > > information dynamically. In our setup, these contextual information are
> > > configured as system env params when job launches. so loggers can
> > > dynamically resolve them during start time.
> > >
> > > Please let me know if any of the suggestions above helps.
> > >
> > > Cheers,
> > > Rong
> > >
> > > [1]
> > >
> >
> https://github.com/logstash/logstash-logback-encoder/blob/master/src/test/resources/logback-test.xml#L13
> > > [2] https://github.com/logstash/logstash-logback-encoder
> > >
> > > On Thu, Oct 3, 2019 at 1:56 AM Gyula Fóra <gy...@gmail.com>
> wrote:
> > >
> > >> Hi all!
> > >>
> > >> We have been thinking that it would be a great improvement to add
> > >> contextual information to the Flink logs:
> > >>
> > >> - Container / yarn / host info to JM/TM logs
> > >> - Job info (job id/ jobname) to task logs
> > >>
> > >> I this should be similar to how the metric scopes are set up and
> should
> > be
> > >> able to provide the same information for logs. Ideally it would be
> user
> > >> configurable.
> > >>
> > >> We are wondering what would be the best way to do this, and would like
> > to
> > >> ask for opinions or past experiences.
> > >>
> > >> Our natural first thought was setting NDC / MDC in the different
> threads
> > >> but it seems to be a somewhat fragile mechanism as it can be easily
> > >> "cleared" or deleted by the user.
> > >>
> > >> What do you think?
> > >>
> > >> Gyula
> > >>
> >
> >
>

Re: [DISCUSS] Improve Flink logging with contextual information

Posted by Gyula Fóra <gy...@gmail.com>.
Hi all!

Thanks for the answers, this has been very helpful and we could set up a
similar scheme using the Env variables.

Cheers,
Gyula

On Tue, Oct 15, 2019 at 9:55 AM Paul Lam <pa...@gmail.com> wrote:

> +1 to Rong’s approach. We use a similar solution to the log context
> problem
> on YARN setups. FYI.
>
> WRT container contextual informations, we collection logs via ELK so that
> the log file paths (which contains application id and container id) and
> the host
> are attached with the logs. But if you don’t want a new log collector, you
> can
> also use the system env variables in your log pattern. Flink sets the
> container
> informations into the system env variables, which could be found in the
> container
> launch script.
>
> WRT job contextual informations, we’ve tried MDC on task threads but it
> ended
> up with poor readability because Flink system threads are not set with the
> MDC
> variables (in my case user info), so now we use user name in system env as
> the logger pattern variable instead. However, for job id/name, I’m afraid
> that
> they can not be found in the default system env variables. You may need
> to find a way to set them into the system env or system properties.
>
> Best,
> Paul Lam
>
> > 在 2019年10月15日,12:50,Rong Rong <wa...@gmail.com> 写道:
> >
> > Hi Gyula,
> >
> > Sorry for the late reply. I think it is definitely a challenge in terms
> of
> > log visibility.
> > However, for your requirement I think you can customize your Flink job by
> > utilizing a customized log formatter/encoder (e.g. log4j.properties or
> > logback.xml) and a suitable logger implementation.
> >
> > One example you can follow is to provide customFields in your log
> encoding
> > [1,2] and utilizing a supported Appender to append your log to a file.
> > You can also utilize a more customized appender to log the data into some
> > external database (for example, ElasticSearch and access via Kibana).
> >
> > One challenge you might face is how to configure these contextual
> > information dynamically. In our setup, these contextual information are
> > configured as system env params when job launches. so loggers can
> > dynamically resolve them during start time.
> >
> > Please let me know if any of the suggestions above helps.
> >
> > Cheers,
> > Rong
> >
> > [1]
> >
> https://github.com/logstash/logstash-logback-encoder/blob/master/src/test/resources/logback-test.xml#L13
> > [2] https://github.com/logstash/logstash-logback-encoder
> >
> > On Thu, Oct 3, 2019 at 1:56 AM Gyula Fóra <gy...@gmail.com> wrote:
> >
> >> Hi all!
> >>
> >> We have been thinking that it would be a great improvement to add
> >> contextual information to the Flink logs:
> >>
> >> - Container / yarn / host info to JM/TM logs
> >> - Job info (job id/ jobname) to task logs
> >>
> >> I this should be similar to how the metric scopes are set up and should
> be
> >> able to provide the same information for logs. Ideally it would be user
> >> configurable.
> >>
> >> We are wondering what would be the best way to do this, and would like
> to
> >> ask for opinions or past experiences.
> >>
> >> Our natural first thought was setting NDC / MDC in the different threads
> >> but it seems to be a somewhat fragile mechanism as it can be easily
> >> "cleared" or deleted by the user.
> >>
> >> What do you think?
> >>
> >> Gyula
> >>
>
>

Re: [DISCUSS] Improve Flink logging with contextual information

Posted by Paul Lam <pa...@gmail.com>.
+1 to Rong’s approach. We use a similar solution to the log context problem 
on YARN setups. FYI.

WRT container contextual informations, we collection logs via ELK so that 
the log file paths (which contains application id and container id) and the host
are attached with the logs. But if you don’t want a new log collector, you can
also use the system env variables in your log pattern. Flink sets the container
informations into the system env variables, which could be found in the container
launch script.

WRT job contextual informations, we’ve tried MDC on task threads but it ended
up with poor readability because Flink system threads are not set with the MDC
variables (in my case user info), so now we use user name in system env as
the logger pattern variable instead. However, for job id/name, I’m afraid that 
they can not be found in the default system env variables. You may need 
to find a way to set them into the system env or system properties.

Best,
Paul Lam

> 在 2019年10月15日,12:50,Rong Rong <wa...@gmail.com> 写道:
> 
> Hi Gyula,
> 
> Sorry for the late reply. I think it is definitely a challenge in terms of
> log visibility.
> However, for your requirement I think you can customize your Flink job by
> utilizing a customized log formatter/encoder (e.g. log4j.properties or
> logback.xml) and a suitable logger implementation.
> 
> One example you can follow is to provide customFields in your log encoding
> [1,2] and utilizing a supported Appender to append your log to a file.
> You can also utilize a more customized appender to log the data into some
> external database (for example, ElasticSearch and access via Kibana).
> 
> One challenge you might face is how to configure these contextual
> information dynamically. In our setup, these contextual information are
> configured as system env params when job launches. so loggers can
> dynamically resolve them during start time.
> 
> Please let me know if any of the suggestions above helps.
> 
> Cheers,
> Rong
> 
> [1]
> https://github.com/logstash/logstash-logback-encoder/blob/master/src/test/resources/logback-test.xml#L13
> [2] https://github.com/logstash/logstash-logback-encoder
> 
> On Thu, Oct 3, 2019 at 1:56 AM Gyula Fóra <gy...@gmail.com> wrote:
> 
>> Hi all!
>> 
>> We have been thinking that it would be a great improvement to add
>> contextual information to the Flink logs:
>> 
>> - Container / yarn / host info to JM/TM logs
>> - Job info (job id/ jobname) to task logs
>> 
>> I this should be similar to how the metric scopes are set up and should be
>> able to provide the same information for logs. Ideally it would be user
>> configurable.
>> 
>> We are wondering what would be the best way to do this, and would like to
>> ask for opinions or past experiences.
>> 
>> Our natural first thought was setting NDC / MDC in the different threads
>> but it seems to be a somewhat fragile mechanism as it can be easily
>> "cleared" or deleted by the user.
>> 
>> What do you think?
>> 
>> Gyula
>> 


Re: [DISCUSS] Improve Flink logging with contextual information

Posted by Rong Rong <wa...@gmail.com>.
Hi Gyula,

Sorry for the late reply. I think it is definitely a challenge in terms of
log visibility.
However, for your requirement I think you can customize your Flink job by
utilizing a customized log formatter/encoder (e.g. log4j.properties or
logback.xml) and a suitable logger implementation.

One example you can follow is to provide customFields in your log encoding
[1,2] and utilizing a supported Appender to append your log to a file.
You can also utilize a more customized appender to log the data into some
external database (for example, ElasticSearch and access via Kibana).

One challenge you might face is how to configure these contextual
information dynamically. In our setup, these contextual information are
configured as system env params when job launches. so loggers can
dynamically resolve them during start time.

Please let me know if any of the suggestions above helps.

Cheers,
Rong

[1]
https://github.com/logstash/logstash-logback-encoder/blob/master/src/test/resources/logback-test.xml#L13
[2] https://github.com/logstash/logstash-logback-encoder

On Thu, Oct 3, 2019 at 1:56 AM Gyula Fóra <gy...@gmail.com> wrote:

> Hi all!
>
> We have been thinking that it would be a great improvement to add
> contextual information to the Flink logs:
>
>  - Container / yarn / host info to JM/TM logs
>  - Job info (job id/ jobname) to task logs
>
> I this should be similar to how the metric scopes are set up and should be
> able to provide the same information for logs. Ideally it would be user
> configurable.
>
> We are wondering what would be the best way to do this, and would like to
> ask for opinions or past experiences.
>
> Our natural first thought was setting NDC / MDC in the different threads
> but it seems to be a somewhat fragile mechanism as it can be easily
> "cleared" or deleted by the user.
>
> What do you think?
>
> Gyula
>