You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Shivani Rao <ra...@gmail.com> on 2014/06/03 01:18:20 UTC

using Log4j to log INFO level messages on workers

Hello Spark fans,

I am trying to log messages from my spark application. When the main()
function attempts to log, using log.info() it works great, but when I try
the same command from the code that probably runs on the worker, I
initially got an serialization error. To solve that, I created a new logger
in the code that operates on the data, which solved the serialization issue
but now there is no output in the console or on the worker node logs. I
don't see any application level log messages in the spark logs either. When
I use println() instead, I do see console output being  generated.

I tried the following and none of them works

a) pass log4j.properties by using -Dlog4j.properties in my java command
line initiation of the spark application
b) setting the properties within the worker by calling log.addAppender(new
ConsoleAppender)

None of them work.

What am i missing?


Thanks,
Shivani
-- 
Software Engineer
Analytics Engineering Team@ Box
Mountain View, CA

Re: using Log4j to log INFO level messages on workers

Posted by Shivani Rao <ra...@gmail.com>.
Hello Alex

Thanks for the link. Yes creating a singleton object for logging outside
the code that gets executed on the workers definitely works. The problem
that i am facing though is related to configuration of the logger. I don't
see any log messages in the worker logs of the application.

a) when i use println, I see the messages from the worker being logged into
the main driver of the application
b) when i use logger,  i see logger messages from the main() but not from
the workers.

Maybe I should upload a MWE (minimum working example) to demonstrate my
point.

Thanks
Shivani


On Mon, Jun 2, 2014 at 10:33 PM, Alex Gaudio <ad...@gmail.com> wrote:

> Hi,
>
>
> I had the same problem with pyspark.  Here's how I resolved it:
>
> What I've found in python (not sure about scala) is that if the function
> being serialized was written in the same python module as the main
> function, then logging fails.  If the serialized function is in a separate
> module, then logging does not fail.  I just created this gist to demo the
> situation and (python) solution.  Is there a similar way to do this in
> scala?
>
> https://gist.github.com/adgaudio/0191e14717af68bbba81
>
>
> Alex
>
>
> On Mon, Jun 2, 2014 at 7:18 PM, Shivani Rao <ra...@gmail.com> wrote:
>
>> Hello Spark fans,
>>
>> I am trying to log messages from my spark application. When the main()
>> function attempts to log, using log.info() it works great, but when I
>> try the same command from the code that probably runs on the worker, I
>> initially got an serialization error. To solve that, I created a new logger
>> in the code that operates on the data, which solved the serialization issue
>> but now there is no output in the console or on the worker node logs. I
>> don't see any application level log messages in the spark logs either. When
>> I use println() instead, I do see console output being  generated.
>>
>> I tried the following and none of them works
>>
>> a) pass log4j.properties by using -Dlog4j.properties in my java command
>> line initiation of the spark application
>> b) setting the properties within the worker by
>> calling log.addAppender(new ConsoleAppender)
>>
>> None of them work.
>>
>> What am i missing?
>>
>>
>> Thanks,
>> Shivani
>> --
>> Software Engineer
>> Analytics Engineering Team@ Box
>> Mountain View, CA
>>
>
>


-- 
Software Engineer
Analytics Engineering Team@ Box
Mountain View, CA

Re: using Log4j to log INFO level messages on workers

Posted by Alex Gaudio <ad...@gmail.com>.
Hi,


I had the same problem with pyspark.  Here's how I resolved it:

What I've found in python (not sure about scala) is that if the function
being serialized was written in the same python module as the main
function, then logging fails.  If the serialized function is in a separate
module, then logging does not fail.  I just created this gist to demo the
situation and (python) solution.  Is there a similar way to do this in
scala?

https://gist.github.com/adgaudio/0191e14717af68bbba81


Alex


On Mon, Jun 2, 2014 at 7:18 PM, Shivani Rao <ra...@gmail.com> wrote:

> Hello Spark fans,
>
> I am trying to log messages from my spark application. When the main()
> function attempts to log, using log.info() it works great, but when I try
> the same command from the code that probably runs on the worker, I
> initially got an serialization error. To solve that, I created a new logger
> in the code that operates on the data, which solved the serialization issue
> but now there is no output in the console or on the worker node logs. I
> don't see any application level log messages in the spark logs either. When
> I use println() instead, I do see console output being  generated.
>
> I tried the following and none of them works
>
> a) pass log4j.properties by using -Dlog4j.properties in my java command
> line initiation of the spark application
> b) setting the properties within the worker by calling log.addAppender(new
> ConsoleAppender)
>
> None of them work.
>
> What am i missing?
>
>
> Thanks,
> Shivani
> --
> Software Engineer
> Analytics Engineering Team@ Box
> Mountain View, CA
>