You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by og...@yahoo.com on 2006/08/12 05:14:34 UTC

log4j.properties bug(?)

Hello,

I noticed the following line in conf/log4j.properties:

  log4j.appender.DRFA.File=${hadoop.log.dir}/${hadoop.log.file}

I noticed that the ${hadoop.log.dir}/${hadoop.log.file} sometimes gets interpreted as "/", indicating that the 2 hadoop properties there are undefined.

I also noticed that the bin/nutch script defines these two properties while invoking Nutch, like this:

  NUTCH_OPTS="$NUTCH_OPTS -Dhadoop.log.dir=$NUTCH_LOG_DIR"

I assume the idea is that the JVM knows about hadoop.log.dir system property, and then log4j knows about it, too.
However, it doesn't _always_ work.

That is, when invoking various bin/nutch commands as described in http://lucene.apache.org/nutch/tutorial8.html , this fails, and the system attempts to write to "/" which, of course, is a directory, not a file.

On the other hand, I also run Nutch using commands described in http://wiki.apache.org/nutch/NutchHadoopTutorial , which are slightly different.  I noticed that when I did that, log4j worked as designed - those 2 properties were defined and logging really went to logs/hadoop.log.

I'm not yet familiar enough with all the internals and configs to figure out what to change to fix this, but can somebody else more familar with the setup fix this?

Otis




Re: log4j.properties bug(?)

Posted by e w <ep...@gmail.com>.
Thanks for pointing this out! I've sent 2 messages to the lists asking where
the Fetcher logs have disappeared to and no-one else seemed to be
experiencing this problem. Hardwiring the "log4j.appender.DRFA.File"
variable to a specified filename has solved this and the logs are back. If
anyone finds the correct solution please share it.

-Ed

On 8/12/06, ogjunk-nutch@yahoo.com <og...@yahoo.com> wrote:
>
> Hello,
>
> I noticed the following line in conf/log4j.properties:
>
>   log4j.appender.DRFA.File=${hadoop.log.dir}/${hadoop.log.file}
>
> I noticed that the ${hadoop.log.dir}/${hadoop.log.file} sometimes gets
> interpreted as "/", indicating that the 2 hadoop properties there are
> undefined.
>
> I also noticed that the bin/nutch script defines these two properties
> while invoking Nutch, like this:
>
>   NUTCH_OPTS="$NUTCH_OPTS -Dhadoop.log.dir=$NUTCH_LOG_DIR"
>
> I assume the idea is that the JVM knows about hadoop.log.dir system
> property, and then log4j knows about it, too.
> However, it doesn't _always_ work.
>
> That is, when invoking various bin/nutch commands as described in
> http://lucene.apache.org/nutch/tutorial8.html , this fails, and the system
> attempts to write to "/" which, of course, is a directory, not a file.
>
> On the other hand, I also run Nutch using commands described in
> http://wiki.apache.org/nutch/NutchHadoopTutorial , which are slightly
> different.  I noticed that when I did that, log4j worked as designed - those
> 2 properties were defined and logging really went to logs/hadoop.log.
>
> I'm not yet familiar enough with all the internals and configs to figure
> out what to change to fix this, but can somebody else more familar with the
> setup fix this?
>
> Otis
>
>
>
>

Re: [Nutch-general] log4j.properties bug(?)

Posted by e w <ep...@gmail.com>.
Hi Sami,

In case it helps (since I've experience the same issue) I'm running on a
multiple node setup and run dfs and the nutch commands same as Otis.

However, with my "fix" of hard-wiring the path of the hadoop.log file in
log4j.properties I get multiple machines and threads trying to write
simultaneously to this same file. I haven't looked at the code to see
whether the logging function tries to lock the file before writing but the
logs certainly show time-stamps that are all over the place.

Using the nightly build from a month or so ago I believe that the fetcher
logs were written to the individual tasktracker logs. Ought this not to
still be the case?

-Ed


On 8/13/06, ogjunk-nutch@yahoo.com <og...@yahoo.com> wrote:
>
> Hi Sami,
>
> This is a single box setup.  I start dfs and mapred daemons with
> bin/start-all.sh .
> I'm using bin/nutch with various commands (inject, generate, crawl,
> updatedb, merge, invertlinks, index, dedup ...), as described in
> http://lucene.apache.org/nutch/tutorial8.html.
>
> Otis
>
> ----- Original Message ----
> From: Sami Siren <ss...@gmail.com>
> To: nutch-user@lucene.apache.org
> Sent: Saturday, August 12, 2006 1:48:40 AM
> Subject: Re: [Nutch-general] log4j.properties bug(?)
>
> ogjunk-nutch@yahoo.com wrote:
> > Hi Sami,
> >
> > ----- Original Message ---- ogjunk-nutch@yahoo.com wrote:
> >> I assume the idea is that the JVM knows about hadoop.log.dir system
> >>  property, and then log4j knows about it, too. However, it doesn't
> >> _always_ work.
> >>
> >> That is, when invoking various bin/nutch commands as described in
> >> http://lucene.apache.org/nutch/tutorial8.html , this fails, and the
> >>  system attempts to write to "/" which, of course, is a directory,
> >> not a file.
> >>
> > Can you be more precise on this one - what commands do fail? What
> > kind of configuration are you running this on?
> >
> >
> > I'll have to look at another server's logs tomorrow, but I can tell
> > you that the error is much like the one in
> > http://issues.apache.org/jira/browse/NUTCH-307 :
> >
> > java.io.FileNotFoundException: / (Is a directory) cr06:   at
> > java.io.FileOutputStream.openAppend(Native Method) cr06:   at
> > java.io.FileOutputStream.<init>(FileOutputStream.java:177) cr06:   at
> > java.io.FileOutputStream.<init>(FileOutputStream.java:102) cr06:   at
> > org.apache.log4j.FileAppender.setFile(FileAppender.java:289) cr06:
> > at
> > org.apache.log4j.FileAppender.activateOptions(FileAppender.java:163)
> >  cr06:   at
> > org.apache.log4j.DailyRollingFileAppender.activateOptions(
> DailyRollingFileAppender.java:215)
> >  cr06:   at
> > org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:256)
> >
> >
> >
> > There is really not much to any kind of particular configuration, it
> > is just that those properties are unset, so when log4j has to
> > interpret this:
> >
> > log4j.appender.DRFA.File=${hadoop.log.dir}/${hadoop.log.file}
> >
> >
> > It gets interpreted as:
> >
> > log4j.appender.DRFA.File=/
> >
> > Because those 2 properties are undefined. And that will happen if you
> > follow this tutorial: http://lucene.apache.org/nutch/tutorial8.html
> > This tutorial uses things like inject, generate, fetch, etc., while
> > the 0.8 tutorial on Wiki does not.  When you use the 0.8 tutorial
> > from the Wiki, the properties do get set somehow, so everything
> > works.  So it's a matter of those properties not getting set.
>
> What I meant by configuration was that are you running on one box and
> executing your tasks with LocalJobRunner or do you use one or more boxes
> and use TaskTracker to run your jobs?
>
> --
>   Sami SIren
>
> -------------------------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job
> easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Nutch-general mailing list
> Nutch-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nutch-general
>
>
>
>

Re: [Nutch-general] log4j.properties bug(?)

Posted by og...@yahoo.com.
Hi Sami,

This is a single box setup.  I start dfs and mapred daemons with bin/start-all.sh .
I'm using bin/nutch with various commands (inject, generate, crawl, updatedb, merge, invertlinks, index, dedup ...), as described in http://lucene.apache.org/nutch/tutorial8.html.

Otis

----- Original Message ----
From: Sami Siren <ss...@gmail.com>
To: nutch-user@lucene.apache.org
Sent: Saturday, August 12, 2006 1:48:40 AM
Subject: Re: [Nutch-general] log4j.properties bug(?)

ogjunk-nutch@yahoo.com wrote:
> Hi Sami,
> 
> ----- Original Message ---- ogjunk-nutch@yahoo.com wrote:
>> I assume the idea is that the JVM knows about hadoop.log.dir system
>>  property, and then log4j knows about it, too. However, it doesn't 
>> _always_ work.
>> 
>> That is, when invoking various bin/nutch commands as described in 
>> http://lucene.apache.org/nutch/tutorial8.html , this fails, and the
>>  system attempts to write to "/" which, of course, is a directory,
>> not a file.
>> 
> Can you be more precise on this one - what commands do fail? What
> kind of configuration are you running this on?
> 
> 
> I'll have to look at another server's logs tomorrow, but I can tell
> you that the error is much like the one in
> http://issues.apache.org/jira/browse/NUTCH-307 :
> 
> java.io.FileNotFoundException: / (Is a directory) cr06:   at
> java.io.FileOutputStream.openAppend(Native Method) cr06:   at
> java.io.FileOutputStream.<init>(FileOutputStream.java:177) cr06:   at
> java.io.FileOutputStream.<init>(FileOutputStream.java:102) cr06:   at
> org.apache.log4j.FileAppender.setFile(FileAppender.java:289) cr06:
> at
> org.apache.log4j.FileAppender.activateOptions(FileAppender.java:163)
>  cr06:   at
> org.apache.log4j.DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:215)
>  cr06:   at
> org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:256)
> 
> 
> 
> There is really not much to any kind of particular configuration, it
> is just that those properties are unset, so when log4j has to
> interpret this:
> 
> log4j.appender.DRFA.File=${hadoop.log.dir}/${hadoop.log.file}
> 
> 
> It gets interpreted as:
> 
> log4j.appender.DRFA.File=/
> 
> Because those 2 properties are undefined. And that will happen if you
> follow this tutorial: http://lucene.apache.org/nutch/tutorial8.html 
> This tutorial uses things like inject, generate, fetch, etc., while
> the 0.8 tutorial on Wiki does not.  When you use the 0.8 tutorial
> from the Wiki, the properties do get set somehow, so everything
> works.  So it's a matter of those properties not getting set.

What I meant by configuration was that are you running on one box and
executing your tasks with LocalJobRunner or do you use one or more boxes
and use TaskTracker to run your jobs?

--
  Sami SIren

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
Nutch-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-general




Re: [Nutch-general] log4j.properties bug(?)

Posted by Sami Siren <ss...@gmail.com>.
ogjunk-nutch@yahoo.com wrote:
> Hi Sami,
> 
> ----- Original Message ---- ogjunk-nutch@yahoo.com wrote:
>> I assume the idea is that the JVM knows about hadoop.log.dir system
>>  property, and then log4j knows about it, too. However, it doesn't 
>> _always_ work.
>> 
>> That is, when invoking various bin/nutch commands as described in 
>> http://lucene.apache.org/nutch/tutorial8.html , this fails, and the
>>  system attempts to write to "/" which, of course, is a directory,
>> not a file.
>> 
> Can you be more precise on this one - what commands do fail? What
> kind of configuration are you running this on?
> 
> 
> I'll have to look at another server's logs tomorrow, but I can tell
> you that the error is much like the one in
> http://issues.apache.org/jira/browse/NUTCH-307 :
> 
> java.io.FileNotFoundException: / (Is a directory) cr06:   at
> java.io.FileOutputStream.openAppend(Native Method) cr06:   at
> java.io.FileOutputStream.<init>(FileOutputStream.java:177) cr06:   at
> java.io.FileOutputStream.<init>(FileOutputStream.java:102) cr06:   at
> org.apache.log4j.FileAppender.setFile(FileAppender.java:289) cr06:
> at
> org.apache.log4j.FileAppender.activateOptions(FileAppender.java:163)
>  cr06:   at
> org.apache.log4j.DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:215)
>  cr06:   at
> org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:256)
> 
> 
> 
> There is really not much to any kind of particular configuration, it
> is just that those properties are unset, so when log4j has to
> interpret this:
> 
> log4j.appender.DRFA.File=${hadoop.log.dir}/${hadoop.log.file}
> 
> 
> It gets interpreted as:
> 
> log4j.appender.DRFA.File=/
> 
> Because those 2 properties are undefined. And that will happen if you
> follow this tutorial: http://lucene.apache.org/nutch/tutorial8.html 
> This tutorial uses things like inject, generate, fetch, etc., while
> the 0.8 tutorial on Wiki does not.  When you use the 0.8 tutorial
> from the Wiki, the properties do get set somehow, so everything
> works.  So it's a matter of those properties not getting set.

What I meant by configuration was that are you running on one box and
executing your tasks with LocalJobRunner or do you use one or more boxes
and use TaskTracker to run your jobs?

--
  Sami SIren

Re: [Nutch-general] log4j.properties bug(?)

Posted by og...@yahoo.com.
Hi Sami,

----- Original Message ----
ogjunk-nutch@yahoo.com wrote:
> 
> I assume the idea is that the JVM knows about hadoop.log.dir system
> property, and then log4j knows about it, too. However, it doesn't
> _always_ work.
> 
> That is, when invoking various bin/nutch commands as described in
> http://lucene.apache.org/nutch/tutorial8.html , this fails, and the
> system attempts to write to "/" which, of course, is a directory, not
> a file.
> 
Can you be more precise on this one - what commands do fail? What kind 
of configuration are you running this on?


I'll have to look at another server's logs tomorrow, but I can tell you that the error is much like the one in http://issues.apache.org/jira/browse/NUTCH-307 :

java.io.FileNotFoundException: / (Is a directory) 
 cr06:   at java.io.FileOutputStream.openAppend(Native Method) 
 cr06:   at java.io.FileOutputStream.<init>(FileOutputStream.java:177) 
 cr06:   at java.io.FileOutputStream.<init>(FileOutputStream.java:102) 
 cr06:   at org.apache.log4j.FileAppender.setFile(FileAppender.java:289) 
 cr06:   at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:163) 
 cr06:   at org.apache.log4j.DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:215) 
 cr06:   at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:256) 


There is really not much to any kind of particular configuration, it is just that those properties are unset, so when log4j has to interpret this:

  log4j.appender.DRFA.File=${hadoop.log.dir}/${hadoop.log.file}


It gets interpreted as:

log4j.appender.DRFA.File=/

Because those 2 properties are undefined.
And that will happen if you follow this tutorial: http://lucene.apache.org/nutch/tutorial8.html
This tutorial uses things like inject, generate, fetch, etc., while the 0.8 tutorial on Wiki does not.  When you use the 0.8 tutorial from the Wiki, the properties do get set somehow, so everything works.  So it's a matter of those properties not getting set.

Thanks,
Otis




Re: log4j.properties bug(?)

Posted by Sami Siren <ss...@gmail.com>.
ogjunk-nutch@yahoo.com wrote:
> 
> I assume the idea is that the JVM knows about hadoop.log.dir system
> property, and then log4j knows about it, too. However, it doesn't
> _always_ work.
> 
> That is, when invoking various bin/nutch commands as described in
> http://lucene.apache.org/nutch/tutorial8.html , this fails, and the
> system attempts to write to "/" which, of course, is a directory, not
> a file.
> 
Can you be more precise on this one - what commands do fail? What kind 
of configuration are you running this on?

--
  Sami Siren