You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Jason Calabrese (JIRA)" <ji...@apache.org> on 2006/03/19 03:45:58 UTC
[jira] Created: (NUTCH-236) PdfParser and RSSParser Log4j appender
redirection
PdfParser and RSSParser Log4j appender redirection
--------------------------------------------------
Key: NUTCH-236
URL: http://issues.apache.org/jira/browse/NUTCH-236
Project: Nutch
Type: Bug
Versions: 0.8-dev
Environment: Linux, Nutch embedded in an other application
Reporter: Jason Calabrese
Priority: Minor
I just found a bug in the way the log messages from Hadoop LogFormatter are
added as a new appender to the Log4j rootLogger in the PdfParser and RSSParser.
Since a new Log4j appender is created and added to the root logger each time
these classes are loaded log messages start getting repeated.
I'm using Nutch/Hadoop inside an other application so other may not be seeing
this problem.
I think the simple fix is as easy as setting a name for the new appender
before adding it and then at the begining of the constructor checking to see
if it's already been added.
Also as the comment says in both the PdfParser and RSSParser this code should
be moved to a common place.
I'd be happy to make these changes and submit a patch, but I wanted to know it
the change would be welcome first. Also does anyone know a good place for
the new util method? Maybe a new static method on LogFormatter, but then the
log4j jar would need to be added to the to the common lib and the classpath.
It would also be good to create a property in nutch-site.xml that could disable this logging appender redirection.
Like I said above I'd be more than happy to do this work, I'll just need some guidance to follow the project's conventions.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Commented: (NUTCH-236) PdfParser and RSSParser Log4j
appender redirection
Posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org>.
[ http://issues.apache.org/jira/browse/NUTCH-236?page=comments#action_12414599 ]
Chris A. Mattmann commented on NUTCH-236:
-----------------------------------------
Hi Jason,
I'll have a patch prepared for this issue shortly, and I'll attach it to JIRA by this Sunday night.
Thanks,
Chris
> PdfParser and RSSParser Log4j appender redirection
> --------------------------------------------------
>
> Key: NUTCH-236
> URL: http://issues.apache.org/jira/browse/NUTCH-236
> Project: Nutch
> Type: Bug
> Versions: 0.8-dev
> Environment: Linux, Nutch embedded in an other application
> Reporter: Jason Calabrese
> Assignee: Chris A. Mattmann
> Priority: Minor
>
> I just found a bug in the way the log messages from Hadoop LogFormatter are
> added as a new appender to the Log4j rootLogger in the PdfParser and RSSParser.
> Since a new Log4j appender is created and added to the root logger each time
> these classes are loaded log messages start getting repeated.
> I'm using Nutch/Hadoop inside an other application so other may not be seeing
> this problem.
> I think the simple fix is as easy as setting a name for the new appender
> before adding it and then at the begining of the constructor checking to see
> if it's already been added.
> Also as the comment says in both the PdfParser and RSSParser this code should
> be moved to a common place.
> I'd be happy to make these changes and submit a patch, but I wanted to know it
> the change would be welcome first. Also does anyone know a good place for
> the new util method? Maybe a new static method on LogFormatter, but then the
> log4j jar would need to be added to the to the common lib and the classpath.
> It would also be good to create a property in nutch-site.xml that could disable this logging appender redirection.
> Like I said above I'd be more than happy to do this work, I'll just need some guidance to follow the project's conventions.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Updated: (NUTCH-236) PdfParser and RSSParser Log4j appender
redirection
Posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org>.
[ http://issues.apache.org/jira/browse/NUTCH-236?page=all ]
Chris A. Mattmann updated NUTCH-236:
------------------------------------
Attachment: NUTCH-236.Mattmann.060806.patch.txt
Okay a bit late, but as usual with me :-)
This patch implements Jason's suggestion for the following two issues:
1. Move log4j root logger redirection and appender code to common place (moved to utility method in org.apache.nutch.parse.ParseUtil)
2. Rename appender before adding it, and make sure it hasn't been added already before adding it
Jason's original suggestion was to move the common root logger redirection code to LogFormatter in Hadoop, but I neglected to do that in order to keep the code base within Nutch and not make this patch span the 2 projects. If there is a pressing need to have the utility code within Hadoop however, I can probably move the method to LogFormatter in Hadoop. Additionally, I just ran unit-level tests on this, I didn't run a full system test in an arena where the behavior that caused this issue has been seen already. It would be great if someone like Jason could test this in his own environment and see if it fixes the issue.
> PdfParser and RSSParser Log4j appender redirection
> --------------------------------------------------
>
> Key: NUTCH-236
> URL: http://issues.apache.org/jira/browse/NUTCH-236
> Project: Nutch
> Type: Bug
> Versions: 0.8-dev
> Environment: Linux, Nutch embedded in an other application
> Reporter: Jason Calabrese
> Assignee: Chris A. Mattmann
> Priority: Minor
> Attachments: NUTCH-236.Mattmann.060806.patch.txt
>
> I just found a bug in the way the log messages from Hadoop LogFormatter are
> added as a new appender to the Log4j rootLogger in the PdfParser and RSSParser.
> Since a new Log4j appender is created and added to the root logger each time
> these classes are loaded log messages start getting repeated.
> I'm using Nutch/Hadoop inside an other application so other may not be seeing
> this problem.
> I think the simple fix is as easy as setting a name for the new appender
> before adding it and then at the begining of the constructor checking to see
> if it's already been added.
> Also as the comment says in both the PdfParser and RSSParser this code should
> be moved to a common place.
> I'd be happy to make these changes and submit a patch, but I wanted to know it
> the change would be welcome first. Also does anyone know a good place for
> the new util method? Maybe a new static method on LogFormatter, but then the
> log4j jar would need to be added to the to the common lib and the classpath.
> It would also be good to create a property in nutch-site.xml that could disable this logging appender redirection.
> Like I said above I'd be more than happy to do this work, I'll just need some guidance to follow the project's conventions.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Closed: (NUTCH-236) PdfParser and RSSParser Log4j appender
redirection
Posted by "Jerome Charron (JIRA)" <ji...@apache.org>.
[ http://issues.apache.org/jira/browse/NUTCH-236?page=all ]
Jerome Charron closed NUTCH-236:
--------------------------------
Fix Version: 0.8-dev
Resolution: Fixed
As a side effect, this issue is solved by NUTCH-303 since nutch now uses Jakarta Commons Logging with the log4j default implementation.
> PdfParser and RSSParser Log4j appender redirection
> --------------------------------------------------
>
> Key: NUTCH-236
> URL: http://issues.apache.org/jira/browse/NUTCH-236
> Project: Nutch
> Type: Bug
> Versions: 0.8-dev
> Environment: Linux, Nutch embedded in an other application
> Reporter: Jason Calabrese
> Assignee: Chris A. Mattmann
> Priority: Minor
> Fix For: 0.8-dev
> Attachments: NUTCH-236.Mattmann.060806.patch.txt
>
> I just found a bug in the way the log messages from Hadoop LogFormatter are
> added as a new appender to the Log4j rootLogger in the PdfParser and RSSParser.
> Since a new Log4j appender is created and added to the root logger each time
> these classes are loaded log messages start getting repeated.
> I'm using Nutch/Hadoop inside an other application so other may not be seeing
> this problem.
> I think the simple fix is as easy as setting a name for the new appender
> before adding it and then at the begining of the constructor checking to see
> if it's already been added.
> Also as the comment says in both the PdfParser and RSSParser this code should
> be moved to a common place.
> I'd be happy to make these changes and submit a patch, but I wanted to know it
> the change would be welcome first. Also does anyone know a good place for
> the new util method? Maybe a new static method on LogFormatter, but then the
> log4j jar would need to be added to the to the common lib and the classpath.
> It would also be good to create a property in nutch-site.xml that could disable this logging appender redirection.
> Like I said above I'd be more than happy to do this work, I'll just need some guidance to follow the project's conventions.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Updated: (NUTCH-236) PdfParser and RSSParser Log4j appender
redirection
Posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org>.
[ http://issues.apache.org/jira/browse/NUTCH-236?page=all ]
Chris A. Mattmann updated NUTCH-236:
------------------------------------
Due Date: 05/Jun/06
> PdfParser and RSSParser Log4j appender redirection
> --------------------------------------------------
>
> Key: NUTCH-236
> URL: http://issues.apache.org/jira/browse/NUTCH-236
> Project: Nutch
> Type: Bug
> Versions: 0.8-dev
> Environment: Linux, Nutch embedded in an other application
> Reporter: Jason Calabrese
> Assignee: Chris A. Mattmann
> Priority: Minor
>
> I just found a bug in the way the log messages from Hadoop LogFormatter are
> added as a new appender to the Log4j rootLogger in the PdfParser and RSSParser.
> Since a new Log4j appender is created and added to the root logger each time
> these classes are loaded log messages start getting repeated.
> I'm using Nutch/Hadoop inside an other application so other may not be seeing
> this problem.
> I think the simple fix is as easy as setting a name for the new appender
> before adding it and then at the begining of the constructor checking to see
> if it's already been added.
> Also as the comment says in both the PdfParser and RSSParser this code should
> be moved to a common place.
> I'd be happy to make these changes and submit a patch, but I wanted to know it
> the change would be welcome first. Also does anyone know a good place for
> the new util method? Maybe a new static method on LogFormatter, but then the
> log4j jar would need to be added to the to the common lib and the classpath.
> It would also be good to create a property in nutch-site.xml that could disable this logging appender redirection.
> Like I said above I'd be more than happy to do this work, I'll just need some guidance to follow the project's conventions.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Commented: (NUTCH-236) PdfParser and RSSParser Log4j
appender redirection
Posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org>.
[ http://issues.apache.org/jira/browse/NUTCH-236?page=comments#action_12371664 ]
Chris A. Mattmann commented on NUTCH-236:
-----------------------------------------
>I'd be happy to make these changes and submit a patch, but I wanted to know it
>the change would be welcome first.
I think that the change makes sense to me. +1
> Also does anyone know a good place for the new util method?
There is a generic lib-log4j plugin I believe that right now just contains the common log4j jars depended on by other plugins. Maybe that would be a good place to put it. What do others think? I don't hink that any log4j jar would need to be added to the common lib or the classpath in this case.
> It would also be good to create a property in nutch-site.xml that could disable this logging appender redirection.
Yup, I think so.
> Like I said above I'd be more than happy to do this work, I'll just need some guidance to follow the project's conventions.
I think that the steps to create a patch are something like as follows (paraphrased from Doug a while back):
1. svn checkout [latest nutch revision]
2. make changes in that checked out version
3. if you added any new files, type:
svn add /path/to/new/files
4. type svn status to make sure that your changes are being seen by svn
5. type svn diff > mypatch.txt
As for coding standards, I believe that Nutch uses Sun's coding standards. More info about how to contribute to Nutch is available on the Wiki at this page:
http://wiki.apache.org/nutch/HowToContribute
> PdfParser and RSSParser Log4j appender redirection
> --------------------------------------------------
>
> Key: NUTCH-236
> URL: http://issues.apache.org/jira/browse/NUTCH-236
> Project: Nutch
> Type: Bug
> Versions: 0.8-dev
> Environment: Linux, Nutch embedded in an other application
> Reporter: Jason Calabrese
> Priority: Minor
>
> I just found a bug in the way the log messages from Hadoop LogFormatter are
> added as a new appender to the Log4j rootLogger in the PdfParser and RSSParser.
> Since a new Log4j appender is created and added to the root logger each time
> these classes are loaded log messages start getting repeated.
> I'm using Nutch/Hadoop inside an other application so other may not be seeing
> this problem.
> I think the simple fix is as easy as setting a name for the new appender
> before adding it and then at the begining of the constructor checking to see
> if it's already been added.
> Also as the comment says in both the PdfParser and RSSParser this code should
> be moved to a common place.
> I'd be happy to make these changes and submit a patch, but I wanted to know it
> the change would be welcome first. Also does anyone know a good place for
> the new util method? Maybe a new static method on LogFormatter, but then the
> log4j jar would need to be added to the to the common lib and the classpath.
> It would also be good to create a property in nutch-site.xml that could disable this logging appender redirection.
> Like I said above I'd be more than happy to do this work, I'll just need some guidance to follow the project's conventions.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
[jira] Assigned: (NUTCH-236) PdfParser and RSSParser Log4j appender
redirection
Posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org>.
[ http://issues.apache.org/jira/browse/NUTCH-236?page=all ]
Chris A. Mattmann reassigned NUTCH-236:
---------------------------------------
Assign To: Chris A. Mattmann
> PdfParser and RSSParser Log4j appender redirection
> --------------------------------------------------
>
> Key: NUTCH-236
> URL: http://issues.apache.org/jira/browse/NUTCH-236
> Project: Nutch
> Type: Bug
> Versions: 0.8-dev
> Environment: Linux, Nutch embedded in an other application
> Reporter: Jason Calabrese
> Assignee: Chris A. Mattmann
> Priority: Minor
>
> I just found a bug in the way the log messages from Hadoop LogFormatter are
> added as a new appender to the Log4j rootLogger in the PdfParser and RSSParser.
> Since a new Log4j appender is created and added to the root logger each time
> these classes are loaded log messages start getting repeated.
> I'm using Nutch/Hadoop inside an other application so other may not be seeing
> this problem.
> I think the simple fix is as easy as setting a name for the new appender
> before adding it and then at the begining of the constructor checking to see
> if it's already been added.
> Also as the comment says in both the PdfParser and RSSParser this code should
> be moved to a common place.
> I'd be happy to make these changes and submit a patch, but I wanted to know it
> the change would be welcome first. Also does anyone know a good place for
> the new util method? Maybe a new static method on LogFormatter, but then the
> log4j jar would need to be added to the to the common lib and the classpath.
> It would also be good to create a property in nutch-site.xml that could disable this logging appender redirection.
> Like I said above I'd be more than happy to do this work, I'll just need some guidance to follow the project's conventions.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira