You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Tejas Patil (JIRA)" <ji...@apache.org> on 2013/04/28 03:38:16 UTC
[jira] [Commented] (NUTCH-346) Improve readability of
logs/hadoop.log
[ https://issues.apache.org/jira/browse/NUTCH-346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13643863#comment-13643863 ]
Tejas Patil commented on NUTCH-346:
-----------------------------------
I think that this will be a good addition as currently log file has this multiple times:
{noformat}2013-04-27 15:33:24,346 INFO plugin.PluginRepository - Plugins: looking in: /home/tejas/Desktop/nutch/trunk/runtime/local/plugins
2013-04-27 15:33:24,432 INFO plugin.PluginRepository - Plugin Auto-activation mode: [true]
2013-04-27 15:33:24,432 INFO plugin.PluginRepository - Registered Plugins:
2013-04-27 15:33:24,432 INFO plugin.PluginRepository - the nutch core extension points (nutch-extensionpoints)
2013-04-27 15:33:24,432 INFO plugin.PluginRepository - Basic URL Normalizer (urlnormalizer-basic)
2013-04-27 15:33:24,432 INFO plugin.PluginRepository - Basic Indexing Filter (index-basic)
2013-04-27 15:33:24,432 INFO plugin.PluginRepository - Html Parse Plug-in (parse-html)
2013-04-27 15:33:24,432 INFO plugin.PluginRepository - SOLRIndexWriter (indexer-solr)
2013-04-27 15:33:24,432 INFO plugin.PluginRepository - HTTP Framework (lib-http)
2013-04-27 15:33:24,432 INFO plugin.PluginRepository - Regex URL Filter (urlfilter-regex)
2013-04-27 15:33:24,432 INFO plugin.PluginRepository - Pass-through URL Normalizer (urlnormalizer-pass)
2013-04-27 15:33:24,432 INFO plugin.PluginRepository - Http Protocol Plug-in (protocol-http)
2013-04-27 15:33:24,432 INFO plugin.PluginRepository - Regex URL Normalizer (urlnormalizer-regex)
2013-04-27 15:33:24,432 INFO plugin.PluginRepository - CyberNeko HTML Parser (lib-nekohtml)
2013-04-27 15:33:24,432 INFO plugin.PluginRepository - Tika Parser Plug-in (parse-tika)
2013-04-27 15:33:24,432 INFO plugin.PluginRepository - OPIC Scoring Plug-in (scoring-opic)
2013-04-27 15:33:24,432 INFO plugin.PluginRepository - Anchor Indexing Filter (index-anchor)
2013-04-27 15:33:24,432 INFO plugin.PluginRepository - Regex URL Filter Framework (lib-regex-filter)
2013-04-27 15:33:24,432 INFO plugin.PluginRepository - Registered Extension-Points:
2013-04-27 15:33:24,432 INFO plugin.PluginRepository - Nutch URL Normalizer (org.apache.nutch.net.URLNormalizer)
2013-04-27 15:33:24,432 INFO plugin.PluginRepository - Nutch Protocol (org.apache.nutch.protocol.Protocol)
2013-04-27 15:33:24,432 INFO plugin.PluginRepository - Nutch Segment Merge Filter (org.apache.nutch.segment.SegmentMergeFilter)
2013-04-27 15:33:24,432 INFO plugin.PluginRepository - Nutch URL Filter (org.apache.nutch.net.URLFilter)
2013-04-27 15:33:24,432 INFO plugin.PluginRepository - Nutch Index Writer (org.apache.nutch.indexer.IndexWriter)
2013-04-27 15:33:24,432 INFO plugin.PluginRepository - Nutch Indexing Filter (org.apache.nutch.indexer.IndexingFilter)
2013-04-27 15:33:24,433 INFO plugin.PluginRepository - HTML Parse Filter (org.apache.nutch.parse.HtmlParseFilter)
2013-04-27 15:33:24,433 INFO plugin.PluginRepository - Nutch Content Parser (org.apache.nutch.parse.Parser)
2013-04-27 15:33:24,433 INFO plugin.PluginRepository - Nutch Scoring (org.apache.nutch.scoring.ScoringFilter)
{noformat}
Not sure how useful this info is from debugging perspective and it gets logged with every phase making the log file bigger in size. The patch would avoid the repetition.
> Improve readability of logs/hadoop.log
> --------------------------------------
>
> Key: NUTCH-346
> URL: https://issues.apache.org/jira/browse/NUTCH-346
> Project: Nutch
> Issue Type: Improvement
> Affects Versions: 0.9.0
> Environment: ubuntu dapper
> Reporter: Renaud Richardet
> Priority: Minor
> Fix For: 1.7, 2.2
>
> Attachments: log4j_plugins.diff
>
>
> adding
> log4j.logger.org.apache.nutch.plugin.PluginRepository=WARN
> to conf/log4j.properties
> dramatically improves the readability of the logs in logs/hadoop.log (removes all INFO)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira