You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2018/07/03 20:45:00 UTC
[jira] [Commented] (NUTCH-2614) NPE in CrawlDbReader
[ https://issues.apache.org/jira/browse/NUTCH-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531904#comment-16531904 ]
Sebastian Nagel commented on NUTCH-2614:
----------------------------------------
The lines in the stack trace
{code}
513 LongWritable totalCnt = ((LongWritable) stats.get("T"));
514 stats.remove("T");
515 LOG.info("TOTAL urls:\t" + totalCnt.get());
{code}
suggest a trivial reason - an empty CrawlDb:
{noformat}
% rm -r crawldb/ # make sure to start a new CrawlDb
% bin/nutch inject crawldb/ /dev/null
% bin/nutch readdb crawldb/ -stats
...
Exception in thread "main" java.lang.NullPointerException
at org.apache.nutch.crawl.CrawlDbReader.processStatJob(CrawlDbReader.java:555)
at org.apache.nutch.crawl.CrawlDbReader.run(CrawlDbReader.java:914)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.nutch.crawl.CrawlDbReader.main(CrawlDbReader.java:980)
{noformat}
That's actually reproducible also with earlier versions (I've tried 1.14). Should be trivial to fix.
> NPE in CrawlDbReader
> --------------------
>
> Key: NUTCH-2614
> URL: https://issues.apache.org/jira/browse/NUTCH-2614
> Project: Nutch
> Issue Type: Bug
> Components: crawldb
> Affects Versions: 1.14, 1.15
> Reporter: Markus Jelsma
> Priority: Major
> Fix For: 1.15
>
>
> Got this in master:
> {code}
> Exception in thread "main" java.lang.NullPointerException
> at org.apache.nutch.crawl.CrawlDbReader.processStatJob(CrawlDbReader.java:555)
> at org.apache.nutch.crawl.CrawlDbReader.run(CrawlDbReader.java:914)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.nutch.crawl.CrawlDbReader.main(CrawlDbReader.java:980)
> {code}
> Not sure why it happens or which commit caused the problem.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)