You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Koch Martina <Ko...@huberverlag.de> on 2009/02/10 15:47:01 UTC
"old" crawldb not readable with current trunk
Hi,
I just upgraded from trunk version 28.12.2008 to trunk version 04.02.2009.
Now, I'm trying to read my old crawldb's e.g. by using the command "bin/nutch readdb <crawldb> -stats" , but I always get the following error:
2009-02-10 15:41:05,541 DEBUG mapred.MapTask - Writing local split to /tmp/CRAWLNAME.default.xyz/mapred/local/localRunner/split.dta
2009-02-10 15:41:05,588 DEBUG mapred.TaskRunner - attempt_local_0001_m_000000_0 Progress/ping thread started
2009-02-10 15:41:05,588 INFO mapred.MapTask - numReduceTasks: 1
2009-02-10 15:41:05,588 INFO mapred.MapTask - io.sort.mb = 100
2009-02-10 15:41:05,698 INFO mapred.MapTask - data buffer = 79691776/99614720
2009-02-10 15:41:05,698 INFO mapred.MapTask - record buffer = 262144/327680
2009-02-10 15:41:05,713 DEBUG mapred.Counters - Creating group org.apache.hadoop.mapred.Task$Counter with bundle
2009-02-10 15:41:05,713 DEBUG mapred.Counters - Adding MAP_OUTPUT_BYTES
2009-02-10 15:41:05,713 DEBUG mapred.Counters - Adding MAP_OUTPUT_RECORDS
2009-02-10 15:41:05,713 DEBUG mapred.Counters - Adding COMBINE_INPUT_RECORDS
2009-02-10 15:41:05,713 DEBUG mapred.Counters - Adding COMBINE_OUTPUT_RECORDS
2009-02-10 15:41:05,713 DEBUG mapred.Counters - Adding MAP_INPUT_RECORDS
2009-02-10 15:41:05,713 DEBUG mapred.Counters - Adding MAP_INPUT_BYTES
2009-02-10 15:41:05,729 WARN mapred.LocalJobRunner - job_local_0001
java.lang.RuntimeException: java.lang.NullPointerException
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:81)
at org.apache.hadoop.io.MapWritable.readFields(MapWritable.java:164)
at org.apache.nutch.crawl.CrawlDatum.readFields(CrawlDatum.java:262)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
at org.apache.hadoop.io.SequenceFile$Reader.deserializeValue(SequenceFile.java:1817)
at org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1790)
at org.apache.hadoop.mapred.SequenceFileRecordReader.getCurrentValue(SequenceFileRecordReader.java:103)
at org.apache.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRecordReader.java:78)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:186)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:170)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
Caused by: java.lang.NullPointerException
at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:73)
... 13 more
With the older version oft he trunk I can read the crawldb without difficulty.
Are the old files not readable with the new trunk version since the upgrade to lucene 2.4?
Is there anything I can do to re-use my old data with the new version?
Kind regards,
Martina
Re: "old" crawldb not readable with current trunk
Posted by Doğacan Güney <do...@gmail.com>.
Hi Koch,
Sorry, I thought that would have fixed your problem.
How big is your crawldb? If it is small, would you mind sending it to
me so I can have a look?
On Wed, Feb 11, 2009 at 10:24 AM, Koch Martina <Ko...@huberverlag.de> wrote:
> Hi Doğacan,
>
> thanks for your reply!
>
> I applied the patch, but I still get the same error message.
> I also tried to merge the old crawldb in a new one and then to a readdb, but even the merge step fails with the following error message:
>
> 2009-02-11 08:35:31,520 INFO jvm.JvmMetrics - Initializing JVM Metrics with processName=JobTracker, sessionId=
> 2009-02-11 08:35:31,707 INFO mapred.FileInputFormat - Total input paths to process : 1
> 2009-02-11 08:35:32,004 INFO mapred.JobClient - Running job: job_local_0001
> 2009-02-11 08:35:32,004 INFO mapred.FileInputFormat - Total input paths to process : 1
> 2009-02-11 08:35:32,082 INFO mapred.MapTask - numReduceTasks: 1
> 2009-02-11 08:35:32,082 INFO mapred.MapTask - io.sort.mb = 100
> 2009-02-11 08:35:32,191 INFO mapred.MapTask - data buffer = 79691776/99614720
> 2009-02-11 08:35:32,191 INFO mapred.MapTask - record buffer = 262144/327680
> 2009-02-11 08:35:32,222 WARN mapred.LocalJobRunner - job_local_0001
> java.lang.RuntimeException: java.lang.NullPointerException
> at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:81)
> at org.apache.hadoop.io.MapWritable.readFields(MapWritable.java:164)
> at org.apache.nutch.crawl.CrawlDatum.readFields(CrawlDatum.java:262)
> at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
> at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
> at org.apache.hadoop.io.SequenceFile$Reader.deserializeValue(SequenceFile.java:1817)
> at org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1790)
> at org.apache.hadoop.mapred.SequenceFileRecordReader.getCurrentValue(SequenceFileRecordReader.java:103)
> at org.apache.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRecordReader.java:78)
> at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:186)
> at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:170)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
> Caused by: java.lang.NullPointerException
> at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
> at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:73)
> ... 13 more
> 2009-02-11 08:35:33,003 FATAL crawl.CrawlDbMerger - CrawlDb merge: java.io.IOException: Job failed!
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
> at org.apache.nutch.crawl.CrawlDbMerger.merge(CrawlDbMerger.java:119)
> at org.apache.nutch.crawl.CrawlDbMerger.run(CrawlDbMerger.java:178)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at org.apache.nutch.crawl.CrawlDbMerger.main(CrawlDbMerger.java:150)
>
> I ran the merge step in debug mode and saw that the new code lines of CrawlDbMerger are never read. The error occurs earlier somewhere in the merge method.
>
> Kind regards,
> Martina
>
>
> -----Ursprüngliche Nachricht-----
> Von: Doğacan Güney [mailto:dogacan@gmail.com]
> Gesendet: Dienstag, 10. Februar 2009 22:54
> An: nutch-user@lucene.apache.org
> Betreff: Re: "old" crawldb not readable with current trunk
>
> On Tue, Feb 10, 2009 at 4:47 PM, Koch Martina <Ko...@huberverlag.de> wrote:
>> Hi,
>>
>> I just upgraded from trunk version 28.12.2008 to trunk version 04.02.2009.
>> Now, I'm trying to read my old crawldb's e.g. by using the command "bin/nutch readdb <crawldb> -stats" , but I always get the following error:
>>
>> 2009-02-10 15:41:05,541 DEBUG mapred.MapTask - Writing local split to /tmp/CRAWLNAME.default.xyz/mapred/local/localRunner/split.dta
>> 2009-02-10 15:41:05,588 DEBUG mapred.TaskRunner - attempt_local_0001_m_000000_0 Progress/ping thread started
>> 2009-02-10 15:41:05,588 INFO mapred.MapTask - numReduceTasks: 1
>> 2009-02-10 15:41:05,588 INFO mapred.MapTask - io.sort.mb = 100
>> 2009-02-10 15:41:05,698 INFO mapred.MapTask - data buffer = 79691776/99614720
>> 2009-02-10 15:41:05,698 INFO mapred.MapTask - record buffer = 262144/327680
>> 2009-02-10 15:41:05,713 DEBUG mapred.Counters - Creating group org.apache.hadoop.mapred.Task$Counter with bundle
>> 2009-02-10 15:41:05,713 DEBUG mapred.Counters - Adding MAP_OUTPUT_BYTES
>> 2009-02-10 15:41:05,713 DEBUG mapred.Counters - Adding MAP_OUTPUT_RECORDS
>> 2009-02-10 15:41:05,713 DEBUG mapred.Counters - Adding COMBINE_INPUT_RECORDS
>> 2009-02-10 15:41:05,713 DEBUG mapred.Counters - Adding COMBINE_OUTPUT_RECORDS
>> 2009-02-10 15:41:05,713 DEBUG mapred.Counters - Adding MAP_INPUT_RECORDS
>> 2009-02-10 15:41:05,713 DEBUG mapred.Counters - Adding MAP_INPUT_BYTES
>> 2009-02-10 15:41:05,729 WARN mapred.LocalJobRunner - job_local_0001
>> java.lang.RuntimeException: java.lang.NullPointerException
>> at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:81)
>> at org.apache.hadoop.io.MapWritable.readFields(MapWritable.java:164)
>> at org.apache.nutch.crawl.CrawlDatum.readFields(CrawlDatum.java:262)
>> at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
>> at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
>> at org.apache.hadoop.io.SequenceFile$Reader.deserializeValue(SequenceFile.java:1817)
>> at org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1790)
>> at org.apache.hadoop.mapred.SequenceFileRecordReader.getCurrentValue(SequenceFileRecordReader.java:103)
>> at org.apache.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRecordReader.java:78)
>> at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:186)
>> at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:170)
>> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
>> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
>> Caused by: java.lang.NullPointerException
>> at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
>> at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:73)
>> ... 13 more
>>
>> With the older version oft he trunk I can read the crawldb without difficulty.
>>
>> Are the old files not readable with the new trunk version since the upgrade to lucene 2.4?
>> Is there anything I can do to re-use my old data with the new version?
>>
>
> Try again in a couple of days. This is a known bug (NUTCH-683). I will
> commit that patch very
> soon. Meanwhile, you can apply patch there manually.
>
>> Kind regards,
>> Martina
>>
>
>
>
> --
> Doğacan Güney
>
--
Doğacan Güney
AW: "old" crawldb not readable with current trunk
Posted by Koch Martina <Ko...@huberverlag.de>.
Hi Doğacan,
thanks for your reply!
I applied the patch, but I still get the same error message.
I also tried to merge the old crawldb in a new one and then to a readdb, but even the merge step fails with the following error message:
2009-02-11 08:35:31,520 INFO jvm.JvmMetrics - Initializing JVM Metrics with processName=JobTracker, sessionId=
2009-02-11 08:35:31,707 INFO mapred.FileInputFormat - Total input paths to process : 1
2009-02-11 08:35:32,004 INFO mapred.JobClient - Running job: job_local_0001
2009-02-11 08:35:32,004 INFO mapred.FileInputFormat - Total input paths to process : 1
2009-02-11 08:35:32,082 INFO mapred.MapTask - numReduceTasks: 1
2009-02-11 08:35:32,082 INFO mapred.MapTask - io.sort.mb = 100
2009-02-11 08:35:32,191 INFO mapred.MapTask - data buffer = 79691776/99614720
2009-02-11 08:35:32,191 INFO mapred.MapTask - record buffer = 262144/327680
2009-02-11 08:35:32,222 WARN mapred.LocalJobRunner - job_local_0001
java.lang.RuntimeException: java.lang.NullPointerException
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:81)
at org.apache.hadoop.io.MapWritable.readFields(MapWritable.java:164)
at org.apache.nutch.crawl.CrawlDatum.readFields(CrawlDatum.java:262)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
at org.apache.hadoop.io.SequenceFile$Reader.deserializeValue(SequenceFile.java:1817)
at org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1790)
at org.apache.hadoop.mapred.SequenceFileRecordReader.getCurrentValue(SequenceFileRecordReader.java:103)
at org.apache.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRecordReader.java:78)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:186)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:170)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
Caused by: java.lang.NullPointerException
at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:73)
... 13 more
2009-02-11 08:35:33,003 FATAL crawl.CrawlDbMerger - CrawlDb merge: java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
at org.apache.nutch.crawl.CrawlDbMerger.merge(CrawlDbMerger.java:119)
at org.apache.nutch.crawl.CrawlDbMerger.run(CrawlDbMerger.java:178)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.crawl.CrawlDbMerger.main(CrawlDbMerger.java:150)
I ran the merge step in debug mode and saw that the new code lines of CrawlDbMerger are never read. The error occurs earlier somewhere in the merge method.
Kind regards,
Martina
-----Ursprüngliche Nachricht-----
Von: Doğacan Güney [mailto:dogacan@gmail.com]
Gesendet: Dienstag, 10. Februar 2009 22:54
An: nutch-user@lucene.apache.org
Betreff: Re: "old" crawldb not readable with current trunk
On Tue, Feb 10, 2009 at 4:47 PM, Koch Martina <Ko...@huberverlag.de> wrote:
> Hi,
>
> I just upgraded from trunk version 28.12.2008 to trunk version 04.02.2009.
> Now, I'm trying to read my old crawldb's e.g. by using the command "bin/nutch readdb <crawldb> -stats" , but I always get the following error:
>
> 2009-02-10 15:41:05,541 DEBUG mapred.MapTask - Writing local split to /tmp/CRAWLNAME.default.xyz/mapred/local/localRunner/split.dta
> 2009-02-10 15:41:05,588 DEBUG mapred.TaskRunner - attempt_local_0001_m_000000_0 Progress/ping thread started
> 2009-02-10 15:41:05,588 INFO mapred.MapTask - numReduceTasks: 1
> 2009-02-10 15:41:05,588 INFO mapred.MapTask - io.sort.mb = 100
> 2009-02-10 15:41:05,698 INFO mapred.MapTask - data buffer = 79691776/99614720
> 2009-02-10 15:41:05,698 INFO mapred.MapTask - record buffer = 262144/327680
> 2009-02-10 15:41:05,713 DEBUG mapred.Counters - Creating group org.apache.hadoop.mapred.Task$Counter with bundle
> 2009-02-10 15:41:05,713 DEBUG mapred.Counters - Adding MAP_OUTPUT_BYTES
> 2009-02-10 15:41:05,713 DEBUG mapred.Counters - Adding MAP_OUTPUT_RECORDS
> 2009-02-10 15:41:05,713 DEBUG mapred.Counters - Adding COMBINE_INPUT_RECORDS
> 2009-02-10 15:41:05,713 DEBUG mapred.Counters - Adding COMBINE_OUTPUT_RECORDS
> 2009-02-10 15:41:05,713 DEBUG mapred.Counters - Adding MAP_INPUT_RECORDS
> 2009-02-10 15:41:05,713 DEBUG mapred.Counters - Adding MAP_INPUT_BYTES
> 2009-02-10 15:41:05,729 WARN mapred.LocalJobRunner - job_local_0001
> java.lang.RuntimeException: java.lang.NullPointerException
> at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:81)
> at org.apache.hadoop.io.MapWritable.readFields(MapWritable.java:164)
> at org.apache.nutch.crawl.CrawlDatum.readFields(CrawlDatum.java:262)
> at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
> at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
> at org.apache.hadoop.io.SequenceFile$Reader.deserializeValue(SequenceFile.java:1817)
> at org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1790)
> at org.apache.hadoop.mapred.SequenceFileRecordReader.getCurrentValue(SequenceFileRecordReader.java:103)
> at org.apache.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRecordReader.java:78)
> at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:186)
> at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:170)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
> Caused by: java.lang.NullPointerException
> at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
> at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:73)
> ... 13 more
>
> With the older version oft he trunk I can read the crawldb without difficulty.
>
> Are the old files not readable with the new trunk version since the upgrade to lucene 2.4?
> Is there anything I can do to re-use my old data with the new version?
>
Try again in a couple of days. This is a known bug (NUTCH-683). I will
commit that patch very
soon. Meanwhile, you can apply patch there manually.
> Kind regards,
> Martina
>
--
Doğacan Güney
Re: "old" crawldb not readable with current trunk
Posted by Doğacan Güney <do...@gmail.com>.
On Tue, Feb 10, 2009 at 4:47 PM, Koch Martina <Ko...@huberverlag.de> wrote:
> Hi,
>
> I just upgraded from trunk version 28.12.2008 to trunk version 04.02.2009.
> Now, I'm trying to read my old crawldb's e.g. by using the command "bin/nutch readdb <crawldb> -stats" , but I always get the following error:
>
> 2009-02-10 15:41:05,541 DEBUG mapred.MapTask - Writing local split to /tmp/CRAWLNAME.default.xyz/mapred/local/localRunner/split.dta
> 2009-02-10 15:41:05,588 DEBUG mapred.TaskRunner - attempt_local_0001_m_000000_0 Progress/ping thread started
> 2009-02-10 15:41:05,588 INFO mapred.MapTask - numReduceTasks: 1
> 2009-02-10 15:41:05,588 INFO mapred.MapTask - io.sort.mb = 100
> 2009-02-10 15:41:05,698 INFO mapred.MapTask - data buffer = 79691776/99614720
> 2009-02-10 15:41:05,698 INFO mapred.MapTask - record buffer = 262144/327680
> 2009-02-10 15:41:05,713 DEBUG mapred.Counters - Creating group org.apache.hadoop.mapred.Task$Counter with bundle
> 2009-02-10 15:41:05,713 DEBUG mapred.Counters - Adding MAP_OUTPUT_BYTES
> 2009-02-10 15:41:05,713 DEBUG mapred.Counters - Adding MAP_OUTPUT_RECORDS
> 2009-02-10 15:41:05,713 DEBUG mapred.Counters - Adding COMBINE_INPUT_RECORDS
> 2009-02-10 15:41:05,713 DEBUG mapred.Counters - Adding COMBINE_OUTPUT_RECORDS
> 2009-02-10 15:41:05,713 DEBUG mapred.Counters - Adding MAP_INPUT_RECORDS
> 2009-02-10 15:41:05,713 DEBUG mapred.Counters - Adding MAP_INPUT_BYTES
> 2009-02-10 15:41:05,729 WARN mapred.LocalJobRunner - job_local_0001
> java.lang.RuntimeException: java.lang.NullPointerException
> at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:81)
> at org.apache.hadoop.io.MapWritable.readFields(MapWritable.java:164)
> at org.apache.nutch.crawl.CrawlDatum.readFields(CrawlDatum.java:262)
> at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
> at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
> at org.apache.hadoop.io.SequenceFile$Reader.deserializeValue(SequenceFile.java:1817)
> at org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1790)
> at org.apache.hadoop.mapred.SequenceFileRecordReader.getCurrentValue(SequenceFileRecordReader.java:103)
> at org.apache.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRecordReader.java:78)
> at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:186)
> at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:170)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
> Caused by: java.lang.NullPointerException
> at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
> at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:73)
> ... 13 more
>
> With the older version oft he trunk I can read the crawldb without difficulty.
>
> Are the old files not readable with the new trunk version since the upgrade to lucene 2.4?
> Is there anything I can do to re-use my old data with the new version?
>
Try again in a couple of days. This is a known bug (NUTCH-683). I will
commit that patch very
soon. Meanwhile, you can apply patch there manually.
> Kind regards,
> Martina
>
--
Doğacan Güney