You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Mike Baranczak (JIRA)" <ji...@apache.org> on 2012/10/11 03:23:03 UTC

[jira] [Commented] (NUTCH-1477) NPE when injecting with DataFileAvroStore

    [ https://issues.apache.org/jira/browse/NUTCH-1477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13473756#comment-13473756 ] 

Mike Baranczak commented on NUTCH-1477:
---------------------------------------

I tried upgrading the Avro library to the latest (1.7.2), but I just get another error:

org.apache.gora.util.GoraException: org.apache.avro.AvroRuntimeException: Not a Specific class: class org.apache.nutch.storage.WebPage
	at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:167)
	at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:135)
	at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:75)
	at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:214)
	at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:228)
	at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:248)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:258)
Caused by: org.apache.avro.AvroRuntimeException: Not a Specific class: class org.apache.nutch.storage.WebPage
	at org.apache.avro.specific.SpecificData.createSchema(SpecificData.java:213)
	at org.apache.avro.specific.SpecificData.getSchema(SpecificData.java:154)
	at org.apache.avro.specific.SpecificDatumReader.setSchema(SpecificDatumReader.java:62)
	at org.apache.gora.avro.PersistentDatumReader.setSchema(PersistentDatumReader.java:69)
	at org.apache.gora.avro.PersistentDatumReader.<init>(PersistentDatumReader.java:63)
	at org.apache.gora.store.impl.DataStoreBase.initialize(DataStoreBase.java:87)
	at org.apache.gora.store.impl.FileBackedDataStoreBase.initialize(FileBackedDataStoreBase.java:63)
	at org.apache.gora.avro.store.AvroStore.initialize(AvroStore.java:80)
	at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102)
	at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161)

                
> NPE when injecting with DataFileAvroStore
> -----------------------------------------
>
>                 Key: NUTCH-1477
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1477
>             Project: Nutch
>          Issue Type: Bug
>          Components: storage
>    Affects Versions: 2.1
>         Environment: Java 1.6.0_35
>            Reporter: Mike Baranczak
>
> Fresh installation of Nutch 2.1, configured to use DataFileAvroStore. Injection job throws NullPointerException, see below. No error when I switch to MemStore.
> java.lang.NullPointerException
> 	at org.apache.avro.io.BinaryEncoder.writeString(BinaryEncoder.java:133)
> 	at org.apache.avro.generic.GenericDatumWriter.writeString(GenericDatumWriter.java:176)
> 	at org.apache.avro.generic.GenericDatumWriter.writeString(GenericDatumWriter.java:171)
> 	at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:72)
> 	at org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:89)
> 	at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:62)
> 	at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:55)
> 	at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:245)
> 	at org.apache.gora.avro.store.DataFileAvroStore.put(DataFileAvroStore.java:54)
> 	at org.apache.gora.mapreduce.GoraRecordWriter.write(GoraRecordWriter.java:60)
> 	at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:639)
> 	at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
> 	at org.apache.nutch.crawl.InjectorJob$UrlMapper.map(InjectorJob.java:185)
> 	at org.apache.nutch.crawl.InjectorJob$UrlMapper.map(InjectorJob.java:85)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> 	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira