You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@nutch.apache.org by 高睿 <ga...@163.com> on 2012/12/29 05:32:17 UTC

RuntimeException: Invalid version (expected 2, but 60) or the data in not in 'javabin' format

Hi,

I'm using Nutch 2.1 (Inside Eclipse) + Solr 4.0.0 with schema-solr4.xml. The run configuration in eclipse is:
org.apache.nutch.crawl.Crawler
urls -solr http://localhost:8080/solr/#/collection2 -threads 1 -depth 1 -topN 3
-Dhadoop.log.dir=logs -Dhadoop.log.file=hadoop.log

Rarely, it works fine, but most time there's an exception in console:
Adding 1 documents
Exception in thread "main" java.lang.RuntimeException: job failed: name=solr-index, jobid=job_local_0006
    at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:54)
    at org.apache.nutch.indexer.solr.SolrIndexerJob.run(SolrIndexerJob.java:46)
    at org.apache.nutch.crawl.Crawler.runTool(Crawler.java:68)
    at org.apache.nutch.crawl.Crawler.run(Crawler.java:192)
    at org.apache.nutch.crawl.Crawler.run(Crawler.java:250)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.nutch.crawl.Crawler.main(Crawler.java:257)

In hadoop.log:
2012-12-29 12:22:53,109 INFO  solr.SolrWriter - Adding 1 documents
2012-12-29 12:22:53,187 WARN  mapred.FileOutputCommitter - Output path is null in cleanup
2012-12-29 12:22:53,187 WARN  mapred.LocalJobRunner - job_local_0006
java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data in not in 'javabin' format
    at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:99)
    at org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:41)
    at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:472)
    at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
    at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
    at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:49)
    at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:92)
    at org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:53)
    at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)

Thanks.

Regards,
Rui

Re:Re:RuntimeException: Invalid version (expected 2, but 60) or the data in not in 'javabin' format

Posted by 高睿 <ga...@163.com>.

Hi,

Already figure out how to solve this problem. This may because I made the change on schema.xml. Here's the steps:
1. Shutdown tomcat
2. Clean the index folder (i.e. D:\Program Files\Apache Software Foundation\Tomcat 7.0\solr\collection2\data)
3. Clean the nutch data folder (i.e. E:\tmp\hadoop-xxxxx)
4. Delete the records nutch DB table 'webpage'. (Maybe this is not necessary.)
4. Synchronize schema.xml between Nutch and Solr.
5. Start tomcat

Good luck.





At 2012-12-29 16:57:42,"高睿" <ga...@163.com> wrote:

Hi,

Problem solved, however I don't know exactly how to reproduce/solve this. If you meet the exception, try to clean the folder like: E:\tmp\hadoop-ibmsz\mapred






At 2012-12-29 12:32:17,"高睿" <ga...@163.com> wrote:

Hi,

I'm using Nutch 2.1 (Inside Eclipse) + Solr 4.0.0 with schema-solr4.xml. The run configuration in eclipse is:
org.apache.nutch.crawl.Crawler
urls -solr http://localhost:8080/solr/#/collection2 -threads 1 -depth 1 -topN 3
-Dhadoop.log.dir=logs -Dhadoop.log.file=hadoop.log

Rarely, it works fine, but most time there's an exception in console:
Adding 1 documents
Exception in thread "main" java.lang.RuntimeException: job failed: name=solr-index, jobid=job_local_0006
    at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:54)
    at org.apache.nutch.indexer.solr.SolrIndexerJob.run(SolrIndexerJob.java:46)
    at org.apache.nutch.crawl.Crawler.runTool(Crawler.java:68)
    at org.apache.nutch.crawl.Crawler.run(Crawler.java:192)
    at org.apache.nutch.crawl.Crawler.run(Crawler.java:250)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.nutch.crawl.Crawler.main(Crawler.java:257)

In hadoop.log:
2012-12-29 12:22:53,109 INFO  solr.SolrWriter - Adding 1 documents
2012-12-29 12:22:53,187 WARN  mapred.FileOutputCommitter - Output path is null in cleanup
2012-12-29 12:22:53,187 WARN  mapred.LocalJobRunner - job_local_0006
java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data in not in 'javabin' format
    at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:99)
    at org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:41)
    at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:472)
    at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
    at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
    at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:49)
    at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:92)
    at org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:53)
    at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)

Thanks.

Regards,
Rui

Re:RuntimeException: Invalid version (expected 2, but 60) or the data in not in 'javabin' format

Posted by 高睿 <ga...@163.com>.

Hi,

Problem solved, however I don't know exactly how to reproduce/solve this. If you meet the exception, try to clean the folder like: E:\tmp\hadoop-ibmsz\mapred






At 2012-12-29 12:32:17,"高睿" <ga...@163.com> wrote:

Hi,

I'm using Nutch 2.1 (Inside Eclipse) + Solr 4.0.0 with schema-solr4.xml. The run configuration in eclipse is:
org.apache.nutch.crawl.Crawler
urls -solr http://localhost:8080/solr/#/collection2 -threads 1 -depth 1 -topN 3
-Dhadoop.log.dir=logs -Dhadoop.log.file=hadoop.log

Rarely, it works fine, but most time there's an exception in console:
Adding 1 documents
Exception in thread "main" java.lang.RuntimeException: job failed: name=solr-index, jobid=job_local_0006
    at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:54)
    at org.apache.nutch.indexer.solr.SolrIndexerJob.run(SolrIndexerJob.java:46)
    at org.apache.nutch.crawl.Crawler.runTool(Crawler.java:68)
    at org.apache.nutch.crawl.Crawler.run(Crawler.java:192)
    at org.apache.nutch.crawl.Crawler.run(Crawler.java:250)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.nutch.crawl.Crawler.main(Crawler.java:257)

In hadoop.log:
2012-12-29 12:22:53,109 INFO  solr.SolrWriter - Adding 1 documents
2012-12-29 12:22:53,187 WARN  mapred.FileOutputCommitter - Output path is null in cleanup
2012-12-29 12:22:53,187 WARN  mapred.LocalJobRunner - job_local_0006
java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data in not in 'javabin' format
    at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:99)
    at org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:41)
    at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:472)
    at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
    at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
    at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:49)
    at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:92)
    at org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:53)
    at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)

Thanks.

Regards,
Rui