You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Steven W (JIRA)" <ji...@apache.org> on 2016/06/27 13:24:52 UTC
[jira] [Comment Edited] (NUTCH-2267) Solr indexer fails at the end
of the job with a java error message
[ https://issues.apache.org/jira/browse/NUTCH-2267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15350979#comment-15350979 ]
Steven W edited comment on NUTCH-2267 at 6/27/16 1:24 PM:
----------------------------------------------------------
I think this is a valid bug, however it's actually a JAR mismatch between SOLR and HADOOP. There's an easy solution though... Just change the following in the indexer-solr SolrUtils.java class:
SystemDefaultHttpClient httpClient = new SystemDefaultHttpClient();
CloudSolrClient sc = new CloudSolrClient(url.replace('|', ','), httpClient);
I'm working on a PR now.
was (Author: sjwoodard):
I think this is a valid bug, however it's actually a JAR mismatch between SOLR and HADOOP. There's an easy solution though... Just change the following in the indexer-solr SolrUtils.java class:
```
SystemDefaultHttpClient httpClient = new SystemDefaultHttpClient();
CloudSolrClient sc = new CloudSolrClient(url.replace('|', ','), httpClient);
```
I'm working on a PR now.
> Solr indexer fails at the end of the job with a java error message
> ------------------------------------------------------------------
>
> Key: NUTCH-2267
> URL: https://issues.apache.org/jira/browse/NUTCH-2267
> Project: Nutch
> Issue Type: Bug
> Components: indexer
> Affects Versions: 1.12
> Environment: hadoop v2.7.2 solr6 in cloud configuration with zookeeper 3.4.6. I use the master branch from github currently on commit da252eb7b3d2d7b70 ( NUTCH - 2263 mingram and maxgram support for Unigram Cosine Similarity Model is provided. )
> Reporter: kaveh minooie
> Fix For: 1.13
>
>
> this is was what I was getting first:
> 16/05/23 13:52:27 INFO mapreduce.Job: map 100% reduce 100%
> 16/05/23 13:52:27 INFO mapreduce.Job: Task Id : attempt_1462499602101_0119_r_000000_0, Status : FAILED
> Error: Bad return type
> Exception Details:
> Location:
> org/apache/solr/client/solrj/impl/HttpClientUtil.createClient(Lorg/apache/solr/common/params/SolrParams;Lorg/apache/http/conn/ClientConnectionManager;)Lorg/apache/http/impl/client/CloseableHttpClient; @58: areturn
> Reason:
> Type 'org/apache/http/impl/client/DefaultHttpClient' (current frame, stack[0]) is not assignable to 'org/apache/http/impl/client/CloseableHttpClient' (from method signature)
> Current Frame:
> bci: @58
> flags: { }
> locals: { 'org/apache/solr/common/params/SolrParams', 'org/apache/http/conn/ClientConnectionManager', 'org/apache/solr/common/params/ModifiableSolrParams', 'org/apache/http/impl/client/DefaultHttpClient' }
> stack: { 'org/apache/http/impl/client/DefaultHttpClient' }
> Bytecode:
> 0x0000000: bb00 0359 2ab7 0004 4db2 0005 b900 0601
> 0x0000010: 0099 001e b200 05bb 0007 59b7 0008 1209
> 0x0000020: b600 0a2c b600 0bb6 000c b900 0d02 002b
> 0x0000030: b800 104e 2d2c b800 0f2d b0
> Stackmap Table:
> append_frame(@47,Object[#143])
> 16/05/23 13:52:28 INFO mapreduce.Job: map 100% reduce 0%
> as you can see the failed reducer gets re-spawned. then I found this issue:
> https://issues.apache.org/jira/browse/SOLR-7657 and I updated my hadoop config file. after that, the indexer seems to be able to finish ( I got the document in the solr, it seems ) but I still get the error message at the end of the job:
> 16/05/23 16:39:26 INFO mapreduce.Job: map 100% reduce 99%
> 16/05/23 16:39:44 INFO mapreduce.Job: map 100% reduce 100%
> 16/05/23 16:39:57 INFO mapreduce.Job: Job job_1464045047943_0001 completed successfully
> 16/05/23 16:39:58 INFO mapreduce.Job: Counters: 53
> File System Counters
> FILE: Number of bytes read=42700154855
> FILE: Number of bytes written=70210771807
> FILE: Number of read operations=0
> FILE: Number of large read operations=0
> FILE: Number of write operations=0
> HDFS: Number of bytes read=8699202825
> HDFS: Number of bytes written=0
> HDFS: Number of read operations=537
> HDFS: Number of large read operations=0
> HDFS: Number of write operations=0
> Job Counters
> Launched map tasks=134
> Launched reduce tasks=1
> Data-local map tasks=107
> Rack-local map tasks=27
> Total time spent by all maps in occupied slots (ms)=49377664
> Total time spent by all reduces in occupied slots (ms)=32765064
> Total time spent by all map tasks (ms)=3086104
> Total time spent by all reduce tasks (ms)=1365211
> Total vcore-milliseconds taken by all map tasks=3086104
> Total vcore-milliseconds taken by all reduce tasks=1365211
> Total megabyte-milliseconds taken by all map tasks=12640681984
> Total megabyte-milliseconds taken by all reduce tasks=8387856384
> Map-Reduce Framework
> Map input records=25305474
> Map output records=25305474
> Map output bytes=27422869763
> Map output materialized bytes=27489888004
> Input split bytes=15225
> Combine input records=0
> Combine output records=0
> Reduce input groups=16061459
> Reduce shuffle bytes=27489888004
> Reduce input records=25305474
> Reduce output records=230
> Spilled Records=54688613
> Shuffled Maps =134
> Failed Shuffles=0
> Merged Map outputs=134
> GC time elapsed (ms)=88103
> CPU time spent (ms)=3361270
> Physical memory (bytes) snapshot=144395186176
> Virtual memory (bytes) snapshot=751590166528
> Total committed heap usage (bytes)=156232056832
> IndexerStatus
> indexed (add/update)=230
> Shuffle Errors
> BAD_ID=0
> CONNECTION=0
> IO_ERROR=0
> WRONG_LENGTH=0
> WRONG_MAP=0
> WRONG_REDUCE=0
> SkippingTaskCounters
> MapProcessedRecords=25305474
> ReduceProcessedGroups=16061459
> File Input Format Counters
> Bytes Read=8699187600
> File Output Format Counters
> Bytes Written=0
> Exception in thread "main" java.lang.VerifyError: Bad return type
> Exception Details:
> Location:
> org/apache/solr/client/solrj/impl/HttpClientUtil.createClient(Lorg/apache/solr/common/params/SolrParams;)Lorg/apache/http/impl/client/CloseableHttpClient; @57: areturn
> Reason:
> Type 'org/apache/http/impl/client/SystemDefaultHttpClient' (current frame, stack[0]) is not assignable to 'org/apache/http/impl/client/CloseableHttpClient' (from method signature)
> Current Frame:
> bci: @57
> flags: { }
> locals: { 'org/apache/solr/common/params/SolrParams', 'org/apache/solr/common/params/ModifiableSolrParams', 'org/apache/http/impl/client/SystemDefaultHttpClient' }
> stack: { 'org/apache/http/impl/client/SystemDefaultHttpClient' }
> Bytecode:
> 0x0000000: bb00 0359 2ab7 0004 4cb2 0005 b900 0601
> 0x0000010: 0099 001e b200 05bb 0007 59b7 0008 1209
> 0x0000020: b600 0a2b b600 0bb6 000c b900 0d02 00b8
> 0x0000030: 000e 4d2c 2bb8 000f 2cb0
> Stackmap Table:
> append_frame(@47,Object[#143])
> at org.apache.solr.client.solrj.impl.HttpSolrClient.<init>(HttpSolrClient.java:189)
> at org.apache.solr.client.solrj.impl.HttpSolrClient.<init>(HttpSolrClient.java:162)
> at org.apache.nutch.indexwriter.solr.SolrUtils.getSolrClients(SolrUtils.java:54)
> at org.apache.nutch.indexwriter.solr.SolrIndexWriter.open(SolrIndexWriter.java:78)
> at org.apache.nutch.indexer.IndexWriters.open(IndexWriters.java:75)
> at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:148)
> at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:228)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:237)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)