You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Tom Chiverton <tc...@extravision.com> on 2016/10/14 13:30:39 UTC
Nutch 2, Solr 5 - solrdedup causes ClassCastException:
I've tried using both Solr 6 and 5 with the latest Nutch 2, and with
both I am getting an error from Nutch's bin/crawl.
mnt/nutch/nutch/runtime/local/bin/nutch solrdedup -D
mapred.reduce.tasks=2 -D mapred.child.java.opts=-Xmx1000m -D
mapred.reduce.tasks.speculative.execution=false -D
mapred.map.tasks.speculative.execution=false -D
mapred.compress.map.output=true http://localhost:8983/solr/nutch
Exception in thread "main" java.lang.RuntimeException: job failed:
name=apache-nutch-2.3.1.jar, jobid=job_local2123017879_0001
at
org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:120)
at
org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDuplicates.java:383)
at
org.apache.nutch.indexer.solr.SolrDeleteDuplicates.run(SolrDeleteDuplicates.java:393)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at
org.apache.nutch.indexer.solr.SolrDeleteDuplicates.main(SolrDeleteDuplicates.java:403)
Error running:
/mnt/nutch/nutch/runtime/local/bin/nutch solrdedup -D
mapred.reduce.tasks=2 -D mapred.child.java.opts=-Xmx1000m -D
mapred.reduce.tasks.speculative.execution=false -D
mapred.map.tasks.speculative.execution=false -D
mapred.compress.map.output=true http://localhost:8983/solr/nutch
Failed with exit value 1.
hadoop.log says
java.lang.Exception: java.lang.ClassCastException: java.util.ArrayList
cannot be cast to java.lang.String
at
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.ClassCastException: java.util.ArrayList cannot be
cast to java.lang.String
at
org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrRecordReader.nextKeyValue(SolrDeleteDuplicates.java:233)
at
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
at
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Which appears to be related to the digest field somehow...
Is this a known bug ? Do I need a particular version of Nutch with a
particular Solr or something ?
--
*Tom Chiverton*
Lead Developer
e: tc@extravision.com <ma...@extravision.com>
p: 0161 817 2922
t: @extravision <http://www.twitter.com/extravision>
w: www.extravision.com <http://www.extravision.com/>
Extravision - email worth seeing <http://www.extravision.com/>
Registered in the UK at: 107 Timber Wharf, 33 Worsley Street,
Manchester, M15 4LD.
Company Reg No: 0\u200c\u200c5017214 VAT: GB 8\u200c\u200c24 5386 19
This e-mail is intended solely for the person to whom it is addressed
and may contain confidential or privileged information.
Any views or opinions presented in this e-mail are solely of the author
and do not necessarily represent those of Extravision Ltd.
Re: Nutch 2, Solr 5 - solrdedup causes ClassCastException:
Posted by Tom Chiverton <tc...@extravision.com>.
Where would this be configured ? I'm creating the solr core by just doing
"solr/bin/solr create_core -c nutch"
should I be feeding it a special schema file somehow ?
Tom
On 14/10/16 14:39, Markus Jelsma wrote:
> Your digest field is configured as multi valued, which should not be the case.
RE: Nutch 2, Solr 5 - solrdedup causes ClassCastException:
Posted by Markus Jelsma <ma...@openindex.io>.
According to the source:
https://github.com/apache/nutch/blob/2.x/src/java/org/apache/nutch/indexer/solr/SolrDeleteDuplicates.java#L233
Your digest field is configured as multi valued, which should not be the case.
M.
-----Original message-----
> From:Tom Chiverton <tc...@extravision.com>
> Sent: Friday 14th October 2016 15:31
> To: user@nutch.apache.org
> Subject: Nutch 2, Solr 5 - solrdedup causes ClassCastException:
>
> I've tried using both Solr 6 and 5 with the latest Nutch 2, and with
> both I am getting an error from Nutch's bin/crawl.
>
> mnt/nutch/nutch/runtime/local/bin/nutch solrdedup -D
> mapred.reduce.tasks=2 -D mapred.child.java.opts=-Xmx1000m -D
> mapred.reduce.tasks.speculative.execution=false -D
> mapred.map.tasks.speculative.execution=false -D
> mapred.compress.map.output=true http://localhost:8983/solr/nutch <http://localhost:8983/solr/nutch>
> Exception in thread "main" java.lang.RuntimeException: job failed:
> name=apache-nutch-2.3.1.jar, jobid=job_local2123017879_0001
> at
> org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:120)
> at
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDuplicates.java:383)
> at
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates.run(SolrDeleteDuplicates.java:393)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates.main(SolrDeleteDuplicates.java:403)
> Error running:
> /mnt/nutch/nutch/runtime/local/bin/nutch solrdedup -D
> mapred.reduce.tasks=2 -D mapred.child.java.opts=-Xmx1000m -D
> mapred.reduce.tasks.speculative.execution=false -D
> mapred.map.tasks.speculative.execution=false -D
> mapred.compress.map.output=true http://localhost:8983/solr/nutch <http://localhost:8983/solr/nutch>
> Failed with exit value 1.
>
> hadoop.log says
>
> java.lang.Exception: java.lang.ClassCastException:
> java.util.ArrayList cannot be cast to java.lang.String
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
> Caused by: java.lang.ClassCastException: java.util.ArrayList cannot
> be cast to java.lang.String
> at
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrRecordReader.nextKeyValue(SolrDeleteDuplicates.java:233)
> at
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
> at
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
> at
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at
> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
>
> Which appears to be related to the digest field somehow...
>
> Is this a known bug ? Do I need a particular version of Nutch with a
> particular Solr or something ?
> --
>
>
>
>
> Tom Chiverton
> Lead Developer
>
>
>
> e:
>
> tc@extravision.com <ma...@extravision.com>
>
> p:
>
> 0161 817 2922
>
> t:
>
> @extravision <http://www.twitter.com/extravision>
>
> w:
>
> www.extravision.com <http://www.extravision.com/>
>
>
> <http://www.extravision.com/>
>
>
> Registered in the UK at: 107 Timber Wharf, 33 Worsley
> Street, Manchester, M15 4LD.
> Company Reg No: 05017214 VAT: GB 824 5386 19
>
> This e-mail is intended solely for the person to whom it
> is addressed and may contain confidential or privileged
> information.
> Any views or opinions presented in this e-mail are
> solely of the author and do not necessarily represent
> those of Extravision Ltd.
>