You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Denis Sinner <de...@dkd.de> on 2012/01/24 13:25:06 UTC

Delete Duplicates Error

Hello,

i have a setup Nutch crawler and try to index into a Solr Core where information is written by other applications aswell. The data gets indexed, but i get the following error: 

SolrDeleteDuplicates: starting at 2012-01-24 12:59:43
SolrDeleteDuplicates: Solr url: http://192.168.0.47:8080/solr/core_en/
Exception in thread "main" java.io.IOException: Job failed!
	at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)
	at org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDuplicates.java:392)
	at org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDuplicates.java:372)
	at org.apache.nutch.crawl.Crawl.run(Crawl.java:153)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at org.apache.nutch.crawl.Crawl.main(Crawl.java:55)

If i index into an empty Core on the same Solr server, i don't get this exception.
Any hints how to solve it? I would be very Thankful.

Thanks,

Denis

-- 

[Entwickler]

dkd Internet Service GmbH
development // kommunikation // design
Kaiserstraße 73
60329 Frankfurt/Main

fon:  +49 69 2475218-0
fax:  +49 69 2475218-99
e-mail: denis.sinner@dkd.de
twitter: http://twitter.com/dkd_de
facebook: http://www.facebook.com/www.dkd.de
web: http://www.dkd.de

Registergericht: Amtsgericht Frankfurt am Main
Registernummer: HRB 45590
Geschäftsführer: Olivier Dobberkau, Søren Schaffstein, Götz Wegenast, Christian Zabanski

Aktuelle Projekte:
http://www.spielwarenmesse-eg.de – Relaunch & Responsive Design (TYPO3)
http://www.horsch.com – Relaunch Website (TYPO3)
http://www.dosb.de – Refresh Website (TYPO3)