You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Laura McCord <lm...@ucmerced.edu> on 2014/03/24 19:33:06 UTC

Indexer: java.io.IOException: Job failed!

Hi,

I’m trying to integrate Solr with Nutch and I performed all of the necessary steps except after Nutch performs the crawl it appears that I’m receiving a connection refused.

2014-03-24 11:42:43,062 INFO  indexer.IndexerMapReduce - IndexerMapReduce: crawldb: TestCrawl/crawldb
2014-03-24 11:42:43,062 INFO  indexer.IndexerMapReduce - IndexerMapReduce: linkdb: TestCrawl/linkdb
2014-03-24 11:42:43,062 INFO  indexer.IndexerMapReduce - IndexerMapReduces: adding segment: TestCrawl/segments/20140324113941
2014-03-24 11:42:43,304 WARN  util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2014-03-24 11:42:43,942 INFO  anchor.AnchorIndexingFilter - Anchor deduplication is: off
2014-03-24 11:42:44,456 INFO  indexer.IndexWriters - Adding org.apache.nutch.indexwriter.solr.SolrIndexWriter
2014-03-24 11:42:44,465 INFO  solr.SolrUtils - Authenticating as: <my username>
2014-03-24 11:42:44,483 INFO  solr.SolrMappingReader - source: content dest: content
2014-03-24 11:42:44,483 INFO  solr.SolrMappingReader - source: title dest: title
2014-03-24 11:42:44,483 INFO  solr.SolrMappingReader - source: host dest: host
2014-03-24 11:42:44,483 INFO  solr.SolrMappingReader - source: segment dest: segment
2014-03-24 11:42:44,483 INFO  solr.SolrMappingReader - source: boost dest: boost
2014-03-24 11:42:44,484 INFO  solr.SolrMappingReader - source: digest dest: digest
2014-03-24 11:42:44,484 INFO  solr.SolrMappingReader - source: tstamp dest: tstamp
2014-03-24 11:42:44,484 INFO  solr.SolrMappingReader - source: url dest: id
2014-03-24 11:42:44,484 INFO  solr.SolrMappingReader - source: url dest: url
2014-03-24 11:42:44,616 INFO  solr.SolrIndexWriter - Indexing 22 documents
2014-03-24 11:42:44,704 INFO  httpclient.HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused
2014-03-24 11:42:44,704 INFO  httpclient.HttpMethodDirector - Retrying request
2014-03-24 11:42:44,707 INFO  httpclient.HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused
2014-03-24 11:42:44,707 INFO  httpclient.HttpMethodDirector - Retrying request
2014-03-24 11:42:44,707 INFO  httpclient.HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused
2014-03-24 11:42:44,707 INFO  httpclient.HttpMethodDirector - Retrying request
2014-03-24 11:42:44,708 INFO  solr.SolrIndexWriter - Indexing 22 documents
2014-03-24 11:42:44,709 INFO  httpclient.HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused
2014-03-24 11:42:44,709 INFO  httpclient.HttpMethodDirector - Retrying request
2014-03-24 11:42:44,709 INFO  httpclient.HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused
2014-03-24 11:42:44,709 INFO  httpclient.HttpMethodDirector - Retrying request
2014-03-24 11:42:44,709 INFO  httpclient.HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused
2014-03-24 11:42:44,709 INFO  httpclient.HttpMethodDirector - Retrying request
2014-03-24 11:42:44,715 WARN  mapred.LocalJobRunner - job_local319933392_0001
java.io.IOException
	at org.apache.nutch.indexwriter.solr.SolrIndexWriter.makeIOException(SolrIndexWriter.java:173)
	at org.apache.nutch.indexwriter.solr.SolrIndexWriter.close(SolrIndexWriter.java:159)
	at org.apache.nutch.indexer.IndexWriters.close(IndexWriters.java:118)
	at org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:44)
	at org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.close(ReduceTask.java:467)
	at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:535)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421)
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:398)
Caused by: org.apache.solr.client.solrj.SolrServerException: java.net.ConnectException: Connection refused
	at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:478)
	at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
	at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
	at org.apache.nutch.indexwriter.solr.SolrIndexWriter.close(SolrIndexWriter.java:155)
	... 6 more
Caused by: java.net.ConnectException: Connection refused
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
	at java.net.Socket.connect(Socket.java:529)
	at java.net.Socket.connect(Socket.java:478)
	at java.net.Socket.<init>(Socket.java:375)
	at java.net.Socket.<init>(Socket.java:249)
	at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80)
	at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:122)
	at org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
	at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
	at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
	at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
	at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
	at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:422)
	... 9 more
2014-03-24 11:42:45,705 ERROR indexer.IndexingJob - Indexer: java.io.IOException: Job failed!


My solr instance is installed on Tomcat and it’s protected using tomcat-users.xml. I read that I should change the nutch-default.xml file with the authentication properties:

solr.server.url
solr.auth
solr.auth.username
solr.auth.password

It appears that my username is being attempted however it still seems that the connection is refused. 

Any ideas?

Thanks in advance,
Laura

Re: Indexer: java.io.IOException: Job failed!

Posted by Laura McCord <lm...@ucmerced.edu>.
So the problem might be because I’m running solr on tomcat port 8080. is there a way to resolve this so I can run the command successfully?

Thanks,
 Laura



On Mar 24, 2014, at 1:33 PM, Laura McCord <lm...@ucmerced.edu> wrote:

> Hi,
> 
> I’m trying to integrate Solr with Nutch and I performed all of the necessary steps except after Nutch performs the crawl it appears that I’m receiving a connection refused.
> 
> 2014-03-24 11:42:43,062 INFO  indexer.IndexerMapReduce - IndexerMapReduce: crawldb: TestCrawl/crawldb
> 2014-03-24 11:42:43,062 INFO  indexer.IndexerMapReduce - IndexerMapReduce: linkdb: TestCrawl/linkdb
> 2014-03-24 11:42:43,062 INFO  indexer.IndexerMapReduce - IndexerMapReduces: adding segment: TestCrawl/segments/20140324113941
> 2014-03-24 11:42:43,304 WARN  util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> 2014-03-24 11:42:43,942 INFO  anchor.AnchorIndexingFilter - Anchor deduplication is: off
> 2014-03-24 11:42:44,456 INFO  indexer.IndexWriters - Adding org.apache.nutch.indexwriter.solr.SolrIndexWriter
> 2014-03-24 11:42:44,465 INFO  solr.SolrUtils - Authenticating as: <my username>
> 2014-03-24 11:42:44,483 INFO  solr.SolrMappingReader - source: content dest: content
> 2014-03-24 11:42:44,483 INFO  solr.SolrMappingReader - source: title dest: title
> 2014-03-24 11:42:44,483 INFO  solr.SolrMappingReader - source: host dest: host
> 2014-03-24 11:42:44,483 INFO  solr.SolrMappingReader - source: segment dest: segment
> 2014-03-24 11:42:44,483 INFO  solr.SolrMappingReader - source: boost dest: boost
> 2014-03-24 11:42:44,484 INFO  solr.SolrMappingReader - source: digest dest: digest
> 2014-03-24 11:42:44,484 INFO  solr.SolrMappingReader - source: tstamp dest: tstamp
> 2014-03-24 11:42:44,484 INFO  solr.SolrMappingReader - source: url dest: id
> 2014-03-24 11:42:44,484 INFO  solr.SolrMappingReader - source: url dest: url
> 2014-03-24 11:42:44,616 INFO  solr.SolrIndexWriter - Indexing 22 documents
> 2014-03-24 11:42:44,704 INFO  httpclient.HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused
> 2014-03-24 11:42:44,704 INFO  httpclient.HttpMethodDirector - Retrying request
> 2014-03-24 11:42:44,707 INFO  httpclient.HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused
> 2014-03-24 11:42:44,707 INFO  httpclient.HttpMethodDirector - Retrying request
> 2014-03-24 11:42:44,707 INFO  httpclient.HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused
> 2014-03-24 11:42:44,707 INFO  httpclient.HttpMethodDirector - Retrying request
> 2014-03-24 11:42:44,708 INFO  solr.SolrIndexWriter - Indexing 22 documents
> 2014-03-24 11:42:44,709 INFO  httpclient.HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused
> 2014-03-24 11:42:44,709 INFO  httpclient.HttpMethodDirector - Retrying request
> 2014-03-24 11:42:44,709 INFO  httpclient.HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused
> 2014-03-24 11:42:44,709 INFO  httpclient.HttpMethodDirector - Retrying request
> 2014-03-24 11:42:44,709 INFO  httpclient.HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused
> 2014-03-24 11:42:44,709 INFO  httpclient.HttpMethodDirector - Retrying request
> 2014-03-24 11:42:44,715 WARN  mapred.LocalJobRunner - job_local319933392_0001
> java.io.IOException
> 	at org.apache.nutch.indexwriter.solr.SolrIndexWriter.makeIOException(SolrIndexWriter.java:173)
> 	at org.apache.nutch.indexwriter.solr.SolrIndexWriter.close(SolrIndexWriter.java:159)
> 	at org.apache.nutch.indexer.IndexWriters.close(IndexWriters.java:118)
> 	at org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:44)
> 	at org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.close(ReduceTask.java:467)
> 	at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:535)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421)
> 	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:398)
> Caused by: org.apache.solr.client.solrj.SolrServerException: java.net.ConnectException: Connection refused
> 	at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:478)
> 	at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
> 	at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
> 	at org.apache.nutch.indexwriter.solr.SolrIndexWriter.close(SolrIndexWriter.java:155)
> 	... 6 more
> Caused by: java.net.ConnectException: Connection refused
> 	at java.net.PlainSocketImpl.socketConnect(Native Method)
> 	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
> 	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
> 	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
> 	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
> 	at java.net.Socket.connect(Socket.java:529)
> 	at java.net.Socket.connect(Socket.java:478)
> 	at java.net.Socket.<init>(Socket.java:375)
> 	at java.net.Socket.<init>(Socket.java:249)
> 	at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80)
> 	at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:122)
> 	at org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
> 	at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
> 	at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
> 	at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
> 	at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
> 	at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:422)
> 	... 9 more
> 2014-03-24 11:42:45,705 ERROR indexer.IndexingJob - Indexer: java.io.IOException: Job failed!
> 
> 
> My solr instance is installed on Tomcat and it’s protected using tomcat-users.xml. I read that I should change the nutch-default.xml file with the authentication properties:
> 
> solr.server.url
> solr.auth
> solr.auth.username
> solr.auth.password
> 
> It appears that my username is being attempted however it still seems that the connection is refused. 
> 
> Any ideas?
> 
> Thanks in advance,
> Laura