You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Fred Gilmore <fg...@mail.utexas.edu> on 2010/08/19 20:57:59 UTC

solrindex, Nutch 1.0 and httpclient

  I've checked the archives and the patch list and it is still possible 
I missed the answer.  My apologies if this has come up before.

I've got a solr multicore setup which I've secured (loosely) with path 
based (admin, update) basic authentication under Tomcat.  Works as it 
should from the solr side.  It would appear that solrindex under Nutch 
1.0 uses a hard coded httpclient which will not pass the necessary 
parameters to accept this approach and allow the push into Solr to happen.

My question is, has anyone else run into this and developed a 
workaround?  Or, has this been patched subsequent to the Nutch 1.0 
general release and I missed it?  If not, the balance of my questions 
(how then to leave Solr select statements open but IP restrict 
admin/update)  will leave with me to the solr-user list :-)

thanks,

Fred

In other words this:

$nutch solrindex http://username:password@127.0.0.1/$collection 
$crawldir/crawldb $crawldir/linkdb $crawldir/segments/*



results in this hadoop.log:

2010-08-18 16:43:11,711 INFO  auth.AuthChallengeProcessor - basic 
authentication scheme selected
2010-08-18 16:43:11,720 INFO  httpclient.HttpMethodDirector - No 
credentials available for BASIC 'Basic Authentication'@127.0.0.1:80
2010-08-18 16:43:11,779 WARN  mapred.LocalJobRunner - job_local_0001
org.apache.solr.common.SolrException: Unauthorized

Unauthorized

request: http://127.0.0.1/txtell/update?wt=javabin&version=2.2
         at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:343)
         at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:183)
         at 
org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.java:217)
         at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48)
         at 
org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:69)
         at 
org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48)
         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)
         at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:170)
2010-08-18 16:43:12,134 FATAL solr.SolrIndexer - SolrIndexer: 
java.io.IOException: Job failed!
         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
         at 
org.apache.nutch.indexer.solr.SolrIndexer.indexSolr(SolrIndexer.java:73)
         at 
org.apache.nutch.indexer.solr.SolrIndexer.run(SolrIndexer.java:95)
         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
         at 
org.apache.nutch.indexer.solr.SolrIndexer.main(SolrIndexer.java:104)