You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by ma...@Automationdirect.com on 2012/09/06 21:17:26 UTC
SolrDeleteDuplicates bug
Hi,
I am using Nutch 1.5.1 with SOLR 3.6. I am getting an IOException error while deduping. The error in Hadoop.log file is as below. This looks like the Nutch error: NUTCH-1251<https://issues.apache.org/jira/browse/NUTCH-1251>. It says that the fixed version is in 1.6. Just wanted to confirm that it is this particular error. Also wanted to find out where to download the version 1.6. It is not available in any of the mirror sites such as:
http://www.carfab.com/apachesoftware/nutch/
Thanks in advance.
java.io.IOException: org.apache.solr.client.solrj.SolrServerException: Error executing query
at org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.getRecordReader(SolrDeleteDuplicates.java:236)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:197)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
Caused by: org.apache.solr.client.solrj.SolrServerException: Error executing query
at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95)
at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118)
at org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.getRecordReader(SolrDeleteDuplicates.java:234)
... 4 more
Caused by: org.apache.solr.common.SolrException: Internal Server Error
Internal Server Error
request: http://vmdev02:8080/solr/nutch/select?q=id:[* TO *]&fl=id,boost,tstamp,digest&start=0&rows=3032&wt=javabin&version=2
at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:430)
at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89)
Re: SolrDeleteDuplicates bug
Posted by ma...@Automationdirect.com.
Thanks Julien.
On 9/6/12 3:44 PM, "Julien Nioche" <li...@gmail.com> wrote:
>Hi,
>
>1.6 is not published yet and is the trunk on SVN see
>http://nutch.apache.org/version_control.html
>
>
>On 6 September 2012 20:17, <ma...@automationdirect.com> wrote:
>
>> Hi,
>>
>> I am using Nutch 1.5.1 with SOLR 3.6. I am getting an IOException error
>> while deduping. The error in Hadoop.log file is as below. This looks
>>like
>> the Nutch error: NUTCH-1251<
>> https://issues.apache.org/jira/browse/NUTCH-1251>. It says that the
>>fixed
>> version is in 1.6. Just wanted to confirm that it is this particular
>>error.
>> Also wanted to find out where to download the version 1.6. It is not
>> available in any of the mirror sites such as:
>> http://www.carfab.com/apachesoftware/nutch/
>>
>> Thanks in advance.
>>
>>
>>
>> java.io.IOException: org.apache.solr.client.solrj.SolrServerException:
>> Error executing query
>> at
>>
>>org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.getRec
>>ordReader(SolrDeleteDuplicates.java:236)
>> at
>>
>>org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:
>>197)
>> at
>>org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>> at
>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
>> Caused by: org.apache.solr.client.solrj.SolrServerException: Error
>> executing query
>> at
>>
>>org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.ja
>>va:95)
>> at
>> org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118)
>> at
>>
>>org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.getRec
>>ordReader(SolrDeleteDuplicates.java:234)
>> ... 4 more
>> Caused by: org.apache.solr.common.SolrException: Internal Server Error
>>
>> Internal Server Error
>>
>> request: http://vmdev02:8080/solr/nutch/select?q=id:[* TO
>> *]&fl=id,boost,tstamp,digest&start=0&rows=3032&wt=javabin&version=2
>> at
>>
>>org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHt
>>tpSolrServer.java:430)
>> at
>>
>>org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHt
>>tpSolrServer.java:244)
>> at
>>
>>org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.ja
>>va:89)
>>
>
>
>
>--
>*
>*Open Source Solutions for Text Engineering
>
>http://digitalpebble.blogspot.com/
>http://www.digitalpebble.com
>http://twitter.com/digitalpebble
Re: SolrDeleteDuplicates bug
Posted by Julien Nioche <li...@gmail.com>.
Hi,
1.6 is not published yet and is the trunk on SVN see
http://nutch.apache.org/version_control.html
On 6 September 2012 20:17, <ma...@automationdirect.com> wrote:
> Hi,
>
> I am using Nutch 1.5.1 with SOLR 3.6. I am getting an IOException error
> while deduping. The error in Hadoop.log file is as below. This looks like
> the Nutch error: NUTCH-1251<
> https://issues.apache.org/jira/browse/NUTCH-1251>. It says that the fixed
> version is in 1.6. Just wanted to confirm that it is this particular error.
> Also wanted to find out where to download the version 1.6. It is not
> available in any of the mirror sites such as:
> http://www.carfab.com/apachesoftware/nutch/
>
> Thanks in advance.
>
>
>
> java.io.IOException: org.apache.solr.client.solrj.SolrServerException:
> Error executing query
> at
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.getRecordReader(SolrDeleteDuplicates.java:236)
> at
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:197)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> Caused by: org.apache.solr.client.solrj.SolrServerException: Error
> executing query
> at
> org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95)
> at
> org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118)
> at
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.getRecordReader(SolrDeleteDuplicates.java:234)
> ... 4 more
> Caused by: org.apache.solr.common.SolrException: Internal Server Error
>
> Internal Server Error
>
> request: http://vmdev02:8080/solr/nutch/select?q=id:[* TO
> *]&fl=id,boost,tstamp,digest&start=0&rows=3032&wt=javabin&version=2
> at
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:430)
> at
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
> at
> org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89)
>
--
*
*Open Source Solutions for Text Engineering
http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble
Re: SolrDeleteDuplicates bug
Posted by ma...@Automationdirect.com.
This may help someone else: I got the jar file from
https://builds.apache.org/job/nutch-trunk-maven/330/org.apache.nutch$nutch/
I.e. nutch-1.6-SNAPSHOT.jar and I replaced the nutch jar with this one in
the /lib directory and that fixed my issue.
On 9/6/12 3:17 PM, "Arora, Madhvi" <ma...@Automationdirect.com> wrote:
>Hi,
>
>I am using Nutch 1.5.1 with SOLR 3.6. I am getting an IOException error
>while deduping. The error in Hadoop.log file is as below. This looks like
>the Nutch error:
>NUTCH-1251<https://issues.apache.org/jira/browse/NUTCH-1251>. It says
>that the fixed version is in 1.6. Just wanted to confirm that it is this
>particular error. Also wanted to find out where to download the version
>1.6. It is not available in any of the mirror sites such as:
>http://www.carfab.com/apachesoftware/nutch/
>
>Thanks in advance.
>
>
>
>java.io.IOException: org.apache.solr.client.solrj.SolrServerException:
>Error executing query
> at
>org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.getReco
>rdReader(SolrDeleteDuplicates.java:236)
> at
>org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:1
>97)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
> at
>org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
>Caused by: org.apache.solr.client.solrj.SolrServerException: Error
>executing query
> at
>org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.jav
>a:95)
> at
>org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118)
> at
>org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.getReco
>rdReader(SolrDeleteDuplicates.java:234)
> ... 4 more
>Caused by: org.apache.solr.common.SolrException: Internal Server Error
>
>Internal Server Error
>
>request: http://vmdev02:8080/solr/nutch/select?q=id:[* TO
>*]&fl=id,boost,tstamp,digest&start=0&rows=3032&wt=javabin&version=2
> at
>org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHtt
>pSolrServer.java:430)
> at
>org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHtt
>pSolrServer.java:244)
> at
>org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.jav
>a:89)