You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by ma...@Automationdirect.com on 2012/09/06 21:17:26 UTC

SolrDeleteDuplicates bug

Hi,

I am using Nutch 1.5.1 with SOLR 3.6.  I am getting an IOException error while deduping. The error in Hadoop.log file is as below. This looks like the Nutch error: NUTCH-1251<https://issues.apache.org/jira/browse/NUTCH-1251>. It says that the fixed version is in 1.6. Just wanted to confirm that it is this particular error. Also wanted to find out where to download the version 1.6. It is not available in any of the mirror sites such as:
http://www.carfab.com/apachesoftware/nutch/

Thanks in advance.



java.io.IOException: org.apache.solr.client.solrj.SolrServerException: Error executing query
        at org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.getRecordReader(SolrDeleteDuplicates.java:236)
        at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:197)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
Caused by: org.apache.solr.client.solrj.SolrServerException: Error executing query
        at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95)
        at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118)
        at org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.getRecordReader(SolrDeleteDuplicates.java:234)
        ... 4 more
Caused by: org.apache.solr.common.SolrException: Internal Server Error

Internal Server Error

request: http://vmdev02:8080/solr/nutch/select?q=id:[* TO *]&fl=id,boost,tstamp,digest&start=0&rows=3032&wt=javabin&version=2
        at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:430)
        at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
        at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89)

Re: SolrDeleteDuplicates bug

Posted by ma...@Automationdirect.com.
Thanks Julien. 

On 9/6/12 3:44 PM, "Julien Nioche" <li...@gmail.com> wrote:

>Hi,
>
>1.6 is not published yet and is the trunk on SVN see
>http://nutch.apache.org/version_control.html
>
>
>On 6 September 2012 20:17, <ma...@automationdirect.com> wrote:
>
>> Hi,
>>
>> I am using Nutch 1.5.1 with SOLR 3.6.  I am getting an IOException error
>> while deduping. The error in Hadoop.log file is as below. This looks
>>like
>> the Nutch error: NUTCH-1251<
>> https://issues.apache.org/jira/browse/NUTCH-1251>. It says that the
>>fixed
>> version is in 1.6. Just wanted to confirm that it is this particular
>>error.
>> Also wanted to find out where to download the version 1.6. It is not
>> available in any of the mirror sites such as:
>> http://www.carfab.com/apachesoftware/nutch/
>>
>> Thanks in advance.
>>
>>
>>
>> java.io.IOException: org.apache.solr.client.solrj.SolrServerException:
>> Error executing query
>>         at
>> 
>>org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.getRec
>>ordReader(SolrDeleteDuplicates.java:236)
>>         at
>> 
>>org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:
>>197)
>>         at 
>>org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>>         at
>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
>> Caused by: org.apache.solr.client.solrj.SolrServerException: Error
>> executing query
>>         at
>> 
>>org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.ja
>>va:95)
>>         at
>> org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118)
>>         at
>> 
>>org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.getRec
>>ordReader(SolrDeleteDuplicates.java:234)
>>         ... 4 more
>> Caused by: org.apache.solr.common.SolrException: Internal Server Error
>>
>> Internal Server Error
>>
>> request: http://vmdev02:8080/solr/nutch/select?q=id:[* TO
>> *]&fl=id,boost,tstamp,digest&start=0&rows=3032&wt=javabin&version=2
>>         at
>> 
>>org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHt
>>tpSolrServer.java:430)
>>         at
>> 
>>org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHt
>>tpSolrServer.java:244)
>>         at
>> 
>>org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.ja
>>va:89)
>>
>
>
>
>-- 
>*
>*Open Source Solutions for Text Engineering
>
>http://digitalpebble.blogspot.com/
>http://www.digitalpebble.com
>http://twitter.com/digitalpebble


Re: SolrDeleteDuplicates bug

Posted by Julien Nioche <li...@gmail.com>.
Hi,

1.6 is not published yet and is the trunk on SVN see
http://nutch.apache.org/version_control.html


On 6 September 2012 20:17, <ma...@automationdirect.com> wrote:

> Hi,
>
> I am using Nutch 1.5.1 with SOLR 3.6.  I am getting an IOException error
> while deduping. The error in Hadoop.log file is as below. This looks like
> the Nutch error: NUTCH-1251<
> https://issues.apache.org/jira/browse/NUTCH-1251>. It says that the fixed
> version is in 1.6. Just wanted to confirm that it is this particular error.
> Also wanted to find out where to download the version 1.6. It is not
> available in any of the mirror sites such as:
> http://www.carfab.com/apachesoftware/nutch/
>
> Thanks in advance.
>
>
>
> java.io.IOException: org.apache.solr.client.solrj.SolrServerException:
> Error executing query
>         at
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.getRecordReader(SolrDeleteDuplicates.java:236)
>         at
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:197)
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>         at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> Caused by: org.apache.solr.client.solrj.SolrServerException: Error
> executing query
>         at
> org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95)
>         at
> org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118)
>         at
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.getRecordReader(SolrDeleteDuplicates.java:234)
>         ... 4 more
> Caused by: org.apache.solr.common.SolrException: Internal Server Error
>
> Internal Server Error
>
> request: http://vmdev02:8080/solr/nutch/select?q=id:[* TO
> *]&fl=id,boost,tstamp,digest&start=0&rows=3032&wt=javabin&version=2
>         at
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:430)
>         at
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
>         at
> org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89)
>



-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble

Re: SolrDeleteDuplicates bug

Posted by ma...@Automationdirect.com.
This may help someone else:  I got the jar file from
https://builds.apache.org/job/nutch-trunk-maven/330/org.apache.nutch$nutch/
I.e. nutch-1.6-SNAPSHOT.jar and I replaced the nutch jar with this one in
the /lib directory and that fixed my issue.

On 9/6/12 3:17 PM, "Arora, Madhvi" <ma...@Automationdirect.com> wrote:

>Hi,
>
>I am using Nutch 1.5.1 with SOLR 3.6.  I am getting an IOException error
>while deduping. The error in Hadoop.log file is as below. This looks like
>the Nutch error: 
>NUTCH-1251<https://issues.apache.org/jira/browse/NUTCH-1251>. It says
>that the fixed version is in 1.6. Just wanted to confirm that it is this
>particular error. Also wanted to find out where to download the version
>1.6. It is not available in any of the mirror sites such as:
>http://www.carfab.com/apachesoftware/nutch/
>
>Thanks in advance.
>
>
>
>java.io.IOException: org.apache.solr.client.solrj.SolrServerException:
>Error executing query
>        at 
>org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.getReco
>rdReader(SolrDeleteDuplicates.java:236)
>        at 
>org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:1
>97)
>        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>        at 
>org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
>Caused by: org.apache.solr.client.solrj.SolrServerException: Error
>executing query
>        at 
>org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.jav
>a:95)
>        at 
>org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118)
>        at 
>org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.getReco
>rdReader(SolrDeleteDuplicates.java:234)
>        ... 4 more
>Caused by: org.apache.solr.common.SolrException: Internal Server Error
>
>Internal Server Error
>
>request: http://vmdev02:8080/solr/nutch/select?q=id:[* TO
>*]&fl=id,boost,tstamp,digest&start=0&rows=3032&wt=javabin&version=2
>        at 
>org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHtt
>pSolrServer.java:430)
>        at 
>org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHtt
>pSolrServer.java:244)
>        at 
>org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.jav
>a:89)