You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Matthias Paul <ma...@gmail.com> on 2010/06/01 17:08:55 UTC

solrdedup crashes if digest-field not compiled

Hi,

if I run bin/nutch solrdedup <SOLR_URL> it crashes with
java.lang.NullPointerException
    at org.apache.hadoop.io.Text.encode(Text.java:388)
    at org.apache.hadoop.io.Text.set(Text.java:178).

I suppose this happens because in my solr-index there are not only documents
from nutch but also from a database. So not all records have the
digest-field compiled.
What can I do?
Isn't there the possibility to override the query which Nutch sends to Solr
id:[* TO *]?

Thanks
Matthias

Re: solrdedup crashes if digest-field not compiled

Posted by Matthias Paul <ma...@gmail.com>.
>Dedup will not work without digest field. Perhaps we can extend solrdedup
so
>it skips all documents
>with a digest field. Will that work for you?

You mean skip all documents *without* a digest field?
Yes, that would work.
But wouldn't it be better for performance reasons to query only against
documents with the field already compiled?

Matthias

Re: solrdedup crashes if digest-field not compiled

Posted by Doğacan Güney <do...@gmail.com>.
Hi,

On Tue, Jun 1, 2010 at 18:08, Matthias Paul <ma...@gmail.com>wrote:

> Hi,
>
> if I run bin/nutch solrdedup <SOLR_URL> it crashes with
> java.lang.NullPointerException
>    at org.apache.hadoop.io.Text.encode(Text.java:388)
>    at org.apache.hadoop.io.Text.set(Text.java:178).
>
> I suppose this happens because in my solr-index there are not only
> documents
> from nutch but also from a database. So not all records have the
> digest-field compiled.
> What can I do?
> Isn't there the possibility to override the query which Nutch sends to Solr
> id:[* TO *]?
>
>
Dedup will not work without digest field. Perhaps we can extend solrdedup so
it skips all documents
with a digest field. Will that work for you?


> Thanks
> Matthias
>



-- 
Doğacan Güney