You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Raymond Balmès <ra...@gmail.com> on 2009/04/15 16:47:05 UTC

Problems with custom field query

Hi guys,

Quite new to nutch so be patient.

I'm using nutch-1.0 and I get no results when I query some custom field
created by my filter.
My custom field is 'TOKENIZED' unlike the following threads suggest it does
not solve the problem for me:
http://www.nabble.com/Custom-field-query-to13123141.html#a13123141

My indexing is fine, checking with Luke shows that my field is indeed there
and I can query it as I want with Luke.

 So other thread suggests to apply a patch from Julien Noche did that patch
made it for nutch-1.0 ?
http://www.nabble.com/-jira--Created%3A-%28NUTCH-563%29-Include-custom-fields-in-BasicQueryFilter-to12982495.html#a13014569
If yes, then it does not work me.


A subtility though, not all documents have that field only some (this is how
I build my indexing filter)... could it be the problem ?

Also I'm using the out-of-the box servlet. search.jsp uses NutchBean for
searching ... could that be the issue ?

Thx for any direction where to look at.

-Ray-

Re: Problems with custom field query

Posted by Raymond Balmès <ra...@gmail.com>.
No other answers ?

Well another question: is there a method one can call to list all the fields
that Nutch knows at a given point ... did not find it.

-Ray-

2009/4/15 Raymond Balmès <ra...@gmail.com>

> OK I had not. Fixed nutch-site.xml, reloaded servlet ... still not working.
>
> I hardwired in my query the following
>
> query.addRequiredTerm("of","anchor")
> (I checked with Luke that "of" is indeed one of the terms in anchor)
> Same problem.... maybe I miss something else.
>
> Another thing strange is that the query parser only seems to seek url:
> fields other fields get parse as normal text ???
> -Ray-
> 2009/4/15 Julien Nioche <li...@gmail.com>
>
> Hi Raymond,
>>
>> As you can see on https://issues.apache.org/jira/browse/NUTCH-563 the
>> patch
>> has been included.
>>
>> Did you specify the fields to include in nutch-site.xml e.g.
>>
>> *<property>
>>  <name>query.basic.description.boost</name>
>>  <value>1.0</value>
>>  <description> Declares a custom field and its boost to be added to the
>> default fields of the Lucene query.
>>  </description>
>> </property>*
>>
>> where description is the name of the field and value its weight?
>>
>> Julien
>>
>> 2009/4/15 Raymond Balmès <ra...@gmail.com>
>>
>> > Hi guys,
>> >
>> > Quite new to nutch so be patient.
>> >
>> > I'm using nutch-1.0 and I get no results when I query some custom field
>> > created by my filter.
>> > My custom field is 'TOKENIZED' unlike the following threads suggest it
>> does
>> > not solve the problem for me:
>> > http://www.nabble.com/Custom-field-query-to13123141.html#a13123141
>> >
>> > My indexing is fine, checking with Luke shows that my field is indeed
>> there
>> > and I can query it as I want with Luke.
>> >
>> >  So other thread suggests to apply a patch from Julien Noche did that
>> patch
>> > made it for nutch-1.0 ?
>> >
>> >
>> http://www.nabble.com/-jira--Created%3A-%28NUTCH-563%29-Include-custom-fields-in-BasicQueryFilter-to12982495.html#a13014569
>> > If yes, then it does not work me.
>> >
>> >
>> > A subtility though, not all documents have that field only some (this is
>> > how
>> > I build my indexing filter)... could it be the problem ?
>> >
>> > Also I'm using the out-of-the box servlet. search.jsp uses NutchBean for
>> > searching ... could that be the issue ?
>> >
>> > Thx for any direction where to look at.
>> >
>> > -Ray-
>> >
>>
>>
>>
>> --
>> DigitalPebble Ltd
>> http://www.digitalpebble.com
>>
>
>

Re: Problems with custom field query

Posted by Raymond Balmès <ra...@gmail.com>.
Just in case somebody bumps into the same issues in the future.

SOLVED... I found that my field names where creating the problems  because
they had upperCase letters & '-' so the Query.parse would not find the
fields. No problems with lowercasing them.

Where this is perfectly fine with Lucene /Luke or even Solr the Nutch query
parser seem to lower case clauses... room for improvements ?

-Ray-

2009/4/15 Raymond Balmès <ra...@gmail.com>

> OK I had not. Fixed nutch-site.xml, reloaded servlet ... still not working.
>
> I hardwired in my query the following
>
> query.addRequiredTerm("of","anchor")
> (I checked with Luke that "of" is indeed one of the terms in anchor)
> Same problem.... maybe I miss something else.
>
> Another thing strange is that the query parser only seems to seek url:
> fields other fields get parse as normal text ???
> -Ray-
> 2009/4/15 Julien Nioche <li...@gmail.com>
>
> Hi Raymond,
>>
>> As you can see on https://issues.apache.org/jira/browse/NUTCH-563 the
>> patch
>> has been included.
>>
>> Did you specify the fields to include in nutch-site.xml e.g.
>>
>> *<property>
>>  <name>query.basic.description.boost</name>
>>  <value>1.0</value>
>>  <description> Declares a custom field and its boost to be added to the
>> default fields of the Lucene query.
>>  </description>
>> </property>*
>>
>> where description is the name of the field and value its weight?
>>
>> Julien
>>
>> 2009/4/15 Raymond Balmès <ra...@gmail.com>
>>
>> > Hi guys,
>> >
>> > Quite new to nutch so be patient.
>> >
>> > I'm using nutch-1.0 and I get no results when I query some custom field
>> > created by my filter.
>> > My custom field is 'TOKENIZED' unlike the following threads suggest it
>> does
>> > not solve the problem for me:
>> > http://www.nabble.com/Custom-field-query-to13123141.html#a13123141
>> >
>> > My indexing is fine, checking with Luke shows that my field is indeed
>> there
>> > and I can query it as I want with Luke.
>> >
>> >  So other thread suggests to apply a patch from Julien Noche did that
>> patch
>> > made it for nutch-1.0 ?
>> >
>> >
>> http://www.nabble.com/-jira--Created%3A-%28NUTCH-563%29-Include-custom-fields-in-BasicQueryFilter-to12982495.html#a13014569
>> > If yes, then it does not work me.
>> >
>> >
>> > A subtility though, not all documents have that field only some (this is
>> > how
>> > I build my indexing filter)... could it be the problem ?
>> >
>> > Also I'm using the out-of-the box servlet. search.jsp uses NutchBean for
>> > searching ... could that be the issue ?
>> >
>> > Thx for any direction where to look at.
>> >
>> > -Ray-
>> >
>>
>>
>>
>> --
>> DigitalPebble Ltd
>> http://www.digitalpebble.com
>>
>
>

Re: Problems with custom field query

Posted by Raymond Balmès <ra...@gmail.com>.
OK I had not. Fixed nutch-site.xml, reloaded servlet ... still not working.

I hardwired in my query the following

query.addRequiredTerm("of","anchor")
(I checked with Luke that "of" is indeed one of the terms in anchor)
Same problem.... maybe I miss something else.

Another thing strange is that the query parser only seems to seek url:
fields other fields get parse as normal text ???
-Ray-
2009/4/15 Julien Nioche <li...@gmail.com>

> Hi Raymond,
>
> As you can see on https://issues.apache.org/jira/browse/NUTCH-563 the
> patch
> has been included.
>
> Did you specify the fields to include in nutch-site.xml e.g.
>
> *<property>
>  <name>query.basic.description.boost</name>
>  <value>1.0</value>
>  <description> Declares a custom field and its boost to be added to the
> default fields of the Lucene query.
>  </description>
> </property>*
>
> where description is the name of the field and value its weight?
>
> Julien
>
> 2009/4/15 Raymond Balmès <ra...@gmail.com>
>
> > Hi guys,
> >
> > Quite new to nutch so be patient.
> >
> > I'm using nutch-1.0 and I get no results when I query some custom field
> > created by my filter.
> > My custom field is 'TOKENIZED' unlike the following threads suggest it
> does
> > not solve the problem for me:
> > http://www.nabble.com/Custom-field-query-to13123141.html#a13123141
> >
> > My indexing is fine, checking with Luke shows that my field is indeed
> there
> > and I can query it as I want with Luke.
> >
> >  So other thread suggests to apply a patch from Julien Noche did that
> patch
> > made it for nutch-1.0 ?
> >
> >
> http://www.nabble.com/-jira--Created%3A-%28NUTCH-563%29-Include-custom-fields-in-BasicQueryFilter-to12982495.html#a13014569
> > If yes, then it does not work me.
> >
> >
> > A subtility though, not all documents have that field only some (this is
> > how
> > I build my indexing filter)... could it be the problem ?
> >
> > Also I'm using the out-of-the box servlet. search.jsp uses NutchBean for
> > searching ... could that be the issue ?
> >
> > Thx for any direction where to look at.
> >
> > -Ray-
> >
>
>
>
> --
> DigitalPebble Ltd
> http://www.digitalpebble.com
>

Re: Problems with custom field query

Posted by Julien Nioche <li...@gmail.com>.
Hi Raymond,

As you can see on https://issues.apache.org/jira/browse/NUTCH-563 the patch
has been included.

Did you specify the fields to include in nutch-site.xml e.g.

*<property>
  <name>query.basic.description.boost</name>
  <value>1.0</value>
  <description> Declares a custom field and its boost to be added to the
default fields of the Lucene query.
  </description>
</property>*

where description is the name of the field and value its weight?

Julien

2009/4/15 Raymond Balmès <ra...@gmail.com>

> Hi guys,
>
> Quite new to nutch so be patient.
>
> I'm using nutch-1.0 and I get no results when I query some custom field
> created by my filter.
> My custom field is 'TOKENIZED' unlike the following threads suggest it does
> not solve the problem for me:
> http://www.nabble.com/Custom-field-query-to13123141.html#a13123141
>
> My indexing is fine, checking with Luke shows that my field is indeed there
> and I can query it as I want with Luke.
>
>  So other thread suggests to apply a patch from Julien Noche did that patch
> made it for nutch-1.0 ?
>
> http://www.nabble.com/-jira--Created%3A-%28NUTCH-563%29-Include-custom-fields-in-BasicQueryFilter-to12982495.html#a13014569
> If yes, then it does not work me.
>
>
> A subtility though, not all documents have that field only some (this is
> how
> I build my indexing filter)... could it be the problem ?
>
> Also I'm using the out-of-the box servlet. search.jsp uses NutchBean for
> searching ... could that be the issue ?
>
> Thx for any direction where to look at.
>
> -Ray-
>



-- 
DigitalPebble Ltd
http://www.digitalpebble.com