You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by ahammad <ah...@gmail.com> on 2009/01/27 21:32:38 UTC

Issue with index-more query-more plugins

Hello all,

I've reached the stage where I need to index documents like msword and pdf.
So I enabled the correct plugins to do just that. I also enabled the
index-more and query-more in order to index the field "subType", which
recordes the filetype.

When I open the index in Luke, I can search, as an example, for "ISP
+subType:pdf", and sure enough, it returns all the pdf files that contain
"ISP" somewhere in it. When I do that in a fresh Nutch web interface, it
returns no results. Note that if I search for ISP without the +subType
filter, I get some results back.

What could be causing this? I'm not sure where to start looking to be
honest.

Thank you all for your help and support. Without the people on these mailing
lists, a lot of people like me would be completely lost. I hope you can help
me figure this out.

Cheers
-- 
View this message in context: http://www.nabble.com/Issue-with-index-more-query-more-plugins-tp21693809p21693809.html
Sent from the Nutch - User mailing list archive at Nabble.com.


Re: Issue with index-more query-more plugins

Posted by Doğacan Güney <do...@gmail.com>.
On Tue, Jan 27, 2009 at 10:32 PM, ahammad <ah...@gmail.com> wrote:
>
> Hello all,
>
> I've reached the stage where I need to index documents like msword and pdf.
> So I enabled the correct plugins to do just that. I also enabled the
> index-more and query-more in order to index the field "subType", which
> recordes the filetype.
>
> When I open the index in Luke, I can search, as an example, for "ISP
> +subType:pdf", and sure enough, it returns all the pdf files that contain
> "ISP" somewhere in it. When I do that in a fresh Nutch web interface, it
> returns no results. Note that if I search for ISP without the +subType
> filter, I get some results back.
>

You can do
bin/nutch org.apache.nutch.searcher.NutchBean query

to see hits for a query easily.

Now, if you are using a local setup (meaning no IndexServers)
try adding a System.out.println to around line 98 of IndexSearcher.java.
This way we can now the resulting lucene query. You can then use this
query in luke and check the results.

> What could be causing this? I'm not sure where to start looking to be
> honest.
>
> Thank you all for your help and support. Without the people on these mailing
> lists, a lot of people like me would be completely lost. I hope you can help
> me figure this out.
>
> Cheers
> --
> View this message in context: http://www.nabble.com/Issue-with-index-more-query-more-plugins-tp21693809p21693809.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>
>



-- 
Doğacan Güney