You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Maria Sifniotis <se...@yahoo.com> on 2008/07/08 17:09:28 UTC

browsing query at Servlet level

Hello all!

Newish Nutch User here so I ask for you patience !

I have setup Nutch, fiddled with some things i wanted changed and got my index, servlet all working fine, query results return fine all is set.

My question is, is it possible instead of a text query to the content of a page to do a browsing function? I'll illustrate because I may not be making any sense.

One of my objectives is to see if a webpage has images, indicate it as a field in the indexing phase such as
doc.add(new Field("hasImgs", "YES", Field.Store.YES, Field.Index.TOKENIZED));	
This works ok!

Suppose I harvest 10 web pages, 5 of those have images and 5 don't. How is it possible to direct my bean.search to look at the hasImgs fields for the value YES and then display only those? 

Provided I don't care for text-based queries - just want to see how many of my indexed paged have a YES in their respective field. I can do this with Luke, but I need it to be in the Tomcat application.

Any clues?

Thank you very much!

Maria


      

Re: browsing query at Servlet level

Posted by Maria Sifniotis <se...@yahoo.com>.
Hi John thanks for the answer!

I was looking at the addRequired() method the other day but from what I saw does it not require a query term anyway, in that sense a keyword to be searched for from the content?

Maybe i did not understand it very well from the docs, I'll have a look tomorrow at work and update. 

Cheers again for the help,

Maria



--- On Tue, 7/8/08, John Thompson <jo...@gmail.com> wrote:

> From: John Thompson <jo...@gmail.com>
> Subject: Re: browsing query at Servlet level
> To: nutch-user@lucene.apache.org, senthais@yahoo.com
> Date: Tuesday, July 8, 2008, 1:05 PM
> Hi Maria,
> 
> If I understand what you want correctly, I think you want
> to use the
> addRequired() method of the Query object before you use
> your query in the
> search
> 
> *addRequiredTerm<http://lucene.apache.org/nutch/apidocs/org/apache/nutch/searcher/Query.html#addRequiredTerm%28java.lang.String,%20java.lang.String%29>
> *(String
> <http://java.sun.com/j2se/1.4.2/docs/api/java/lang/String.html>
> term,
> String
> <http://java.sun.com/j2se/1.4.2/docs/api/java/lang/String.html>
>  field)
>           Add a required term in a specified field.
> 
> http://lucene.apache.org/nutch/apidocs/org/apache/nutch/searcher/Query.html
> 
> HTH,
> John
> 
> On Tue, Jul 8, 2008 at 8:09 AM, Maria Sifniotis
> <se...@yahoo.com> wrote:
> 
> > Hello all!
> >
> > Newish Nutch User here so I ask for you patience !
> >
> > I have setup Nutch, fiddled with some things i wanted
> changed and got my
> > index, servlet all working fine, query results return
> fine all is set.
> >
> > My question is, is it possible instead of a text query
> to the content of a
> > page to do a browsing function? I'll illustrate
> because I may not be making
> > any sense.
> >
> > One of my objectives is to see if a webpage has
> images, indicate it as a
> > field in the indexing phase such as
> > doc.add(new Field("hasImgs",
> "YES", Field.Store.YES,
> > Field.Index.TOKENIZED));
> > This works ok!
> >
> > Suppose I harvest 10 web pages, 5 of those have images
> and 5 don't. How is
> > it possible to direct my bean.search to look at the
> hasImgs fields for the
> > value YES and then display only those?
> >
> > Provided I don't care for text-based queries -
> just want to see how many of
> > my indexed paged have a YES in their respective field.
> I can do this with
> > Luke, but I need it to be in the Tomcat application.
> >
> > Any clues?
> >
> > Thank you very much!
> >
> > Maria
> >
> >
> >
> >


      

Re: browsing query at Servlet level

Posted by John Thompson <jo...@gmail.com>.
Hi Maria,

If I understand what you want correctly, I think you want to use the
addRequired() method of the Query object before you use your query in the
search

*addRequiredTerm<http://lucene.apache.org/nutch/apidocs/org/apache/nutch/searcher/Query.html#addRequiredTerm%28java.lang.String,%20java.lang.String%29>
*(String <http://java.sun.com/j2se/1.4.2/docs/api/java/lang/String.html> term,
String <http://java.sun.com/j2se/1.4.2/docs/api/java/lang/String.html>
 field)
          Add a required term in a specified field.

http://lucene.apache.org/nutch/apidocs/org/apache/nutch/searcher/Query.html

HTH,
John

On Tue, Jul 8, 2008 at 8:09 AM, Maria Sifniotis <se...@yahoo.com> wrote:

> Hello all!
>
> Newish Nutch User here so I ask for you patience !
>
> I have setup Nutch, fiddled with some things i wanted changed and got my
> index, servlet all working fine, query results return fine all is set.
>
> My question is, is it possible instead of a text query to the content of a
> page to do a browsing function? I'll illustrate because I may not be making
> any sense.
>
> One of my objectives is to see if a webpage has images, indicate it as a
> field in the indexing phase such as
> doc.add(new Field("hasImgs", "YES", Field.Store.YES,
> Field.Index.TOKENIZED));
> This works ok!
>
> Suppose I harvest 10 web pages, 5 of those have images and 5 don't. How is
> it possible to direct my bean.search to look at the hasImgs fields for the
> value YES and then display only those?
>
> Provided I don't care for text-based queries - just want to see how many of
> my indexed paged have a YES in their respective field. I can do this with
> Luke, but I need it to be in the Tomcat application.
>
> Any clues?
>
> Thank you very much!
>
> Maria
>
>
>
>