You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by aditya naga hemanth kumar <ad...@gmail.com> on 2007/08/20 14:26:59 UTC

How to get results without a query based on the date

Hi everyone,

I've been working on a news system which is similar to google news. We have
a set of news web-pages which are crawled using nutch-0.9. I cluster the
pages using carrot-clustering plugin. To get the recent pages I need to
search using the date of the page. I modified the query-parser plugins of
nutch using that of lucene so that it supports queries like
[date:yyyymmdd-yyyymmdd  queryterm]  where queryterm is the query.

But a news system will not have any queries, it just displays the lastest
news articles in clusters from the whole index. I want to retrieve the
latest news pages based on their  date.
I index the dates of every document as a lucene field.

My requirement is I'll just give the date range [date:yyyymmdd-yyyymmdd ] as
query and I should get the  documents whose date is in the range
specified[no query term other than date]results. But nutch is'nt giving any
results.It requires a query.

 Can anyone help me with this where I can get the documents based on the
modified date but no query will be given.

Any help in this regard would be greatly appreciated.

Thanks in anticipation
Aditya Veluguri

Re: How to get results without a query based on the date

Posted by qi wu <ch...@gmail.com>.
Try to take a look at plugin index-more and query-more,which support index and search based on date information.You can easily customerize your own one based on the code. I have implemented my own plugin successfully.

----- Original Message ----- 
From: "aditya naga hemanth kumar" <ad...@gmail.com>
To: <nu...@lucene.apache.org>
Sent: Monday, August 20, 2007 8:26 PM
Subject: How to get results without a query based on the date


> Hi everyone,
> 
> I've been working on a news system which is similar to google news. We have
> a set of news web-pages which are crawled using nutch-0.9. I cluster the
> pages using carrot-clustering plugin. To get the recent pages I need to
> search using the date of the page. I modified the query-parser plugins of
> nutch using that of lucene so that it supports queries like
> [date:yyyymmdd-yyyymmdd  queryterm]  where queryterm is the query.
> 
> But a news system will not have any queries, it just displays the lastest
> news articles in clusters from the whole index. I want to retrieve the
> latest news pages based on their  date.
> I index the dates of every document as a lucene field.
> 
> My requirement is I'll just give the date range [date:yyyymmdd-yyyymmdd ] as
> query and I should get the  documents whose date is in the range
> specified[no query term other than date]results. But nutch is'nt giving any
> results.It requires a query.
> 
> Can anyone help me with this where I can get the documents based on the
> modified date but no query will be given.
> 
> Any help in this regard would be greatly appreciated.
> 
> Thanks in anticipation
> Aditya Veluguri
>