You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by aditya naga hemanth kumar <ad...@gmail.com> on 2007/08/20 14:26:59 UTC
How to get results without a query based on the date
Hi everyone,
I've been working on a news system which is similar to google news. We have
a set of news web-pages which are crawled using nutch-0.9. I cluster the
pages using carrot-clustering plugin. To get the recent pages I need to
search using the date of the page. I modified the query-parser plugins of
nutch using that of lucene so that it supports queries like
[date:yyyymmdd-yyyymmdd queryterm] where queryterm is the query.
But a news system will not have any queries, it just displays the lastest
news articles in clusters from the whole index. I want to retrieve the
latest news pages based on their date.
I index the dates of every document as a lucene field.
My requirement is I'll just give the date range [date:yyyymmdd-yyyymmdd ] as
query and I should get the documents whose date is in the range
specified[no query term other than date]results. But nutch is'nt giving any
results.It requires a query.
Can anyone help me with this where I can get the documents based on the
modified date but no query will be given.
Any help in this regard would be greatly appreciated.
Thanks in anticipation
Aditya Veluguri
Re: How to get results without a query based on the date
Posted by qi wu <ch...@gmail.com>.
Try to take a look at plugin index-more and query-more,which support index and search based on date information.You can easily customerize your own one based on the code. I have implemented my own plugin successfully.
----- Original Message -----
From: "aditya naga hemanth kumar" <ad...@gmail.com>
To: <nu...@lucene.apache.org>
Sent: Monday, August 20, 2007 8:26 PM
Subject: How to get results without a query based on the date
> Hi everyone,
>
> I've been working on a news system which is similar to google news. We have
> a set of news web-pages which are crawled using nutch-0.9. I cluster the
> pages using carrot-clustering plugin. To get the recent pages I need to
> search using the date of the page. I modified the query-parser plugins of
> nutch using that of lucene so that it supports queries like
> [date:yyyymmdd-yyyymmdd queryterm] where queryterm is the query.
>
> But a news system will not have any queries, it just displays the lastest
> news articles in clusters from the whole index. I want to retrieve the
> latest news pages based on their date.
> I index the dates of every document as a lucene field.
>
> My requirement is I'll just give the date range [date:yyyymmdd-yyyymmdd ] as
> query and I should get the documents whose date is in the range
> specified[no query term other than date]results. But nutch is'nt giving any
> results.It requires a query.
>
> Can anyone help me with this where I can get the documents based on the
> modified date but no query will be given.
>
> Any help in this regard would be greatly appreciated.
>
> Thanks in anticipation
> Aditya Veluguri
>