You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lucenenet.apache.org by Richard Wilde <ri...@wildesoft.net> on 2012/11/28 22:55:02 UTC
Get number of documents
Hi , I have the following code that I inherited that searches for
"Extract:Simon AND Type:Contact" and the following:
1) I want to find the total number of Documents that match the query
2) I want to only return the first X Id's (from maxResults)
However when I look at this code I can see that 10,000 is passed into the
TopScoreDocCollector.Create method and the code loops through until it hits
the maxResults where it breaks out of the hits loop.
I am sure there is a better approach to this but Google so far has not
helped me out here. Can anyone point me in the right direction? I am using
v2.9.4.2 - will v3.0.3 server me better here?
The code I have is:-
var indexSearch = new IndexSearcher(indexReader);
var queryParser = new QueryParser(Version.LUCENE_29, "Extract", Analyzer);
var special = "Simon AND Type:Contact";
var collector = TopScoreDocCollector.Create(10000, true);
indexSearch.Search(queryParser.Parse(special), collector);
var hits = collector.TopDocs().ScoreDocs;
for (var i = 0; i < hits.Length; i++)
{
if (i >= maxResults) break;
var document = indexSearch.Doc(hits[i].Doc);
luceneDocuments.Add(document);
}
Thanks
Rippo
Re: Get number of documents
Posted by Simon Svensson <si...@devhost.se>.
Hi.
1.
TopScoreDocCollector.TopDocs
<http://lucene.apache.org/core/old_versioned_docs/versions/2_9_0/api/all/org/apache/lucene/search/TopDocsCollector.html#topDocs%28%29>
returns an TopDocs
<http://lucene.apache.org/core/old_versioned_docs/versions/2_9_0/api/all/org/apache/lucene/search/TopDocs.html>
instance which have teo properties; |scoreDocs| and |totalHits|.
You're looking for the later.
2.
The first parameter to TopScoreDocCollector.Create is the |numHits|
parameter. Pass |maxResults| I you only want the first |maxResults|
results, instead of 10000.
// Simon
On 2012-11-28 22:55, Richard Wilde wrote:
> Hi , I have the following code that I inherited that searches for
> "Extract:Simon AND Type:Contact" and the following:
>
>
>
> 1) I want to find the total number of Documents that match the query
>
> 2) I want to only return the first X Id's (from maxResults)
>
>
>
> However when I look at this code I can see that 10,000 is passed into the
> TopScoreDocCollector.Create method and the code loops through until it hits
> the maxResults where it breaks out of the hits loop.
>
> I am sure there is a better approach to this but Google so far has not
> helped me out here. Can anyone point me in the right direction? I am using
> v2.9.4.2 - will v3.0.3 server me better here?
>
>
>
> The code I have is:-
>
>
>
> var indexSearch = new IndexSearcher(indexReader);
>
> var queryParser = new QueryParser(Version.LUCENE_29, "Extract", Analyzer);
>
>
>
> var special = "Simon AND Type:Contact";
>
>
>
> var collector = TopScoreDocCollector.Create(10000, true);
>
> indexSearch.Search(queryParser.Parse(special), collector);
>
> var hits = collector.TopDocs().ScoreDocs;
>
>
>
> for (var i = 0; i < hits.Length; i++)
>
> {
>
> if (i >= maxResults) break;
>
> var document = indexSearch.Doc(hits[i].Doc);
>
> luceneDocuments.Add(document);
>
> }
>
>
>
> Thanks
>
> Rippo
>
>
>
>