You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by xin liu <li...@yahoo.com> on 2008/07/11 00:07:41 UTC

how to get total hit count for each Searchable?

Hi,
I have individual index files for Audio, Image and PDF files. We build common meta fields for them. When I search for a string, I want the search defaults to return mixed search results from these 3 different index based on relevancy. But I also wants to know hit count for each individual index type. For example, I want to get:
Mixed together total hit count: 105, with the first 10 HitItem.
Total hit in Audio: 73
Total hit in Image: 17
Total hit in PDF:    15

Right now, I'm doing the following way:
1. Gets one Searchable instance for Audio, one for Image, and one for PDF index;
2. construct ParallelMultiSearcher s with above 3 Searchable as parameters; call its search to get total hit count and first 10 hit items;
3. Call Audio searchable to get total hit count in Audio;
4. Call Image searchable to get total hit count for Image;
5. Call PDF searchable to get total hit count for Image.

So, Lucene will need do 6 search operations for these 3 index. Definitely, the performance will be an issue. 

Any experts can give me some advice? Thanks!

Tony



       

Re: how to get total hit count for each Searchable?

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Jul 11, 2008, at 1:13 PM, xin liu wrote:
> I have individual index files for Audio, Image and PDF files. We  
> build common meta fields for these different data types. When I  
> search for a string, I want the search to return mixed search  
> results from these 3 different index based on relevancy. So I use  
> ParallelMultiSearcher class to do the search. But I also wants to  
> know individual hit count for each individual index type. For  
> example, I want to get:
>    Mixed together total hit count: 103, with the first 10 HitItem.
>    Total hit in Audio: 73
>    Total hit in Image: 17
>    Total hit in PDF: 13
>
> Right now, I'm doing the following way:
> 1. Gets one Searchable instance for Audio, one for Image, and one  
> for PDF index;
> 2. construct ParallelMultiSearcher s with above 3 Searchable as  
> parameters; call its search to get total hit count and first 10 hit  
> items;
> 3. Call Audio searchable to get total hit count in Audio;
> 4. Call Image searchable to get total hit count for Image;
> 5. Call PDF searchable to get total hit count for Image.
>
> So, Lucene will need do 6 search operations for these 3 index.  
> Definitely, the performance will be an issue.
>
> Any better solution for this? Thanks!

Solr - <http://lucene.apache.org/solr> - features faceting along the  
lines of your needs.

However, Solr does not currently support ParallelMultiSearcher, but it  
does support distributed searching across sharded Solr instances.   
Under the covers of Solr is simply a Lucene index.   There's no reason  
Solr couldn't be enhanced to support ParallelMultiSearcher, I don't  
think, but right now it only uses a single file based IndexSearcher.

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


how to get total hit count for each Searchable?

Posted by xin liu <li...@yahoo.com>.
Hi,

I have individual index files for Audio, Image and PDF files. We build common meta fields for these different data types. When I search for a string, I want the search to return mixed search results from these 3 different index based on relevancy. So I use ParallelMultiSearcher class to do the search. But I also wants to know individual hit count for each individual index type. For example, I want to get:
    Mixed together total hit count: 103, with the first 10 HitItem.
    Total hit in Audio: 73
    Total hit in Image: 17
    Total hit in PDF: 13

Right now, I'm doing the following way:
1. Gets one Searchable instance for Audio, one for Image, and one for PDF index;
2. construct ParallelMultiSearcher s with above 3 Searchable as parameters; call its search to get total hit count and first 10 hit items;
3. Call Audio searchable to get total hit count in Audio;
4. Call Image searchable to get total hit count for Image;
5. Call PDF searchable to get total hit count for Image.

So, Lucene will need do 6 search operations for these 3 index. Definitely, the performance will be an issue.

Any better solution for this? Thanks!

Tony