You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Robert Newson <rn...@connected.com> on 2005/06/23 22:38:59 UTC
Hits not serializable.
Can Hits be made serializable?
I'm finding that almost all of the time for a remote search is spent
lazily retrieving document objects.
I'd like to create a remote interface like with a method like;
Hits search(Query query, Filter filter, int prefetch)
The remote end would call Hits.doc() for the first $prefetch entries.
This will make a huge difference to remote searching performance;
total fetch server1 server2 server3
862 699 86 69 96
For now, I'll use Document[] as the return value, but Hits feels more
natural.
B.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
Re: Hits not serializable. (bulk document retrieval)
Posted by Robert Newson <rn...@connected.com>.
Thanks for the suggestion. I have solved this problem locally, I'm
wondering if this should be in Lucene core.
I have seven machines in a rack, each with Lucene indexes of about 30
million messages each. I'm trying to search across them with
RemoteSearcher and ParallelMultiSearcher.
Search times are impressive, only hundreds of milliseconds (for multiple
term queries).
Unfortunately, in order for the search to be useful, I need to pull back
a page worth of hits. In my case this is the first 25 results.
With the current out-of-the-box API this causes 50 sequential RMI calls,
which seriously degrades the total time that the client must wait for a
response.
ParallelMultiSearcher itself is pretty reasonable, though I have my own
re-implementation using the java.util.concurrent framework. However, the
Lucene API is simply not optimised for retrieving Documents in bulk.
Obviously we can all work around it in different ways, but I feel that
it should be core functionality.
Searchable could have a bulk retrieval method and ParallelMultiSearcher
should be able to execute it *in parallel* to each underlying searcher.
I've implemented it locally. If anyone feels that this addresses a
genuine problem, let me know.
In short, should Lucene provide an efficient document paging facility,
or is it not considered core?
B.
P.S. I'm using a CVS snapshot of Lucene 1.9.
Nrupal Akolkar wrote:
> Hi,
> Dear try doing the following,
> 1. write an extension class and extend the class containing search(...)
> method you listed. Define that class to be serialized.
> 2. let the class be overriding search method with just same content in it as
> in the super class.
> 3. build your lucene 1.** file again with ant, and try working out the way
> you desire.
> I think this solves your problem.
> Nrupal
>
>
> On 6/24/05, Robert Newson <rn...@connected.com> wrote:
>
>>
>>Can Hits be made serializable?
>>
>>I'm finding that almost all of the time for a remote search is spent
>>lazily retrieving document objects.
>>
>>I'd like to create a remote interface like with a method like;
>>
>>Hits search(Query query, Filter filter, int prefetch)
>>
>>The remote end would call Hits.doc() for the first $prefetch entries.
>>
>>This will make a huge difference to remote searching performance;
>>
>>total fetch server1 server2 server3
>>862 699 86 69 96
>>
>>For now, I'll use Document[] as the return value, but Hits feels more
>>natural.
>>
>>B.
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
Re: Hits not serializable.
Posted by Nrupal Akolkar <nr...@gmail.com>.
Hi,
Dear try doing the following,
1. write an extension class and extend the class containing search(...)
method you listed. Define that class to be serialized.
2. let the class be overriding search method with just same content in it as
in the super class.
3. build your lucene 1.** file again with ant, and try working out the way
you desire.
I think this solves your problem.
Nrupal
On 6/24/05, Robert Newson <rn...@connected.com> wrote:
>
>
> Can Hits be made serializable?
>
> I'm finding that almost all of the time for a remote search is spent
> lazily retrieving document objects.
>
> I'd like to create a remote interface like with a method like;
>
> Hits search(Query query, Filter filter, int prefetch)
>
> The remote end would call Hits.doc() for the first $prefetch entries.
>
> This will make a huge difference to remote searching performance;
>
> total fetch server1 server2 server3
> 862 699 86 69 96
>
> For now, I'll use Document[] as the return value, but Hits feels more
> natural.
>
> B.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>