You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Jason Rutherglen (JIRA)" <ji...@apache.org> on 2008/11/27 01:20:44 UTC

[jira] Created: (LUCENE-1471) Faster MultiSearcher.search merge docs

Faster MultiSearcher.search merge docs 
---------------------------------------

                 Key: LUCENE-1471
                 URL: https://issues.apache.org/jira/browse/LUCENE-1471
             Project: Lucene - Java
          Issue Type: Improvement
          Components: Search
    Affects Versions: 2.4
            Reporter: Jason Rutherglen
            Priority: Minor


MultiSearcher.search places sorted search results from individual searchers into a PriorityQueue.  This can be made to be more optimal by taking advantage of the fact that the results returned are already sorted.  

The proposed solution places the sub-searcher results iterator into a custom PriorityQueue that produces the sorted ScoreDocs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1471) Faster MultiSearcher.search merge docs

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654755#action_12654755 ] 

Michael McCandless commented on LUCENE-1471:
--------------------------------------------

Luke, it looks like the 2nd patch lost the necessary mods to FieldDocSortedHitQueue -- can you post a new patch that includes it?  Thanks.

> Faster MultiSearcher.search merge docs 
> ---------------------------------------
>
>                 Key: LUCENE-1471
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1471
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.4
>            Reporter: Jason Rutherglen
>            Assignee: Michael McCandless
>            Priority: Minor
>         Attachments: LUCENE-1471.patch, multisearcher.patch, multisearcher.take2.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> MultiSearcher.search places sorted search results from individual searchers into a PriorityQueue.  This can be made to be more optimal by taking advantage of the fact that the results returned are already sorted.  
> The proposed solution places the sub-searcher results iterator into a custom PriorityQueue that produces the sorted ScoreDocs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1471) Faster MultiSearcher.search merge docs

Posted by "Luke Nezda (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Luke Nezda updated LUCENE-1471:
-------------------------------

    Attachment: multisearcher.patch

> Faster MultiSearcher.search merge docs 
> ---------------------------------------
>
>                 Key: LUCENE-1471
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1471
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.4
>            Reporter: Jason Rutherglen
>            Priority: Minor
>         Attachments: multisearcher.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> MultiSearcher.search places sorted search results from individual searchers into a PriorityQueue.  This can be made to be more optimal by taking advantage of the fact that the results returned are already sorted.  
> The proposed solution places the sub-searcher results iterator into a custom PriorityQueue that produces the sorted ScoreDocs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1471) Faster MultiSearcher.search merge docs

Posted by "Luke Nezda (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Luke Nezda updated LUCENE-1471:
-------------------------------

    Attachment: multisearcher.take2.patch

Patch covering MultiSearcher and ParallelMultiSearcher

> Faster MultiSearcher.search merge docs 
> ---------------------------------------
>
>                 Key: LUCENE-1471
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1471
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.4
>            Reporter: Jason Rutherglen
>            Assignee: Michael McCandless
>            Priority: Minor
>         Attachments: LUCENE-1471.patch, multisearcher.patch, multisearcher.take2.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> MultiSearcher.search places sorted search results from individual searchers into a PriorityQueue.  This can be made to be more optimal by taking advantage of the fact that the results returned are already sorted.  
> The proposed solution places the sub-searcher results iterator into a custom PriorityQueue that produces the sorted ScoreDocs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1471) Faster MultiSearcher.search merge docs

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654791#action_12654791 ] 

Mark Miller commented on LUCENE-1471:
-------------------------------------

Re: thread, Something makes me think a method more like the IndexWriter merge stuff would be better - a max of 3 or n threads used type of thing. One thread per sub searcher worries me.

> Faster MultiSearcher.search merge docs 
> ---------------------------------------
>
>                 Key: LUCENE-1471
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1471
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.4
>            Reporter: Jason Rutherglen
>            Assignee: Michael McCandless
>            Priority: Minor
>         Attachments: LUCENE-1471.patch, multisearcher.patch, multisearcher.take2.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> MultiSearcher.search places sorted search results from individual searchers into a PriorityQueue.  This can be made to be more optimal by taking advantage of the fact that the results returned are already sorted.  
> The proposed solution places the sub-searcher results iterator into a custom PriorityQueue that produces the sorted ScoreDocs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1471) Faster MultiSearcher.search merge docs

Posted by "Luke Nezda (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12651876#action_12651876 ] 

Luke Nezda commented on LUCENE-1471:
------------------------------------

I had a look at this code and it looks like an easy opportunity.  Here's my analysis
  * let m = searchables.length
  * let n = nDocs
- Current performance: n * m * lg( n )
  n * m * lg( n ) + // fill queue
  n * lg( n )   // drain queue into scoreDocs[]
  if each searcher read has n worse documents than the one before it
- Possible performance: n * lg( m )
    m * lg( m ) + // init queue
    n * lg( m ) + // drain & fill queue

I'll attach a patch for {{MultiSearcher}} {{search()}} methods that supports with and without {{Sort}}.  Its a little kludgy - had to remove {{final}} from {{FieldDocSortedHitQueue}}'s {{lessThan}} method and do some casting.  All tests pass.  I doubt much search time is tied up here since this is all in-memory and n and m are usually small.

> Faster MultiSearcher.search merge docs 
> ---------------------------------------
>
>                 Key: LUCENE-1471
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1471
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.4
>            Reporter: Jason Rutherglen
>            Priority: Minor
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> MultiSearcher.search places sorted search results from individual searchers into a PriorityQueue.  This can be made to be more optimal by taking advantage of the fact that the results returned are already sorted.  
> The proposed solution places the sub-searcher results iterator into a custom PriorityQueue that produces the sorted ScoreDocs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1471) Faster MultiSearcher.search merge docs

Posted by "Jason Rutherglen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Rutherglen updated LUCENE-1471:
-------------------------------------

    Attachment: LUCENE-1471.patch

LUCENE-1471.patch

Implements MultiSearcher.search methods using PriorityQueue of iterators of sorted Score/FieldDocs.  

> Faster MultiSearcher.search merge docs 
> ---------------------------------------
>
>                 Key: LUCENE-1471
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1471
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.4
>            Reporter: Jason Rutherglen
>            Priority: Minor
>         Attachments: LUCENE-1471.patch, multisearcher.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> MultiSearcher.search places sorted search results from individual searchers into a PriorityQueue.  This can be made to be more optimal by taking advantage of the fact that the results returned are already sorted.  
> The proposed solution places the sub-searcher results iterator into a custom PriorityQueue that produces the sorted ScoreDocs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1471) Faster MultiSearcher.search merge docs

Posted by "Luke Nezda (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654720#action_12654720 ] 

Luke Nezda commented on LUCENE-1471:
------------------------------------

* Simplified MultiSearcherThread 
** Pulled out result merging functionality - was serialized on hq anyway
    and made much more similar to parent merge logic (actually so similar it felt a little dirty)
** Made it a non-static inner class to cut down on parameters, though after moving merge logic, only saved searchables[] ref.
* Made fields searchables[] and starts[] final - really parent version of these same fields should probably just be protected final
* Fixed some javadoc typos
* Patch created against 724620 supersedes previous multisearcher.patch - all tests pass


> Faster MultiSearcher.search merge docs 
> ---------------------------------------
>
>                 Key: LUCENE-1471
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1471
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.4
>            Reporter: Jason Rutherglen
>            Assignee: Michael McCandless
>            Priority: Minor
>         Attachments: LUCENE-1471.patch, multisearcher.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> MultiSearcher.search places sorted search results from individual searchers into a PriorityQueue.  This can be made to be more optimal by taking advantage of the fact that the results returned are already sorted.  
> The proposed solution places the sub-searcher results iterator into a custom PriorityQueue that produces the sorted ScoreDocs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Assigned: (LUCENE-1471) Faster MultiSearcher.search merge docs

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless reassigned LUCENE-1471:
------------------------------------------

    Assignee: Michael McCandless

> Faster MultiSearcher.search merge docs 
> ---------------------------------------
>
>                 Key: LUCENE-1471
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1471
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.4
>            Reporter: Jason Rutherglen
>            Assignee: Michael McCandless
>            Priority: Minor
>         Attachments: LUCENE-1471.patch, multisearcher.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> MultiSearcher.search places sorted search results from individual searchers into a PriorityQueue.  This can be made to be more optimal by taking advantage of the fact that the results returned are already sorted.  
> The proposed solution places the sub-searcher results iterator into a custom PriorityQueue that produces the sorted ScoreDocs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1471) Faster MultiSearcher.search merge docs

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654520#action_12654520 ] 

Michael McCandless commented on LUCENE-1471:
--------------------------------------------


I agree performance improvement is probably smallish since m & n are
usually small; still it'd be good to improve it, especially since
we're discussing cutting over sort-by-field searching in IndexSearcher
to the MultiSearcher approach, and, sometimes m & n may not be small.

There are two different patches here.  I think the approaches are
mostly the same (ie use 2nd pqueue to extract top N merged results),
but on quick inspection there are some differences:

  * The first one shares a common source for the big switch statement
    (by extending FieldDocSortedHitQueue) on SortField.getType(), which
    is great.

  * First one passes all tests; 2nd one fails at least 3 tests (all
    due to the AUTO SortField -- what's the fix here?).

  * Code style is closer to Lucene's in the first one ({'s not on
    separate lines, no _ leader in many variable names).

I'm sure there are other differences I'm missing.  Can you two work
together to merge the two patches into a single one?  Thanks.


> Faster MultiSearcher.search merge docs 
> ---------------------------------------
>
>                 Key: LUCENE-1471
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1471
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.4
>            Reporter: Jason Rutherglen
>            Assignee: Michael McCandless
>            Priority: Minor
>         Attachments: LUCENE-1471.patch, multisearcher.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> MultiSearcher.search places sorted search results from individual searchers into a PriorityQueue.  This can be made to be more optimal by taking advantage of the fact that the results returned are already sorted.  
> The proposed solution places the sub-searcher results iterator into a custom PriorityQueue that produces the sorted ScoreDocs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1471) Faster MultiSearcher.search merge docs

Posted by "Jason Rutherglen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654529#action_12654529 ] 

Jason Rutherglen commented on LUCENE-1471:
------------------------------------------

The patches seem to implement the same concept?  I'm using the 2nd one because FieldDocSortedHitQueue is not public (it should be) and some other class is final that made using the 1st patch impossible.  

If there is no performance difference then the 1st patch is less code and re-uses Lucene more so the 1st looks best.

Mike M:
"I agree performance improvement is probably smallish since m & n are
usually small; "
If results are in the hundreds then the performance matters.  With more microprocessor cores 
growing because we don't have nanotech processors yet, parallel thread searching should be the norm 
for systems that care about response time.  

> Faster MultiSearcher.search merge docs 
> ---------------------------------------
>
>                 Key: LUCENE-1471
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1471
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.4
>            Reporter: Jason Rutherglen
>            Assignee: Michael McCandless
>            Priority: Minor
>         Attachments: LUCENE-1471.patch, multisearcher.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> MultiSearcher.search places sorted search results from individual searchers into a PriorityQueue.  This can be made to be more optimal by taking advantage of the fact that the results returned are already sorted.  
> The proposed solution places the sub-searcher results iterator into a custom PriorityQueue that produces the sorted ScoreDocs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1471) Faster MultiSearcher.search merge docs

Posted by "Luke Nezda (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654712#action_12654712 ] 

Luke Nezda commented on LUCENE-1471:
------------------------------------

I will prepare a similar derivative patch that covers MultiSearcher and ParallelMultiSearcher.

> Faster MultiSearcher.search merge docs 
> ---------------------------------------
>
>                 Key: LUCENE-1471
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1471
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.4
>            Reporter: Jason Rutherglen
>            Assignee: Michael McCandless
>            Priority: Minor
>         Attachments: LUCENE-1471.patch, multisearcher.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> MultiSearcher.search places sorted search results from individual searchers into a PriorityQueue.  This can be made to be more optimal by taking advantage of the fact that the results returned are already sorted.  
> The proposed solution places the sub-searcher results iterator into a custom PriorityQueue that produces the sorted ScoreDocs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1471) Faster MultiSearcher.search merge docs

Posted by "Jason Rutherglen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654594#action_12654594 ] 

Jason Rutherglen commented on LUCENE-1471:
------------------------------------------

Wouldn't it be good to remove BitVector and replace it with OpenBitSet?  OBS is faster, has the DocIdSetIterator already.  It just needs to implement write to disk compression of the bitset (dgaps?).  This would be a big win for almost *all* searches.  We could also create an interface so that any bitset implementation could be used.  

Such as:
{code}
public interface WriteableBitSet {
  public void write(IndexOutput output) throws IOException;
}
{code}

> Faster MultiSearcher.search merge docs 
> ---------------------------------------
>
>                 Key: LUCENE-1471
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1471
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.4
>            Reporter: Jason Rutherglen
>            Assignee: Michael McCandless
>            Priority: Minor
>         Attachments: LUCENE-1471.patch, multisearcher.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> MultiSearcher.search places sorted search results from individual searchers into a PriorityQueue.  This can be made to be more optimal by taking advantage of the fact that the results returned are already sorted.  
> The proposed solution places the sub-searcher results iterator into a custom PriorityQueue that produces the sorted ScoreDocs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1471) Faster MultiSearcher.search merge docs

Posted by "Jason Rutherglen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Rutherglen updated LUCENE-1471:
-------------------------------------

    Comment: was deleted

> Faster MultiSearcher.search merge docs 
> ---------------------------------------
>
>                 Key: LUCENE-1471
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1471
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.4
>            Reporter: Jason Rutherglen
>            Assignee: Michael McCandless
>            Priority: Minor
>         Attachments: LUCENE-1471.patch, multisearcher.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> MultiSearcher.search places sorted search results from individual searchers into a PriorityQueue.  This can be made to be more optimal by taking advantage of the fact that the results returned are already sorted.  
> The proposed solution places the sub-searcher results iterator into a custom PriorityQueue that produces the sorted ScoreDocs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1471) Faster MultiSearcher.search merge docs

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654546#action_12654546 ] 

Michael McCandless commented on LUCENE-1471:
--------------------------------------------

bq. The patches seem to implement the same concept?

That's my impression.

bq. I'm using the 2nd one because FieldDocSortedHitQueue is not public (it should be) and some other class is final that made using the 1st patch impossible.

The first patch works fine, w/o making FieldDocSortedHitQueue public.

bq. If there is no performance difference then the 1st patch is less code and re-uses Lucene more so the 1st looks best.

OK I'll go forwards with the first patch.

{quote}
If results are in the hundreds then the performance matters. With more microprocessor cores
growing because we don't have nanotech processors yet, parallel thread searching should be the norm
for systems that care about response time.
{quote}

I would love to find a clean way to make Lucene's searching "naturally" concurrent, so that more cores would in fact greatly reduce the worst case latency.  Our inability to properly use concurrency on the search side to reduce a single query's latency (we can of course use concurrency to improve net throughput, today) will soon be a big limitation.  ParallelMultiSearcher ought to work, but it requires you to manually partition.  And it should pool threads or use ExecutorService.  But I don't see how this applies to this issue...

> Faster MultiSearcher.search merge docs 
> ---------------------------------------
>
>                 Key: LUCENE-1471
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1471
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.4
>            Reporter: Jason Rutherglen
>            Assignee: Michael McCandless
>            Priority: Minor
>         Attachments: LUCENE-1471.patch, multisearcher.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> MultiSearcher.search places sorted search results from individual searchers into a PriorityQueue.  This can be made to be more optimal by taking advantage of the fact that the results returned are already sorted.  
> The proposed solution places the sub-searcher results iterator into a custom PriorityQueue that produces the sorted ScoreDocs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1471) Faster MultiSearcher.search merge docs

Posted by "Luke Nezda (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Luke Nezda updated LUCENE-1471:
-------------------------------

    Attachment: multisearcher.take3.patch

Doh.  Sorry Michael, I reverted my local changes and tested this patch :).

I agree Mark, unbounded number of Threads little worrisome.

> Faster MultiSearcher.search merge docs 
> ---------------------------------------
>
>                 Key: LUCENE-1471
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1471
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.4
>            Reporter: Jason Rutherglen
>            Assignee: Michael McCandless
>            Priority: Minor
>         Attachments: LUCENE-1471.patch, multisearcher.patch, multisearcher.take2.patch, multisearcher.take3.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> MultiSearcher.search places sorted search results from individual searchers into a PriorityQueue.  This can be made to be more optimal by taking advantage of the fact that the results returned are already sorted.  
> The proposed solution places the sub-searcher results iterator into a custom PriorityQueue that produces the sorted ScoreDocs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org