You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Shai Erera (JIRA)" <ji...@apache.org> on 2007/12/12 12:09:43 UTC

[jira] Created: (LUCENE-1089) Add insertWithOverflow to PriorityQueue

Add insertWithOverflow to PriorityQueue
---------------------------------------

                 Key: LUCENE-1089
                 URL: https://issues.apache.org/jira/browse/LUCENE-1089
             Project: Lucene - Java
          Issue Type: Improvement
    Affects Versions: 2.3
            Reporter: Shai Erera
             Fix For: 2.3


This feature proposes to add an insertWithOverflow to PriorityQueue so that callers can reuse the objects that are being dropped off the queue. Also, it changes heap to protected for easier extensibility of PQ

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Assigned: (LUCENE-1089) Add insertWithOverflow to PriorityQueue

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless reassigned LUCENE-1089:
------------------------------------------

    Assignee: Michael McCandless

> Add insertWithOverflow to PriorityQueue
> ---------------------------------------
>
>                 Key: LUCENE-1089
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1089
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.3
>            Reporter: Shai Erera
>            Assignee: Michael McCandless
>             Fix For: 2.3
>
>         Attachments: PriorityQueue.patch
>
>
> This feature proposes to add an insertWithOverflow to PriorityQueue so that callers can reuse the objects that are being dropped off the queue. Also, it changes heap to protected for easier extensibility of PQ

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1089) Add insertWithOverflow to PriorityQueue

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-1089:
---------------------------------------

    Attachment: LUCENE-1089.take3.patch

Patch looks good; I made a few changes (attached take3):

  * Put back the minScore optimization in TopDocCollector.collect &
    FuzzyQuery, but, using reusableSD.score as the bar.  This saves
    changing reusableSD & calling insertWithOverflow for every
    candidate.  Then I put back TopFieldDocCollector.collect since it
    can't make that same optimization.

  * I implemented PriorityQueue.insert by calling insertWithOverflow
    (so we keep the tricky insert logic as single source).

  * I changed reusableSD in TopDocCollector to private (it was
    protected -- was there a reason?).

  * Made another small optimization to PriorityQueue that saves the if
    statement in top() (this was #2 in Nadav's additional suggestions).

  * Small javadoc fixes.



> Add insertWithOverflow to PriorityQueue
> ---------------------------------------
>
>                 Key: LUCENE-1089
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1089
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.3
>            Reporter: Shai Erera
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.3
>
>         Attachments: LUCENE-1089.take3.patch, PriorityQueue-2.patch, PriorityQueue.patch
>
>
> This feature proposes to add an insertWithOverflow to PriorityQueue so that callers can reuse the objects that are being dropped off the queue. Also, it changes heap to protected for easier extensibility of PQ

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1089) Add insertWithOverflow to PriorityQueue

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12551227 ] 

Shai Erera commented on LUCENE-1089:
------------------------------------

1. Actually there was why I removed the minScore. There's no point in evaluating minScore with current score because it's been evaluated again in insertWithOverflow. You don't lose anything by just populating reusableSD and calling insertWithOverflow. For that reason I removed collect() from TopFieldDocCollector. To me it looks like a cleaner code, and gain, performance wise, you only make the same comparison twice.

2. That's a good one.

3. Initially I thought TopFieldDocCollector would use it as FieldDoc (which extends ScoreDoc) and using TopDocCollector's collect() method. Then I implemented a protected newScoreDoc(doc, score) which was overidden by TopFieldDocCollector. That way, if TFDC uses TDC's collect() method, all it needs to do is override newScoreDoc(doc, score) to return a FieldDoc.

4. Initially I implemented top() the same as you did, but then there were tests that created a PQ of size 0. However, with the change to initialize, it is safe to do it.

I think that TDC and TFDC as I put them in the patch are extended cleaner and that way we don't do any extra logic (that's the reason why you would have needed to write the comment in TFDC's collect() method about simply comparing the score. Please re-consider my changes.

> Add insertWithOverflow to PriorityQueue
> ---------------------------------------
>
>                 Key: LUCENE-1089
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1089
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.3
>            Reporter: Shai Erera
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.3
>
>         Attachments: LUCENE-1089.take3.patch, PriorityQueue-2.patch, PriorityQueue.patch
>
>
> This feature proposes to add an insertWithOverflow to PriorityQueue so that callers can reuse the objects that are being dropped off the queue. Also, it changes heap to protected for easier extensibility of PQ

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1089) Add insertWithOverflow to PriorityQueue

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12551462 ] 

Michael McCandless commented on LUCENE-1089:
--------------------------------------------

{quote}
1. Actually there was why I removed the minScore. There's no point in evaluating minScore with current score because it's been evaluated again in insertWithOverflow. You don't lose anything by just populating reusableSD and calling insertWithOverflow. For that reason I removed collect() from TopFieldDocCollector. To me it looks like a cleaner code, and gain, performance wise, you only make the same comparison twice.
{quote}

But, imagine a query that has 55K hits to be collected (the avg. from
your tests).

With the original patch, you saved a tiny number of allocations (~70
in your tests) yet added 55K inits of recycledSD and 55K additional
method calls, which is surely a net loss of performance for the
"typical" query.  We set out here to improve performance of searching,
but I think the original patch does the reverse.

The proposed (modified) patch, puts back the original optimization
(well, close to it: the original "minScore" is actually the minScore
in the PQ) so we don't have the extra 55K MM inits+method call.

I suppose we could also just make the API addition to PriorityQueue,
but not change how TDC and TFDC and FQ call PriorityQueue.


> Add insertWithOverflow to PriorityQueue
> ---------------------------------------
>
>                 Key: LUCENE-1089
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1089
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.3
>            Reporter: Shai Erera
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.3
>
>         Attachments: LUCENE-1089.take3.patch, PriorityQueue-2.patch, PriorityQueue.patch
>
>
> This feature proposes to add an insertWithOverflow to PriorityQueue so that callers can reuse the objects that are being dropped off the queue. Also, it changes heap to protected for easier extensibility of PQ

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1089) Add insertWithOverflow to PriorityQueue

Posted by "Michael Busch (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Busch updated LUCENE-1089:
----------------------------------

         Priority: Minor  (was: Major)
    Lucene Fields: [New, Patch Available]  (was: [Patch Available, New])

> Add insertWithOverflow to PriorityQueue
> ---------------------------------------
>
>                 Key: LUCENE-1089
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1089
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.3
>            Reporter: Shai Erera
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.3
>
>         Attachments: PriorityQueue.patch
>
>
> This feature proposes to add an insertWithOverflow to PriorityQueue so that callers can reuse the objects that are being dropped off the queue. Also, it changes heap to protected for easier extensibility of PQ

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1089) Add insertWithOverflow to PriorityQueue

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12551510 ] 

Michael McCandless commented on LUCENE-1089:
--------------------------------------------

OK, thanks.  I plan to commit in a day or two.

> Add insertWithOverflow to PriorityQueue
> ---------------------------------------
>
>                 Key: LUCENE-1089
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1089
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.3
>            Reporter: Shai Erera
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.3
>
>         Attachments: LUCENE-1089.take3.patch, PriorityQueue-2.patch, PriorityQueue.patch
>
>
> This feature proposes to add an insertWithOverflow to PriorityQueue so that callers can reuse the objects that are being dropped off the queue. Also, it changes heap to protected for easier extensibility of PQ

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1089) Add insertWithOverflow to PriorityQueue

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera updated LUCENE-1089:
-------------------------------

    Attachment: PriorityQueue.patch

The fixes done are:
TopDocCollector:
1. Use insertWithOverflow instead of insert.
2. Get rid of unnecessary member variables (numHits and minScore).

TestPriorityQueue:
1. Add testInsertWithOverflow.

PriorityQueue:
1. Add insertWithOverflow
2. Change heap to protected.
3. user heap[1] in insert and insertWithOverflow instead of calling top().

> Add insertWithOverflow to PriorityQueue
> ---------------------------------------
>
>                 Key: LUCENE-1089
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1089
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.3
>            Reporter: Shai Erera
>             Fix For: 2.3
>
>         Attachments: PriorityQueue.patch
>
>
> This feature proposes to add an insertWithOverflow to PriorityQueue so that callers can reuse the objects that are being dropped off the queue. Also, it changes heap to protected for easier extensibility of PQ

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1089) Add insertWithOverflow to PriorityQueue

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera updated LUCENE-1089:
-------------------------------

    Attachment: PriorityQueue-2.patch

This one modifies other classes that called insert() with calls to insertWithOverflow():
TopFieldDocCollector
QualityQueriesFinder
FuzzyQuery

> Add insertWithOverflow to PriorityQueue
> ---------------------------------------
>
>                 Key: LUCENE-1089
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1089
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.3
>            Reporter: Shai Erera
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.3
>
>         Attachments: PriorityQueue-2.patch, PriorityQueue.patch
>
>
> This feature proposes to add an insertWithOverflow to PriorityQueue so that callers can reuse the objects that are being dropped off the queue. Also, it changes heap to protected for easier extensibility of PQ

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Resolved: (LUCENE-1089) Add insertWithOverflow to PriorityQueue

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless resolved LUCENE-1089.
----------------------------------------

       Resolution: Fixed
    Lucene Fields: [New, Patch Available]  (was: [Patch Available, New])

I just committed this.  Thanks Shai!

> Add insertWithOverflow to PriorityQueue
> ---------------------------------------
>
>                 Key: LUCENE-1089
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1089
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.3
>            Reporter: Shai Erera
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.3
>
>         Attachments: LUCENE-1089.take3.patch, PriorityQueue-2.patch, PriorityQueue.patch
>
>
> This feature proposes to add an insertWithOverflow to PriorityQueue so that callers can reuse the objects that are being dropped off the queue. Also, it changes heap to protected for easier extensibility of PQ

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1089) Add insertWithOverflow to PriorityQueue

Posted by "Shai Erera (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12551501 ] 

Shai Erera commented on LUCENE-1089:
------------------------------------

You're right, I wasn't thinking in those terms. Your changes should remain.

> Add insertWithOverflow to PriorityQueue
> ---------------------------------------
>
>                 Key: LUCENE-1089
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1089
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.3
>            Reporter: Shai Erera
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.3
>
>         Attachments: LUCENE-1089.take3.patch, PriorityQueue-2.patch, PriorityQueue.patch
>
>
> This feature proposes to add an insertWithOverflow to PriorityQueue so that callers can reuse the objects that are being dropped off the queue. Also, it changes heap to protected for easier extensibility of PQ

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org