You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2011/06/13 20:42:51 UTC

[jira] [Created] (LUCENE-3197) Optimize runs forever if you keep deleting docs at the same time

Optimize runs forever if you keep deleting docs at the same time
----------------------------------------------------------------

                 Key: LUCENE-3197
                 URL: https://issues.apache.org/jira/browse/LUCENE-3197
             Project: Lucene - Java
          Issue Type: Bug
            Reporter: Michael McCandless
            Priority: Minor
             Fix For: 3.3, 4.0


Because we "cascade" merges for an optimize... if you also delete documents while the merges are running, then the merge policy will see the resulting single segment as still not optimized (since it has pending deletes) and do a single-segment merge, and will repeat indefinitely (as long as your app keeps deleting docs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3197) Optimize runs forever if you keep deleting docs at the same time

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-3197:
---------------------------------------

    Component/s: core/index

> Optimize runs forever if you keep deleting docs at the same time
> ----------------------------------------------------------------
>
>                 Key: LUCENE-3197
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3197
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/index
>            Reporter: Michael McCandless
>            Priority: Minor
>             Fix For: 3.3, 4.0
>
>
> Because we "cascade" merges for an optimize... if you also delete documents while the merges are running, then the merge policy will see the resulting single segment as still not optimized (since it has pending deletes) and do a single-segment merge, and will repeat indefinitely (as long as your app keeps deleting docs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3197) Optimize runs forever if you keep deleting docs at the same time

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-3197:
---------------------------------------

    Attachment: LUCENE-3197.patch

Patch.


> Optimize runs forever if you keep deleting docs at the same time
> ----------------------------------------------------------------
>
>                 Key: LUCENE-3197
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3197
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/index
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 3.3, 4.0
>
>         Attachments: LUCENE-3197.patch
>
>
> Because we "cascade" merges for an optimize... if you also delete documents while the merges are running, then the merge policy will see the resulting single segment as still not optimized (since it has pending deletes) and do a single-segment merge, and will repeat indefinitely (as long as your app keeps deleting docs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3197) Optimize runs forever if you keep deleting docs at the same time

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049786#comment-13049786 ] 

Michael McCandless commented on LUCENE-3197:
--------------------------------------------

Right, this has been the intended semantics of a background optimize for some time, ie, when it returns it only ensures that whatever was not optimized as of when it was called has been merged away.

This already works correctly for newly added docs, meaning if you continue adding docs / flushing new segments while the optimize runs, it knows that the newly flushed segments do not have to be merged away.

But for new deletions we are not handling it correctly, which leads to the forever running merges.

> Optimize runs forever if you keep deleting docs at the same time
> ----------------------------------------------------------------
>
>                 Key: LUCENE-3197
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3197
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/index
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 3.3, 4.0
>
>
> Because we "cascade" merges for an optimize... if you also delete documents while the merges are running, then the merge policy will see the resulting single segment as still not optimized (since it has pending deletes) and do a single-segment merge, and will repeat indefinitely (as long as your app keeps deleting docs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3197) Optimize runs forever if you keep deleting docs at the same time

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049109#comment-13049109 ] 

Michael McCandless commented on LUCENE-3197:
--------------------------------------------

One simple way to fix this would be to have IW disregard the MergePolicy if ever it asks to do a single-segment merge of a segment that had already been produced by merging for the current optimize call.

But... I don't really like this, as it could be some unusual MergePolicy out there sometimes wants to do such merging.

So I think a better solution, but API breaking to the MergePolicy, which is OK because it's @experimental, is to change the segmentsToOptimize argument; currently it's just a set recording which segments need to be optimized away.  I think we should change it to a Map<String,Boolean>, where the Boolean indicates whether this segment had been created by a merge in the current optimize session.  Then I'll fix our MPs to not cascade in such a case.

> Optimize runs forever if you keep deleting docs at the same time
> ----------------------------------------------------------------
>
>                 Key: LUCENE-3197
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3197
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/index
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 3.3, 4.0
>
>
> Because we "cascade" merges for an optimize... if you also delete documents while the merges are running, then the merge policy will see the resulting single segment as still not optimized (since it has pending deletes) and do a single-segment merge, and will repeat indefinitely (as long as your app keeps deleting docs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Assigned] (LUCENE-3197) Optimize runs forever if you keep deleting docs at the same time

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless reassigned LUCENE-3197:
------------------------------------------

    Assignee: Michael McCandless

> Optimize runs forever if you keep deleting docs at the same time
> ----------------------------------------------------------------
>
>                 Key: LUCENE-3197
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3197
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/index
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 3.3, 4.0
>
>
> Because we "cascade" merges for an optimize... if you also delete documents while the merges are running, then the merge policy will see the resulting single segment as still not optimized (since it has pending deletes) and do a single-segment merge, and will repeat indefinitely (as long as your app keeps deleting docs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Resolved] (LUCENE-3197) Optimize runs forever if you keep deleting docs at the same time

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless resolved LUCENE-3197.
----------------------------------------

    Resolution: Fixed

> Optimize runs forever if you keep deleting docs at the same time
> ----------------------------------------------------------------
>
>                 Key: LUCENE-3197
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3197
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/index
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 3.3, 4.0
>
>         Attachments: LUCENE-3197.patch
>
>
> Because we "cascade" merges for an optimize... if you also delete documents while the merges are running, then the merge policy will see the resulting single segment as still not optimized (since it has pending deletes) and do a single-segment merge, and will repeat indefinitely (as long as your app keeps deleting docs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3197) Optimize runs forever if you keep deleting docs at the same time

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049254#comment-13049254 ] 

Hoss Man commented on LUCENE-3197:
----------------------------------

is the possibility of a never ending optimize in this situation (never ending deletes) really something we need to "fix" ?

i mean ... isn't this what hte user should expect?  they've asked for a single segment w/o deletes, and then while we try to give it to them they keep deleting -- how is it bad that we optimize doesn't stop until it's completely done ?

> Optimize runs forever if you keep deleting docs at the same time
> ----------------------------------------------------------------
>
>                 Key: LUCENE-3197
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3197
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/index
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 3.3, 4.0
>
>
> Because we "cascade" merges for an optimize... if you also delete documents while the merges are running, then the merge policy will see the resulting single segment as still not optimized (since it has pending deletes) and do a single-segment merge, and will repeat indefinitely (as long as your app keeps deleting docs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3197) Optimize runs forever if you keep deleting docs at the same time

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049298#comment-13049298 ] 

Yonik Seeley commented on LUCENE-3197:
--------------------------------------

Regardless of if one views this as a bug or not, I think the more useful semantics are to at least "merge all of the current segments into 1 and remove all *currently* deleted docs" (i.e. I agree with Mike).  The alternative is that optimize is dangerous in the presence of index updates (i.e. applications should discontinue updates if they call optimize).


> Optimize runs forever if you keep deleting docs at the same time
> ----------------------------------------------------------------
>
>                 Key: LUCENE-3197
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3197
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/index
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 3.3, 4.0
>
>
> Because we "cascade" merges for an optimize... if you also delete documents while the merges are running, then the merge policy will see the resulting single segment as still not optimized (since it has pending deletes) and do a single-segment merge, and will repeat indefinitely (as long as your app keeps deleting docs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org