You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Robert Muir (JIRA)" <ji...@apache.org> on 2012/06/15 15:51:43 UTC

[jira] [Created] (LUCENE-4147) rollback/preparecommit thread hazard

Robert Muir created LUCENE-4147:
-----------------------------------

             Summary: rollback/preparecommit thread hazard
                 Key: LUCENE-4147
                 URL: https://issues.apache.org/jira/browse/LUCENE-4147
             Project: Lucene - Java
          Issue Type: Bug
    Affects Versions: 4.0
            Reporter: Robert Muir


found by http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java7-64/70/

rollback should never throw this exception, as it documents it clears any pendingcommits.

but it calls closeInternal outside of any sync block, so it looks like there is a race here.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-4147) rollback/preparecommit thread hazard

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397957#comment-13397957 ] 

Michael McCandless commented on LUCENE-4147:
--------------------------------------------

Thanks Simon; I'll re-beast.
                
> rollback/preparecommit thread hazard
> ------------------------------------
>
>                 Key: LUCENE-4147
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4147
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-4147.patch, LUCENE-4147.patch, LUCENE-4147.patch, LUCENE-4147.patch, fail.log, fail.log
>
>
> found by http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java7-64/70/
> rollback should never throw this exception, as it documents it clears any pendingcommits.
> but it calls closeInternal outside of any sync block, so it looks like there is a race here.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-4147) rollback/preparecommit thread hazard

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-4147:
---------------------------------------

    Attachment: LUCENE-4147.patch

Patch, acquiring the commitLock around close and rollback, and adding
the ensureOpen before prepareCommit.  However, the test still fails
after a few hundred beasting iterations, I think because of thread safety
issues where one thread calls docWriter.abort while another is
flushing ... not sure how to fix that one yet.  Simon maybe you can
have a look?

                
> rollback/preparecommit thread hazard
> ------------------------------------
>
>                 Key: LUCENE-4147
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4147
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-4147.patch, LUCENE-4147.patch
>
>
> found by http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java7-64/70/
> rollback should never throw this exception, as it documents it clears any pendingcommits.
> but it calls closeInternal outside of any sync block, so it looks like there is a race here.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-4147) rollback/preparecommit thread hazard

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397008#comment-13397008 ] 

Simon Willnauer commented on LUCENE-4147:
-----------------------------------------

I see what's happening. There is a thread that started flushing before we call rollback but finishes after we already wiped its files. I think we don't have a choice here but wait for the flushes to finish with flushControl.waitForFlush(); I will prepare a new patch tomorrow.
                
> rollback/preparecommit thread hazard
> ------------------------------------
>
>                 Key: LUCENE-4147
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4147
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-4147.patch, LUCENE-4147.patch, LUCENE-4147.patch, fail.log, fail.log
>
>
> found by http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java7-64/70/
> rollback should never throw this exception, as it documents it clears any pendingcommits.
> but it calls closeInternal outside of any sync block, so it looks like there is a race here.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-4147) rollback/preparecommit thread hazard

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-4147:
---------------------------------------

    Attachment: LUCENE-4147.patch

Test case that fails more easily from the bug ...
                
> rollback/preparecommit thread hazard
> ------------------------------------
>
>                 Key: LUCENE-4147
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4147
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-4147.patch
>
>
> found by http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java7-64/70/
> rollback should never throw this exception, as it documents it clears any pendingcommits.
> but it calls closeInternal outside of any sync block, so it looks like there is a race here.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-4147) rollback/preparecommit thread hazard

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-4147:
---------------------------------------

    Attachment: fail.log

Hmm, hit a failure (verbose log attached)... haven't tried to understand it yet.
                
> rollback/preparecommit thread hazard
> ------------------------------------
>
>                 Key: LUCENE-4147
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4147
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-4147.patch, LUCENE-4147.patch, LUCENE-4147.patch, fail.log, fail.log
>
>
> found by http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java7-64/70/
> rollback should never throw this exception, as it documents it clears any pendingcommits.
> but it calls closeInternal outside of any sync block, so it looks like there is a race here.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-4147) rollback/preparecommit thread hazard

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13396772#comment-13396772 ] 

Simon Willnauer commented on LUCENE-4147:
-----------------------------------------

one more think, I think this is a general problem that exists before so we might need a CHANGES.TXT entry?
                
> rollback/preparecommit thread hazard
> ------------------------------------
>
>                 Key: LUCENE-4147
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4147
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-4147.patch, LUCENE-4147.patch, LUCENE-4147.patch, fail.log
>
>
> found by http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java7-64/70/
> rollback should never throw this exception, as it documents it clears any pendingcommits.
> but it calls closeInternal outside of any sync block, so it looks like there is a race here.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-4147) rollback/preparecommit thread hazard

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13395854#comment-13395854 ] 

Michael McCandless commented on LUCENE-4147:
--------------------------------------------

bq. mike, do we really need to acquire the commit lock?

The problem is rollback forcefully clears the pendingCommit and then deletes any files it had (alone) referenced, so if a commit is running concurrently the fsyncs will fail since the files were deleted.

Also: it doesn't really make sense to allow rollback and commit to proceed concurrently?  Why would an app need this?  Seems like we can simplify the code by making them exclusive.

bq. regarding the thread safety issue in DocWriter can you paste the trace?

Will do ... need to re-beast.
                
> rollback/preparecommit thread hazard
> ------------------------------------
>
>                 Key: LUCENE-4147
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4147
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-4147.patch, LUCENE-4147.patch
>
>
> found by http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java7-64/70/
> rollback should never throw this exception, as it documents it clears any pendingcommits.
> but it calls closeInternal outside of any sync block, so it looks like there is a race here.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-4147) rollback/preparecommit thread hazard

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-4147:
---------------------------------------

    Attachment: fail.log

Here's the verbose output from a failure w/ the patch ... 
                
> rollback/preparecommit thread hazard
> ------------------------------------
>
>                 Key: LUCENE-4147
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4147
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-4147.patch, LUCENE-4147.patch, fail.log
>
>
> found by http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java7-64/70/
> rollback should never throw this exception, as it documents it clears any pendingcommits.
> but it calls closeInternal outside of any sync block, so it looks like there is a race here.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-4147) rollback/preparecommit thread hazard

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer updated LUCENE-4147:
------------------------------------

    Attachment: LUCENE-4147.patch

patch fixing the DWPT issue. The problem was that we didn't close the DW before aborting. That means we didn't invalidate the thread states in DWPTThreadPool and an already waiting Thread could acquire the state before we eventually close the DW. If that happens together with a low ram buffer / low maxBufferedDocs we hit an exception on flush since IFD deleted the files already. Now since we first close and then abort this can't happen anymore and will cause an AlreadClosedException for the indexing thread. 
                
> rollback/preparecommit thread hazard
> ------------------------------------
>
>                 Key: LUCENE-4147
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4147
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-4147.patch, LUCENE-4147.patch, LUCENE-4147.patch, fail.log
>
>
> found by http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java7-64/70/
> rollback should never throw this exception, as it documents it clears any pendingcommits.
> but it calls closeInternal outside of any sync block, so it looks like there is a race here.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-4147) rollback/preparecommit thread hazard

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13396718#comment-13396718 ] 

Michael McCandless commented on LUCENE-4147:
--------------------------------------------

I think we should commit the current patch, and then leave the issue open for the docWriter abort/flush thread safety.

The current patch should fix the Jenkins test failures we're seeing (but the new test here may still sometimes fail until we fix the abort/flush thread safety issue).
                
> rollback/preparecommit thread hazard
> ------------------------------------
>
>                 Key: LUCENE-4147
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4147
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-4147.patch, LUCENE-4147.patch, fail.log
>
>
> found by http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java7-64/70/
> rollback should never throw this exception, as it documents it clears any pendingcommits.
> but it calls closeInternal outside of any sync block, so it looks like there is a race here.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-4147) rollback/preparecommit thread hazard

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer updated LUCENE-4147:
------------------------------------

    Attachment: LUCENE-4147.patch

I feared that this is gonna hang at some point. I moved the docWriter abort / close out of the sync block in IW rollbackInternal and beasted the new test + all other tests for hours now. I think this is fine to move that out, no need really to keep the IW lock since we already have the commit lock in our hands. I didn't see a failure so far.
                
> rollback/preparecommit thread hazard
> ------------------------------------
>
>                 Key: LUCENE-4147
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4147
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-4147.patch, LUCENE-4147.patch, LUCENE-4147.patch, LUCENE-4147.patch, LUCENE-4147.patch, deadlock.log, fail.log, fail.log
>
>
> found by http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java7-64/70/
> rollback should never throw this exception, as it documents it clears any pendingcommits.
> but it calls closeInternal outside of any sync block, so it looks like there is a race here.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Resolved] (LUCENE-4147) rollback/preparecommit thread hazard

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless resolved LUCENE-4147.
----------------------------------------

       Resolution: Fixed
    Fix Version/s: 5.0
                   4.0
    
> rollback/preparecommit thread hazard
> ------------------------------------
>
>                 Key: LUCENE-4147
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4147
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>             Fix For: 4.0, 5.0
>
>         Attachments: LUCENE-4147.patch, LUCENE-4147.patch, LUCENE-4147.patch, LUCENE-4147.patch, LUCENE-4147.patch, deadlock.log, fail.log, fail.log
>
>
> found by http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java7-64/70/
> rollback should never throw this exception, as it documents it clears any pendingcommits.
> but it calls closeInternal outside of any sync block, so it looks like there is a race here.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-4147) rollback/preparecommit thread hazard

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13395826#comment-13395826 ] 

Simon Willnauer commented on LUCENE-4147:
-----------------------------------------

mike, do we really need to acquire the commit lock? from my perspective it would be enough to add an ensure open when we assigne pendingCommit (inside the sync block) so that racing threads hit already close exceptions.

regarding the thread safety issue in DocWriter can you paste the trace?

                
> rollback/preparecommit thread hazard
> ------------------------------------
>
>                 Key: LUCENE-4147
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4147
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-4147.patch, LUCENE-4147.patch
>
>
> found by http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java7-64/70/
> rollback should never throw this exception, as it documents it clears any pendingcommits.
> but it calls closeInternal outside of any sync block, so it looks like there is a race here.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-4147) rollback/preparecommit thread hazard

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13398524#comment-13398524 ] 

Michael McCandless commented on LUCENE-4147:
--------------------------------------------

OK, patch looks good, and beasting ran for 4 hours w/ no failures/hangs ... I'll commit.  Thanks Simon!
                
> rollback/preparecommit thread hazard
> ------------------------------------
>
>                 Key: LUCENE-4147
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4147
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-4147.patch, LUCENE-4147.patch, LUCENE-4147.patch, LUCENE-4147.patch, LUCENE-4147.patch, deadlock.log, fail.log, fail.log
>
>
> found by http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java7-64/70/
> rollback should never throw this exception, as it documents it clears any pendingcommits.
> but it calls closeInternal outside of any sync block, so it looks like there is a race here.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-4147) rollback/preparecommit thread hazard

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13396853#comment-13396853 ] 

Michael McCandless commented on LUCENE-4147:
--------------------------------------------

bq. The problem was that we didn't close the DW before aborting.

Aha!  Thanks.

I'll beast this, and add a CHANGES entry ...
                
> rollback/preparecommit thread hazard
> ------------------------------------
>
>                 Key: LUCENE-4147
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4147
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-4147.patch, LUCENE-4147.patch, LUCENE-4147.patch, fail.log
>
>
> found by http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java7-64/70/
> rollback should never throw this exception, as it documents it clears any pendingcommits.
> but it calls closeInternal outside of any sync block, so it looks like there is a race here.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-4147) rollback/preparecommit thread hazard

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-4147:
---------------------------------------

    Attachment: deadlock.log

Well, the good news is beasting didn't uncover a failure ... but the bad news is: it uncovered a deadlock/hang!!  I'm attaching thread stacks.
                
> rollback/preparecommit thread hazard
> ------------------------------------
>
>                 Key: LUCENE-4147
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4147
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-4147.patch, LUCENE-4147.patch, LUCENE-4147.patch, LUCENE-4147.patch, deadlock.log, fail.log, fail.log
>
>
> found by http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java7-64/70/
> rollback should never throw this exception, as it documents it clears any pendingcommits.
> but it calls closeInternal outside of any sync block, so it looks like there is a race here.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-4147) rollback/preparecommit thread hazard

Posted by "Simon Willnauer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer updated LUCENE-4147:
------------------------------------

    Attachment: LUCENE-4147.patch

new patch that waits for running flushes in abort after all possible DWPTs are aborted. 


                
> rollback/preparecommit thread hazard
> ------------------------------------
>
>                 Key: LUCENE-4147
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4147
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-4147.patch, LUCENE-4147.patch, LUCENE-4147.patch, LUCENE-4147.patch, fail.log, fail.log
>
>
> found by http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java7-64/70/
> rollback should never throw this exception, as it documents it clears any pendingcommits.
> but it calls closeInternal outside of any sync block, so it looks like there is a race here.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-4147) rollback/preparecommit thread hazard

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13398388#comment-13398388 ] 

Michael McCandless commented on LUCENE-4147:
--------------------------------------------

Thanks Simon, I'll review & beast ...
                
> rollback/preparecommit thread hazard
> ------------------------------------
>
>                 Key: LUCENE-4147
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4147
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-4147.patch, LUCENE-4147.patch, LUCENE-4147.patch, LUCENE-4147.patch, LUCENE-4147.patch, deadlock.log, fail.log, fail.log
>
>
> found by http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java7-64/70/
> rollback should never throw this exception, as it documents it clears any pendingcommits.
> but it calls closeInternal outside of any sync block, so it looks like there is a race here.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org