You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@activemq.apache.org by "Jamie goodyear (JIRA)" <ji...@apache.org> on 2018/10/08 16:06:00 UTC

[jira] [Commented] (AMQ-7067) KahaDB Recovery can experience a dangling transaction when prepare and commit occur on different data files.

    [ https://issues.apache.org/jira/browse/AMQ-7067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16642067#comment-16642067 ] 

Jamie goodyear commented on AMQ-7067:
-------------------------------------

Hi Gary,

The crux of this patch is to prevent the GC of the PageFile such that on recovery  Commit/Rollback can be processed as per existing logic.  No changes to the existing processors is required. 

Please note, the problem was happening with Non-XA transactions as well, hence the addition of non-xa unit tests. The commit/rollback would get lost via GC, by updating the code to mark those index to not be GC'd the behavoir is fixed for both XA and Non-XA transactions. Adding in a new structure and logic to track the TX lifeCycle would not touch the Non-XA transaction case.

> KahaDB Recovery can experience a dangling transaction when prepare and commit occur on different data files.
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-7067
>                 URL: https://issues.apache.org/jira/browse/AMQ-7067
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: KahaDB, XA
>    Affects Versions: 5.15.6
>            Reporter: Jamie goodyear
>            Priority: Critical
>             Fix For: 5.16.0, 5.15.7
>
>         Attachments: amq7067test.patch
>
>
> KahaDB Recovery can experience a dangling transaction when prepare and commit occur on different pagefiles.
> Scenario:
> A XA Transaction is started, message is prepared and sent into Broker.
> We then send into broker enough messages to file page file (100 message with 512 * 1024 characters in message payload). This forces a new pagefile to be created.
> Commit the XA transaction. Commit will land on the new page file.
> Restart the Broker.
> Upon restart a KahaDB recovery is executed.
> The prepare in PageFile 1 is not matched to Commit on PageFile 2, as such, it will appear in recovered message state.
> Looking deeper into this scenario, it appears that the commit message is GC'd, hence the prepare & commit can not be matched. 
> The MessageDatabase only checks the following for GC:
> {color:#808080}// Don't GC files referenced by in-progress tx
> {color}{color:#cc7832}if {color}(inProgressTxRange[{color:#6897bb}0{color}] != {color:#cc7832}null{color}) {
>  {color:#cc7832}for {color}({color:#cc7832}int {color}pendingTx=inProgressTxRange[{color:#6897bb}0{color}].getDataFileId(){color:#cc7832}; {color}pendingTx <= inProgressTxRange[{color:#6897bb}1{color}].getDataFileId(){color:#cc7832}; {color}pendingTx++) {
>  gcCandidateSet.remove(pendingTx){color:#cc7832};
> {color} }
> }
> We need to become aware of where the prepare & commits occur in pagefiles with respect to GCing files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)