You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@activemq.apache.org by "Jamie goodyear (JIRA)" <ji...@apache.org> on 2018/10/04 15:54:00 UTC

[jira] [Comment Edited] (AMQ-7067) KahaDB Recovery can experience a dangling transaction when prepare and commit occur on different pagefiles.

    [ https://issues.apache.org/jira/browse/AMQ-7067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638417#comment-16638417 ] 

Jamie goodyear edited comment on AMQ-7067 at 10/4/18 3:53 PM:
--------------------------------------------------------------

I've attached a stand alone test case to reproduce the issue.

Here's how to use the test case:

0. Setup AMQ 5.15.6 broker on local system. Use default configuration, no special settings required. Start broker.
 1. Extract AMQ7067 zip file.
2. Run XATransactionTest while Broker is live using :  mvn clean compile exec:java -Dexec.mainClass="XATransactionTest"

The test case will output:

[INFO] — exec-maven-plugin:1.6.0:java (default-cli) @ transaction —
 SLF4J: Class path contains multiple SLF4J bindings.
 SLF4J: Found binding in [jar:file:/Users/icbts/.m2/repository/org/apache/activemq/activemq-all/5.15.6/activemq-all-5.15.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
 SLF4J: Found binding in [jar:file:/Users/icbts/.m2/repository/org/slf4j/slf4j-log4j12/1.6.1/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
 SLF4J: See [http://www.slf4j.org/codes.html#multiple_bindings] for an explanation.
 SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
 createDanglingTransaction txid: XID:[55,globalId=ffffffffffffffffffffffffffffffffffffffe9ffffffb8fffffff7ffffffa1,branchId=ffffffffffffffffffffffffffffffffffffffe9ffffffb8fffffff7ffffffa1]
 createXATransaction txid: XID:[55,globalId=000015333ec,branchId=000015333ec]
 Pending Transactions 1: [XID:[55,globalId=ffffffffffffffffffffffffffffffffffffffe9ffffffb8fffffff7ffffffa1,branchId=ffffffffffffffffffffffffffffffffffffffe9ffffffb8fffffff7ffffffa1]]
  INFO | I/O exception (org.apache.http.NoHttpResponseException) caught when processing request to {}->[http://localhost:8161:] The target server failed to respond
  INFO | Retrying request to {}->[http://localhost:8161|http://localhost:8161/]
 Pending Transactions 2: [XID:[55,globalId=ffffffffffffffffffffffffffffffffffffffe9ffffffb8fffffff7ffffffa1,branchId=ffffffffffffffffffffffffffffffffffffffe9ffffffb8fffffff7ffffffa1], XID:[55,globalId=000015333ec,branchId=000015333ec]]
 [INFO] ------------------------------------------------------------------------

In the Broker log you'll see:

 INFO | KahaDB is version 6
  INFO | Recovering from the journal @1:741
  INFO | Recovery replayed 429 operations from the journal in 0.081 seconds.
  WARN | Recovered prepared XA TX: [XID:[55,globalId=ffffffffffffffffffffffffffffffffffffffe9ffffffb8fffffff7ffffffa1,branchId=ffffffffffffffffffffffffffffffffffffffe9ffffffb8fffffff7ffffffa1]]
  WARN | Recovered prepared XA TX: [XID:[55,globalId=000015333ec,branchId=000015333ec]]
  
 On JConsole you'll see two messages in org.apache.activemq > Broker > RecoveredXaTransaction MBean.

Given we commit the transaction, I'd expect there to not be a message in Recovered state.

Note:  If we do not send additional messages to force a new page file, than we do not see the message in Recovered state after broker restart.

It appears to be that the prepared statement in db-1.log,. and commit in db-2.log will result in the commit getting GC'd (hence no pare to match) and having the message being seen as being in prepared state, and hence getting recovered.


was (Author: jgoodyear):
I've attached a stand alone test case to reproduce the issue.

Here's how to use the test case:

0. Setup AMQ 5.15.6 broker on local system. Use default configuration, no special settings required. Start broker.
1. Extract AMQ7067 zip file.
2. Use maven to build the test case.
3. Run XATransactionTest while Broker is live using :  mvn clean compile exec:java -Dexec.mainClass="XATransactionTest"

The test case will output:


[INFO] --- exec-maven-plugin:1.6.0:java (default-cli) @ transaction ---
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Users/icbts/.m2/repository/org/apache/activemq/activemq-all/5.15.6/activemq-all-5.15.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/icbts/.m2/repository/org/slf4j/slf4j-log4j12/1.6.1/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
createDanglingTransaction txid: XID:[55,globalId=ffffffffffffffffffffffffffffffffffffffe9ffffffb8fffffff7ffffffa1,branchId=ffffffffffffffffffffffffffffffffffffffe9ffffffb8fffffff7ffffffa1]
createXATransaction txid: XID:[55,globalId=000015333ec,branchId=000015333ec]
Pending Transactions 1: [XID:[55,globalId=ffffffffffffffffffffffffffffffffffffffe9ffffffb8fffffff7ffffffa1,branchId=ffffffffffffffffffffffffffffffffffffffe9ffffffb8fffffff7ffffffa1]]
 INFO | I/O exception (org.apache.http.NoHttpResponseException) caught when processing request to {}->http://localhost:8161: The target server failed to respond
 INFO | Retrying request to {}->http://localhost:8161
Pending Transactions 2: [XID:[55,globalId=ffffffffffffffffffffffffffffffffffffffe9ffffffb8fffffff7ffffffa1,branchId=ffffffffffffffffffffffffffffffffffffffe9ffffffb8fffffff7ffffffa1], XID:[55,globalId=000015333ec,branchId=000015333ec]]
[INFO] ------------------------------------------------------------------------

In the Broker log you'll see:

 INFO | KahaDB is version 6
 INFO | Recovering from the journal @1:741
 INFO | Recovery replayed 429 operations from the journal in 0.081 seconds.
 WARN | Recovered prepared XA TX: [XID:[55,globalId=ffffffffffffffffffffffffffffffffffffffe9ffffffb8fffffff7ffffffa1,branchId=ffffffffffffffffffffffffffffffffffffffe9ffffffb8fffffff7ffffffa1]]
 WARN | Recovered prepared XA TX: [XID:[55,globalId=000015333ec,branchId=000015333ec]]
 
On JConsole you'll see two messages in org.apache.activemq > Broker > RecoveredXaTransaction MBean.

Given we commit the transaction, I'd expect there to not be a message in Recovered state.

Note:  If we do not send additional messages to force a new page file, than we do not see the message in Recovered state after broker restart.

It appears to be that the prepared statement in db-1.log,. and commit in db-2.log will result in the commit getting GC'd (hence no pare to match) and having the message being seen as being in prepared state, and hence getting recovered.

> KahaDB Recovery can experience a dangling transaction when prepare and commit occur on different pagefiles.
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-7067
>                 URL: https://issues.apache.org/jira/browse/AMQ-7067
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: KahaDB, XA
>    Affects Versions: 5.15.6
>            Reporter: Jamie goodyear
>            Priority: Critical
>             Fix For: 5.16.0, 5.15.7
>
>         Attachments: AMQ7067.zip
>
>
> KahaDB Recovery can experience a dangling transaction when prepare and commit occur on different pagefiles.
> Scenario:
> A XA Transaction is started, message is prepared and sent into Broker.
> We then send into broker enough messages to file page file (100 message with 512 * 1024 characters in message payload). This forces a new pagefile to be created.
> Commit the XA transaction. Commit will land on the new page file.
> Restart the Broker.
> Upon restart a KahaDB recovery is executed.
> The prepare in PageFile 1 is not matched to Commit on PageFile 2, as such, it will appear in recovered message state.
> Looking deeper into this scenario, it appears that the commit message is GC'd, hence the prepare & commit can not be matched. 
> The MessageDatabase only checks the following for GC:
> {color:#808080}// Don't GC files referenced by in-progress tx
> {color}{color:#cc7832}if {color}(inProgressTxRange[{color:#6897bb}0{color}] != {color:#cc7832}null{color}) {
>  {color:#cc7832}for {color}({color:#cc7832}int {color}pendingTx=inProgressTxRange[{color:#6897bb}0{color}].getDataFileId(){color:#cc7832}; {color}pendingTx <= inProgressTxRange[{color:#6897bb}1{color}].getDataFileId(){color:#cc7832}; {color}pendingTx++) {
>  gcCandidateSet.remove(pendingTx){color:#cc7832};
> {color} }
> }
> We need to become aware of where the prepare & commits occur in pagefiles with respect to GCing files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)