You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Denis Chudov (Jira)" <ji...@apache.org> on 2022/09/27 14:59:00 UTC
[jira] [Updated] (IGNITE-17578) Transactions: async cleanup processing on tx commit

     [ https://issues.apache.org/jira/browse/IGNITE-17578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Denis Chudov updated IGNITE-17578:
----------------------------------
    Description: 
h3. Motivation

According to tx commit process design it's required to return the control to the outer logic right after COMMITED/ABORTED txn state replication. Follow-up cleanup process, that will send replica cleanup requests to all enlisted replication groups should be asynchronous.

Currently it's not true:
{code:java}
/**
 * Process transaction finish request:
 * <ol>
 *     <li>Evaluate commit timestamp.</li>
 *     <li>Run specific raft {@code FinishTxCommand} command, that will apply txn state to corresponding txStateStorage.</li>
 *     <li>Send cleanup requests to all enlisted primary replicas.</li>
 * </ol>
 * This operation is NOT idempotent, because of commit timestamp evaluation.
 *
 * @param request Transaction finish request.
 * @return future result of the operation.
 */
private CompletableFuture<Object> processTxFinishAction(TxFinishRequest request) {
    HybridTimestamp commitTimestamp = hybridClock.now();

    List<String> aggregatedGroupIds = request.groups().values().stream().flatMap(List::stream).collect(Collectors.toList());

    UUID txId = request.txId();

    boolean commit = request.commit();

    CompletableFuture<Object> chaneStateFuture = raftClient.run(
            new FinishTxCommand(
                    txId,
                    commit,
                    commitTimestamp,
                    aggregatedGroupIds
            )
    );

    // TODO: https://issues.apache.org/jira/browse/IGNITE-17578
    chaneStateFuture.thenRun(
            () -> request.groups().forEach(
                    (recipientNode, replicationGroupIds) -> txManager.cleanup(
                            recipientNode,
                            replicationGroupIds,
                            txId,
                            commit,
                            commitTimestamp
                    )
            )
    );

    return chaneStateFuture;
}
{code}
Besides aforementioned, it's expected that cleanup process (that is guaranteed to be idempotent) should be performed until success.
h3. Definition of Done
 * Sending cleanup request should be implemented in an async format.
 * Cleanup failures, including timeouts should trigger one more cleanup until success. There's no failure handler currently, so it's the only option.

h3. Implementation Notes

Seems that, properly shared between replicas, cleanup executor will suite us. The executor is needed to have ability to plan the next attempt of cleanup in case of failure, so that such attempt would be performed not right after the failure but after successful rehashing of replicas when their state allows to perform the cleanup attempt with high possibility of success.

 

 

  was:
h3. Motivation

According to tx commit process design it's required to return the control to the outer logic right after COMMITED/ABORTED txn state replication. Follow-up cleanup process, that will send replica cleanup requests to all enlisted replication groups should be asynchronous.

Currently it's not true:
{code:java}
/**
 * Process transaction finish request:
 * <ol>
 *     <li>Evaluate commit timestamp.</li>
 *     <li>Run specific raft {@code FinishTxCommand} command, that will apply txn state to corresponding txStateStorage.</li>
 *     <li>Send cleanup requests to all enlisted primary replicas.</li>
 * </ol>
 * This operation is NOT idempotent, because of commit timestamp evaluation.
 *
 * @param request Transaction finish request.
 * @return future result of the operation.
 */
private CompletableFuture<Object> processTxFinishAction(TxFinishRequest request) {
    HybridTimestamp commitTimestamp = hybridClock.now();

    List<String> aggregatedGroupIds = request.groups().values().stream().flatMap(List::stream).collect(Collectors.toList());

    UUID txId = request.txId();

    boolean commit = request.commit();

    CompletableFuture<Object> chaneStateFuture = raftClient.run(
            new FinishTxCommand(
                    txId,
                    commit,
                    commitTimestamp,
                    aggregatedGroupIds
            )
    );

    // TODO: https://issues.apache.org/jira/browse/IGNITE-17578
    chaneStateFuture.thenRun(
            () -> request.groups().forEach(
                    (recipientNode, replicationGroupIds) -> txManager.cleanup(
                            recipientNode,
                            replicationGroupIds,
                            txId,
                            commit,
                            commitTimestamp
                    )
            )
    );

    return chaneStateFuture;
}
{code}
Besides aforementioned, it's expected that cleanup process (that is guaranteed to be idempotent) should be performed until success.
h3. Definition of Done
 * Sending cleanup request should be implemented in an async format.
 * Cleanup failures, including timeouts should trigger one more cleanup until success. There's no failure handler currently, so it's the only option.

h3. Implementation Notes

Seems that, properly shared between replicas, cleanup executor will suite us.

 

 


> Transactions: async cleanup processing on tx commit
> ---------------------------------------------------
>
>                 Key: IGNITE-17578
>                 URL: https://issues.apache.org/jira/browse/IGNITE-17578
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Alexander Lapin
>            Priority: Major
>              Labels: ignite-3, transaction3_rw
>
> h3. Motivation
> According to tx commit process design it's required to return the control to the outer logic right after COMMITED/ABORTED txn state replication. Follow-up cleanup process, that will send replica cleanup requests to all enlisted replication groups should be asynchronous.
> Currently it's not true:
> {code:java}
> /**
>  * Process transaction finish request:
>  * <ol>
>  *     <li>Evaluate commit timestamp.</li>
>  *     <li>Run specific raft {@code FinishTxCommand} command, that will apply txn state to corresponding txStateStorage.</li>
>  *     <li>Send cleanup requests to all enlisted primary replicas.</li>
>  * </ol>
>  * This operation is NOT idempotent, because of commit timestamp evaluation.
>  *
>  * @param request Transaction finish request.
>  * @return future result of the operation.
>  */
> private CompletableFuture<Object> processTxFinishAction(TxFinishRequest request) {
>     HybridTimestamp commitTimestamp = hybridClock.now();
>     List<String> aggregatedGroupIds = request.groups().values().stream().flatMap(List::stream).collect(Collectors.toList());
>     UUID txId = request.txId();
>     boolean commit = request.commit();
>     CompletableFuture<Object> chaneStateFuture = raftClient.run(
>             new FinishTxCommand(
>                     txId,
>                     commit,
>                     commitTimestamp,
>                     aggregatedGroupIds
>             )
>     );
>     // TODO: https://issues.apache.org/jira/browse/IGNITE-17578
>     chaneStateFuture.thenRun(
>             () -> request.groups().forEach(
>                     (recipientNode, replicationGroupIds) -> txManager.cleanup(
>                             recipientNode,
>                             replicationGroupIds,
>                             txId,
>                             commit,
>                             commitTimestamp
>                     )
>             )
>     );
>     return chaneStateFuture;
> }
> {code}
> Besides aforementioned, it's expected that cleanup process (that is guaranteed to be idempotent) should be performed until success.
> h3. Definition of Done
>  * Sending cleanup request should be implemented in an async format.
>  * Cleanup failures, including timeouts should trigger one more cleanup until success. There's no failure handler currently, so it's the only option.
> h3. Implementation Notes
> Seems that, properly shared between replicas, cleanup executor will suite us. The executor is needed to have ability to plan the next attempt of cleanup in case of failure, so that such attempt would be performed not right after the failure but after successful rehashing of replicas when their state allows to perform the cleanup attempt with high possibility of success.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)