You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Jason Gustafson (Jira)" <ji...@apache.org> on 2020/11/30 20:02:00 UTC

[jira] [Updated] (KAFKA-10778) Stronger log fencing after write failure

     [ https://issues.apache.org/jira/browse/KAFKA-10778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Gustafson updated KAFKA-10778:
------------------------------------
    Description: 
If a log append operation fails with an IO error, the broker attempts to fail the log dir that it resides in. Currently this is done asynchronously, which means there is no guarantee that additional appends won't be attempted before the log is fenced. This can be a problem for EOS because of the need to maintain consistent producer state.

1. Iterate through batches to build producer state and collect completed transactions
2. Append the batches to the log 
3. Update the offset/timestamp indexes
4. Update log end offset
5. Apply individual producer state to `ProducerStateManager`
6. Update the transaction index
7. Update completed transactions and advance LSO

One example of how this process can go wrong is if the index updates in step 3 fail. In this case, the log will contain updated producer state which has not been reflected in `ProducerStateManager`. If the append is retried before the log is fenced, then we can have duplicates. There are probably other potential failures that are possible as well.

I'm sure we can come up with some way to fix this specific case, but the general fencing approach is slippery enough that we'll have a hard time convincing ourselves that it handles all potential cases. It would be simpler to add synchronous fencing logic for the case when an append fails due to an IO error. For example, we can mark a flag to indicate that the log is closed for additional read/write operations.

  was:
If a log operation fails with an IO error, the broker attempts to fail the log dir that it resides in. Currently this is done asynchronously, which means there is no guarantee that additional appends won't be attempted before the log is fenced. This can be a problem for EOS because of the need to maintain consistent producer state.

1. Iterate through batches to build producer state and collect completed transactions
2. Append the batches to the log 
3. Update the offset/timestamp indexes
4. Update log end offset
5. Apply individual producer state to `ProducerStateManager`
6. Update the transaction index
7. Update completed transactions and advance LSO

One example of how this process can go wrong is if the index updates in step 3 fail. In this case, the log will contain updated producer state which has not been reflected in `ProducerStateManager`. If the append is retried before the log is fenced, then we can have duplicates. There are probably other potential failures that are possible as well.

I'm sure we can come up with some way to fix this specific case, but the general fencing approach is slippery enough that we'll have a hard time convincing ourselves that it handles all potential cases. It would be simpler to add synchronous fencing logic for the case when an append fails due to an IO error. For example, we can mark a flag to indicate that the log is closed for additional read/write operations.


> Stronger log fencing after write failure
> ----------------------------------------
>
>                 Key: KAFKA-10778
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10778
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Jason Gustafson
>            Priority: Major
>
> If a log append operation fails with an IO error, the broker attempts to fail the log dir that it resides in. Currently this is done asynchronously, which means there is no guarantee that additional appends won't be attempted before the log is fenced. This can be a problem for EOS because of the need to maintain consistent producer state.
> 1. Iterate through batches to build producer state and collect completed transactions
> 2. Append the batches to the log 
> 3. Update the offset/timestamp indexes
> 4. Update log end offset
> 5. Apply individual producer state to `ProducerStateManager`
> 6. Update the transaction index
> 7. Update completed transactions and advance LSO
> One example of how this process can go wrong is if the index updates in step 3 fail. In this case, the log will contain updated producer state which has not been reflected in `ProducerStateManager`. If the append is retried before the log is fenced, then we can have duplicates. There are probably other potential failures that are possible as well.
> I'm sure we can come up with some way to fix this specific case, but the general fencing approach is slippery enough that we'll have a hard time convincing ourselves that it handles all potential cases. It would be simpler to add synchronous fencing logic for the case when an append fails due to an IO error. For example, we can mark a flag to indicate that the log is closed for additional read/write operations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)