You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@zookeeper.apache.org by "anaud (Jira)" <ji...@apache.org> on 2020/09/25 03:21:00 UTC

[jira] [Updated] (ZOOKEEPER-3947) truncate in FileTxnLog.java is buggy and fails to correctly truncate a file containing a single transaction only the follower saw

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-3947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

anaud updated ZOOKEEPER-3947:
-----------------------------
    Description: It is possible that there is a follower, which is the old leader, with a transaction only it saw as the single entry in the log file from the last epoch. The follower tries to sync with the new leader. It gets TRUNC request, but in the truncate method of FileTxnLog.java, the FileTxnIterator will fast-forward to the log file containing the sole transaction. Also, the pos is pointing to the sole transaction, which is actually the target to truncate. Therefore, doing a setLength(pos) does not discard this transaction. The itr.goToNextLog() is not true in this case if the log file was the last one. Consequently, the transaction remains in the followers log and gets applied to the in-memory data tree of the follower. Therefore, convergence fails.

> truncate in FileTxnLog.java is buggy and fails to correctly truncate a file containing a single transaction only the follower saw
> ---------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-3947
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3947
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.7.0
>            Reporter: anaud
>            Priority: Major
>
> It is possible that there is a follower, which is the old leader, with a transaction only it saw as the single entry in the log file from the last epoch. The follower tries to sync with the new leader. It gets TRUNC request, but in the truncate method of FileTxnLog.java, the FileTxnIterator will fast-forward to the log file containing the sole transaction. Also, the pos is pointing to the sole transaction, which is actually the target to truncate. Therefore, doing a setLength(pos) does not discard this transaction. The itr.goToNextLog() is not true in this case if the log file was the last one. Consequently, the transaction remains in the followers log and gets applied to the in-memory data tree of the follower. Therefore, convergence fails.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)