You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Denis Chudov (Jira)" <ji...@apache.org> on 2022/12/21 09:02:00 UTC
[jira] [Commented] (IGNITE-18050) Possible phantom reads during sorted index scan
[ https://issues.apache.org/jira/browse/IGNITE-18050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17650459#comment-17650459 ]
Denis Chudov commented on IGNITE-18050:
---------------------------------------
[~v.pyatkov] lgtm.
> Possible phantom reads during sorted index scan
> -----------------------------------------------
>
> Key: IGNITE-18050
> URL: https://issues.apache.org/jira/browse/IGNITE-18050
> Project: Ignite
> Issue Type: Bug
> Reporter: Denis Chudov
> Assignee: Vladislav Pyatkov
> Priority: Major
> Labels: ignite-3
> Time Spent: 40m
> Remaining Estimate: 0h
>
> According to transaction protocol IEP [1], locking protocol for scan operation over sorted index looks as follows (we assume the forward scans):
> // scan
> If the currentKey is bigger than the upper bound, return NOT_FOUND but take a lock anyway.
> If currentKey matches the upper bound no need to take a lock on the next key.
> If the upper bound is null, take a lock on +INF.
> If currentKey is not found, take a lock on +INF.
> S_commit(currentKey)
> // insert
> IX_short(nextKey) // released after the insertion
> X_commit(currentKey) // acquired before releasing IX_short
> For example, scan(2, false, 4, false) must take S locks on keys 3 and 5. Note that this will falsely lock the insertion of key 4, but we can’t do anything here due to the chosen conservative approach.
> Also we are going to prevent phantom reads by peeking next key before taking the lock.
> Consider table *t* with field *a* and sorted index:
> 3 -> rowId(3)
> 6 -> rowId(6)
> Three concurrent transactions are running, below each line is a moment in time (assuming select actually performs scan operation):
> {code:java}
> tx1 | tx2 | tx3
> performs select * from t where a >= 3 and a <= 6| performs insert(5) | performs insert(4)
> next(): currentKey = 3 | |
> S_commit(3) | |
> peek(): nextKey = 6 | |
> // hangs for a while | |
> | IX_short(6) |
> | X_commit(5) |
> | commit() |
> | | IX_short(5)
> | | X_commit(4)
> | | commit()
> S_commit(6) | |
> next(): currentKey = 5 | |
> ... | |
> select returns following: | |
> 3 | |
> 5 | |
> 6 | |
> | |
> the same select is repeated. | |
> now it returns: | |
> 3 | |
> 4 | |
> 5 | |
> 6 | |
> {code}
> We see a phantom read for second select within tx1, which violates serializability.
> So we should either repeat peeking next key in a loop until next() operation returns the key that was actually S_locked, or take IX_short lock on previous key during insert. The former doesn't seem to be efficient.
> [1] https://cwiki.apache.org/confluence/display/IGNITE/IEP-91%3A+Transaction+protocol
--
This message was sent by Atlassian Jira
(v8.20.10#820010)