You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Vladislav Pyatkov (Jira)" <ji...@apache.org> on 2022/11/22 13:55:00 UTC

[jira] [Assigned] (IGNITE-18050) Possible phantom reads during sorted index scan

     [ https://issues.apache.org/jira/browse/IGNITE-18050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vladislav Pyatkov reassigned IGNITE-18050:
------------------------------------------

    Assignee: Vladislav Pyatkov

> Possible phantom reads during sorted index scan
> -----------------------------------------------
>
>                 Key: IGNITE-18050
>                 URL: https://issues.apache.org/jira/browse/IGNITE-18050
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Denis Chudov
>            Assignee: Vladislav Pyatkov
>            Priority: Major
>              Labels: ignite-3
>
> According to transaction protocol IEP [1], locking protocol for scan operation over sorted index looks as follows (we assume the forward scans):
> // scan
> If the currentKey is bigger than the upper bound, return NOT_FOUND but take a lock anyway.
> If currentKey matches the upper bound no need to take a lock on the next key.
> If the upper bound is null, take a lock on +INF.
> If currentKey is not found, take a lock on +INF.
> S_commit(currentKey)
> // insert
> IX_short(nextKey) // released after the insertion
> X_commit(currentKey) // acquired before releasing IX_short
> For example, scan(2, false, 4, false) must take S locks on keys 3 and 5. Note that this will falsely lock the insertion of key 4, but we can’t do anything here due to the chosen conservative approach. 
> Also we are going to prevent phantom reads by peeking next key before taking the lock.
> Consider table *t* with field *a* and sorted index:
> 3 -> rowId(3)
> 6 -> rowId(6)
> Three concurrent transactions are running, below each line is a moment in time (assuming select actually performs scan operation):
> {code:java}
> tx1                                             | tx2                  | tx3
> performs select * from t where a >= 3 and a <= 6| performs insert(5)   | performs insert(4)
> next(): currentKey = 3                          |                      |
> S_commit(3)                                     |                      |
> peek(): nextKey = 6                             |                      |
> // hangs for a while                            |                      |
>                                                 | IX_short(6)          |
>                                                 | X_commit(5)          |
>                                                 | commit()             |
>                                                 |                      | IX_short(5)
>                                                 |                      | X_commit(4)
>                                                 |                      | commit()
> S_commit(6)                                     |                      |
> next(): currentKey = 5                          |                      |
> ...                                             |                      |
> select returns following:                       |                      |
> 3                                               |                      |
> 5                                               |                      |
> 6                                               |                      |
>                                                 |                      |
> the same select is repeated.                    |                      |
> now it returns:                                 |                      |
> 3                                               |                      |
> 4                                               |                      |
> 5                                               |                      |
> 6                                               |                      |
> {code}
> We see a phantom read for second select within tx1, which violates serializability.
> So we should either repeat peeking next key in a loop until next() operation returns the key that was actually S_locked, or take IX_short lock on previous key during insert. The former doesn't seem to be efficient.
> [1] https://cwiki.apache.org/confluence/display/IGNITE/IEP-91%3A+Transaction+protocol



--
This message was sent by Atlassian Jira
(v8.20.10#820010)