You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Alexander Lapin (Jira)" <ji...@apache.org> on 2022/10/10 09:18:00 UTC
[jira] [Updated] (IGNITE-17859) Update indexes on data modifications

     [ https://issues.apache.org/jira/browse/IGNITE-17859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alexander Lapin updated IGNITE-17859:
-------------------------------------
    Description: 
h3. Motivation

For the sake of better performance and common sense, it is necessary to integrate indexes into data manipulation flow, which implies
 * Integrating indexes into the process of efficient evaluation keys to rowsIds both within pure read/scan operations and as part of modification operations in order to find proper rowId of a row to be modified.
 * Integrating indexes into the process of data modification itself, meaning that not only data should be updated but also all corresponding indexes should be populated with updated entries along with cleanup on transaction rollback.

Given Jira issue is about second part.

*Definition* *of Done*
 * All indexes with relevant schema version are populated with modified rows.
 * All pending index entries are removed on tx rollback.

h3. Implementation Notes
 # Seems, that it has sense to introduce new Index abstractions that will manage update logic internally. Something like following:
 ** Index
 ** HashIndex extends Index
 ** HashUniqueIndex extends HashIndex
 ** SortedIndex extneds Index
 ** SorteUniquedIndex extneds SortedIndex
 # In order to define which indexes to update both during update itself or during rollback it'll be useful to add extra parameter _schemaVersion_ to each operation enlisted into transaction. All in all, that will allow to select only proper set of indexes with relevant schema versions.
 # During transaction rollback it'll be necessary  to cleanup outdated pending entries that were touched during tx lifetime. 

h4. More details about *first* item.

Index itself may declare update() and other methods that will have index-type-specific lock management and uniqueness processing logic, e.g. for HashIndex.update following is suggested:

 
{code:java}
@Override
public CompletableFuture update(UUID txId, TxState txState, Tuple oldRow, Tuple newRow, VersionChain<Tuple> rowId) {
    Tuple oldVal = oldRow == Tuple.TOMBSTONE ? Tuple.TOMBSTONE : oldRow.select(col);
    Tuple newVal = newRow == Tuple.TOMBSTONE ? Tuple.TOMBSTONE : newRow.select(col);

    List<CompletableFuture> futs = new ArrayList<>();

    if (!oldVal.equals(newVal)) {
        if (oldVal.length() > 0) {
            Lock lock0 = lockTable.getOrAddEntry(oldVal);

            txState.addLock(lock0);

            futs.add(lock0.acquire(txId, LockMode.IX));

            // Do not remove bookmarks due to multi-versioning.
        }

        if (newVal.length() > 0) {
            Lock lock0 = lockTable.getOrAddEntry(newVal);

            txState.addLock(lock0);

            futs.add(lock0.acquire(txId, LockMode.IX).thenAccept(ignored0 -> {
                if (index.insert(newVal, rowId)) {
                    txState.addUndo(() -> index.remove(newVal, rowId));
                }
            }));
        }
    }

    return CompletableFuture.allOf(futs.toArray(new CompletableFuture[0]));
} {code}
Further details could be found in [https://github.com/ascherbakoff/ai3-txn-mvp]

 

Detailed lock management design is described in  [IEP-91-Locking model|https://cwiki.apache.org/confluence/display/IGNITE/IEP-91%3A+Transaction+protocol#IEP91:Transactionprotocol-Lockingmodel] See index related sections, e.g. for HashIndex following lock flow is suggested:

 
{code:java}
Non-unique locks
  // scan
  S_commit(key)

  // insert
  IX_commit(key)

  // delete
  IX_commit(key) {code}
h4. More details about *third* item.

 

Please see cleanup flow described in https://issues.apache.org/jira/browse/IGNITE-17673

> Update indexes on data modifications
> ------------------------------------
>
>                 Key: IGNITE-17859
>                 URL: https://issues.apache.org/jira/browse/IGNITE-17859
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Alexander Lapin
>            Priority: Major
>              Labels: ignite-3
>
> h3. Motivation
> For the sake of better performance and common sense, it is necessary to integrate indexes into data manipulation flow, which implies
>  * Integrating indexes into the process of efficient evaluation keys to rowsIds both within pure read/scan operations and as part of modification operations in order to find proper rowId of a row to be modified.
>  * Integrating indexes into the process of data modification itself, meaning that not only data should be updated but also all corresponding indexes should be populated with updated entries along with cleanup on transaction rollback.
> Given Jira issue is about second part.
> *Definition* *of Done*
>  * All indexes with relevant schema version are populated with modified rows.
>  * All pending index entries are removed on tx rollback.
> h3. Implementation Notes
>  # Seems, that it has sense to introduce new Index abstractions that will manage update logic internally. Something like following:
>  ** Index
>  ** HashIndex extends Index
>  ** HashUniqueIndex extends HashIndex
>  ** SortedIndex extneds Index
>  ** SorteUniquedIndex extneds SortedIndex
>  # In order to define which indexes to update both during update itself or during rollback it'll be useful to add extra parameter _schemaVersion_ to each operation enlisted into transaction. All in all, that will allow to select only proper set of indexes with relevant schema versions.
>  # During transaction rollback it'll be necessary  to cleanup outdated pending entries that were touched during tx lifetime. 
> h4. More details about *first* item.
> Index itself may declare update() and other methods that will have index-type-specific lock management and uniqueness processing logic, e.g. for HashIndex.update following is suggested:
>  
> {code:java}
> @Override
> public CompletableFuture update(UUID txId, TxState txState, Tuple oldRow, Tuple newRow, VersionChain<Tuple> rowId) {
>     Tuple oldVal = oldRow == Tuple.TOMBSTONE ? Tuple.TOMBSTONE : oldRow.select(col);
>     Tuple newVal = newRow == Tuple.TOMBSTONE ? Tuple.TOMBSTONE : newRow.select(col);
>     List<CompletableFuture> futs = new ArrayList<>();
>     if (!oldVal.equals(newVal)) {
>         if (oldVal.length() > 0) {
>             Lock lock0 = lockTable.getOrAddEntry(oldVal);
>             txState.addLock(lock0);
>             futs.add(lock0.acquire(txId, LockMode.IX));
>             // Do not remove bookmarks due to multi-versioning.
>         }
>         if (newVal.length() > 0) {
>             Lock lock0 = lockTable.getOrAddEntry(newVal);
>             txState.addLock(lock0);
>             futs.add(lock0.acquire(txId, LockMode.IX).thenAccept(ignored0 -> {
>                 if (index.insert(newVal, rowId)) {
>                     txState.addUndo(() -> index.remove(newVal, rowId));
>                 }
>             }));
>         }
>     }
>     return CompletableFuture.allOf(futs.toArray(new CompletableFuture[0]));
> } {code}
> Further details could be found in [https://github.com/ascherbakoff/ai3-txn-mvp]
>  
> Detailed lock management design is described in  [IEP-91-Locking model|https://cwiki.apache.org/confluence/display/IGNITE/IEP-91%3A+Transaction+protocol#IEP91:Transactionprotocol-Lockingmodel] See index related sections, e.g. for HashIndex following lock flow is suggested:
>  
> {code:java}
> Non-unique locks
>   // scan
>   S_commit(key)
>   // insert
>   IX_commit(key)
>   // delete
>   IX_commit(key) {code}
> h4. More details about *third* item.
>  
> Please see cleanup flow described in https://issues.apache.org/jira/browse/IGNITE-17673



--
This message was sent by Atlassian Jira
(v8.20.10#820010)