You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Ivan Bessonov (Jira)" <ji...@apache.org> on 2022/09/13 14:24:00 UTC
[jira] [Updated] (IGNITE-17673) Extend MV partition storage API with methods to help cleaning up SQL indices
[ https://issues.apache.org/jira/browse/IGNITE-17673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ivan Bessonov updated IGNITE-17673:
-----------------------------------
Description:
In order to allow indices to be cleaned, we need extra API in partition storage.
In pseudo-code, cleanup should look like following:
{code:java}
BinaryRow oldRow = partition.addWrite(rowId, txId, partitionId, newRow);
if (oldRow != null) {
Set<Index> allIndexes = getAllIndexes();
for (BinaryRow version : partition.scanVersions(rowId)) {
for (Index index : allIndexes) {
if (index.rowsMatch(oldRow, version)) {
allIndexes.remove(index);
}
}
if (allIndexes.isEmpty()) {
break;
}
}
for (Index index : allIndexes) {
index.remove(oldRow);
}
}{code}
Now, I guess I need to explain this a little bit.
First of all, the real implementation will probably look a bit different. Cursor has to be closed, oldRow must be converted to a binary tuple. Rows matching algorithm shouldn't be in the index itself, because it depends on versioned row schemas and indexes don't know about them. Having a set and removing from it doesn't look optimal either. Etc. This is just a sketch.
Second, from the API standpoint for getting versions for a single key, it's pretty accurate to what I imagine:
{code:java}
Cursor<BinaryRow> scanVersions(RowId rowId);{code}
Versions should be returned from newest to oldest. Timestamp itself doesn't seem to be necessary.
was:
In order to allow indices to be cleaned, we need extra API in partition storage.
In pseudo-code, cleanup should look like following:
{code:java}
BinaryRow oldRow = partition.addWrite(rowId, txId, partitionId, newRow);
if (oldRow != null) {
Set<Index> allIndexes = getAllIndexes();
for (BinaryRow version : partition.scanVersions(rowId)) {
for (Index index : allIndexes) {
if (index.rowsMatch(oldRow, version)) {
allIndexes.remove(index);
}
}
if (allIndexes.isEmpty()) {
break;
}
}
for (Index index : allIndexes) {
index.remove(oldRow);
}
}{code}
Now, I guess I need to explain this a little bit.
First of all, the real implementation will probably look a bit different. Cursor has to be closed, oldRow must be converted to a binary tuple. Rows matching algorithm shouldn't be in the index itself, because it depends on versioned row schemas and indexes don't know about them. Having a set and removing from it doesn't look optimal either. Etc. This is just a sketch.
Second, from the API standpoint for getting versions for a single key, it's pretty accurate to what I imagine:
{code:java}
Cursor<BinaryRow> scanVersions(RowId rowId);{code}
Versions should be returned from newest to oldest. Timestamp itself doesn't seem to be necessary.
> Extend MV partition storage API with methods to help cleaning up SQL indices
> ----------------------------------------------------------------------------
>
> Key: IGNITE-17673
> URL: https://issues.apache.org/jira/browse/IGNITE-17673
> Project: Ignite
> Issue Type: Improvement
> Reporter: Ivan Bessonov
> Priority: Major
> Labels: ignite-3
>
> In order to allow indices to be cleaned, we need extra API in partition storage.
> In pseudo-code, cleanup should look like following:
> {code:java}
> BinaryRow oldRow = partition.addWrite(rowId, txId, partitionId, newRow);
> if (oldRow != null) {
> Set<Index> allIndexes = getAllIndexes();
> for (BinaryRow version : partition.scanVersions(rowId)) {
> for (Index index : allIndexes) {
> if (index.rowsMatch(oldRow, version)) {
> allIndexes.remove(index);
> }
> }
> if (allIndexes.isEmpty()) {
> break;
> }
> }
> for (Index index : allIndexes) {
> index.remove(oldRow);
> }
> }{code}
> Now, I guess I need to explain this a little bit.
> First of all, the real implementation will probably look a bit different. Cursor has to be closed, oldRow must be converted to a binary tuple. Rows matching algorithm shouldn't be in the index itself, because it depends on versioned row schemas and indexes don't know about them. Having a set and removing from it doesn't look optimal either. Etc. This is just a sketch.
> Second, from the API standpoint for getting versions for a single key, it's pretty accurate to what I imagine:
> {code:java}
> Cursor<BinaryRow> scanVersions(RowId rowId);{code}
> Versions should be returned from newest to oldest. Timestamp itself doesn't seem to be necessary.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)