You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ignite.apache.org by Andrey Mashenkov <an...@gmail.com> on 2022/01/28 18:20:54 UTC

[DISCUSS] Index corruption scenarios to be fixed.

Hi Igniters,

I've created a ticket [1] with the PR [2] where wrote 3 tests for some
scenarios that lead to index corruption.
The scenarios are described in detail in the PR, so, you may look at a code
for better understanding.

Please, feel free to create a separate ticket for fixing any of the next
scenarios.
The first scenario is quite simple, but others impact compatibility, and
the proper fix has to be discussed first.

1. The first scenario.
The user has a cache with the data in some cache group and creates an index.
While the index is building in the background, the user drops the cache.
The user expects the index building will be canceled and the incomplete
index tree will be dropped.
Then the user creates an empty cache with the same name and the first put
operation fails with a corrupted tree exception.
At first glance, we forgot to drop the index tree.

Actually, in GridGain we faced this issue and the fix is trivial.
However, I've add this fix to the PR without success. Most likely, there is
another bug or code much different,
because we don't have the IndexQuery feature in GG that was added by Maxim
Timonin recently.

Max would you please assist and validate the test
ResumeCreateIndexTest.testIncompleteIndexDroppedOnCacheDestroy,
maybe I've missed smth.

2. The second scenario.
I've found that Ignite creates a system index "Affinity_Key" if the
affinity key is configured,
but it can be shadowed by a user index that starts with the affinity column
(composite or not).

Seems, Ignite assumes that it can save memory and omit the system affinity
index if it can be replaced with the user one.
That was ok for older Ignite versions, which had no support for dynamic
indices, but now this 'feature' is painful.

If a user creates such an index on the cache with existed data, and later
drop the index,
the affinity index tree will be inconsistent because it missed all the
operations between creating-dropping the custom affinity index.

3. The third scenario.
Similar to the second one.
If a user drops the cache with the custom affinity index and then creates a
new cache with the same name,
the newly created cache will use 'outdated' system affinity tree from the
previous cache.
This happens, because Ignite does NOT register the system index (with tree)
if there is a custom affinity index, and forgets to clean up the index tree
on cache drop.

The fix for 2 and 3 is not trivial.
1. If we will register the system affinity index unconditionally, then
users with the custom index will get corrupted tree on node start after the
upgrade.
2. if we decided to drop affinity cache, this will affect UX.
Some old users may experience a slowdown for some queries (very common
case) where affinity index is involved and will have to create index on
affinity index manually.
New users may have poor performance out-of-the-box.
3. I'm not sure if we can detect if the affinity index should be dropped or
rebuilt at node start. And correctly substitute system index with custom
one (and vice versa) on-fly.


I lay to the second option.
WDYT?
Any ideas?


[1] https://issues.apache.org/jira/browse/IGNITE-16426
[2] https://github.com/apache/ignite/pull/9780
-- 
Best regards,
Andrey V. Mashenkov

Re: [DISCUSS] Index corruption scenarios to be fixed.

Posted by Maksim Timonin <ti...@gmail.com>.
Hi Andrey,

I write comments to PR related to the first scenario.

I have some thoughts about 2 and 3 scenarios.

I think Ignite should support proxying for such indexes. If there is a
shadowed index then Ignite should handle it differently. In case of
dropping the index, it should rename it and still use it as an affinity
index. Even the index is defined with more columns than it's required for
the affinity index. In this case I think we should write a WARN to a user
with recommendation to explicitly create an affinity index to replace the
wide index with a narrow one. Ignite will automatically detect a new index
as more suitable for the affinity index.

It looks a little bit complicated, but pretty safe from the other hand. Am
I missing smth here?

On Fri, Jan 28, 2022 at 11:12 PM Maksim Timonin <ti...@gmail.com>
wrote:

> Hi Andrey,
>
> > Max would you please assist and validate the test ResumeCreateIndexTest.
> testIncompleteIndexDroppedOnCacheDestroy, maybe I've missed smth.
>
> Sure, I'll check it and other issues you mentioned on Monday.
>
> Thank you for investigating these things.
>
> Maksim
>
> On Friday, January 28, 2022, Andrey Mashenkov <an...@gmail.com>
> wrote:
>
>> Hi Igniters,
>>
>> I've created a ticket [1] with the PR [2] where wrote 3 tests for some
>> scenarios that lead to index corruption.
>> The scenarios are described in detail in the PR, so, you may look at a
>> code for better understanding.
>>
>> Please, feel free to create a separate ticket for fixing any of the next
>> scenarios.
>> The first scenario is quite simple, but others impact compatibility, and
>> the proper fix has to be discussed first.
>>
>> 1. The first scenario.
>> The user has a cache with the data in some cache group and creates an
>> index.
>> While the index is building in the background, the user drops the cache.
>> The user expects the index building will be canceled and the incomplete
>> index tree will be dropped.
>> Then the user creates an empty cache with the same name and the first put
>> operation fails with a corrupted tree exception.
>> At first glance, we forgot to drop the index tree.
>>
>> Actually, in GridGain we faced this issue and the fix is trivial.
>> However, I've add this fix to the PR without success. Most likely, there
>> is another bug or code much different,
>> because we don't have the IndexQuery feature in GG that was added by
>> Maxim Timonin recently.
>>
>> Max would you please assist and validate the test
>> ResumeCreateIndexTest.testIncompleteIndexDroppedOnCacheDestroy,
>> maybe I've missed smth.
>>
>> 2. The second scenario.
>> I've found that Ignite creates a system index "Affinity_Key" if the
>> affinity key is configured,
>> but it can be shadowed by a user index that starts with the affinity
>> column (composite or not).
>>
>> Seems, Ignite assumes that it can save memory and omit the system
>> affinity index if it can be replaced with the user one.
>> That was ok for older Ignite versions, which had no support for dynamic
>> indices, but now this 'feature' is painful.
>>
>> If a user creates such an index on the cache with existed data, and later
>> drop the index,
>> the affinity index tree will be inconsistent because it missed all the
>> operations between creating-dropping the custom affinity index.
>>
>> 3. The third scenario.
>> Similar to the second one.
>> If a user drops the cache with the custom affinity index and then creates
>> a new cache with the same name,
>> the newly created cache will use 'outdated' system affinity tree from the
>> previous cache.
>> This happens, because Ignite does NOT register the system index (with
>> tree) if there is a custom affinity index, and forgets to clean up the
>> index tree on cache drop.
>>
>> The fix for 2 and 3 is not trivial.
>> 1. If we will register the system affinity index unconditionally, then
>> users with the custom index will get corrupted tree on node start after the
>> upgrade.
>> 2. if we decided to drop affinity cache, this will affect UX.
>> Some old users may experience a slowdown for some queries (very common
>> case) where affinity index is involved and will have to create index on
>> affinity index manually.
>> New users may have poor performance out-of-the-box.
>> 3. I'm not sure if we can detect if the affinity index should be dropped
>> or rebuilt at node start. And correctly substitute system index with custom
>> one (and vice versa) on-fly.
>>
>>
>> I lay to the second option.
>> WDYT?
>> Any ideas?
>>
>>
>> [1] https://issues.apache.org/jira/browse/IGNITE-16426
>> [2] https://github.com/apache/ignite/pull/9780
>> --
>> Best regards,
>> Andrey V. Mashenkov
>>
>

Re: [DISCUSS] Index corruption scenarios to be fixed.

Posted by Maksim Timonin <ti...@gmail.com>.
Hi Andrey,

> Max would you please assist and validate the test ResumeCreateIndexTest.
testIncompleteIndexDroppedOnCacheDestroy, maybe I've missed smth.

Sure, I'll check it and other issues you mentioned on Monday.

Thank you for investigating these things.

Maksim

On Friday, January 28, 2022, Andrey Mashenkov <an...@gmail.com>
wrote:

> Hi Igniters,
>
> I've created a ticket [1] with the PR [2] where wrote 3 tests for some
> scenarios that lead to index corruption.
> The scenarios are described in detail in the PR, so, you may look at a
> code for better understanding.
>
> Please, feel free to create a separate ticket for fixing any of the next
> scenarios.
> The first scenario is quite simple, but others impact compatibility, and
> the proper fix has to be discussed first.
>
> 1. The first scenario.
> The user has a cache with the data in some cache group and creates an
> index.
> While the index is building in the background, the user drops the cache.
> The user expects the index building will be canceled and the incomplete
> index tree will be dropped.
> Then the user creates an empty cache with the same name and the first put
> operation fails with a corrupted tree exception.
> At first glance, we forgot to drop the index tree.
>
> Actually, in GridGain we faced this issue and the fix is trivial.
> However, I've add this fix to the PR without success. Most likely, there
> is another bug or code much different,
> because we don't have the IndexQuery feature in GG that was added by Maxim
> Timonin recently.
>
> Max would you please assist and validate the test ResumeCreateIndexTest.
> testIncompleteIndexDroppedOnCacheDestroy,
> maybe I've missed smth.
>
> 2. The second scenario.
> I've found that Ignite creates a system index "Affinity_Key" if the
> affinity key is configured,
> but it can be shadowed by a user index that starts with the affinity
> column (composite or not).
>
> Seems, Ignite assumes that it can save memory and omit the system affinity
> index if it can be replaced with the user one.
> That was ok for older Ignite versions, which had no support for dynamic
> indices, but now this 'feature' is painful.
>
> If a user creates such an index on the cache with existed data, and later
> drop the index,
> the affinity index tree will be inconsistent because it missed all the
> operations between creating-dropping the custom affinity index.
>
> 3. The third scenario.
> Similar to the second one.
> If a user drops the cache with the custom affinity index and then creates
> a new cache with the same name,
> the newly created cache will use 'outdated' system affinity tree from the
> previous cache.
> This happens, because Ignite does NOT register the system index (with
> tree) if there is a custom affinity index, and forgets to clean up the
> index tree on cache drop.
>
> The fix for 2 and 3 is not trivial.
> 1. If we will register the system affinity index unconditionally, then
> users with the custom index will get corrupted tree on node start after the
> upgrade.
> 2. if we decided to drop affinity cache, this will affect UX.
> Some old users may experience a slowdown for some queries (very common
> case) where affinity index is involved and will have to create index on
> affinity index manually.
> New users may have poor performance out-of-the-box.
> 3. I'm not sure if we can detect if the affinity index should be dropped
> or rebuilt at node start. And correctly substitute system index with custom
> one (and vice versa) on-fly.
>
>
> I lay to the second option.
> WDYT?
> Any ideas?
>
>
> [1] https://issues.apache.org/jira/browse/IGNITE-16426
> [2] https://github.com/apache/ignite/pull/9780
> --
> Best regards,
> Andrey V. Mashenkov
>