You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@ignite.apache.org by Stanislav Lukyanov <st...@gmail.com> on 2018/12/19 14:32:01 UTC

RE: Asynchronous index rebuild

Hi Vladimir,

Thanks for this summary!

Why the third option and not the second?

The process of join-leave-build indexes-rejoin sounds kind of heavy.
Topology changes are complex because of PME, so I think the less PME – the better.

Rejoin to build indexes also means that not having indexes for one cache prevents the node
to serve operations for other caches (for which the node has indexes or which don’t have indexes at all).

I really like the idea of re-using late affinity assignment here.
The semantics is simple, the implementation looks sort-of simple as well (although it adds another bit of complexity to PME).

Why do you prefer to go with the join-leave-rejoin approach?

Thanks,
Stan

From: Vladimir Ozerov
Sent: 30 ноября 2018 г. 13:09
To: dev
Subject: Asynchronous index rebuild

Igniters,

During work on index rebuild bug [1] I realized that our asynchronous
nature of index rebuild logic may lead to long SQL queries in user
applications, indistinguishable from hangs.

Fundamental problem here is that we think that node is ready to become
partition owner when partition data is rebalanced. But correct rule form
user standpoint is that both partitions and indexes are ready. This is
always true for in-memory mode and when fresh node with persistent is
started, as indexes are filled with data during rebalance. But this is not
true for several other cases:
1) When new index were created while a node was done
2) When index.bin was corrupted or removed
3) Possible in future: rebalance through partition files [2], as current
IEP doesn't cover secondary indexes.

Let's think on how to overcome this problem. First, we may cover it on SQL
level: extract used indexes on query planning phase, attach them to query
messages, check on mapper side that indexes are ready, re-try if not.
Second, we may think of making index rebuild as a part of rebalance
procedure, so that node do not become primary until both partitions and
indexes are ready.
Third, we may detect missing indexes during join. Then node may leave
topology, rebuild indexes locally, then re-join. For missing index.bin
case, rebuild process may be started even without joining node to the
cluster, though it potentially may lead to rebuild of an index which
doesn't exist in the cluster anymore.

The third option looks appears to be the most promising to me. What do you
think?

Vladimir.

[1] https://issues.apache.org/jira/browse/IGNITE-10291
[2]
https://cwiki.apache.org/confluence/display/IGNITE/IEP-28%3A+Cluster+peer-2-peer+balancing

Re: Asynchronous index rebuild

Posted by Vladimir Ozerov <vo...@gridgain.com>.

Hi Stan,

The thing is that index rebuild when node is not operational might be order
of magnitude faster than rebuilding an index when node is online. This is
the only reason why I thought the third option is worth considering.

Vladimir.



On Wed, Dec 19, 2018 at 5:32 PM Stanislav Lukyanov <st...@gmail.com>
wrote:

> Hi Vladimir,
>
> Thanks for this summary!
>
> Why the third option and not the second?
>
> The process of join-leave-build indexes-rejoin sounds kind of heavy.
> Topology changes are complex because of PME, so I think the less PME – the
> better.
>
> Rejoin to build indexes also means that not having indexes for one cache
> prevents the node
> to serve operations for other caches (for which the node has indexes or
> which don’t have indexes at all).
>
> I really like the idea of re-using late affinity assignment here.
> The semantics is simple, the implementation looks sort-of simple as well
> (although it adds another bit of complexity to PME).
>
> Why do you prefer to go with the join-leave-rejoin approach?
>
> Thanks,
> Stan
>
> From: Vladimir Ozerov
> Sent: 30 ноября 2018 г. 13:09
> To: dev
> Subject: Asynchronous index rebuild
>
> Igniters,
>
> During work on index rebuild bug [1] I realized that our asynchronous
> nature of index rebuild logic may lead to long SQL queries in user
> applications, indistinguishable from hangs.
>
> Fundamental problem here is that we think that node is ready to become
> partition owner when partition data is rebalanced. But correct rule form
> user standpoint is that both partitions and indexes are ready. This is
> always true for in-memory mode and when fresh node with persistent is
> started, as indexes are filled with data during rebalance. But this is not
> true for several other cases:
> 1) When new index were created while a node was done
> 2) When index.bin was corrupted or removed
> 3) Possible in future: rebalance through partition files [2], as current
> IEP doesn't cover secondary indexes.
>
> Let's think on how to overcome this problem. First, we may cover it on SQL
> level: extract used indexes on query planning phase, attach them to query
> messages, check on mapper side that indexes are ready, re-try if not.
> Second, we may think of making index rebuild as a part of rebalance
> procedure, so that node do not become primary until both partitions and
> indexes are ready.
> Third, we may detect missing indexes during join. Then node may leave
> topology, rebuild indexes locally, then re-join. For missing index.bin
> case, rebuild process may be started even without joining node to the
> cluster, though it potentially may lead to rebuild of an index which
> doesn't exist in the cluster anymore.
>
> The third option looks appears to be the most promising to me. What do you
> think?
>
> Vladimir.
>
> [1] https://issues.apache.org/jira/browse/IGNITE-10291
> [2]
>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-28%3A+Cluster+peer-2-peer+balancing
>
>