You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Matija Polajnar (Jira)" <ji...@apache.org> on 2020/05/29 12:05:00 UTC
[jira] [Comment Edited] (IGNITE-12297) Detect lost partitions is
not happened during cluster activation
[ https://issues.apache.org/jira/browse/IGNITE-12297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16975026#comment-16975026 ]
Matija Polajnar edited comment on IGNITE-12297 at 5/29/20, 12:04 PM:
---------------------------------------------------------------------
For the record, as discussed in IGNITE-10226, this resulted in us getting a *org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Cannot run update query. Node must own all the necessary partitions.* This was happening on a one-node "cluster".
A sane but sometimes difficult to execute workaround was provided by [~jokser]:
{quote}1) Start another node, this is a topology event that will trigger detecting lost partitions.
2) Stop started node
3) If you have partition loss policy != IGNORE trigger explicitly `resetLostPartitions`
It should help to return back partition to OWNING state.
{quote}
It works, but you need to configure another node for the cluster. A dangerous and ugly but more practical workaround is to have this reflection-based method ready to invoke when you need it:
{code:java}
public void resetMovingPartitions() {
try {
Field igniteKernalField = IgniteSpringBean.class.getDeclaredField("g");
igniteKernalField.setAccessible(true);
IgniteKernal igniteKernal = (IgniteKernal)igniteKernalField.get(this);
GridKernalContextImpl kernalContext = (GridKernalContextImpl)igniteKernal.context();
kernalContext.cache().context().exchange().scheduleResendPartitions();
} catch (IllegalAccessException | NoSuchFieldException | ClassCastException e) {
throw new AssertionError(e);
}
}
{code}
It works for us.
was (Author: matijap):
For the record, as discussed in IGNITE-10266, this resulted in us getting a *org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Cannot run update query. Node must own all the necessary partitions.* This was happening on a one-node "cluster".
A sane but sometimes difficult to execute workaround was provided by [~jokser]:
{quote}1) Start another node, this is a topology event that will trigger detecting lost partitions.
2) Stop started node
3) If you have partition loss policy != IGNORE trigger explicitly `resetLostPartitions`
It should help to return back partition to OWNING state.
{quote}
It works, but you need to configure another node for the cluster. A dangerous and ugly but more practical workaround is to have this reflection-based method ready to invoke when you need it:
{code:java}
public void resetMovingPartitions() {
try {
Field igniteKernalField = IgniteSpringBean.class.getDeclaredField("g");
igniteKernalField.setAccessible(true);
IgniteKernal igniteKernal = (IgniteKernal)igniteKernalField.get(this);
GridKernalContextImpl kernalContext = (GridKernalContextImpl)igniteKernal.context();
kernalContext.cache().context().exchange().scheduleResendPartitions();
} catch (IllegalAccessException | NoSuchFieldException | ClassCastException e) {
throw new AssertionError(e);
}
}
{code}
It works for us.
> Detect lost partitions is not happened during cluster activation
> ----------------------------------------------------------------
>
> Key: IGNITE-12297
> URL: https://issues.apache.org/jira/browse/IGNITE-12297
> Project: Ignite
> Issue Type: Bug
> Components: cache
> Affects Versions: 2.4
> Reporter: Pavel Kovalenko
> Priority: Major
> Labels: newbie
>
> We invoke `detectLostPartitions` during PME only if there is a server join or server left.
> However, we can activate a persistent cluster where a partition may have MOVING status on all nodes. In this case, a partition may stay in MOVING state forever before any other topology event.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)