You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by Cody Yancey <ya...@uber.com> on 2017/01/26 16:14:16 UTC
Reloading from Persistent Store after Losing a Node
Hello Ignite users!
I have a use case where I am doing SQL queries on a sharded cache, and I
need to ensure that SQL queries always return The Right Answer even if some
nodes in the ring are lost. As I have rigorously confirmed, SQL queries
only apply to data in the cache (as opposed to in the write-through
persistent store but lost from the cache). Also, when you lose a node, you
don't lose persisted data, but data IS now gone from the cache (unless
there is an in-cache backup of the relevant cache partitions).
Now, I *could* do this by just increasing the backup factor for the cache
equal to the number of nodes I can stand to lose, and then setting a
TopologyValidator on the cache to ensure I always have more nodes in the
ring than that number. If the TopologyValidator ever returns a number of
nodes below this survivability threshold, I crash the app and let
everything get reloaded from the persistent store when the nodes
automatically start back up.
This technique has a lot of false positives, where we lose too many nodes,
but slowly enough that Ignite is well-able to shift the data around to
avoid data loss and so we shouldn't have had to crash the app.
Therefore, I would rather be a little smarter about this for the sake of
uptime.
Ideally, in the TopologyValidator logic, while reads and writes to the
cache are blocked, I would be able to:
1.) Detect when a lost partition has no viable backup,
2.) Reload from the persistent store.
The problem I am facing is, I can't find a clean and efficient way of
figuring out #1 from the information the ToplogyValidator gives you.
And even if I could, #2 hangs forever, which makes sense because the cache
isn't readable or writeable until AFTER the topology has been validated.
Has anyone faced a similar challenge and has some wisdom to share? Am I
making this way more complicated than it needs to be?
Thanks in advance,
Cody
Re: Reloading from Persistent Store after Losing a Node
Posted by Cody Yancey <ya...@uber.com>.
Ah thank you. This is exactly what I was looking for!
Thanks,
Cody
On Tue, Jan 31, 2017 at 8:07 AM, Vladimir Ozerov <vo...@gridgain.com>
wrote:
> Hi Cody,
>
> I think you can try using EventType.EVT_CACHE_REBALANCE_PART_DATA_LOST. It
> is fired when data is lost.
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Reloading-from-Persistent-Store-after-
> Losing-a-Node-tp10259p10339.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>
Re: Reloading from Persistent Store after Losing a Node
Posted by Vladimir Ozerov <vo...@gridgain.com>.
Hi Cody,
I think you can try using EventType.EVT_CACHE_REBALANCE_PART_DATA_LOST. It
is fired when data is lost.
--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Reloading-from-Persistent-Store-after-Losing-a-Node-tp10259p10339.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.