You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by Raymond Wilson <ra...@trimble.com> on 2019/08/28 03:50:16 UTC

Issue with recovery of Kubernetes deployed server node after abnormal shutdown

We have an Ignite grid deployed on a Kubernetes cluster using an AWS EFS
volume to store the persistent data for all nodes in the grid.

The Ignite based services running on those pods respond to SIG_TERM style
graceful shutdown and restart events by reattaching to the persistent
stores in the EFS volume.

Ignite maintains a lock file in the persistence folder for each node that
indicates if that persistence store is owned by a running Ignite server
node. When the node shots down gracefully the lock file is removed allowing
the a new Ignite node in a Kubernetes pod to use it.

If a Ignite server node hosted in a Kubernetes pod is subject to abnormal
termination (eg: via SIG_KIILL or a failure in the underlying EC2 server
hosting the K8s pod), then the lock file is not removed. When a new K8s pod
starts up to replace the one that failed, it does not reattach to the
existing node persistence folder due to the lock file. Instead it creates
another node persistence folder which leads to apparent data loss.

This can be seen in the log fragment below where a new pod examines the
node00 folder, finds a lock file and proceeds to create a node01 folder due
to that lock.

[image: image.png]

My question is: What is the best way to manage this so that abnormal
termination recovery copes with the orphaned lock file without the need for
DevOps intervention?

Thanks,
Raymond.

Re: Issue with recovery of Kubernetes deployed server node after abnormal shutdown

Posted by Ilya Kasnacheev <il...@gmail.com>.

Hello!

I think the mere presence of lock file is not enough. It can be reacquired
if previous node is down. Maybe it can't be reacquired due to e.g. file
permissions issues?

Setting consistentId will prevent node from starting at all if it can't
lock storage.

Regards,
-- 
Ilya Kasnacheev


сб, 31 авг. 2019 г. в 08:52, Raymond Wilson <ra...@trimble.com>:

> Hi Ilya,
>
> It is curious you do not see the lock failure error.
>
> Currently our approach is that the Kubernetes nodes (pods) are stateless
> and are provisioned against the EFS volume at the point they are created.
> In this way the consistent id as such is a part of the persistent store and
> is inherited by the Kubernetes pod when it attaches to the persistent
> volume.
> In general this works really well, except for this instance related to the
> lock file being left after abnormal node termination.
>
> The particular issue seems to occur due to the presence of the lock file
> at the point the ignite node in the Kubernetes pod tries to access the
> persistent store. IE: The new pod sees the lock file and determines this
> persistent volume is not available for the new pod to access, so it creates
> a new node.
>
> We are happy to modify our approach to align with IA best practices. Does
> assigning consistent IDs manually, rather than using the default consistent
> ID, mean that the lock file being present does not cause an issue? How
> would we align consistent ID specification with Kubernetes automatic pod
> replacement on IA node failure/
>
> Thanks,
> Raymond.
>
>
> On Sat, Aug 31, 2019 at 2:24 AM Ilya Kasnacheev <il...@gmail.com>
> wrote:
>
>> Hello!
>>
>> Maybe I misunderstand something, my recommendation will be to provide
>> consistentId for all nodes. This way, it would be impossible to boot with
>> wrong/different data dir.
>>
>> It's not obvious why the error "Unable to acquire lock" happens, I didn't
>> see that. What's your target OS? Are you sure all other instances are
>> completely stopped at the time of this node startup?
>>
>> Regards,
>> --
>> Ilya Kasnacheev
>>
>>
>> ср, 28 авг. 2019 г. в 06:50, Raymond Wilson <ra...@trimble.com>:
>>
>>> We have an Ignite grid deployed on a Kubernetes cluster using an AWS EFS
>>> volume to store the persistent data for all nodes in the grid.
>>>
>>> The Ignite based services running on those pods respond to SIG_TERM
>>> style graceful shutdown and restart events by reattaching to the persistent
>>> stores in the EFS volume.
>>>
>>> Ignite maintains a lock file in the persistence folder for each node
>>> that indicates if that persistence store is owned by a running Ignite
>>> server node. When the node shots down gracefully the lock file is removed
>>> allowing the a new Ignite node in a Kubernetes pod to use it.
>>>
>>> If a Ignite server node hosted in a Kubernetes pod is subject to
>>> abnormal termination (eg: via SIG_KIILL or a failure in the underlying EC2
>>> server hosting the K8s pod), then the lock file is not removed. When a new
>>> K8s pod starts up to replace the one that failed, it does not reattach to
>>> the existing node persistence folder due to the lock file. Instead it
>>> creates another node persistence folder which leads to apparent data loss.
>>>
>>> This can be seen in the log fragment below where a new pod examines the
>>> node00 folder, finds a lock file and proceeds to create a node01 folder due
>>> to that lock.
>>>
>>> [image: image.png]
>>>
>>> My question is: What is the best way to manage this so that abnormal
>>> termination recovery copes with the orphaned lock file without the need for
>>> DevOps intervention?
>>>
>>> Thanks,
>>> Raymond.
>>>
>>>

Re: Issue with recovery of Kubernetes deployed server node after abnormal shutdown

Posted by Raymond Wilson <ra...@trimble.com>.

Hi Ilya,

It is curious you do not see the lock failure error.

Currently our approach is that the Kubernetes nodes (pods) are stateless
and are provisioned against the EFS volume at the point they are created.
In this way the consistent id as such is a part of the persistent store and
is inherited by the Kubernetes pod when it attaches to the persistent
volume.
In general this works really well, except for this instance related to the
lock file being left after abnormal node termination.

The particular issue seems to occur due to the presence of the lock file at
the point the ignite node in the Kubernetes pod tries to access the
persistent store. IE: The new pod sees the lock file and determines this
persistent volume is not available for the new pod to access, so it creates
a new node.

We are happy to modify our approach to align with IA best practices. Does
assigning consistent IDs manually, rather than using the default consistent
ID, mean that the lock file being present does not cause an issue? How
would we align consistent ID specification with Kubernetes automatic pod
replacement on IA node failure/

Thanks,
Raymond.

On Sat, Aug 31, 2019 at 2:24 AM Ilya Kasnacheev <il...@gmail.com>
wrote:

> Hello!
>
> Maybe I misunderstand something, my recommendation will be to provide
> consistentId for all nodes. This way, it would be impossible to boot with
> wrong/different data dir.
>
> It's not obvious why the error "Unable to acquire lock" happens, I didn't
> see that. What's your target OS? Are you sure all other instances are
> completely stopped at the time of this node startup?
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> ср, 28 авг. 2019 г. в 06:50, Raymond Wilson <ra...@trimble.com>:
>
>> We have an Ignite grid deployed on a Kubernetes cluster using an AWS EFS
>> volume to store the persistent data for all nodes in the grid.
>>
>> The Ignite based services running on those pods respond to SIG_TERM style
>> graceful shutdown and restart events by reattaching to the persistent
>> stores in the EFS volume.
>>
>> Ignite maintains a lock file in the persistence folder for each node that
>> indicates if that persistence store is owned by a running Ignite server
>> node. When the node shots down gracefully the lock file is removed allowing
>> the a new Ignite node in a Kubernetes pod to use it.
>>
>> If a Ignite server node hosted in a Kubernetes pod is subject to abnormal
>> termination (eg: via SIG_KIILL or a failure in the underlying EC2 server
>> hosting the K8s pod), then the lock file is not removed. When a new K8s pod
>> starts up to replace the one that failed, it does not reattach to the
>> existing node persistence folder due to the lock file. Instead it creates
>> another node persistence folder which leads to apparent data loss.
>>
>> This can be seen in the log fragment below where a new pod examines the
>> node00 folder, finds a lock file and proceeds to create a node01 folder due
>> to that lock.
>>
>> [image: image.png]
>>
>> My question is: What is the best way to manage this so that abnormal
>> termination recovery copes with the orphaned lock file without the need for
>> DevOps intervention?
>>
>> Thanks,
>> Raymond.
>>
>>

Re: Issue with recovery of Kubernetes deployed server node after abnormal shutdown

Posted by Ilya Kasnacheev <il...@gmail.com>.

Hello!

Maybe I misunderstand something, my recommendation will be to provide
consistentId for all nodes. This way, it would be impossible to boot with
wrong/different data dir.

It's not obvious why the error "Unable to acquire lock" happens, I didn't
see that. What's your target OS? Are you sure all other instances are
completely stopped at the time of this node startup?

Regards,
-- 
Ilya Kasnacheev


ср, 28 авг. 2019 г. в 06:50, Raymond Wilson <ra...@trimble.com>:

> We have an Ignite grid deployed on a Kubernetes cluster using an AWS EFS
> volume to store the persistent data for all nodes in the grid.
>
> The Ignite based services running on those pods respond to SIG_TERM style
> graceful shutdown and restart events by reattaching to the persistent
> stores in the EFS volume.
>
> Ignite maintains a lock file in the persistence folder for each node that
> indicates if that persistence store is owned by a running Ignite server
> node. When the node shots down gracefully the lock file is removed allowing
> the a new Ignite node in a Kubernetes pod to use it.
>
> If a Ignite server node hosted in a Kubernetes pod is subject to abnormal
> termination (eg: via SIG_KIILL or a failure in the underlying EC2 server
> hosting the K8s pod), then the lock file is not removed. When a new K8s pod
> starts up to replace the one that failed, it does not reattach to the
> existing node persistence folder due to the lock file. Instead it creates
> another node persistence folder which leads to apparent data loss.
>
> This can be seen in the log fragment below where a new pod examines the
> node00 folder, finds a lock file and proceeds to create a node01 folder due
> to that lock.
>
> [image: image.png]
>
> My question is: What is the best way to manage this so that abnormal
> termination recovery copes with the orphaned lock file without the need for
> DevOps intervention?
>
> Thanks,
> Raymond.
>
>