You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@ignite.apache.org by Vladimir Ozerov <vo...@gridgain.com> on 2015/04/10 09:58:06 UTC

IGFS behavior when node is stopping, but there are active file writes.

Hi,

Currently in case IGFS is writting a file (i.e. an output stream is opened)
and concurrent node stop is requested, we do not wait for these writes to
finish, but simply stops the node. This could potentially lead to
corrupted, partially written data.

Better approach is to wait for file write finish in case "cancel" flag is
false, and to stop immediately in case "cancel" flag is true.

But the problem is that a file could be opened for write indefinetly (e.g.
a kind of logging facility constantly writing to a file). As a result if
someone wants to perform gracefull node shutdown for some administrative
tasks and this "eternal" file write had happened to occur on this node, the
only option is to forcefully shutdown the node what can still lead to data
corruption.

I think we must think about the following:
1) Add cross-cache transactions for meta/data cache updates. E.g. now it is
possible that file length in meta cache is updated, but block in data cache
doesn't (e.g. due to node stop between these two actions).
2) Add ability to configure backups for data cache. Currently for some
reason we prevents user from doing it. As a result, there is no
fault-tolerance at all for PRIMARY mode.
3) Optionally wait for output streams close on node stop. May be with some
sensible configurable timeout.

Thoughts?

Vladimir.

Re: IGFS behavior when node is stopping, but there are active file writes.

Posted by Vladimir Ozerov <vo...@gridgain.com>.

Atri,

We already doing this. When something is written to IGFS stream first we
batch this data a little, then we create/update a block and save it to data
cache.

But the problem is that in our design we have two caches: meta cache and
data cache. As it applies from their names, meta cache stores file system
metadata like parent-child relationships, file lengths, etc.. And in data
cache we store actual file data blocks.
The problem is that currently there are no transactional semantics between
this two caches. As a result abrupt node stop could result in a situation,
when file length is updated in meta cache, but data is not written. IGFS
will treat such file as corrupted as it looks like some data was lost.

IGFS doesn't have transactional semantics between these two caches because
by the time it was developed, there was no cross-cache transactions in
Ignite (GridGain at that time). As of Ignite 1.0.0 we have cross-cache
transactions, so it is possible to implement consistent solution now.

Vladimir.

On Fri, Apr 10, 2015 at 12:14 PM, Atri Sharma <at...@gmail.com> wrote:

> On Fri, Apr 10, 2015 at 1:28 PM, Vladimir Ozerov <vo...@gridgain.com>
> wrote:
>
> > Hi,
> >
> > Currently in case IGFS is writting a file (i.e. an output stream is
> opened)
> > and concurrent node stop is requested, we do not wait for these writes to
> > finish, but simply stops the node. This could potentially lead to
> > corrupted, partially written data.
> >
> > Better approach is to wait for file write finish in case "cancel" flag is
> > false, and to stop immediately in case "cancel" flag is true.
> >
> > But the problem is that a file could be opened for write indefinetly
> (e.g.
> > a kind of logging facility constantly writing to a file). As a result if
> > someone wants to perform gracefull node shutdown for some administrative
> > tasks and this "eternal" file write had happened to occur on this node,
> the
> > only option is to forcefully shutdown the node what can still lead to
> data
> > corruption.
> >
> > I think we must think about the following:
> > 1) Add cross-cache transactions for meta/data cache updates. E.g. now it
> is
> > possible that file length in meta cache is updated, but block in data
> cache
> > doesn't (e.g. due to node stop between these two actions).
> > 2) Add ability to configure backups for data cache. Currently for some
> > reason we prevents user from doing it. As a result, there is no
> > fault-tolerance at all for PRIMARY mode.
> > 3) Optionally wait for output streams close on node stop. May be with
> some
> > sensible configurable timeout.
> >
> > Thoughts?
> >
> > Vladimir.
> >
>
>
> I think we should actually do two things:
>
> 1) Immediately terminate all open streams.
> 2) Flush whatever we have in the data cache.
>
> I am not too familiar with how data cache works but I think that we should
> either keep flushing it at regular intervals (if we do not do it already)
> or allow user to decide the checkpoint frequency.
>
> Please correct me if I am missing something.
>

Re: IGFS behavior when node is stopping, but there are active file writes.

Posted by Atri Sharma <at...@gmail.com>.

On Fri, Apr 10, 2015 at 1:28 PM, Vladimir Ozerov <vo...@gridgain.com>
wrote:

> Hi,
>
> Currently in case IGFS is writting a file (i.e. an output stream is opened)
> and concurrent node stop is requested, we do not wait for these writes to
> finish, but simply stops the node. This could potentially lead to
> corrupted, partially written data.
>
> Better approach is to wait for file write finish in case "cancel" flag is
> false, and to stop immediately in case "cancel" flag is true.
>
> But the problem is that a file could be opened for write indefinetly (e.g.
> a kind of logging facility constantly writing to a file). As a result if
> someone wants to perform gracefull node shutdown for some administrative
> tasks and this "eternal" file write had happened to occur on this node, the
> only option is to forcefully shutdown the node what can still lead to data
> corruption.
>
> I think we must think about the following:
> 1) Add cross-cache transactions for meta/data cache updates. E.g. now it is
> possible that file length in meta cache is updated, but block in data cache
> doesn't (e.g. due to node stop between these two actions).
> 2) Add ability to configure backups for data cache. Currently for some
> reason we prevents user from doing it. As a result, there is no
> fault-tolerance at all for PRIMARY mode.
> 3) Optionally wait for output streams close on node stop. May be with some
> sensible configurable timeout.
>
> Thoughts?
>
> Vladimir.
>


I think we should actually do two things:

1) Immediately terminate all open streams.
2) Flush whatever we have in the data cache.

I am not too familiar with how data cache works but I think that we should
either keep flushing it at regular intervals (if we do not do it already)
or allow user to decide the checkpoint frequency.

Please correct me if I am missing something.