You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cassandra.apache.org by Stefan Miklosovic <st...@instaclustr.com> on 2021/11/03 20:53:00 UTC

The most reliable way to determine the last time node was up

Hi,

We see a lot of cases out there when a node was down for longer than
the GC period and once that node is up there are a lot of zombie data
issues ... you know the story.

We would like to implement some kind of a check which would detect
this so that node would not start in the first place so no issues
would be there at all and it would be up to operators to figure out
first what to do with it.

There are a couple of ideas we were exploring with various pros and
cons and I would like to know what you think about them.

1) Register a shutdown hook on "drain". This is already there (1).
"drain" method is doing quite a lot of stuff and this is called on
shutdown so our idea is to write a timestamp to system.local into a
new column like "lastly_drained" or something like that and it would
be read on startup.

The disadvantage of this approach, or all approaches via shutdown
hooks, is that it will only react only on SIGTERM and SIGINT. If that
node is killed via SIGKILL, JVM just stops and there is basically
nothing we have any guarantee of that would leave some traces behind.

If it is killed and that value is not overwritten, on the next startup
it might happen that it would be older than 10 days so it will falsely
evaluate it should not be started.

2) Doing this on startup, you would check how old all your sstables
and commit logs are, if no file was modified less than 10 days ago you
would abort start, there is pretty big chance that your node did at
least something in 10 days, there does not need to be anything added
to system tables or similar and it would be just another StartupCheck.

The disadvantage of this is that some dev clusters, for example, may
run more than 10 days and they are just sitting there doing absolutely
nothing at all, nobody interacts with them, nobody is repairing them,
they are just sitting there. So when nobody talks to these nodes, no
files are modified, right?

It seems like there is not a silver bullet here, what is your opinion on this?

Regards

(1) https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/StorageService.java#L786-L799

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: The most reliable way to determine the last time node was up

Posted by Paulo Motta <pa...@gmail.com>.

> I would expect that if nobody talks to a node and no operation is
running, it does not produce any "side effects".

In order to track the last checkpoint timestamp you need to persist it
periodically to prevent against losing state during an ungraceful shutdown
(ie. kill -9).

However you're right this may generate tons of sstables if we're persisting
it periodically to a system table, even if we skip the commit log. We could
tune system.local compaction to use LCS but it would still generate
periodic compaction activity.  In this case an external marker file sounds
much simpler and cleaner.

The downsides I see to the marker file approach are:
a) External clients cannot query last checkpoint time easily
b) The state is lost if the marker file is removed.

However we could solve these issues with:
a) exposing the info via a system table
b) fallback to min(last commitlog/sstable timestamp)

I prefer an explicit mechanism to track last checkpoint (ie. marker file)
vs implicit min(last commitlog/sstable timestamp) so we don't create
unnecessary coupling between different subsystems.

Cheers,

Paulo

Em qua., 3 de nov. de 2021 às 19:29, Stefan Miklosovic <
stefan.miklosovic@instaclustr.com> escreveu:

> Yes this is the combination of system.local and "marker file"
> approach, basically updating that field periodically.
>
> However, when there is a mutation done against the system table (in
> this example), it goes to a commit log and then it will be propagated
> to sstable on disk, no? So in our hypothetical scenario, if a node is
> not touched by anybody, it would still behave like it _does_
> something. I would expect that if nobody talks to a node and no
> operation is running, it does not produce any "side effects".
>
> I just do not want to generate any unnecessary noise. A node which
> does not do anything should not change its data. I am not sure if it
> is like that already or if an inactive node still does writes new
> sstables after some time, I doubt that.
>
> On Wed, 3 Nov 2021 at 22:58, Paulo Motta <pa...@gmail.com> wrote:
> >
> > How about a last_checkpoint (or better name) system.local column that is
> > updated periodically (ie. every minute) + on drain? This would give a
> lower
> > time bound on when the node was last live without requiring an external
> > marker file.
> >
> > On Wed, 3 Nov 2021 at 18:03 Stefan Miklosovic <
> > stefan.miklosovic@instaclustr.com> wrote:
> >
> > > The third option would be to have some thread running in the
> > > background "touching" some (empty) marker file, it is the most simple
> > > solution but I do not like the idea of this marker file, it feels
> > > dirty, but hey, while it would be opt-in feature for people knowing
> > > what they want, why not right ...
> > >
> > > On Wed, 3 Nov 2021 at 21:53, Stefan Miklosovic
> > > <st...@instaclustr.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > We see a lot of cases out there when a node was down for longer than
> > > > the GC period and once that node is up there are a lot of zombie data
> > > > issues ... you know the story.
> > > >
> > > > We would like to implement some kind of a check which would detect
> > > > this so that node would not start in the first place so no issues
> > > > would be there at all and it would be up to operators to figure out
> > > > first what to do with it.
> > > >
> > > > There are a couple of ideas we were exploring with various pros and
> > > > cons and I would like to know what you think about them.
> > > >
> > > > 1) Register a shutdown hook on "drain". This is already there (1).
> > > > "drain" method is doing quite a lot of stuff and this is called on
> > > > shutdown so our idea is to write a timestamp to system.local into a
> > > > new column like "lastly_drained" or something like that and it would
> > > > be read on startup.
> > > >
> > > > The disadvantage of this approach, or all approaches via shutdown
> > > > hooks, is that it will only react only on SIGTERM and SIGINT. If that
> > > > node is killed via SIGKILL, JVM just stops and there is basically
> > > > nothing we have any guarantee of that would leave some traces behind.
> > > >
> > > > If it is killed and that value is not overwritten, on the next
> startup
> > > > it might happen that it would be older than 10 days so it will
> falsely
> > > > evaluate it should not be started.
> > > >
> > > > 2) Doing this on startup, you would check how old all your sstables
> > > > and commit logs are, if no file was modified less than 10 days ago
> you
> > > > would abort start, there is pretty big chance that your node did at
> > > > least something in 10 days, there does not need to be anything added
> > > > to system tables or similar and it would be just another
> StartupCheck.
> > > >
> > > > The disadvantage of this is that some dev clusters, for example, may
> > > > run more than 10 days and they are just sitting there doing
> absolutely
> > > > nothing at all, nobody interacts with them, nobody is repairing them,
> > > > they are just sitting there. So when nobody talks to these nodes, no
> > > > files are modified, right?
> > > >
> > > > It seems like there is not a silver bullet here, what is your
> opinion on
> > > this?
> > > >
> > > > Regards
> > > >
> > > > (1)
> > >
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/StorageService.java#L786-L799
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> > > For additional commands, e-mail: dev-help@cassandra.apache.org
> > >
> > >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

Re: The most reliable way to determine the last time node was up

Posted by Stefan Miklosovic <st...@instaclustr.com>.

Yes this is the combination of system.local and "marker file"
approach, basically updating that field periodically.

However, when there is a mutation done against the system table (in
this example), it goes to a commit log and then it will be propagated
to sstable on disk, no? So in our hypothetical scenario, if a node is
not touched by anybody, it would still behave like it _does_
something. I would expect that if nobody talks to a node and no
operation is running, it does not produce any "side effects".

I just do not want to generate any unnecessary noise. A node which
does not do anything should not change its data. I am not sure if it
is like that already or if an inactive node still does writes new
sstables after some time, I doubt that.

On Wed, 3 Nov 2021 at 22:58, Paulo Motta <pa...@gmail.com> wrote:
>
> How about a last_checkpoint (or better name) system.local column that is
> updated periodically (ie. every minute) + on drain? This would give a lower
> time bound on when the node was last live without requiring an external
> marker file.
>
> On Wed, 3 Nov 2021 at 18:03 Stefan Miklosovic <
> stefan.miklosovic@instaclustr.com> wrote:
>
> > The third option would be to have some thread running in the
> > background "touching" some (empty) marker file, it is the most simple
> > solution but I do not like the idea of this marker file, it feels
> > dirty, but hey, while it would be opt-in feature for people knowing
> > what they want, why not right ...
> >
> > On Wed, 3 Nov 2021 at 21:53, Stefan Miklosovic
> > <st...@instaclustr.com> wrote:
> > >
> > > Hi,
> > >
> > > We see a lot of cases out there when a node was down for longer than
> > > the GC period and once that node is up there are a lot of zombie data
> > > issues ... you know the story.
> > >
> > > We would like to implement some kind of a check which would detect
> > > this so that node would not start in the first place so no issues
> > > would be there at all and it would be up to operators to figure out
> > > first what to do with it.
> > >
> > > There are a couple of ideas we were exploring with various pros and
> > > cons and I would like to know what you think about them.
> > >
> > > 1) Register a shutdown hook on "drain". This is already there (1).
> > > "drain" method is doing quite a lot of stuff and this is called on
> > > shutdown so our idea is to write a timestamp to system.local into a
> > > new column like "lastly_drained" or something like that and it would
> > > be read on startup.
> > >
> > > The disadvantage of this approach, or all approaches via shutdown
> > > hooks, is that it will only react only on SIGTERM and SIGINT. If that
> > > node is killed via SIGKILL, JVM just stops and there is basically
> > > nothing we have any guarantee of that would leave some traces behind.
> > >
> > > If it is killed and that value is not overwritten, on the next startup
> > > it might happen that it would be older than 10 days so it will falsely
> > > evaluate it should not be started.
> > >
> > > 2) Doing this on startup, you would check how old all your sstables
> > > and commit logs are, if no file was modified less than 10 days ago you
> > > would abort start, there is pretty big chance that your node did at
> > > least something in 10 days, there does not need to be anything added
> > > to system tables or similar and it would be just another StartupCheck.
> > >
> > > The disadvantage of this is that some dev clusters, for example, may
> > > run more than 10 days and they are just sitting there doing absolutely
> > > nothing at all, nobody interacts with them, nobody is repairing them,
> > > they are just sitting there. So when nobody talks to these nodes, no
> > > files are modified, right?
> > >
> > > It seems like there is not a silver bullet here, what is your opinion on
> > this?
> > >
> > > Regards
> > >
> > > (1)
> > https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/StorageService.java#L786-L799
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> > For additional commands, e-mail: dev-help@cassandra.apache.org
> >
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: The most reliable way to determine the last time node was up

Posted by Paulo Motta <pa...@gmail.com>.

How about a last_checkpoint (or better name) system.local column that is
updated periodically (ie. every minute) + on drain? This would give a lower
time bound on when the node was last live without requiring an external
marker file.

On Wed, 3 Nov 2021 at 18:03 Stefan Miklosovic <
stefan.miklosovic@instaclustr.com> wrote:

> The third option would be to have some thread running in the
> background "touching" some (empty) marker file, it is the most simple
> solution but I do not like the idea of this marker file, it feels
> dirty, but hey, while it would be opt-in feature for people knowing
> what they want, why not right ...
>
> On Wed, 3 Nov 2021 at 21:53, Stefan Miklosovic
> <st...@instaclustr.com> wrote:
> >
> > Hi,
> >
> > We see a lot of cases out there when a node was down for longer than
> > the GC period and once that node is up there are a lot of zombie data
> > issues ... you know the story.
> >
> > We would like to implement some kind of a check which would detect
> > this so that node would not start in the first place so no issues
> > would be there at all and it would be up to operators to figure out
> > first what to do with it.
> >
> > There are a couple of ideas we were exploring with various pros and
> > cons and I would like to know what you think about them.
> >
> > 1) Register a shutdown hook on "drain". This is already there (1).
> > "drain" method is doing quite a lot of stuff and this is called on
> > shutdown so our idea is to write a timestamp to system.local into a
> > new column like "lastly_drained" or something like that and it would
> > be read on startup.
> >
> > The disadvantage of this approach, or all approaches via shutdown
> > hooks, is that it will only react only on SIGTERM and SIGINT. If that
> > node is killed via SIGKILL, JVM just stops and there is basically
> > nothing we have any guarantee of that would leave some traces behind.
> >
> > If it is killed and that value is not overwritten, on the next startup
> > it might happen that it would be older than 10 days so it will falsely
> > evaluate it should not be started.
> >
> > 2) Doing this on startup, you would check how old all your sstables
> > and commit logs are, if no file was modified less than 10 days ago you
> > would abort start, there is pretty big chance that your node did at
> > least something in 10 days, there does not need to be anything added
> > to system tables or similar and it would be just another StartupCheck.
> >
> > The disadvantage of this is that some dev clusters, for example, may
> > run more than 10 days and they are just sitting there doing absolutely
> > nothing at all, nobody interacts with them, nobody is repairing them,
> > they are just sitting there. So when nobody talks to these nodes, no
> > files are modified, right?
> >
> > It seems like there is not a silver bullet here, what is your opinion on
> this?
> >
> > Regards
> >
> > (1)
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/StorageService.java#L786-L799
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

Re: The most reliable way to determine the last time node was up

Posted by Stefan Miklosovic <st...@instaclustr.com>.

The third option would be to have some thread running in the
background "touching" some (empty) marker file, it is the most simple
solution but I do not like the idea of this marker file, it feels
dirty, but hey, while it would be opt-in feature for people knowing
what they want, why not right ...

On Wed, 3 Nov 2021 at 21:53, Stefan Miklosovic
<st...@instaclustr.com> wrote:
>
> Hi,
>
> We see a lot of cases out there when a node was down for longer than
> the GC period and once that node is up there are a lot of zombie data
> issues ... you know the story.
>
> We would like to implement some kind of a check which would detect
> this so that node would not start in the first place so no issues
> would be there at all and it would be up to operators to figure out
> first what to do with it.
>
> There are a couple of ideas we were exploring with various pros and
> cons and I would like to know what you think about them.
>
> 1) Register a shutdown hook on "drain". This is already there (1).
> "drain" method is doing quite a lot of stuff and this is called on
> shutdown so our idea is to write a timestamp to system.local into a
> new column like "lastly_drained" or something like that and it would
> be read on startup.
>
> The disadvantage of this approach, or all approaches via shutdown
> hooks, is that it will only react only on SIGTERM and SIGINT. If that
> node is killed via SIGKILL, JVM just stops and there is basically
> nothing we have any guarantee of that would leave some traces behind.
>
> If it is killed and that value is not overwritten, on the next startup
> it might happen that it would be older than 10 days so it will falsely
> evaluate it should not be started.
>
> 2) Doing this on startup, you would check how old all your sstables
> and commit logs are, if no file was modified less than 10 days ago you
> would abort start, there is pretty big chance that your node did at
> least something in 10 days, there does not need to be anything added
> to system tables or similar and it would be just another StartupCheck.
>
> The disadvantage of this is that some dev clusters, for example, may
> run more than 10 days and they are just sitting there doing absolutely
> nothing at all, nobody interacts with them, nobody is repairing them,
> they are just sitting there. So when nobody talks to these nodes, no
> files are modified, right?
>
> It seems like there is not a silver bullet here, what is your opinion on this?
>
> Regards
>
> (1) https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/StorageService.java#L786-L799

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: The most reliable way to determine the last time node was up

Posted by Brandon Williams <dr...@gmail.com>.

If you always drain you won't have any commit logs.

On Thu, Nov 4, 2021 at 2:57 PM Elliott Sims <el...@backblaze.com> wrote:
>
> To deal with this, I've just made a very small Bash script that looks at
> commitlog age, then set the script as an "ExecStartPre=" in systemd:
>
> if [[ -d '/opt/cassandra/data/data' && $(/usr/bin/find
> /opt/cassandra/data/commitlog/ -name 'CommitLog*.log' -mtime -8 | wc -l)
> -eq 0 ]]; then
>   >&2  echo "ERROR:  precheck filed, Cassandra data too old"
>   exit 10
> fi
>
> First conditional is to reduce false-positives on brand new machines with
> no data.
> I suspect it'll false-positive if your writes are extremely rare (that is,
> basically read-only), but at that point you may not need it at all.
> (adjust as needed for your grace period and paths)
>
> On Thu, Nov 4, 2021 at 12:54 AM Berenguer Blasi <be...@gmail.com>
> wrote:
>
> > Apologies, I missed Paulo's reply on my email client threading funnies...
> >
> > On 4/11/21 7:50, Berenguer Blasi wrote:
> > > What about an hourly heartbeat 'lastSeenAlive' timestamp? my 2cts.
> > >
> > > On 3/11/21 21:53, Stefan Miklosovic wrote:
> > >> Hi,
> > >>
> > >> We see a lot of cases out there when a node was down for longer than
> > >> the GC period and once that node is up there are a lot of zombie data
> > >> issues ... you know the story.
> > >>
> > >> We would like to implement some kind of a check which would detect
> > >> this so that node would not start in the first place so no issues
> > >> would be there at all and it would be up to operators to figure out
> > >> first what to do with it.
> > >>
> > >> There are a couple of ideas we were exploring with various pros and
> > >> cons and I would like to know what you think about them.
> > >>
> > >> 1) Register a shutdown hook on "drain". This is already there (1).
> > >> "drain" method is doing quite a lot of stuff and this is called on
> > >> shutdown so our idea is to write a timestamp to system.local into a
> > >> new column like "lastly_drained" or something like that and it would
> > >> be read on startup.
> > >>
> > >> The disadvantage of this approach, or all approaches via shutdown
> > >> hooks, is that it will only react only on SIGTERM and SIGINT. If that
> > >> node is killed via SIGKILL, JVM just stops and there is basically
> > >> nothing we have any guarantee of that would leave some traces behind.
> > >>
> > >> If it is killed and that value is not overwritten, on the next startup
> > >> it might happen that it would be older than 10 days so it will falsely
> > >> evaluate it should not be started.
> > >>
> > >> 2) Doing this on startup, you would check how old all your sstables
> > >> and commit logs are, if no file was modified less than 10 days ago you
> > >> would abort start, there is pretty big chance that your node did at
> > >> least something in 10 days, there does not need to be anything added
> > >> to system tables or similar and it would be just another StartupCheck.
> > >>
> > >> The disadvantage of this is that some dev clusters, for example, may
> > >> run more than 10 days and they are just sitting there doing absolutely
> > >> nothing at all, nobody interacts with them, nobody is repairing them,
> > >> they are just sitting there. So when nobody talks to these nodes, no
> > >> files are modified, right?
> > >>
> > >> It seems like there is not a silver bullet here, what is your opinion
> > on this?
> > >>
> > >> Regards
> > >>
> > >> (1)
> > https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/StorageService.java#L786-L799
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> > >> For additional commands, e-mail: dev-help@cassandra.apache.org
> > >>
> > >> .
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> > For additional commands, e-mail: dev-help@cassandra.apache.org
> >
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: The most reliable way to determine the last time node was up

Posted by Elliott Sims <el...@backblaze.com>.

To deal with this, I've just made a very small Bash script that looks at
commitlog age, then set the script as an "ExecStartPre=" in systemd:

if [[ -d '/opt/cassandra/data/data' && $(/usr/bin/find
/opt/cassandra/data/commitlog/ -name 'CommitLog*.log' -mtime -8 | wc -l)
-eq 0 ]]; then
  >&2  echo "ERROR:  precheck filed, Cassandra data too old"
  exit 10
fi

First conditional is to reduce false-positives on brand new machines with
no data.
I suspect it'll false-positive if your writes are extremely rare (that is,
basically read-only), but at that point you may not need it at all.
(adjust as needed for your grace period and paths)

On Thu, Nov 4, 2021 at 12:54 AM Berenguer Blasi <be...@gmail.com>
wrote:

> Apologies, I missed Paulo's reply on my email client threading funnies...
>
> On 4/11/21 7:50, Berenguer Blasi wrote:
> > What about an hourly heartbeat 'lastSeenAlive' timestamp? my 2cts.
> >
> > On 3/11/21 21:53, Stefan Miklosovic wrote:
> >> Hi,
> >>
> >> We see a lot of cases out there when a node was down for longer than
> >> the GC period and once that node is up there are a lot of zombie data
> >> issues ... you know the story.
> >>
> >> We would like to implement some kind of a check which would detect
> >> this so that node would not start in the first place so no issues
> >> would be there at all and it would be up to operators to figure out
> >> first what to do with it.
> >>
> >> There are a couple of ideas we were exploring with various pros and
> >> cons and I would like to know what you think about them.
> >>
> >> 1) Register a shutdown hook on "drain". This is already there (1).
> >> "drain" method is doing quite a lot of stuff and this is called on
> >> shutdown so our idea is to write a timestamp to system.local into a
> >> new column like "lastly_drained" or something like that and it would
> >> be read on startup.
> >>
> >> The disadvantage of this approach, or all approaches via shutdown
> >> hooks, is that it will only react only on SIGTERM and SIGINT. If that
> >> node is killed via SIGKILL, JVM just stops and there is basically
> >> nothing we have any guarantee of that would leave some traces behind.
> >>
> >> If it is killed and that value is not overwritten, on the next startup
> >> it might happen that it would be older than 10 days so it will falsely
> >> evaluate it should not be started.
> >>
> >> 2) Doing this on startup, you would check how old all your sstables
> >> and commit logs are, if no file was modified less than 10 days ago you
> >> would abort start, there is pretty big chance that your node did at
> >> least something in 10 days, there does not need to be anything added
> >> to system tables or similar and it would be just another StartupCheck.
> >>
> >> The disadvantage of this is that some dev clusters, for example, may
> >> run more than 10 days and they are just sitting there doing absolutely
> >> nothing at all, nobody interacts with them, nobody is repairing them,
> >> they are just sitting there. So when nobody talks to these nodes, no
> >> files are modified, right?
> >>
> >> It seems like there is not a silver bullet here, what is your opinion
> on this?
> >>
> >> Regards
> >>
> >> (1)
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/StorageService.java#L786-L799
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> >> For additional commands, e-mail: dev-help@cassandra.apache.org
> >>
> >> .
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

Re: The most reliable way to determine the last time node was up

Posted by Berenguer Blasi <be...@gmail.com>.

Apologies, I missed Paulo's reply on my email client threading funnies...

On 4/11/21 7:50, Berenguer Blasi wrote:
> What about an hourly heartbeat 'lastSeenAlive' timestamp? my 2cts.
>
> On 3/11/21 21:53, Stefan Miklosovic wrote:
>> Hi,
>>
>> We see a lot of cases out there when a node was down for longer than
>> the GC period and once that node is up there are a lot of zombie data
>> issues ... you know the story.
>>
>> We would like to implement some kind of a check which would detect
>> this so that node would not start in the first place so no issues
>> would be there at all and it would be up to operators to figure out
>> first what to do with it.
>>
>> There are a couple of ideas we were exploring with various pros and
>> cons and I would like to know what you think about them.
>>
>> 1) Register a shutdown hook on "drain". This is already there (1).
>> "drain" method is doing quite a lot of stuff and this is called on
>> shutdown so our idea is to write a timestamp to system.local into a
>> new column like "lastly_drained" or something like that and it would
>> be read on startup.
>>
>> The disadvantage of this approach, or all approaches via shutdown
>> hooks, is that it will only react only on SIGTERM and SIGINT. If that
>> node is killed via SIGKILL, JVM just stops and there is basically
>> nothing we have any guarantee of that would leave some traces behind.
>>
>> If it is killed and that value is not overwritten, on the next startup
>> it might happen that it would be older than 10 days so it will falsely
>> evaluate it should not be started.
>>
>> 2) Doing this on startup, you would check how old all your sstables
>> and commit logs are, if no file was modified less than 10 days ago you
>> would abort start, there is pretty big chance that your node did at
>> least something in 10 days, there does not need to be anything added
>> to system tables or similar and it would be just another StartupCheck.
>>
>> The disadvantage of this is that some dev clusters, for example, may
>> run more than 10 days and they are just sitting there doing absolutely
>> nothing at all, nobody interacts with them, nobody is repairing them,
>> they are just sitting there. So when nobody talks to these nodes, no
>> files are modified, right?
>>
>> It seems like there is not a silver bullet here, what is your opinion on this?
>>
>> Regards
>>
>> (1) https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/StorageService.java#L786-L799
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: dev-help@cassandra.apache.org
>>
>> .

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: The most reliable way to determine the last time node was up

Posted by Berenguer Blasi <be...@gmail.com>.

What about an hourly heartbeat 'lastSeenAlive' timestamp? my 2cts.

On 3/11/21 21:53, Stefan Miklosovic wrote:
> Hi,
>
> We see a lot of cases out there when a node was down for longer than
> the GC period and once that node is up there are a lot of zombie data
> issues ... you know the story.
>
> We would like to implement some kind of a check which would detect
> this so that node would not start in the first place so no issues
> would be there at all and it would be up to operators to figure out
> first what to do with it.
>
> There are a couple of ideas we were exploring with various pros and
> cons and I would like to know what you think about them.
>
> 1) Register a shutdown hook on "drain". This is already there (1).
> "drain" method is doing quite a lot of stuff and this is called on
> shutdown so our idea is to write a timestamp to system.local into a
> new column like "lastly_drained" or something like that and it would
> be read on startup.
>
> The disadvantage of this approach, or all approaches via shutdown
> hooks, is that it will only react only on SIGTERM and SIGINT. If that
> node is killed via SIGKILL, JVM just stops and there is basically
> nothing we have any guarantee of that would leave some traces behind.
>
> If it is killed and that value is not overwritten, on the next startup
> it might happen that it would be older than 10 days so it will falsely
> evaluate it should not be started.
>
> 2) Doing this on startup, you would check how old all your sstables
> and commit logs are, if no file was modified less than 10 days ago you
> would abort start, there is pretty big chance that your node did at
> least something in 10 days, there does not need to be anything added
> to system tables or similar and it would be just another StartupCheck.
>
> The disadvantage of this is that some dev clusters, for example, may
> run more than 10 days and they are just sitting there doing absolutely
> nothing at all, nobody interacts with them, nobody is repairing them,
> they are just sitting there. So when nobody talks to these nodes, no
> files are modified, right?
>
> It seems like there is not a silver bullet here, what is your opinion on this?
>
> Regards
>
> (1) https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/StorageService.java#L786-L799
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
> .

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org