You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by fald 1970 <fa...@gmail.com> on 2019/04/17 08:42:14 UTC

Fwd: gc_grace config for time serie database

Hi,

According to these Facts:
1. If a node is down for longer than max_hint_window_in_ms (3 hours by
default), the coordinator stops writing new hints.
2. The main purpose of gc_grace property is to prevent Zombie data and also
it determines for how long the coordinator should keep hinted files

When we use Cassandra for Time series data which:
A) Every row of data has TTL and there would be no explicit delete so not
so much worried about zombies
B) At every minute there should be hundredrs of write requets to each node,
so if one of the node was down for longer than max_hint_window_in_ms, we
should run manual repair on that node, so anyway stored hints on the
coordinator won't be necessary.

So Finally the question, is this a good idea to set gc_grace equal to
max_hint_window_in_ms (/1000 to convert to seconds),
for example set them both to 3 hours (why should keep the tombstones for 10
days when they won't be needed at all)?

Best Regards
Federica Albertini

Re: gc_grace config for time serie database

Posted by Stefan Miklosovic <st...@instaclustr.com>.

I am wrong in this paragraph:

>> On the other hand, a node was down, it was TTLed on healthy nodes and
>> tombstone was created, then you start the first one which was down and
>> as it counts down you hit that node with update.

It does not matter how long that dead node was dead. Once you start
the DB it will compute TTL value regardless, it does not suddenly stop
to take time it was dead into account. It would just mean it would
TTLed and it should not as other healthy nodes could receive updates
after they stopped to make hints.

But you say you dont ever update so it is not applicable here.

It is interesting question. I wont give you an ultimate answer. Maybe
somebody else gives their opinion on this. I am curious what
consequences that has if any if you set it to be equal.



On Wed, 17 Apr 2019 at 23:12, onmstester onmstester
<on...@zoho.com.invalid> wrote:
>
> I do not use table default ttl (every row has its own TTL) and also no update occurs to the rows.
> I suppose that (because of immutable nature of everything in cassandra) cassandra would keep only the insertion timestamp + the original ttl and  computes ttl of a row using these two and current timestamp of the system whenever needed (when you select ttl or when the compaction occurs).
> So there should be something like this attached to every row: "this row inserted at 4/17/2019 12:20 PM  and should be deleted in 2 months", so whatever happens to the row replicas, my intention of removing it at 6/17 should not be changed!
>
> Would you suggest that my idea of "gc_grace = max_hint = 3 hours" for a time serie db is not reasonable?
>
> Sent using Zoho Mail
>
>
>
> ---- On Wed, 17 Apr 2019 17:13:02 +0430 Stefan Miklosovic <st...@instaclustr.com> wrote ----
>
> TTL value is decreasing every second and it is set to original TTL
> value back after some update occurs on that row (see example below).
> Does not it logically imply that if a node is down for some time and
> updates are occurring on live nodes and handoffs are saved for three
> hours and after three hours it stops to do them, your data on other
> nodes would not be deleted as TTLS are reset upon every update and
> countdown starts again, which is correct, but they would be deleted on
> that node which was down because it didnt receive updates so if you
> query that node, data will not be there but they should.
>
> On the other hand, a node was down, it was TTLed on healthy nodes and
> tombstone was created, then you start the first one which was down and
> as it counts down you hit that node with update. So there is not a
> tombstone on the previously dead node but there are tombstones on
> healthy ones and if you delete tombstones after 3 hours, previously
> dead node will never get that info and it your data might actually end
> up being resurrected as they would be replicated to always healthy
> nodes as part of the repair.
>
> Do you see some flaw in my reasoning?
>
> cassandra@cqlsh> DESCRIBE TABLE test.test;
>
> CREATE TABLE test.test (
> id uuid PRIMARY KEY,
> value text
> ) WITH bloom_filter_fp_chance = 0.6
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class':
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class':
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 60
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99PERCENTILE';
>
>
> cassandra@cqlsh> select ttl(value) from test.test where id =
> 4f860bf0-d793-4408-8330-a809c6cf6375;
>
> ttl(value)
> ------------
> 25
>
> (1 rows)
> cassandra@cqlsh> UPDATE test.test SET value = 'c' WHERE id =
> 4f860bf0-d793-4408-8330-a809c6cf6375;
> cassandra@cqlsh> select ttl(value) from test.test where id =
> 4f860bf0-d793-4408-8330-a809c6cf6375;
>
> ttl(value)
> ------------
> 59
>
> (1 rows)
> cassandra@cqlsh> select * from test.test ;
>
> id | value
> --------------------------------------+-------
> 4f860bf0-d793-4408-8330-a809c6cf6375 | c
>
>
> On Wed, 17 Apr 2019 at 19:18, fald 1970 <fa...@gmail.com> wrote:
> >
> >
> >
> > Hi,
> >
> > According to these Facts:
> > 1. If a node is down for longer than max_hint_window_in_ms (3 hours by default), the coordinator stops writing new hints.
> > 2. The main purpose of gc_grace property is to prevent Zombie data and also it determines for how long the coordinator should keep hinted files
> >
> > When we use Cassandra for Time series data which:
> > A) Every row of data has TTL and there would be no explicit delete so not so much worried about zombies
> > B) At every minute there should be hundredrs of write requets to each node, so if one of the node was down for longer than max_hint_window_in_ms, we should run manual repair on that node, so anyway stored hints on the coordinator won't be necessary.
> >
> > So Finally the question, is this a good idea to set gc_grace equal to max_hint_window_in_ms (/1000 to convert to seconds),
> > for example set them both to 3 hours (why should keep the tombstones for 10 days when they won't be needed at all)?
> >
> > Best Regards
> > Federica Albertini
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org

Re: gc_grace config for time serie database

Posted by onmstester onmstester <on...@zoho.com.INVALID>.

I do not use table default ttl (every row has its own TTL) and also no update occurs to the rows.

 I suppose that (because of immutable nature of everything in cassandra) cassandra would keep only the insertion timestamp + the original ttl and  computes ttl of a row using these two and current timestamp of the system whenever needed (when you select ttl or when the compaction occurs).

So there should be something like this attached to every row: "this row inserted at 4/17/2019 12:20 PM  and should be deleted in 2 months", so whatever happens to the row replicas, my intention of removing it at 6/17 should not be changed!



Would you suggest that my idea of "gc_grace = max_hint = 3 hours" for a time serie db is not reasonable?


Sent using https://www.zoho.com/mail/






---- On Wed, 17 Apr 2019 17:13:02 +0430 Stefan Miklosovic <st...@instaclustr.com> wrote ----



TTL value is decreasing every second and it is set to original TTL 

value back after some update occurs on that row (see example below). 

Does not it logically imply that if a node is down for some time and 

updates are occurring on live nodes and handoffs are saved for three 

hours and after three hours it stops to do them, your data on other 

nodes would not be deleted as TTLS are reset upon every update and 

countdown starts again, which is correct, but they would be deleted on 

that node which was down because it didnt receive updates so if you 

query that node, data will not be there but they should. 

 

On the other hand, a node was down, it was TTLed on healthy nodes and 

tombstone was created, then you start the first one which was down and 

as it counts down you hit that node with update. So there is not a 

tombstone on the previously dead node but there are tombstones on 

healthy ones and if you delete tombstones after 3 hours, previously 

dead node will never get that info and it your data might actually end 

up being resurrected as they would be replicated to always healthy 

nodes as part of the repair. 

 

Do you see some flaw in my reasoning? 

 

cassandra@cqlsh> DESCRIBE TABLE test.test; 

 

CREATE TABLE test.test ( 

 id uuid PRIMARY KEY, 

 value text 

) WITH bloom_filter_fp_chance = 0.6 

 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} 

 AND comment = '' 

 AND compaction = {'class': 

'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 

'max_threshold': '32', 'min_threshold': '4'} 

 AND compression = {'chunk_length_in_kb': '64', 'class': 

'org.apache.cassandra.io.compress.LZ4Compressor'} 

 AND crc_check_chance = 1.0 

 AND dclocal_read_repair_chance = 0.1 

 AND default_time_to_live = 60 

 AND gc_grace_seconds = 864000 

 AND max_index_interval = 2048 

 AND memtable_flush_period_in_ms = 0 

 AND min_index_interval = 128 

 AND read_repair_chance = 0.0 

 AND speculative_retry = '99PERCENTILE'; 

 

 

cassandra@cqlsh> select ttl(value) from test.test where id = 

4f860bf0-d793-4408-8330-a809c6cf6375; 

 

 ttl(value) 

------------ 

 25 

 

(1 rows) 

cassandra@cqlsh> UPDATE test.test SET value = 'c' WHERE  id = 

4f860bf0-d793-4408-8330-a809c6cf6375; 

cassandra@cqlsh> select ttl(value) from test.test where id = 

4f860bf0-d793-4408-8330-a809c6cf6375; 

 

 ttl(value) 

------------ 

 59 

 

(1 rows) 

cassandra@cqlsh> select * from test.test  ; 

 

 id                                   | value 

--------------------------------------+------- 

 4f860bf0-d793-4408-8330-a809c6cf6375 |     c 

 

 

On Wed, 17 Apr 2019 at 19:18, fald 1970 <ma...@gmail.com> wrote: 

> 

> 

> 

> Hi, 

> 

> According to these Facts: 

> 1. If a node is down for longer than max_hint_window_in_ms (3 hours by default), the coordinator stops writing new hints. 

> 2. The main purpose of gc_grace property is to prevent Zombie data and also it determines for how long the coordinator should keep hinted files 

> 

> When we use Cassandra for Time series data which: 

> A) Every row of data has TTL and there would be no explicit delete so not so much worried about zombies 

> B) At every minute there should be hundredrs of write requets to each node, so if one of the node was down for longer than max_hint_window_in_ms, we should run manual repair on that node, so anyway stored hints on the coordinator won't be necessary. 

> 

> So Finally the question, is this a good idea to set gc_grace equal to max_hint_window_in_ms (/1000 to convert to seconds), 

> for example set them both to 3 hours (why should keep the tombstones for 10 days when they won't be needed at all)? 

> 

> Best Regards 

> Federica Albertini 

 

--------------------------------------------------------------------- 

To unsubscribe, e-mail: mailto:user-unsubscribe@cassandra.apache.org 

For additional commands, e-mail: mailto:user-help@cassandra.apache.org

Re: gc_grace config for time serie database

Posted by Stefan Miklosovic <st...@instaclustr.com>.

TTL value is decreasing every second and it is set to original TTL
value back after some update occurs on that row (see example below).
Does not it logically imply that if a node is down for some time and
updates are occurring on live nodes and handoffs are saved for three
hours and after three hours it stops to do them, your data on other
nodes would not be deleted as TTLS are reset upon every update and
countdown starts again, which is correct, but they would be deleted on
that node which was down because it didnt receive updates so if you
query that node, data will not be there but they should.

On the other hand, a node was down, it was TTLed on healthy nodes and
tombstone was created, then you start the first one which was down and
as it counts down you hit that node with update. So there is not a
tombstone on the previously dead node but there are tombstones on
healthy ones and if you delete tombstones after 3 hours, previously
dead node will never get that info and it your data might actually end
up being resurrected as they would be replicated to always healthy
nodes as part of the repair.

Do you see some flaw in my reasoning?

cassandra@cqlsh> DESCRIBE TABLE test.test;

CREATE TABLE test.test (
    id uuid PRIMARY KEY,
    value text
) WITH bloom_filter_fp_chance = 0.6
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 60
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99PERCENTILE';

cassandra@cqlsh> select ttl(value) from test.test where id =
4f860bf0-d793-4408-8330-a809c6cf6375;

 ttl(value)
------------
         25

(1 rows)
cassandra@cqlsh> UPDATE test.test SET value = 'c' WHERE  id =
4f860bf0-d793-4408-8330-a809c6cf6375;
cassandra@cqlsh> select ttl(value) from test.test where id =
4f860bf0-d793-4408-8330-a809c6cf6375;

 ttl(value)
------------
         59

(1 rows)
cassandra@cqlsh> select * from test.test  ;

 id                                   | value
--------------------------------------+-------
 4f860bf0-d793-4408-8330-a809c6cf6375 |     c

On Wed, 17 Apr 2019 at 19:18, fald 1970 <fa...@gmail.com> wrote:
>
>
>
> Hi,
>
> According to these Facts:
> 1. If a node is down for longer than max_hint_window_in_ms (3 hours by default), the coordinator stops writing new hints.
> 2. The main purpose of gc_grace property is to prevent Zombie data and also it determines for how long the coordinator should keep hinted files
>
> When we use Cassandra for Time series data which:
> A) Every row of data has TTL and there would be no explicit delete so not so much worried about zombies
> B) At every minute there should be hundredrs of write requets to each node, so if one of the node was down for longer than max_hint_window_in_ms, we should run manual repair on that node, so anyway stored hints on the coordinator won't be necessary.
>
> So Finally the question, is this a good idea to set gc_grace equal to max_hint_window_in_ms (/1000 to convert to seconds),
> for example set them both to 3 hours (why should keep the tombstones for 10 days when they won't be needed at all)?
>
> Best Regards
> Federica Albertini

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org