You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Aiman Parvaiz <ai...@flipagram.com> on 2015/06/04 19:57:46 UTC

Reading too many tombstones

Hi everyone,
We are running a 10 node Cassandra 2.0.9 without vnode cluster. We are
running in to a issue where we are reading too many tombstones and hence
getting tons of WARN messages and some ERROR query aborted.

cass-prod4 2015-06-04 14:38:34,307 WARN ReadStage:
<https://logentries.com/app/9f95dbd4#>1998
SliceQueryFilter.collectReducedColumns - Read 46 live and 1560 tombstoned
cells in ABC.home_feed (see tombstone_warn_threshold). 100 columns was
requested, slices= <https://logentries.com/app/9f95dbd4#>[-], delInfo=
<https://logentries.com/app/9f95dbd4#>{deletedAt=
<https://logentries.com/app/9f95dbd4#>-9223372036854775808, localDeletion=
<https://logentries.com/app/9f95dbd4#>2147483647}

cass-prod2 2015-05-31 12:55:55,331 ERROR ReadStage:
<https://logentries.com/app/9f95dbd4#>1953
SliceQueryFilter.collectReducedColumns - Scanned over 100000 tombstones in
ABC.home_feed; query aborted (see tombstone_fail_threshold)

As you can see all of this is happening for CF home_feed. This CF is
basically maintaining a feed with TTL set to 2592000 (30 days).
gc_grace_seconds for this CF is 864000 and its SizeTieredCompaction.

Repairs have been running regularly and automatic compactions are occurring
normally too.

I can definitely use some help here in how to tackle this issue.

Up till now I have the following ideas:

1) I can make gc_grace_seconds to 0 and then do a manual compaction for
this CF and bump up the gc_grace again.

2) Make gc_grace 0, run manual compaction on this CF and leave gc_grace to
zero. In this case have to be careful in running repairs.

3) I am also considering moving to DateTier Compaction.

What would be the best approach here for my feed case. Any help is
appreciated.

Thanks

Re: Reading too many tombstones

Posted by Alain RODRIGUEZ <ar...@gmail.com>.

Actually what happen is that STC as well as LCS mix old and fresh data
during the compaction process.

So all the fragments of the same row that you deleted (or reached the TTL
of), are spread among multiple sstables. The point is that they need to be
gathered all in the same compaction to be really and fully evicted. In a
time series model (using wide rows), there might be quite a few fragments
for each row depending on your sharding on your primary key and on your
insert / update workload. This is due to some issues around distributes
deletes. Tombstones are actually like special inserts, if you remove them
without removing all the fragments of the row, old incorrect data may come
back. So you need to run repairs within the grace period (usually 10 days)
to avoid ghosts (make sure all node has received the tombstone + compact
all the fragments at once (from multiple SSTable) at node level.

Using STCS, the unique way to make sure of this is to run a major
compaction (nodetool compact mykeyspace mycf). But major compaction also
has some negative impacts. If you chose this option, you should read about
it.

An other approach would be to truncate periodically your data. Depending on
your needs, of course...

Anyway this has alway been a tricky issue, handling tombstones (and so
TTLs) properly. That's precisely for this kind of use cases that DTCS was
designed. This strategy keeps data per date which makes a sens for
timeseries and constant TTLs.

See https://labs.spotify.com/2014/12/18/date-tiered-compaction/ and
http://www.datastax.com/dev/blog/datetieredcompactionstrategy

Hope this will help.

NB : I haven't used this (yet ;-)). DTCS is enabled from Cassandra 2.0.11,
you should look at the changelog for some improvements / issues around
 DTCS and imho go directly to the last minor (2.0.15 ?) if you want to use
this compaction strategy.

C*heers,

Alain





2015-06-04 22:31 GMT+02:00 Sebastian Estevez <sebastian.estevez@datastax.com
>:

> Check out the compaction subproperties for tombstones.
>
>
> http://docs.datastax.com/en/cql/3.1/cql/cql_reference/compactSubprop.html?scroll=compactSubprop__compactionSubpropertiesDTCS
> On Jun 4, 2015 1:29 PM, "Aiman Parvaiz" <ai...@flipagram.com> wrote:
>
>> Thanks Carlos for pointing me in that direction, I have some interesting
>> findings to share. So in December last year there was a redesign of
>> home_feed and it was migrated to a new CF. Initially all the data in
>> home_feed had a TTL of 1 year but migrated data was inserted with TTL of
>> 30days.
>> Now on digging a bit deeper I found that home_feed still has data from
>> Jan 2015 with ttl 1275094 (14 days).
>>
>> This data is for the same id from home_feed:
>>  date                     | ttl(description)
>> --------------------------+------------------
>>  2015-04-03 21:22:58+0000 |           759791
>>  2015-04-03 04:50:11+0000 |           412706
>>  2015-03-30 22:18:58+0000 |           759791
>>  2015-03-29 15:20:36+0000 |          1978689
>>  2015-03-28 14:41:28+0000 |          1275116
>>  2015-03-28 14:31:25+0000 |          1275116
>>  2015-03-18 19:23:44+0000 |          2512936
>>  2015-03-13 17:51:01+0000 |          1978689
>>  2015-02-12 15:41:01+0000 |          1978689
>>  2015-01-18 02:36:27+0000 |          1275094
>>
>>
>> I am not sure what happened in that migration but I think that when
>> trying to load data we are reading this old data(as feed queries a
>> 1000/page to be displayed to the user) and in order to read this data we
>> have to cross(read) lots of tombstones(newer data has TTL working
>> correctly) and hence the error.
>> I am not sure how much would date tier help us in this situation too. If
>> anyone has any suggestions in how to handle this either on Systems or
>> Developer level please pitch in.
>>
>> Thanks
>>
>> On Thu, Jun 4, 2015 at 11:47 AM, Carlos Rolo <ro...@pythian.com> wrote:
>>
>>> The TTL data will only be removed after the gc_grace_seconds. So your
>>> data with 30 days TTL will be still in Cassandra for 10 days more (40 in
>>> total). Is your data being there for more than that? Otherwise it is
>>> expected behaviour and probably you should do something on your data model
>>> to avoid scanning tombstoned data.
>>>
>>> Regards,
>>>
>>> Carlos Juzarte Rolo
>>> Cassandra Consultant
>>>
>>> Pythian - Love your data
>>>
>>> rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
>>> <http://linkedin.com/in/carlosjuzarterolo>*
>>> Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
>>> www.pythian.com
>>>
>>> On Thu, Jun 4, 2015 at 8:31 PM, Aiman Parvaiz <ai...@flipagram.com>
>>> wrote:
>>>
>>>> yeah we don't update old data. One thing I am curious about is why are
>>>> we running in to so many tombstones with compaction happening normally. Is
>>>> compaction not removing tombstomes?
>>>>
>>>>
>>>> On Thu, Jun 4, 2015 at 11:25 AM, Jonathan Haddad <jo...@jonhaddad.com>
>>>> wrote:
>>>>
>>>>> DateTiered is fantastic if you've got time series, TTLed data.  That
>>>>> means no updates to old data.
>>>>>
>>>>> On Thu, Jun 4, 2015 at 10:58 AM Aiman Parvaiz <ai...@flipagram.com>
>>>>> wrote:
>>>>>
>>>>>> Hi everyone,
>>>>>> We are running a 10 node Cassandra 2.0.9 without vnode cluster. We
>>>>>> are running in to a issue where we are reading too many tombstones and
>>>>>> hence getting tons of WARN messages and some ERROR query aborted.
>>>>>>
>>>>>> cass-prod4 2015-06-04 14:38:34,307 WARN ReadStage:
>>>>>> <https://logentries.com/app/9f95dbd4#>1998
>>>>>> SliceQueryFilter.collectReducedColumns - Read 46 live and 1560 tombstoned
>>>>>> cells in ABC.home_feed (see tombstone_warn_threshold). 100 columns was
>>>>>> requested, slices= <https://logentries.com/app/9f95dbd4#>[-],
>>>>>> delInfo= <https://logentries.com/app/9f95dbd4#>{deletedAt=
>>>>>> <https://logentries.com/app/9f95dbd4#>-9223372036854775808,
>>>>>> localDeletion= <https://logentries.com/app/9f95dbd4#>2147483647}
>>>>>>
>>>>>> cass-prod2 2015-05-31 12:55:55,331 ERROR ReadStage:
>>>>>> <https://logentries.com/app/9f95dbd4#>1953
>>>>>> SliceQueryFilter.collectReducedColumns - Scanned over 100000 tombstones in
>>>>>> ABC.home_feed; query aborted (see tombstone_fail_threshold)
>>>>>>
>>>>>> As you can see all of this is happening for CF home_feed. This CF is
>>>>>> basically maintaining a feed with TTL set to 2592000 (30 days).
>>>>>> gc_grace_seconds for this CF is 864000 and its SizeTieredCompaction.
>>>>>>
>>>>>> Repairs have been running regularly and automatic compactions are
>>>>>> occurring normally too.
>>>>>>
>>>>>> I can definitely use some help here in how to tackle this issue.
>>>>>>
>>>>>> Up till now I have the following ideas:
>>>>>>
>>>>>> 1) I can make gc_grace_seconds to 0 and then do a manual compaction
>>>>>> for this CF and bump up the gc_grace again.
>>>>>>
>>>>>> 2) Make gc_grace 0, run manual compaction on this CF and leave
>>>>>> gc_grace to zero. In this case have to be careful in running repairs.
>>>>>>
>>>>>> 3) I am also considering moving to DateTier Compaction.
>>>>>>
>>>>>> What would be the best approach here for my feed case. Any help is
>>>>>> appreciated.
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>> --
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Lead Systems Architect
>> 10351 Santa Monica Blvd, Suite 3310
>> Los Angeles CA 90025
>>
>

Re: Reading too many tombstones

Posted by Sebastian Estevez <se...@datastax.com>.

Check out the compaction subproperties for tombstones.

http://docs.datastax.com/en/cql/3.1/cql/cql_reference/compactSubprop.html?scroll=compactSubprop__compactionSubpropertiesDTCS
On Jun 4, 2015 1:29 PM, "Aiman Parvaiz" <ai...@flipagram.com> wrote:

> Thanks Carlos for pointing me in that direction, I have some interesting
> findings to share. So in December last year there was a redesign of
> home_feed and it was migrated to a new CF. Initially all the data in
> home_feed had a TTL of 1 year but migrated data was inserted with TTL of
> 30days.
> Now on digging a bit deeper I found that home_feed still has data from Jan
> 2015 with ttl 1275094 (14 days).
>
> This data is for the same id from home_feed:
>  date                     | ttl(description)
> --------------------------+------------------
>  2015-04-03 21:22:58+0000 |           759791
>  2015-04-03 04:50:11+0000 |           412706
>  2015-03-30 22:18:58+0000 |           759791
>  2015-03-29 15:20:36+0000 |          1978689
>  2015-03-28 14:41:28+0000 |          1275116
>  2015-03-28 14:31:25+0000 |          1275116
>  2015-03-18 19:23:44+0000 |          2512936
>  2015-03-13 17:51:01+0000 |          1978689
>  2015-02-12 15:41:01+0000 |          1978689
>  2015-01-18 02:36:27+0000 |          1275094
>
>
> I am not sure what happened in that migration but I think that when trying
> to load data we are reading this old data(as feed queries a 1000/page to be
> displayed to the user) and in order to read this data we have to
> cross(read) lots of tombstones(newer data has TTL working correctly) and
> hence the error.
> I am not sure how much would date tier help us in this situation too. If
> anyone has any suggestions in how to handle this either on Systems or
> Developer level please pitch in.
>
> Thanks
>
> On Thu, Jun 4, 2015 at 11:47 AM, Carlos Rolo <ro...@pythian.com> wrote:
>
>> The TTL data will only be removed after the gc_grace_seconds. So your
>> data with 30 days TTL will be still in Cassandra for 10 days more (40 in
>> total). Is your data being there for more than that? Otherwise it is
>> expected behaviour and probably you should do something on your data model
>> to avoid scanning tombstoned data.
>>
>> Regards,
>>
>> Carlos Juzarte Rolo
>> Cassandra Consultant
>>
>> Pythian - Love your data
>>
>> rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
>> <http://linkedin.com/in/carlosjuzarterolo>*
>> Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
>> www.pythian.com
>>
>> On Thu, Jun 4, 2015 at 8:31 PM, Aiman Parvaiz <ai...@flipagram.com>
>> wrote:
>>
>>> yeah we don't update old data. One thing I am curious about is why are
>>> we running in to so many tombstones with compaction happening normally. Is
>>> compaction not removing tombstomes?
>>>
>>>
>>> On Thu, Jun 4, 2015 at 11:25 AM, Jonathan Haddad <jo...@jonhaddad.com>
>>> wrote:
>>>
>>>> DateTiered is fantastic if you've got time series, TTLed data.  That
>>>> means no updates to old data.
>>>>
>>>> On Thu, Jun 4, 2015 at 10:58 AM Aiman Parvaiz <ai...@flipagram.com>
>>>> wrote:
>>>>
>>>>> Hi everyone,
>>>>> We are running a 10 node Cassandra 2.0.9 without vnode cluster. We are
>>>>> running in to a issue where we are reading too many tombstones and hence
>>>>> getting tons of WARN messages and some ERROR query aborted.
>>>>>
>>>>> cass-prod4 2015-06-04 14:38:34,307 WARN ReadStage:
>>>>> <https://logentries.com/app/9f95dbd4#>1998
>>>>> SliceQueryFilter.collectReducedColumns - Read 46 live and 1560 tombstoned
>>>>> cells in ABC.home_feed (see tombstone_warn_threshold). 100 columns was
>>>>> requested, slices= <https://logentries.com/app/9f95dbd4#>[-], delInfo=
>>>>> <https://logentries.com/app/9f95dbd4#>{deletedAt=
>>>>> <https://logentries.com/app/9f95dbd4#>-9223372036854775808,
>>>>> localDeletion= <https://logentries.com/app/9f95dbd4#>2147483647}
>>>>>
>>>>> cass-prod2 2015-05-31 12:55:55,331 ERROR ReadStage:
>>>>> <https://logentries.com/app/9f95dbd4#>1953
>>>>> SliceQueryFilter.collectReducedColumns - Scanned over 100000 tombstones in
>>>>> ABC.home_feed; query aborted (see tombstone_fail_threshold)
>>>>>
>>>>> As you can see all of this is happening for CF home_feed. This CF is
>>>>> basically maintaining a feed with TTL set to 2592000 (30 days).
>>>>> gc_grace_seconds for this CF is 864000 and its SizeTieredCompaction.
>>>>>
>>>>> Repairs have been running regularly and automatic compactions are
>>>>> occurring normally too.
>>>>>
>>>>> I can definitely use some help here in how to tackle this issue.
>>>>>
>>>>> Up till now I have the following ideas:
>>>>>
>>>>> 1) I can make gc_grace_seconds to 0 and then do a manual compaction
>>>>> for this CF and bump up the gc_grace again.
>>>>>
>>>>> 2) Make gc_grace 0, run manual compaction on this CF and leave
>>>>> gc_grace to zero. In this case have to be careful in running repairs.
>>>>>
>>>>> 3) I am also considering moving to DateTier Compaction.
>>>>>
>>>>> What would be the best approach here for my feed case. Any help is
>>>>> appreciated.
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>
>>>
>>>
>>>
>>
>> --
>>
>>
>>
>>
>
>
> --
> Lead Systems Architect
> 10351 Santa Monica Blvd, Suite 3310
> Los Angeles CA 90025
>

Re: Reading too many tombstones

Posted by Aiman Parvaiz <ai...@flipagram.com>.

Thanks Carlos for pointing me in that direction, I have some interesting
findings to share. So in December last year there was a redesign of
home_feed and it was migrated to a new CF. Initially all the data in
home_feed had a TTL of 1 year but migrated data was inserted with TTL of
30days.
Now on digging a bit deeper I found that home_feed still has data from Jan
2015 with ttl 1275094 (14 days).

This data is for the same id from home_feed:
 date                     | ttl(description)
--------------------------+------------------
 2015-04-03 21:22:58+0000 |           759791
 2015-04-03 04:50:11+0000 |           412706
 2015-03-30 22:18:58+0000 |           759791
 2015-03-29 15:20:36+0000 |          1978689
 2015-03-28 14:41:28+0000 |          1275116
 2015-03-28 14:31:25+0000 |          1275116
 2015-03-18 19:23:44+0000 |          2512936
 2015-03-13 17:51:01+0000 |          1978689
 2015-02-12 15:41:01+0000 |          1978689
 2015-01-18 02:36:27+0000 |          1275094


I am not sure what happened in that migration but I think that when trying
to load data we are reading this old data(as feed queries a 1000/page to be
displayed to the user) and in order to read this data we have to
cross(read) lots of tombstones(newer data has TTL working correctly) and
hence the error.
I am not sure how much would date tier help us in this situation too. If
anyone has any suggestions in how to handle this either on Systems or
Developer level please pitch in.

Thanks

On Thu, Jun 4, 2015 at 11:47 AM, Carlos Rolo <ro...@pythian.com> wrote:

> The TTL data will only be removed after the gc_grace_seconds. So your data
> with 30 days TTL will be still in Cassandra for 10 days more (40 in total).
> Is your data being there for more than that? Otherwise it is expected
> behaviour and probably you should do something on your data model to avoid
> scanning tombstoned data.
>
> Regards,
>
> Carlos Juzarte Rolo
> Cassandra Consultant
>
> Pythian - Love your data
>
> rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
> <http://linkedin.com/in/carlosjuzarterolo>*
> Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
> www.pythian.com
>
> On Thu, Jun 4, 2015 at 8:31 PM, Aiman Parvaiz <ai...@flipagram.com> wrote:
>
>> yeah we don't update old data. One thing I am curious about is why are we
>> running in to so many tombstones with compaction happening normally. Is
>> compaction not removing tombstomes?
>>
>>
>> On Thu, Jun 4, 2015 at 11:25 AM, Jonathan Haddad <jo...@jonhaddad.com>
>> wrote:
>>
>>> DateTiered is fantastic if you've got time series, TTLed data.  That
>>> means no updates to old data.
>>>
>>> On Thu, Jun 4, 2015 at 10:58 AM Aiman Parvaiz <ai...@flipagram.com>
>>> wrote:
>>>
>>>> Hi everyone,
>>>> We are running a 10 node Cassandra 2.0.9 without vnode cluster. We are
>>>> running in to a issue where we are reading too many tombstones and hence
>>>> getting tons of WARN messages and some ERROR query aborted.
>>>>
>>>> cass-prod4 2015-06-04 14:38:34,307 WARN ReadStage:
>>>> <https://logentries.com/app/9f95dbd4#>1998
>>>> SliceQueryFilter.collectReducedColumns - Read 46 live and 1560 tombstoned
>>>> cells in ABC.home_feed (see tombstone_warn_threshold). 100 columns was
>>>> requested, slices= <https://logentries.com/app/9f95dbd4#>[-], delInfo=
>>>> <https://logentries.com/app/9f95dbd4#>{deletedAt=
>>>> <https://logentries.com/app/9f95dbd4#>-9223372036854775808,
>>>> localDeletion= <https://logentries.com/app/9f95dbd4#>2147483647}
>>>>
>>>> cass-prod2 2015-05-31 12:55:55,331 ERROR ReadStage:
>>>> <https://logentries.com/app/9f95dbd4#>1953
>>>> SliceQueryFilter.collectReducedColumns - Scanned over 100000 tombstones in
>>>> ABC.home_feed; query aborted (see tombstone_fail_threshold)
>>>>
>>>> As you can see all of this is happening for CF home_feed. This CF is
>>>> basically maintaining a feed with TTL set to 2592000 (30 days).
>>>> gc_grace_seconds for this CF is 864000 and its SizeTieredCompaction.
>>>>
>>>> Repairs have been running regularly and automatic compactions are
>>>> occurring normally too.
>>>>
>>>> I can definitely use some help here in how to tackle this issue.
>>>>
>>>> Up till now I have the following ideas:
>>>>
>>>> 1) I can make gc_grace_seconds to 0 and then do a manual compaction for
>>>> this CF and bump up the gc_grace again.
>>>>
>>>> 2) Make gc_grace 0, run manual compaction on this CF and leave gc_grace
>>>> to zero. In this case have to be careful in running repairs.
>>>>
>>>> 3) I am also considering moving to DateTier Compaction.
>>>>
>>>> What would be the best approach here for my feed case. Any help is
>>>> appreciated.
>>>>
>>>> Thanks
>>>>
>>>>
>>
>>
>>
>>
>
> --
>
>
>
>


-- 
Lead Systems Architect
10351 Santa Monica Blvd, Suite 3310
Los Angeles CA 90025

Re: Reading too many tombstones

Posted by Carlos Rolo <ro...@pythian.com>.

The TTL data will only be removed after the gc_grace_seconds. So your data
with 30 days TTL will be still in Cassandra for 10 days more (40 in total).
Is your data being there for more than that? Otherwise it is expected
behaviour and probably you should do something on your data model to avoid
scanning tombstoned data.

Regards,

Carlos Juzarte Rolo
Cassandra Consultant

Pythian - Love your data

rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
<http://linkedin.com/in/carlosjuzarterolo>*
Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
www.pythian.com

On Thu, Jun 4, 2015 at 8:31 PM, Aiman Parvaiz <ai...@flipagram.com> wrote:

> yeah we don't update old data. One thing I am curious about is why are we
> running in to so many tombstones with compaction happening normally. Is
> compaction not removing tombstomes?
>
>
> On Thu, Jun 4, 2015 at 11:25 AM, Jonathan Haddad <jo...@jonhaddad.com>
> wrote:
>
>> DateTiered is fantastic if you've got time series, TTLed data.  That
>> means no updates to old data.
>>
>> On Thu, Jun 4, 2015 at 10:58 AM Aiman Parvaiz <ai...@flipagram.com>
>> wrote:
>>
>>> Hi everyone,
>>> We are running a 10 node Cassandra 2.0.9 without vnode cluster. We are
>>> running in to a issue where we are reading too many tombstones and hence
>>> getting tons of WARN messages and some ERROR query aborted.
>>>
>>> cass-prod4 2015-06-04 14:38:34,307 WARN ReadStage:
>>> <https://logentries.com/app/9f95dbd4#>1998
>>> SliceQueryFilter.collectReducedColumns - Read 46 live and 1560 tombstoned
>>> cells in ABC.home_feed (see tombstone_warn_threshold). 100 columns was
>>> requested, slices= <https://logentries.com/app/9f95dbd4#>[-], delInfo=
>>> <https://logentries.com/app/9f95dbd4#>{deletedAt=
>>> <https://logentries.com/app/9f95dbd4#>-9223372036854775808,
>>> localDeletion= <https://logentries.com/app/9f95dbd4#>2147483647}
>>>
>>> cass-prod2 2015-05-31 12:55:55,331 ERROR ReadStage:
>>> <https://logentries.com/app/9f95dbd4#>1953
>>> SliceQueryFilter.collectReducedColumns - Scanned over 100000 tombstones in
>>> ABC.home_feed; query aborted (see tombstone_fail_threshold)
>>>
>>> As you can see all of this is happening for CF home_feed. This CF is
>>> basically maintaining a feed with TTL set to 2592000 (30 days).
>>> gc_grace_seconds for this CF is 864000 and its SizeTieredCompaction.
>>>
>>> Repairs have been running regularly and automatic compactions are
>>> occurring normally too.
>>>
>>> I can definitely use some help here in how to tackle this issue.
>>>
>>> Up till now I have the following ideas:
>>>
>>> 1) I can make gc_grace_seconds to 0 and then do a manual compaction for
>>> this CF and bump up the gc_grace again.
>>>
>>> 2) Make gc_grace 0, run manual compaction on this CF and leave gc_grace
>>> to zero. In this case have to be careful in running repairs.
>>>
>>> 3) I am also considering moving to DateTier Compaction.
>>>
>>> What would be the best approach here for my feed case. Any help is
>>> appreciated.
>>>
>>> Thanks
>>>
>>>
>
>
>
>

-- 


--

Re: Reading too many tombstones

Posted by Aiman Parvaiz <ai...@flipagram.com>.

yeah we don't update old data. One thing I am curious about is why are we
running in to so many tombstones with compaction happening normally. Is
compaction not removing tombstomes?

On Thu, Jun 4, 2015 at 11:25 AM, Jonathan Haddad <jo...@jonhaddad.com> wrote:

> DateTiered is fantastic if you've got time series, TTLed data.  That means
> no updates to old data.
>
> On Thu, Jun 4, 2015 at 10:58 AM Aiman Parvaiz <ai...@flipagram.com> wrote:
>
>> Hi everyone,
>> We are running a 10 node Cassandra 2.0.9 without vnode cluster. We are
>> running in to a issue where we are reading too many tombstones and hence
>> getting tons of WARN messages and some ERROR query aborted.
>>
>> cass-prod4 2015-06-04 14:38:34,307 WARN ReadStage:
>> <https://logentries.com/app/9f95dbd4#>1998
>> SliceQueryFilter.collectReducedColumns - Read 46 live and 1560 tombstoned
>> cells in ABC.home_feed (see tombstone_warn_threshold). 100 columns was
>> requested, slices= <https://logentries.com/app/9f95dbd4#>[-], delInfo=
>> <https://logentries.com/app/9f95dbd4#>{deletedAt=
>> <https://logentries.com/app/9f95dbd4#>-9223372036854775808,
>> localDeletion= <https://logentries.com/app/9f95dbd4#>2147483647}
>>
>> cass-prod2 2015-05-31 12:55:55,331 ERROR ReadStage:
>> <https://logentries.com/app/9f95dbd4#>1953
>> SliceQueryFilter.collectReducedColumns - Scanned over 100000 tombstones in
>> ABC.home_feed; query aborted (see tombstone_fail_threshold)
>>
>> As you can see all of this is happening for CF home_feed. This CF is
>> basically maintaining a feed with TTL set to 2592000 (30 days).
>> gc_grace_seconds for this CF is 864000 and its SizeTieredCompaction.
>>
>> Repairs have been running regularly and automatic compactions are
>> occurring normally too.
>>
>> I can definitely use some help here in how to tackle this issue.
>>
>> Up till now I have the following ideas:
>>
>> 1) I can make gc_grace_seconds to 0 and then do a manual compaction for
>> this CF and bump up the gc_grace again.
>>
>> 2) Make gc_grace 0, run manual compaction on this CF and leave gc_grace
>> to zero. In this case have to be careful in running repairs.
>>
>> 3) I am also considering moving to DateTier Compaction.
>>
>> What would be the best approach here for my feed case. Any help is
>> appreciated.
>>
>> Thanks
>>
>>

Re: Reading too many tombstones

Posted by Jonathan Haddad <jo...@jonhaddad.com>.

DateTiered is fantastic if you've got time series, TTLed data.  That means
no updates to old data.

On Thu, Jun 4, 2015 at 10:58 AM Aiman Parvaiz <ai...@flipagram.com> wrote:

> Hi everyone,
> We are running a 10 node Cassandra 2.0.9 without vnode cluster. We are
> running in to a issue where we are reading too many tombstones and hence
> getting tons of WARN messages and some ERROR query aborted.
>
> cass-prod4 2015-06-04 14:38:34,307 WARN ReadStage:
> <https://logentries.com/app/9f95dbd4#>1998
> SliceQueryFilter.collectReducedColumns - Read 46 live and 1560 tombstoned
> cells in ABC.home_feed (see tombstone_warn_threshold). 100 columns was
> requested, slices= <https://logentries.com/app/9f95dbd4#>[-], delInfo=
> <https://logentries.com/app/9f95dbd4#>{deletedAt=
> <https://logentries.com/app/9f95dbd4#>-9223372036854775808, localDeletion=
> <https://logentries.com/app/9f95dbd4#>2147483647}
>
> cass-prod2 2015-05-31 12:55:55,331 ERROR ReadStage:
> <https://logentries.com/app/9f95dbd4#>1953
> SliceQueryFilter.collectReducedColumns - Scanned over 100000 tombstones in
> ABC.home_feed; query aborted (see tombstone_fail_threshold)
>
> As you can see all of this is happening for CF home_feed. This CF is
> basically maintaining a feed with TTL set to 2592000 (30 days).
> gc_grace_seconds for this CF is 864000 and its SizeTieredCompaction.
>
> Repairs have been running regularly and automatic compactions are
> occurring normally too.
>
> I can definitely use some help here in how to tackle this issue.
>
> Up till now I have the following ideas:
>
> 1) I can make gc_grace_seconds to 0 and then do a manual compaction for
> this CF and bump up the gc_grace again.
>
> 2) Make gc_grace 0, run manual compaction on this CF and leave gc_grace to
> zero. In this case have to be careful in running repairs.
>
> 3) I am also considering moving to DateTier Compaction.
>
> What would be the best approach here for my feed case. Any help is
> appreciated.
>
> Thanks
>
>