You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cassandra.apache.org by chandi datta <ch...@gmail.com> on 2019/02/12 15:11:23 UTC

Re: Related to large partition and row out of order

Hi,

We are facing one weird issue for a long time.

High level table Defination: primary key
((column1,column2,column3),column4,colum5)

Issue: Generating very large partions repeatedly. Sometimes even 6GB for a
single partition.
Distribution: DSE 5.1
            : Cassandra 3.11.2

After some debugging, here are our findings:
1. Multiple mutations for the same row with the same timestamp. Millions of
rows are there with the exact same information (just the positions are
different). In local system I am not able to reproduce it.

        [67:1:0]@139420369 Row[info=[ts=1541027088688857] ]: 1727728,
2014-11-24 | [status=TE ts=1541027088688857]
        [67:1:0]@139420413 Row[info=[ts=1479790801000000] ]: 1727728,  |
[status=TE ts=1479790801000000]
[67:1:0]@155707642 Row[info=[ts=1494652842062000] ]: 1727727, 2008-09-23 |
[status=OK ts=1528736995521967]
[67:1:0]@155707693 Row[info=[ts=1479790801000000] ]: 1727727,  | [status=TE
ts=1479790801000000]
[67:1:0]@155707720 Row[info=[ts=1494652842062000] ]: 1727727, 2008-09-23 |
[status=OK ts=1528736995521967]
[67:1:0]@155707771 Row[info=[ts=1479790801000000] ]: 1727727,  | [status=TE
ts=1479790801000000]

2. Run through a regular major compaction, UDC, normal scrub. Nothing
helped.  But scrub with --reinsert-overflowed-ttl option removed all
duplicate rows. I didnt see any overflowed localExpirationTime stuff on
that whole sstable though.


3. Normal Scrub is not telling us that the above rows are out of order. But
with  --reinsert-overflowed-tt  fix it's telling rows are out of order and
creating a new sstable. So scrubber, ordercheckIterator computeNext method
is detecing these out of order rows.


Questions are: What circumtances we can expect row out of order situtions?
Why normal compaction is not compacting these rows with the same timestamp?
I run it through the 3.11 master code base few times but everytime it's
just blindly coping into new object as there is no transformation at all.

Thanks in advance!

On Tue, Feb 12, 2019 at 9:00 AM chandi datta <ch...@gmail.com> wrote:

>  Hi,
>
> We are facing one weird issue for a long time.
>
> High level table Defination: primary key
> ((column1,column2,column3),column4,colum5)
>
> Issue: Generating very large partions repeatedly. Sometimes even 6GB for a
> single partition.
> Distribution: DSE 5.1
>             : Cassandra 3.11.2
>
> After some debugging, here are our findings:
> 1. Multiple mutations for the same row with the same timestamp. Millions
> of rows are there with the exact same information (just the positions are
> different). In local system I am not able to reproduce it.
>
>         [67:1:0]@139420369 Row[info=[ts=1541027088688857] ]: 1727728,
> 2014-11-24 | [status=TE ts=1541027088688857]
>         [67:1:0]@139420413 Row[info=[ts=1479790801000000] ]: 1727728,  |
> [status=TE ts=1479790801000000]
> [67:1:0]@155707642 Row[info=[ts=1494652842062000] ]: 1727727, 2008-09-23 |
> [status=OK ts=1528736995521967]
> [67:1:0]@155707693 Row[info=[ts=1479790801000000] ]: 1727727,  |
> [status=TE ts=1479790801000000]
> [67:1:0]@155707720 Row[info=[ts=1494652842062000] ]: 1727727, 2008-09-23 |
> [status=OK ts=1528736995521967]
> [67:1:0]@155707771 Row[info=[ts=1479790801000000] ]: 1727727,  |
> [status=TE ts=1479790801000000]
>
> 2. Run through a regular major compaction, UDC, normal scrub. Nothing
> helped.  But scrub with --reinsert-overflowed-ttl option removed all
> duplicate rows. I didnt see any overflowed localExpirationTime stuff on
> that whole sstable though.
>
>
> 3. Normal Scrub is not telling us that the above rows are out of order.
> But with  --reinsert-overflowed-tt  fix it's telling rows are out of order
> and creating a new sstable. So scrubber, ordercheckIterator computeNext
> method is detecing these out of order rows.
>
>
> Questions are: What circumtances we can expect row out of order situtions?
> Why normal compaction is not compacting these rows with the same timestamp?
> I run it through the 3.11 master code base few times but everytime it's
> just blindly coping into new object as there is no transformation at all.
>
> Thanks in advance!
> Chandi
>

Re: Related to large partition and row out of order

Posted by Benedict Elliott Smith <be...@apache.org>.

This sounds like something you should report upstream as a bug to DataStax.

I’m unaware of any bugs in Cassandra mainline that cause this behaviour, but if you can reproduce the creation of this partitions in 3.11.2, we can help to diagnose the cause.  But without source code it would be a fool’s errand.

If you simply want to mitigate the problem, your best bet is probably to write your own scrub tool that specifically filters out duplicate cells.  The scrub tool is quite simple, so you could probably take the existing code as a starting point and modify it quite easily.

Compaction and scrub do not generally handle this kind of problem, as it is a very specific form of corruption.  It is unclear that it is even desirable to handle it automatically, or what a generally applicable approach for handling it would be.


> On 12 Feb 2019, at 15:11, chandi datta <ch...@gmail.com> wrote:
> 
> Hi,
> 
> We are facing one weird issue for a long time.
> 
> High level table Defination: primary key
> ((column1,column2,column3),column4,colum5)
> 
> Issue: Generating very large partions repeatedly. Sometimes even 6GB for a
> single partition.
> Distribution: DSE 5.1
>            : Cassandra 3.11.2
> 
> After some debugging, here are our findings:
> 1. Multiple mutations for the same row with the same timestamp. Millions of
> rows are there with the exact same information (just the positions are
> different). In local system I am not able to reproduce it.
> 
>        [67:1:0]@139420369 Row[info=[ts=1541027088688857] ]: 1727728,
> 2014-11-24 | [status=TE ts=1541027088688857]
>        [67:1:0]@139420413 Row[info=[ts=1479790801000000] ]: 1727728,  |
> [status=TE ts=1479790801000000]
> [67:1:0]@155707642 Row[info=[ts=1494652842062000] ]: 1727727, 2008-09-23 |
> [status=OK ts=1528736995521967]
> [67:1:0]@155707693 Row[info=[ts=1479790801000000] ]: 1727727,  | [status=TE
> ts=1479790801000000]
> [67:1:0]@155707720 Row[info=[ts=1494652842062000] ]: 1727727, 2008-09-23 |
> [status=OK ts=1528736995521967]
> [67:1:0]@155707771 Row[info=[ts=1479790801000000] ]: 1727727,  | [status=TE
> ts=1479790801000000]
> 
> 2. Run through a regular major compaction, UDC, normal scrub. Nothing
> helped.  But scrub with --reinsert-overflowed-ttl option removed all
> duplicate rows. I didnt see any overflowed localExpirationTime stuff on
> that whole sstable though.
> 
> 
> 3. Normal Scrub is not telling us that the above rows are out of order. But
> with  --reinsert-overflowed-tt  fix it's telling rows are out of order and
> creating a new sstable. So scrubber, ordercheckIterator computeNext method
> is detecing these out of order rows.
> 
> 
> Questions are: What circumtances we can expect row out of order situtions?
> Why normal compaction is not compacting these rows with the same timestamp?
> I run it through the 3.11 master code base few times but everytime it's
> just blindly coping into new object as there is no transformation at all.
> 
> Thanks in advance!
> 
> On Tue, Feb 12, 2019 at 9:00 AM chandi datta <ch...@gmail.com> wrote:
> 
>> Hi,
>> 
>> We are facing one weird issue for a long time.
>> 
>> High level table Defination: primary key
>> ((column1,column2,column3),column4,colum5)
>> 
>> Issue: Generating very large partions repeatedly. Sometimes even 6GB for a
>> single partition.
>> Distribution: DSE 5.1
>>            : Cassandra 3.11.2
>> 
>> After some debugging, here are our findings:
>> 1. Multiple mutations for the same row with the same timestamp. Millions
>> of rows are there with the exact same information (just the positions are
>> different). In local system I am not able to reproduce it.
>> 
>>        [67:1:0]@139420369 Row[info=[ts=1541027088688857] ]: 1727728,
>> 2014-11-24 | [status=TE ts=1541027088688857]
>>        [67:1:0]@139420413 Row[info=[ts=1479790801000000] ]: 1727728,  |
>> [status=TE ts=1479790801000000]
>> [67:1:0]@155707642 Row[info=[ts=1494652842062000] ]: 1727727, 2008-09-23 |
>> [status=OK ts=1528736995521967]
>> [67:1:0]@155707693 Row[info=[ts=1479790801000000] ]: 1727727,  |
>> [status=TE ts=1479790801000000]
>> [67:1:0]@155707720 Row[info=[ts=1494652842062000] ]: 1727727, 2008-09-23 |
>> [status=OK ts=1528736995521967]
>> [67:1:0]@155707771 Row[info=[ts=1479790801000000] ]: 1727727,  |
>> [status=TE ts=1479790801000000]
>> 
>> 2. Run through a regular major compaction, UDC, normal scrub. Nothing
>> helped.  But scrub with --reinsert-overflowed-ttl option removed all
>> duplicate rows. I didnt see any overflowed localExpirationTime stuff on
>> that whole sstable though.
>> 
>> 
>> 3. Normal Scrub is not telling us that the above rows are out of order.
>> But with  --reinsert-overflowed-tt  fix it's telling rows are out of order
>> and creating a new sstable. So scrubber, ordercheckIterator computeNext
>> method is detecing these out of order rows.
>> 
>> 
>> Questions are: What circumtances we can expect row out of order situtions?
>> Why normal compaction is not compacting these rows with the same timestamp?
>> I run it through the 3.11 master code base few times but everytime it's
>> just blindly coping into new object as there is no transformation at all.
>> 
>> Thanks in advance!
>> Chandi
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org