You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by Sam Klock <sk...@akamai.com.INVALID> on 2018/12/19 21:51:55 UTC

Question about PartitionUpdate.singleRowUpdate()

Cassandra devs,

I have a question about the implementation of
PartitionUpdate.singleRowUpdate(), in particular the choice to use
EncodingStats.NO_STATS when building the resulting PartitionUpdate.  Is
there a functional reason for that -- i.e., is it safe to modify it to
use an EncodingStats built from deletionInfo, row, and staticRow?

Context: under 3.0.17, we have a table using TWCS and a secondary index.
We've been having a problem with the sstables for the index lingering
essentially forever, despite the correlated sstables for the parent
table being removed pretty much when we expect them to.  We traced the
problem to the use of EncodingStats.NO_STATS in singleRowUpdate(), which
is being used to create the index updates when we write to the parent
table.  It appears that NO_STATS is making Cassandra think the memtables
for the index have data from September 2015 in them, which in turn
prevents it from dropping expired sstables (all of which are much more
recent than that) for the index.

Experimentally, modifying singleRowUpdate() to build an EncodingStats
from its inputs (plus the MutableDeletionInfo it creates) seems to fix
the problem.  We don't have any insight into why the existing logic uses
NO_STATS, however, so we don't know if this change is really safe.  Does
it sound like we're on the right track?  (Also: I'm sure we'd be happy
to open an issue and submit a patch if this sounds like it would be
useful generally.)

Thanks,
SK

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org


Re: Question about PartitionUpdate.singleRowUpdate()

Posted by Sam Klock <sk...@akamai.com.INVALID>.
Thanks for the feedback.  I've opened:

https://issues.apache.org/jira/browse/CASSANDRA-14941

SK

On 2018-12-19 17:03, Jeff Jirsa wrote:
> Definitely worth a JIRA. Suspect it may be slow to get a response this
> close to the holidays, but a JIRA will be a bit more durable than the
> mailing list post.
> 
> 
> On Wed, Dec 19, 2018 at 1:58 PM Sam Klock <sk...@akamai.com.invalid> wrote:
> 
>> Cassandra devs,
>>
>> I have a question about the implementation of
>> PartitionUpdate.singleRowUpdate(), in particular the choice to use
>> EncodingStats.NO_STATS when building the resulting PartitionUpdate.  Is
>> there a functional reason for that -- i.e., is it safe to modify it to
>> use an EncodingStats built from deletionInfo, row, and staticRow?
>>
>> Context: under 3.0.17, we have a table using TWCS and a secondary index.
>> We've been having a problem with the sstables for the index lingering
>> essentially forever, despite the correlated sstables for the parent
>> table being removed pretty much when we expect them to.  We traced the
>> problem to the use of EncodingStats.NO_STATS in singleRowUpdate(), which
>> is being used to create the index updates when we write to the parent
>> table.  It appears that NO_STATS is making Cassandra think the memtables
>> for the index have data from September 2015 in them, which in turn
>> prevents it from dropping expired sstables (all of which are much more
>> recent than that) for the index.
>>
>> Experimentally, modifying singleRowUpdate() to build an EncodingStats
>> from its inputs (plus the MutableDeletionInfo it creates) seems to fix
>> the problem.  We don't have any insight into why the existing logic uses
>> NO_STATS, however, so we don't know if this change is really safe.  Does
>> it sound like we're on the right track?  (Also: I'm sure we'd be happy
>> to open an issue and submit a patch if this sounds like it would be
>> useful generally.)
>>
>> Thanks,
>> SK
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: dev-help@cassandra.apache.org
>>
>>
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org


Re: Question about PartitionUpdate.singleRowUpdate()

Posted by Jeff Jirsa <jj...@gmail.com>.
Definitely worth a JIRA. Suspect it may be slow to get a response this
close to the holidays, but a JIRA will be a bit more durable than the
mailing list post.


On Wed, Dec 19, 2018 at 1:58 PM Sam Klock <sk...@akamai.com.invalid> wrote:

> Cassandra devs,
>
> I have a question about the implementation of
> PartitionUpdate.singleRowUpdate(), in particular the choice to use
> EncodingStats.NO_STATS when building the resulting PartitionUpdate.  Is
> there a functional reason for that -- i.e., is it safe to modify it to
> use an EncodingStats built from deletionInfo, row, and staticRow?
>
> Context: under 3.0.17, we have a table using TWCS and a secondary index.
> We've been having a problem with the sstables for the index lingering
> essentially forever, despite the correlated sstables for the parent
> table being removed pretty much when we expect them to.  We traced the
> problem to the use of EncodingStats.NO_STATS in singleRowUpdate(), which
> is being used to create the index updates when we write to the parent
> table.  It appears that NO_STATS is making Cassandra think the memtables
> for the index have data from September 2015 in them, which in turn
> prevents it from dropping expired sstables (all of which are much more
> recent than that) for the index.
>
> Experimentally, modifying singleRowUpdate() to build an EncodingStats
> from its inputs (plus the MutableDeletionInfo it creates) seems to fix
> the problem.  We don't have any insight into why the existing logic uses
> NO_STATS, however, so we don't know if this change is really safe.  Does
> it sound like we're on the right track?  (Also: I'm sure we'd be happy
> to open an issue and submit a patch if this sounds like it would be
> useful generally.)
>
> Thanks,
> SK
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>