You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Stefan Podkowinski (JIRA)" <ji...@apache.org> on 2019/01/24 15:19:04 UTC

[jira] [Created] (CASSANDRA-14999) Incorrect fallback calculation of getApproximateKeyCount

Stefan Podkowinski created CASSANDRA-14999:
----------------------------------------------

             Summary: Incorrect fallback calculation of getApproximateKeyCount
                 Key: CASSANDRA-14999
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14999
             Project: Cassandra
          Issue Type: Bug
            Reporter: Stefan Podkowinski


Creating a key count for a number of sstables depends on a probabilistic hyperloglog data structure for estimating cardinality of keys. In case of any errors, we'll fallback to [some code|https://github.com/apache/cassandra/blob/7d138e20ea987d44fffbc47de4674b253b7431ff/src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java#L294] that does not calculate the cardinality, but simply creates a sum of all estimated keys for all sstables. This will lead to very different results for larger numbers of sstables with identical keys.

We should have a look at the possible implications of that. Do we depend on this value for sizing bloom filters?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org