You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Branimir Lambov (Jira)" <ji...@apache.org> on 2021/07/14 12:05:00 UTC

[jira] [Comment Edited] (CASSANDRA-12922) Bloom filter miss counts are not measured correctly

    [ https://issues.apache.org/jira/browse/CASSANDRA-12922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380536#comment-17380536 ] 

Branimir Lambov edited comment on CASSANDRA-12922 at 7/14/21, 12:04 PM:
------------------------------------------------------------------------

To be honest, I did not realize there's a "there is no index file" exit path; an index is required for normal operation. (The only place we can do without one is in {{Scrubber}} which does not rely on {{getPosition}}.)

You are right that for the bloom filter not having an index file should not matter, but if such a scenario is to be realized, it would be due to a problem with the index rather than one with the precision of the bloom filter; the relevant data may actually be present. Instead of adding a false positive, I would thus prefer to move the {{ifile == null}} check to the beginning of the method and leave it outside the coverage of the bloom filter tracker.


was (Author: blambov):
To be honest, I did not realize there's a "there is no index file" exit path; an index is required for normal operation. (The only place we can do without one is in {{Scrubber}} which does not rely on {{getPosition}}.)

You are right that for the bloom filter not having an index file should not matter, but if such a scenario is to be realized, it would be due to a problem with the index rather than one with the precision of the bloom filter; the relevant data may actually be present. Instead of adding a false positive, I would thus prefer to move the `ifile == null` check to the beginning of the method and leave it outside the coverage of the bloom filter tracker.

> Bloom filter miss counts are not measured correctly
> ---------------------------------------------------
>
>                 Key: CASSANDRA-12922
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12922
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Legacy/Local Write-Read Paths
>            Reporter: Branimir Lambov
>            Assignee: Aleksei Zotov
>            Priority: Normal
>              Labels: lhf
>             Fix For: 3.0.x, 3.11.x, 4.0.x, 4.x
>
>         Attachments: 12922-trunk.txt
>
>
> Bloom filter hits and misses are evaluated incorrectly in {{BigTableReader.getPosition}}: we properly record hits, but not misses. In particular, if we don't find a match for a key in the index, which is where almost all non-matches will be rejected, [we don't record a bloom filter false positive|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/format/big/BigTableReader.java#L228].
> This leads to very misleading output from e.g. {{nodetool tablestats}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org