You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Igor Novgorodov (JIRA)" <ji...@apache.org> on 2017/04/03 15:20:41 UTC

[jira] [Updated] (CASSANDRA-13403) nodetool repair breaks SASI index

     [ https://issues.apache.org/jira/browse/CASSANDRA-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Igor Novgorodov updated CASSANDRA-13403:
----------------------------------------
    Description: 
I've got table:
{code}
CREATE TABLE cservice.bulks_recipients (
    recipient text,
    bulk_id uuid,
    datetime_final timestamp,
    datetime_sent timestamp,
    request_id uuid,
    status int,
    PRIMARY KEY (recipient, bulk_id)
) WITH CLUSTERING ORDER BY (bulk_id ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99PERCENTILE';
CREATE CUSTOM INDEX bulk_recipients_bulk_id ON cservice.bulks_recipients (bulk_id) USING 'org.apache.cassandra.index.sasi.SASIIndex';
{code}

There are 11 rows in it:
{code}
> select * from bulks_recipients;

...
(11 rows)
{code}

Let's query by index (all rows have the same *bulk_id*):
{code}
> select * from bulks_recipients where bulk_id = baa94815-e276-4ca4-adda-5b9734e6c4a5;                                                 

...
(11 rows)
{code}
Ok, everything is fine.

Now i'm doing *nodetool repair --partitioner-range --job-threads 4 --full* on each node in cluster sequentially.

After it finished:
{code}
> select * from bulks_recipients where bulk_id = baa94815-e276-4ca4-adda-5b9734e6c4a5;

...
(2 rows)
{code}
Only two rows.

While the rows are actually there:
{code}
> select * from bulks_recipients;

...
(11 rows)
{code}

If i issue an incremental repair on a random node, i can get like 7 rows after index query.

Dropping index and recreating it fixes the issue. Is it a bug or am i doing the repair the wrong way?

  was:
I've got table:
{code}
CREATE TABLE cservice.bulks_recipients (
    recipient text,
    bulk_id uuid,
    datetime_final timestamp,
    datetime_sent timestamp,
    request_id uuid,
    status int,
    PRIMARY KEY (recipient, bulk_id)
) WITH CLUSTERING ORDER BY (bulk_id ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99PERCENTILE';
CREATE CUSTOM INDEX bulk_recipients_bulk_id ON cservice.bulks_recipients (bulk_id) USING 'org.apache.cassandra.index.sasi.SASIIndex';
{code}

There are 11 rows in it:
{code}
> select * from bulks_recipients;

...
(11 rows)
{code}

Let's query by index (all rows have the same *bulk_id*):
{code}
> select * from bulks_recipients where bulk_id = baa94815-e276-4ca4-adda-5b9734e6c4a5;                                                 

...
(11 rows)
{code}
Ok, everything is fine.

Now i'm doing *nodetool repair --partitioner-range --job-threads 4 --full* on each node in cluster sequentially.

After it finished:
{code}
> select * from bulks_recipients where bulk_id = baa94815-e276-4ca4-adda-5b9734e6c4a5;

...
(2 rows)
{code}
Only two rows.

While the rows are actually there:
{code}
> select * from bulks_recipients;

...
(11 rows)
{code}

Dropping index and recreating it fixes the issue. Is it a bug or am i doing the repair the wrong way?


> nodetool repair breaks SASI index
> ---------------------------------
>
>                 Key: CASSANDRA-13403
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13403
>             Project: Cassandra
>          Issue Type: Bug
>          Components: sasi
>         Environment: 3.10
>            Reporter: Igor Novgorodov
>
> I've got table:
> {code}
> CREATE TABLE cservice.bulks_recipients (
>     recipient text,
>     bulk_id uuid,
>     datetime_final timestamp,
>     datetime_sent timestamp,
>     request_id uuid,
>     status int,
>     PRIMARY KEY (recipient, bulk_id)
> ) WITH CLUSTERING ORDER BY (bulk_id ASC)
>     AND bloom_filter_fp_chance = 0.01
>     AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
>     AND comment = ''
>     AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
>     AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
>     AND crc_check_chance = 1.0
>     AND dclocal_read_repair_chance = 0.1
>     AND default_time_to_live = 0
>     AND gc_grace_seconds = 864000
>     AND max_index_interval = 2048
>     AND memtable_flush_period_in_ms = 0
>     AND min_index_interval = 128
>     AND read_repair_chance = 0.0
>     AND speculative_retry = '99PERCENTILE';
> CREATE CUSTOM INDEX bulk_recipients_bulk_id ON cservice.bulks_recipients (bulk_id) USING 'org.apache.cassandra.index.sasi.SASIIndex';
> {code}
> There are 11 rows in it:
> {code}
> > select * from bulks_recipients;
> ...
> (11 rows)
> {code}
> Let's query by index (all rows have the same *bulk_id*):
> {code}
> > select * from bulks_recipients where bulk_id = baa94815-e276-4ca4-adda-5b9734e6c4a5;                                                 
> ...
> (11 rows)
> {code}
> Ok, everything is fine.
> Now i'm doing *nodetool repair --partitioner-range --job-threads 4 --full* on each node in cluster sequentially.
> After it finished:
> {code}
> > select * from bulks_recipients where bulk_id = baa94815-e276-4ca4-adda-5b9734e6c4a5;
> ...
> (2 rows)
> {code}
> Only two rows.
> While the rows are actually there:
> {code}
> > select * from bulks_recipients;
> ...
> (11 rows)
> {code}
> If i issue an incremental repair on a random node, i can get like 7 rows after index query.
> Dropping index and recreating it fixes the issue. Is it a bug or am i doing the repair the wrong way?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)