You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "mlowicki (JIRA)" <ji...@apache.org> on 2015/01/31 20:52:34 UTC
[jira] [Created] (CASSANDRA-8712) Out-of-sync secondary index

mlowicki created CASSANDRA-8712:
-----------------------------------

             Summary: Out-of-sync secondary index
                 Key: CASSANDRA-8712
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8712
             Project: Cassandra
          Issue Type: Bug
         Environment: 2.1.2
            Reporter: mlowicki


I've such table with index:
{code}
CREATE TABLE entity (
    user_id text,
    data_type_id int,
    version bigint,
    id text,
    cache_guid text,
    client_defined_unique_tag text,
    ctime timestamp,
    deleted boolean,
    folder boolean,
    mtime timestamp,
    name text,
    originator_client_item_id text,
    parent_id text,
    position blob,
    server_defined_unique_tag text,
    specifics blob,
    PRIMARY KEY (user_id, data_type_id, version, id)
) WITH CLUSTERING ORDER BY (data_type_id ASC, version ASC, id ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
    AND comment = ''
    AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99.0PERCENTILE';
CREATE INDEX index_entity_parent_id ON entity (parent_id);
{code}

It turned out that index became out of sync:
{code}
>>> Entity.objects.filter(user_id='255824802', parent_id=parent_id).consistency(6).count()
16
 
>>> counter = 0
>>> for e in Entity.objects.filter(user_id='255824802'):
...     if e.parent_id and e.parent_id == parent_id:
...         counter += 1
... 
>>> counter
10
{code}

After couple of hours it was fine (at night) but then when user probably started to interact with DB we got the same problem. As a temporary solution we'll try to rebuild indexes from time to time as suggested in http://dev.nuclearrooster.com/2013/01/20/using-nodetool-to-rebuild-secondary-indexes-in-cassandra/

Launched simple script for checking such anomaly and before rebuilding index for 4024856 folders 10378 had this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)