You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Mathijs Vogelzang (JIRA)" <ji...@apache.org> on 2015/06/03 17:17:39 UTC
[jira] [Updated] (CASSANDRA-9540) Cql query doesn't return right
information when using IN on columns for some keys
[ https://issues.apache.org/jira/browse/CASSANDRA-9540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mathijs Vogelzang updated CASSANDRA-9540:
-----------------------------------------
Description:
We are investigating a weird issue where one of our clients doesn't get data on his dashboard. It seems Cassandra is not returning data for a particular key ("brokenkey" from now on).
Some background:
We have a row where we store a "metadata" column and data in columns "bucket/0", "bucket/1", "bucket/2", etc. Depending on the date selection of the UI, we know that we only need to retrieve bucket/0, bucket/0 and bucket/1 etc. (we always need to retrieve "metadata").
A typical query may look like this (using SELECT column1 to just show what is returned, normally we would of course do SELECT value):
{noformat}
cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where key=textAsBlob('install/workingkey');
blobAsText(column1)
---------------------
bucket/0
metadata
(2 rows)
cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where key=textAsBlob('install/brokenkey');
blobAsText(column1)
---------------------
bucket/0
metadata
(2 rows)
{noformat}
These two queries work as expected, and return the information that we actually stored.
However, when we "filter" for certain columns, the brokenkey starts behaving very weird:
{noformat}
cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where key=textAsBlob('install/workingkey') and column1 IN (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'));
blobAsText(column1)
---------------------
bucket/0
metadata
(2 rows)
cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where key=textAsBlob('install/workingkey') and column1 IN (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'),textAsBlob('asdfasdfasdf'));
blobAsText(column1)
---------------------
bucket/0
metadata
(2 rows)
*** As expected, querying for more information doesn't really matter for the working key ***
cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where key=textAsBlob('install/brokenkey') and column1 IN (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'));
blobAsText(column1)
---------------------
bucket/0
(1 rows)
*** Cassandra stops giving us the metadata column when asking for a few more columns! ***
cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where key=textAsBlob('install/brokenkey') and column1 IN (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'),textAsBlob('asdfasdfasdf'));
key | column1 | value
-----+---------+-------
(0 rows)
*** Adding the bogus column name even makes it return nothing from this row anymore! ***
{noformat}
There are at least two rows that malfunction like this in our table (which is quite old already and has gone through a bunch of Cassandra upgrades). I've upgraded our whole cluster to 2.1.5 (we were on 2.1.2 when I discovered this problem) and compacted, repaired and scrubbed this column family, which hasn't helped.
Our table structure is:
{noformat}
cqlsh:AppBrain> describe table "GroupedSeries";
CREATE TABLE "AppBrain"."GroupedSeries" (
key blob,
column1 blob,
value blob,
PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE
AND CLUSTERING ORDER BY (column1 ASC)
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 1.0
AND speculative_retry = 'NONE';
{noformat}
Let me know if I can give more information that may be helpful.
was:
We are investigating a weird issue where one of our clients doesn't get data on his dashboard. It seems Cassandra is not returning data for a particular key ("brokenkey" from now on).
Some background:
We have a row where we store a "metadata" column and data in columns "bucket/0", "bucket/1", "bucket/2", etc. Depending on the date selection of the UI, we know that we only need to retrieve bucket/0, bucket/0 and bucket/1 etc. (we always need to retrieve "metadata").
A typical query may look like this (using SELECT column1 to just show what is returned, normally we would of course do SELECT value):
{{noformat}}
cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where key=textAsBlob('install/workingkey');
blobAsText(column1)
---------------------
bucket/0
metadata
(2 rows)
cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where key=textAsBlob('install/brokenkey');
blobAsText(column1)
---------------------
bucket/0
metadata
(2 rows)
{{/noformat}}
These two queries work as expected, and return the information that we actually stored.
However, when we "filter" for certain columns, the brokenkey starts behaving very weird:
{{noformat}}
cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where key=textAsBlob('install/workingkey') and column1 IN (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'));
blobAsText(column1)
---------------------
bucket/0
metadata
(2 rows)
cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where key=textAsBlob('install/workingkey') and column1 IN (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'),textAsBlob('asdfasdfasdf'));
blobAsText(column1)
---------------------
bucket/0
metadata
(2 rows)
*** As expected, querying for more information doesn't really matter for the working key ***
cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where key=textAsBlob('install/brokenkey') and column1 IN (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'));
blobAsText(column1)
---------------------
bucket/0
(1 rows)
*** Cassandra stops giving us the metadata column when asking for a few more columns! ***
cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where key=textAsBlob('install/brokenkey') and column1 IN (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'),textAsBlob('asdfasdfasdf'));
key | column1 | value
-----+---------+-------
(0 rows)
*** Adding the bogus column name even makes it return nothing from this row anymore! ***
{{/noformat}}
There are at least two rows that malfunction like this in our table (which is quite old already and has gone through a bunch of Cassandra upgrades). I've upgraded our whole cluster to 2.1.5 (we were on 2.1.2 when I discovered this problem) and compacted, repaired and scrubbed this column family, which hasn't helped.
Our table structure is:
{{noformat}}
cqlsh:AppBrain> describe table "GroupedSeries";
CREATE TABLE "AppBrain"."GroupedSeries" (
key blob,
column1 blob,
value blob,
PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE
AND CLUSTERING ORDER BY (column1 ASC)
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 1.0
AND speculative_retry = 'NONE';
{{/noformat}}
Let me know if I can give more information that may be helpful.
> Cql query doesn't return right information when using IN on columns for some keys
> ---------------------------------------------------------------------------------
>
> Key: CASSANDRA-9540
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9540
> Project: Cassandra
> Issue Type: Bug
> Components: API
> Environment: Cassandra 2.1.5
> Reporter: Mathijs Vogelzang
>
> We are investigating a weird issue where one of our clients doesn't get data on his dashboard. It seems Cassandra is not returning data for a particular key ("brokenkey" from now on).
> Some background:
> We have a row where we store a "metadata" column and data in columns "bucket/0", "bucket/1", "bucket/2", etc. Depending on the date selection of the UI, we know that we only need to retrieve bucket/0, bucket/0 and bucket/1 etc. (we always need to retrieve "metadata").
> A typical query may look like this (using SELECT column1 to just show what is returned, normally we would of course do SELECT value):
> {noformat}
> cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where key=textAsBlob('install/workingkey');
> blobAsText(column1)
> ---------------------
> bucket/0
> metadata
> (2 rows)
> cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where key=textAsBlob('install/brokenkey');
> blobAsText(column1)
> ---------------------
> bucket/0
> metadata
> (2 rows)
> {noformat}
> These two queries work as expected, and return the information that we actually stored.
> However, when we "filter" for certain columns, the brokenkey starts behaving very weird:
> {noformat}
> cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where key=textAsBlob('install/workingkey') and column1 IN (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'));
> blobAsText(column1)
> ---------------------
> bucket/0
> metadata
> (2 rows)
> cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where key=textAsBlob('install/workingkey') and column1 IN (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'),textAsBlob('asdfasdfasdf'));
> blobAsText(column1)
> ---------------------
> bucket/0
> metadata
> (2 rows)
> *** As expected, querying for more information doesn't really matter for the working key ***
> cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where key=textAsBlob('install/brokenkey') and column1 IN (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'));
> blobAsText(column1)
> ---------------------
> bucket/0
> (1 rows)
> *** Cassandra stops giving us the metadata column when asking for a few more columns! ***
> cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where key=textAsBlob('install/brokenkey') and column1 IN (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'),textAsBlob('asdfasdfasdf'));
> key | column1 | value
> -----+---------+-------
> (0 rows)
> *** Adding the bogus column name even makes it return nothing from this row anymore! ***
> {noformat}
> There are at least two rows that malfunction like this in our table (which is quite old already and has gone through a bunch of Cassandra upgrades). I've upgraded our whole cluster to 2.1.5 (we were on 2.1.2 when I discovered this problem) and compacted, repaired and scrubbed this column family, which hasn't helped.
> Our table structure is:
> {noformat}
> cqlsh:AppBrain> describe table "GroupedSeries";
> CREATE TABLE "AppBrain"."GroupedSeries" (
> key blob,
> column1 blob,
> value blob,
> PRIMARY KEY (key, column1)
> ) WITH COMPACT STORAGE
> AND CLUSTERING ORDER BY (column1 ASC)
> AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
> AND comment = ''
> AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'}
> AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 1.0
> AND speculative_retry = 'NONE';
> {noformat}
> Let me know if I can give more information that may be helpful.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)