You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by sam_ <am...@yahoo.com> on 2011/04/15 08:43:00 UTC
Duplicate result of get_indexed_slices, depending on
indexClause.count
Hi All,
I have been using Cassandra 0.7.2 and 0.7.4 with Thrift API (using Java).
I noticed that if I am querying a Column Family with indexed columns
sometimes I get a duplicate result in get_indexed_slices depending on the
number of rows in the CF and the count that I set in IndexClause.count.
It also depends on the order of rows in CF.
For example consider the following CF that I call Attributes:
create column family Attributes with comparator=UTF8Type
and column_metadata=[
{column_name: range_id, validation_class: LongType, index_type: KEYS},
{column_name: attr_key, validation_class: UTF8Type, index_type: KEYS},
{column_name: attr_val, validation_class: BytesType, index_type: KEYS}
];
And suppose I have the following rows in the CF:
key range_id attr_key attr_val
"1/@1/0", 1, "A", "1"
"1/5/0", 1, "B", "1000"
"3/@1/0", 2, "A", "1"
"3/5/0", 2, "B", "1001"
"5/@1/0", 3, "A", "2"
"5/5/0", 3, "B", "1002"
"7/@1/0", 4, "A", "2"
"7/5/0", 4, "B", "1003"
Now if I have a query with IndexClause like this (in pseudo code):
attr_key == "A" AND attr_val == "1"
with indexClause.count = 4;
Then I ill get the rows with the following keys from get_indexed_slices :
"1/@1/0", "3/@1/0", "3/@1/0"
The last key is a duplicate!
This is very sensitive to the order of rows in the CF and the number of rows
and the number you set in indexClause.count. I noticed when the number of
rows in the CF is twice the indexClause.count this issue might happen
depending on the order of rows in CF!
This seems a bug. And it occurs in both 0.7.2 and 0.7.4.
Is there a solution to this problem?
Many Thanks,
Sam
--
View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Duplicate-result-of-get-indexed-slices-depending-on-indexClause-count-tp6275394p6275394.html
Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.
Re: Duplicate result of get_indexed_slices, depending on indexClause.count
Posted by Jonathan Ellis <jb...@gmail.com>.
https://issues.apache.org/jira/browse/CASSANDRA-2406
On Fri, Apr 15, 2011 at 1:43 AM, sam_ <am...@yahoo.com> wrote:
> Hi All,
>
> I have been using Cassandra 0.7.2 and 0.7.4 with Thrift API (using Java).
>
> I noticed that if I am querying a Column Family with indexed columns
> sometimes I get a duplicate result in get_indexed_slices depending on the
> number of rows in the CF and the count that I set in IndexClause.count.
> It also depends on the order of rows in CF.
>
> For example consider the following CF that I call Attributes:
>
> create column family Attributes with comparator=UTF8Type
> and column_metadata=[
> {column_name: range_id, validation_class: LongType, index_type: KEYS},
> {column_name: attr_key, validation_class: UTF8Type, index_type: KEYS},
> {column_name: attr_val, validation_class: BytesType, index_type: KEYS}
> ];
>
> And suppose I have the following rows in the CF:
>
> key range_id attr_key attr_val
> "1/@1/0", 1, "A", "1"
> "1/5/0", 1, "B", "1000"
> "3/@1/0", 2, "A", "1"
> "3/5/0", 2, "B", "1001"
> "5/@1/0", 3, "A", "2"
> "5/5/0", 3, "B", "1002"
> "7/@1/0", 4, "A", "2"
> "7/5/0", 4, "B", "1003"
>
> Now if I have a query with IndexClause like this (in pseudo code):
>
> attr_key == "A" AND attr_val == "1"
>
> with indexClause.count = 4;
>
> Then I ill get the rows with the following keys from get_indexed_slices :
>
> "1/@1/0", "3/@1/0", "3/@1/0"
>
> The last key is a duplicate!
>
> This is very sensitive to the order of rows in the CF and the number of rows
> and the number you set in indexClause.count. I noticed when the number of
> rows in the CF is twice the indexClause.count this issue might happen
> depending on the order of rows in CF!
>
> This seems a bug. And it occurs in both 0.7.2 and 0.7.4.
>
> Is there a solution to this problem?
>
> Many Thanks,
> Sam
>
>
>
>
>
> --
> View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Duplicate-result-of-get-indexed-slices-depending-on-indexClause-count-tp6275394p6275394.html
> Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.
>
--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com