You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by "Fryar, Dexter" <de...@hp.com> on 2011/04/07 18:19:18 UTC
Secondary Index Updates Break CLI and Client Code Reading
Creating an index, validator, and default validator then renaming/dropping the index later results in read errors.
Is there an easy way around this problem without having to keep an invalid definition for a column that will get deleted or expired?
1) create a secondary index on a column with a validator and a default validator
2) insert a row
3) read and verify the row
4) update the CF/index/name/validator
5) read the CF and get an error (CLI or Pycassa)
CLI Commands to create the row and CF/Index
create column family cf_testing with comparator=UTF8Type and default_validation_class=UTF8Type and column_metadata=[{column_name: colour, validation_class: LongType, index_type: KEYS}];
set cf_testing['key']['colour']='1234';
list cf_testing;
update column family cf_testing with comparator=UTF8Type and default_validation_class=UTF8Type and column_metadata=[{column_name: color, validation_class: LongType, index_type: KEYS}];
ERROR from the CLI:
list cf_testing;
Using default limit of 100
-------------------
RowKey: key
invalid UTF8 bytes 00000000000004d2
Here is the Pycassa client code that shows this error too.
badindex.py
#!/usr/local/bin/python2.7
import pycassa
import uuid
import sys
def main():
try:
keyspace="badindex"
serverPoolList = ['localhost:9160']
pool = pycassa.connect(keyspace, serverPoolList)
except:
print "couldn't get a connection"
sys.exit()
cfname="cf_testing"
cf = pycassa.ColumnFamily(pool, cfname)
results = cf.get_range(start='key', finish='key', row_count=1)
for key, columns in results:
print key, '=>', columns
if __name__ == "__main__":
main()
Re: Secondary Index Updates Break CLI and Client Code Reading ::
DebugLog Attached
Posted by Jonathan Ellis <jb...@gmail.com>.
Addressed on the issue you created,
https://issues.apache.org/jira/browse/CASSANDRA-2436.
On Thu, Apr 7, 2011 at 12:19 PM, Fryar, Dexter <de...@hp.com> wrote:
> I have also attached the debug log with each step attached. I've even tried going back and updating the CF with the old index to no avail. You can insert/write all you want, but reads will fail if you come across a row that included one of these cases.
>
> log4j-server.properties
> log4j.rootLogger=DEBUG,stdout,R
>
>
>
> -----Original Message-----
> From: Fryar, Dexter
> Sent: Thursday, April 07, 2011 11:19 AM
> To: user@cassandra.apache.org
> Subject: Secondary Index Updates Break CLI and Client Code Reading
>
> Creating an index, validator, and default validator then renaming/dropping the index later results in read errors.
>
>
> Is there an easy way around this problem without having to keep an invalid definition for a column that will get deleted or expired?
>
>
> 1) create a secondary index on a column with a validator and a default validator
> 2) insert a row
> 3) read and verify the row
> 4) update the CF/index/name/validator
> 5) read the CF and get an error (CLI or Pycassa)
>
>
> CLI Commands to create the row and CF/Index
>
> create column family cf_testing with comparator=UTF8Type and default_validation_class=UTF8Type and column_metadata=[{column_name: colour, validation_class: LongType, index_type: KEYS}];
>
> set cf_testing['key']['colour']='1234';
> list cf_testing;
>
> update column family cf_testing with comparator=UTF8Type and default_validation_class=UTF8Type and column_metadata=[{column_name: color, validation_class: LongType, index_type: KEYS}];
>
>
> ERROR from the CLI:
>
> list cf_testing;
> Using default limit of 100
> -------------------
> RowKey: key
> invalid UTF8 bytes 00000000000004d2
>
>
>
> Here is the Pycassa client code that shows this error too.
>
>
> badindex.py
>
> #!/usr/local/bin/python2.7
>
> import pycassa
> import uuid
> import sys
>
> def main():
> try:
> keyspace="badindex"
> serverPoolList = ['localhost:9160']
> pool = pycassa.connect(keyspace, serverPoolList)
> except:
> print "couldn't get a connection"
> sys.exit()
>
> cfname="cf_testing"
> cf = pycassa.ColumnFamily(pool, cfname)
> results = cf.get_range(start='key', finish='key', row_count=1)
> for key, columns in results:
> print key, '=>', columns
>
> if __name__ == "__main__":
> main()
>
--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com
RE: Secondary Index Updates Break CLI and Client Code Reading ::
DebugLog Attached
Posted by "Fryar, Dexter" <de...@hp.com>.
I have also attached the debug log with each step attached. I've even tried going back and updating the CF with the old index to no avail. You can insert/write all you want, but reads will fail if you come across a row that included one of these cases.
log4j-server.properties
log4j.rootLogger=DEBUG,stdout,R
-----Original Message-----
From: Fryar, Dexter
Sent: Thursday, April 07, 2011 11:19 AM
To: user@cassandra.apache.org
Subject: Secondary Index Updates Break CLI and Client Code Reading
Creating an index, validator, and default validator then renaming/dropping the index later results in read errors.
Is there an easy way around this problem without having to keep an invalid definition for a column that will get deleted or expired?
1) create a secondary index on a column with a validator and a default validator
2) insert a row
3) read and verify the row
4) update the CF/index/name/validator
5) read the CF and get an error (CLI or Pycassa)
CLI Commands to create the row and CF/Index
create column family cf_testing with comparator=UTF8Type and default_validation_class=UTF8Type and column_metadata=[{column_name: colour, validation_class: LongType, index_type: KEYS}];
set cf_testing['key']['colour']='1234';
list cf_testing;
update column family cf_testing with comparator=UTF8Type and default_validation_class=UTF8Type and column_metadata=[{column_name: color, validation_class: LongType, index_type: KEYS}];
ERROR from the CLI:
list cf_testing;
Using default limit of 100
-------------------
RowKey: key
invalid UTF8 bytes 00000000000004d2
Here is the Pycassa client code that shows this error too.
badindex.py
#!/usr/local/bin/python2.7
import pycassa
import uuid
import sys
def main():
try:
keyspace="badindex"
serverPoolList = ['localhost:9160']
pool = pycassa.connect(keyspace, serverPoolList)
except:
print "couldn't get a connection"
sys.exit()
cfname="cf_testing"
cf = pycassa.ColumnFamily(pool, cfname)
results = cf.get_range(start='key', finish='key', row_count=1)
for key, columns in results:
print key, '=>', columns
if __name__ == "__main__":
main()