You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Sam Hodgson <ho...@hotmail.com> on 2012/03/18 14:02:10 UTC

Secondary Index Validation Type Parse Error

Hi All,

Getting the following parse error when trying to create a CF with a secondary index using the bytestype attribute, the index is for a column called 'subject':

java.lang.RuntimeException: org.apache.cassandra.db.marshal.MarshalException: cannot parse 'subject' as hex bytes

Im doing all my validation in php however im unable to validate some UTF8 sources accurately (using mb_detect_encoding) - Cass picks up on bits of non-UTF8 compatible text that the php doesnt so its throwing exceptions.  Figured id set everything to bytestype to try and effectively turn off validation in Cass?  

Im using the following to try and build the CF:

create column family subjects
  with column_type = 'Standard'
  and comparator = 'BytesType'
  and default_validation_class = 'BytesType'
  and key_validation_class = 'BytesType'
  and rows_cached = 200000.0
  and row_cache_save_period = 0
  and row_cache_keys_to_save = 2147483647
  and keys_cached = 200000.0
  and key_cache_save_period = 14400
  and read_repair_chance = 1.0
  and gc_grace = 864000
  and min_compaction_threshold = 4
  and max_compaction_threshold = 32
  and replicate_on_write = true
  and row_cache_provider = 'SerializingCacheProvider'
  and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
  and column_metadata=[{column_name: subject, validation_class: BytesType,  index_type: KEYS}];

Any help is greatly appreciated! :)

Cheers

Sam
 		 	   		  

Re: Secondary Index Validation Type Parse Error

Posted by aaron morton <aa...@thelastpickle.com>.
> java.lang.RuntimeException: org.apache.cassandra.db.marshal.MarshalException: cannot parse 'subject' as hex bytes
This has to do with the create column family statement...

>   and comparator = 'BytesType'
Tells Cassandra that all column names in this CF should be interpreted as raw bytes. The BytesType expects string input to be Hexadecimal formatted strings. 

>   and column_metadata=[{column_name: subject, validation_class: BytesType,  index_type: KEYS}]; 

Tells Cassandra to create a secondary index on the column named 'subject'. Column names will in interpreted as hex however, and 'subject' is not a valid hex string. 

Assuming the column names are not thing things you are validating in PHP. Consider changing the comparator to UTF8Type…

create column family subjects
  with column_type = 'Standard'
  and comparator = 'UTF8Type'
  and default_validation_class = 'BytesType'
  and key_validation_class = 'BytesType'
  and rows_cached = 200000.0
  and row_cache_save_period = 0
  and row_cache_keys_to_save = 2147483647
  and keys_cached = 200000.0
  and key_cache_save_period = 14400
  and read_repair_chance = 1.0
  and gc_grace = 864000
  and min_compaction_threshold = 4
  and max_compaction_threshold = 32
  and replicate_on_write = true
  and row_cache_provider = 'SerializingCacheProvider'
  and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
  and column_metadata=[{column_name: subject, validation_class: BytesType,  index_type: KEYS}];
 

hope that helps.

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 19/03/2012, at 2:22 AM, Sam Hodgson wrote:

> Hi me again - sorry i've just read that bytestype will expect hex input so my question now is how to create a column that will accept non-validated text as as input?  I think I can maybe get round this by forcing UTF8Encoding regardless if the string is already identified as UTF8 or not however it seems like im missing some fundamental knowledge about casandra validation?
> 
> Cheers
> 
> Sam
> 
> From: hodgson_sam@hotmail.com
> To: user@cassandra.apache.org
> Subject: Secondary Index Validation Type Parse Error
> Date: Sun, 18 Mar 2012 13:02:10 +0000
> 
> Hi All,
> 
> Getting the following parse error when trying to create a CF with a secondary index using the bytestype attribute, the index is for a column called 'subject':
> 
> java.lang.RuntimeException: org.apache.cassandra.db.marshal.MarshalException: cannot parse 'subject' as hex bytes
> 
> Im doing all my validation in php however im unable to validate some UTF8 sources accurately (using mb_detect_encoding) - Cass picks up on bits of non-UTF8 compatible text that the php doesnt so its throwing exceptions.  Figured id set everything to bytestype to try and effectively turn off validation in Cass?  
> 
> Im using the following to try and build the CF:
> 
> create column family subjects
>   with column_type = 'Standard'
>   and comparator = 'BytesType'
>   and default_validation_class = 'BytesType'
>   and key_validation_class = 'BytesType'
>   and rows_cached = 200000.0
>   and row_cache_save_period = 0
>   and row_cache_keys_to_save = 2147483647
>   and keys_cached = 200000.0
>   and key_cache_save_period = 14400
>   and read_repair_chance = 1.0
>   and gc_grace = 864000
>   and min_compaction_threshold = 4
>   and max_compaction_threshold = 32
>   and replicate_on_write = true
>   and row_cache_provider = 'SerializingCacheProvider'
>   and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
>   and column_metadata=[{column_name: subject, validation_class: BytesType,  index_type: KEYS}];
> 
> Any help is greatly appreciated! :)
> 
> Cheers
> 
> Sam


RE: Secondary Index Validation Type Parse Error

Posted by Sam Hodgson <ho...@hotmail.com>.
Hi me again - sorry i've just read that bytestype will expect hex input so my question now is how to create a column that will accept non-validated text as as input?  I think I can maybe get round this by forcing UTF8Encoding regardless if the string is already identified as UTF8 or not however it seems like im missing some fundamental knowledge about casandra validation?

Cheers

Sam

From: hodgson_sam@hotmail.com
To: user@cassandra.apache.org
Subject: Secondary Index Validation Type Parse Error
Date: Sun, 18 Mar 2012 13:02:10 +0000







Hi All,

Getting the following parse error when trying to create a CF with a secondary index using the bytestype attribute, the index is for a column called 'subject':

java.lang.RuntimeException: org.apache.cassandra.db.marshal.MarshalException: cannot parse 'subject' as hex bytes

Im doing all my validation in php however im unable to validate some UTF8 sources accurately (using mb_detect_encoding) - Cass picks up on bits of non-UTF8 compatible text that the php doesnt so its throwing exceptions.  Figured id set everything to bytestype to try and effectively turn off validation in Cass?  

Im using the following to try and build the CF:

create column family subjects
  with column_type = 'Standard'
  and comparator = 'BytesType'
  and default_validation_class = 'BytesType'
  and key_validation_class = 'BytesType'
  and rows_cached = 200000.0
  and row_cache_save_period = 0
  and row_cache_keys_to_save = 2147483647
  and keys_cached = 200000.0
  and key_cache_save_period = 14400
  and read_repair_chance = 1.0
  and gc_grace = 864000
  and min_compaction_threshold = 4
  and max_compaction_threshold = 32
  and replicate_on_write = true
  and row_cache_provider = 'SerializingCacheProvider'
  and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
  and column_metadata=[{column_name: subject, validation_class: BytesType,  index_type: KEYS}];

Any help is greatly appreciated! :)

Cheers

Sam