You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Hans Melgers <Ha...@anachron.com> on 2013/04/23 11:37:07 UTC

readable (not hex encoded) column names using sstable2json

Hello,

Using Cassandra 1.0.7 sstable2json on some tables I get readable column
names. This leads to problems (java.lang.NumberFormatException: Non-hex
characters in) when importing later.

We're trying to move data over to another cluster but this prevents us
from doing so. Could it have to do with using a custom Serializer<T>?

Here example output:

D:\Java\apache-cassandra-1.0.7\bin>sstable2json
d:\var\lib\cassandra\data2\depsi\ACCOUNT_RECEIVERS-hc-1-Data.db
{
"23696423656139323162633138666635343135616161336136373337666639623038633
9": [["dep.1205050","",1364383456519006]],
"23696423396338306562366366383365346162383863623266663830663863643930343
2": [["dep.1057162","",1364383456664000]],
[GOES ON here]

The value "dep.1205050" is literally what we put in there. It's not hex
encoded.

Kind regards,
Hans Melgers




Re: readable (not hex encoded) column names using sstable2json

Posted by aaron morton <aa...@thelastpickle.com>.
> First of all thanks for the response. We’re trying to copy existing data into a keyspace with a different name on the same server. I’m not sure why our operations team wants this.
You can just rename the files. 

> java.lang.NumberFormatException: Non-hex characters in hertz.246944493-2012
>         at org.apache.cassandra.utils.Hex.hexToBytes(Hex.java:59)
>         at org.apache.cassandra.utils.ByteBufferUtil.hexToBytes(ByteBufferUtil.java:496)
>         at org.apache.cassandra.tools.SSTableImport.stringAsType(SSTableImport.java:523)
>         at org.apache.cassandra.tools.SSTableImport.access$000(SSTableImport.java:52)
>         at org.apache.cassandra.tools.SSTableImport$JsonColumn.<init>(SSTableImport.java:106)
It's choking on the column name. 
The meta data for the the SSTableImport process has says that the column name is BytesType, hence the call to ByteBuffer.hexToBytes(). 

Check the schema in the destination system. (But really just poke the Ops team with a stick until they copy the files, rename them as you need but keep the numbered files together).

Cheers

 
-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 24/04/2013, at 8:08 PM, Hans Melgers <Ha...@anachron.com> wrote:

> First of all thanks for the response. We’re trying to copy existing data into a keyspace with a different name on the same server. I’m not sure why our operations team wants this.
>  
> We’re also looking into the sstable copy approach you suggested and that could work. Still I thinks it’s odd the column names are not hex encoded. Below you’ll find some logging from the json2sstable run showing the exeception. The json files are not manually edited at all btw.
>  
> This is the CF definition I used as an example. It’s used to model parent child relations so the column name are foreign keys. I used this CF as example because it’s nice and small.  
>  
> create column family ACCOUNT_RECEIVERS
>   with column_type = 'Standard'
>   and comparator = 'UTF8Type's
>   and default_validation_class = 'BytesType'
>   and key_validation_class = 'BytesType'
>   and rows_cached = 0.0
>   and row_cache_save_period = 0
>   and row_cache_keys_to_save = 0
>   and keys_cached = 200000.0
>   and key_cache_save_period = 14400
>   and read_repair_chance = 1.0
>   and gc_grace = 864000
>   and min_compaction_threshold = 4
>   and max_compaction_threshold = 32
>   and replicate_on_write = true
>   and row_cache_provider = 'ConcurrentLinkedHashCacheProvider'
>   and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy';
>  
> log:
>  
> ---> sFile : /data/dump/ACCOUNT_RECEIVERS-hd-14-Data.db.json ACCOUNT_RECEIVERS-hd-14-Data.db.json
> bas  -  ACCOUNT  -  /data/dump/ACCOUNT_RECEIVERS-hd-14-Data.db.json  - /data/cassandra-data/bas/ACCOUNT_RECEIVERS-hd-14-Data.db
> Counting keys to import, please wait... (NOTE: to skip this use -n <num_keys>)
> Importing 24735 keys...
> java.lang.NumberFormatException: Non-hex characters in hertz.246944493-2012
>         at org.apache.cassandra.utils.Hex.hexToBytes(Hex.java:59)
>         at org.apache.cassandra.utils.ByteBufferUtil.hexToBytes(ByteBufferUtil.java:496)
>         at org.apache.cassandra.tools.SSTableImport.stringAsType(SSTableImport.java:523)
>         at org.apache.cassandra.tools.SSTableImport.access$000(SSTableImport.java:52)
>         at org.apache.cassandra.tools.SSTableImport$JsonColumn.<init>(SSTableImport.java:106)
>         at org.apache.cassandra.tools.SSTableImport.addColumnsToCF(SSTableImport.java:191)
>         at org.apache.cassandra.tools.SSTableImport.addToStandardCF(SSTableImport.java:174)
>         at org.apache.cassandra.tools.SSTableImport.importSorted(SSTableImport.java:362)
>         at org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.java:255)
>         at org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:479)
> ERROR: Non-hex characters in hertz.246944493-2012
>  
>  
>  
> Van: aaron morton [mailto:aaron@thelastpickle.com] 
> Verzonden: woensdag 24 april 2013 5:37
> Aan: user@cassandra.apache.org
> Onderwerp: Re: readable (not hex encoded) column names using sstable2json
>  
> What the CF definition ?
> What are the errors you are getting?
>  
> We're trying to move data over to another cluster but this prevents us from doing so. 
> Is there a reason you are converting the SSTables to JSON ? 
> You could just copy the sstables. 
>  
> Cheers
>  
> -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>  
> @aaronmorton
> http://www.thelastpickle.com
>  
> On 23/04/2013, at 9:37 PM, Hans Melgers <Ha...@anachron.com> wrote:
> 
> 
> Hello,
> 
> Using Cassandra 1.0.7 sstable2json on some tables I get readable column
> names. This leads to problems (java.lang.NumberFormatException: Non-hex
> characters in) when importing later.
> 
> We're trying to move data over to another cluster but this prevents us
> from doing so. Could it have to do with using a custom Serializer<T>?
> 
> Here example output:
> 
> D:\Java\apache-cassandra-1.0.7\bin>sstable2json
> d:\var\lib\cassandra\data2\depsi\ACCOUNT_RECEIVERS-hc-1-Data.db
> {
> "23696423656139323162633138666635343135616161336136373337666639623038633
> 9": [["dep.1205050","",1364383456519006]],
> "23696423396338306562366366383365346162383863623266663830663863643930343
> 2": [["dep.1057162","",1364383456664000]],
> [GOES ON here]
> 
> The value "dep.1205050" is literally what we put in there. It's not hex
> encoded.
> 
> Kind regards,
> Hans Melgers
> 
> 
> 


RE: readable (not hex encoded) column names using sstable2json

Posted by Hans Melgers <Ha...@anachron.com>.
First of all thanks for the response. We're trying to copy existing data
into a keyspace with a different name on the same server. I'm not sure
why our operations team wants this.

 

We're also looking into the sstable copy approach you suggested and that
could work. Still I thinks it's odd the column names are not hex
encoded. Below you'll find some logging from the json2sstable run
showing the exeception. The json files are not manually edited at all
btw.

 

This is the CF definition I used as an example. It's used to model
parent child relations so the column name are foreign keys. I used this
CF as example because it's nice and small.  

 

create column family ACCOUNT_RECEIVERS

  with column_type = 'Standard'

  and comparator = 'UTF8Type'

  and default_validation_class = 'BytesType'

  and key_validation_class = 'BytesType'

  and rows_cached = 0.0

  and row_cache_save_period = 0

  and row_cache_keys_to_save = 0

  and keys_cached = 200000.0

  and key_cache_save_period = 14400

  and read_repair_chance = 1.0

  and gc_grace = 864000

  and min_compaction_threshold = 4

  and max_compaction_threshold = 32

  and replicate_on_write = true

  and row_cache_provider = 'ConcurrentLinkedHashCacheProvider'

  and compaction_strategy =
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy';

 

log:

 

---> sFile : /data/dump/ACCOUNT_RECEIVERS-hd-14-Data.db.json
ACCOUNT_RECEIVERS-hd-14-Data.db.json

bas  -  ACCOUNT  -  /data/dump/ACCOUNT_RECEIVERS-hd-14-Data.db.json  -
/data/cassandra-data/bas/ACCOUNT_RECEIVERS-hd-14-Data.db

Counting keys to import, please wait... (NOTE: to skip this use -n
<num_keys>)

Importing 24735 keys...

java.lang.NumberFormatException: Non-hex characters in
hertz.246944493-2012

        at org.apache.cassandra.utils.Hex.hexToBytes(Hex.java:59)

        at
org.apache.cassandra.utils.ByteBufferUtil.hexToBytes(ByteBufferUtil.java
:496)

        at
org.apache.cassandra.tools.SSTableImport.stringAsType(SSTableImport.java
:523)

        at
org.apache.cassandra.tools.SSTableImport.access$000(SSTableImport.java:5
2)

        at
org.apache.cassandra.tools.SSTableImport$JsonColumn.<init>(SSTableImport
.java:106)

        at
org.apache.cassandra.tools.SSTableImport.addColumnsToCF(SSTableImport.ja
va:191)

        at
org.apache.cassandra.tools.SSTableImport.addToStandardCF(SSTableImport.j
ava:174)

        at
org.apache.cassandra.tools.SSTableImport.importSorted(SSTableImport.java
:362)

        at
org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.java:2
55)

        at
org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:479)

ERROR: Non-hex characters in hertz.246944493-2012

 

 

 

Van: aaron morton [mailto:aaron@thelastpickle.com] 
Verzonden: woensdag 24 april 2013 5:37
Aan: user@cassandra.apache.org
Onderwerp: Re: readable (not hex encoded) column names using
sstable2json

 

What the CF definition ?

What are the errors you are getting?

 

	We're trying to move data over to another cluster but this
prevents us from doing so. 

Is there a reason you are converting the SSTables to JSON ? 

You could just copy the sstables. 

 

Cheers

 

-----------------

Aaron Morton

Freelance Cassandra Consultant

New Zealand

 

@aaronmorton

http://www.thelastpickle.com

 

On 23/04/2013, at 9:37 PM, Hans Melgers <Ha...@anachron.com>
wrote:





Hello,

Using Cassandra 1.0.7 sstable2json on some tables I get readable column
names. This leads to problems (java.lang.NumberFormatException: Non-hex
characters in) when importing later.

We're trying to move data over to another cluster but this prevents us
from doing so. Could it have to do with using a custom Serializer<T>?

Here example output:

D:\Java\apache-cassandra-1.0.7\bin>sstable2json
d:\var\lib\cassandra\data2\depsi\ACCOUNT_RECEIVERS-hc-1-Data.db
{
"23696423656139323162633138666635343135616161336136373337666639623038633
9": [["dep.1205050","",1364383456519006]],
"23696423396338306562366366383365346162383863623266663830663863643930343
2": [["dep.1057162","",1364383456664000]],
[GOES ON here]

The value "dep.1205050" is literally what we put in there. It's not hex
encoded.

Kind regards,
Hans Melgers




 


Re: readable (not hex encoded) column names using sstable2json

Posted by aaron morton <aa...@thelastpickle.com>.
What the CF definition ?
What are the errors you are getting?

> We're trying to move data over to another cluster but this prevents us from doing so. 
Is there a reason you are converting the SSTables to JSON ? 
You could just copy the sstables. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 23/04/2013, at 9:37 PM, Hans Melgers <Ha...@anachron.com> wrote:

> Hello,
> 
> Using Cassandra 1.0.7 sstable2json on some tables I get readable column
> names. This leads to problems (java.lang.NumberFormatException: Non-hex
> characters in) when importing later.
> 
> We're trying to move data over to another cluster but this prevents us
> from doing so. Could it have to do with using a custom Serializer<T>?
> 
> Here example output:
> 
> D:\Java\apache-cassandra-1.0.7\bin>sstable2json
> d:\var\lib\cassandra\data2\depsi\ACCOUNT_RECEIVERS-hc-1-Data.db
> {
> "23696423656139323162633138666635343135616161336136373337666639623038633
> 9": [["dep.1205050","",1364383456519006]],
> "23696423396338306562366366383365346162383863623266663830663863643930343
> 2": [["dep.1057162","",1364383456664000]],
> [GOES ON here]
> 
> The value "dep.1205050" is literally what we put in there. It's not hex
> encoded.
> 
> Kind regards,
> Hans Melgers
> 
> 
>