You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by mo...@gmail.com on 2009/07/21 18:18:30 UTC

keys and column names cannot be utf-8

Is there any timeline on when commit 185 will be done as the utf8 error
still exists
In my experiments, i found that keys and column-names still cannot be utf8

this is a major restriction
Please push this fix in the trunk
Thanks

On Sun, Jul 19, 2009 at 6:11 AM, Jonathan Ellis <jb...@gmail.com> wrote:

> That should be partially solved in trunk now that 139 is committed,
> and more solved when we commit 185 soon.
>
> On Sun, Jul 19, 2009 at 3:43 AM, <mo...@gmail.com> wrote:
> > Any utf-8 keyword causes cassandra to crash!
> >
>

Re: keys and column names cannot be utf-8

Posted by mo...@gmail.com.
This is a definitely a bug not an improvement
The python thrift client is unusable without utf8 or unicode as much of the
web is utf8 or unicode
https://issues.apache.org/jira/browse/THRIFT-395

jonathan, cpython is the default way to use in django, pylons or any of the
other frameworks
using jython or java is not an option

If someone can tell how hard this is to fix python thrift client, it would
tell me if we can use cassandra or not

On Tue, Jul 21, 2009 at 2:18 PM, <mo...@gmail.com> wrote:

> Hey jonathan
> this is not in the wiki or any documentation. Give that searching for
> UTF8Type gave the tag of
> adding  *CompareWith="UTF8Type" *
>   <ColumnFamily ColumnType="Super" Name="Super1"/>
>
> All my inserts and queries go to a super column family Super1
>
> So should i change this to
>
> <ColumnFamily ColumnType="Super" *CompareWith="UTF8Type" * Name="Super1"/>
> does this work in python thrift
> if it does - that would be perfect
>
> but this doesnt explain why keys cannot be utf8
>
> thanks
>
> On Tue, Jul 21, 2009 at 2:06 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>
>> did you read the new section in the config xml explaining how to use a
>> UTF8 comparator?
>>
>> also: thrift itself is just plain broken for unicode support in some
>> languages; see THRIFT-395
>>
>> I think the short version is that when you have a java server, unicode
>> will work with java or C# clients but not with anything else
>>
>> (so if you are using a python client for instance switching to jython
>> might be a workaround)
>>
>> On Tue, Jul 21, 2009 at 4:00 PM, <mo...@gmail.com> wrote:
>> > Not fixed
>> > The following utf8 key names and column names still give an error.
>> > cass: 2009-07-21 13:55:35,597 error 98. ìµì§
>> >                                             ì¤ ìì§
>> > Ûïº] (1)icasso's, instruments de musique sur un guéridon]
>> (1)Ûïº[irancel]
>> > (1)ïº
>> > cass: 2009-07-21 13:55:55,093 error 377. friday night lights
>> > s03e01[âmegaupload..50 error 321. instruments de musique sur un
>> guéridon[[
>> > comâ
>> >     cass: 2009-07-21 13:56:12,341 error 637. asuka izumi photos[u15
>> ç«¥æãçé]
>> > (1)
>> > cass: 2009-07-21 13:56:39,380 error 1118. dragonball z games for
>> pc[dragon
>> > balĺz pc games download] (1)
>> > cass: 2009-07-21 13:56:48,976 error 1301. ï»ïº­ïºïºï» ﺳ[ï»ïº­ïºïºï»
>> ﺳ ï»
>> >
>> > 导æç³æµ·è¯±å¥¸å¯¼è´å¥³çèªæ] ((1)2009-07-21 13:56:55,352 error 1430.
>> > æç³æµ·[大å
>> > cass: 2009-07-21 13:56:59,287 error 1510. cinquième
>> république[définition
>> > de  république?] (1)                                      导æç³æµ·]
>> (1)
>> > cass: 2009-07-21 13:59:38,783 error 1842. navaratri kolu[doll festival
>> in
>> > navratt
>> > ri golu] (1)
>> > cass: 2009-07-21 13:59:39,069 error 1846. tn lottery winning
>> > numbers[www.tnlottery] (1)
>> > cass: 2009-07-21 13:59:39,274 error 1850. www.buildabearville.comcheats[all
>> > the buildabear.com cheats and codes] (1)
>> > cass: 2009-07-21 13:59:39,773 error 1860. shippuuden 78[naruto
>> shippuuden 78
>> > subbed torrent] (1)
>> >
>> > On Tue, Jul 21, 2009 at 10:34 AM, Eric Evans <ee...@rackspace.com>
>> wrote:
>> >>
>> >> On Tue, 2009-07-21 at 09:18 -0700, mobiledreamers@gmail.com wrote:
>> >> > Is there any timeline on when commit 185 will be done as the utf8
>> >> > error still exists
>> >>
>> >> 185 was committed yesterday.
>> >>
>> >> https://issues.apache.org/jira/browse/CASSANDRA-185
>> >>
>> >> --
>> >> Eric Evans
>> >> eevans@rackspace.com
>> >>
>> >
>> >
>> >
>> > --
>> > Bidegg worlds best auction site
>> > http://bidegg.com
>> >
>>
>
>
>
> --
> Bidegg worlds best auction site
> http://bidegg.com
>



-- 
Bidegg worlds best auction site
http://bidegg.com

Re: keys and column names cannot be utf-8

Posted by mo...@gmail.com.
Still gives error - x.search and x.related are unicode words and when they
are used as key or column name the following erros come up

 x.search
Out[5]: u'\ucd5c\uc9c4\uc2e4 \uc774\ud63c'
In [6]: x.related
Out[6]: u'\ucd5c\uc9c4\uc2e4 \uc774\ud63c'
In [7]: client.insert('Table1', x.search, ColumnPath('Super1', 'Related',
x.related), pickle.dumps(dict(count=1)), time.time(), 0)
ERROR: An unexpected error occurred while tokenizing input
The following traceback may be corrupted or invalid
The error message is: ('EOF in multi-line statement', (1149, 0))

---------------------------------------------------------------------------
TApplicationException                     Traceback (most recent call last)

/home/mark/<ipython console> in <module>()

/home/mark/work/common/cassandra/Cassandra.pyc in insert(self, table, key,
column_path, value, timestamp, block_for)
    359     """
    360     self.send_insert(table, key, column_path, value, timestamp,
block_for)
--> 361     self.recv_insert()
    362
    363   def send_insert(self, table, key, column_path, value, timestamp,
block_for):

/home/mark/work/common/cassandra/Cassandra.pyc in recv_insert(self)
    380       x.read(self._iprot)
    381       self._iprot.readMessageEnd()
--> 382       raise x
    383     result = insert_result()
    384     result.read(self._iprot)

TApplicationException: Internal error processing insert




INFO - Cassandra starting up...
DEBUG - insert
ERROR - Internal error processing insert
java.lang.NullPointerException
        at
org.apache.cassandra.service.ThriftValidation.validateColumnPath(ThriftValidation.java:61)
        at
org.apache.cassandra.service.CassandraServer.insert(CassandraServer.java:262)
        at
org.apache.cassandra.service.Cassandra$Processor$insert.process(Cassandra.java:927)
        at
org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:796)
        at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:252)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)



On Tue, Jul 21, 2009 at 3:04 PM, Jonathan Ellis <jb...@gmail.com> wrote:

> did you check to make sure all the nodes were running and had no
> exceptions in their logs?
>
> On Tue, Jul 21, 2009 at 4:46 PM, <mo...@gmail.com> wrote:
> > Strange this happened. in the 4 server nodes that run cassandra, the conf
> > file had
> > ConfA
> > <ColumnFamily ColumnSort="Name"
> > Name="Standard1"  FlushPeriodInMinutes="60"/>
> > <ColumnFamily ColumnSort="Name"  Name="Standard2"/>
> > <ColumnFamily ColumnSort="Time"  Name="StandardByTime1"/>
> > <ColumnFamily ColumnType="Super"   Name="Super1"/>
> > I changed it to the following and doing nodeprobe after restarting
> > cassandra, the other 3 nodes are down,
> > ConfB
> > <ColumnFamily ColumnSort="Name"
> > Name="Standard1"  CompareWith="UTF8Type" FlushPeriodInMinutes="60"/>
> > <ColumnFamily
> ColumnSort="Name"  CompareWith="UTF8Type" Name="Standard2"/>
> > <ColumnFamily
> > ColumnSort="Time"  CompareWith="UTF8Type" Name="StandardByTime1"/>
> > <ColumnFamily ColumnType="Super" CompareWith="UTF8Type"
> > CompareSubcolumnsWith="UTF8Type" Name="Super1"/>
> > If i revert ConfB and set ConfA, all 4 nodes show up in nodeprobe in all
> the
> > 4 nodes
> > I m unsure how to debug this
> > On Tue, Jul 21, 2009 at 2:32 PM, <mo...@gmail.com> wrote:
> >>
> >> if this would be the conf/storage-conf.xml
> >> <ColumnFamily ColumnSort="Name"
> >> Name="Standard1"  CompareWith="UTF8Type" FlushPeriodInMinutes="60"/>
> >> <ColumnFamily
> ColumnSort="Name"  CompareWith="UTF8Type" Name="Standard2"/>
> >> <ColumnFamily
> >> ColumnSort="Time"  CompareWith="UTF8Type" Name="StandardByTime1"/>
> >> <ColumnFamily ColumnType="Super" CompareWith="UTF8Type"
> >> CompareSubcolumnsWith="UTF8Type" Name="Super1"/>
> >> Jonathan can you clarify if this will guarantee proper python thrift
> utf8
> >> behavior thanks
> >> On Tue, Jul 21, 2009 at 2:29 PM, Jonathan Ellis <jb...@gmail.com>
> wrote:
> >>>
> >>> you may also want to specify CompareSubcolumnsWith.
> >>>
> >>> On Tue, Jul 21, 2009 at 4:27 PM, <mo...@gmail.com> wrote:
> >>> > thanks jonathan
> >>> > trying this
> >>> > <ColumnFamily
> ColumnType="Super" CompareWith="UTF8Type" Name="Super1"/>
> >>> >
> >>> > On Tue, Jul 21, 2009 at 2:24 PM, Jonathan Ellis <jb...@gmail.com>
> >>> > wrote:
> >>> >>
> >>> >> On Tue, Jul 21, 2009 at 4:21 PM, Jonathan Ellis<jb...@gmail.com>
> >>> >> wrote:
> >>> >> >> does this work in python thrift
> >>> >> >
> >>> >> > probably not, given the thrift utf8 bugs.
> >>> >>
> >>> >> to correct myself: now that we are using binary data in the thrift
> api
> >>> >> it can't screw us over.  so yes, UTF8Type should be fine.
> >>> >
> >>> >
> >>> >
> >>> > --
> >>> > Bidegg worlds best auction site
> >>> > http://bidegg.com
> >>> >
> >>
> >>
> >>
> >> --
> >> Bidegg worlds best auction site
> >> http://bidegg.com
> >
> >
> >
> > --
> > Bidegg worlds best auction site
> > http://bidegg.com
> >
>
>



-- 
Bidegg worlds best auction site
http://bidegg.com

Re: keys and column names cannot be utf-8

Posted by Jonathan Ellis <jb...@gmail.com>.
did you check to make sure all the nodes were running and had no
exceptions in their logs?

On Tue, Jul 21, 2009 at 4:46 PM, <mo...@gmail.com> wrote:
> Strange this happened. in the 4 server nodes that run cassandra, the conf
> file had
> ConfA
> <ColumnFamily ColumnSort="Name"
> Name="Standard1"  FlushPeriodInMinutes="60"/>
> <ColumnFamily ColumnSort="Name"  Name="Standard2"/>
> <ColumnFamily ColumnSort="Time"  Name="StandardByTime1"/>
> <ColumnFamily ColumnType="Super"   Name="Super1"/>
> I changed it to the following and doing nodeprobe after restarting
> cassandra, the other 3 nodes are down,
> ConfB
> <ColumnFamily ColumnSort="Name"
> Name="Standard1"  CompareWith="UTF8Type" FlushPeriodInMinutes="60"/>
> <ColumnFamily ColumnSort="Name"  CompareWith="UTF8Type" Name="Standard2"/>
> <ColumnFamily
> ColumnSort="Time"  CompareWith="UTF8Type" Name="StandardByTime1"/>
> <ColumnFamily ColumnType="Super" CompareWith="UTF8Type"
> CompareSubcolumnsWith="UTF8Type" Name="Super1"/>
> If i revert ConfB and set ConfA, all 4 nodes show up in nodeprobe in all the
> 4 nodes
> I m unsure how to debug this
> On Tue, Jul 21, 2009 at 2:32 PM, <mo...@gmail.com> wrote:
>>
>> if this would be the conf/storage-conf.xml
>> <ColumnFamily ColumnSort="Name"
>> Name="Standard1"  CompareWith="UTF8Type" FlushPeriodInMinutes="60"/>
>> <ColumnFamily ColumnSort="Name"  CompareWith="UTF8Type" Name="Standard2"/>
>> <ColumnFamily
>> ColumnSort="Time"  CompareWith="UTF8Type" Name="StandardByTime1"/>
>> <ColumnFamily ColumnType="Super" CompareWith="UTF8Type"
>> CompareSubcolumnsWith="UTF8Type" Name="Super1"/>
>> Jonathan can you clarify if this will guarantee proper python thrift utf8
>> behavior thanks
>> On Tue, Jul 21, 2009 at 2:29 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>>>
>>> you may also want to specify CompareSubcolumnsWith.
>>>
>>> On Tue, Jul 21, 2009 at 4:27 PM, <mo...@gmail.com> wrote:
>>> > thanks jonathan
>>> > trying this
>>> > <ColumnFamily ColumnType="Super" CompareWith="UTF8Type" Name="Super1"/>
>>> >
>>> > On Tue, Jul 21, 2009 at 2:24 PM, Jonathan Ellis <jb...@gmail.com>
>>> > wrote:
>>> >>
>>> >> On Tue, Jul 21, 2009 at 4:21 PM, Jonathan Ellis<jb...@gmail.com>
>>> >> wrote:
>>> >> >> does this work in python thrift
>>> >> >
>>> >> > probably not, given the thrift utf8 bugs.
>>> >>
>>> >> to correct myself: now that we are using binary data in the thrift api
>>> >> it can't screw us over.  so yes, UTF8Type should be fine.
>>> >
>>> >
>>> >
>>> > --
>>> > Bidegg worlds best auction site
>>> > http://bidegg.com
>>> >
>>
>>
>>
>> --
>> Bidegg worlds best auction site
>> http://bidegg.com
>
>
>
> --
> Bidegg worlds best auction site
> http://bidegg.com
>

Re: keys and column names cannot be utf-8

Posted by mo...@gmail.com.
*Strange this happened. in the 4 server nodes that run cassandra, the conf
file had**ConfA*
<ColumnFamily ColumnSort="Name" Name="Standard1" *
 FlushPeriodInMinutes="60"/>*
<ColumnFamily ColumnSort="Name"  *Name="Standard2"/>*
<ColumnFamily ColumnSort="Time"  *Name="StandardByTime1"/>*
<ColumnFamily ColumnType="Super"  * *Name="Super1"/>

*I changed it to the following and doing nodeprobe after restarting
cassandra, the other 3 nodes are down,*
*ConfB*
<ColumnFamily ColumnSort="Name" Name="Standard1"  *CompareWith="UTF8Type"
 FlushPeriodInMinutes="60"/>*
<ColumnFamily ColumnSort="Name"  *CompareWith="UTF8Type" Name="Standard2"/>*
<ColumnFamily ColumnSort="Time"  *CompareWith="UTF8Type"
 Name="StandardByTime1"/>*
<ColumnFamily ColumnType="Super" *CompareWith="UTF8Type"
CompareSubcolumnsWith="UTF8Type" *Name="Super1"/>

*If i revert ConfB and set ConfA, all 4 nodes show up in nodeprobe in all
the 4 nodes*
*
*
*I m unsure how to debug this*

On Tue, Jul 21, 2009 at 2:32 PM, <mo...@gmail.com> wrote:

> if this would be the conf/storage-conf.xml
>
> <ColumnFamily ColumnSort="Name" Name="Standard1"  *CompareWith="UTF8Type"
>  FlushPeriodInMinutes="60"/>*
> <ColumnFamily ColumnSort="Name"  *CompareWith="UTF8Type"
>  Name="Standard2"/>*
> <ColumnFamily ColumnSort="Time"  *CompareWith="UTF8Type"
>  Name="StandardByTime1"/>*
> <ColumnFamily ColumnType="Super" *CompareWith="UTF8Type"
> CompareSubcolumnsWith="UTF8Type" *Name="Super1"/>
>
> Jonathan can you clarify if this will guarantee proper python thrift utf8
> behavior thanks
>
> On Tue, Jul 21, 2009 at 2:29 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>
>> you may also want to specify CompareSubcolumnsWith.
>>
>> On Tue, Jul 21, 2009 at 4:27 PM, <mo...@gmail.com> wrote:
>> > thanks jonathan
>> > trying this
>> > <ColumnFamily ColumnType="Super" CompareWith="UTF8Type" Name="Super1"/>
>> >
>> > On Tue, Jul 21, 2009 at 2:24 PM, Jonathan Ellis <jb...@gmail.com>
>> wrote:
>> >>
>> >> On Tue, Jul 21, 2009 at 4:21 PM, Jonathan Ellis<jb...@gmail.com>
>> wrote:
>> >> >> does this work in python thrift
>> >> >
>> >> > probably not, given the thrift utf8 bugs.
>> >>
>> >> to correct myself: now that we are using binary data in the thrift api
>> >> it can't screw us over.  so yes, UTF8Type should be fine.
>> >
>> >
>> >
>> > --
>> > Bidegg worlds best auction site
>> > http://bidegg.com
>> >
>>
>
>
>
> --
> Bidegg worlds best auction site
> http://bidegg.com
>



-- 
Bidegg worlds best auction site
http://bidegg.com

Re: keys and column names cannot be utf-8

Posted by Jonathan Ellis <jb...@gmail.com>.
guarantee?  in a pre-alpha trunk?  no, that is too strong a word.

but that's what *supposed* to work, so I will fix it if it doesn't. :)

On Tue, Jul 21, 2009 at 4:32 PM, <mo...@gmail.com> wrote:
> if this would be the conf/storage-conf.xml
> <ColumnFamily ColumnSort="Name"
> Name="Standard1"  CompareWith="UTF8Type" FlushPeriodInMinutes="60"/>
> <ColumnFamily ColumnSort="Name"  CompareWith="UTF8Type" Name="Standard2"/>
> <ColumnFamily
> ColumnSort="Time"  CompareWith="UTF8Type" Name="StandardByTime1"/>
> <ColumnFamily ColumnType="Super" CompareWith="UTF8Type"
> CompareSubcolumnsWith="UTF8Type" Name="Super1"/>
> Jonathan can you clarify if this will guarantee proper python thrift utf8
> behavior thanks
> On Tue, Jul 21, 2009 at 2:29 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>>
>> you may also want to specify CompareSubcolumnsWith.
>>
>> On Tue, Jul 21, 2009 at 4:27 PM, <mo...@gmail.com> wrote:
>> > thanks jonathan
>> > trying this
>> > <ColumnFamily ColumnType="Super" CompareWith="UTF8Type" Name="Super1"/>
>> >
>> > On Tue, Jul 21, 2009 at 2:24 PM, Jonathan Ellis <jb...@gmail.com>
>> > wrote:
>> >>
>> >> On Tue, Jul 21, 2009 at 4:21 PM, Jonathan Ellis<jb...@gmail.com>
>> >> wrote:
>> >> >> does this work in python thrift
>> >> >
>> >> > probably not, given the thrift utf8 bugs.
>> >>
>> >> to correct myself: now that we are using binary data in the thrift api
>> >> it can't screw us over.  so yes, UTF8Type should be fine.
>> >
>> >
>> >
>> > --
>> > Bidegg worlds best auction site
>> > http://bidegg.com
>> >
>
>
>
> --
> Bidegg worlds best auction site
> http://bidegg.com
>

Re: keys and column names cannot be utf-8

Posted by mo...@gmail.com.
if this would be the conf/storage-conf.xml

<ColumnFamily ColumnSort="Name" Name="Standard1"  *CompareWith="UTF8Type"
 FlushPeriodInMinutes="60"/>*
<ColumnFamily ColumnSort="Name"  *CompareWith="UTF8Type" Name="Standard2"/>*
<ColumnFamily ColumnSort="Time"  *CompareWith="UTF8Type"
 Name="StandardByTime1"/>*
<ColumnFamily ColumnType="Super" *CompareWith="UTF8Type"
CompareSubcolumnsWith="UTF8Type" *Name="Super1"/>

Jonathan can you clarify if this will guarantee proper python thrift utf8
behavior thanks

On Tue, Jul 21, 2009 at 2:29 PM, Jonathan Ellis <jb...@gmail.com> wrote:

> you may also want to specify CompareSubcolumnsWith.
>
> On Tue, Jul 21, 2009 at 4:27 PM, <mo...@gmail.com> wrote:
> > thanks jonathan
> > trying this
> > <ColumnFamily ColumnType="Super" CompareWith="UTF8Type" Name="Super1"/>
> >
> > On Tue, Jul 21, 2009 at 2:24 PM, Jonathan Ellis <jb...@gmail.com>
> wrote:
> >>
> >> On Tue, Jul 21, 2009 at 4:21 PM, Jonathan Ellis<jb...@gmail.com>
> wrote:
> >> >> does this work in python thrift
> >> >
> >> > probably not, given the thrift utf8 bugs.
> >>
> >> to correct myself: now that we are using binary data in the thrift api
> >> it can't screw us over.  so yes, UTF8Type should be fine.
> >
> >
> >
> > --
> > Bidegg worlds best auction site
> > http://bidegg.com
> >
>



-- 
Bidegg worlds best auction site
http://bidegg.com

Re: keys and column names cannot be utf-8

Posted by Jonathan Ellis <jb...@gmail.com>.
you may also want to specify CompareSubcolumnsWith.

On Tue, Jul 21, 2009 at 4:27 PM, <mo...@gmail.com> wrote:
> thanks jonathan
> trying this
> <ColumnFamily ColumnType="Super" CompareWith="UTF8Type" Name="Super1"/>
>
> On Tue, Jul 21, 2009 at 2:24 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>>
>> On Tue, Jul 21, 2009 at 4:21 PM, Jonathan Ellis<jb...@gmail.com> wrote:
>> >> does this work in python thrift
>> >
>> > probably not, given the thrift utf8 bugs.
>>
>> to correct myself: now that we are using binary data in the thrift api
>> it can't screw us over.  so yes, UTF8Type should be fine.
>
>
>
> --
> Bidegg worlds best auction site
> http://bidegg.com
>

Re: keys and column names cannot be utf-8

Posted by mo...@gmail.com.
*WHy not use UTF8Type or BytesType as default
*
The CompareWith attribute tells Cassandra how to sort the columns
+ for slicing operations. For backwards compatibility, the default
+ is to use AsciiType, which is probably NOT what you want.
+ Other options are UTF8Type, UUIDType, and LongType.
+ You can also specify the fully-qualified class name to a class
+ of your choice implementing org.apache.cassandra.db.marshal.IType.
+
+ if FlushPeriodInMinutes is configured and positive, it will be
flushed to disk with that period whether it is dirty or not.
This is intended for lightly-used columnfamilies so that they
do not prevent commitlog segments from being purged.
On Tue, Jul 21, 2009 at 2:27 PM, <mo...@gmail.com> wrote:

> thanks jonathantrying this
> <ColumnFamily ColumnType="Super" *CompareWith="UTF8Type" *Name="Super1"/>
>
>
> On Tue, Jul 21, 2009 at 2:24 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>
>> On Tue, Jul 21, 2009 at 4:21 PM, Jonathan Ellis<jb...@gmail.com> wrote:
>> >> does this work in python thrift
>> >
>> > probably not, given the thrift utf8 bugs.
>>
>> to correct myself: now that we are using binary data in the thrift api
>> it can't screw us over.  so yes, UTF8Type should be fine.
>>
>
>
>
> --
> Bidegg worlds best auction site
> http://bidegg.com
>



-- 
Bidegg worlds best auction site
http://bidegg.com

Re: keys and column names cannot be utf-8

Posted by mo...@gmail.com.
thanks jonathantrying this
<ColumnFamily ColumnType="Super" *CompareWith="UTF8Type" *Name="Super1"/>


On Tue, Jul 21, 2009 at 2:24 PM, Jonathan Ellis <jb...@gmail.com> wrote:

> On Tue, Jul 21, 2009 at 4:21 PM, Jonathan Ellis<jb...@gmail.com> wrote:
> >> does this work in python thrift
> >
> > probably not, given the thrift utf8 bugs.
>
> to correct myself: now that we are using binary data in the thrift api
> it can't screw us over.  so yes, UTF8Type should be fine.
>



-- 
Bidegg worlds best auction site
http://bidegg.com

Re: keys and column names cannot be utf-8

Posted by Jonathan Ellis <jb...@gmail.com>.
On Tue, Jul 21, 2009 at 4:21 PM, Jonathan Ellis<jb...@gmail.com> wrote:
>> does this work in python thrift
>
> probably not, given the thrift utf8 bugs.

to correct myself: now that we are using binary data in the thrift api
it can't screw us over.  so yes, UTF8Type should be fine.

Re: keys and column names cannot be utf-8

Posted by Jonathan Ellis <jb...@gmail.com>.
On Tue, Jul 21, 2009 at 4:18 PM, <mo...@gmail.com> wrote:
> Hey jonathan
> this is not in the wiki or any documentation.

this is trunk.  i wrote it a couple days ago.  feel free to step in
and update the wiki.

> does this work in python thrift

probably not, given the thrift utf8 bugs.  (but you could use
BytesType and at least you will get the right data back.)

> if it does - that would be perfect
> but this doesnt explain why keys cannot be utf8

because FB didn't write it and so far neither has anyone else.

-Jonathan

Re: keys and column names cannot be utf-8

Posted by mo...@gmail.com.
Hey jonathan
this is not in the wiki or any documentation. Give that searching for
UTF8Type gave the tag of
adding  *CompareWith="UTF8Type" *
  <ColumnFamily ColumnType="Super" Name="Super1"/>

All my inserts and queries go to a super column family Super1

So should i change this to

<ColumnFamily ColumnType="Super" *CompareWith="UTF8Type" * Name="Super1"/>
does this work in python thrift
if it does - that would be perfect

but this doesnt explain why keys cannot be utf8

thanks

On Tue, Jul 21, 2009 at 2:06 PM, Jonathan Ellis <jb...@gmail.com> wrote:

> did you read the new section in the config xml explaining how to use a
> UTF8 comparator?
>
> also: thrift itself is just plain broken for unicode support in some
> languages; see THRIFT-395
>
> I think the short version is that when you have a java server, unicode
> will work with java or C# clients but not with anything else
>
> (so if you are using a python client for instance switching to jython
> might be a workaround)
>
> On Tue, Jul 21, 2009 at 4:00 PM, <mo...@gmail.com> wrote:
> > Not fixed
> > The following utf8 key names and column names still give an error.
> > cass: 2009-07-21 13:55:35,597 error 98. ìµì§
> >                                             ì¤ ìì§
> > Ûïº] (1)icasso's, instruments de musique sur un guéridon]
> (1)Ûïº[irancel]
> > (1)ïº
> > cass: 2009-07-21 13:55:55,093 error 377. friday night lights
> > s03e01[âmegaupload..50 error 321. instruments de musique sur un
> guéridon[[
> > comâ
> >     cass: 2009-07-21 13:56:12,341 error 637. asuka izumi photos[u15
> ç«¥æãçé]
> > (1)
> > cass: 2009-07-21 13:56:39,380 error 1118. dragonball z games for
> pc[dragon
> > balĺz pc games download] (1)
> > cass: 2009-07-21 13:56:48,976 error 1301. ï»ïº­ïºïºï» ﺳ[ï»ïº­ïºïºï» ﺳ
> ï»
> >
> > 导æç³æµ·è¯±å¥¸å¯¼è´å¥³çèªæ] ((1)2009-07-21 13:56:55,352 error 1430.
> > æç³æµ·[大å
> > cass: 2009-07-21 13:56:59,287 error 1510. cinquième
> république[définition
> > de  république?] (1)                                      导æç³æµ·] (1)
> > cass: 2009-07-21 13:59:38,783 error 1842. navaratri kolu[doll festival in
> > navratt
> > ri golu] (1)
> > cass: 2009-07-21 13:59:39,069 error 1846. tn lottery winning
> > numbers[www.tnlottery] (1)
> > cass: 2009-07-21 13:59:39,274 error 1850. www.buildabearville.comcheats[all
> > the buildabear.com cheats and codes] (1)
> > cass: 2009-07-21 13:59:39,773 error 1860. shippuuden 78[naruto shippuuden
> 78
> > subbed torrent] (1)
> >
> > On Tue, Jul 21, 2009 at 10:34 AM, Eric Evans <ee...@rackspace.com>
> wrote:
> >>
> >> On Tue, 2009-07-21 at 09:18 -0700, mobiledreamers@gmail.com wrote:
> >> > Is there any timeline on when commit 185 will be done as the utf8
> >> > error still exists
> >>
> >> 185 was committed yesterday.
> >>
> >> https://issues.apache.org/jira/browse/CASSANDRA-185
> >>
> >> --
> >> Eric Evans
> >> eevans@rackspace.com
> >>
> >
> >
> >
> > --
> > Bidegg worlds best auction site
> > http://bidegg.com
> >
>



-- 
Bidegg worlds best auction site
http://bidegg.com

Re: keys and column names cannot be utf-8

Posted by Jonathan Ellis <jb...@gmail.com>.
On Tue, Jul 21, 2009 at 4:06 PM, Jonathan Ellis<jb...@gmail.com> wrote:
> (so if you are using a python client for instance switching to jython
> might be a workaround)

that is, using the java thrift client, not the python ones.

Re: keys and column names cannot be utf-8

Posted by Jonathan Ellis <jb...@gmail.com>.
did you read the new section in the config xml explaining how to use a
UTF8 comparator?

also: thrift itself is just plain broken for unicode support in some
languages; see THRIFT-395

I think the short version is that when you have a java server, unicode
will work with java or C# clients but not with anything else

(so if you are using a python client for instance switching to jython
might be a workaround)

On Tue, Jul 21, 2009 at 4:00 PM, <mo...@gmail.com> wrote:
> Not fixed
> The following utf8 key names and column names still give an error.
> cass: 2009-07-21 13:55:35,597 error 98. ìµì§
>                                             ì¤ ìì§
> Ûïº] (1)icasso's, instruments de musique sur un guéridon] (1)Ûïº[irancel]
> (1)ïº
> cass: 2009-07-21 13:55:55,093 error 377. friday night lights
> s03e01[âmegaupload..50 error 321. instruments de musique sur un guéridon[[
> comâ
>     cass: 2009-07-21 13:56:12,341 error 637. asuka izumi photos[u15 ç«¥æãçé]
> (1)
> cass: 2009-07-21 13:56:39,380 error 1118. dragonball z games for pc[dragon
> balĺz pc games download] (1)
> cass: 2009-07-21 13:56:48,976 error 1301. ï»ïº­ïºïºï» ﺳ[ï»ïº­ïºïºï» ﺳ ï»
>
> 导æç³æµ·è¯±å¥¸å¯¼è´å¥³çèªæ] ((1)2009-07-21 13:56:55,352 error 1430.
> æç³æµ·[大å
> cass: 2009-07-21 13:56:59,287 error 1510. cinquième république[définition
> de  république?] (1)                                      å¯¼æç³æµ·] (1)
> cass: 2009-07-21 13:59:38,783 error 1842. navaratri kolu[doll festival in
> navratt
> ri golu] (1)
> cass: 2009-07-21 13:59:39,069 error 1846. tn lottery winning
> numbers[www.tnlottery] (1)
> cass: 2009-07-21 13:59:39,274 error 1850. www.buildabearville.com cheats[all
> the buildabear.com cheats and codes] (1)
> cass: 2009-07-21 13:59:39,773 error 1860. shippuuden 78[naruto shippuuden 78
> subbed torrent] (1)
>
> On Tue, Jul 21, 2009 at 10:34 AM, Eric Evans <ee...@rackspace.com> wrote:
>>
>> On Tue, 2009-07-21 at 09:18 -0700, mobiledreamers@gmail.com wrote:
>> > Is there any timeline on when commit 185 will be done as the utf8
>> > error still exists
>>
>> 185 was committed yesterday.
>>
>> https://issues.apache.org/jira/browse/CASSANDRA-185
>>
>> --
>> Eric Evans
>> eevans@rackspace.com
>>
>
>
>
> --
> Bidegg worlds best auction site
> http://bidegg.com
>

Re: keys and column names cannot be utf-8

Posted by Sandeep Tata <sa...@gmail.com>.
This is after you changed the conf file to use UTF8Type for the column family?

On Tue, Jul 21, 2009 at 2:00 PM, <mo...@gmail.com> wrote:
> Not fixed
> The following utf8 key names and column names still give an error.
> cass: 2009-07-21 13:55:35,597 error 98. ìµì§
>                                             ì¤ ìì§
> Ûïº] (1)icasso's, instruments de musique sur un guéridon] (1)Ûïº[irancel]
> (1)ïº
> cass: 2009-07-21 13:55:55,093 error 377. friday night lights
> s03e01[âmegaupload..50 error 321. instruments de musique sur un guéridon[[
> comâ
>     cass: 2009-07-21 13:56:12,341 error 637. asuka izumi photos[u15 ç«¥æãçé]
> (1)
> cass: 2009-07-21 13:56:39,380 error 1118. dragonball z games for pc[dragon
> balĺz pc games download] (1)
> cass: 2009-07-21 13:56:48,976 error 1301. ï»ïº­ïºïºï» ﺳ[ï»ïº­ïºïºï» ﺳ ï»
>
> 导æç³æµ·è¯±å¥¸å¯¼è´å¥³çèªæ] ((1)2009-07-21 13:56:55,352 error 1430.
> æç³æµ·[大å
> cass: 2009-07-21 13:56:59,287 error 1510. cinquième république[définition
> de  république?] (1)                                      å¯¼æç³æµ·] (1)
> cass: 2009-07-21 13:59:38,783 error 1842. navaratri kolu[doll festival in
> navratt
> ri golu] (1)
> cass: 2009-07-21 13:59:39,069 error 1846. tn lottery winning
> numbers[www.tnlottery] (1)
> cass: 2009-07-21 13:59:39,274 error 1850. www.buildabearville.com cheats[all
> the buildabear.com cheats and codes] (1)
> cass: 2009-07-21 13:59:39,773 error 1860. shippuuden 78[naruto shippuuden 78
> subbed torrent] (1)
>
> On Tue, Jul 21, 2009 at 10:34 AM, Eric Evans <ee...@rackspace.com> wrote:
>>
>> On Tue, 2009-07-21 at 09:18 -0700, mobiledreamers@gmail.com wrote:
>> > Is there any timeline on when commit 185 will be done as the utf8
>> > error still exists
>>
>> 185 was committed yesterday.
>>
>> https://issues.apache.org/jira/browse/CASSANDRA-185
>>
>> --
>> Eric Evans
>> eevans@rackspace.com
>>
>
>
>
> --
> Bidegg worlds best auction site
> http://bidegg.com
>

Re: keys and column names cannot be utf-8

Posted by mo...@gmail.com.
Not fixed
The following utf8 key names and column names still give an error.

cass: 2009-07-21 13:55:35,597 error 98. ìµì§
                                            ì¤ ìì§
Ûïº] (1)icasso's, instruments de musique sur un guéridon] (1)Ûïº[irancel]
(1)ïº
cass: 2009-07-21 13:55:55,093 error 377. friday night lights
s03e01[âmegaupload..50 error 321. instruments de musique sur un guéridon[[

comâ
    cass: 2009-07-21 13:56:12,341 error 637. asuka izumi photos[u15 ç«¥æãçé]
(1)
cass: 2009-07-21 13:56:39,380 error 1118. dragonball z games for pc[dragon
balĺz pc games download] (1)
cass: 2009-07-21 13:56:48,976 error 1301. ï»ïº­ïºïºï» ﺳ[ï»ïº­ïºïºï» ﺳ ï»

导æç³æµ·è¯±å¥¸å¯¼è´å¥³çèªæ] ((1)2009-07-21 13:56:55,352 error 1430.
æç³æµ·[大å
cass: 2009-07-21 13:56:59,287 error 1510. cinquième république[définition
de  république?] (1)                                      导æç³æµ·] (1)
cass: 2009-07-21 13:59:38,783 error 1842. navaratri kolu[doll festival in
navratt
ri golu] (1)
cass: 2009-07-21 13:59:39,069 error 1846. tn lottery winning
numbers[www.tnlottery] (1)
cass: 2009-07-21 13:59:39,274 error 1850. www.buildabearville.com cheats[all
the buildabear.com cheats and codes] (1)
cass: 2009-07-21 13:59:39,773 error 1860. shippuuden 78[naruto shippuuden 78
subbed torrent] (1)


On Tue, Jul 21, 2009 at 10:34 AM, Eric Evans <ee...@rackspace.com> wrote:

> On Tue, 2009-07-21 at 09:18 -0700, mobiledreamers@gmail.com wrote:
> > Is there any timeline on when commit 185 will be done as the utf8
> > error still exists
>
> 185 was committed yesterday.
>
> https://issues.apache.org/jira/browse/CASSANDRA-185
>
> --
> Eric Evans
> eevans@rackspace.com
>
>


-- 
Bidegg worlds best auction site
http://bidegg.com

Re: keys and column names cannot be utf-8

Posted by Eric Evans <ee...@rackspace.com>.
On Tue, 2009-07-21 at 09:18 -0700, mobiledreamers@gmail.com wrote:
> Is there any timeline on when commit 185 will be done as the utf8
> error still exists

185 was committed yesterday.

https://issues.apache.org/jira/browse/CASSANDRA-185

-- 
Eric Evans
eevans@rackspace.com