You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@thrift.apache.org by Tushar Sudake <et...@gmail.com> on 2012/08/03 09:50:03 UTC

Issue unicode data over thrift

Hi,

I'm performing some performance experiments with Thrift based Server and
Client.
The server is written in Java, while Client is written in C++.

What server does is returns "Sting Bytes" wrapped in Java's "ByetBuffer"
object.

This works fine with non-unicode data. But when  data contains unicode
Strings, client Receives OR Reads (I'm not sure which one is it) corrupted
unicode data (like '?').

So my question is is there any specific way needed to handle unicode data
with Thrift?

I'm very new to Thrift and can't provide much technical details. (I haven't
written both client, server :) )


Any help would be greatly appreciated.

Thanks,
Tushar Sudake

Re: Issue unicode data over thrift

Posted by Tushar Sudake <et...@gmail.com>.
Thanks Mark.

Yes plain Stings (not in ByteBuffer) just work fine. Actually I need to use
ByteBuffer for very fast data transfer requirements.
So I'm manually wrapping them in ByteBuffer before sending to client.

Anyways, I figured out the issue. I missed to specify Unicode Encoding as
UTF-8 (instead of Java's default UTF-16) on server side while wrapping
sting bytes in ByteBuffer. My Client only understands UTF-8 :)

On server side:
ByteBuffer.wrap(someString.getBytes("UTF-8"));

Thanks Again for you reply.

On Tue, Aug 7, 2012 at 1:26 AM, Mark Slee <ms...@fb.com> wrote:

> What data types are you using in your Thrift definition?
>
> If you use the String type, and don't wrap in ByteBuffer, you should get
> unicode strings on the Java side. Are you manually wrapping up your
> strings in a ByteBuffer? A ByteBuffer should not come into play unless you
> are using the binary type, which is intended not to do any unicode
> handling and just treat everything as raw bytes.
>
> Cheers,
> Mark
>
> On 8/3/12 12:50 AM, "Tushar Sudake" <et...@gmail.com> wrote:
>
> >Hi,
> >
> >I'm performing some performance experiments with Thrift based Server and
> >Client.
> >The server is written in Java, while Client is written in C++.
> >
> >What server does is returns "Sting Bytes" wrapped in Java's "ByetBuffer"
> >object.
> >
> >This works fine with non-unicode data. But when  data contains unicode
> >Strings, client Receives OR Reads (I'm not sure which one is it) corrupted
> >unicode data (like '?').
> >
> >So my question is is there any specific way needed to handle unicode data
> >with Thrift?
> >
> >I'm very new to Thrift and can't provide much technical details. (I
> >haven't
> >written both client, server :) )
> >
> >
> >Any help would be greatly appreciated.
> >
> >Thanks,
> >Tushar Sudake
>
>

Re: Issue unicode data over thrift

Posted by Mark Slee <ms...@fb.com>.
What data types are you using in your Thrift definition?

If you use the String type, and don't wrap in ByteBuffer, you should get
unicode strings on the Java side. Are you manually wrapping up your
strings in a ByteBuffer? A ByteBuffer should not come into play unless you
are using the binary type, which is intended not to do any unicode
handling and just treat everything as raw bytes.

Cheers,
Mark

On 8/3/12 12:50 AM, "Tushar Sudake" <et...@gmail.com> wrote:

>Hi,
>
>I'm performing some performance experiments with Thrift based Server and
>Client.
>The server is written in Java, while Client is written in C++.
>
>What server does is returns "Sting Bytes" wrapped in Java's "ByetBuffer"
>object.
>
>This works fine with non-unicode data. But when  data contains unicode
>Strings, client Receives OR Reads (I'm not sure which one is it) corrupted
>unicode data (like '?').
>
>So my question is is there any specific way needed to handle unicode data
>with Thrift?
>
>I'm very new to Thrift and can't provide much technical details. (I
>haven't
>written both client, server :) )
>
>
>Any help would be greatly appreciated.
>
>Thanks,
>Tushar Sudake