You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@thrift.apache.org by Simon Falsig <si...@newtec.dk> on 2014/12/02 10:47:03 UTC

Thrift C++ library generating locale-specific JSON

Hi,

I'm using Thrift for serializing objects (in C++) into JSON strings, which
I then store as files. This usually works fine, but I just discovered that
if the locale of my system is set to a locale where the decimal separator
is ',' instead of '.' (for instance in Denmark), then the TJSONProtocol
(specifically the writeJSONDouble function that does the double to string
conversion through a boost::lexical_cast) will also use this separator
when serializing doubles. This kinda plays havoc with the JSON
specification, which does not allow for localized formatting, and depends
on always using '.' as a decimal separator. I could imagine that there may
be other more subtle places where the local can have an effect, but this
is currently the only place I've been having trouble with.

I can see that the same problem has been fixed for C# (commit
3da317bda100130b2f615034c46b0944888f0f14 / THRIFT-1245 C# JSON Protocol
uses culture-dependant decimal separator for double) but it doesn't seem
as if there's a similarly easy way to fix boost::lexical_cast...
In any case, I'm not too strong with respect to all this locale stuff, so
if anyone has a quick way to fix this, I'd be very interested.

Note that I'm using Thrift 0.9.0, but as far as I can see there hasn't
been any relevant changes to TJSONProtocol.cpp since that.

Thanks in advance,
 - Simon

SV: Thrift C++ library generating locale-specific JSON

Posted by Simon Falsig <si...@newtec.dk>.
> Hi,
>
> How it should look (and hopefully work):
>
> char* oldLoc = setlocale(LC_NUMERIC, "C") std::stirng numberAsString = /*
> printf/lexical_cast/iostream */; setlocale(LC_NUMERIC, oldLoc);
>
> Preferably - some RAII construct has to be implemented :)
>
> It's probably only portable solution. But: setlocale can be sloooow, so
> maybe changing it once at begin of generation and restoring at end will
> be needed (local changes are better, yet I don't know what are efficiency
> expectations for code generator).
>
> -KG

Hmm - I'd really prefer not to have to change the locale for anything but
the conversion. We have multiple threads running in parallel, and it's not
possible to say if one of those is trying to get a localized version of a
string at the same time as the locale is changed... Of course, the risk
would probably not be huge, but it's still there...

For reference, I've created an issue in the Jira system now - it can be
found here: https://issues.apache.org/jira/browse/THRIFT-2870

Thanks,
 - Simon

Re: Thrift C++ library generating locale-specific JSON

Posted by Konrad Grochowski <hc...@apache.org>.
Hi,

How it should look (and hopefully work):

char* oldLoc = setlocale(LC_NUMERIC, "C")
std::stirng numberAsString = /* printf/lexical_cast/iostream */;
setlocale(LC_NUMERIC, oldLoc);

Preferably - some RAII construct has to be implemented :)

It's probably only portable solution. But: setlocale can be sloooow, so 
maybe changing it once at begin of generation and restoring at end will 
be needed (local changes are better, yet I don't know what are 
efficiency expectations for code generator).

-KG


W dniu 2014-12-02 o 13:35, Randy Abernethy pisze:
> Hello Simon,
>
> If you create a Jira issue for this bug a Thrift developer (could be
> you or any other interested party) will repair the flaw and you can
> track it through the issue ticket.
>
> https://issues.apache.org/jira/browse/THRIFT/?selectedTab=com.atlassian.jira.jira-projects-plugin:issues-panel
>
> The Boost Lexical cast header is 2,747 lines long and it includes
> 10-15 similar headers. Something simpler here would make a lot of
> sense.
>
> Best,
> Randy
>
> On Tue, Dec 2, 2014 at 1:47 AM, Simon Falsig <si...@newtec.dk> wrote:
>> Hi,
>>
>> I'm using Thrift for serializing objects (in C++) into JSON strings, which
>> I then store as files. This usually works fine, but I just discovered that
>> if the locale of my system is set to a locale where the decimal separator
>> is ',' instead of '.' (for instance in Denmark), then the TJSONProtocol
>> (specifically the writeJSONDouble function that does the double to string
>> conversion through a boost::lexical_cast) will also use this separator
>> when serializing doubles. This kinda plays havoc with the JSON
>> specification, which does not allow for localized formatting, and depends
>> on always using '.' as a decimal separator. I could imagine that there may
>> be other more subtle places where the local can have an effect, but this
>> is currently the only place I've been having trouble with.
>>
>> I can see that the same problem has been fixed for C# (commit
>> 3da317bda100130b2f615034c46b0944888f0f14 / THRIFT-1245 C# JSON Protocol
>> uses culture-dependant decimal separator for double) but it doesn't seem
>> as if there's a similarly easy way to fix boost::lexical_cast...
>> In any case, I'm not too strong with respect to all this locale stuff, so
>> if anyone has a quick way to fix this, I'd be very interested.
>>
>> Note that I'm using Thrift 0.9.0, but as far as I can see there hasn't
>> been any relevant changes to TJSONProtocol.cpp since that.
>>
>> Thanks in advance,
>>   - Simon


---
Ta wiadomość została sprawdzona na obecność wirusów przez oprogramowanie antywirusowe Avast.
http://www.avast.com


SV: Thrift C++ library generating locale-specific JSON

Posted by Simon Falsig <si...@newtec.dk>.
> Hello Simon,
>
> If you create a Jira issue for this bug a Thrift developer (could be you
> or any other interested party) will repair the flaw and you can track it
> through the issue ticket.
>
> https://issues.apache.org/jira/browse/THRIFT/?selectedTab=com.atlassian.jira.jira-projects-plugin:issues-panel
>
> The Boost Lexical cast header is 2,747 lines long and it includes
> 10-15 similar headers. Something simpler here would make a lot of sense.
>
> Best,
> Randy

Thanks for the answer! I've created an issue in the system now - it can be
found here: https://issues.apache.org/jira/browse/THRIFT-2870

I also managed to put together a patch that works on my own system at
least - using ostringstream with an imbued locale instead of the
boost::lexical_cast - but I'm in no way sure if it covers all corner cases
and such...

Best regards,
 - Simon

Re: Thrift C++ library generating locale-specific JSON

Posted by Randy Abernethy <ra...@apache.org>.
Hello Simon,

If you create a Jira issue for this bug a Thrift developer (could be
you or any other interested party) will repair the flaw and you can
track it through the issue ticket.

https://issues.apache.org/jira/browse/THRIFT/?selectedTab=com.atlassian.jira.jira-projects-plugin:issues-panel

The Boost Lexical cast header is 2,747 lines long and it includes
10-15 similar headers. Something simpler here would make a lot of
sense.

Best,
Randy

On Tue, Dec 2, 2014 at 1:47 AM, Simon Falsig <si...@newtec.dk> wrote:
> Hi,
>
> I'm using Thrift for serializing objects (in C++) into JSON strings, which
> I then store as files. This usually works fine, but I just discovered that
> if the locale of my system is set to a locale where the decimal separator
> is ',' instead of '.' (for instance in Denmark), then the TJSONProtocol
> (specifically the writeJSONDouble function that does the double to string
> conversion through a boost::lexical_cast) will also use this separator
> when serializing doubles. This kinda plays havoc with the JSON
> specification, which does not allow for localized formatting, and depends
> on always using '.' as a decimal separator. I could imagine that there may
> be other more subtle places where the local can have an effect, but this
> is currently the only place I've been having trouble with.
>
> I can see that the same problem has been fixed for C# (commit
> 3da317bda100130b2f615034c46b0944888f0f14 / THRIFT-1245 C# JSON Protocol
> uses culture-dependant decimal separator for double) but it doesn't seem
> as if there's a similarly easy way to fix boost::lexical_cast...
> In any case, I'm not too strong with respect to all this locale stuff, so
> if anyone has a quick way to fix this, I'd be very interested.
>
> Note that I'm using Thrift 0.9.0, but as far as I can see there hasn't
> been any relevant changes to TJSONProtocol.cpp since that.
>
> Thanks in advance,
>  - Simon