You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Alexander Rojas <al...@mesosphere.io> on 2017/01/03 14:23:26 UTC
Re: Review Request 52877: Fixed wrong float serialization in JSON in
locales different from C.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52877/
-----------------------------------------------------------
(Updated Jan. 3, 2017, 3:23 p.m.)
Review request for mesos, Adam B, Benjamin Bannier, Joris Van Remoortere, and Michael Park.
Changes
-------
Rebases master.
Rewrites the tests as follows:
- Test with `jsonify` instead of `stringify` since usage of `JSON::Object` adds unnecesary time noise.
- Tests with a random set of strings to avoid compiler optimizations (the same doubles were being repeated over and over again).
- Updated proposals to their latest versions.
Bugs: MESOS-6349
https://issues.apache.org/jira/browse/MESOS-6349
Repository: mesos
Description
-------
Resolves an isse with the JSON serialization functions where floats
would use the process active locale as a decimal separator instead
of the JSON conformant period (`.`).
Diffs (updated)
-----
3rdparty/stout/include/stout/json.hpp 62ce15274677112d142a3c829b4a9f06258c9e2c
3rdparty/stout/include/stout/jsonify.hpp 8220d001001e8b9f247c05be58534f1174611aad
Diff: https://reviews.apache.org/r/52877/diff/
Testing (updated)
-------
In order to test the different options the following benchmarking program was
written using [Google benchmark](https://github.com/google/benchmark):
```c++
#include <benchmark/benchmark_api.h>
#include <stout/json.hpp>
#include <iostream>
#include <random>
#include <string>
#include <sstream>
static void BM_Jsonify(benchmark::State& state) {
while (state.KeepRunning()) {
state.PauseTiming();
// Randomly generated set of numbers.
std::vector<double> doubles = {
54366462691.1798,
3.35465250645312,
3056184950.05953,
74057564.7741182,
280.445791893924,
3446.63176368415,
419012114.325581,
3464212.51004162,
1.45156507568354,
13304750.4216248,
7716457488648.00,
700533630.650588,
679.659801950981,
307059152.688268,
5615744.28063731,
859.902033900705,
293878.810967451,
284685668.155903,
722683811.462448,
407.682284299325,
9874834.88341080,
4829649.14505646,
3045544.72401146,
1112707.08627010,
8379539.79388719,
3004161.89676627,
3866662.79849617,
3871151.73991937,
4090119.26657417,
4118699.88616345,
2104416.18322520,
9896898.63226234,
5957851.08773457,
3501068.52003430,
7524714.36218293,
4333874.01982647,
9562008.06930384,
3374957.45494027,
5867075.07463260,
815499.697450741,
600936.470830026,
9661425.72632153,
6392256.71537575,
7517969.33139398,
9031912.03425553,
5497593.85752735,
815419.808032205,
5098659.46057626,
930160.667551563,
5970944.98217500,
6166534.92677787,
3541537.67676275,
1291933.13549156,
9185094.72404290,
4507338.03523123,
9559754.89147696,
6152898.39204769,
2358966.41795446,
6520510.92218183,
2201757.78606032,
4960487.80737309,
1784969.91832029,
3858390.23735070,
1439952.27402359,
6407199.91163080,
6130379.95590661,
6427890.23913244,
2128879.29010338,
8175291.42483598,
1587278.27639235,
7493343.68705307,
4853439.25371342,
2564845.15639735,
1415661.63929173,
6388168.79342734,
3003424.90116775,
6915390.10600792,
7115390.65502377,
5288515.90088063,
1209208.86882085,
4483111.91884606,
5974377.97163572,
5821054.89489766,
8284136.84073623,
1607044.34898051,
3255087.31267763,
2305369.43079747,
1282392.57487249,
4884797.49134467,
5119420.62129117,
6276725.07755672,
3054999.92210194,
3594116.58894982,
6691568.49651968,
265562.721872220,
2864878.07276221,
627299.902077148,
5885179.44212665,
7654144.98508934,
590857.599685431
};
state.ResumeTiming();
benchmark::DoNotOptimize(jsonify(object));
}
}
BENCHMARK(BM_Jsonify);
BENCHMARK_MAIN();
```
The program creates a JSON object with 50 floats in it, and then it creates
its string representation. Since the algorithms in `json.hpp` and `jsonify.hpp`
are similar but not the same, they were benchmarked as different. While the
original solutions are non solutions since they produce erroneous results if
localization is active, they constitute a baseline for comparison with the
alternatives. These are the benchmarks for the original algorithms and their
alternatives:
1. `json.hpp` default:
```c++
char buffer[50] {};
snprintf(
buffer,
sizeof(buffer),
"%#.*g",
std::numeric_limits<double>::digits10,
number.value);
std::string trimmed = strings::trim(buffer, strings::SUFFIX, "0");
*stream_ << trimmed << (trimmed.back() == '.' ? "0" : "");
```
Perfomance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:06:35
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 985 ns 986 ns 714227
```
2. `jsonify.hpp` default (it changes the call to `strings::trim` in order to
avoid string copies:
```c++
char buffer[50] {};
const int size = snprintf(
buffer,
sizeof(buffer),
"%#.*g",
std::numeric_limits<double>::digits10,
number.value);
int back = size - 1;
for (; back > 0; --back) {
if (buffer[back] != '0') {
break;
}
buffer[back] = '\0';
}
*stream_ << buffer << (buffer[back] == '.' ? "0" : "");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:05:34
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 968 ns 969 ns 723417
```
3. Proposal 1, manually searching for a character which is not expected (and it
is assumed to be the decimal separator) and replacing it:
```c++
char buffer[50] {};
int size = snprintf(
buffer,
sizeof(buffer),
"%#.*g",
std::numeric_limits<double>::digits10,
number.value);
static const char separator =
std::use_facet<std::numpunct<char>>(stream.getloc()).decimal_point();
for (int i = 0; i < size; ++i) {
if (buffer[i] == separator) {
buffer[i] = '.';
break;
}
}
int back = size - 1;
for (; back > 0; --back) {
if (buffer[back] != '0') {
break;
}
buffer[back] = '\0';
}
*stream_ << buffer << (buffer[back] == '.' ? "0" : "");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:07:39
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 982 ns 981 ns 704544
```
4. Proposal 2, the C++ approach, use streams and stream manipulators:
```c++
std::stringstream ss;
ss.imbue(std::locale::classic());
ss << std::setprecision(std::numeric_limits<double>::digits10)
<< std::showpoint << number.value;
std::string trimmed = strings::trim(ss.str(), strings::SUFFIX, "0");
return out << trimmed << (trimmed.back() == '.' ? "0" : "");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:09:40
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 977 ns 978 ns 715571
```
5. Proposal 3, try to avoid copies (there are still some being created):
```c++
std::stringstream ss;
ss.imbue(std::locale::classic());
ss << std::setprecision(std::numeric_limits<double>::digits10)
<< std::showpoint << number.value;
std::string representation = ss.str();
int back = representation.size() - 1;
for (; back > 0; --back) {
if (representation[back] != '0') {
break;
}
}
representation.resize(back + 1);
*stream_ << representation << (representation[back] == '.' ? "0" : "");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:10:52
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 982 ns 984 ns 714935
```
6. Proposal 4: Using thread local variables:
```c++
static THREAD_LOCAL std::stringstream* ss = nullptr;
static THREAD_LOCAL bool initialized = false;
if (!initialized) {
ss = new std::stringstream;
ss->imbue(std::locale::classic());
*ss << std::setprecission(std::numeric_limits<double>::digits10)
<< std::showpoint;
initialized = true;
}
*ss << double_;
std::string trimmed = strings::trim(ss->str(), strings::SUFFIX, "0");
*stream_ << trimmed << (trimmed.back() == '.' ? "0" : "");
ss->str("");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:12:37
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 980 ns 980 ns 716413
```
Surprisingly, proposal 2 seemed to be the most efficient after multiple runs of the suite.
Thanks,
Alexander Rojas
Re: Review Request 52877: Fixed wrong float serialization in JSON in
locales different from C.
Posted by Mesos ReviewBot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52877/#review160814
-----------------------------------------------------------
Patch looks great!
Reviews applied: [52877]
Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; ./support/docker_build.sh
- Mesos ReviewBot
On Jan. 5, 2017, 5:05 p.m., Alexander Rojas wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52877/
> -----------------------------------------------------------
>
> (Updated Jan. 5, 2017, 5:05 p.m.)
>
>
> Review request for mesos, Adam B, Benjamin Bannier, Joris Van Remoortere, and Michael Park.
>
>
> Bugs: MESOS-6349
> https://issues.apache.org/jira/browse/MESOS-6349
>
>
> Repository: mesos
>
>
> Description
> -------
>
> Resolves an isse with the JSON serialization functions where floats
> would use the process active locale as a decimal separator instead
> of the JSON conformant period (`.`).
>
>
> Diffs
> -----
>
> 3rdparty/stout/include/stout/json.hpp 62ce15274677112d142a3c829b4a9f06258c9e2c
> 3rdparty/stout/include/stout/jsonify.hpp 8220d001001e8b9f247c05be58534f1174611aad
>
> Diff: https://reviews.apache.org/r/52877/diff/
>
>
> Testing
> -------
>
> In order to test the different options the following benchmarking program was
> written using [Google benchmark](https://github.com/google/benchmark):
>
> ```c++
> #include <benchmark/benchmark_api.h>
> #include <stout/json.hpp>
>
> #include <iostream>
> #include <random>
> #include <string>
> #include <sstream>
>
> static void BM_Jsonify(benchmark::State& state) {
> while (state.KeepRunning()) {
> state.PauseTiming();
>
> // Randomly generated set of numbers.
> std::vector<double> doubles = {
> 54366462691.1798,
> 3.35465250645312,
> 3056184950.05953,
> 74057564.7741182,
> 280.445791893924,
> 3446.63176368415,
> 419012114.325581,
> 3464212.51004162,
> 1.45156507568354,
> 13304750.4216248,
> 7716457488648.00,
> 700533630.650588,
> 679.659801950981,
> 307059152.688268,
> 5615744.28063731,
> 859.902033900705,
> 293878.810967451,
> 284685668.155903,
> 722683811.462448,
> 407.682284299325,
> 9874834.88341080,
> 4829649.14505646,
> 3045544.72401146,
> 1112707.08627010,
> 8379539.79388719,
> 3004161.89676627,
> 3866662.79849617,
> 3871151.73991937,
> 4090119.26657417,
> 4118699.88616345,
> 2104416.18322520,
> 9896898.63226234,
> 5957851.08773457,
> 3501068.52003430,
> 7524714.36218293,
> 4333874.01982647,
> 9562008.06930384,
> 3374957.45494027,
> 5867075.07463260,
> 815499.697450741,
> 600936.470830026,
> 9661425.72632153,
> 6392256.71537575,
> 7517969.33139398,
> 9031912.03425553,
> 5497593.85752735,
> 815419.808032205,
> 5098659.46057626,
> 930160.667551563,
> 5970944.98217500,
> 6166534.92677787,
> 3541537.67676275,
> 1291933.13549156,
> 9185094.72404290,
> 4507338.03523123,
> 9559754.89147696,
> 6152898.39204769,
> 2358966.41795446,
> 6520510.92218183,
> 2201757.78606032,
> 4960487.80737309,
> 1784969.91832029,
> 3858390.23735070,
> 1439952.27402359,
> 6407199.91163080,
> 6130379.95590661,
> 6427890.23913244,
> 2128879.29010338,
> 8175291.42483598,
> 1587278.27639235,
> 7493343.68705307,
> 4853439.25371342,
> 2564845.15639735,
> 1415661.63929173,
> 6388168.79342734,
> 3003424.90116775,
> 6915390.10600792,
> 7115390.65502377,
> 5288515.90088063,
> 1209208.86882085,
> 4483111.91884606,
> 5974377.97163572,
> 5821054.89489766,
> 8284136.84073623,
> 1607044.34898051,
> 3255087.31267763,
> 2305369.43079747,
> 1282392.57487249,
> 4884797.49134467,
> 5119420.62129117,
> 6276725.07755672,
> 3054999.92210194,
> 3594116.58894982,
> 6691568.49651968,
> 265562.721872220,
> 2864878.07276221,
> 627299.902077148,
> 5885179.44212665,
> 7654144.98508934,
> 590857.599685431
> };
>
> state.ResumeTiming();
>
> benchmark::DoNotOptimize(jsonify(object));
> }
> }
>
> BENCHMARK(BM_Jsonify);
>
> BENCHMARK_MAIN();
> ```
>
> The program creates a JSON object with 50 floats in it, and then it creates
> its string representation. Since the algorithms in `json.hpp` and `jsonify.hpp`
> are similar but not the same, they were benchmarked as different. While the
> original solutions are non solutions since they produce erroneous results if
> localization is active, they constitute a baseline for comparison with the
> alternatives. These are the benchmarks for the original algorithms and their
> alternatives:
>
> 1. `json.hpp` default:
>
> ```c++
> char buffer[50] {};
> snprintf(
> buffer,
> sizeof(buffer),
> "%#.*g",
> std::numeric_limits<double>::digits10,
> number.value);
>
> std::string trimmed = strings::trim(buffer, strings::SUFFIX, "0");
> *stream_ << trimmed << (trimmed.back() == '.' ? "0" : "");
> ```
>
> Perfomance
>
> ```
> Run on (8 X 2800 MHz CPU s)
> 2017-01-03 15:06:35
> Benchmark Time CPU Iterations
> -------------------------------------------------
> BM_Jsonify 985 ns 986 ns 714227
> ```
>
> 2. `jsonify.hpp` default (it changes the call to `strings::trim` in order to
> avoid string copies:
>
> ```c++
> char buffer[50] {};
> const int size = snprintf(
> buffer,
> sizeof(buffer),
> "%#.*g",
> std::numeric_limits<double>::digits10,
> number.value);
>
> int back = size - 1;
> for (; back > 0; --back) {
> if (buffer[back] != '0') {
> break;
> }
> buffer[back] = '\0';
> }
> *stream_ << buffer << (buffer[back] == '.' ? "0" : "");
> ```
>
> Performance
>
> ```
> Run on (8 X 2800 MHz CPU s)
> 2017-01-03 15:05:34
> Benchmark Time CPU Iterations
> -------------------------------------------------
> BM_Jsonify 968 ns 969 ns 723417
> ```
>
> 3. Proposal 1, manually searching for a character which is not expected (and it
> is assumed to be the decimal separator) and replacing it:
>
> ```c++
> char buffer[50] {};
> int size = snprintf(
> buffer,
> sizeof(buffer),
> "%#.*g",
> std::numeric_limits<double>::digits10,
> double_);
>
> const char separator =
> std::use_facet<std::numpunct<char>>(*stream_.getloc()).decimal_point();
> for (int i = 0; i < size; ++i) {
> if (buffer[i] == separator) {
> buffer[i] = '.';
> break;
> }
> }
>
> int back = size - 1;
> for (; back > 0; --back) {
> if (buffer[back] != '0') {
> break;
> }
> buffer[back] = '\0';
> }
>
> *stream_ << buffer << (buffer[back] == '.' ? "0" : "");
> ```
>
> Performance
>
> ```
> Run on (8 X 2800 MHz CPU s)
> 2017-01-03 15:07:39
> Benchmark Time CPU Iterations
> -------------------------------------------------
> BM_Jsonify 982 ns 981 ns 704544
> ```
>
> 4. Proposal 2, the C++ approach, use streams and stream manipulators:
>
> ```c++
> std::stringstream ss;
> ss.imbue(std::locale::classic());
> ss << std::setprecision(std::numeric_limits<double>::digits10)
> << std::showpoint << number.value;
>
> std::string trimmed = strings::trim(ss.str(), strings::SUFFIX, "0");
> return out << trimmed << (trimmed.back() == '.' ? "0" : "");
> ```
>
> Performance
>
> ```
> Run on (8 X 2800 MHz CPU s)
> 2017-01-03 15:09:40
> Benchmark Time CPU Iterations
> -------------------------------------------------
> BM_Jsonify 977 ns 978 ns 715571
> ```
>
> 5. Proposal 3, try to avoid copies (there are still some being created):
>
> ```c++
> std::stringstream ss;
> ss.imbue(std::locale::classic());
> ss << std::setprecision(std::numeric_limits<double>::digits10)
> << std::showpoint << number.value;
>
> std::string representation = ss.str();
> int back = representation.size() - 1;
> for (; back > 0; --back) {
> if (representation[back] != '0') {
> break;
> }
> }
> representation.resize(back + 1);
> *stream_ << representation << (representation[back] == '.' ? "0" : "");
> ```
>
> Performance
>
> ```
> Run on (8 X 2800 MHz CPU s)
> 2017-01-03 15:10:52
> Benchmark Time CPU Iterations
> -------------------------------------------------
> BM_Jsonify 982 ns 984 ns 714935
> ```
>
> 6. Proposal 4: Using thread local variables:
>
> ```c++
> static THREAD_LOCAL std::stringstream* ss = nullptr;
> static THREAD_LOCAL bool initialized = false;
> if (!initialized) {
> ss = new std::stringstream;
> ss->imbue(std::locale::classic());
> *ss << std::setprecission(std::numeric_limits<double>::digits10)
> << std::showpoint;
> initialized = true;
> }
> *ss << double_;
> std::string trimmed = strings::trim(ss->str(), strings::SUFFIX, "0");
> *stream_ << trimmed << (trimmed.back() == '.' ? "0" : "");
> ss->str("");
> ```
>
> Performance
>
> ```
> Run on (8 X 2800 MHz CPU s)
> 2017-01-03 15:12:37
> Benchmark Time CPU Iterations
> -------------------------------------------------
> BM_Jsonify 980 ns 980 ns 716413
> ```
>
> Surprisingly, proposal 2 seemed to be the most efficient after multiple runs of the suite.
>
>
> Thanks,
>
> Alexander Rojas
>
>
Re: Review Request 52877: Fixed wrong float serialization in JSON in
locales different from C.
Posted by Alexander Rojas <al...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52877/
-----------------------------------------------------------
(Updated Jan. 9, 2017, 3:29 p.m.)
Review request for mesos, Adam B, Benjamin Bannier, Joris Van Remoortere, and Michael Park.
Bugs: MESOS-6349
https://issues.apache.org/jira/browse/MESOS-6349
Repository: mesos
Description
-------
Resolves an isse with the JSON serialization functions where floats
would use the process active locale as a decimal separator instead
of the JSON conformant period (`.`).
Diffs (updated)
-----
3rdparty/stout/include/stout/jsonify.hpp 8220d001001e8b9f247c05be58534f1174611aad
Diff: https://reviews.apache.org/r/52877/diff/
Testing
-------
In order to test the different options the following benchmarking program was
written using [Google benchmark](https://github.com/google/benchmark):
```c++
#include <benchmark/benchmark_api.h>
#include <stout/json.hpp>
#include <iostream>
#include <random>
#include <string>
#include <sstream>
static void BM_Jsonify(benchmark::State& state) {
while (state.KeepRunning()) {
state.PauseTiming();
// Randomly generated set of numbers.
std::vector<double> doubles = {
54366462691.1798,
3.35465250645312,
3056184950.05953,
74057564.7741182,
280.445791893924,
3446.63176368415,
419012114.325581,
3464212.51004162,
1.45156507568354,
13304750.4216248,
7716457488648.00,
700533630.650588,
679.659801950981,
307059152.688268,
5615744.28063731,
859.902033900705,
293878.810967451,
284685668.155903,
722683811.462448,
407.682284299325,
9874834.88341080,
4829649.14505646,
3045544.72401146,
1112707.08627010,
8379539.79388719,
3004161.89676627,
3866662.79849617,
3871151.73991937,
4090119.26657417,
4118699.88616345,
2104416.18322520,
9896898.63226234,
5957851.08773457,
3501068.52003430,
7524714.36218293,
4333874.01982647,
9562008.06930384,
3374957.45494027,
5867075.07463260,
815499.697450741,
600936.470830026,
9661425.72632153,
6392256.71537575,
7517969.33139398,
9031912.03425553,
5497593.85752735,
815419.808032205,
5098659.46057626,
930160.667551563,
5970944.98217500,
6166534.92677787,
3541537.67676275,
1291933.13549156,
9185094.72404290,
4507338.03523123,
9559754.89147696,
6152898.39204769,
2358966.41795446,
6520510.92218183,
2201757.78606032,
4960487.80737309,
1784969.91832029,
3858390.23735070,
1439952.27402359,
6407199.91163080,
6130379.95590661,
6427890.23913244,
2128879.29010338,
8175291.42483598,
1587278.27639235,
7493343.68705307,
4853439.25371342,
2564845.15639735,
1415661.63929173,
6388168.79342734,
3003424.90116775,
6915390.10600792,
7115390.65502377,
5288515.90088063,
1209208.86882085,
4483111.91884606,
5974377.97163572,
5821054.89489766,
8284136.84073623,
1607044.34898051,
3255087.31267763,
2305369.43079747,
1282392.57487249,
4884797.49134467,
5119420.62129117,
6276725.07755672,
3054999.92210194,
3594116.58894982,
6691568.49651968,
265562.721872220,
2864878.07276221,
627299.902077148,
5885179.44212665,
7654144.98508934,
590857.599685431
};
state.ResumeTiming();
benchmark::DoNotOptimize(jsonify(object));
}
}
BENCHMARK(BM_Jsonify);
BENCHMARK_MAIN();
```
The program creates a JSON object with 50 floats in it, and then it creates
its string representation. Since the algorithms in `json.hpp` and `jsonify.hpp`
are similar but not the same, they were benchmarked as different. While the
original solutions are non solutions since they produce erroneous results if
localization is active, they constitute a baseline for comparison with the
alternatives. These are the benchmarks for the original algorithms and their
alternatives:
1. `json.hpp` default:
```c++
char buffer[50] {};
snprintf(
buffer,
sizeof(buffer),
"%#.*g",
std::numeric_limits<double>::digits10,
number.value);
std::string trimmed = strings::trim(buffer, strings::SUFFIX, "0");
*stream_ << trimmed << (trimmed.back() == '.' ? "0" : "");
```
Perfomance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:06:35
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 985 ns 986 ns 714227
```
2. `jsonify.hpp` default (it changes the call to `strings::trim` in order to
avoid string copies:
```c++
char buffer[50] {};
const int size = snprintf(
buffer,
sizeof(buffer),
"%#.*g",
std::numeric_limits<double>::digits10,
number.value);
int back = size - 1;
for (; back > 0; --back) {
if (buffer[back] != '0') {
break;
}
buffer[back] = '\0';
}
*stream_ << buffer << (buffer[back] == '.' ? "0" : "");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:05:34
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 968 ns 969 ns 723417
```
3. Proposal 1, manually searching for a character which is not expected (and it
is assumed to be the decimal separator) and replacing it:
```c++
char buffer[50] {};
int size = snprintf(
buffer,
sizeof(buffer),
"%#.*g",
std::numeric_limits<double>::digits10,
double_);
const char separator =
std::use_facet<std::numpunct<char>>(*stream_.getloc()).decimal_point();
for (int i = 0; i < size; ++i) {
if (buffer[i] == separator) {
buffer[i] = '.';
break;
}
}
int back = size - 1;
for (; back > 0; --back) {
if (buffer[back] != '0') {
break;
}
buffer[back] = '\0';
}
*stream_ << buffer << (buffer[back] == '.' ? "0" : "");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:07:39
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 982 ns 981 ns 704544
```
4. Proposal 2, the C++ approach, use streams and stream manipulators:
```c++
std::stringstream ss;
ss.imbue(std::locale::classic());
ss << std::setprecision(std::numeric_limits<double>::digits10)
<< std::showpoint << number.value;
std::string trimmed = strings::trim(ss.str(), strings::SUFFIX, "0");
return out << trimmed << (trimmed.back() == '.' ? "0" : "");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:09:40
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 977 ns 978 ns 715571
```
5. Proposal 3, try to avoid copies (there are still some being created):
```c++
std::stringstream ss;
ss.imbue(std::locale::classic());
ss << std::setprecision(std::numeric_limits<double>::digits10)
<< std::showpoint << number.value;
std::string representation = ss.str();
int back = representation.size() - 1;
for (; back > 0; --back) {
if (representation[back] != '0') {
break;
}
}
representation.resize(back + 1);
*stream_ << representation << (representation[back] == '.' ? "0" : "");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:10:52
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 982 ns 984 ns 714935
```
6. Proposal 4: Using thread local variables:
```c++
static THREAD_LOCAL std::stringstream* ss = nullptr;
static THREAD_LOCAL bool initialized = false;
if (!initialized) {
ss = new std::stringstream;
ss->imbue(std::locale::classic());
*ss << std::setprecission(std::numeric_limits<double>::digits10)
<< std::showpoint;
initialized = true;
}
*ss << double_;
std::string trimmed = strings::trim(ss->str(), strings::SUFFIX, "0");
*stream_ << trimmed << (trimmed.back() == '.' ? "0" : "");
ss->str("");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:12:37
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 980 ns 980 ns 716413
```
Surprisingly, proposal 2 seemed to be the most efficient after multiple runs of the suite.
Thanks,
Alexander Rojas
Re: Review Request 52877: Fixed wrong float serialization in JSON in
locales different from C.
Posted by Michael Park <mp...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52877/#review160811
-----------------------------------------------------------
Fix it, then Ship it!
3rdparty/stout/include/stout/json.hpp (lines 690 - 697)
<https://reviews.apache.org/r/52877/#comment232040>
Could you depend on https://reviews.apache.org/r/55296 for this patch, and we won't have to worry about this part?
3rdparty/stout/include/stout/jsonify.hpp (lines 95 - 96)
<https://reviews.apache.org/r/52877/#comment232038>
This fits in one line.
3rdparty/stout/include/stout/jsonify.hpp (lines 131 - 132)
<https://reviews.apache.org/r/52877/#comment232039>
This fits in one line.
- Michael Park
On Jan. 5, 2017, 9:05 a.m., Alexander Rojas wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52877/
> -----------------------------------------------------------
>
> (Updated Jan. 5, 2017, 9:05 a.m.)
>
>
> Review request for mesos, Adam B, Benjamin Bannier, Joris Van Remoortere, and Michael Park.
>
>
> Bugs: MESOS-6349
> https://issues.apache.org/jira/browse/MESOS-6349
>
>
> Repository: mesos
>
>
> Description
> -------
>
> Resolves an isse with the JSON serialization functions where floats
> would use the process active locale as a decimal separator instead
> of the JSON conformant period (`.`).
>
>
> Diffs
> -----
>
> 3rdparty/stout/include/stout/json.hpp 62ce15274677112d142a3c829b4a9f06258c9e2c
> 3rdparty/stout/include/stout/jsonify.hpp 8220d001001e8b9f247c05be58534f1174611aad
>
> Diff: https://reviews.apache.org/r/52877/diff/
>
>
> Testing
> -------
>
> In order to test the different options the following benchmarking program was
> written using [Google benchmark](https://github.com/google/benchmark):
>
> ```c++
> #include <benchmark/benchmark_api.h>
> #include <stout/json.hpp>
>
> #include <iostream>
> #include <random>
> #include <string>
> #include <sstream>
>
> static void BM_Jsonify(benchmark::State& state) {
> while (state.KeepRunning()) {
> state.PauseTiming();
>
> // Randomly generated set of numbers.
> std::vector<double> doubles = {
> 54366462691.1798,
> 3.35465250645312,
> 3056184950.05953,
> 74057564.7741182,
> 280.445791893924,
> 3446.63176368415,
> 419012114.325581,
> 3464212.51004162,
> 1.45156507568354,
> 13304750.4216248,
> 7716457488648.00,
> 700533630.650588,
> 679.659801950981,
> 307059152.688268,
> 5615744.28063731,
> 859.902033900705,
> 293878.810967451,
> 284685668.155903,
> 722683811.462448,
> 407.682284299325,
> 9874834.88341080,
> 4829649.14505646,
> 3045544.72401146,
> 1112707.08627010,
> 8379539.79388719,
> 3004161.89676627,
> 3866662.79849617,
> 3871151.73991937,
> 4090119.26657417,
> 4118699.88616345,
> 2104416.18322520,
> 9896898.63226234,
> 5957851.08773457,
> 3501068.52003430,
> 7524714.36218293,
> 4333874.01982647,
> 9562008.06930384,
> 3374957.45494027,
> 5867075.07463260,
> 815499.697450741,
> 600936.470830026,
> 9661425.72632153,
> 6392256.71537575,
> 7517969.33139398,
> 9031912.03425553,
> 5497593.85752735,
> 815419.808032205,
> 5098659.46057626,
> 930160.667551563,
> 5970944.98217500,
> 6166534.92677787,
> 3541537.67676275,
> 1291933.13549156,
> 9185094.72404290,
> 4507338.03523123,
> 9559754.89147696,
> 6152898.39204769,
> 2358966.41795446,
> 6520510.92218183,
> 2201757.78606032,
> 4960487.80737309,
> 1784969.91832029,
> 3858390.23735070,
> 1439952.27402359,
> 6407199.91163080,
> 6130379.95590661,
> 6427890.23913244,
> 2128879.29010338,
> 8175291.42483598,
> 1587278.27639235,
> 7493343.68705307,
> 4853439.25371342,
> 2564845.15639735,
> 1415661.63929173,
> 6388168.79342734,
> 3003424.90116775,
> 6915390.10600792,
> 7115390.65502377,
> 5288515.90088063,
> 1209208.86882085,
> 4483111.91884606,
> 5974377.97163572,
> 5821054.89489766,
> 8284136.84073623,
> 1607044.34898051,
> 3255087.31267763,
> 2305369.43079747,
> 1282392.57487249,
> 4884797.49134467,
> 5119420.62129117,
> 6276725.07755672,
> 3054999.92210194,
> 3594116.58894982,
> 6691568.49651968,
> 265562.721872220,
> 2864878.07276221,
> 627299.902077148,
> 5885179.44212665,
> 7654144.98508934,
> 590857.599685431
> };
>
> state.ResumeTiming();
>
> benchmark::DoNotOptimize(jsonify(object));
> }
> }
>
> BENCHMARK(BM_Jsonify);
>
> BENCHMARK_MAIN();
> ```
>
> The program creates a JSON object with 50 floats in it, and then it creates
> its string representation. Since the algorithms in `json.hpp` and `jsonify.hpp`
> are similar but not the same, they were benchmarked as different. While the
> original solutions are non solutions since they produce erroneous results if
> localization is active, they constitute a baseline for comparison with the
> alternatives. These are the benchmarks for the original algorithms and their
> alternatives:
>
> 1. `json.hpp` default:
>
> ```c++
> char buffer[50] {};
> snprintf(
> buffer,
> sizeof(buffer),
> "%#.*g",
> std::numeric_limits<double>::digits10,
> number.value);
>
> std::string trimmed = strings::trim(buffer, strings::SUFFIX, "0");
> *stream_ << trimmed << (trimmed.back() == '.' ? "0" : "");
> ```
>
> Perfomance
>
> ```
> Run on (8 X 2800 MHz CPU s)
> 2017-01-03 15:06:35
> Benchmark Time CPU Iterations
> -------------------------------------------------
> BM_Jsonify 985 ns 986 ns 714227
> ```
>
> 2. `jsonify.hpp` default (it changes the call to `strings::trim` in order to
> avoid string copies:
>
> ```c++
> char buffer[50] {};
> const int size = snprintf(
> buffer,
> sizeof(buffer),
> "%#.*g",
> std::numeric_limits<double>::digits10,
> number.value);
>
> int back = size - 1;
> for (; back > 0; --back) {
> if (buffer[back] != '0') {
> break;
> }
> buffer[back] = '\0';
> }
> *stream_ << buffer << (buffer[back] == '.' ? "0" : "");
> ```
>
> Performance
>
> ```
> Run on (8 X 2800 MHz CPU s)
> 2017-01-03 15:05:34
> Benchmark Time CPU Iterations
> -------------------------------------------------
> BM_Jsonify 968 ns 969 ns 723417
> ```
>
> 3. Proposal 1, manually searching for a character which is not expected (and it
> is assumed to be the decimal separator) and replacing it:
>
> ```c++
> char buffer[50] {};
> int size = snprintf(
> buffer,
> sizeof(buffer),
> "%#.*g",
> std::numeric_limits<double>::digits10,
> double_);
>
> const char separator =
> std::use_facet<std::numpunct<char>>(*stream_.getloc()).decimal_point();
> for (int i = 0; i < size; ++i) {
> if (buffer[i] == separator) {
> buffer[i] = '.';
> break;
> }
> }
>
> int back = size - 1;
> for (; back > 0; --back) {
> if (buffer[back] != '0') {
> break;
> }
> buffer[back] = '\0';
> }
>
> *stream_ << buffer << (buffer[back] == '.' ? "0" : "");
> ```
>
> Performance
>
> ```
> Run on (8 X 2800 MHz CPU s)
> 2017-01-03 15:07:39
> Benchmark Time CPU Iterations
> -------------------------------------------------
> BM_Jsonify 982 ns 981 ns 704544
> ```
>
> 4. Proposal 2, the C++ approach, use streams and stream manipulators:
>
> ```c++
> std::stringstream ss;
> ss.imbue(std::locale::classic());
> ss << std::setprecision(std::numeric_limits<double>::digits10)
> << std::showpoint << number.value;
>
> std::string trimmed = strings::trim(ss.str(), strings::SUFFIX, "0");
> return out << trimmed << (trimmed.back() == '.' ? "0" : "");
> ```
>
> Performance
>
> ```
> Run on (8 X 2800 MHz CPU s)
> 2017-01-03 15:09:40
> Benchmark Time CPU Iterations
> -------------------------------------------------
> BM_Jsonify 977 ns 978 ns 715571
> ```
>
> 5. Proposal 3, try to avoid copies (there are still some being created):
>
> ```c++
> std::stringstream ss;
> ss.imbue(std::locale::classic());
> ss << std::setprecision(std::numeric_limits<double>::digits10)
> << std::showpoint << number.value;
>
> std::string representation = ss.str();
> int back = representation.size() - 1;
> for (; back > 0; --back) {
> if (representation[back] != '0') {
> break;
> }
> }
> representation.resize(back + 1);
> *stream_ << representation << (representation[back] == '.' ? "0" : "");
> ```
>
> Performance
>
> ```
> Run on (8 X 2800 MHz CPU s)
> 2017-01-03 15:10:52
> Benchmark Time CPU Iterations
> -------------------------------------------------
> BM_Jsonify 982 ns 984 ns 714935
> ```
>
> 6. Proposal 4: Using thread local variables:
>
> ```c++
> static THREAD_LOCAL std::stringstream* ss = nullptr;
> static THREAD_LOCAL bool initialized = false;
> if (!initialized) {
> ss = new std::stringstream;
> ss->imbue(std::locale::classic());
> *ss << std::setprecission(std::numeric_limits<double>::digits10)
> << std::showpoint;
> initialized = true;
> }
> *ss << double_;
> std::string trimmed = strings::trim(ss->str(), strings::SUFFIX, "0");
> *stream_ << trimmed << (trimmed.back() == '.' ? "0" : "");
> ss->str("");
> ```
>
> Performance
>
> ```
> Run on (8 X 2800 MHz CPU s)
> 2017-01-03 15:12:37
> Benchmark Time CPU Iterations
> -------------------------------------------------
> BM_Jsonify 980 ns 980 ns 716413
> ```
>
> Surprisingly, proposal 2 seemed to be the most efficient after multiple runs of the suite.
>
>
> Thanks,
>
> Alexander Rojas
>
>
Re: Review Request 52877: Fixed wrong float serialization in JSON in
locales different from C.
Posted by Alexander Rojas <al...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52877/
-----------------------------------------------------------
(Updated Jan. 5, 2017, 6:05 p.m.)
Review request for mesos, Adam B, Benjamin Bannier, Joris Van Remoortere, and Michael Park.
Bugs: MESOS-6349
https://issues.apache.org/jira/browse/MESOS-6349
Repository: mesos
Description (updated)
-------
Resolves an isse with the JSON serialization functions where floats
would use the process active locale as a decimal separator instead
of the JSON conformant period (`.`).
Diffs
-----
3rdparty/stout/include/stout/json.hpp 62ce15274677112d142a3c829b4a9f06258c9e2c
3rdparty/stout/include/stout/jsonify.hpp 8220d001001e8b9f247c05be58534f1174611aad
Diff: https://reviews.apache.org/r/52877/diff/
Testing
-------
In order to test the different options the following benchmarking program was
written using [Google benchmark](https://github.com/google/benchmark):
```c++
#include <benchmark/benchmark_api.h>
#include <stout/json.hpp>
#include <iostream>
#include <random>
#include <string>
#include <sstream>
static void BM_Jsonify(benchmark::State& state) {
while (state.KeepRunning()) {
state.PauseTiming();
// Randomly generated set of numbers.
std::vector<double> doubles = {
54366462691.1798,
3.35465250645312,
3056184950.05953,
74057564.7741182,
280.445791893924,
3446.63176368415,
419012114.325581,
3464212.51004162,
1.45156507568354,
13304750.4216248,
7716457488648.00,
700533630.650588,
679.659801950981,
307059152.688268,
5615744.28063731,
859.902033900705,
293878.810967451,
284685668.155903,
722683811.462448,
407.682284299325,
9874834.88341080,
4829649.14505646,
3045544.72401146,
1112707.08627010,
8379539.79388719,
3004161.89676627,
3866662.79849617,
3871151.73991937,
4090119.26657417,
4118699.88616345,
2104416.18322520,
9896898.63226234,
5957851.08773457,
3501068.52003430,
7524714.36218293,
4333874.01982647,
9562008.06930384,
3374957.45494027,
5867075.07463260,
815499.697450741,
600936.470830026,
9661425.72632153,
6392256.71537575,
7517969.33139398,
9031912.03425553,
5497593.85752735,
815419.808032205,
5098659.46057626,
930160.667551563,
5970944.98217500,
6166534.92677787,
3541537.67676275,
1291933.13549156,
9185094.72404290,
4507338.03523123,
9559754.89147696,
6152898.39204769,
2358966.41795446,
6520510.92218183,
2201757.78606032,
4960487.80737309,
1784969.91832029,
3858390.23735070,
1439952.27402359,
6407199.91163080,
6130379.95590661,
6427890.23913244,
2128879.29010338,
8175291.42483598,
1587278.27639235,
7493343.68705307,
4853439.25371342,
2564845.15639735,
1415661.63929173,
6388168.79342734,
3003424.90116775,
6915390.10600792,
7115390.65502377,
5288515.90088063,
1209208.86882085,
4483111.91884606,
5974377.97163572,
5821054.89489766,
8284136.84073623,
1607044.34898051,
3255087.31267763,
2305369.43079747,
1282392.57487249,
4884797.49134467,
5119420.62129117,
6276725.07755672,
3054999.92210194,
3594116.58894982,
6691568.49651968,
265562.721872220,
2864878.07276221,
627299.902077148,
5885179.44212665,
7654144.98508934,
590857.599685431
};
state.ResumeTiming();
benchmark::DoNotOptimize(jsonify(object));
}
}
BENCHMARK(BM_Jsonify);
BENCHMARK_MAIN();
```
The program creates a JSON object with 50 floats in it, and then it creates
its string representation. Since the algorithms in `json.hpp` and `jsonify.hpp`
are similar but not the same, they were benchmarked as different. While the
original solutions are non solutions since they produce erroneous results if
localization is active, they constitute a baseline for comparison with the
alternatives. These are the benchmarks for the original algorithms and their
alternatives:
1. `json.hpp` default:
```c++
char buffer[50] {};
snprintf(
buffer,
sizeof(buffer),
"%#.*g",
std::numeric_limits<double>::digits10,
number.value);
std::string trimmed = strings::trim(buffer, strings::SUFFIX, "0");
*stream_ << trimmed << (trimmed.back() == '.' ? "0" : "");
```
Perfomance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:06:35
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 985 ns 986 ns 714227
```
2. `jsonify.hpp` default (it changes the call to `strings::trim` in order to
avoid string copies:
```c++
char buffer[50] {};
const int size = snprintf(
buffer,
sizeof(buffer),
"%#.*g",
std::numeric_limits<double>::digits10,
number.value);
int back = size - 1;
for (; back > 0; --back) {
if (buffer[back] != '0') {
break;
}
buffer[back] = '\0';
}
*stream_ << buffer << (buffer[back] == '.' ? "0" : "");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:05:34
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 968 ns 969 ns 723417
```
3. Proposal 1, manually searching for a character which is not expected (and it
is assumed to be the decimal separator) and replacing it:
```c++
char buffer[50] {};
int size = snprintf(
buffer,
sizeof(buffer),
"%#.*g",
std::numeric_limits<double>::digits10,
double_);
const char separator =
std::use_facet<std::numpunct<char>>(*stream_.getloc()).decimal_point();
for (int i = 0; i < size; ++i) {
if (buffer[i] == separator) {
buffer[i] = '.';
break;
}
}
int back = size - 1;
for (; back > 0; --back) {
if (buffer[back] != '0') {
break;
}
buffer[back] = '\0';
}
*stream_ << buffer << (buffer[back] == '.' ? "0" : "");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:07:39
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 982 ns 981 ns 704544
```
4. Proposal 2, the C++ approach, use streams and stream manipulators:
```c++
std::stringstream ss;
ss.imbue(std::locale::classic());
ss << std::setprecision(std::numeric_limits<double>::digits10)
<< std::showpoint << number.value;
std::string trimmed = strings::trim(ss.str(), strings::SUFFIX, "0");
return out << trimmed << (trimmed.back() == '.' ? "0" : "");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:09:40
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 977 ns 978 ns 715571
```
5. Proposal 3, try to avoid copies (there are still some being created):
```c++
std::stringstream ss;
ss.imbue(std::locale::classic());
ss << std::setprecision(std::numeric_limits<double>::digits10)
<< std::showpoint << number.value;
std::string representation = ss.str();
int back = representation.size() - 1;
for (; back > 0; --back) {
if (representation[back] != '0') {
break;
}
}
representation.resize(back + 1);
*stream_ << representation << (representation[back] == '.' ? "0" : "");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:10:52
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 982 ns 984 ns 714935
```
6. Proposal 4: Using thread local variables:
```c++
static THREAD_LOCAL std::stringstream* ss = nullptr;
static THREAD_LOCAL bool initialized = false;
if (!initialized) {
ss = new std::stringstream;
ss->imbue(std::locale::classic());
*ss << std::setprecission(std::numeric_limits<double>::digits10)
<< std::showpoint;
initialized = true;
}
*ss << double_;
std::string trimmed = strings::trim(ss->str(), strings::SUFFIX, "0");
*stream_ << trimmed << (trimmed.back() == '.' ? "0" : "");
ss->str("");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:12:37
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 980 ns 980 ns 716413
```
Surprisingly, proposal 2 seemed to be the most efficient after multiple runs of the suite.
Thanks,
Alexander Rojas
Re: Review Request 52877: Fixed wrong float serialization in JSON in
locales different from C.
Posted by Alexander Rojas <al...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52877/
-----------------------------------------------------------
(Updated Jan. 5, 2017, 6:04 p.m.)
Review request for mesos, Adam B, Benjamin Bannier, Joris Van Remoortere, and Michael Park.
Bugs: MESOS-6349
https://issues.apache.org/jira/browse/MESOS-6349
Repository: mesos
Description (updated)
-------
Resolves an isse with the JSON serialization functions where floats
would use the process active locale as a decimal separator instead
of the JSON conformant period (`.`).
Diffs (updated)
-----
3rdparty/stout/include/stout/json.hpp 62ce15274677112d142a3c829b4a9f06258c9e2c
3rdparty/stout/include/stout/jsonify.hpp 8220d001001e8b9f247c05be58534f1174611aad
Diff: https://reviews.apache.org/r/52877/diff/
Testing
-------
In order to test the different options the following benchmarking program was
written using [Google benchmark](https://github.com/google/benchmark):
```c++
#include <benchmark/benchmark_api.h>
#include <stout/json.hpp>
#include <iostream>
#include <random>
#include <string>
#include <sstream>
static void BM_Jsonify(benchmark::State& state) {
while (state.KeepRunning()) {
state.PauseTiming();
// Randomly generated set of numbers.
std::vector<double> doubles = {
54366462691.1798,
3.35465250645312,
3056184950.05953,
74057564.7741182,
280.445791893924,
3446.63176368415,
419012114.325581,
3464212.51004162,
1.45156507568354,
13304750.4216248,
7716457488648.00,
700533630.650588,
679.659801950981,
307059152.688268,
5615744.28063731,
859.902033900705,
293878.810967451,
284685668.155903,
722683811.462448,
407.682284299325,
9874834.88341080,
4829649.14505646,
3045544.72401146,
1112707.08627010,
8379539.79388719,
3004161.89676627,
3866662.79849617,
3871151.73991937,
4090119.26657417,
4118699.88616345,
2104416.18322520,
9896898.63226234,
5957851.08773457,
3501068.52003430,
7524714.36218293,
4333874.01982647,
9562008.06930384,
3374957.45494027,
5867075.07463260,
815499.697450741,
600936.470830026,
9661425.72632153,
6392256.71537575,
7517969.33139398,
9031912.03425553,
5497593.85752735,
815419.808032205,
5098659.46057626,
930160.667551563,
5970944.98217500,
6166534.92677787,
3541537.67676275,
1291933.13549156,
9185094.72404290,
4507338.03523123,
9559754.89147696,
6152898.39204769,
2358966.41795446,
6520510.92218183,
2201757.78606032,
4960487.80737309,
1784969.91832029,
3858390.23735070,
1439952.27402359,
6407199.91163080,
6130379.95590661,
6427890.23913244,
2128879.29010338,
8175291.42483598,
1587278.27639235,
7493343.68705307,
4853439.25371342,
2564845.15639735,
1415661.63929173,
6388168.79342734,
3003424.90116775,
6915390.10600792,
7115390.65502377,
5288515.90088063,
1209208.86882085,
4483111.91884606,
5974377.97163572,
5821054.89489766,
8284136.84073623,
1607044.34898051,
3255087.31267763,
2305369.43079747,
1282392.57487249,
4884797.49134467,
5119420.62129117,
6276725.07755672,
3054999.92210194,
3594116.58894982,
6691568.49651968,
265562.721872220,
2864878.07276221,
627299.902077148,
5885179.44212665,
7654144.98508934,
590857.599685431
};
state.ResumeTiming();
benchmark::DoNotOptimize(jsonify(object));
}
}
BENCHMARK(BM_Jsonify);
BENCHMARK_MAIN();
```
The program creates a JSON object with 50 floats in it, and then it creates
its string representation. Since the algorithms in `json.hpp` and `jsonify.hpp`
are similar but not the same, they were benchmarked as different. While the
original solutions are non solutions since they produce erroneous results if
localization is active, they constitute a baseline for comparison with the
alternatives. These are the benchmarks for the original algorithms and their
alternatives:
1. `json.hpp` default:
```c++
char buffer[50] {};
snprintf(
buffer,
sizeof(buffer),
"%#.*g",
std::numeric_limits<double>::digits10,
number.value);
std::string trimmed = strings::trim(buffer, strings::SUFFIX, "0");
*stream_ << trimmed << (trimmed.back() == '.' ? "0" : "");
```
Perfomance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:06:35
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 985 ns 986 ns 714227
```
2. `jsonify.hpp` default (it changes the call to `strings::trim` in order to
avoid string copies:
```c++
char buffer[50] {};
const int size = snprintf(
buffer,
sizeof(buffer),
"%#.*g",
std::numeric_limits<double>::digits10,
number.value);
int back = size - 1;
for (; back > 0; --back) {
if (buffer[back] != '0') {
break;
}
buffer[back] = '\0';
}
*stream_ << buffer << (buffer[back] == '.' ? "0" : "");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:05:34
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 968 ns 969 ns 723417
```
3. Proposal 1, manually searching for a character which is not expected (and it
is assumed to be the decimal separator) and replacing it:
```c++
char buffer[50] {};
int size = snprintf(
buffer,
sizeof(buffer),
"%#.*g",
std::numeric_limits<double>::digits10,
double_);
const char separator =
std::use_facet<std::numpunct<char>>(*stream_.getloc()).decimal_point();
for (int i = 0; i < size; ++i) {
if (buffer[i] == separator) {
buffer[i] = '.';
break;
}
}
int back = size - 1;
for (; back > 0; --back) {
if (buffer[back] != '0') {
break;
}
buffer[back] = '\0';
}
*stream_ << buffer << (buffer[back] == '.' ? "0" : "");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:07:39
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 982 ns 981 ns 704544
```
4. Proposal 2, the C++ approach, use streams and stream manipulators:
```c++
std::stringstream ss;
ss.imbue(std::locale::classic());
ss << std::setprecision(std::numeric_limits<double>::digits10)
<< std::showpoint << number.value;
std::string trimmed = strings::trim(ss.str(), strings::SUFFIX, "0");
return out << trimmed << (trimmed.back() == '.' ? "0" : "");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:09:40
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 977 ns 978 ns 715571
```
5. Proposal 3, try to avoid copies (there are still some being created):
```c++
std::stringstream ss;
ss.imbue(std::locale::classic());
ss << std::setprecision(std::numeric_limits<double>::digits10)
<< std::showpoint << number.value;
std::string representation = ss.str();
int back = representation.size() - 1;
for (; back > 0; --back) {
if (representation[back] != '0') {
break;
}
}
representation.resize(back + 1);
*stream_ << representation << (representation[back] == '.' ? "0" : "");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:10:52
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 982 ns 984 ns 714935
```
6. Proposal 4: Using thread local variables:
```c++
static THREAD_LOCAL std::stringstream* ss = nullptr;
static THREAD_LOCAL bool initialized = false;
if (!initialized) {
ss = new std::stringstream;
ss->imbue(std::locale::classic());
*ss << std::setprecission(std::numeric_limits<double>::digits10)
<< std::showpoint;
initialized = true;
}
*ss << double_;
std::string trimmed = strings::trim(ss->str(), strings::SUFFIX, "0");
*stream_ << trimmed << (trimmed.back() == '.' ? "0" : "");
ss->str("");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:12:37
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 980 ns 980 ns 716413
```
Surprisingly, proposal 2 seemed to be the most efficient after multiple runs of the suite.
Thanks,
Alexander Rojas
Re: Review Request 52877: Fixed wrong float serialization in JSON in
locales different from C.
Posted by Benjamin Bannier <be...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52877/#review160587
-----------------------------------------------------------
This looks like a much more light-weight solution. It might be worthwhile to at some point add some tests as well.
3rdparty/stout/include/stout/json.hpp (line 690)
<https://reviews.apache.org/r/52877/#comment231717>
Please include `locale` for the facets.
3rdparty/stout/include/stout/json.hpp (line 691)
<https://reviews.apache.org/r/52877/#comment231722>
Please use `foreach` here :/
I am also unsure about your use of `auto`. `trimmed` is just a `string` and you also assumed `char` in the facet above. You probably should also just use an explicit `char&` here.
3rdparty/stout/include/stout/json.hpp (line 693)
<https://reviews.apache.org/r/52877/#comment231718>
We can `break` here.
3rdparty/stout/include/stout/jsonify.hpp (line 60)
<https://reviews.apache.org/r/52877/#comment231719>
Please include `xlocale.h` for `LC_NUMERIC_MASK`.
Since calling `freelocale` on an invalid handle invokes UB, we should probably also assert success (i.e., `CHECK(c_locale_ != 0)`).
3rdparty/stout/include/stout/jsonify.hpp (line 91)
<https://reviews.apache.org/r/52877/#comment231720>
I feel this should be named `guard`; it probably also needs a comment so it isn't removed by accident.
3rdparty/stout/include/stout/jsonify.hpp (line 125)
<https://reviews.apache.org/r/52877/#comment231721>
I feel this should be named `guard`; it probably also needs a comment so it isn't removed by accident.
- Benjamin Bannier
On Jan. 4, 2017, 2:07 p.m., Alexander Rojas wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52877/
> -----------------------------------------------------------
>
> (Updated Jan. 4, 2017, 2:07 p.m.)
>
>
> Review request for mesos, Adam B, Benjamin Bannier, Joris Van Remoortere, and Michael Park.
>
>
> Bugs: MESOS-6349
> https://issues.apache.org/jira/browse/MESOS-6349
>
>
> Repository: mesos
>
>
> Description
> -------
>
> Resolves an isse with the JSON serialization functions where floats
> would use the process active locale as a decimal separator instead
> of the JSON conformant period (`.`).
>
>
> Diffs
> -----
>
> 3rdparty/stout/include/stout/json.hpp 62ce15274677112d142a3c829b4a9f06258c9e2c
> 3rdparty/stout/include/stout/jsonify.hpp 8220d001001e8b9f247c05be58534f1174611aad
>
> Diff: https://reviews.apache.org/r/52877/diff/
>
>
> Testing
> -------
>
> In order to test the different options the following benchmarking program was
> written using [Google benchmark](https://github.com/google/benchmark):
>
> ```c++
> #include <benchmark/benchmark_api.h>
> #include <stout/json.hpp>
>
> #include <iostream>
> #include <random>
> #include <string>
> #include <sstream>
>
> static void BM_Jsonify(benchmark::State& state) {
> while (state.KeepRunning()) {
> state.PauseTiming();
>
> // Randomly generated set of numbers.
> std::vector<double> doubles = {
> 54366462691.1798,
> 3.35465250645312,
> 3056184950.05953,
> 74057564.7741182,
> 280.445791893924,
> 3446.63176368415,
> 419012114.325581,
> 3464212.51004162,
> 1.45156507568354,
> 13304750.4216248,
> 7716457488648.00,
> 700533630.650588,
> 679.659801950981,
> 307059152.688268,
> 5615744.28063731,
> 859.902033900705,
> 293878.810967451,
> 284685668.155903,
> 722683811.462448,
> 407.682284299325,
> 9874834.88341080,
> 4829649.14505646,
> 3045544.72401146,
> 1112707.08627010,
> 8379539.79388719,
> 3004161.89676627,
> 3866662.79849617,
> 3871151.73991937,
> 4090119.26657417,
> 4118699.88616345,
> 2104416.18322520,
> 9896898.63226234,
> 5957851.08773457,
> 3501068.52003430,
> 7524714.36218293,
> 4333874.01982647,
> 9562008.06930384,
> 3374957.45494027,
> 5867075.07463260,
> 815499.697450741,
> 600936.470830026,
> 9661425.72632153,
> 6392256.71537575,
> 7517969.33139398,
> 9031912.03425553,
> 5497593.85752735,
> 815419.808032205,
> 5098659.46057626,
> 930160.667551563,
> 5970944.98217500,
> 6166534.92677787,
> 3541537.67676275,
> 1291933.13549156,
> 9185094.72404290,
> 4507338.03523123,
> 9559754.89147696,
> 6152898.39204769,
> 2358966.41795446,
> 6520510.92218183,
> 2201757.78606032,
> 4960487.80737309,
> 1784969.91832029,
> 3858390.23735070,
> 1439952.27402359,
> 6407199.91163080,
> 6130379.95590661,
> 6427890.23913244,
> 2128879.29010338,
> 8175291.42483598,
> 1587278.27639235,
> 7493343.68705307,
> 4853439.25371342,
> 2564845.15639735,
> 1415661.63929173,
> 6388168.79342734,
> 3003424.90116775,
> 6915390.10600792,
> 7115390.65502377,
> 5288515.90088063,
> 1209208.86882085,
> 4483111.91884606,
> 5974377.97163572,
> 5821054.89489766,
> 8284136.84073623,
> 1607044.34898051,
> 3255087.31267763,
> 2305369.43079747,
> 1282392.57487249,
> 4884797.49134467,
> 5119420.62129117,
> 6276725.07755672,
> 3054999.92210194,
> 3594116.58894982,
> 6691568.49651968,
> 265562.721872220,
> 2864878.07276221,
> 627299.902077148,
> 5885179.44212665,
> 7654144.98508934,
> 590857.599685431
> };
>
> state.ResumeTiming();
>
> benchmark::DoNotOptimize(jsonify(object));
> }
> }
>
> BENCHMARK(BM_Jsonify);
>
> BENCHMARK_MAIN();
> ```
>
> The program creates a JSON object with 50 floats in it, and then it creates
> its string representation. Since the algorithms in `json.hpp` and `jsonify.hpp`
> are similar but not the same, they were benchmarked as different. While the
> original solutions are non solutions since they produce erroneous results if
> localization is active, they constitute a baseline for comparison with the
> alternatives. These are the benchmarks for the original algorithms and their
> alternatives:
>
> 1. `json.hpp` default:
>
> ```c++
> char buffer[50] {};
> snprintf(
> buffer,
> sizeof(buffer),
> "%#.*g",
> std::numeric_limits<double>::digits10,
> number.value);
>
> std::string trimmed = strings::trim(buffer, strings::SUFFIX, "0");
> *stream_ << trimmed << (trimmed.back() == '.' ? "0" : "");
> ```
>
> Perfomance
>
> ```
> Run on (8 X 2800 MHz CPU s)
> 2017-01-03 15:06:35
> Benchmark Time CPU Iterations
> -------------------------------------------------
> BM_Jsonify 985 ns 986 ns 714227
> ```
>
> 2. `jsonify.hpp` default (it changes the call to `strings::trim` in order to
> avoid string copies:
>
> ```c++
> char buffer[50] {};
> const int size = snprintf(
> buffer,
> sizeof(buffer),
> "%#.*g",
> std::numeric_limits<double>::digits10,
> number.value);
>
> int back = size - 1;
> for (; back > 0; --back) {
> if (buffer[back] != '0') {
> break;
> }
> buffer[back] = '\0';
> }
> *stream_ << buffer << (buffer[back] == '.' ? "0" : "");
> ```
>
> Performance
>
> ```
> Run on (8 X 2800 MHz CPU s)
> 2017-01-03 15:05:34
> Benchmark Time CPU Iterations
> -------------------------------------------------
> BM_Jsonify 968 ns 969 ns 723417
> ```
>
> 3. Proposal 1, manually searching for a character which is not expected (and it
> is assumed to be the decimal separator) and replacing it:
>
> ```c++
> char buffer[50] {};
> int size = snprintf(
> buffer,
> sizeof(buffer),
> "%#.*g",
> std::numeric_limits<double>::digits10,
> double_);
>
> const char separator =
> std::use_facet<std::numpunct<char>>(*stream_.getloc()).decimal_point();
> for (int i = 0; i < size; ++i) {
> if (buffer[i] == separator) {
> buffer[i] = '.';
> break;
> }
> }
>
> int back = size - 1;
> for (; back > 0; --back) {
> if (buffer[back] != '0') {
> break;
> }
> buffer[back] = '\0';
> }
>
> *stream_ << buffer << (buffer[back] == '.' ? "0" : "");
> ```
>
> Performance
>
> ```
> Run on (8 X 2800 MHz CPU s)
> 2017-01-03 15:07:39
> Benchmark Time CPU Iterations
> -------------------------------------------------
> BM_Jsonify 982 ns 981 ns 704544
> ```
>
> 4. Proposal 2, the C++ approach, use streams and stream manipulators:
>
> ```c++
> std::stringstream ss;
> ss.imbue(std::locale::classic());
> ss << std::setprecision(std::numeric_limits<double>::digits10)
> << std::showpoint << number.value;
>
> std::string trimmed = strings::trim(ss.str(), strings::SUFFIX, "0");
> return out << trimmed << (trimmed.back() == '.' ? "0" : "");
> ```
>
> Performance
>
> ```
> Run on (8 X 2800 MHz CPU s)
> 2017-01-03 15:09:40
> Benchmark Time CPU Iterations
> -------------------------------------------------
> BM_Jsonify 977 ns 978 ns 715571
> ```
>
> 5. Proposal 3, try to avoid copies (there are still some being created):
>
> ```c++
> std::stringstream ss;
> ss.imbue(std::locale::classic());
> ss << std::setprecision(std::numeric_limits<double>::digits10)
> << std::showpoint << number.value;
>
> std::string representation = ss.str();
> int back = representation.size() - 1;
> for (; back > 0; --back) {
> if (representation[back] != '0') {
> break;
> }
> }
> representation.resize(back + 1);
> *stream_ << representation << (representation[back] == '.' ? "0" : "");
> ```
>
> Performance
>
> ```
> Run on (8 X 2800 MHz CPU s)
> 2017-01-03 15:10:52
> Benchmark Time CPU Iterations
> -------------------------------------------------
> BM_Jsonify 982 ns 984 ns 714935
> ```
>
> 6. Proposal 4: Using thread local variables:
>
> ```c++
> static THREAD_LOCAL std::stringstream* ss = nullptr;
> static THREAD_LOCAL bool initialized = false;
> if (!initialized) {
> ss = new std::stringstream;
> ss->imbue(std::locale::classic());
> *ss << std::setprecission(std::numeric_limits<double>::digits10)
> << std::showpoint;
> initialized = true;
> }
> *ss << double_;
> std::string trimmed = strings::trim(ss->str(), strings::SUFFIX, "0");
> *stream_ << trimmed << (trimmed.back() == '.' ? "0" : "");
> ss->str("");
> ```
>
> Performance
>
> ```
> Run on (8 X 2800 MHz CPU s)
> 2017-01-03 15:12:37
> Benchmark Time CPU Iterations
> -------------------------------------------------
> BM_Jsonify 980 ns 980 ns 716413
> ```
>
> Surprisingly, proposal 2 seemed to be the most efficient after multiple runs of the suite.
>
>
> Thanks,
>
> Alexander Rojas
>
>
Re: Review Request 52877: Fixed wrong float serialization in JSON in
locales different from C.
Posted by Alexander Rojas <al...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52877/
-----------------------------------------------------------
(Updated Jan. 4, 2017, 2:07 p.m.)
Review request for mesos, Adam B, Benjamin Bannier, Joris Van Remoortere, and Michael Park.
Changes
-------
Update generation of JSON using `uselocale()` in `jsonify` and changing manually the separator in `stringify()`
Bugs: MESOS-6349
https://issues.apache.org/jira/browse/MESOS-6349
Repository: mesos
Description
-------
Resolves an isse with the JSON serialization functions where floats
would use the process active locale as a decimal separator instead
of the JSON conformant period (`.`).
Diffs (updated)
-----
3rdparty/stout/include/stout/json.hpp 62ce15274677112d142a3c829b4a9f06258c9e2c
3rdparty/stout/include/stout/jsonify.hpp 8220d001001e8b9f247c05be58534f1174611aad
Diff: https://reviews.apache.org/r/52877/diff/
Testing
-------
In order to test the different options the following benchmarking program was
written using [Google benchmark](https://github.com/google/benchmark):
```c++
#include <benchmark/benchmark_api.h>
#include <stout/json.hpp>
#include <iostream>
#include <random>
#include <string>
#include <sstream>
static void BM_Jsonify(benchmark::State& state) {
while (state.KeepRunning()) {
state.PauseTiming();
// Randomly generated set of numbers.
std::vector<double> doubles = {
54366462691.1798,
3.35465250645312,
3056184950.05953,
74057564.7741182,
280.445791893924,
3446.63176368415,
419012114.325581,
3464212.51004162,
1.45156507568354,
13304750.4216248,
7716457488648.00,
700533630.650588,
679.659801950981,
307059152.688268,
5615744.28063731,
859.902033900705,
293878.810967451,
284685668.155903,
722683811.462448,
407.682284299325,
9874834.88341080,
4829649.14505646,
3045544.72401146,
1112707.08627010,
8379539.79388719,
3004161.89676627,
3866662.79849617,
3871151.73991937,
4090119.26657417,
4118699.88616345,
2104416.18322520,
9896898.63226234,
5957851.08773457,
3501068.52003430,
7524714.36218293,
4333874.01982647,
9562008.06930384,
3374957.45494027,
5867075.07463260,
815499.697450741,
600936.470830026,
9661425.72632153,
6392256.71537575,
7517969.33139398,
9031912.03425553,
5497593.85752735,
815419.808032205,
5098659.46057626,
930160.667551563,
5970944.98217500,
6166534.92677787,
3541537.67676275,
1291933.13549156,
9185094.72404290,
4507338.03523123,
9559754.89147696,
6152898.39204769,
2358966.41795446,
6520510.92218183,
2201757.78606032,
4960487.80737309,
1784969.91832029,
3858390.23735070,
1439952.27402359,
6407199.91163080,
6130379.95590661,
6427890.23913244,
2128879.29010338,
8175291.42483598,
1587278.27639235,
7493343.68705307,
4853439.25371342,
2564845.15639735,
1415661.63929173,
6388168.79342734,
3003424.90116775,
6915390.10600792,
7115390.65502377,
5288515.90088063,
1209208.86882085,
4483111.91884606,
5974377.97163572,
5821054.89489766,
8284136.84073623,
1607044.34898051,
3255087.31267763,
2305369.43079747,
1282392.57487249,
4884797.49134467,
5119420.62129117,
6276725.07755672,
3054999.92210194,
3594116.58894982,
6691568.49651968,
265562.721872220,
2864878.07276221,
627299.902077148,
5885179.44212665,
7654144.98508934,
590857.599685431
};
state.ResumeTiming();
benchmark::DoNotOptimize(jsonify(object));
}
}
BENCHMARK(BM_Jsonify);
BENCHMARK_MAIN();
```
The program creates a JSON object with 50 floats in it, and then it creates
its string representation. Since the algorithms in `json.hpp` and `jsonify.hpp`
are similar but not the same, they were benchmarked as different. While the
original solutions are non solutions since they produce erroneous results if
localization is active, they constitute a baseline for comparison with the
alternatives. These are the benchmarks for the original algorithms and their
alternatives:
1. `json.hpp` default:
```c++
char buffer[50] {};
snprintf(
buffer,
sizeof(buffer),
"%#.*g",
std::numeric_limits<double>::digits10,
number.value);
std::string trimmed = strings::trim(buffer, strings::SUFFIX, "0");
*stream_ << trimmed << (trimmed.back() == '.' ? "0" : "");
```
Perfomance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:06:35
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 985 ns 986 ns 714227
```
2. `jsonify.hpp` default (it changes the call to `strings::trim` in order to
avoid string copies:
```c++
char buffer[50] {};
const int size = snprintf(
buffer,
sizeof(buffer),
"%#.*g",
std::numeric_limits<double>::digits10,
number.value);
int back = size - 1;
for (; back > 0; --back) {
if (buffer[back] != '0') {
break;
}
buffer[back] = '\0';
}
*stream_ << buffer << (buffer[back] == '.' ? "0" : "");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:05:34
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 968 ns 969 ns 723417
```
3. Proposal 1, manually searching for a character which is not expected (and it
is assumed to be the decimal separator) and replacing it:
```c++
char buffer[50] {};
int size = snprintf(
buffer,
sizeof(buffer),
"%#.*g",
std::numeric_limits<double>::digits10,
double_);
const char separator =
std::use_facet<std::numpunct<char>>(*stream_.getloc()).decimal_point();
for (int i = 0; i < size; ++i) {
if (buffer[i] == separator) {
buffer[i] = '.';
break;
}
}
int back = size - 1;
for (; back > 0; --back) {
if (buffer[back] != '0') {
break;
}
buffer[back] = '\0';
}
*stream_ << buffer << (buffer[back] == '.' ? "0" : "");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:07:39
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 982 ns 981 ns 704544
```
4. Proposal 2, the C++ approach, use streams and stream manipulators:
```c++
std::stringstream ss;
ss.imbue(std::locale::classic());
ss << std::setprecision(std::numeric_limits<double>::digits10)
<< std::showpoint << number.value;
std::string trimmed = strings::trim(ss.str(), strings::SUFFIX, "0");
return out << trimmed << (trimmed.back() == '.' ? "0" : "");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:09:40
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 977 ns 978 ns 715571
```
5. Proposal 3, try to avoid copies (there are still some being created):
```c++
std::stringstream ss;
ss.imbue(std::locale::classic());
ss << std::setprecision(std::numeric_limits<double>::digits10)
<< std::showpoint << number.value;
std::string representation = ss.str();
int back = representation.size() - 1;
for (; back > 0; --back) {
if (representation[back] != '0') {
break;
}
}
representation.resize(back + 1);
*stream_ << representation << (representation[back] == '.' ? "0" : "");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:10:52
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 982 ns 984 ns 714935
```
6. Proposal 4: Using thread local variables:
```c++
static THREAD_LOCAL std::stringstream* ss = nullptr;
static THREAD_LOCAL bool initialized = false;
if (!initialized) {
ss = new std::stringstream;
ss->imbue(std::locale::classic());
*ss << std::setprecission(std::numeric_limits<double>::digits10)
<< std::showpoint;
initialized = true;
}
*ss << double_;
std::string trimmed = strings::trim(ss->str(), strings::SUFFIX, "0");
*stream_ << trimmed << (trimmed.back() == '.' ? "0" : "");
ss->str("");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:12:37
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 980 ns 980 ns 716413
```
Surprisingly, proposal 2 seemed to be the most efficient after multiple runs of the suite.
Thanks,
Alexander Rojas
Re: Review Request 52877: Fixed wrong float serialization in JSON in
locales different from C.
Posted by Alexander Rojas <al...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52877/
-----------------------------------------------------------
(Updated Jan. 3, 2017, 5:06 p.m.)
Review request for mesos, Adam B, Benjamin Bannier, Joris Van Remoortere, and Michael Park.
Bugs: MESOS-6349
https://issues.apache.org/jira/browse/MESOS-6349
Repository: mesos
Description
-------
Resolves an isse with the JSON serialization functions where floats
would use the process active locale as a decimal separator instead
of the JSON conformant period (`.`).
Diffs
-----
3rdparty/stout/include/stout/json.hpp 62ce15274677112d142a3c829b4a9f06258c9e2c
3rdparty/stout/include/stout/jsonify.hpp 8220d001001e8b9f247c05be58534f1174611aad
Diff: https://reviews.apache.org/r/52877/diff/
Testing (updated)
-------
In order to test the different options the following benchmarking program was
written using [Google benchmark](https://github.com/google/benchmark):
```c++
#include <benchmark/benchmark_api.h>
#include <stout/json.hpp>
#include <iostream>
#include <random>
#include <string>
#include <sstream>
static void BM_Jsonify(benchmark::State& state) {
while (state.KeepRunning()) {
state.PauseTiming();
// Randomly generated set of numbers.
std::vector<double> doubles = {
54366462691.1798,
3.35465250645312,
3056184950.05953,
74057564.7741182,
280.445791893924,
3446.63176368415,
419012114.325581,
3464212.51004162,
1.45156507568354,
13304750.4216248,
7716457488648.00,
700533630.650588,
679.659801950981,
307059152.688268,
5615744.28063731,
859.902033900705,
293878.810967451,
284685668.155903,
722683811.462448,
407.682284299325,
9874834.88341080,
4829649.14505646,
3045544.72401146,
1112707.08627010,
8379539.79388719,
3004161.89676627,
3866662.79849617,
3871151.73991937,
4090119.26657417,
4118699.88616345,
2104416.18322520,
9896898.63226234,
5957851.08773457,
3501068.52003430,
7524714.36218293,
4333874.01982647,
9562008.06930384,
3374957.45494027,
5867075.07463260,
815499.697450741,
600936.470830026,
9661425.72632153,
6392256.71537575,
7517969.33139398,
9031912.03425553,
5497593.85752735,
815419.808032205,
5098659.46057626,
930160.667551563,
5970944.98217500,
6166534.92677787,
3541537.67676275,
1291933.13549156,
9185094.72404290,
4507338.03523123,
9559754.89147696,
6152898.39204769,
2358966.41795446,
6520510.92218183,
2201757.78606032,
4960487.80737309,
1784969.91832029,
3858390.23735070,
1439952.27402359,
6407199.91163080,
6130379.95590661,
6427890.23913244,
2128879.29010338,
8175291.42483598,
1587278.27639235,
7493343.68705307,
4853439.25371342,
2564845.15639735,
1415661.63929173,
6388168.79342734,
3003424.90116775,
6915390.10600792,
7115390.65502377,
5288515.90088063,
1209208.86882085,
4483111.91884606,
5974377.97163572,
5821054.89489766,
8284136.84073623,
1607044.34898051,
3255087.31267763,
2305369.43079747,
1282392.57487249,
4884797.49134467,
5119420.62129117,
6276725.07755672,
3054999.92210194,
3594116.58894982,
6691568.49651968,
265562.721872220,
2864878.07276221,
627299.902077148,
5885179.44212665,
7654144.98508934,
590857.599685431
};
state.ResumeTiming();
benchmark::DoNotOptimize(jsonify(object));
}
}
BENCHMARK(BM_Jsonify);
BENCHMARK_MAIN();
```
The program creates a JSON object with 50 floats in it, and then it creates
its string representation. Since the algorithms in `json.hpp` and `jsonify.hpp`
are similar but not the same, they were benchmarked as different. While the
original solutions are non solutions since they produce erroneous results if
localization is active, they constitute a baseline for comparison with the
alternatives. These are the benchmarks for the original algorithms and their
alternatives:
1. `json.hpp` default:
```c++
char buffer[50] {};
snprintf(
buffer,
sizeof(buffer),
"%#.*g",
std::numeric_limits<double>::digits10,
number.value);
std::string trimmed = strings::trim(buffer, strings::SUFFIX, "0");
*stream_ << trimmed << (trimmed.back() == '.' ? "0" : "");
```
Perfomance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:06:35
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 985 ns 986 ns 714227
```
2. `jsonify.hpp` default (it changes the call to `strings::trim` in order to
avoid string copies:
```c++
char buffer[50] {};
const int size = snprintf(
buffer,
sizeof(buffer),
"%#.*g",
std::numeric_limits<double>::digits10,
number.value);
int back = size - 1;
for (; back > 0; --back) {
if (buffer[back] != '0') {
break;
}
buffer[back] = '\0';
}
*stream_ << buffer << (buffer[back] == '.' ? "0" : "");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:05:34
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 968 ns 969 ns 723417
```
3. Proposal 1, manually searching for a character which is not expected (and it
is assumed to be the decimal separator) and replacing it:
```c++
char buffer[50] {};
int size = snprintf(
buffer,
sizeof(buffer),
"%#.*g",
std::numeric_limits<double>::digits10,
double_);
const char separator =
std::use_facet<std::numpunct<char>>(*stream_.getloc()).decimal_point();
for (int i = 0; i < size; ++i) {
if (buffer[i] == separator) {
buffer[i] = '.';
break;
}
}
int back = size - 1;
for (; back > 0; --back) {
if (buffer[back] != '0') {
break;
}
buffer[back] = '\0';
}
*stream_ << buffer << (buffer[back] == '.' ? "0" : "");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:07:39
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 982 ns 981 ns 704544
```
4. Proposal 2, the C++ approach, use streams and stream manipulators:
```c++
std::stringstream ss;
ss.imbue(std::locale::classic());
ss << std::setprecision(std::numeric_limits<double>::digits10)
<< std::showpoint << number.value;
std::string trimmed = strings::trim(ss.str(), strings::SUFFIX, "0");
return out << trimmed << (trimmed.back() == '.' ? "0" : "");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:09:40
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 977 ns 978 ns 715571
```
5. Proposal 3, try to avoid copies (there are still some being created):
```c++
std::stringstream ss;
ss.imbue(std::locale::classic());
ss << std::setprecision(std::numeric_limits<double>::digits10)
<< std::showpoint << number.value;
std::string representation = ss.str();
int back = representation.size() - 1;
for (; back > 0; --back) {
if (representation[back] != '0') {
break;
}
}
representation.resize(back + 1);
*stream_ << representation << (representation[back] == '.' ? "0" : "");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:10:52
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 982 ns 984 ns 714935
```
6. Proposal 4: Using thread local variables:
```c++
static THREAD_LOCAL std::stringstream* ss = nullptr;
static THREAD_LOCAL bool initialized = false;
if (!initialized) {
ss = new std::stringstream;
ss->imbue(std::locale::classic());
*ss << std::setprecission(std::numeric_limits<double>::digits10)
<< std::showpoint;
initialized = true;
}
*ss << double_;
std::string trimmed = strings::trim(ss->str(), strings::SUFFIX, "0");
*stream_ << trimmed << (trimmed.back() == '.' ? "0" : "");
ss->str("");
```
Performance
```
Run on (8 X 2800 MHz CPU s)
2017-01-03 15:12:37
Benchmark Time CPU Iterations
-------------------------------------------------
BM_Jsonify 980 ns 980 ns 716413
```
Surprisingly, proposal 2 seemed to be the most efficient after multiple runs of the suite.
Thanks,
Alexander Rojas