You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@logging.apache.org by Volkan Yazıcı <vo...@gmail.com> on 2020/04/07 20:19:06 UTC

JSON quoting discrepancies

Hello,

There are a couple of concerns I want to share regarding the available
JSON quoting options in the core. I would appreciate your feedback on
them.

1. My first concern is... There are 2 competing JSON quoting
   implementations in the core:

   - JsonUtils.quoteAsString(CharSequence, StringBuilder)
   - StringBuilders.escapeJson(StringBuilder, int)

   escapeJson() is our homegrown solution and quoteAsString() is the one
   we borrowed from Jackson. I am in favor of marking the first as deprecated,
   removing its implementation, and directing it to quoteAsString(). (Jackson
   clone is significantly faster.)

2. JsonUtils.quoteAsString() uses ThreadLocals without honoring the
    ENABLE_THREADLOCALS flag.

3. JsonUtils.quoteAsString() doesn't really need TLA at all; it can just receive
   a char[6] argument, where the call site can deal with the caching of that
   buffer at its own will.

4. escapeJson() produces slightly different results compared to
   quoteAsString() and Jackson. For demonstration purposes, I've
   created the following setup:

@Test
public void test_quoting() throws Exception {
    final ObjectMapper objectMapper = new ObjectMapper();
    final StringBuilder stringBuilder = new StringBuilder();
    final SoftAssertions assertions = new SoftAssertions();
    for (char c = Character.MIN_VALUE;; c++) {

        final String s = "" + c;
        final String jacksonEncodedJson =
                objectMapper.writeValueAsString(s);

        stringBuilder.setLength(0);
        JsonUtils.quoteAsString(s, stringBuilder);
        final String quoteAsStringEncodedJson =
                '"' + stringBuilder.toString() + '"';

        assertions
                .assertThat(quoteAsStringEncodedJson)
                .as("c=%d (quoteAsString)", (int) c)
                .isEqualTo(jacksonEncodedJson);

        stringBuilder.setLength(0);
        stringBuilder.append(c);
        StringBuilders.escapeJson(stringBuilder, 0);
        final String escapeJsonEncodedJson =
                '"' + stringBuilder.toString() + '"';

        assertions
                .assertThat(escapeJsonEncodedJson)
                .as("c=%d (escapeJson)", (int) c)
                .isEqualTo(jacksonEncodedJson);

        if (c == Character.MAX_VALUE) {
            break;
        }

    }
    assertions.assertAll();
}

   This is the output I get:

   The following 33 assertions failed:
   1) [c=127 (escapeJson)] expected:<""[ ]""> but was:<""[\u007F]"">
   2) [c=128 (escapeJson)] expected:<""[?]""> but was:<""[\u0080]"">
   3) [c=129 (escapeJson)] expected:<""[?]""> but was:<""[\u0081]"">
   4) [c=130 (escapeJson)] expected:<""[?]""> but was:<""[\u0082]"">
   5) [c=131 (escapeJson)] expected:<""[?]""> but was:<""[\u0083]"">
   6) [c=132 (escapeJson)] expected:<""[?]""> but was:<""[\u0084]"">
   7) [c=133 (escapeJson)] expected:<""[…]""> but was:<""[\u0085]"">
   8) [c=134 (escapeJson)] expected:<""[?]""> but was:<""[\u0086]"">
   9) [c=135 (escapeJson)] expected:<""[?]""> but was:<""[\u0087]"">
   10) [c=136 (escapeJson)] expected:<""[?]""> but was:<""[\u0088]"">
   11) [c=137 (escapeJson)] expected:<""[?]""> but was:<""[\u0089]"">
   12) [c=138 (escapeJson)] expected:<""[?]""> but was:<""[\u008A]"">
   13) [c=139 (escapeJson)] expected:<""[?]""> but was:<""[\u008B]"">
   14) [c=140 (escapeJson)] expected:<""[?]""> but was:<""[\u008C]"">
   15) [c=141 (escapeJson)] expected:<""[?]""> but was:<""[\u008D]"">
   16) [c=142 (escapeJson)] expected:<""[?]""> but was:<""[\u008E]"">
   17) [c=143 (escapeJson)] expected:<""[?]""> but was:<""[\u008F]"">
   18) [c=144 (escapeJson)] expected:<""[?]""> but was:<""[\u0090]"">
   19) [c=145 (escapeJson)] expected:<""[?]""> but was:<""[\u0091]"">
   20) [c=146 (escapeJson)] expected:<""[?]""> but was:<""[\u0092]"">
   21) [c=147 (escapeJson)] expected:<""[?]""> but was:<""[\u0093]"">
   22) [c=148 (escapeJson)] expected:<""[?]""> but was:<""[\u0094]"">
   23) [c=149 (escapeJson)] expected:<""[?]""> but was:<""[\u0095]"">
   24) [c=150 (escapeJson)] expected:<""[?]""> but was:<""[\u0096]"">
   25) [c=151 (escapeJson)] expected:<""[?]""> but was:<""[\u0097]"">
   26) [c=152 (escapeJson)] expected:<""[?]""> but was:<""[\u0098]"">
   27) [c=153 (escapeJson)] expected:<""[?]""> but was:<""[\u0099]"">
   28) [c=154 (escapeJson)] expected:<""[?]""> but was:<""[\u009A]"">
   29) [c=155 (escapeJson)] expected:<""[?]""> but was:<""[\u009B]"">
   30) [c=156 (escapeJson)] expected:<""[?]""> but was:<""[\u009C]"">
   31) [c=157 (escapeJson)] expected:<""[?]""> but was:<""[\u009D]"">
   32) [c=158 (escapeJson)] expected:<""[?]""> but was:<""[\u009E]"">
   33) [c=159 (escapeJson)] expected:<""[?]""> but was:<""[\u009F]"">

   I am not a JSON expert, but I find this worrying. Would anybody
   provide some further feedback on what is going on here, please?

Kind regards.

Re: JSON quoting discrepancies

Posted by Matt Sicker <bo...@gmail.com>.
Back when the first JSON escape method was written, I think it was
based on the JSON RFC. Looks like it's been modified a bit since then,
so I'm not sure if it's expected behavior or not.

On Tue, 7 Apr 2020 at 15:18, Volkan Yazıcı <vo...@gmail.com> wrote:
>
> Hello,
>
> There are a couple of concerns I want to share regarding the available
> JSON quoting options in the core. I would appreciate your feedback on
> them.
>
> 1. My first concern is... There are 2 competing JSON quoting
>    implementations in the core:
>
>    - JsonUtils.quoteAsString(CharSequence, StringBuilder)
>    - StringBuilders.escapeJson(StringBuilder, int)
>
>    escapeJson() is our homegrown solution and quoteAsString() is the one
>    we borrowed from Jackson. I am in favor of marking the first as deprecated,
>    removing its implementation, and directing it to quoteAsString(). (Jackson
>    clone is significantly faster.)
>
> 2. JsonUtils.quoteAsString() uses ThreadLocals without honoring the
>     ENABLE_THREADLOCALS flag.
>
> 3. JsonUtils.quoteAsString() doesn't really need TLA at all; it can just receive
>    a char[6] argument, where the call site can deal with the caching of that
>    buffer at its own will.
>
> 4. escapeJson() produces slightly different results compared to
>    quoteAsString() and Jackson. For demonstration purposes, I've
>    created the following setup:
>
> @Test
> public void test_quoting() throws Exception {
>     final ObjectMapper objectMapper = new ObjectMapper();
>     final StringBuilder stringBuilder = new StringBuilder();
>     final SoftAssertions assertions = new SoftAssertions();
>     for (char c = Character.MIN_VALUE;; c++) {
>
>         final String s = "" + c;
>         final String jacksonEncodedJson =
>                 objectMapper.writeValueAsString(s);
>
>         stringBuilder.setLength(0);
>         JsonUtils.quoteAsString(s, stringBuilder);
>         final String quoteAsStringEncodedJson =
>                 '"' + stringBuilder.toString() + '"';
>
>         assertions
>                 .assertThat(quoteAsStringEncodedJson)
>                 .as("c=%d (quoteAsString)", (int) c)
>                 .isEqualTo(jacksonEncodedJson);
>
>         stringBuilder.setLength(0);
>         stringBuilder.append(c);
>         StringBuilders.escapeJson(stringBuilder, 0);
>         final String escapeJsonEncodedJson =
>                 '"' + stringBuilder.toString() + '"';
>
>         assertions
>                 .assertThat(escapeJsonEncodedJson)
>                 .as("c=%d (escapeJson)", (int) c)
>                 .isEqualTo(jacksonEncodedJson);
>
>         if (c == Character.MAX_VALUE) {
>             break;
>         }
>
>     }
>     assertions.assertAll();
> }
>
>    This is the output I get:
>
>    The following 33 assertions failed:
>    1) [c=127 (escapeJson)] expected:<""[ ]""> but was:<""[\u007F]"">
>    2) [c=128 (escapeJson)] expected:<""[?]""> but was:<""[\u0080]"">
>    3) [c=129 (escapeJson)] expected:<""[?]""> but was:<""[\u0081]"">
>    4) [c=130 (escapeJson)] expected:<""[?]""> but was:<""[\u0082]"">
>    5) [c=131 (escapeJson)] expected:<""[?]""> but was:<""[\u0083]"">
>    6) [c=132 (escapeJson)] expected:<""[?]""> but was:<""[\u0084]"">
>    7) [c=133 (escapeJson)] expected:<""[…]""> but was:<""[\u0085]"">
>    8) [c=134 (escapeJson)] expected:<""[?]""> but was:<""[\u0086]"">
>    9) [c=135 (escapeJson)] expected:<""[?]""> but was:<""[\u0087]"">
>    10) [c=136 (escapeJson)] expected:<""[?]""> but was:<""[\u0088]"">
>    11) [c=137 (escapeJson)] expected:<""[?]""> but was:<""[\u0089]"">
>    12) [c=138 (escapeJson)] expected:<""[?]""> but was:<""[\u008A]"">
>    13) [c=139 (escapeJson)] expected:<""[?]""> but was:<""[\u008B]"">
>    14) [c=140 (escapeJson)] expected:<""[?]""> but was:<""[\u008C]"">
>    15) [c=141 (escapeJson)] expected:<""[?]""> but was:<""[\u008D]"">
>    16) [c=142 (escapeJson)] expected:<""[?]""> but was:<""[\u008E]"">
>    17) [c=143 (escapeJson)] expected:<""[?]""> but was:<""[\u008F]"">
>    18) [c=144 (escapeJson)] expected:<""[?]""> but was:<""[\u0090]"">
>    19) [c=145 (escapeJson)] expected:<""[?]""> but was:<""[\u0091]"">
>    20) [c=146 (escapeJson)] expected:<""[?]""> but was:<""[\u0092]"">
>    21) [c=147 (escapeJson)] expected:<""[?]""> but was:<""[\u0093]"">
>    22) [c=148 (escapeJson)] expected:<""[?]""> but was:<""[\u0094]"">
>    23) [c=149 (escapeJson)] expected:<""[?]""> but was:<""[\u0095]"">
>    24) [c=150 (escapeJson)] expected:<""[?]""> but was:<""[\u0096]"">
>    25) [c=151 (escapeJson)] expected:<""[?]""> but was:<""[\u0097]"">
>    26) [c=152 (escapeJson)] expected:<""[?]""> but was:<""[\u0098]"">
>    27) [c=153 (escapeJson)] expected:<""[?]""> but was:<""[\u0099]"">
>    28) [c=154 (escapeJson)] expected:<""[?]""> but was:<""[\u009A]"">
>    29) [c=155 (escapeJson)] expected:<""[?]""> but was:<""[\u009B]"">
>    30) [c=156 (escapeJson)] expected:<""[?]""> but was:<""[\u009C]"">
>    31) [c=157 (escapeJson)] expected:<""[?]""> but was:<""[\u009D]"">
>    32) [c=158 (escapeJson)] expected:<""[?]""> but was:<""[\u009E]"">
>    33) [c=159 (escapeJson)] expected:<""[?]""> but was:<""[\u009F]"">
>
>    I am not a JSON expert, but I find this worrying. Would anybody
>    provide some further feedback on what is going on here, please?
>
> Kind regards.



-- 
Matt Sicker <bo...@gmail.com>