You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by GitBox <gi...@apache.org> on 2021/04/01 06:50:18 UTC

[GitHub] [couchdb] bessbd opened a new pull request #3488: Fix libicu inconsistency

bessbd opened a new pull request #3488:
URL: https://github.com/apache/couchdb/pull/3488


   <!-- Thank you for your contribution!
   
        Please file this form by replacing the Markdown comments
        with your text. If a section needs no action - remove it.
   
        Also remember, that CouchDB uses the Review-Then-Commit (RTC) model
        of code collaboration. Positive feedback is represented +1 from committers
        and negative is a -1. The -1 also means veto, and needs to be addressed
        to proceed. Once there are no objections, the PR can be merged by a
        CouchDB committer.
   
        See: http://couchdb.apache.org/bylaws.html#decisions for more info. -->
   
   ## Overview
   
   <!-- Please give a short brief for the pull request,
        what problem it solves or how it makes things better. -->
   
   ## Testing recommendations
   
   <!-- Describe how we can test your changes.
        Does it provides any behaviour that the end users
        could notice? -->
   
   ## Related Issues or Pull Requests
   
   <!-- If your changes affects multiple components in different
        repositories please put links to those issues or pull requests here.  -->
   
   ## Checklist
   
   - [ ] Code is written and works correctly
   - [ ] Changes are covered by tests
   - [ ] Any new configurable parameters are documented in `rel/overlay/etc/default.ini`
   - [ ] A PR for documentation changes has been made in https://github.com/apache/couchdb-documentation
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] bessbd commented on pull request #3488: Fix libicu inconsistency

Posted by GitBox <gi...@apache.org>.
bessbd commented on pull request #3488:
URL: https://github.com/apache/couchdb/pull/3488#issuecomment-814017001


   Superseded by https://github.com/apache/couchdb/pull/3490


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] nickva edited a comment on pull request #3488: Fix libicu inconsistency

Posted by GitBox <gi...@apache.org>.
nickva edited a comment on pull request #3488:
URL: https://github.com/apache/couchdb/pull/3488#issuecomment-812723151


   I thought of making it bit more general since noticed we'd have to handle perhaps `less({[?MAX_JSON_OBJ], _}, _) -> gt` and maybe also `{[?MAX_JSON_OBJ], _}, {[?MAX_JSON_OBJ], _} -> eq`. The one clause would work currently as is,  but, if we change the logic in the future it might get an unexpected failure. Then there is also the `?HIGHEST_KEY` macro which I think is not used in `main` but it may be one day before CentOS 7 is retired and it would unexpectedly fail as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] bessbd closed pull request #3488: Fix libicu inconsistency

Posted by GitBox <gi...@apache.org>.
bessbd closed pull request #3488:
URL: https://github.com/apache/couchdb/pull/3488


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] nickva edited a comment on pull request #3488: Fix libicu inconsistency

Posted by GitBox <gi...@apache.org>.
nickva edited a comment on pull request #3488:
URL: https://github.com/apache/couchdb/pull/3488#issuecomment-812723151


   I thought of making a bit more general since noticed we'd have to handle perhaps `less({[?MAX_JSON_OBJ], _}, _) -> gt` and maybe also `{[?MAX_JSON_OBJ], _}, {[?MAX_JSON_OBJ], _} -> eq`. The one clause would work currently as is,  but, if we change the logic in the future it might get an unexpected failure. Then there is also the `?HIGHEST_KEY` macro which I think is not used in `main` but it may be one day before CentOS 7 is retired and it would unexpectedly fail as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] bessbd commented on pull request #3488: Fix libicu inconsistency

Posted by GitBox <gi...@apache.org>.
bessbd commented on pull request #3488:
URL: https://github.com/apache/couchdb/pull/3488#issuecomment-811688071


   This is related to https://github.com/apache/couchdb/issues/3394


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] nickva removed a comment on pull request #3488: Fix libicu inconsistency

Posted by GitBox <gi...@apache.org>.
nickva removed a comment on pull request #3488:
URL: https://github.com/apache/couchdb/pull/3488#issuecomment-812699068


   It was interesting that we didn't hit this issue on CentOS 7 on 3.x so I was curious and traced the `less` call when running that mango test:
   
   ```
   10:9:04.429561 <0.11140.1> couch_ejson_compare:less_nif([{<<"ÿÿÿÿ">>}], [{[{<<"forename">>,<<"Eddie">>}]}])
   
   10:9:04.429783 <0.11140.1> couch_ejson_compare:less_nif/2 error badarg
   
   10:9:04.429984 <0.11140.1> couch_ejson_compare:less_erl([{<<"ÿÿÿÿ">>}], [{[{<<"forename">>,<<"Eddie">>}]}])
   
   10:9:04.430254 <0.11140.1> couch_ejson_compare:less_list([{<<"ÿÿÿÿ">>}], [{[{<<"forename">>,<<"Eddie">>}]}])
   
   10:9:04.430506 <0.11140.1> couch_ejson_compare:less_erl({<<"ÿÿÿÿ">>}, {[{<<"forename">>,<<"Eddie">>}]})
   
   10:9:04.430745 <0.11140.1> couch_ejson_compare:less_erl/2 --> 1
   
   10:9:04.430903 <0.11140.1> couch_ejson_compare:less_list/2 --> 1
   
   10:9:04.431088 <0.11140.1> couch_ejson_compare:less_erl/2 --> 1
   ```
   
   Apparently in `3.x` we compare an object `{[{<<"forename">>,<<"Eddie">>}]}` against a (highest?) marker, but that marker is not an object but a tuple with one element `{<<"ÿÿÿÿ">>}`. `less_nif/2` then notices the object is not valid ejson and throws a `badarg` error.  The the code then falls back to Erlang using `less_erl/2` and at least in this case return the correct result (highest marker is greater than (`1`) than the object).
   
   In `main` we compare against an actual object `{[{<<"ÿÿÿÿ">>, <<>>}]}` so everything gets unpacked nicely then the old unicode library at least older than 59 (and possible 53) didn't error out, but also didn't sort the `<<255,255,255,255>>` as the highest element as intended. Here is centos 7 with libicu-50.2-4.el7_7.x86_64, for example
   ```
   couch_ejson_compare:less(<<"forename">>, <<255,255,255,255>>).
   1
   > couch_ejson_compare:less(<<"/">>, <<255,255,255,255>>).
   -1
   > couch_ejson_compare:less(<<"0">>, <<255,255,255,255>>).
   1
   ```
   
   and on mac os with icu4c 59:
   ```
   > couch_ejson_compare:less(<<"forename">>, <<255,255,255,255>>).
   -1
   > couch_ejson_compare:less(<<"/">>, <<255,255,255,255>>).
   -1
   > couch_ejson_compare:less(<<"0">>, <<255,255,255,255>>).
   -1
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] nickva commented on pull request #3488: Fix libicu inconsistency

Posted by GitBox <gi...@apache.org>.
nickva commented on pull request #3488:
URL: https://github.com/apache/couchdb/pull/3488#issuecomment-812723151


   I thought of making a bit more general since noticed we'd have to handle perhaps `less({[?MAX_JSON_OBJ], _}, _) -> gt` and maybe also `{[?MAX_JSON_OBJ], _}, {[?MAX_JSON_OBJ], _} -> eq`. The clause would work currently but if compare a bit differently in the future it might get an unexpected failure. Then there is also the ?HIGHEST_KEY macro which I think is not used in `main` but it may be one day before CentOS 7 is retired and it would unexpectedly fail.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] bessbd commented on pull request #3488: Fix libicu inconsistency

Posted by GitBox <gi...@apache.org>.
bessbd commented on pull request #3488:
URL: https://github.com/apache/couchdb/pull/3488#issuecomment-812734598


   > @bessbd I thought we could do a slightly more thorough check for the max string only in C right before we call the collator code #3490
   
   I responded over at https://github.com/apache/couchdb/pull/3490#issuecomment-812721292


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] nickva commented on pull request #3488: Fix libicu inconsistency

Posted by GitBox <gi...@apache.org>.
nickva commented on pull request #3488:
URL: https://github.com/apache/couchdb/pull/3488#issuecomment-812715247


   @bessbd I thought we could do a slightly more thorough check for the max string only in C right before we call the collator code https://github.com/apache/couchdb/pull/3490


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] nickva commented on pull request #3488: Fix libicu inconsistency

Posted by GitBox <gi...@apache.org>.
nickva commented on pull request #3488:
URL: https://github.com/apache/couchdb/pull/3488#issuecomment-812699068


   It was interesting that we didn't hit this issue on CentOS 7 on 3.x so I was curious and traced the `less` call when running that mango test:
   
   ```
   10:9:04.429561 <0.11140.1> couch_ejson_compare:less_nif([{<<"ÿÿÿÿ">>}], [{[{<<"forename">>,<<"Eddie">>}]}])
   
   10:9:04.429783 <0.11140.1> couch_ejson_compare:less_nif/2 error badarg
   
   10:9:04.429984 <0.11140.1> couch_ejson_compare:less_erl([{<<"ÿÿÿÿ">>}], [{[{<<"forename">>,<<"Eddie">>}]}])
   
   10:9:04.430254 <0.11140.1> couch_ejson_compare:less_list([{<<"ÿÿÿÿ">>}], [{[{<<"forename">>,<<"Eddie">>}]}])
   
   10:9:04.430506 <0.11140.1> couch_ejson_compare:less_erl({<<"ÿÿÿÿ">>}, {[{<<"forename">>,<<"Eddie">>}]})
   
   10:9:04.430745 <0.11140.1> couch_ejson_compare:less_erl/2 --> 1
   
   10:9:04.430903 <0.11140.1> couch_ejson_compare:less_list/2 --> 1
   
   10:9:04.431088 <0.11140.1> couch_ejson_compare:less_erl/2 --> 1
   ```
   
   Apparently in `3.x` we compare an object `{[{<<"forename">>,<<"Eddie">>}]}` against a (highest?) marker, but that marker is not an object but a tuple with one element `{<<"ÿÿÿÿ">>}`. `less_nif/2` then notices the object is not valid ejson and throws a `badarg` error.  The the code then falls back to Erlang using `less_erl/2` and at least in this case return the correct result (highest marker is greater than (`1`) than the object).
   
   In `main` we compare against an actual object `{[{<<"ÿÿÿÿ">>, <<>>}]}` so everything gets unpacked nicely then the old unicode library at least older than 59 (and possible 53) didn't error out, but also didn't sort the `<<255,255,255,255>>` as the highest element as intended. Here is centos 7 with libicu-50.2-4.el7_7.x86_64, for example
   ```
   couch_ejson_compare:less(<<"forename">>, <<255,255,255,255>>).
   1
   > couch_ejson_compare:less(<<"/">>, <<255,255,255,255>>).
   -1
   > couch_ejson_compare:less(<<"0">>, <<255,255,255,255>>).
   1
   ```
   
   and on mac os with icu4c 59:
   ```
   > couch_ejson_compare:less(<<"forename">>, <<255,255,255,255>>).
   -1
   > couch_ejson_compare:less(<<"/">>, <<255,255,255,255>>).
   -1
   > couch_ejson_compare:less(<<"0">>, <<255,255,255,255>>).
   -1
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org