You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2021/09/08 19:11:00 UTC

[jira] [Commented] (ASTERIXDB-2949) SUBSTR function may produce malformed string

    [ https://issues.apache.org/jira/browse/ASTERIXDB-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17412142#comment-17412142 ] 

ASF subversion and git services commented on ASTERIXDB-2949:
------------------------------------------------------------

Commit cc6143b4ef5bb3f505478ada2bd95350a0758f6a in asterixdb's branch refs/heads/master from Ali Alsuliman
[ https://gitbox.apache.org/repos/asf?p=asterixdb.git;h=cc6143b ]

[ASTERIXDB-2949][RUN][FUN] SUBSTR function produces malformed string

- user model changes: no
- storage format changes: no
- interface changes: no

Details:
- Fix UTF8StringBuilder grow logic

UTF8StringBuilder initially takes an estimated length of the
string to be written and reserves space at the beginning
of the buffer to later store the length of the data written.
When the actual data written happens to be greater than the
estimated length requiring more space to store the length,
the string content needs to be shifted.

This patch is to fix the starting offset of the data to be shifted.
Also, the estimated length calculation of the substring method of
the UTF8StringPointable is modified to account for
SUBSTR(input_string, 0, num_chars_to_substring) with start offset = 0.

Change-Id: If36253ff884a9c19eaa130c4e5e926f2dd9eea1d
Reviewed-on: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/12864
Integration-Tests: Jenkins <je...@fulliautomatix.ics.uci.edu>
Tested-by: Jenkins <je...@fulliautomatix.ics.uci.edu>
Reviewed-by: Ali Alsuliman <al...@gmail.com>
Reviewed-by: Ian Maxon <im...@uci.edu>


> SUBSTR function may produce malformed string
> --------------------------------------------
>
>                 Key: ASTERIXDB-2949
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2949
>             Project: Apache AsterixDB
>          Issue Type: Bug
>          Components: FUN - Functions, RT - Runtime
>    Affects Versions: 0.9.3
>            Reporter: Ali Alsuliman
>            Assignee: Ali Alsuliman
>            Priority: Critical
>             Fix For: 0.9.8
>
>
> SUBSTR function may produce a malformed string. SUBSTR function uses a string builder to construct the output substring. Before constructing the string, it gives an estimated length of the output substring to the string builder and then starts writing out the substring data to the builder buffer. If the actual data written exceeds the estimated length by an amount that requires the builder buffer to make more space to encode the actual length and shift the substring content, the resulting content gets malformed which might lead to failures up in the stack.
> Also, for the function call SUBSTR(input_string, 0, num_chars_to_substring) with start offset = 0, SUBSTR always estimates the length to be 0-127 which means if the characters written go beyond 127, it will encounter the issue described above.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)