You are viewing a plain text version of this content. The canonical link for it is here.
Posted to codereview@trafodion.apache.org by liuyu000 <gi...@git.apache.org> on 2018/04/27 04:02:21 UTC

[GitHub] trafodion pull request #1544: [TRAFODION-3048] Correct the Syntax Descriptio...

GitHub user liuyu000 opened a pull request:

    https://github.com/apache/trafodion/pull/1544

    [TRAFODION-3048] Correct the Syntax Descriptions for *ASCII Fucntion* in Trafodion SQL Reference Manual

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/liuyu000/trafodion ASCII

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/trafodion/pull/1544.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1544
    
----

----


---

[GitHub] trafodion pull request #1544: [TRAFODION-3048] Correct the Syntax Descriptio...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/trafodion/pull/1544


---

[GitHub] trafodion pull request #1544: [TRAFODION-3048] Correct the Syntax Descriptio...

Posted by liuyu000 <gi...@git.apache.org>.
Github user liuyu000 commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1544#discussion_r185458080
  
    --- Diff: docs/sql_reference/src/asciidoc/_chapters/sql_functions_and_expressions.adoc ---
    @@ -581,12 +581,12 @@ characters. See <<character_value_expressions,Character Value Expressions>>.
     [[considerations_for_ascii]]
     === Considerations For ASCII
     
    -For a string expression in the UTF8 character set, if the value of the
    +If the value of the
    --- End diff --
    
    @DaveBirdsall Thanks for your explanations :)
    1. I tried `SELECT ASCII('ñ')` on UTF8 column but found a different behaviour: it did **not** generate an error.
    And the 'ñ', which is larger than 127, is **double-byte**.
    ![default](https://user-images.githubusercontent.com/20532956/39507086-f92c42f6-4e0e-11e8-8ec8-f47d2c5ae2bc.png)
    
        I also tried  `SELECT ASCII('ÿ')` on UTF8 column, it did **not** generate an error either (**double-byte**).
    ![default](https://user-images.githubusercontent.com/20532956/39507420-66439ff0-4e10-11e8-8838-764325c20cb8.png)
    
        Then I tried `SELECT ASCII('A')` on UTF8 column, it runs successfully (**single-byte**):
    ![default](https://user-images.githubusercontent.com/20532956/39507733-f2157994-4e11-11e8-8601-1744ce70d0e3.png)
    
    2. I inserted the Chinese character ‘昱’ and tried the same thing, it returned an error (which seems right, **double-byte**):
    ![default](https://user-images.githubusercontent.com/20532956/39507210-82f56d50-4e0f-11e8-9926-3ad3057279fc.png)
    
    
    
    
    
     


---

[GitHub] trafodion pull request #1544: [TRAFODION-3048] Correct the Syntax Descriptio...

Posted by DaveBirdsall <gi...@git.apache.org>.
Github user DaveBirdsall commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1544#discussion_r185953650
  
    --- Diff: docs/sql_reference/src/asciidoc/_chapters/sql_functions_and_expressions.adoc ---
    @@ -581,12 +581,12 @@ characters. See <<character_value_expressions,Character Value Expressions>>.
     [[considerations_for_ascii]]
     === Considerations For ASCII
     
    -For a string expression in the UTF8 character set, if the value of the
    +If the value of the
    --- End diff --
    
    By "inserting into a <some character set name> column", I meant:
    
    create table t ( x char(4) character set <some character set name>);
    insert into t values ('<some string>');
    select x from t;
    
    You're doing something different. You are selecting a constant expression from DUAL.
    
    By default, the character set for your string constant is ISO88591. So, in your examples, you are executing the ASCII function against an ISO88591 string. Now, 'ñ' and 'ÿ' are valid ISO88591 characters. The behavior you got was the same as the behavior I got when I insert these into an ISO88591 column. In your last example, the compiler senses that the literal string is not ISO88591, so it gives it a datatype of CHAR CHARACTER SET UTF8 (or possibly UCS2). And that results in the 4106 error.
    
    So there is a very subtle difference in behavior here. For literals, the compiler is inferring the datatype of the string from its contents. For column references, the compiler is using the datatype of the column.


---

[GitHub] trafodion pull request #1544: [TRAFODION-3048] Correct the Syntax Descriptio...

Posted by liuyu000 <gi...@git.apache.org>.
Github user liuyu000 commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1544#discussion_r187227770
  
    --- Diff: docs/sql_reference/src/asciidoc/_chapters/sql_functions_and_expressions.adoc ---
    @@ -573,22 +573,578 @@ ASCII is a {project-name} SQL extension.
     ASCII (character-expression)
     ```
     
    -* `_character-expression`
    +* `_character-expression_`
     +
     is an SQL character value expression that specifies a string of
    -characters. See <<character_value_expressions,Character Value Expressions>>.
    +characters. 
    ++
    +For more information, see <<character_value_expressions,Character Value Expressions>>.
     
     [[considerations_for_ascii]]
     === Considerations For ASCII
     
    -For a string expression in the UTF8 character set, if the value of the
    -first byte in the string is greater than 127, {project-name} SQL returns this
    -error message:
    +ASCII function handles string and column differently:
    --- End diff --
    
    Thanks Dave, your comment has been incorporated  :sunglasses:


---

[GitHub] trafodion pull request #1544: [TRAFODION-3048] Correct the Syntax Descriptio...

Posted by DaveBirdsall <gi...@git.apache.org>.
Github user DaveBirdsall commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1544#discussion_r187071098
  
    --- Diff: docs/sql_reference/src/asciidoc/_chapters/sql_functions_and_expressions.adoc ---
    @@ -573,22 +573,578 @@ ASCII is a {project-name} SQL extension.
     ASCII (character-expression)
     ```
     
    -* `_character-expression`
    +* `_character-expression_`
     +
     is an SQL character value expression that specifies a string of
    -characters. See <<character_value_expressions,Character Value Expressions>>.
    +characters. 
    ++
    +For more information, see <<character_value_expressions,Character Value Expressions>>.
     
     [[considerations_for_ascii]]
     === Considerations For ASCII
     
    -For a string expression in the UTF8 character set, if the value of the
    -first byte in the string is greater than 127, {project-name} SQL returns this
    -error message:
    +ASCII function handles string and column differently:
    --- End diff --
    
    Suggested wordsmith: "The ASCII function handles string literals and columns differently:"


---

[GitHub] trafodion pull request #1544: [TRAFODION-3048] Correct the Syntax Descriptio...

Posted by DaveBirdsall <gi...@git.apache.org>.
Github user DaveBirdsall commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1544#discussion_r185041381
  
    --- Diff: docs/sql_reference/src/asciidoc/_chapters/sql_functions_and_expressions.adoc ---
    @@ -581,12 +581,12 @@ characters. See <<character_value_expressions,Character Value Expressions>>.
     [[considerations_for_ascii]]
     === Considerations For ASCII
     
    -For a string expression in the UTF8 character set, if the value of the
    +If the value of the
    --- End diff --
    
    I just tested this, and this doesn't seem to be the case. I inserted the string 'ñ' into an ISO88591 column, then selected ascii(x) from that column. (The 'ñ' character is the n-with-tilde, which is used in the Spanish language.) The value I got back was 241 (which seems right; the 'ñ' character has an encoding of hex 'f1' = 241 decimal in the ISO88591 code page). So, I think the behavior in Trafodion has changed.
    
    I also tried this on a UTF8 column. There, the ASCII function seems to work fine so long as the encoding is less than or equal to 127. Again, I inserted 'ñ', and then selected ascii(x) from that column. This time I did get an error, but a different error:
    
    *** ERROR[8428] The argument to function ASCII is not valid.
    
    I suspect what is going on there is the SQL run-time is allowing the ASCII function on UTF8 characters only if the UTF8 character is single-byte.


---

[GitHub] trafodion pull request #1544: [TRAFODION-3048] Correct the Syntax Descriptio...

Posted by liuyu000 <gi...@git.apache.org>.
Github user liuyu000 commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1544#discussion_r186333746
  
    --- Diff: docs/sql_reference/src/asciidoc/_chapters/sql_functions_and_expressions.adoc ---
    @@ -581,12 +581,12 @@ characters. See <<character_value_expressions,Character Value Expressions>>.
     [[considerations_for_ascii]]
     === Considerations For ASCII
     
    -For a string expression in the UTF8 character set, if the value of the
    +If the value of the
    --- End diff --
    
    @DaveBirdsall Thanks for your help :yum:  
    
    If I have understood you correctly, do you mean:
    
    ASCII function handles string and column differently:
    
    * For a string, ASCII function accepts only single-byte string expression, or else it returns the following error:
    ```
    *** ERROR[4106] The character set for the operand of function ASCII must be ISO88591. 
    ```
    
    For example.
    ```
    SQL>select ascii('昱') from dual;
    
    *** ERROR[4106] The character set for the operand of function ASCII must be ISO88591. [2018-05-07 06:01:55]
    ```
    
    * For a column, the string expression(s) in this column can be only single-byte,
      * For a string expression in the UTF8 character set, the value of the first byte in the string cannot be greater than 127
      * The first byte in the string can be all characters in the ISO88591 character set
    
      or else it returns the following error:
    ```
    *** ERROR[8428] The argument to function ASCII is not valid.
    ```
    
    For example,
    1. Suppose that we have the UTF8 table _t1_ like this:
    ```
    SQL>showddl t1;
    
    CREATE TABLE TRAFODION.SEABASE.T1
      ( 
        C1                               CHAR(4 CHARS) CHARACTER SET UTF8 COLLATE
          DEFAULT DEFAULT NULL NOT SERIALIZED
      )
     ATTRIBUTES ALIGNED FORMAT 
    ;
     
    -- GRANT SELECT, INSERT, DELETE, UPDATE, REFERENCES ON TRAFODION.SEABASE.T1 TO DB__ROOT WITH GRANT OPTION;
    
    --- SQL operation complete.
    ```
    
    ```
    SQL>select * from t1;
    
    C1
    ----------------
    ñ               
    
    --- 1 row(s) selected.
    ```
    Select the column c1 and the ASCII function returns the error 8428 because the ñ (UTF8) is not single-byte .
    ```
    SQL>select ascii(c1) from t1;
    
    *** ERROR[8428] The argument to function ASCII is not valid. [2018-05-07 02:13:42]
    ```
    
    2. Suppose that we have the ISO88591 table _t2_ like this:
    ```
    SQL>showddl t2;   
     
    CREATE TABLE TRAFODION.SEABASE.T2
      ( 
        C1                               CHAR(4) CHARACTER SET ISO88591 COLLATE
          DEFAULT DEFAULT NULL NOT SERIALIZED
      )
     ATTRIBUTES ALIGNED FORMAT 
    ;
     
    -- GRANT SELECT, INSERT, DELETE, UPDATE, REFERENCES ON TRAFODION.SEABASE.T2 TO DB__ROOT WITH GRANT OPTION;
    
    --- SQL operation complete.
    ```
    
    ```
    SQL>select * from t2;
    
    C1
    ----
    ñ   
    
    --- 1 row(s) selected.
    
    ```
    
    Select the column c1 and return the ASCII code of the first character ñ:
    ```
    SQL>select ascii(c1) from t2;
    
    (EXPR)
    ----------
           241
    
    --- 1 row(s) selected.
    ```
    3. For a UTF8 table contains multiple rows, all string expressions must be single-byte, or else it returns the following error:
    ```
    *** ERROR[8428] The argument to function ASCII is not valid.
    ```
    
    * Suppose that we have the UTF8 table _t6_ like this:
    ```
    SQL>showddl t6;
    
     CREATE TABLE TRAFODION.SEABASE.T6
      ( 
        C1                               VARCHAR(4 CHARS) CHARACTER SET UTF8
          COLLATE DEFAULT DEFAULT NULL NOT SERIALIZED
      )
     ATTRIBUTES ALIGNED FORMAT 
    ;
     
    -- GRANT SELECT, INSERT, DELETE, UPDATE, REFERENCES ON TRAFODION.SEABASE.T6 TO DB__ROOT WITH GRANT OPTION;
    
    --- SQL operation complete.
    ```
    ```
    SQL>select * from t6;
    
    C1
    --------------------
    a                   
    ñ                   
    
    --- 2 row(s) selected.
    ```
    The ASCII function returns the error 8428 because ñ (UTF8) is not single-byte.
    ```
    SQL>select ascii(c1) from t6;
    
    *** ERROR[8428] The argument to function ASCII is not valid. [2018-05-07 03:11:58]
    ```
    
    * Suppose that we have the UTF8 table _t8_ like this:
    ```
    SQL>showddl t8;
     
    CREATE TABLE TRAFODION.SEABASE.T8
      ( 
        C1                               VARCHAR(4 CHARS) CHARACTER SET UTF8
          COLLATE DEFAULT DEFAULT NULL NOT SERIALIZED
      )
     ATTRIBUTES ALIGNED FORMAT 
    ;
     
    -- GRANT SELECT, INSERT, DELETE, UPDATE, REFERENCES ON TRAFODION.SEABASE.T8 TO DB__ROOT WITH GRANT OPTION;
    
    --- SQL operation complete.
    ```
    ```
    SQL>select * from t8;
    
    C1
    ----------------
    a               
    b                
    
    --- 2 row(s) selected.
    ```
    Select the column c1 and return the ASCII codes of the first characters for all string expressions in the c1:
    ```
    SQL>select ascii(c1) from t8;
    
    (EXPR)
    ----------
            97
            98
    
    --- 2 row(s) selected.
    ```
    
    4. For a ISO88951 table contains multiple rows, the ASCII function returns the first ASCII codes for all string expressions in the selected column.
    * Suppose we have the ISO88591 table _t7_ like folliowing:
    ```
    SQL>showddl t7;
    
     CREATE TABLE TRAFODION.SCH.T7
      ( 
        C1                               VARCHAR(4) CHARACTER SET ISO88591 COLLATE
          DEFAULT DEFAULT NULL NOT SERIALIZED
      )
     ATTRIBUTES ALIGNED FORMAT 
    ;
     
    -- GRANT SELECT, INSERT, DELETE, UPDATE, REFERENCES ON TRAFODION.SEABASE.T7 TO DB__ROOT WITH GRANT OPTION;
    
    --- SQL operation complete.
    ```
    ```
    SQL>select * from t7;
    
    C1
    ----
    a   
    ñ   
    
    --- 2 row(s) selected.
    ```
    ```
    SQL>select ascii(c1) from t7;
    
    (EXPR)
    ----------
            97
           241
    
    --- 2 row(s) selected.
    ```


---

[GitHub] trafodion pull request #1544: [TRAFODION-3048] Correct the Syntax Descriptio...

Posted by liuyu000 <gi...@git.apache.org>.
Github user liuyu000 commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1544#discussion_r186694064
  
    --- Diff: docs/sql_reference/src/asciidoc/_chapters/sql_functions_and_expressions.adoc ---
    @@ -581,12 +581,12 @@ characters. See <<character_value_expressions,Character Value Expressions>>.
     [[considerations_for_ascii]]
     === Considerations For ASCII
     
    -For a string expression in the UTF8 character set, if the value of the
    +If the value of the
    --- End diff --
    
    @DaveBirdsall thank you for reading patiently :relaxed: :clap:
    I've incorporated them in this PR, could you please help review?  


---

[GitHub] trafodion pull request #1544: [TRAFODION-3048] Correct the Syntax Descriptio...

Posted by DaveBirdsall <gi...@git.apache.org>.
Github user DaveBirdsall commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1544#discussion_r186575593
  
    --- Diff: docs/sql_reference/src/asciidoc/_chapters/sql_functions_and_expressions.adoc ---
    @@ -581,12 +581,12 @@ characters. See <<character_value_expressions,Character Value Expressions>>.
     [[considerations_for_ascii]]
     === Considerations For ASCII
     
    -For a string expression in the UTF8 character set, if the value of the
    +If the value of the
    --- End diff --
    
    Yes! That's correct. Seems like this is needlessly complicated, doesn't it? The compiler writer tried to make a useful compromise between the strong-typing of columns vs. a weaker but perhaps more user-convenient typing for string literals.


---