You are viewing a plain text version of this content. The canonical link for it is here.

Posted to derby-dev@db.apache.org by "Knut Anders Hatlen (Created) (JIRA)" <ji...@apache.org> on 2011/12/08 16:45:40 UTC

[jira] [Created] (DERBY-5525) Precision for UPPER function is wrong if the returned value is longer than the literal argument

Precision for UPPER function is wrong if the returned value is longer than the literal argument
-----------------------------------------------------------------------------------------------

                 Key: DERBY-5525
                 URL: https://issues.apache.org/jira/browse/DERBY-5525
             Project: Derby
          Issue Type: Bug
          Components: SQL
    Affects Versions: 10.8.2.2
            Reporter: Knut Anders Hatlen


Seen in ij in a database with territory based collation and German locale:

==vv= COPIED FROM IJ CONSOLE =vv==
ij> VALUES UCASE('Straßenbahn');
1
-----------
STRASSENBA&

1 Zeile ausgewählt
==================================

And with JDBC calls:

    Connection c = DriverManager.getConnection(
            "jdbc:derby:memory:db;create=true;territory=de_DE;" +
            "collation=TERRITORY_BASED");
    Statement s = c.createStatement();
    ResultSet rs = s.executeQuery("values upper('Straße')");
    System.out.println(rs.getMetaData().getPrecision(1));
    rs.next();
    System.out.println(rs.getString(1));

This prints

6
STRASSE

The precision is wrong, since the returned value is 7 characters long.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (DERBY-5525) Precision for UPPER function is wrong if the returned value is longer than the literal argument

Posted by "Rick Hillegas (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165351#comment-13165351 ] 

Rick Hillegas commented on DERBY-5525:
--------------------------------------

Thanks, Knut. There is a parallel problem in the lcase() function. Appendix D of the JDBC spec doesn't give any guidance about what the return type of lcase() and ucase() should be. Since the methods are supposed to operate on any character expression (including CLOB-typed expressions), it may make sense to type the return values as CLOB. We may already raise an exception if the result doesn't fit in the maximum length for a CLOB.

It would be a shame if fixing this bug caused queries to perform worse because a ucase() or lcase() expression became un-indexable. I am not sure that this is an important enough bug to warrant a performance regression. Thanks.
                
> Precision for UPPER function is wrong if the returned value is longer than the literal argument
> -----------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5525
>                 URL: https://issues.apache.org/jira/browse/DERBY-5525
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.8.2.2
>            Reporter: Knut Anders Hatlen
>
> Seen in ij in a database with territory based collation and German locale:
> ==vv= COPIED FROM IJ CONSOLE =vv==
> ij> VALUES UCASE('Straßenbahn');
> 1
> -----------
> STRASSENBA&
> 1 Zeile ausgewählt
> ==================================
> And with JDBC calls:
>     Connection c = DriverManager.getConnection(
>             "jdbc:derby:memory:db;create=true;territory=de_DE;" +
>             "collation=TERRITORY_BASED");
>     Statement s = c.createStatement();
>     ResultSet rs = s.executeQuery("values upper('Straße')");
>     System.out.println(rs.getMetaData().getPrecision(1));
>     rs.next();
>     System.out.println(rs.getString(1));
> This prints
> 6
> STRASSE
> The precision is wrong, since the returned value is 7 characters long.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (DERBY-5525) Precision for UPPER function is wrong if the returned value is longer than the literal argument

Posted by "Dag H. Wanvik (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13166244#comment-13166244 ] 

Dag H. Wanvik commented on DERBY-5525:
--------------------------------------

Yes, my point was exactly that the metadata type is for the resultset. not the column. So, in order not to get truncation, one would have to write something like:
 
values cast(upper('Straße') as varchar(7));

This also displays correctly:

ij> values upper('Straße') , '1234567';
1      
-------
STRASSE
1234567

So, we should this issue as invalid. Do we generate a truncation warning, though?

> I don't see much value in adding a truncation warning for the case where upper() can be evaluated at compilation-time:
>
> select ... where col1 = upper( 'Straße' )

I think we should. The semantics for the case according to the standard should be the same. Since we *do* generate a result set which shows the truncation, I believe we should generate the warning as well.

As for lcase and ucase in JDBC, aren't these supposed to map to underlying functions if they are available? If so, why should they have different semantics? Sounds confusing to me..

                
> Precision for UPPER function is wrong if the returned value is longer than the literal argument
> -----------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5525
>                 URL: https://issues.apache.org/jira/browse/DERBY-5525
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.8.2.2
>            Reporter: Knut Anders Hatlen
>
> Seen in ij in a database with territory based collation and German locale:
> ==vv= COPIED FROM IJ CONSOLE =vv==
> ij> VALUES UCASE('Straßenbahn');
> 1
> -----------
> STRASSENBA&
> 1 Zeile ausgewählt
> ==================================
> And with JDBC calls:
>     Connection c = DriverManager.getConnection(
>             "jdbc:derby:memory:db;create=true;territory=de_DE;" +
>             "collation=TERRITORY_BASED");
>     Statement s = c.createStatement();
>     ResultSet rs = s.executeQuery("values upper('Straße')");
>     System.out.println(rs.getMetaData().getPrecision(1));
>     rs.next();
>     System.out.println(rs.getString(1));
> This prints
> 6
> STRASSE
> The precision is wrong, since the returned value is 7 characters long.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (DERBY-5525) Precision for UPPER function is wrong if the returned value is longer than the literal argument

Posted by "Rick Hillegas (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13166288#comment-13166288 ] 

Rick Hillegas commented on DERBY-5525:
--------------------------------------

Thanks, Dag. I see your point about generating a truncation warning for the compile-time constant case. Agreed.

I think that we treat lcase() and ucase() as SQL operators (not defined in the Standard) rather than SQL functions. As operators, lcase() and ucase() don't live in a schema, unlike functions. Derby's lcase() and ucase() expressions have variable data type, depending on the type of their arguments. In contrast, if they were SQL functions, they would have fixed return types and fixed argument types. I suppose we could model lcase() and ucase() as functions with thousands of overloads registered in every schema. The implications for DatabaseMetaData.getFunctions()/getFunctionColumns() would be startling!

We also re-purpose lcase() and ucase() as JDBC escape functions. So they can be invoked like this too:

  values ( {fn ucase( 'abc' ) } );

Appendix D of the JDBC spec partially describes the behavior of these escape functions. I don't think that the JDBC spec provides any guidance on the data type issues we're puzzling through here.

I guess what I'm saying is that lcase() and ucase() look to me like Derby extensions to the SQL language. Trying to fit them into the SQL Standard by modelling them as vendor-supplied functions doesn't make much sense to me.

Thanks,
-Rick
                
> Precision for UPPER function is wrong if the returned value is longer than the literal argument
> -----------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5525
>                 URL: https://issues.apache.org/jira/browse/DERBY-5525
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.8.2.2
>            Reporter: Knut Anders Hatlen
>
> Seen in ij in a database with territory based collation and German locale:
> ==vv= COPIED FROM IJ CONSOLE =vv==
> ij> VALUES UCASE('Straßenbahn');
> 1
> -----------
> STRASSENBA&
> 1 Zeile ausgewählt
> ==================================
> And with JDBC calls:
>     Connection c = DriverManager.getConnection(
>             "jdbc:derby:memory:db;create=true;territory=de_DE;" +
>             "collation=TERRITORY_BASED");
>     Statement s = c.createStatement();
>     ResultSet rs = s.executeQuery("values upper('Straße')");
>     System.out.println(rs.getMetaData().getPrecision(1));
>     rs.next();
>     System.out.println(rs.getString(1));
> This prints
> 6
> STRASSE
> The precision is wrong, since the returned value is 7 characters long.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (DERBY-5525) Precision for UPPER function is wrong if the returned value is longer than the literal argument

Posted by "Rick Hillegas (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13166201#comment-13166201 ] 

Rick Hillegas commented on DERBY-5525:
--------------------------------------

Thanks for digging into the SQL Standard, Dag and Knut. I agree with your reading:

1) Derby is assigning the correct type to upper( expr ) and lower( expr ), viz., the type of expr.

2) Truncation caused by upper() and lower() should raise a warning.

I can see some value in adding a truncation warning for the case where upper() has to be evaluated at execution-time:

  select upper( col1 ) from ...

I don't see much value in adding a truncation warning for the case where upper() can be evaluated at compilation-time:

  select ... where col1 = upper( 'Straße' )

The behavior of ucase() and lcase() fall outside the SQL Standard. However, I don't see any value in changing these operators to make them behave like SQL functions.

I agree with Knut's interpretation of getPrecision(). For character-typed columns, getPrecision() is identical to getColumnDisplaySize(), i.e., it is the maximum length of the column's data type. The ResultSet is evaluated lazily but the immutable ResultSetMetaData is available before next() is called.

Thanks,
-Rick
                
> Precision for UPPER function is wrong if the returned value is longer than the literal argument
> -----------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5525
>                 URL: https://issues.apache.org/jira/browse/DERBY-5525
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.8.2.2
>            Reporter: Knut Anders Hatlen
>
> Seen in ij in a database with territory based collation and German locale:
> ==vv= COPIED FROM IJ CONSOLE =vv==
> ij> VALUES UCASE('Straßenbahn');
> 1
> -----------
> STRASSENBA&
> 1 Zeile ausgewählt
> ==================================
> And with JDBC calls:
>     Connection c = DriverManager.getConnection(
>             "jdbc:derby:memory:db;create=true;territory=de_DE;" +
>             "collation=TERRITORY_BASED");
>     Statement s = c.createStatement();
>     ResultSet rs = s.executeQuery("values upper('Straße')");
>     System.out.println(rs.getMetaData().getPrecision(1));
>     rs.next();
>     System.out.println(rs.getString(1));
> This prints
> 6
> STRASSE
> The precision is wrong, since the returned value is 7 characters long.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (DERBY-5525) Precision for UPPER function is wrong if the returned value is longer than the literal argument

Posted by "Mamta A. Satoor (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/DERBY-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mamta A. Satoor updated DERBY-5525:
-----------------------------------

    Issue & fix info: Release Note Needed,Repro attached
             Urgency: Normal
              Labels: derby_triage10_9  (was: )
    
> Precision for UPPER function is wrong if the returned value is longer than the literal argument
> -----------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5525
>                 URL: https://issues.apache.org/jira/browse/DERBY-5525
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.8.2.2
>            Reporter: Knut Anders Hatlen
>              Labels: derby_triage10_9
>
> Seen in ij in a database with territory based collation and German locale:
> ==vv= COPIED FROM IJ CONSOLE =vv==
> ij> VALUES UCASE('Straßenbahn');
> 1
> -----------
> STRASSENBA&
> 1 Zeile ausgewählt
> ==================================
> And with JDBC calls:
>     Connection c = DriverManager.getConnection(
>             "jdbc:derby:memory:db;create=true;territory=de_DE;" +
>             "collation=TERRITORY_BASED");
>     Statement s = c.createStatement();
>     ResultSet rs = s.executeQuery("values upper('Straße')");
>     System.out.println(rs.getMetaData().getPrecision(1));
>     rs.next();
>     System.out.println(rs.getString(1));
> This prints
> 6
> STRASSE
> The precision is wrong, since the returned value is 7 characters long.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (DERBY-5525) Precision for UPPER function is wrong if the returned value is longer than the literal argument

Posted by "Knut Anders Hatlen (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165918#comment-13165918 ] 

Knut Anders Hatlen commented on DERBY-5525:
-------------------------------------------

> What is the meaning of "precision" for character strings in the result set? (...) But does this mean the actual column's length or the maximum length of the column type? If we have a VARCHAR(n), I think "n" just as reasonable as LENGTH(<col>) here...

It couldn't be LENGTH(<col>), I think, since there might be many rows, and the ResultSetMetaData is per ResultSet, not per row. The actual length of the value might be different in every row.

> If it were "n", I guess this issue would go away..

Not entirely. Since the returned value is "STRASSE" (7 characters), and the data type of the returned value is VARCHAR(6), we'd either need to truncate the value, or change the returned data type to VARCHAR(7). Otherwise, the value and the type would not be consistent.

> In any case, solving this just for the literal case isn't enough, I think.

Agreed. There might be a benefit of having special handling of the literal case, though. Not only because constant folding would move work from execution time to compile time, but also because the meta-data for expressions like "UPPER('abc')" would stay the same as before, even if the fix for the general case would set the precision to the maximum length of the data type.

> What should happen if an UPPER(col) leads to column type overflow? The SQL standard says "warning: truncation", cf. section 6.29 <string value function>, GR 6) g) ii).

Yes, that sounds reasonable.
                
> Precision for UPPER function is wrong if the returned value is longer than the literal argument
> -----------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5525
>                 URL: https://issues.apache.org/jira/browse/DERBY-5525
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.8.2.2
>            Reporter: Knut Anders Hatlen
>
> Seen in ij in a database with territory based collation and German locale:
> ==vv= COPIED FROM IJ CONSOLE =vv==
> ij> VALUES UCASE('Straßenbahn');
> 1
> -----------
> STRASSENBA&
> 1 Zeile ausgewählt
> ==================================
> And with JDBC calls:
>     Connection c = DriverManager.getConnection(
>             "jdbc:derby:memory:db;create=true;territory=de_DE;" +
>             "collation=TERRITORY_BASED");
>     Statement s = c.createStatement();
>     ResultSet rs = s.executeQuery("values upper('Straße')");
>     System.out.println(rs.getMetaData().getPrecision(1));
>     rs.next();
>     System.out.println(rs.getString(1));
> This prints
> 6
> STRASSE
> The precision is wrong, since the returned value is 7 characters long.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (DERBY-5525) Precision for UPPER function is wrong if the returned value is longer than the literal argument

Posted by "Rick Hillegas (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165294#comment-13165294 ] 

Rick Hillegas commented on DERBY-5525:
--------------------------------------

Note the following related problem: What should we return as ResultSetMetaData.getColumnDisplaySize()/getPrecision() for the following query:

   select ucase( a ) from ...

String.toUpperCase() says that an upper case string may not be the same length as the lowercase version. The result length is Locale-specific. Note that users can supply their own custom Locales.
                
> Precision for UPPER function is wrong if the returned value is longer than the literal argument
> -----------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5525
>                 URL: https://issues.apache.org/jira/browse/DERBY-5525
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.8.2.2
>            Reporter: Knut Anders Hatlen
>
> Seen in ij in a database with territory based collation and German locale:
> ==vv= COPIED FROM IJ CONSOLE =vv==
> ij> VALUES UCASE('Straßenbahn');
> 1
> -----------
> STRASSENBA&
> 1 Zeile ausgewählt
> ==================================
> And with JDBC calls:
>     Connection c = DriverManager.getConnection(
>             "jdbc:derby:memory:db;create=true;territory=de_DE;" +
>             "collation=TERRITORY_BASED");
>     Statement s = c.createStatement();
>     ResultSet rs = s.executeQuery("values upper('Straße')");
>     System.out.println(rs.getMetaData().getPrecision(1));
>     rs.next();
>     System.out.println(rs.getString(1));
> This prints
> 6
> STRASSE
> The precision is wrong, since the returned value is 7 characters long.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (DERBY-5525) Precision for UPPER function is wrong if the returned value is longer than the literal argument

Posted by "Knut Anders Hatlen (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165330#comment-13165330 ] 

Knut Anders Hatlen commented on DERBY-5525:
-------------------------------------------

Moving the constant folding earlier, so that it happens before the bind phase, seems to do the trick:

ij> values upper('Straße');
1      
-------
STRASSE

1 row selected

Currently, constant folding happens in the beginning of the optimize phase. I vaguely recall there was a reason to put it exactly there, but I don't remember now.

Even if constant folding would solve the case with a string literal argument, there's still the general problem with "select ucase(a) from ...", mentioned by Rick. Unless we have some knowledge about the locale that we could use to come up with a better estimate, I think we'll have to set the precision to the maximum varchar length. (I also wonder what happens if we have a string close to the maximum length, and converting it to upper case makes it longer than the maximum length. Might run into problems similar to DERBY-4793.)
                
> Precision for UPPER function is wrong if the returned value is longer than the literal argument
> -----------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5525
>                 URL: https://issues.apache.org/jira/browse/DERBY-5525
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.8.2.2
>            Reporter: Knut Anders Hatlen
>
> Seen in ij in a database with territory based collation and German locale:
> ==vv= COPIED FROM IJ CONSOLE =vv==
> ij> VALUES UCASE('Straßenbahn');
> 1
> -----------
> STRASSENBA&
> 1 Zeile ausgewählt
> ==================================
> And with JDBC calls:
>     Connection c = DriverManager.getConnection(
>             "jdbc:derby:memory:db;create=true;territory=de_DE;" +
>             "collation=TERRITORY_BASED");
>     Statement s = c.createStatement();
>     ResultSet rs = s.executeQuery("values upper('Straße')");
>     System.out.println(rs.getMetaData().getPrecision(1));
>     rs.next();
>     System.out.println(rs.getString(1));
> This prints
> 6
> STRASSE
> The precision is wrong, since the returned value is 7 characters long.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (DERBY-5525) Precision for UPPER function is wrong if the returned value is longer than the literal argument

Posted by "Dag H. Wanvik (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165708#comment-13165708 ] 

Dag H. Wanvik commented on DERBY-5525:
--------------------------------------

What is the meaning of "precision" for character strings in the result set? The Javadoc says: "For character data, this is the length in characters." But does this mean the actual column's length or the maximum length of the column type? If we have a VARCHAR(n), I think "n" just as reasonable as LENGTH(<col>) here... but I am not sure. If it were "n", I guess this issue would go away.. In any case, solving this just for the literal case isn't enough, I think. We need to handle the general case. 
What should happen if an UPPER(col) leads to column type overflow? The SQL standard says "warning: truncation", cf. section 6.29 <string value function>, GR 6) g) ii).
                
> Precision for UPPER function is wrong if the returned value is longer than the literal argument
> -----------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5525
>                 URL: https://issues.apache.org/jira/browse/DERBY-5525
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.8.2.2
>            Reporter: Knut Anders Hatlen
>
> Seen in ij in a database with territory based collation and German locale:
> ==vv= COPIED FROM IJ CONSOLE =vv==
> ij> VALUES UCASE('Straßenbahn');
> 1
> -----------
> STRASSENBA&
> 1 Zeile ausgewählt
> ==================================
> And with JDBC calls:
>     Connection c = DriverManager.getConnection(
>             "jdbc:derby:memory:db;create=true;territory=de_DE;" +
>             "collation=TERRITORY_BASED");
>     Statement s = c.createStatement();
>     ResultSet rs = s.executeQuery("values upper('Straße')");
>     System.out.println(rs.getMetaData().getPrecision(1));
>     rs.next();
>     System.out.println(rs.getString(1));
> This prints
> 6
> STRASSE
> The precision is wrong, since the returned value is 7 characters long.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (DERBY-5525) Precision for UPPER function is wrong if the returned value is longer than the literal argument

Posted by "Rick Hillegas (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13166336#comment-13166336 ] 

Rick Hillegas commented on DERBY-5525:
--------------------------------------

So what is being proposed is a behavior change which will need a release note. Users will need to type

  values cast(upper('Straße') as varchar(7)); 

rather than

  values upper('Straße')

Thanks,
-Rick
                
> Precision for UPPER function is wrong if the returned value is longer than the literal argument
> -----------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5525
>                 URL: https://issues.apache.org/jira/browse/DERBY-5525
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.8.2.2
>            Reporter: Knut Anders Hatlen
>
> Seen in ij in a database with territory based collation and German locale:
> ==vv= COPIED FROM IJ CONSOLE =vv==
> ij> VALUES UCASE('Straßenbahn');
> 1
> -----------
> STRASSENBA&
> 1 Zeile ausgewählt
> ==================================
> And with JDBC calls:
>     Connection c = DriverManager.getConnection(
>             "jdbc:derby:memory:db;create=true;territory=de_DE;" +
>             "collation=TERRITORY_BASED");
>     Statement s = c.createStatement();
>     ResultSet rs = s.executeQuery("values upper('Straße')");
>     System.out.println(rs.getMetaData().getPrecision(1));
>     rs.next();
>     System.out.println(rs.getString(1));
> This prints
> 6
> STRASSE
> The precision is wrong, since the returned value is 7 characters long.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (DERBY-5525) Precision for UPPER function is wrong if the returned value is longer than the literal argument

Posted by "Knut Anders Hatlen (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13169255#comment-13169255 ] 

Knut Anders Hatlen commented on DERBY-5525:
-------------------------------------------

It should be possible to get the proposed behaviour by making SimpleStringOperatorNode.bindExpression() insert a CastNode on top of the operator and cast the result to the correct type. But until we fix DERBY-5537, the cast will only truncate the value and not generate a truncation warning.
                
> Precision for UPPER function is wrong if the returned value is longer than the literal argument
> -----------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5525
>                 URL: https://issues.apache.org/jira/browse/DERBY-5525
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.8.2.2
>            Reporter: Knut Anders Hatlen
>
> Seen in ij in a database with territory based collation and German locale:
> ==vv= COPIED FROM IJ CONSOLE =vv==
> ij> VALUES UCASE('Straßenbahn');
> 1
> -----------
> STRASSENBA&
> 1 Zeile ausgewählt
> ==================================
> And with JDBC calls:
>     Connection c = DriverManager.getConnection(
>             "jdbc:derby:memory:db;create=true;territory=de_DE;" +
>             "collation=TERRITORY_BASED");
>     Statement s = c.createStatement();
>     ResultSet rs = s.executeQuery("values upper('Straße')");
>     System.out.println(rs.getMetaData().getPrecision(1));
>     rs.next();
>     System.out.println(rs.getString(1));
> This prints
> 6
> STRASSE
> The precision is wrong, since the returned value is 7 characters long.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (DERBY-5525) Precision for UPPER function is wrong if the returned value is longer than the literal argument

Posted by "Knut Anders Hatlen (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13166295#comment-13166295 ] 

Knut Anders Hatlen commented on DERBY-5525:
-------------------------------------------

> So, we should this issue as invalid. Do we generate a truncation warning, though?

Right now we're not even truncating the value, so no truncation warning is generated.
                
> Precision for UPPER function is wrong if the returned value is longer than the literal argument
> -----------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5525
>                 URL: https://issues.apache.org/jira/browse/DERBY-5525
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.8.2.2
>            Reporter: Knut Anders Hatlen
>
> Seen in ij in a database with territory based collation and German locale:
> ==vv= COPIED FROM IJ CONSOLE =vv==
> ij> VALUES UCASE('Straßenbahn');
> 1
> -----------
> STRASSENBA&
> 1 Zeile ausgewählt
> ==================================
> And with JDBC calls:
>     Connection c = DriverManager.getConnection(
>             "jdbc:derby:memory:db;create=true;territory=de_DE;" +
>             "collation=TERRITORY_BASED");
>     Statement s = c.createStatement();
>     ResultSet rs = s.executeQuery("values upper('Straße')");
>     System.out.println(rs.getMetaData().getPrecision(1));
>     rs.next();
>     System.out.println(rs.getString(1));
> This prints
> 6
> STRASSE
> The precision is wrong, since the returned value is 7 characters long.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (DERBY-5525) Precision for UPPER function is wrong if the returned value is longer than the literal argument

Posted by "Knut Anders Hatlen (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13166145#comment-13166145 ] 

Knut Anders Hatlen commented on DERBY-5525:
-------------------------------------------

My reading of the standard is that UPPER('Straße') should return
'STRASS' and give a truncation warning. I'm basing this on section
6.29 <string value function>:

> Syntax Rules:
>
> 8) If <fold> is specified, then the declared type of the result of
> <fold> is that of the <character value expression>.

The <character value expression> is 'Straße', which has the declared
type CHAR(6) (see 5.3 <literal>, syntax rule 13, which defines the
declared type of a character string literal). This means that the
declared type of the result of <fold> is also CHAR(6).

Furthermore, general rule 6) c) says:

> Let FRML be the length or maximum length in characters of the
> declared type of <fold>.

Since the declared type of <fold> is CHAR(6), FRML must be 6.

Finally, general rule 6) g) ii), mentioned by Dag, says:

> If FRL is greater than FRML, then the result of the <fold> is the
> first FRML characters of FR with length FRML. If any of the
> right-most (FRL – FRML) characters of FR are not <space> characters,
> then a completion condition is raised: warning — string data, right
> truncation.

FRL is the character length of the upper-case representation (which is
7), which is greater than FRML (which is 6). So this rule says that we
should return the first 6 characters of the upper-case representation,
that is 'STRASS', and raise a warning because the removed character is
not a space.
                
> Precision for UPPER function is wrong if the returned value is longer than the literal argument
> -----------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5525
>                 URL: https://issues.apache.org/jira/browse/DERBY-5525
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.8.2.2
>            Reporter: Knut Anders Hatlen
>
> Seen in ij in a database with territory based collation and German locale:
> ==vv= COPIED FROM IJ CONSOLE =vv==
> ij> VALUES UCASE('Straßenbahn');
> 1
> -----------
> STRASSENBA&
> 1 Zeile ausgewählt
> ==================================
> And with JDBC calls:
>     Connection c = DriverManager.getConnection(
>             "jdbc:derby:memory:db;create=true;territory=de_DE;" +
>             "collation=TERRITORY_BASED");
>     Statement s = c.createStatement();
>     ResultSet rs = s.executeQuery("values upper('Straße')");
>     System.out.println(rs.getMetaData().getPrecision(1));
>     rs.next();
>     System.out.println(rs.getString(1));
> This prints
> 6
> STRASSE
> The precision is wrong, since the returned value is 7 characters long.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (DERBY-5525) Precision for UPPER function is wrong if the returned value is longer than the literal argument

Posted by "Knut Anders Hatlen (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165272#comment-13165272 ] 

Knut Anders Hatlen commented on DERBY-5525:
-------------------------------------------

Reported on derby-user: http://mail-archives.apache.org/mod_mbox/db-derby-user/201112.mbox/%3C20111208122118.123970%40gmx.net%3E
                
> Precision for UPPER function is wrong if the returned value is longer than the literal argument
> -----------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5525
>                 URL: https://issues.apache.org/jira/browse/DERBY-5525
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.8.2.2
>            Reporter: Knut Anders Hatlen
>
> Seen in ij in a database with territory based collation and German locale:
> ==vv= COPIED FROM IJ CONSOLE =vv==
> ij> VALUES UCASE('Straßenbahn');
> 1
> -----------
> STRASSENBA&
> 1 Zeile ausgewählt
> ==================================
> And with JDBC calls:
>     Connection c = DriverManager.getConnection(
>             "jdbc:derby:memory:db;create=true;territory=de_DE;" +
>             "collation=TERRITORY_BASED");
>     Statement s = c.createStatement();
>     ResultSet rs = s.executeQuery("values upper('Straße')");
>     System.out.println(rs.getMetaData().getPrecision(1));
>     rs.next();
>     System.out.println(rs.getString(1));
> This prints
> 6
> STRASSE
> The precision is wrong, since the returned value is 7 characters long.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (DERBY-5525) Precision for UPPER function is wrong if the returned value is longer than the literal argument

Posted by "Knut Anders Hatlen (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/DERBY-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165320#comment-13165320 ] 

Knut Anders Hatlen commented on DERBY-5525:
-------------------------------------------

For the string literal case, I thought maybe constant folding might do the trick, so I added this method to SimpleStringOperatorNode:

    ValueNode evaluateConstantExpressions() throws StandardException {
        if (operand instanceof CharConstantNode) {
            CharConstantNode node = (CharConstantNode) operand;
            StringDataValue val = (StringDataValue) node.getValue();
            if (methodName.equals("upper")) {
                val = val.upper(null);
            } else {
                val = val.lower(null);
            }
            return (ValueNode) getNodeFactory().getNode(
                    C_NodeTypes.CHAR_CONSTANT_NODE,
                    val.getString(),
                    getContextManager());
        }

        return this;
    }

However, the meta-data seems to have been calculated before we do constant folding, so the precision is still wrong:

ij> values upper('Straße');
1     
------
STRAS&

1 row selected

However, the constant folding appears to be an improvement, since wrapping calling LENGTH on the result from UPPER changed from

ij> values length(upper('Straße'));
1          
-----------
6          

1 row selected

to

ij> values length(upper('Straße'));
1          
-----------
7          

1 row selected
                
> Precision for UPPER function is wrong if the returned value is longer than the literal argument
> -----------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5525
>                 URL: https://issues.apache.org/jira/browse/DERBY-5525
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.8.2.2
>            Reporter: Knut Anders Hatlen
>
> Seen in ij in a database with territory based collation and German locale:
> ==vv= COPIED FROM IJ CONSOLE =vv==
> ij> VALUES UCASE('Straßenbahn');
> 1
> -----------
> STRASSENBA&
> 1 Zeile ausgewählt
> ==================================
> And with JDBC calls:
>     Connection c = DriverManager.getConnection(
>             "jdbc:derby:memory:db;create=true;territory=de_DE;" +
>             "collation=TERRITORY_BASED");
>     Statement s = c.createStatement();
>     ResultSet rs = s.executeQuery("values upper('Straße')");
>     System.out.println(rs.getMetaData().getPrecision(1));
>     rs.next();
>     System.out.println(rs.getString(1));
> This prints
> 6
> STRASSE
> The precision is wrong, since the returned value is 7 characters long.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira