You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Stamatis Zampetakis (Jira)" <ji...@apache.org> on 2022/09/28 09:43:00 UTC

[jira] [Commented] (DERBY-7132) SQLDataException when executing CAST inside a CASE WHEN clause

    [ https://issues.apache.org/jira/browse/DERBY-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17610473#comment-17610473 ] 

Stamatis Zampetakis commented on DERBY-7132:
--------------------------------------------

I spend some more time on this problem and it turns out that the problem is not really specific to CASE WHEN evaluation as I initially thought. As [~rhillegas] mentioned the problem comes from the fact that the filter/restriction including the CAST is pushed down too far. 

In  [^Derby-7132-plan.pdf], I sketched how the query plan looks like highlighting the {{ProjectRestrictResultSet}} operator where the CASE WHEN clause is evaluated. I naively assumed that the CAST will be evaluated after all joins but that's not the case here. According to SQL standard the WHERE clause is applied on the result of the FROM clause so pushing the CAST so far down is still a bug. Derby, as most (if not all) well-known databases, is pushing filters aggressively below inner joins. In 99% of the time this is a valid optimization but when the condition can throw an exception (e.g., CAST) this optimization shouldn't be done (or at least should be done differently).

Just for the sake of completeness the combination of rows triggering the exception is the following:
{noformat}
@2723: { 1, default, hive, 8, 1, src_bucket_tbl } @2239 { 1, 2008-04-08, 0 }
{noformat}
Observed that all conditions in the {{CASE WHEN}} are satisfied so what happens is that we will attempt to cast 2008-04-08 (PART_KEY_VAL) to a DECIMAL(21,0) and the latter obviously fails. 

Apologies for not attaching the actual Derby plan and coming up with this sketch but I didn't find a way to get the plan when there is an {{RuntimeException}}. If you have any tips around this I may use it in the future.

> SQLDataException when executing CAST inside a CASE WHEN clause
> --------------------------------------------------------------
>
>                 Key: DERBY-7132
>                 URL: https://issues.apache.org/jira/browse/DERBY-7132
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.14.2.0, 10.15.2.0
>            Reporter: Stamatis Zampetakis
>            Priority: Major
>         Attachments: Derby-7132-plan.pdf, derby-dump.tar.gz, schemaddl.sql, uml_schema.svg
>
>
> {code:sql}
> SELECT "PARTITIONS"."PART_ID"
> FROM "PARTITIONS"
>          INNER JOIN "TBLS" ON "PARTITIONS"."TBL_ID" = "TBLS"."TBL_ID"
>          INNER JOIN "DBS" ON "TBLS"."DB_ID" = "DBS"."DB_ID"
>          INNER JOIN "PARTITION_KEY_VALS" "FILTER0" ON "FILTER0"."PART_ID" = "PARTITIONS"."PART_ID"
> WHERE "DBS"."CTLG_NAME" = 'hive'
>   AND "TBLS"."TBL_NAME" = 'src_bucket_tbl'
>   AND "DBS"."NAME" = 'default'
>   AND "FILTER0"."INTEGER_IDX" = 0
>   AND (((CASE
>              WHEN "FILTER0"."PART_KEY_VAL" <> '__HIVE_DEFAULT_PARTITION__'
>                  AND "TBLS"."TBL_NAME" = 'src_bucket_tbl'
>                  AND "DBS"."NAME" = 'default'
>                  AND "DBS"."CTLG_NAME" = 'hive'
>                  AND "FILTER0"."INTEGER_IDX" = 0 THEN cast("FILTER0"."PART_KEY_VAL" AS decimal(21, 0))
>     END) = 10))
> {code}
> The SQL query above fails with the following stacktrace when attempting to evaluate the CAST expression. Note that the condition inside the CASE WHEN clause guarantees that only legal values (numbers) should be passed inside the CAST function. Apparently, the operations are somehow re-ordered and the CAST is evaluated before the condition in the WHEN clause which has a result a non-number to be passed in the CAST and cause the exception below.
> {noformat}
> Exception in thread "main" java.sql.SQLDataException: Invalid character string format for type DECIMAL.
> 	at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:84)
> 	at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Util.java:230)
> 	at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(TransactionResourceImpl.java:424)
> 	at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(TransactionResourceImpl.java:353)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.handleException(EmbedConnection.java:2405)
> 	at org.apache.derby.impl.jdbc.ConnectionChild.handleException(ConnectionChild.java:88)
> 	at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(EmbedStatement.java:1436)
> 	at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(EmbedPreparedStatement.java:1709)
> 	at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeQuery(EmbedPreparedStatement.java:286)
> 	at com.github.zabetak.CaseProblem.main(CaseProblem.java:63)
> Caused by: ERROR 22018: Invalid character string format for type DECIMAL.
> 	at org.apache.derby.iapi.error.StandardException.newException(StandardException.java:290)
> 	at org.apache.derby.iapi.error.StandardException.newException(StandardException.java:285)
> 	at org.apache.derby.iapi.types.DataType.invalidFormat(DataType.java:1280)
> 	at org.apache.derby.iapi.types.DataType.setValue(DataType.java:552)
> 	at org.apache.derby.exe.acf81e0010x017fx0812xbaa5x00003a07fe880.e3(Unknown Source)
> 	at org.apache.derby.impl.services.reflect.DirectCall.invoke(ReflectGeneratedClass.java:107)
> 	at org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCore(ProjectRestrictResultSet.java:302)
> 	at org.apache.derby.impl.sql.execute.NestedLoopJoinResultSet.getNextRowCore(NestedLoopJoinResultSet.java:119)
> 	at org.apache.derby.impl.sql.execute.JoinResultSet.openCore(JoinResultSet.java:149)
> 	at org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.openCore(ProjectRestrictResultSet.java:182)
> 	at org.apache.derby.impl.sql.execute.BasicNoPutResultSetImpl.open(BasicNoPutResultSetImpl.java:266)
> 	at org.apache.derby.impl.sql.GenericPreparedStatement.executeStmt(GenericPreparedStatement.java:472)
> 	at org.apache.derby.impl.sql.GenericPreparedStatement.execute(GenericPreparedStatement.java:351)
> 	at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(EmbedStatement.java:1344)
> {noformat}
> The problem can be reproduced by running the query above in the derby database attached to the case.
> {code:sql}
> try (Connection c = DriverManager.getConnection("jdbc:derby:;databaseName=repro_derby_db")) {
>         try (PreparedStatement ps = c.prepareStatement(sql)) {
>           try (ResultSet rs = ps.executeQuery()) {
>             while (rs.next()) {
>               System.out.println(rs.getInt(1));
>             }
>           }
>         }
>       }
> {code}
> Unfortunately, I couldn't write a minimal reproducer cause slight changes to the order of performing the operations in the database has an impact on the plan and may hide the problem.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)