You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "Hans Zeller (JIRA)" <ji...@apache.org> on 2016/12/15 04:37:58 UTC

[jira] [Commented] (TRAFODION-2400) Incorrect data returned by TMUDF with selection predicate on input table

    [ https://issues.apache.org/jira/browse/TRAFODION-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15750363#comment-15750363 ] 

Hans Zeller commented on TRAFODION-2400:
----------------------------------------

The problem happens in preCodeGen. We decide on the data types to present to the UDR writer early on, but preCodeGen may change an expression like a column reference to the char(20) column userid to a char(10) constant 'super-user'. Right now, what happens is that the executor presents a record in the format determined by preCodeGen (using char(10) in this case) to the UDF, which assumes the original record format with a char(20). This causes the UDF to read corrupted data.

The fix is to add a cast to the original UDR data type in preCodeGen if needed. We don't want to change the data types used in the UDR during preCodeGen, since the UDR writer may rely on the types staying constant throughout the compilation and execution phases.

> Incorrect data returned by TMUDF with selection predicate on input table
> ------------------------------------------------------------------------
>
>                 Key: TRAFODION-2400
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-2400
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: sql-cmp
>            Reporter: Hans Zeller
>            Assignee: Hans Zeller
>             Fix For: 2.0-incubating
>
>
> We saw incorrect results from a query with the following characteristics:
>   - the incorrect data is read from the input table
>   - there is an equals predicate col=const on the input table
>   - the constant const has a data type that is smaller than the column (e.g. comparing an int to a constant 1000 which is a smallint or comparing a char(20) column to a char(1) constant 'x'.
> To demonstrate the issue, I added the following to regression test udr/TEST001:
> {noformat}
> SELECT cast(CONVERTTIMESTAMP(ts) as TIME(6)), userid, session_id, ipAddr
> FROM UDF(sessionize_dynamic(TABLE(SELECT userid,
>                                          JULIANTIMESTAMP(ts) as TS,
>                                          ipAddr
>                                   FROM clicks
>                                   WHERE userid='super-user'
>                                   PARTITION BY 1 ORDER BY 2),
>                             'USERID',
>                             'TS',
>                             60000000));
> SELECT cast(CONVERTTIMESTAMP(ts) as TIME(6)), userid, session_id, ipAddr
> FROM UDF(sessionize_dynamic(TABLE(SELECT userid,
>                                          JULIANTIMESTAMP(ts) as TS,
>                                          ipAddr
>                                   FROM clicks
>                                   WHERE userid='super-user'
>                                   PARTITION BY 1 ORDER BY 2),
>                             'USERID',
>                             'TS',
>                             60000000));
> {noformat}
> For some reason I had to do the same select twice, the first one didn't show a corrupted userid and/or ipAddr field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)