You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "David Wayne Birdsall (JIRA)" <ji...@apache.org> on 2015/11/02 23:07:27 UTC
[jira] [Commented] (TRAFODION-1575) Self-referencing update updates the column to a wrong value

    [ https://issues.apache.org/jira/browse/TRAFODION-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14986149#comment-14986149 ] 

David Wayne Birdsall commented on TRAFODION-1575:
-------------------------------------------------

The problem is that the subquery is being done as a child of the hybrid_hash_join node, which is above the delete node. So the subquery in effect races the delete. In this case, it loses the race so it sees a changed copy of the source table, MYTABLE. The correct semantics would be to evaluate the subquery before any delete is done.

This problem is a latent bug which became visible with the DELETE costing changes in JIRA TRAFODION-25. Formerly a different plan was chosen, and the race turned out more fortunately.

To see this, one can turn off the DELETE costing code using CQD HBASE_DELETE_COSTING 'OFF', to get the stub costing that was formerly in place.

> Self-referencing update updates the column to a wrong value
> -----------------------------------------------------------
>
>                 Key: TRAFODION-1575
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-1575
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: sql-cmp
>    Affects Versions: 1.3-incubating
>         Environment: Can be reproduced on a workstation
>            Reporter: David Wayne Birdsall
>            Assignee: David Wayne Birdsall
>             Fix For: 2.0-incubating
>
>
> As shown in the following execution output, the update statement tries to update c2 with count(distinct c2) from the same table. While the subquery ‘select c from (select count(distinct c2) from mytable) dt(c)’ returns the correct result 3 when it is run by itself, the update statement using the same subquery updated the column c2 to 2, instead of 3. The updated value always seems to be 1 less in this case.
> Here is the execution output:
> >>create schema mytest;
> --- SQL operation complete.
> >>
> >>create table mytable (c1 char(1), c2 integer);
> --- SQL operation complete.
> >>
> >>insert into mytable values ('A', 100), ('B', 200), ('C', 300);
> --- 3 row(s) inserted.
> >>select * from mytable order by 1;
> C1 C2
> -- -----------
> A 100
> B 200
> C 300
> --- 3 row(s) selected.
> >>select c from (select count(distinct c2) from mytable) dt(c);
> C
> --------------------
>                    3
> --- 1 row(s) selected.
> >>
> >>prepare xx from update mytable set c2 =
> +>(select c from (select count(distinct c2) from mytable) dt(c))
> +>where c2 = 100;
> --- SQL command prepared.
> >>explain options 'f' xx;
> LC RC OP OPERATOR OPT DESCRIPTION CARD
> ---- ---- ---- -------------------- -------- -------------------- ---------
> 12 . 13 root x 1.00E+001
> 10 11 12 tuple_flow 1.00E+001
> . . 11 trafodion_insert MYTABLE 1.00E+000
> 9 . 10 sort 1.00E+001
> 8 4 9 hybrid_hash_join 1.00E+001
> 6 7 8 nested_join 1.00E+001
> . . 7 trafodion_delete MYTABLE 1.00E+000
> 5 . 6 sort 1.00E+001
> . . 5 trafodion_scan MYTABLE 1.00E+001
> 3 . 4 sort_scalar_aggr 1.00E+000
> 2 . 3 sort_scalar_aggr 1.00E+000
> 1 . 2 hash_groupby 2.00E+000
> . . 1 trafodion_scan MYTABLE 1.00E+002
> --- SQL operation complete.
> >>execute xx;
> --- 1 row(s) updated.
> >>
> >>select * from mytable order by 1;
> C1 C2
> -- -----------
> A 2
> B 200
> C 300
> --- 3 row(s) selected.
> >>
> >>drop schema mytest cascade;
> --- SQL operation complete.
> >>
> The value of C2 in row A above should have been updated to 3.
> This problem was found by Wei-Shiun Tsai.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)