You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/03/15 20:50:41 UTC
[jira] [Commented] (TRAFODION-2537) Salted indexes do not result in parallel index scan plans

    [ https://issues.apache.org/jira/browse/TRAFODION-2537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15926968#comment-15926968 ] 

ASF GitHub Bot commented on TRAFODION-2537:
-------------------------------------------

GitHub user DaveBirdsall opened a pull request:

    https://github.com/apache/incubator-trafodion/pull/1012

    [TRAFODION-2537] Add file_desc to salted secondary indexes indexes_desc

    The problem was a query on a large table with a salted index chose a serial plan when a parallel plan on the salted index would have been superior.
    
    The cause was a bit of missing logic. Function createNAFileSets (optimizer/NATable.cpp) relies on the presence of a files descriptor in the indexes descriptor to deduce that an index is salted. The code that generates these descriptors, Generator::createVirtualTableDesc (generator/Generator.cpp) however only creates a files descriptor for the clustering key or primary key indexes descriptor. So, the fix was to add logic to Generator::createVirtualTableDesc to create a files descriptor when a secondary index is salted.
    
    Note that the only form of salting supported on indexes today is SALT LIKE TABLE. If in the future we add support for salting an index on a different set of columns and a different set of partitions than the table, many additional changes will be needed.
    
    Two regression tests show changes in plan as a result of this change. In both cases, index scans that formerly were serial are now parallel. (Note: The one plan change for seabase/EXPECTED010 is near the bottom.) 
    
    See the JIRA for a discussion of the performance implications of this change.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/DaveBirdsall/incubator-trafodion Trafodion2537

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-trafodion/pull/1012.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1012
    
----
commit e5e1356bf1ea70ee4b23d4e921a218f40cdf8577
Author: Dave Birdsall <db...@apache.org>
Date:   2017-03-15T20:42:58Z

    [TRAFODION-2537] Add file_desc to salted secondary indexes indexes_desc

----


> Salted indexes do not result in parallel index scan plans
> ---------------------------------------------------------
>
>                 Key: TRAFODION-2537
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-2537
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: sql-cmp
>    Affects Versions: 2.1-incubating, 2.2-incubating
>         Environment: All, though it is more likely on a cluster.
>            Reporter: David Wayne Birdsall
>            Assignee: David Wayne Birdsall
>
> The following script reproduces the problem:
> ?section setup
> create table test2735(a int not null not droppable primary key, b int, c int)
>  salt using 4 partitions;
> -- put 2,000,000 rows into it
> upsert using load into test2735
> select c0+c1*10+c2*100+c3*1000+c4*10000+c5*100000+c6*1000000,
>        c0+2*c1+4*c5,
>        c3+3*c4-7*c2
> from (values(1)) t
> transpose 0,1,2,3,4,5,6,7,8,9 as c0
> transpose 0,1,2,3,4,5,6,7,8,9 as c1
> transpose 0,1,2,3,4,5,6,7,8,9 as c2
> transpose 0,1,2,3,4,5,6,7,8,9 as c3
> transpose 0,1,2,3,4,5,6,7,8,9 as c4
> transpose 0,1,2,3,4,5,6,7,8,9 as c5
> transpose 0,1 as c6;
> update statistics for table test2735 on every column;
> ?section indexsetup
> create index itest2735s on test2735(b,c)
>   salt like table;
> create index itest2735ns on test2735(c);
> ?section doit
> prepare s0 from select a,count(*) from test2735 group by a;
> explain options 'f' s0;
> prepare s1 from select b,count(*) from test2735 group by b;
> explain options 'f' s1;
> prepare s2 from select c,count(*) from test2735 group by c;
> explain options 'f' s2;
> When the above script is run, the explain output is as follows:
> >>prepare s0 from select a,count(*) from test2735 group by a;
> --- SQL command prepared.
> >>explain options 'f' s0;
> LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD
> ---- ---- ---- --------------------  --------  --------------------  ---------
> 1    .    2    root                                                  2.00E+006
> .    .    1    trafodion_index_scan            ITEST2735NS           2.00E+006
> --- SQL operation complete.
> >>
> >>prepare s1 from select b,count(*) from test2735 group by b;
> --- SQL command prepared.
> >>explain options 'f' s1;
> LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD
> ---- ---- ---- --------------------  --------  --------------------  ---------
> 2    .    3    root                                                  6.40E+001
> 1    .    2    hash_groupby                                          6.40E+001
> .    .    1    trafodion_index_scan            ITEST2735S            2.00E+006
> --- SQL operation complete.
> >>
> >>prepare s2 from select c,count(*) from test2735 group by c;
> --- SQL command prepared.
> >>explain options 'f' s2;
> LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD
> ---- ---- ---- --------------------  --------  --------------------  ---------
> 2    .    3    root                                                  1.00E+002
> 1    .    2    sort_groupby                                          1.00E+002
> .    .    1    trafodion_index_scan            ITEST2735NS           2.00E+006
> --- SQL operation complete.
> >>
> In particular, statement S1 chooses a serial plan, but should choose a parallel plan.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)