You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Julian Hyde (JIRA)" <ji...@apache.org> on 2015/10/29 19:23:27 UTC

[jira] [Created] (CALCITE-945) Sort algorithm that skips low-cardinality leading edge

Julian Hyde created CALCITE-945:
-----------------------------------

             Summary: Sort algorithm that skips low-cardinality leading edge
                 Key: CALCITE-945
                 URL: https://issues.apache.org/jira/browse/CALCITE-945
             Project: Calcite
          Issue Type: Bug
            Reporter: Julian Hyde
            Assignee: Julian Hyde


Calcite should support a sort algorithm that skips low-cardinality leading edge.

Consider a table EMP sorted on (gender, empno), where gender has 2 distinct values, and the query {code}SELECT * FROM emp ORDER BY empno{code}

You cannot satisfy the query using the native sort order alone, but you can efficiently merge the two sorted runs.

The query {code}SELECT * FROM emp WHERE empno BETWEEN 100 AND 105 ORDER BY empno{code} should be able to use a skip-scan.

This applies to Phoenix, where "salting" columns are often added to the leading edge of the key. The salting column has a fixed cardinality, less than 256, chosen to ensure the right degree of parallelism. Calcite would not need to model salting columns explicitly, just recognize that they are low cardinality. Phoenix has such a sort algorithm; it just need to be modelled as a RelNode with appropriate cost model.

This is issue is closely related to CALCITE-873. An equality constraint (i.e. a filter below the sort) ensures that a column entering the sort has cardinality C <= 1, whereas this issue considers where C is small. If you know C <= 1 you don't need a sort, but if C is small you can use a merge.

Cc [~jamestaylor], [~maryannxue], [~apurtell].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)