You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "mustafasrepo (via GitHub)" <gi...@apache.org> on 2023/04/29 16:41:56 UTC

[GitHub] [arrow-datafusion] mustafasrepo opened a new issue, #6159: Add support for ordering alias

mustafasrepo opened a new issue, #6159:
URL: https://github.com/apache/arrow-datafusion/issues/6159

   ### Is your feature request related to a problem or challenge?
   
   Datafusion can detect aliased columns (exactly same columns). This enables us to do additional optimizations during planning. However, same thing cannot be done in terms of ordering. 
   Consider query below
   ```sql
   (SELECT c9,
                      ROW_NUMBER() OVER(ORDER BY c9 ASC) as rn1
                      FROM aggregate_test_100
                      ORDER BY c9 ASC)
   ```
   Output of this query would satisfy `c9 ASC`. However, it will also satisfy `rn1 ASC`. However, currently we cannot detect this
   query below
   ```sql
   SELECT c9, rn1 FROM (SELECT c9,
                      ROW_NUMBER() OVER(ORDER BY c9 ASC) as rn1
                      FROM aggregate_test_100
                      ORDER BY c9 ASC)
      ORDER BY rn1
   ```
   produces the following plan
   ```
   "SortExec: expr=[rn1@1 ASC NULLS LAST]",
   "  ProjectionExec: expr=[c9@0 as c9, ROW_NUMBER() ORDER BY [aggregate_test_100.c9 ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW@1 as rn1]",
   "    BoundedWindowAggExec: wdw=[ROW_NUMBER(): Ok(Field { name: \"ROW_NUMBER()\", data_type: UInt64, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }), frame: WindowFrame { units: Range, start_bound: Preceding(UInt32(NULL)), end_bound: CurrentRow }], mode=[Sorted]",
   "      SortExec: expr=[c9@0 ASC NULLS LAST]",
   "        CsvExec: files={1 group: [[SOURCE_PATH]]}, has_header=true, limit=None, projection=[c9]",
   ```
   We could have produced the following if were to detect that `c9 ASC` and `rn1 ASC` defines same ordering.
   ```
   "ProjectionExec: expr=[c9@0 as c9, ROW_NUMBER() ORDER BY [aggregate_test_100.c9 ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW@1 as rn1]",
   "  BoundedWindowAggExec: wdw=[ROW_NUMBER(): Ok(Field { name: \"ROW_NUMBER()\", data_type: UInt64, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }), frame: WindowFrame { units: Range, start_bound: Preceding(UInt32(NULL)), end_bound: CurrentRow }], mode=[Sorted]",
   "    SortExec: expr=[c9@0 ASC NULLS LAST]",
   "      CsvExec: files={1 group: [[SOURCE_PATH]]}, has_header=true, limit=None, projection=[c9]",
   ```
   
   ### Describe the solution you'd like
   
   I would like to have this support.
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb closed issue #6159: Add support for ordering alias

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb closed issue #6159: Add support for ordering alias
URL: https://github.com/apache/arrow-datafusion/issues/6159


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org