You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Ben Kietzman (Jira)" <ji...@apache.org> on 2021/01/07 21:57:00 UTC

[jira] [Created] (ARROW-11174) [C++][Dataset] Make Expressions available for projection

Ben Kietzman created ARROW-11174:
------------------------------------

             Summary: [C++][Dataset] Make Expressions available for projection
                 Key: ARROW-11174
                 URL: https://issues.apache.org/jira/browse/ARROW-11174
             Project: Apache Arrow
          Issue Type: Bug
          Components: C++
    Affects Versions: 2.0.0
            Reporter: Ben Kietzman
            Assignee: Ben Kietzman
             Fix For: 4.0.0


RecordBatchProjector should be replaced by an expression calling the "project" compute function.

Projection currently supports only reordering and subselection of fields, materializing virtual columns where necessary. Replacement with an Expression will enable specifying arbitrary expressions for projected columns:
{code:java}
// project an explicit selection:
// SELECT a as "a", b as "b" ...
project({field_ref("a"), field_ref("b")}, {"a", "b"});

// project an arithmetic expression:
// SELECT a + b as "a + b" ...
project({add(field_ref("a"), field_ref("b"))}, {"a + b"}){code}
This will also allow the same expression optimization machinery used for filters to be directly applied to projections. Virtual columns become a consequence of constant folding:
{code:java}
// project in a partition where a == 3:
assert(
  SimplifyWithGuarantee(
    project({field_ref("a"), field_ref("b")}, {"a", "b"}),
    equal(field_ref("a"), literal(3))
  )
  == project({literal(3), field_ref("b")}, {"a", "b"})
){code}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)