You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Will Jones (Jira)" <ji...@apache.org> on 2022/06/16 17:06:00 UTC

[jira] [Created] (ARROW-16844) [C++][Python] Implement to/from substrait for Expression

Will Jones created ARROW-16844:
----------------------------------

             Summary: [C++][Python] Implement to/from substrait for Expression
                 Key: ARROW-16844
                 URL: https://issues.apache.org/jira/browse/ARROW-16844
             Project: Apache Arrow
          Issue Type: Improvement
          Components: C++, Python
            Reporter: Will Jones


DataFusion has the ability to convert between Substrait expressions and it's own internal expressions. (See: [https://github.com/datafusion-contrib/datafusion-substrait] .) It would be cool if we had a similar conversion for Acero's Expression class.

This might unlock allowing datafusion-python to easily use PyArrow datasets, by using Substrait as intermediate format to pass down filter and projections from Datafusion into the scanner. (See early draft here: [https://github.com/datafusion-contrib/datafusion-python/pull/21].)

One problem is that it's unclear what should be the type of the object in Python representing the Substrait expression. IIUC Python doesn't have direct bindings to the Substrait protobuf.

 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)