You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Jörn Horstmann (Jira)" <ji...@apache.org> on 2020/10/09 08:00:28 UTC

[jira] [Created] (ARROW-10243) [Rust] [Datafusion] Optimize literal expression evaluation

Jörn Horstmann created ARROW-10243:
--------------------------------------

             Summary: [Rust] [Datafusion] Optimize literal expression evaluation
                 Key: ARROW-10243
                 URL: https://issues.apache.org/jira/browse/ARROW-10243
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Rust, Rust - DataFusion
            Reporter: Jörn Horstmann


While benchmarking the tpch query I noticed that the physical literal expression takes up a sizable amount of time. I think the creation of the corresponding array for numeric literals can be speed up by creating Buffer and ArrayData directly without going through a builder. That also allows to skip building a null bitmap for non-null literals.

I'm also thinking whether it might be possible to cache the created array. For queries without a WHERE clause, I'd expect all batches except the last to have the same length. I'm not sure though where to store the cached value.

Another possible optimization could be to cast literals already on the logical plan side. In the tpch query the literal `1` is of type `u64` in the logical plan and then needs to be processed by a cast kernel to convert to `f64` for usage in an arithmetic expression.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)