You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Andrew Lamb (Jira)" <ji...@apache.org> on 2021/04/26 13:26:02 UTC
[jira] [Commented] (ARROW-11059) [Rust] [DataFusion] Implement
extensible configuration mechanism
[ https://issues.apache.org/jira/browse/ARROW-11059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17332333#comment-17332333 ]
Andrew Lamb commented on ARROW-11059:
-------------------------------------
Migrated to github: https://github.com/apache/arrow-datafusion/issues/138
> [Rust] [DataFusion] Implement extensible configuration mechanism
> ----------------------------------------------------------------
>
> Key: ARROW-11059
> URL: https://issues.apache.org/jira/browse/ARROW-11059
> Project: Apache Arrow
> Issue Type: New Feature
> Components: Rust - DataFusion
> Reporter: Andy Grove
> Assignee: Andy Grove
> Priority: Major
>
> We are getting to the point where there are multiple settings we could add to operators to fine-tune performance. Custom operators provided by crates that extend DataFusion may also need this capability.
> I propose that we add support for key-value configuration options so that we don't need to plumb through each new configuration setting that we add.
> For example. I am about to start on a "coalesce batches" operator and I would like a setting such as "coalesce.batch.size".
> For built-in settings like this we can provide information such as documentation and default values and generate documentation from this.
> For example, here is how Spark defines configs:
> {code:java}
> val PARQUET_VECTORIZED_READER_ENABLED =
> buildConf("spark.sql.parquet.enableVectorizedReader")
> .doc("Enables vectorized parquet decoding.")
> .version("2.0.0")
> .booleanConf
> .createWithDefault(true) {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)