You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Claire McGinty (Jira)" <ji...@apache.org> on 2023/09/19 14:36:00 UTC
[jira] [Created] (PARQUET-2350) Create Configuration key for enabling Byte Stream Split Encoding in ParquetWRiter
Claire McGinty created PARQUET-2350:
---------------------------------------
Summary: Create Configuration key for enabling Byte Stream Split Encoding in ParquetWRiter
Key: PARQUET-2350
URL: https://issues.apache.org/jira/browse/PARQUET-2350
Project: Parquet
Issue Type: Improvement
Reporter: Claire McGinty
All of the properties in [ParquetWriter|https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetWriter.java] have an associated Configuration key (for example, [ParquetOutputFormat.DICTIONARY_PAGE_SIZE|https://github.com/apache/parquet-mr/blob/910bcc4edc2d707670e02e9ceadd98dacd9f08d2/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetOutputFormat.java#L140] corresponds to ParquetWriter#withDictionaryPageSize), except for `ParquetWriter#withByteStreamSplitEncoding`.
Can we add a Configuration key for this? Happy to make a PR, given some input on naming convention (`parquet.encoding.bytestreamsplit.enabled` maybe?)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)