You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Etienne Chauchot (Jira)" <ji...@apache.org> on 2021/10/18 11:53:00 UTC

[jira] [Commented] (BEAM-4383) Enable block size support in ParquetIO

    [ https://issues.apache.org/jira/browse/BEAM-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429971#comment-17429971 ] 

Etienne Chauchot commented on BEAM-4383:
----------------------------------------

[~ŁukaszG] is it not already implemented on ParquetIO ? I see a Sink#withRowGroupSize()

> Enable block size support in ParquetIO
> --------------------------------------
>
>                 Key: BEAM-4383
>                 URL: https://issues.apache.org/jira/browse/BEAM-4383
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-ideas, io-java-parquet
>            Reporter: Lukasz Gajowy
>            Priority: P3
>
> Parquet API allows block size support, which can improve IO performance when working with Parquet files. Currently, the ParquetIO does not support it at all so it looks like a room for improvement for this IO.
> Good intro into this topic: [https://www.dremio.com/tuning-parquet/] 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)