You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/01/13 11:31:23 UTC

[GitHub] [iceberg] hililiwei opened a new pull request #3892: Doc: Add enumeration supported by `write.parquet.compression-codec`

hililiwei opened a new pull request #3892:
URL: https://github.com/apache/iceberg/pull/3892


   List the values supported by `write.parquet.compression-codec` in desc: `uncompressed, snappy, gzip, lzo, brotli, lz4, zstd`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3892: Docs: Add enumeration supported by `write.parquet.compression-codec`

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3892:
URL: https://github.com/apache/iceberg/pull/3892#discussion_r786307688



##########
File path: site/docs/configuration.md
##########
@@ -38,7 +38,7 @@ Iceberg tables support table properties to configure table behavior, like the de
 | write.parquet.row-group-size-bytes | 134217728 (128 MB) | Parquet row group size                             |
 | write.parquet.page-size-bytes      | 1048576 (1 MB)     | Parquet page size                                  |
 | write.parquet.dict-size-bytes      | 2097152 (2 MB)     | Parquet dictionary page size                       |
-| write.parquet.compression-codec    | gzip               | Parquet compression codec                          |
+| write.parquet.compression-codec    | gzip               | Parquet compression codec; uncompressed, snappy, gzip, lzo, brotli, lz4, zstd. Note that `zstd` requires `ZStandardCodec` to be installed before Hadoop 2.9.0, `brotli` requires `BrotliCodec` to be installed.                        |

Review comment:
       Maybe this should be a link to Parquet docs instead?
   
   This doesn't fit in a table, so we will need to move it or shorten it. I think making it a link fixes both problems.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #3892: Doc: Add values supported by parquet\avro compression codec

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #3892:
URL: https://github.com/apache/iceberg/pull/3892#issuecomment-1016962847


   Thanks, @hililiwei! Good to have those listed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3892: Docs: Add enumeration supported by `write.parquet.compression-codec`

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3892:
URL: https://github.com/apache/iceberg/pull/3892#discussion_r786950249



##########
File path: site/docs/configuration.md
##########
@@ -38,7 +38,7 @@ Iceberg tables support table properties to configure table behavior, like the de
 | write.parquet.row-group-size-bytes | 134217728 (128 MB) | Parquet row group size                             |
 | write.parquet.page-size-bytes      | 1048576 (1 MB)     | Parquet page size                                  |
 | write.parquet.dict-size-bytes      | 2097152 (2 MB)     | Parquet dictionary page size                       |
-| write.parquet.compression-codec    | gzip               | Parquet compression codec                          |
+| write.parquet.compression-codec    | gzip               | Parquet compression codec; uncompressed, snappy, gzip, etc. For more options: [CompressionCodecName](https://github.com/apache/parquet-mr/blob/parquet-1.12.x/parquet-common/src/main/java/org/apache/parquet/hadoop/metadata/CompressionCodecName.java) |

Review comment:
       Rather than adding significantly more text to a table, let's make "compression codec" the link text. Also, this can simply list the other codec names and still be smaller than what is here. Including "etc" takes nearly as much space as "zstd" so let's just list these: zstd, brotli, lz4, gzip, snappy, uncompressed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue merged pull request #3892: Doc: Add values supported by parquet\avro compression codec

Posted by GitBox <gi...@apache.org>.
rdblue merged pull request #3892:
URL: https://github.com/apache/iceberg/pull/3892


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hililiwei commented on a change in pull request #3892: Docs: Add enumeration supported by `write.parquet.compression-codec`

Posted by GitBox <gi...@apache.org>.
hililiwei commented on a change in pull request #3892:
URL: https://github.com/apache/iceberg/pull/3892#discussion_r786348994



##########
File path: site/docs/configuration.md
##########
@@ -38,7 +38,7 @@ Iceberg tables support table properties to configure table behavior, like the de
 | write.parquet.row-group-size-bytes | 134217728 (128 MB) | Parquet row group size                             |
 | write.parquet.page-size-bytes      | 1048576 (1 MB)     | Parquet page size                                  |
 | write.parquet.dict-size-bytes      | 2097152 (2 MB)     | Parquet dictionary page size                       |
-| write.parquet.compression-codec    | gzip               | Parquet compression codec                          |
+| write.parquet.compression-codec    | gzip               | Parquet compression codec; uncompressed, snappy, gzip, lzo, brotli, lz4, zstd. Note that `zstd` requires `ZStandardCodec` to be installed before Hadoop 2.9.0, `brotli` requires `BrotliCodec` to be installed.                        |

Review comment:
       updated




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org