You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Small Wong (Jira)" <ji...@apache.org> on 2022/05/25 13:05:00 UTC
[jira] [Commented] (FLINK-27777) Can not get the parquet.compression when using native parquet&orc writer to sink hive
[ https://issues.apache.org/jira/browse/FLINK-27777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17542040#comment-17542040 ]
Small Wong commented on FLINK-27777:
------------------------------------
[~ruili] hi, could you help me to check?
> Can not get the parquet.compression when using native parquet&orc writer to sink hive
> -------------------------------------------------------------------------------------
>
> Key: FLINK-27777
> URL: https://issues.apache.org/jira/browse/FLINK-27777
> Project: Flink
> Issue Type: Bug
> Reporter: Small Wong
> Priority: Major
> Attachments: image-2022-05-25-20-53-20-412.png, image-2022-05-25-20-57-13-241.png
>
>
> After set {*}table.exec.hive.fallback-mapred-writer=false{*}, then sink hive by {*}native parquet&orc writer{*}, but can not get the *`parquet.compression`* by `{*}formatConf{*}` in class `{*}HiveTableSink{*}`.
> There is no field `{*}parquet.compression{*}` in `jobConf` or `{*}sd.getSerdeInfo().getParameters(){*}`. And `parquet.compression` just exists in `{*}hive table properties{*}` as follows.
>
> {code:java}
> // code placeholder
> CREATE TABLE `hive_table`(
> `user_id` int,
> `order_amount` double)
> PARTITIONED BY (
> `dt` string,
> `hr` string)
> ROW FORMAT SERDE
> 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
> STORED AS INPUTFORMAT
> 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
> OUTPUTFORMAT
> 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
> LOCATION
> 'hdfs://xxxx'
> TBLPROPERTIES (
> 'partition.time-extractor.timestamp-pattern'='$dt $hr:00:00',
> 'sink.partition-commit.delay'='1 h',
> 'sink.partition-commit.policy.kind'='metastore,success-file',
> 'sink.partition-commit.trigger'='partition-time',
> 'transient_lastDdlTime'='1614740641',
> 'parquet.compression'='snappy') {code}
>
> !image-2022-05-25-20-53-20-412.png!
> !image-2022-05-25-20-57-13-241.png!
--
This message was sent by Atlassian Jira
(v8.20.7#820007)