You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Small Wong (Jira)" <ji...@apache.org> on 2022/05/25 13:05:00 UTC

[jira] [Commented] (FLINK-27777) Can not get the parquet.compression when using native parquet&orc writer to sink hive

    [ https://issues.apache.org/jira/browse/FLINK-27777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17542040#comment-17542040 ] 

Small Wong commented on FLINK-27777:
------------------------------------

[~ruili] hi, could you help me to check?

> Can not get the parquet.compression when using native parquet&orc writer to sink hive
> -------------------------------------------------------------------------------------
>
>                 Key: FLINK-27777
>                 URL: https://issues.apache.org/jira/browse/FLINK-27777
>             Project: Flink
>          Issue Type: Bug
>            Reporter: Small Wong
>            Priority: Major
>         Attachments: image-2022-05-25-20-53-20-412.png, image-2022-05-25-20-57-13-241.png
>
>
> After set {*}table.exec.hive.fallback-mapred-writer=false{*}, then sink hive by {*}native parquet&orc writer{*}, but can not get the *`parquet.compression`* by  `{*}formatConf{*}` in class `{*}HiveTableSink{*}`.
> There is no field `{*}parquet.compression{*}` in `jobConf` or `{*}sd.getSerdeInfo().getParameters(){*}`. And `parquet.compression` just  exists in `{*}hive table properties{*}` as follows. 
>  
> {code:java}
> // code placeholder
> CREATE TABLE `hive_table`(
>   `user_id` int,
>   `order_amount` double)
> PARTITIONED BY (
>   `dt` string,
>   `hr` string)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
> LOCATION
>   'hdfs://xxxx'
> TBLPROPERTIES (
>   'partition.time-extractor.timestamp-pattern'='$dt $hr:00:00',
>   'sink.partition-commit.delay'='1 h',
>   'sink.partition-commit.policy.kind'='metastore,success-file',
>   'sink.partition-commit.trigger'='partition-time',
>   'transient_lastDdlTime'='1614740641',
>   'parquet.compression'='snappy') {code}
>  
> !image-2022-05-25-20-53-20-412.png!
> !image-2022-05-25-20-57-13-241.png!



--
This message was sent by Atlassian Jira
(v8.20.7#820007)