You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Istvan Darvas (Jira)" <ji...@apache.org> on 2022/05/13 18:43:00 UTC

[jira] [Updated] (HUDI-4091) TIMESTAMP_MICROS handling

     [ https://issues.apache.org/jira/browse/HUDI-4091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Istvan Darvas updated HUDI-4091:
--------------------------------
    Summary: TIMESTAMP_MICROS handling  (was: Timestamp micro handling)

> TIMESTAMP_MICROS handling
> -------------------------
>
>                 Key: HUDI-4091
>                 URL: https://issues.apache.org/jira/browse/HUDI-4091
>             Project: Apache Hudi
>          Issue Type: Bug
>    Affects Versions: 0.10.1
>         Environment: AWS EMR
>            Reporter: Istvan Darvas
>            Priority: Critical
>         Attachments: b97b9e55-58a4-417b-b71c-f6b2d3860da0-0_0-26-1663_20220512111505310.parquet, before-save.png, example-code.txt
>
>
> Hi Guys!
>  
> I am not able to use timestamp micro columns save with HUDI. 
> I would like to save it keeping microsec granularity, but it only keeps milisec.
>  
> I have set this:
> --conf spark.sql.parquet.outputTimestampType=TIMESTAMP_MICROS \
> and also this in the hoodie:
> "hoodie.parquet.outputtimestamptype": "TIMESTAMP_MICROS",
> but when I read it back (with pyspark, load api), it's only millisecond precision and unfortunately, I need the microsec in some case, because with this I run into a Schrödinger's cat situation  !https://a.slack-edge.com/production-standard-emoji-assets/13.0/google-medium/1f604.png!
> So an entity has more than one states in the same time !https://a.slack-edge.com/production-standard-emoji-assets/13.0/google-medium/1f604.png!Can someone enlighten me what should I do?
>  
> Before save everything is fine! ("ts" column)
> Darvi
> SLACK Thread: [https://apache-hudi.slack.com/archives/C4D716NPQ/p1652347742173779]
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)