You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/08/06 10:23:18 UTC

[GitHub] [hudi] s-sanjay commented on issue #1895: HUDI Dataset backed by Hive Metastore fails on Presto with Unknown converted type TIMESTAMP_MICROS

s-sanjay commented on issue #1895:
URL: https://github.com/apache/hudi/issues/1895#issuecomment-669845428


   Right now presto does not support reading TIMESTAMP_MICROS type. This needs to be fixed from the presto side for which I am working on a fix. ( presto only supports timestamp upto millisecond granularity so the fix will simply convert the microsecond to millisecond ) I think `spark.sql.parquet.outputTimestampType` is not working because hudi is using spark's [SchemaConvertors](https://github.com/apache/spark/blob/master/external/avro/src/main/scala/org/apache/spark/sql/avro/SchemaConverters.scala#L150) which is not even looking at this option. This might be because that property was to control the parquet type but hudi uses avro format to store the schema of the file within parquet.
   It would be very difficult to change this from the hudi or spark side. Right now the easiest option is to choose the double type as mentioned above till the fix merges to presto. I will share the PR link here in couple days ( I need to refactor it since the presto version is custom internal version )


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org