You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Ganesha Shreedhara (Jira)" <ji...@apache.org> on 2019/10/21 07:16:00 UTC

[jira] [Commented] (HIVE-15079) Hive cannot read Parquet string timetamps as TIMESTAMP data type

    [ https://issues.apache.org/jira/browse/HIVE-15079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955820#comment-16955820 ] 

Ganesha Shreedhara commented on HIVE-15079:
-------------------------------------------

I have an another instance similar to this where the data is in long format and table schema has column type as timestamp. 

This works with ORC but throws ClassCastException when parquet is used. Long type data can be converted to timestamp (Eg: new Timestamp(longValue)).

Do we have any plans to support automatic type conversion for parquet file formats in hive?   

> Hive cannot read Parquet string timetamps as TIMESTAMP data type
> ----------------------------------------------------------------
>
>                 Key: HIVE-15079
>                 URL: https://issues.apache.org/jira/browse/HIVE-15079
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>            Reporter: Sergio Peña
>            Priority: Major
>
> The Hive Wiki for timestamps specifies that strings timestamps can be read by Hive. 
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-Timestamps
> {noformat}
> Supported conversions:
> Integer numeric types: Interpreted as UNIX timestamp in seconds
> Floating point numeric types: Interpreted as UNIX timestamp in seconds with decimal precision
> Strings: JDBC compliant java.sql.Timestamp format "YYYY-MM-DD HH:MM:SS.fffffffff" (9 decimal place precision)
> {noformat}
> This works fine with Text table formats, but when Parquet is used, then it throws the following exception:
> {noformat}
> java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.hadoop.hive.serde2.io.TimestampWritable
> {noformat}
> How to reproduce
> {noformat}
> > create table t1 (id int, time string) stored as parquet;
> > insert into table t1 values (1,'2016-07-17 14:42:18');
> > alter table t1 replace columns (id int, time timestamp);
> > select * from t1
> {noformat}
> The above example will run fine if you use a TEXT format instead of PARQUET.
> This issue was raised on PARQUET-723



--
This message was sent by Atlassian Jira
(v8.3.4#803005)