You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Xuefu Zhang (JIRA)" <ji...@apache.org> on 2016/04/18 18:56:25 UTC
[jira] [Updated] (HIVE-13534) ClassCastException when trying to
access TIMESTAMP columns into parquet file using hive external table
[ https://issues.apache.org/jira/browse/HIVE-13534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xuefu Zhang updated HIVE-13534:
-------------------------------
Summary: ClassCastException when trying to access TIMESTAMP columns into parquet file using hive external table (was: Exception when trying to access TIMESTAMP columns into parquet file using hive external table )
> ClassCastException when trying to access TIMESTAMP columns into parquet file using hive external table
> -------------------------------------------------------------------------------------------------------
>
> Key: HIVE-13534
> URL: https://issues.apache.org/jira/browse/HIVE-13534
> Project: Hive
> Issue Type: Bug
> Components: File Formats, Hive, Import/Export, JDBC
> Reporter: Bobeff
> Assignee: Sushanth Sowmyan
> Priority: Critical
>
> Imported data was stored from a netezza datasource using a sqoop import command like this
> SQL DDL creation script of imported table looks like this
> CREATE TABLE "ADMIN"."MIS_AUX_ITR" (
> "DDEBVAL" DATE,
> "DFINVAL" DATE,
> "NAUX" VARCHAR(6),
> "CDMNITR" VARCHAR(3),
> "CDERIMG" VARCHAR(1),
> "DDERIMG" DATE
> );
> Import sqoop job is the following
> sqoop job
> --create import-name
> -- import
> --connect jdbc:netezza://server:port/database
> --username user
> --password pwd
> --table MIS_AUX_ITR
> --as-parquetfile
> --target-dir hdfs:///prod/ZA/dee/MIS_AUX_ITR
> -m 1
> After import parquet file schema is the following
> > yarn jar /tmp/parquet-tools-1.6.0.jar schema /prod/ZA/dee/MIS_AUX_ITR/2cf3e971-4c2c-408f-bd86-5d3cf3bd4fa5.parquet
> message MIS_AUX_ITR {
> optional int64 DDEBVAL;
> optional int64 DFINVAL;
> optional binary NAUX (UTF8);
> optional binary CDMNITR (UTF8);
> optional binary CDERIMG (UTF8);
> optional int64 DDERIMG;
> }
> In order to access data stored into the parquet file we created the external table below
> CREATE EXTERNAL TABLE za_dee.MIS_AUX_ITR
> (
> `DDEBVAL` DATE,
> `DFINVAL` DATE,
> `NAUX` VARCHAR(6),
> `CDMNITR` VARCHAR(3),
> `CDERIMG` VARCHAR(1),
> `DDERIMG` DATE
> )
> COMMENT 'Table DEE MIS_AUX_ITR'
> STORED AS PARQUET
> LOCATION
> '/prod/ZA/dee/MIS_AUX_ITR';
> But when we try to list data from external table above we get the following exception
> hive> CREATE EXTERNAL TABLE za_dee.MIS_AUX_ITR_V_PPROD
> > (
> > `DDEBVAL`DATE,
> > `DFINVAL`DATE,
> > `NAUX`VARCHAR(6),
> > `CDMNITR`VARCHAR(3),
> > `CDERIMG`VARCHAR(1),
> > `DDERIMG`DATE
> > )
> > COMMENT 'Table DEE MIS_AUX_ITR_V_PROD'
> > STORED AS PARQUET
> > LOCATION
> > '/prod/ZA/dee/MIS_AUX_ITR_V_PPROD';
> OK
> Time taken: 0.196 seconds
> hive> select * from za_dee.MIS_AUX_ITR_V_PPROD limit 100;
> OK
> SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
> Failed with exception java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.hive.serde2.io.DateWritable
> Time taken: 0.529 seconds
> hive>
> We also tried with the following external table
> CREATE EXTERNAL TABLE za_dee.MIS_AUX_ITR_V_PPROD_BI
> (
> `DDEBVAL` BIGINT,
> `DFINVAL` BIGINT,
> `NAUX` VARCHAR(6),
> `CDMNITR` VARCHAR(3),
> `CDERIMG` VARCHAR(1),
> `DDERIMG` BIGINT
> )
> COMMENT 'Table DEE MIS_AUX_ITR_V_PROD_BI'
> STORED AS PARQUET
> LOCATION '/prod/ZA/dee/MIS_AUX_ITR_V_PPROD';
> Then the “Date” columns are shown as “timestamp” values as below
> hive> select DDEBVAL from za_dee.MIS_AUX_ITR_V_PPROD_BI limit 5;
> OK
> 1080770400000
> 1080770400000
> 1080770400000
> 1080770400000
> 1080770400000
> Time taken: 0.081 seconds, Fetched: 5 row(s)
> hive>
> However “Date” values can be listed by casting as Timestamp
> hive> select cast(DDEBVAL as Timestamp) from za_dee.MIS_AUX_ITR_V_PPROD_BI limit 5;
> OK
> 2004-04-01 00:00:00
> 2004-04-01 00:00:00
> 2004-04-01 00:00:00
> 2004-04-01 00:00:00
> 2004-04-01 00:00:00
> Time taken: 0.087 seconds, Fetched: 5 row(s)
> hive>
> We also have tested with an external table using TIMESTAMP type as shown below
> CREATE EXTERNAL TABLE za_dee.MIS_AUX_ITR
> (
> `DDEBVAL` TIMESTAMP,
> `DFINVAL` TIMESTAMP,
> `NAUX` VARCHAR(6),
> `CDMNITR` VARCHAR(3),
> `CDERIMG` VARCHAR(1),
> `DDERIMG` TIMESTAMP
> )
> COMMENT 'Table DEE MIS_AUX_ITR'
> STORED AS PARQUET
> LOCATION
> '/prod/ZA/dee/MIS_AUX_ITR';
> But we got the same behavior: an exception when trying to access data from an Oracle DB.
> I tried this
> CREATE EXTERNAL TABLE za_dee.MIS_AUX_ITR_V_PPROD_TS
> (
> `DDEBVAL` TIMESTAMP,
> `DFINVAL` TIMESTAMP,
> `NAUX` VARCHAR(6),
> `CDMNITR` VARCHAR(3),
> `CDERIMG` VARCHAR(1),
> `DDERIMG` TIMESTAMP
> )
> COMMENT 'Table DEE MIS_AUX_ITR_V_PROD_TS'
> STORED AS PARQUET
> LOCATION
> '/prod/ZA/dee/MIS_AUX_ITR_V_PPROD';
> and then i created and launched the sqoop job below
> sqoop job --create import-za_dee-MIS_AUX_ITR_V-full-default-import-PPROD -- import
> --connect jdbc:netezza:/<server>:<port>/db
> --username <user>
> --password <password>
> --table MIS_AUX_ITR_V
> --as-parquetfile
> --hive-import
> --hive-overwrite
> --hive-database za_dee
> --hive-table MIS_AUX_ITR_V_PPROD_TS
> -m 1
> sqoop job --exec import-za_dee-MIS_AUX_ITR_V-full-default-import-PPROD
> the raising error is the following
> 16/04/11 17:15:09 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-sqoop/compile/3533e18a81a65fb8eb88ec9cef2f4688/codegen_MIS_AUX_ITR_V.jar
> 16/04/11 17:15:09 WARN manager.NetezzaManager: It looks like you are importing from Netezza.
> 16/04/11 17:15:09 WARN manager.NetezzaManager: This transfer can be faster! Use the --direct
> 16/04/11 17:15:09 WARN manager.NetezzaManager: option to exercise a Netezza-specific fast path.
> 16/04/11 17:15:17 INFO mapreduce.ImportJobBase: Beginning import of MIS_AUX_ITR_V
> 16/04/11 17:15:17 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM "MIS_AUX_ITR_V" AS t WHERE 1=0
> 16/04/11 17:15:18 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM "MIS_AUX_ITR_V" AS t WHERE 1=0
> 16/04/11 17:15:18 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM "MIS_AUX_ITR_V" AS t WHERE 1=0
> 16/04/11 17:15:19 INFO hive.metastore: Trying to connect to metastore with URI thrift://slhdm007.maif.local:9083
> 16/04/11 17:15:19 INFO hive.metastore: Connected to metastore.
> 16/04/11 17:15:19 ERROR tool.ImportTool: Imported Failed: Cannot convert unsupported type: timestamp
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)