You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Ujjwal Wadhawan <uj...@gmail.com> on 2015/06/01 18:01:44 UTC
Is the HCatRecord iterator ephemeral ? (A usage scenario on Timestamp)
Hi,
I wanted to check a behavior reproducible with timestamp. It can be
summarized as
“When reading from stored HCatRecord iterator, the column value of data
type *timestamp *of a previous row gets reset to 1970-01-01 00:00:00.0
(or locale adjusted epoch time 0) when the column value in the current row
has *null*.
Columns of other data types in previous row do not get affected by presence
of *null* in its current column value.”
Pls see the mail for details and steps to reproduce.
Regards,
Ujjwal
---------- Forwarded message ----------
From: Ujjwal <uj...@gmail.com>
Date: Fri, May 29, 2015 at 2:20 PM
Subject: Re: only timestamp column value of previous row gets reset
To: user@hive.apache.org
Hi all,
The issue can be reproduced in a simple java program (code attached for
reference/use) where I do not use the iterator right away after reading,
but store it in a vector for later use. As per my understanding, the
iterator should not change once given to the consumer. However the
timestamp datatype object gets reset under one condition explained
earlier.. I have attached the code for reference.
Create a table
---------------------
create table if not exists sample (dtcol date, tscol timestamp, stcol
string) row format delimited fields terminated by ',' stored as textfile;
truncate table sample;
Input data (input)
------------------------
9779-11-21,2014-04-01 11:30:55,abc
9779-11-21,2014-04-04 11:30:55,def
,null,
Load the data
-------------------
hadoop fs -put input /apps/hive/warehouse/sample
Check
---------
hive> select * from sample;
OK
9779-11-21 2014-04-01 11:30:55 abc
9779-11-21 2014-04-04 11:30:55 def
NULL NULL
Time taken: 0.029 seconds, Fetched: 3 row(s)
hive>
Execute
------------
export CLASSPATH=`hadoop classpath`:`hcat -classpath`
java -classpath SampleHCatReader.jar:$CLASSPATH
org.my.internal.SampleHCatReader
Output having timestamp reset !
------------------------------------------------
HCat record right after reading is 9779-11-21 2014-04-01 11:30:55.0 abc
HCat record right after reading is 9779-11-21 2014-04-04 11:30:55.0 def
HCat record right after reading is null null
HCat record later is 9779-11-21 2014-04-01 11:30:55.0 abc
HCat record later is 9779-11-21 1970-01-01 00:00:00.0 def
HCat record later is null null
As we see above, the output for time-stamp gets reset.
Regards,
Ujjwal W