You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Matt McCline (JIRA)" <ji...@apache.org> on 2014/10/04 01:14:34 UTC
[jira] [Resolved] (HIVE-8197) Tez and Vectorization Insert into ORC
Table with timestamp column erroneously repeats the last row's column value
[ https://issues.apache.org/jira/browse/HIVE-8197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matt McCline resolved HIVE-8197.
--------------------------------
Resolution: Cannot Reproduce
> Tez and Vectorization Insert into ORC Table with timestamp column erroneously repeats the last row's column value
> -----------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-8197
> URL: https://issues.apache.org/jira/browse/HIVE-8197
> Project: Hive
> Issue Type: Bug
> Environment: Tez and Vectorization.
> Reporter: Matt McCline
> Assignee: Matt McCline
> Priority: Critical
>
> In diagnosing why a only(?) a Tez and Vectorized query with min and max aggregates was always returning the last row read's column value, discovered the problem was in creating the test table....
> {code}
> CREATE TABLE alltypesorc_string STORED AS ORC AS SELECT
> ctinyint as ctinyint,
> to_utc_timestamp(ctimestamp1, 'America/Los_Angeles') as ctimestamp1,
> CAST(to_utc_timestamp(ctimestamp1, 'America/Los_Angeles') AS STRING) as stimestamp1
> FROM alltypesorc WHERE ctinyint > 0
> LIMIT 40;
> {code}
> I think it is related what Prasanth mentioned as a possibility: Saving a Timestamp as a Writable object that gets overwritten. One suspect is the Writable[] records array in VectorFileSinkOperator in the ProcessOp method. Or, perhaps it is in VectorReduceSinkOperator.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)