You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Eric Hanson (JIRA)" <ji...@apache.org> on 2014/03/03 18:25:20 UTC
[jira] [Commented] (HIVE-6511) casting from decimal to
tinyint,smallint, int and bigint generates different result when
vectorization is on
[ https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13918290#comment-13918290 ]
Eric Hanson commented on HIVE-6511:
-----------------------------------
Can you put this up on ReviewBoard?
> casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on
> ------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-6511
> URL: https://issues.apache.org/jira/browse/HIVE-6511
> Project: Hive
> Issue Type: Bug
> Reporter: Jitendra Nath Pandey
> Assignee: Jitendra Nath Pandey
> Attachments: HIVE-6511.1.patch
>
>
> select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from vectortab10korc limit 20 generates following result when vectorization is enabled:
> {code}
> 4619756289662.078125 -1628520834 -16770 126
> 1553532646710.316406 -1245514442 -2762 54
> 3367942487288.360352 688127224 -776 -8
> 4386447830839.337891 1286221623 12087 55
> -3234165331139.458008 -54957251 27453 61
> -488378613475.326172 1247658269 -16099 29
> -493942492598.691406 -21253559 -19895 73
> 3101852523586.039062 886135874 23618 66
> 2544105595941.381836 1484956709 -23515 37
> -3997512403067.0625 1102149509 30597 -123
> -1183754978977.589355 1655994718 31070 94
> 1408783849655.676758 34576568 -26440 -72
> -2993175106993.426758 417098319 27215 79
> 3004723551798.100586 -1753555402 -8650 54
> 1103792083527.786133 -14511544 -28088 72
> 469767055288.485352 1615620024 26552 -72
> -1263700791098.294434 -980406074 12486 -58
> -4244889766496.484375 -1462078048 30112 -96
> -3962729491139.782715 1525323068 -27332 60
> NULL NULL NULL NULL
> {code}
> When vectorization is disabled, result looks like this:
> {code}
> 4619756289662.078125 -1628520834 -16770 126
> 1553532646710.316406 -1245514442 -2762 54
> 3367942487288.360352 688127224 -776 -8
> 4386447830839.337891 1286221623 12087 55
> -3234165331139.458008 -54957251 27453 61
> -488378613475.326172 1247658269 -16099 29
> -493942492598.691406 -21253558 -19894 74
> 3101852523586.039062 886135874 23618 66
> 2544105595941.381836 1484956709 -23515 37
> -3997512403067.0625 1102149509 30597 -123
> -1183754978977.589355 1655994719 31071 95
> 1408783849655.676758 34576567 -26441 -73
> -2993175106993.426758 417098319 27215 79
> 3004723551798.100586 -1753555402 -8650 54
> 1103792083527.786133 -14511545 -28089 71
> 469767055288.485352 1615620024 26552 -72
> -1263700791098.294434 -980406074 12486 -58
> -4244889766496.484375 -1462078048 30112 -96
> -3962729491139.782715 1525323069 -27331 61
> NULL NULL NULL NULL
> {code}
> This issue is visible only for certain decimal values. In above example, row 7,11,12, and 15 generates different results.
> vectortab10korc table schema:
> {code}
> t tinyint from deserializer
> si smallint from deserializer
> i int from deserializer
> b bigint from deserializer
> f float from deserializer
> d double from deserializer
> dc decimal(38,18) from deserializer
> bo boolean from deserializer
> s string from deserializer
> s2 string from deserializer
> ts timestamp from deserializer
>
> # Detailed Table Information
> Database: default
> Owner: xyz
> CreateTime: Tue Feb 25 21:54:28 UTC 2014
> LastAccessTime: UNKNOWN
> Protect Mode: None
> Retention: 0
> Location: hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc
> Table Type: MANAGED_TABLE
> Table Parameters:
> COLUMN_STATS_ACCURATE true
> numFiles 1
> numRows 10000
> rawDataSize 0
> totalSize 344748
> transient_lastDdlTime 1393365281
>
> # Storage Information
> SerDe Library: org.apache.hadoop.hive.ql.io.orc.OrcSerde
> InputFormat: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> OutputFormat: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
> Compressed: No
> Num Buckets: -1
> Bucket Columns: []
> Sort Columns: []
> Storage Desc Params:
> serialization.format 1
> Time taken: 0.196 seconds, Fetched: 41 row(s
> {code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)