You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Mustafa Iman (Jira)" <ji...@apache.org> on 2019/10/03 23:15:00 UTC
[jira] [Updated] (HIVE-14302) Tez: Optimized Hashtable can support
DECIMAL keys of same precision
[ https://issues.apache.org/jira/browse/HIVE-14302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mustafa Iman updated HIVE-14302:
--------------------------------
Attachment: HIVE-14302.patch
Status: Patch Available (was: Open)
> Tez: Optimized Hashtable can support DECIMAL keys of same precision
> -------------------------------------------------------------------
>
> Key: HIVE-14302
> URL: https://issues.apache.org/jira/browse/HIVE-14302
> Project: Hive
> Issue Type: Improvement
> Components: Tez
> Affects Versions: 2.2.0
> Reporter: Gopal Vijayaraghavan
> Assignee: Mustafa Iman
> Priority: Major
> Attachments: HIVE-14302.patch
>
>
> Decimal support in the optimized hashtable was decided on the basis of the fact that Decimal(10,1) == Decimal(10, 2) when both contain "1.0" and "1.00".
> However, the joins now don't have any issues with decimal precision because they cast to common.
> {code}
> create temporary table x (a decimal(10,2), b decimal(10,1)) stored as orc;
> insert into x values (1.0, 1.0);
> > explain logical select count(1) from x, x x1 where x.a = x1.b;
> OK
> LOGICAL PLAN:
> $hdt$_0:$hdt$_0:x
> TableScan (TS_0)
> alias: x
> filterExpr: (a is not null and true) (type: boolean)
> Filter Operator (FIL_18)
> predicate: (a is not null and true) (type: boolean)
> Select Operator (SEL_2)
> expressions: a (type: decimal(10,2))
> outputColumnNames: _col0
> Reduce Output Operator (RS_6)
> key expressions: _col0 (type: decimal(11,2))
> sort order: +
> Map-reduce partition columns: _col0 (type: decimal(11,2))
> Join Operator (JOIN_8)
> condition map:
> Inner Join 0 to 1
> keys:
> 0 _col0 (type: decimal(11,2))
> 1 _col0 (type: decimal(11,2))
> Group By Operator (GBY_11)
> aggregations: count(1)
> mode: hash
> outputColumnNames: _col0
> {code}
> See cast up to Decimal(11, 2) in the plan, which normalizes both sides of the join to be able to compare HiveDecimal as-is.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)