You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Remus Rusanu (JIRA)" <ji...@apache.org> on 2013/06/14 08:52:21 UTC

[jira] [Created] (HIVE-4736) Float aggregate of single value looses precission

Remus Rusanu created HIVE-4736:
----------------------------------

             Summary: Float aggregate of single value looses precission
                 Key: HIVE-4736
                 URL: https://issues.apache.org/jira/browse/HIVE-4736
             Project: Hive
          Issue Type: Sub-task
          Components: Query Processor
    Affects Versions: vectorization-branch
            Reporter: Remus Rusanu
            Assignee: Remus Rusanu
            Priority: Minor


I am seeing differences that are small but greater than expected in the output of TPC-H query 1. Thanks to Jitendra’s patch, the negative result error is gone. However, I am seeing that the results for 
 
Sum(l_extendedprice * ( 1 - l_discount ))
Sum(l_extendedprice * ( 1 - l_discount ) * ( 1 + l_tax ))
 
Differ in the low-order 8 or so digits compared with text.  I know the input is float so it should have all zeros in the low digits when cast to double. I would not expect the answers to match exactly but I would expect errors in the last few digits, not the last 8. I could be wrong, but I think this may be worth investigating. Any ideas?
 
I also noticed this, where I ran the query against a single row worth of input (order 1, line 1):
 
SELECT l_returnflag,
       l_linestatus,
       Sum(l_quantity)                                           AS sum_qty,
       Sum(l_extendedprice)                                      AS sum_base_price,
       Sum(l_extendedprice * ( 1 - l_discount ))                 AS sum_disc_price,
       Sum(l_extendedprice * ( 1 - l_discount ) * ( 1 + l_tax )) AS sum_charge,
       Avg(l_quantity)                                           AS avg_qty,
       Avg(l_extendedprice)                                      AS avg_price,
       Avg(l_discount)                                           AS avg_disc,
       Count(*)                                                  AS count_order
FROM   lineitem
WHERE  l_shipdate <= '1998-09-19'
       and l_orderkey = 1 and l_linenumber = 1 
GROUP  BY l_returnflag,
          l_linestatus
ORDER  BY l_returnflag,
          l_linestatus; 
 
input row
1       155190  7706    1       17.0    21168.23        0.04    0.02    N       O       1996-03-13  1996-02-12  1996-03-22      DELIVER IN PERSON   TRUCK   egular courts above the
 
 
Result
 
V (orc)
N       O       17.0    21168.23046875  20321.501268925873      20727.931285219973      17.0    21168.23046875  0.03999999910593033 1
 
NV (text)
N       O       17.0    21168.23046875  20321.5                 20727.9296875           17.0    21168.23046875  0.03999999910593033     1


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira