You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "HanCheol Cho (JIRA)" <ji...@apache.org> on 2015/09/03 09:47:48 UTC

[jira] [Commented] (HIVE-3715) float and double calculation is inaccurate in Hive

    [ https://issues.apache.org/jira/browse/HIVE-3715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14728628#comment-14728628 ] 

HanCheol Cho commented on HIVE-3715:
------------------------------------

Hi, 


I think I have crushed on this problem too with the following query while testing hive:
-----------------------------------------------------------------
explain select
        sum(l_extendedprice * l_discount) as revenue
from
        lineitem
where
        l_shipdate >= '1993-01-01'
        and l_shipdate < '1994-01-01'
        and l_discount between 0.05 and (0.06 + 0.01)
        and l_quantity < 25;
-----------------------------------------------------------------
The result is wrong and, in the explain result, l_discount's left boundery is 0.05, 
whereas its right boundary is 0.06999999999999999.

Someone can tell me what is the status of this problem?
I think that this is a serious problem and should be patched (and merged into 
 release), but it is open for years...



> float and double calculation is inaccurate in Hive
> --------------------------------------------------
>
>                 Key: HIVE-3715
>                 URL: https://issues.apache.org/jira/browse/HIVE-3715
>             Project: Hive
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 0.10.0
>            Reporter: Johnny Zhang
>            Assignee: Johnny Zhang
>         Attachments: HIVE-3715.patch.txt
>
>
> I found this during debug the e2e test failures. I found Hive miss calculate the float and double value. Take float calculation as an example:
> hive> select f from all100k limit 1;
> 48308.98
> hive> select f/10 from all100k limit 1;
> 4830.898046875   <--added 04875 in the end
> hive> select f*1.01 from all100k limit 1;
> 48792.0702734375  <--should be 48792.0698
> It might be essentially the same problem as http://effbot.org/pyfaq/why-are-floating-point-calculations-so-inaccurate.htm. But since e2e test compare the results with mysql and seems mysql does it right, so it is worthy fixing it in Hive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)