You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "HanCheol Cho (JIRA)" <ji...@apache.org> on 2015/09/03 09:47:48 UTC
[jira] [Commented] (HIVE-3715) float and double calculation is
inaccurate in Hive
[ https://issues.apache.org/jira/browse/HIVE-3715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14728628#comment-14728628 ]
HanCheol Cho commented on HIVE-3715:
------------------------------------
Hi,
I think I have crushed on this problem too with the following query while testing hive:
-----------------------------------------------------------------
explain select
sum(l_extendedprice * l_discount) as revenue
from
lineitem
where
l_shipdate >= '1993-01-01'
and l_shipdate < '1994-01-01'
and l_discount between 0.05 and (0.06 + 0.01)
and l_quantity < 25;
-----------------------------------------------------------------
The result is wrong and, in the explain result, l_discount's left boundery is 0.05,
whereas its right boundary is 0.06999999999999999.
Someone can tell me what is the status of this problem?
I think that this is a serious problem and should be patched (and merged into
release), but it is open for years...
> float and double calculation is inaccurate in Hive
> --------------------------------------------------
>
> Key: HIVE-3715
> URL: https://issues.apache.org/jira/browse/HIVE-3715
> Project: Hive
> Issue Type: Bug
> Components: SQL
> Affects Versions: 0.10.0
> Reporter: Johnny Zhang
> Assignee: Johnny Zhang
> Attachments: HIVE-3715.patch.txt
>
>
> I found this during debug the e2e test failures. I found Hive miss calculate the float and double value. Take float calculation as an example:
> hive> select f from all100k limit 1;
> 48308.98
> hive> select f/10 from all100k limit 1;
> 4830.898046875 <--added 04875 in the end
> hive> select f*1.01 from all100k limit 1;
> 48792.0702734375 <--should be 48792.0698
> It might be essentially the same problem as http://effbot.org/pyfaq/why-are-floating-point-calculations-so-inaccurate.htm. But since e2e test compare the results with mysql and seems mysql does it right, so it is worthy fixing it in Hive.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)