You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Jesus Camacho Rodriguez (JIRA)" <ji...@apache.org> on 2017/01/16 13:02:26 UTC
[jira] [Created] (HIVE-15640) Hive/Druid integration: null handling
for metrics
Jesus Camacho Rodriguez created HIVE-15640:
----------------------------------------------
Summary: Hive/Druid integration: null handling for metrics
Key: HIVE-15640
URL: https://issues.apache.org/jira/browse/HIVE-15640
Project: Hive
Issue Type: Bug
Components: Druid integration
Affects Versions: 2.2.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
Priority: Critical
Null values for metrics in Druid and Hive are not handled the same way (_0.0_ vs _NULL_).
In Druid:
{code:sql}
SELECT i_brand_id, floor_day(`__time`), max(ss_quantity), sum(ss_wholesale_cost) as s
FROM store_sales_sold_time_subset
WHERE floor_day(`__time`) BETWEEN '1999-11-01 00:00:00' AND '1999-11-10 00:00:00'
GROUP BY i_brand_id, floor_day(`__time`)
ORDER BY s
LIMIT 10;
OK
6015006 1999-11-03 00:00:00 0.0 0.0
9011009 1999-11-05 00:00:00 0.0 0.0
8003009 1999-11-03 00:00:00 11.0 1.0299999713897705
10005014 1999-11-05 00:00:00 86.0 1.100000023841858
6008007 1999-11-09 00:00:00 81.0 1.3700000047683716
6003003 1999-11-08 00:00:00 45.0 1.600000023841858
8008009 1999-11-08 00:00:00 98.0 1.7100000381469727
8015003 1999-11-02 00:00:00 10.0 1.7400000095367432
8004008 1999-11-10 00:00:00 45.0 1.7599999904632568
8009009 1999-11-07 00:00:00 81.0 1.7699999809265137
{code}
In Hive:
{code:sql}
SELECT i_brand_id, floor_day(`__time`), max(ss_quantity), sum(ss_wholesale_cost) as s
FROM store_sales_sold_time_subset_hive
WHERE floor_day(`__time`) BETWEEN '1999-11-01 00:00:00' AND '1999-11-10 00:00:00'
GROUP BY i_brand_id, floor_day(`__time`)
ORDER BY s
LIMIT 10;
OK
6015006 1999-11-03 00:00:00 NULL NULL
9011009 1999-11-05 00:00:00 NULL NULL
8003009 1999-11-03 00:00:00 11 1.03
10005014 1999-11-05 00:00:00 86 1.1
6008007 1999-11-09 00:00:00 81 1.37
6003003 1999-11-08 00:00:00 45 1.6
8008009 1999-11-08 00:00:00 98 1.71
8015003 1999-11-02 00:00:00 10 1.74
8004008 1999-11-10 00:00:00 45 1.76
8009009 1999-11-07 00:00:00 81 1.77
{code}
However, for Druid dimensions, NULL values seem to be handled properly.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)