You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2018/06/11 17:55:00 UTC
[jira] [Resolved] (KUDU-2463) Different results returned by group
by on count() metric
[ https://issues.apache.org/jira/browse/KUDU-2463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Todd Lipcon resolved KUDU-2463.
-------------------------------
Resolution: Cannot Reproduce
Fix Version/s: n/a
> Different results returned by group by on count() metric
> --------------------------------------------------------
>
> Key: KUDU-2463
> URL: https://issues.apache.org/jira/browse/KUDU-2463
> Project: Kudu
> Issue Type: Bug
> Components: impala
> Affects Versions: 1.5.0
> Reporter: Tomas Farkas
> Priority: Critical
> Fix For: n/a
>
>
> Hi,
> I have a static table in Kudu, no inserts/updates or deletes are running on the cluster. The query returns DIFFERENT result when I change the where condition on one of the primary key columns, which is in the group_by list.
> The created_date is part of the PK and is type of int.
> PK contains subscriber, time, date, identifier and created_date.
> I tried to check if the inserted count is equal to the HDFS table, and noticed on one day, that the count differs based on the where criteria!!
> {quote}
> {{[10.197.0.164:21000] > select created_date, count(*) from base.usage_kudu where created_date >= 20180601 group by created_date;}}
> {{Query: select created_date, count(*) from base.usage_kudu where created_date >= 20180601 group by created_date}}
> {{Query submitted at: 2018-06-04 21:06:30 (Coordinator: [http://ip-10-197-0-164.eu-west-1.compute.internal:25000|http://ip-10-197-0-164.eu-west-1.compute.internal:25000/])}}
> {{Query progress can be monitored at: [http://ip-10-197-0-164.eu-west-1.compute.internal:25000/query_plan?query_id=ce4e92eda5aaa02f:ea07aa4600000000]}}
> {{+---------------+---------+}}
> {{| created_date | count(*) |}}
> {{+---------------+---------+}}
> {{| 20180603 | 12145131 |}}
> {{| 20180601 | 18076448 | -> 195k MORE!!!}}
> {{| 20180602 | 13325080 |}}
> {{| 20180604 | 3788161 |}}
> {{+---------------+---------+}}
> {{Fetched 4 row(s) in 0.37s}}
> {{[10.197.0.164:21000] >}}
> {{[10.197.0.164:21000] >}}
> {{[10.197.0.164:21000] > select created_date, count(*) from base.usage_kudu where created_date >= 20180601 group by created_date order by 1;}}
> {{Query: select created_date, count(*) from base.usage_kudu where created_date >= 20180601 group by created_date order by 1}}
> {{Query submitted at: 2018-06-04 21:06:55 (Coordinator: [http://ip-10-197-0-164.eu-west-1.compute.internal:25000|http://ip-10-197-0-164.eu-west-1.compute.internal:25000/])}}
> {{Query progress can be monitored at: [http://ip-10-197-0-164.eu-west-1.compute.internal:25000/query_plan?query_id=d541a9dda19e28e4:be4a2ca000000000]}}
> {{+---------------+---------+}}
> {{| created_date | count(*) |}}
> {{+---------------+---------+}}
> {{| 20180601 | 18076448 | -> 195k MORE!!!}}
> {{| 20180602 | 13325080 |}}
> {{| 20180603 | 12145131 |}}
> {{| 20180604 | 3788161 |}}
> {{+---------------+---------+}}
> {{Fetched 4 row(s) in 1.14s}}
> {{[10.197.0.164:21000] > select created_date, count(*) from base.usage_kudu where created_date >= 20180528 group by created_date order by 1;}}
> {{Query: select created_date, count(*) from base.usage_kudu where created_date >= 20180528 group by created_date order by 1}}
> {{Query submitted at: 2018-06-04 21:07:12 (Coordinator: [http://ip-10-197-0-164.eu-west-1.compute.internal:25000|http://ip-10-197-0-164.eu-west-1.compute.internal:25000/])}}
> {{Query progress can be monitored at: [http://ip-10-197-0-164.eu-west-1.compute.internal:25000/query_plan?query_id=774a261fb94ad2bb:aab28b8b00000000]}}
> {{+---------------+---------+}}
> {{| created_date | count(*) |}}
> {{+---------------+---------+}}
> {{| 20180528 | 17607943 |}}
> {{| 20180529 | 20741097 |}}
> {{| 20180530 | 17362364 |}}
> {{| 20180531 | 16877228 |}}
> \{{| 20180601 | 17925671 | -> 44k MORE!! }}
> {{| 20180602 | 13325080 |}}
> {{| 20180603 | 12145131 |}}
> {{| 20180604 | 3788161 |}}
> {{+---------------+---------+}}
> {{Fetched 8 row(s) in 0.67s}}
> {{[10.197.0.164:21000] > select created_date, count(*) from base.usage_kudu where created_date >= 20180525 group by created_date order by 1;}}
> {{Query: select created_date, count(*) from base.usage_kudu where created_date >= 20180525 group by created_date order by 1}}
> {{Query submitted at: 2018-06-04 21:07:25 (Coordinator: [http://ip-10-197-0-164.eu-west-1.compute.internal:25000|http://ip-10-197-0-164.eu-west-1.compute.internal:25000/])}}
> {{Query progress can be monitored at: [http://ip-10-197-0-164.eu-west-1.compute.internal:25000/query_plan?query_id=38483ad3ae5c8eb9:a538cb6300000000]}}
> {{+---------------+---------+}}
> {{| created_date | count(*) |}}
> {{+---------------+---------+}}
> {{| 20180525 | 22309857 |}}
> {{| 20180526 | 15268520 |}}
> {{| 20180527 | 14939691 |}}
> {{| 20180528 | 17607943 |}}
> {{| 20180529 | 20741097 |}}
> {{| 20180530 | 17362364 |}}
> {{| 20180531 | 16903829 |}}
> {{| 20180601 | 18047010 | -> 165k MORE!!!}}
> {{| 20180602 | 13325080 |}}
> {{| 20180603 | 12145131 |}}
> {{| 20180604 | 3788161 |}}
> {{+---------------+---------+}}
> {{Fetched 11 row(s) in 0.85s}}
> {{[10.197.0.164:21000] > select created_date, count(*) from base.usage_kudu where created_date = 20180601 group by created_date;}}
> {{Query: select created_date, count(*) from base.usage_kudu where created_date = 20180601 group by created_date}}
> {{Query submitted at: 2018-06-04 21:07:42 (Coordinator: [http://ip-10-197-0-164.eu-west-1.compute.internal:25000|http://ip-10-197-0-164.eu-west-1.compute.internal:25000/])}}
> {{Query progress can be monitored at: [http://ip-10-197-0-164.eu-west-1.compute.internal:25000/query_plan?query_id=7343ba31f6b4c86f:621a7b8c00000000]}}
> {{+---------------+---------+}}
> {{| created_date | count(*) |}}
> {{+---------------+---------+}}
> {{| 20180601 | 17881253 | -> CORRECT ONE}}
> {{+---------------+---------+}}
> {{Fetched 1 row(s) in 0.27s}}
> {{[10.197.0.164:21000] > select created_date, count(*) from base.usage_kudu where created_date >= 20180525 group by created_date order by 1;}}
> {{Query: select created_date, count(*) from base.usage_kudu where created_date >= 20180525 group by created_date order by 1}}
> {{Query submitted at: 2018-06-04 21:12:02 (Coordinator: [http://ip-10-197-0-164.eu-west-1.compute.internal:25000|http://ip-10-197-0-164.eu-west-1.compute.internal:25000/])}}
> {{Query progress can be monitored at: [http://ip-10-197-0-164.eu-west-1.compute.internal:25000/query_plan?query_id=4141df26117f35c3:9ab2f0700000000]}}
> {{+---------------+---------+}}
> {{| created_date | count(*) |}}
> {{+---------------+---------+}}
> {{| 20180525 | 22309857 |}}
> {{| 20180526 | 15268520 |}}
> {{| 20180527 | 14939691 |}}
> {{| 20180528 | 17607943 |}}
> {{| 20180529 | 20741097 |}}
> {{| 20180530 | 17362364 |}}
> {{| 20180531 | 16903829 |}}
> {{| 20180601 | 18047010 | -> AGAIN WRONG RESULT!!}}
> {{| 20180602 | 13325080 |}}
> {{| 20180603 | 12145131 |}}
> {{| 20180604 | 3788161 |}}
> {{+---------------+---------+}}
> {{Fetched 11 row(s) in 1.04s}}{{}}{{}}
> Again, no other inserts/selects/updates or deletes were running between these statements on the cluster.
>
> I checked the explain, if there is a difference,but it looks ok. But the result is different!
>
> {{[10.197.0.164:21000] > explain select created_date, count(*) from base.usage_kudu where created_date = 20180601 group by created_date;}}
> {{Query: explain select created_date, count(*) from base.usage_kudu where created_date = 20180601 group by created_date}}
> {{+--------------------------------------------------+}}
> {{| Explain String |}}
> {{+--------------------------------------------------+}}
> {{| Max Per-Host Resource Reservation: Memory=3.94MB |}}
> {{| Per-Host Resource Estimates: Memory=20.00MB |}}
> {{| |}}
> {{| PLAN-ROOT SINK |}}
> {{| | |}}
> {{| 04:EXCHANGE [UNPARTITIONED] |}}
> {{| | |}}
> {{| 03:AGGREGATE [FINALIZE] |}}
> {{| | output: count:merge(*) |}}
> {{| | group by: created_date |}}
> {{| | |}}
> {{| 02:EXCHANGE [HASH(created_date)] |}}
> {{| | |}}
> {{| 01:AGGREGATE [STREAMING] |}}
> {{| | output: count(*) |}}
> {{| | group by: created_date |}}
> {{| | |}}
> {{| 00:SCAN KUDU [base.usage_kudu] |}}
> {{| kudu predicates: created_date = 20180601 |}}
> {{+--------------------------------------------------+}}
> {{Fetched 19 row(s) in 0.06s}}
> {{[10.197.0.164:21000] > select created_date, count(*) from base.usage_kudu where created_date = 20180601 group by created_date;}}
> {{Query: select created_date, count(*) from base.usage_kudu where created_date = 20180601 group by created_date}}
> {{Query submitted at: 2018-06-04 21:17:21 (Coordinator: [http://ip-10-197-0-164.eu-west-1.compute.internal:25000|http://ip-10-197-0-164.eu-west-1.compute.internal:25000/])}}
> {{Query progress can be monitored at: [http://ip-10-197-0-164.eu-west-1.compute.internal:25000/query_plan?query_id=c449aabea51e7456:612f096400000000]}}
> {{+---------------+---------+}}
> {{| created_date | count(*) |}}
> {{+---------------+---------+}}
> {{| 20180601 | 17881253 |}}
> {{+---------------+---------+}}
> {{Fetched 1 row(s) in 0.38s}}
> {{[10.197.0.164:21000] > explain select created_date, count(*) from base.usage_kudu where created_date >= 20180525 group by created_date order by 1;}}
> {{Query: explain select created_date, count(*) from base.usage_kudu where created_date >= 20180525 group by created_date order by 1}}
> {{+--------------------------------------------------+}}
> {{| Explain String |}}
> {{+--------------------------------------------------+}}
> {{| Max Per-Host Resource Reservation: Memory=9.94MB |}}
> {{| Per-Host Resource Estimates: Memory=26.00MB |}}
> {{| |}}
> {{| PLAN-ROOT SINK |}}
> {{| | |}}
> {{| 05:MERGING-EXCHANGE [UNPARTITIONED] |}}
> {{| | order by: created_date ASC |}}
> {{| | |}}
> {{| 02:SORT |}}
> {{| | order by: created_date ASC |}}
> {{| | |}}
> {{| 04:AGGREGATE [FINALIZE] |}}
> {{| | output: count:merge(*) |}}
> {{| | group by: created_date |}}
> {{| | |}}
> {{| 03:EXCHANGE [HASH(created_date)] |}}
> {{| | |}}
> {{| 01:AGGREGATE [STREAMING] |}}
> {{| | output: count(*) |}}
> {{| | group by: created_date |}}
> {{| | |}}
> {{| 00:SCAN KUDU [base.usage_kudu] |}}
> {{| kudu predicates: created_date >= 20180525 |}}
> {{+--------------------------------------------------+}}
> {{Fetched 23 row(s) in 0.05s}}
> {{[10.197.0.164:21000] > select created_date, count(*) from base.usage_kudu where created_date >= 20180525 group by created_date order by 1;}}
> {{Query: select created_date, count(*) from base.usage_kudu where created_date >= 20180525 group by created_date order by 1}}
> {{Query submitted at: 2018-06-04 21:17:32 (Coordinator: [http://ip-10-197-0-164.eu-west-1.compute.internal:25000|http://ip-10-197-0-164.eu-west-1.compute.internal:25000/])}}
> {{Query progress can be monitored at: [http://ip-10-197-0-164.eu-west-1.compute.internal:25000/query_plan?query_id=bc4a36f2a7ad3280:c7b09a5100000000]}}
> {{+---------------+---------+}}
> {{| created_date | count(*) |}}
> {{+---------------+---------+}}
> {{| 20180525 | 22309857 |}}
> {{| 20180526 | 15268520 |}}
> {{| 20180527 | 14939691 |}}
> {{| 20180528 | 17607943 |}}
> {{| 20180529 | 20741097 |}}
> {{| 20180530 | 17362364 |}}
> {{| 20180531 | 16903829 |}}
> {{| 20180601 | 18047010 |}}
> {{| 20180602 | 13325080 |}}
> {{| 20180603 | 12145131 |}}
> {{| 20180604 | 3788161 |}}
> {{+---------------+---------+}}
> {{Fetched 11 row(s) in 0.88s}}
>
> {quote}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)