You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@phoenix.apache.org by "Dheeren Beborrtha (JIRA)" <ji...@apache.org> on 2018/10/01 17:26:00 UTC
[jira] [Commented] (PHOENIX-3005) Fixes for COUNT(DISTINCT...) with
DistinctPrefixFilter and indexes
[ https://issues.apache.org/jira/browse/PHOENIX-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16634360#comment-16634360 ]
Dheeren Beborrtha commented on PHOENIX-3005:
--------------------------------------------
Looks like there is still some issue. Should we reopen the issue or create a new Jira ?
`select count (distinct c1) from tab1 limit 10` query returns correct result in (phoenix 4.7.0.2.5.3.0-37, hbase 1.1.2.2.5) and incorrect result in ( phoenix-4.12.0.2.4.2.0-258, hbase 1.1.2.2.4 ):
*On phoenix-4.12.0.2.4.2.0-258, hbase 1.1.2.2.4*
Without limit:
0: jdbc:phoenix:thin:url=[http://XXXXXXX-|http://halhb-bdcsce-/]> select count (distinct oidadi) from ADI_DL_DATA ;
+------------------------------------------+
|DISTINCT_COUNT(OIDADI)|
+------------------------------------------+
|1985|
+------------------------------------------+
1 row selected (1.421 seconds)
with Limits:
0: jdbc:phoenix:thin:url=[http://XXXXXX-|http://halhb-bdcsce-/]> select count (distinct oidadi) from ADI_DL_DATA limit 10;
+------------------------------------------+
|DISTINCT_COUNT(OIDADI)|
+------------------------------------------+
|97|
+------------------------------------------+
*On phoenix 4.7.0.2.5.3.0-37, hbase 1.1.2.2.5*
Without limit:
0: jdbc:phoenix:thin:url=[http://XXXXXXX-|http://halhb-bdcsce-/]> select count (distinct oidadi) from ADI_DL_DATA ;
+------------------------------------------+
|DISTINCT_COUNT(OIDADI)|
+------------------------------------------+
|1985|
+------------------------------------------+
1 row selected (1.421 seconds)
with Limits:
0: jdbc:phoenix:thin:url=[http://XXXXXX-|http://halhb-bdcsce-/]> select count (distinct oidadi) from ADI_DL_DATA limit 10;
+------------------------------------------+
|DISTINCT_COUNT(OIDADI)|
+------------------------------------------+
1985
+------------------------------------------+
======================================
With explain :
0: jdbc:phoenix:thin:url=[http://XXXXXXX-|http://halhb-bdcsce-/]> explain select count (distinct oidadi) from ADI_DL_DATA limit 10;
+-------------------------------------------+-----------------------------------------++------------------------------------------------------------------------------------+
|PLAN|EST_BYTES_READ|EST_ROWS_READ|EST_INFO_TS|
+-------------------------------------------+-----------------------------------------++------------------------------------------------------------------------------------+
|CLIENT 128-CHUNK 26176389 ROWS 35232158993 BYTES PARALLEL 16-WAY FULL SCAN OVER ADI_DL_DATA|35232158993|26176389|
|SERVER FILTER BY FIRST KEY ONLY|35232158993|26176389|1537332401319|
|SERVER DISTINCT PREFIX FILTER OVER [OIDADI] \| 35232158993 \| 26176389 \| 1537332401319 \||
|SERVER 10 ROW LIMIT|35232158993|26176389|1537332401319|
|SERVER AGGREGATE INTO SINGLE ROW|35232158993|26176389|1537332401319|
|CLIENT 10 ROW LIMIT|35232158993|26176389|1537332401319|
+-------------------------------------------+-----------------------------------------++------------------------------------------------------------------------------------+
6 rows selected (1.573 seconds)
CC: [~lhofhansl] [~jamestaylor]
> Fixes for COUNT(DISTINCT...) with DistinctPrefixFilter and indexes
> ------------------------------------------------------------------
>
> Key: PHOENIX-3005
> URL: https://issues.apache.org/jira/browse/PHOENIX-3005
> Project: Phoenix
> Issue Type: Sub-task
> Reporter: Lars Hofhansl
> Assignee: Lars Hofhansl
> Priority: Major
> Fix For: 4.8.0
>
> Attachments: 3005-v2.txt, 3005-wip-v1.txt, PHOENIX-3005_v1.patch
>
>
> It turns out that PHOENIX-2965 has some bugs with indexes:
> # COUNT(DISTINCT <indexed column>) does not use the DistinctPrefixFilter
> # Once an index is created COUNT(DISTINCT <pk-prefix>) is no longer using the DistinctPrefixFilter
> This jira fixes both issues.
> Was:
> Currently the optimization in PHOENIX-258 is not used for DISTINCT index scans. We should add that as well.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)