You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Mujtaba Chohan (JIRA)" <ji...@apache.org> on 2017/05/08 21:17:04 UTC

[jira] [Created] (PHOENIX-3836) Estimated row count is twice the actual row count when stats are updated via major compaction

Mujtaba Chohan created PHOENIX-3836:
---------------------------------------

             Summary: Estimated row count is twice the actual row count when stats are updated via major compaction
                 Key: PHOENIX-3836
                 URL: https://issues.apache.org/jira/browse/PHOENIX-3836
             Project: Phoenix
          Issue Type: Bug
            Reporter: Mujtaba Chohan
            Priority: Minor


Estimated row count for a 2M table is 3986498 after stats updated via major compaction vs 1993250 with {{update statistics}}.

{noformat}
Explain plan for count(*) on 2M row table after major compaction:
+--------------------------------------------------------------------------------------+
|                                         PLAN                                         |
+--------------------------------------------------------------------------------------+
| CLIENT 364-CHUNK 3986498 ROWS 3774892993 BYTES PARALLEL 1-WAY FULL SCAN OVER T  |
|     SERVER FILTER BY FIRST KEY ONLY                                                  |
|     SERVER AGGREGATE INTO SINGLE ROW                                                 |
+--------------------------------------------------------------------------------------+

Explain plan for count(*) on 2M row table after update statistics:
+--------------------------------------------------------------------------------------+
|                                         PLAN                                         |
+--------------------------------------------------------------------------------------+
| CLIENT 364-CHUNK 1993250 ROWS 3774892993 BYTES PARALLEL 1-WAY FULL SCAN OVER T  |
|     SERVER FILTER BY FIRST KEY ONLY                                                  |
|     SERVER AGGREGATE INTO SINGLE ROW                                                 |
+--------------------------------------------------------------------------------------+
{noformat}

Following schema was used with 2M rows and 10MB guidepost width:
{noformat}
CREATE TABLE IF NOT EXISTS T (PKA CHAR(15) NOT NULL, PKF CHAR(3) NOT NULL,
 PKP CHAR(15) NOT NULL, CRD DATE NOT NULL, EHI CHAR(15) NOT NULL, STD_COL VARCHAR, INDEXED_COL INTEGER,
 CONSTRAINT PK PRIMARY KEY ( PKA, PKF, PKP, CRD DESC, EHI))
 VERSIONS=1,MULTI_TENANT=true,IMMUTABLE_ROWS=true
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)