You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Sergey Soldatov (JIRA)" <ji...@apache.org> on 2018/01/19 21:33:00 UTC

[jira] [Created] (PHOENIX-4544) Update statistics inconsistent behavior

Sergey Soldatov created PHOENIX-4544:
----------------------------------------

             Summary: Update statistics inconsistent behavior 
                 Key: PHOENIX-4544
                 URL: https://issues.apache.org/jira/browse/PHOENIX-4544
             Project: Phoenix
          Issue Type: Bug
    Affects Versions: 5.x
            Reporter: Romil Choksi


Update statistics may not generate the stats information for all dependent indexes. And this behavior may depend on whether the command executed synchronously or asynchronously.
I have a table GIGANTIC_TABLE with ~500k rows with global index I1 and local index I2.
If async is turned on (the default value):
{noformat}
0: jdbc:phoenix:> update statistics GIGANTIC_TABLE ALL;
No rows affected (0.081 seconds)
0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='I1' AND COLUMN_FAMILY='0';
+-------------------------------+
| COUNT(GUIDE_POSTS_ROW_COUNT)  |
+-------------------------------+
| 5                             |
+-------------------------------+
1 row selected (0.009 seconds)
0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='GIGANTIC_TABLE' AND COLUMN_FAMILY='0';
+-------------------------------+
| COUNT(GUIDE_POSTS_ROW_COUNT)  |
+-------------------------------+
| 520                           |
+-------------------------------+
1 row selected (0.014 seconds)
0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='GIGANTIC_TABLE' AND COLUMN_FAMILY='L#0';
+-------------------------------+
| COUNT(GUIDE_POSTS_ROW_COUNT)  |
+-------------------------------+
| 0                             |
+-------------------------------+
1 row selected (0.008 seconds)
0: jdbc:phoenix:>
{noformat}
As we can see there is no records for local index I2. But if we run statistics for indexes:
{noformat}
0: jdbc:phoenix:> update statistics GIGANTIC_TABLE INDEX;
No rows affected (0.036 seconds)
0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='GIGANTIC_TABLE' AND COLUMN_FAMILY='L#0';
+-------------------------------+
| COUNT(GUIDE_POSTS_ROW_COUNT)  |
+-------------------------------+
| 20                            |
+-------------------------------+
1 row selected (0.007 seconds)
{noformat}
the statistic for local index is generated correctly.
Now we turn async off:
{noformat}
0: jdbc:phoenix:> delete from SYSTEM.STATS;
547 rows affected (0.079 seconds)
0: jdbc:phoenix:> update statistics GIGANTIC_TABLE ALL;
999,998 rows affected (4.671 seconds)
0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='GIGANTIC_TABLE' AND COLUMN_FAMILY='0';
+-------------------------------+
| COUNT(GUIDE_POSTS_ROW_COUNT)  |
+-------------------------------+
| 520                           |
+-------------------------------+
1 row selected (0.04 seconds)
0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='GIGANTIC_TABLE' AND COLUMN_FAMILY='L#0';
+-------------------------------+
| COUNT(GUIDE_POSTS_ROW_COUNT)  |
+-------------------------------+
| 20                            |
+-------------------------------+
1 row selected (0.012 seconds)
0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='I1' AND COLUMN_FAMILY='0';
+-------------------------------+
| COUNT(GUIDE_POSTS_ROW_COUNT)  |
+-------------------------------+
| 0                             |
+-------------------------------+
1 row selected (0.011 seconds)
{noformat}
As we can see we got statistics for the table itself and local index. But not for the global index.
Moreover, if we try to update statistics for indexes:
{noformat}
0: jdbc:phoenix:> update statistics GIGANTIC_TABLE INDEX;
499,999 rows affected (0.332 seconds)
0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='I1' AND COLUMN_FAMILY='0';
+-------------------------------+
| COUNT(GUIDE_POSTS_ROW_COUNT)  |
+-------------------------------+
| 0                             |
+-------------------------------+
1 row selected (0.009 seconds)
{noformat}
So, still no records for global index.
But if we delete statistics first and run update for indexes:
{noformat}
0: jdbc:phoenix:> delete from SYSTEM.STATS;
541 rows affected (0.024 seconds)
0: jdbc:phoenix:> update statistics GIGANTIC_TABLE INDEX;
999,998 rows affected (0.41 seconds)
0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='I1' AND COLUMN_FAMILY='0';
+-------------------------------+
| COUNT(GUIDE_POSTS_ROW_COUNT)  |
+-------------------------------+
| 5                             |
+-------------------------------+
1 row selected (0.01 seconds)
0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='GIGANTIC_TABLE' AND COLUMN_FAMILY='L#0';
+-------------------------------+
| COUNT(GUIDE_POSTS_ROW_COUNT)  |
+-------------------------------+
| 20                            |
+-------------------------------+
1 row selected (0.01 seconds)
{noformat}
than we got statistics for both local and global indexes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)