You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Chinmay Kulkarni (Jira)" <ji...@apache.org> on 2020/07/20 21:34:00 UTC

[jira] [Updated] (PHOENIX-6032) When phoenix.allow.system.catalog.rollback=true, a view still sees data from a column that was dropped

     [ https://issues.apache.org/jira/browse/PHOENIX-6032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chinmay Kulkarni updated PHOENIX-6032:
--------------------------------------
    Description: 
Start a 4.x server with phoenix.allow.system.catalog.rollback=true, phoenix.system.catalog.splittable=false. Connect to it from a 4.x client with phoenix.allow.system.catalog.rollback=true. Run the following from the 4.x client:

{code:sql}
CREATE TABLE IF NOT EXISTS T (A INTEGER PRIMARY KEY, B INTEGER, C VARCHAR, D INTEGER);
CREATE VIEW IF NOT EXISTS V (VA INTEGER, VB INTEGER) AS SELECT * FROM T WHERE B=200;
UPSERT INTO V(A,B,C,D,VA,VB) VALUES (2, 200, 'def', -20, 91, 101);

SELECT * FROM T;
+----+------+------+------+
| A  |  B   |  C   |  D   |
+----+------+------+------+
| 2  | 200  | def  | -20  |
+----+------+------+------+

SELECT * FROM V;
+----+------+------+------+-----+------+
| A  |  B   |  C   |  D   | VA  |  VB  |
+----+------+------+------+-----+------+
| 2  | 200  | def  | -20  | 91  | 101  |
+----+------+------+------+-----+------+
-- as expected
-- drop a parent column from the view
ALTER VIEW V DROP COLUMN C;

SELECT * FROM V;
+------+----+------+------+-----+------+
|  C   | A  |  B   |  D   | VA  |  VB  |
+------+----+------+------+-----+------+
| def  | 2  | 200  | -20  | 91  | 101  |
+------+----+------+------+-----+------+
-- Column C can still be seen and its ordering is changed for some reason. If you run the drop column again, it is actually dropped
ALTER VIEW V DROP COLUMN C;

SELECT * FROM V;
+----+------+------+-----+------+
| A  |  B   |  D   | VA  |  VB  |
+----+------+------+-----+------+
| 2  | 200  | -20  | 91  | 101  |
+----+------+------+-----+------+
-- Gets dropped when drop column is run a second time.
{code}

The issue seems to be that when splittable SYSTEM.CATALOG rollback is enabled, we store the parent's column metadata along with the view as well. After the first drop column command, metadata for column 'C' of the parent is removed from the view's metadata rows however it is not marked diverged, nor is an EXCLUDED_COLUMN entry made for that column in the view metadata rows.
Because of this, when resolving the view we potentially keep combining the parent table columns and still get column 'C'. When the second drop column command is issued is when we actually add an EXCLUDED_COLUMN linking row for 'C' in the view metadata.


> When phoenix.allow.system.catalog.rollback=true, a view still sees data from a column that was dropped
> ------------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-6032
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6032
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 5.0.0, 4.15.0
>            Reporter: Chinmay Kulkarni
>            Priority: Blocker
>             Fix For: 4.16.0
>
>
> Start a 4.x server with phoenix.allow.system.catalog.rollback=true, phoenix.system.catalog.splittable=false. Connect to it from a 4.x client with phoenix.allow.system.catalog.rollback=true. Run the following from the 4.x client:
> {code:sql}
> CREATE TABLE IF NOT EXISTS T (A INTEGER PRIMARY KEY, B INTEGER, C VARCHAR, D INTEGER);
> CREATE VIEW IF NOT EXISTS V (VA INTEGER, VB INTEGER) AS SELECT * FROM T WHERE B=200;
> UPSERT INTO V(A,B,C,D,VA,VB) VALUES (2, 200, 'def', -20, 91, 101);
> SELECT * FROM T;
> +----+------+------+------+
> | A  |  B   |  C   |  D   |
> +----+------+------+------+
> | 2  | 200  | def  | -20  |
> +----+------+------+------+
> SELECT * FROM V;
> +----+------+------+------+-----+------+
> | A  |  B   |  C   |  D   | VA  |  VB  |
> +----+------+------+------+-----+------+
> | 2  | 200  | def  | -20  | 91  | 101  |
> +----+------+------+------+-----+------+
> -- as expected
> -- drop a parent column from the view
> ALTER VIEW V DROP COLUMN C;
> SELECT * FROM V;
> +------+----+------+------+-----+------+
> |  C   | A  |  B   |  D   | VA  |  VB  |
> +------+----+------+------+-----+------+
> | def  | 2  | 200  | -20  | 91  | 101  |
> +------+----+------+------+-----+------+
> -- Column C can still be seen and its ordering is changed for some reason. If you run the drop column again, it is actually dropped
> ALTER VIEW V DROP COLUMN C;
> SELECT * FROM V;
> +----+------+------+-----+------+
> | A  |  B   |  D   | VA  |  VB  |
> +----+------+------+-----+------+
> | 2  | 200  | -20  | 91  | 101  |
> +----+------+------+-----+------+
> -- Gets dropped when drop column is run a second time.
> {code}
> The issue seems to be that when splittable SYSTEM.CATALOG rollback is enabled, we store the parent's column metadata along with the view as well. After the first drop column command, metadata for column 'C' of the parent is removed from the view's metadata rows however it is not marked diverged, nor is an EXCLUDED_COLUMN entry made for that column in the view metadata rows.
> Because of this, when resolving the view we potentially keep combining the parent table columns and still get column 'C'. When the second drop column command is issued is when we actually add an EXCLUDED_COLUMN linking row for 'C' in the view metadata.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)