You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@phoenix.apache.org by "Geoffrey Jacoby (Jira)" <ji...@apache.org> on 2019/10/04 00:09:00 UTC

[jira] [Comment Edited] (PHOENIX-5502) ALTER INDEX REBUILD removes all rows from already valid/consistent index

    [ https://issues.apache.org/jira/browse/PHOENIX-5502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16944089#comment-16944089 ] 

Geoffrey Jacoby edited comment on PHOENIX-5502 at 10/4/19 12:08 AM:
--------------------------------------------------------------------

So a few minutes after my last post [~kadir] pointed out offline that there's a likely timestamp problem here – the deletes are done using either SCN or LATEST_TIMESTAMP, but the subsequent index rebuild is done using the original timestamps of the base table, so the deletes cover the rebuilt cells.

So how to best handle deletes? Options we came up with are: 
 * Drop index and recreate
 * HBase truncate (preferably with preserve regions)
 * Write delete markers with the same ts as the index rows (not sure how well HBase handles this case)

For normal global indexes, option 2 seems best to me, since it's reasonably quick and preserves the regions, and because recreating the index is non-trivial absent PHOENIX-4286. 

The catch is that for view indexes, we can't truncate because each is co-located with all the other view indexes of the same physical base table. For them we'd need option 1 or 3, and in option 1 the recreated view index would get a new view index id. 

We also can't truncate for local indexes. 


was (Author: gjacoby):
So a few minutes after my last post [~kadir] pointed out offline that there's a likely timestamp problem here – the deletes are done using either SCN or LATEST_TIMESTAMP, but the subsequent index rebuild is done using the original timestamps of the base table, so the deletes cover the rebuilt cells.

So how to best handle deletes? Options we came up with are: 
 * Drop index and recreate
 * HBase truncate (preferably with preserve regions)
 * Write delete markers with the same ts as the index rows (not sure how well HBase handles this case)

For normal global indexes, option 2 seems best to me, since it's reasonably quick and preserves the regions, and because recreating the index is non-trivial absent PHOENIX-4286. 

The catch is that for view indexes, we can't truncate because each is co-located with all the other view indexes of the same physical base table. For them we'd need option 1 or 3, and in option 1 the recreated view index would get a new view index id. 

> ALTER INDEX REBUILD removes all rows from already valid/consistent index
> ------------------------------------------------------------------------
>
>                 Key: PHOENIX-5502
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5502
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.14.1, 4.14.2, 4.14.3
>            Reporter: Priyank Porwal
>            Priority: Major
>             Fix For: 4.14.1, 4.14.2, 4.14.3
>
>
> Create Table & Indexes:
> CREATE TABLE DEMO2.PEOPLE (FNAME VARCHAR NOT NULL, LNAME VARCHAR, AGE TINYINT, ZIP INTEGER, CONSTRAINT pk PRIMARY KEY (FNAME, LNAME));
>  CREATE INDEX PEOPLE_BY_ZIP ON DEMO2.PEOPLE(ZIP);
>  CREATE INDEX PEOPLE_BY_AGE ON DEMO2.PEOPLE(AGE);
> Populate Data:
> UPSERT INTO DEMO2.PEOPLE VALUES ('Audi', 'Q5', 15, 65000);
> UPSERT INTO DEMO2.PEOPLE VALUES ('Volkswagon', 'Beetle', 10, 43130);
> UPSERT INTO DEMO2.PEOPLE VALUES ('BMW', 'X3', 4, 15030);
> Query Index:
> SELECT * FROM DEMO2.PEOPLE_BY_AGE;
> <3 rows show up>
> Rebuild Index:
> alter index people_by_age on DEMO2.people rebuild;
> Query Index Again:
> SELECT * FROM DEMO2.PEOPLE_BY_AGE;
> <No rows show up>
>  
> It seems that if the index is already consistent, then the rebuild command removes all the index rows. Above is the simpler repro, but I have noticed similar behavior where rebuild command does the right thing first time on an inconsistent index (caused by truncation of table using hbase shell), but second run of rebuild command removes all the rows.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)