You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Kadir OZDEMIR (Jira)" <ji...@apache.org> on 2020/03/23 16:50:00 UTC

[jira] [Created] (PHOENIX-5795) Supporting selective queries for index rows updated concurrently

Kadir OZDEMIR created PHOENIX-5795:
--------------------------------------

             Summary: Supporting selective queries for index rows updated concurrently
                 Key: PHOENIX-5795
                 URL: https://issues.apache.org/jira/browse/PHOENIX-5795
             Project: Phoenix
          Issue Type: Sub-task
            Reporter: Kadir OZDEMIR


From the consistent indexing design (PHOENIX-5156) perspective, two or more pending updates from different batches on the same data row are concurrent if and only if for all of these updates the data table row state is read from HBase under the row lock and for none of them the row lock has been acquired the second time for updating the data table. In other words, all of them are in the first update phase concurrently. For concurrent updates, the first two update phases are done but the last update phase is skipped. This means the data table row will be updated by these updates but the corresponding index table rows will be left with the unverified status. Then, the read repair process will repair these unverified index rows during scans.

In addition to leaving index rows unverified, the concurrent updates may generate index row with incorrect row keys. For example, consider that an application issues the verify first two upserts on the same row concurrently and the second update does not include one or more of the indexed columns. When these updates arrive concurrently to IndexRegionObserver, the existing row state would be null for both of these updates. This mean the index updates will be generated solely from the pending updates. The partial upsert with missing indexed columns will generate an index row by assuming missing indexed columns have null value, and this assumption may not true as the other concurrent upsert may have non-null values for indexed columns. After issuing the concurrent update, if the application attempts to read back the row using a selective query on the index table and this selective query maps to an HBase scan that does not scan these unverified rows due to incorrect row keys on these rows, the application will not get the row content back correctly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)