You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@phoenix.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/03/04 02:12:00 UTC
[jira] [Commented] (PHOENIX-6141) Ensure consistency between SYSTEM.CATALOG and SYSTEM.CHILD_LINK

    [ https://issues.apache.org/jira/browse/PHOENIX-6141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17696384#comment-17696384 ] 

ASF GitHub Bot commented on PHOENIX-6141:
-----------------------------------------

jpisaac commented on code in PR #1575:
URL: https://github.com/apache/phoenix/pull/1575#discussion_r1125217986


##########
phoenix-core/src/main/java/org/apache/phoenix/coprocessor/ChildLinkMetaDataObserver.java:
##########
@@ -0,0 +1,290 @@
+package org.apache.phoenix.coprocessor;
+
+import org.apache.hadoop.hbase.Cell;
+import org.apache.hadoop.hbase.CellUtil;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.Delete;
+import org.apache.hadoop.hbase.client.Mutation;
+import org.apache.hadoop.hbase.client.Put;
+import org.apache.hadoop.hbase.client.Result;
+import org.apache.hadoop.hbase.client.ResultScanner;
+import org.apache.hadoop.hbase.client.Scan;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.coprocessor.ObserverContext;
+import org.apache.hadoop.hbase.coprocessor.RegionCoprocessor;
+import org.apache.hadoop.hbase.coprocessor.RegionCoprocessorEnvironment;
+import org.apache.hadoop.hbase.coprocessor.RegionObserver;
+import org.apache.hadoop.hbase.regionserver.Region;
+import org.apache.hadoop.hbase.regionserver.RegionScanner;
+import org.apache.hadoop.hbase.util.Bytes;
+import org.apache.phoenix.jdbc.PhoenixDatabaseMetaData;
+import org.apache.phoenix.util.EnvironmentEdgeManager;
+import org.apache.phoenix.util.SchemaUtil;
+import org.apache.phoenix.util.ServerUtil;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.sql.SQLException;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Optional;
+
+import static org.apache.phoenix.query.QueryConstants.VERIFIED_BYTES;
+import static org.apache.phoenix.util.ScanUtil.getDummyResult;
+import static org.apache.phoenix.util.ScanUtil.getPageSizeMsForRegionScanner;
+import static org.apache.phoenix.util.ScanUtil.isDummy;
+import static org.apache.phoenix.thirdparty.com.google.common.base.Preconditions.checkArgument;
+
+
+/**
+ * Coprocessor that verifies scanned rows of SYSTEM.CHILD_LINK table
+ */
+public class ChildLinkMetaDataObserver extends BaseScannerRegionObserver implements RegionCoprocessor {

Review Comment:
   @palashc I am assuming the verification is being done out of band right?
   Any reason we do not want to reuse the ChildLinkMetaDataEndpoint coprocessor instead of having a new one?
   
   We can extend the ChildLinkMetaDataService to have a verifyRows method and have this logic in there?





> Ensure consistency between SYSTEM.CATALOG and SYSTEM.CHILD_LINK
> ---------------------------------------------------------------
>
>                 Key: PHOENIX-6141
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6141
>             Project: Phoenix
>          Issue Type: Improvement
>    Affects Versions: 5.0.0, 4.15.0
>            Reporter: Chinmay Kulkarni
>            Assignee: Palash Chauhan
>            Priority: Blocker
>             Fix For: 5.2.0, 5.1.4
>
>
> Before 4.15, "CREATE/DROP VIEW" was an atomic operation since we were issuing batch mutations on just the 1 SYSTEM.CATALOG region. In 4.15 we introduced SYSTEM.CHILD_LINK to store the parent->child links and so a CREATE VIEW is no longer atomic since it consists of 2 separate RPCs  (1 to SYSTEM.CHILD_LINK to add the linking row and another to SYSTEM.CATALOG to write metadata for the new view). 
> If the second RPC i.e. the RPC to write metadata to SYSTEM.CATALOG fails after the 1st RPC has already gone through, there will be an inconsistency between both metadata tables. We will see orphan parent->child linking rows in SYSTEM.CHILD_LINK in this case. This can cause the following issues:
> # ALTER TABLE calls on the base table will fail
> # DROP TABLE without CASCADE will fail
> # The upgrade path has calls like UpgradeUtil.upgradeTable() which will fail
> # Any metadata consistency checks can be thrown off
> # Unnecessary extra storage of orphan links
> The first 3 issues happen because we wrongly deduce that a base table has child views due to the orphan linking rows.
> This Jira aims at trying to come up with a way to make mutations among SYSTEM.CATALOG and SYSTEM.CHILD_LINK an atomic transaction. We can use a 2-phase commit approach like in global indexing or also potentially explore using a transaction manager. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)