You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@phoenix.apache.org by GitBox <gi...@apache.org> on 2020/05/14 21:54:24 UTC

[GitHub] [phoenix] swaroopak opened a new pull request #779: PHOENIX-5896: Implement incremental rebuild along the failed regions …

swaroopak opened a new pull request #779:
URL: https://github.com/apache/phoenix/pull/779


   …in IndexTool
   
   IT and unit tests in progress. Sending out for early feedback. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] kadirozde commented on a change in pull request #779: PHOENIX-5896: Implement incremental rebuild along the failed regions …

Posted by GitBox <gi...@apache.org>.
kadirozde commented on a change in pull request #779:
URL: https://github.com/apache/phoenix/pull/779#discussion_r425467457



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/IndexRebuildRegionScanner.java
##########
@@ -1191,6 +1203,9 @@ public boolean next(List<Cell> results) throws IOException {
         }
         Cell lastCell = null;
         int rowCount = 0;
+        if(!shouldRebuildOrVerify()) {

Review comment:
       Have we decided that we would do this check always? Should we do this only when incremental build is specified? At least we should do this check for read-repair since it will unnecessarily impact the read performance.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] gokceni commented on a change in pull request #779: PHOENIX-5896: Implement incremental rebuild along the failed regions …

Posted by GitBox <gi...@apache.org>.
gokceni commented on a change in pull request #779:
URL: https://github.com/apache/phoenix/pull/779#discussion_r425461117



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/IndexRebuildRegionScanner.java
##########
@@ -159,6 +161,16 @@ public IndexRebuildRegionScanner(final RegionScanner innerScanner, final Region
         }
     }
 
+    private boolean shouldRebuildOrVerify() throws IOException {

Review comment:
       Please rename as shouldVerify




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] gjacoby126 commented on a change in pull request #779: PHOENIX-5896: Implement incremental rebuild along the failed regions …

Posted by GitBox <gi...@apache.org>.
gjacoby126 commented on a change in pull request #779:
URL: https://github.com/apache/phoenix/pull/779#discussion_r425473201



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/IndexRebuildRegionScanner.java
##########
@@ -159,6 +161,16 @@ public IndexRebuildRegionScanner(final RegionScanner innerScanner, final Region
         }
     }
 
+    private boolean shouldRebuildOrVerify() throws IOException {
+        if(verifyType == IndexTool.IndexVerifyType.ONLY) {
+            return true;
+        }
+        byte [] rowKey = generateResultTableRowKey(scan.getTimeRange().getMax(),

Review comment:
       or Scan, index table name, and region name. Whichever you prefer. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] kadirozde commented on a change in pull request #779: PHOENIX-5896: Implement incremental rebuild along the failed regions …

Posted by GitBox <gi...@apache.org>.
kadirozde commented on a change in pull request #779:
URL: https://github.com/apache/phoenix/pull/779#discussion_r425579243



##########
File path: phoenix-core/src/test/java/org/apache/phoenix/index/ShouldVerifyTest.java
##########
@@ -0,0 +1,91 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.phoenix.index;
+
+import org.apache.hadoop.hbase.client.Scan;
+import org.apache.hadoop.hbase.regionserver.Region;
+import org.apache.hadoop.hbase.util.Bytes;
+import org.apache.phoenix.coprocessor.IndexRebuildRegionScanner;
+import org.apache.phoenix.mapreduce.index.IndexTool;
+import org.apache.phoenix.mapreduce.index.IndexVerificationResultRepository;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+import org.mockito.Matchers;
+import org.mockito.Mock;
+import org.mockito.MockitoAnnotations;
+
+import java.io.IOException;
+
+import static org.mockito.Matchers.any;
+import static org.mockito.Mockito.when;
+
+public class ShouldVerifyTest {

Review comment:
       This test class is written to test a tiny method which does a simple check. Given how this test class is constructed, I do not see any value of these tests and the entire class. I am not sure if you notice that these tests verifies that the following block of code is correct
   if(a != null || b == IndexTool.IndexVerifyType.ONLY) {
               return true;
           }
           return !c;
   where a, b and c are supplied by the test. 
   
   I suggest adding a test to an existing integration test to test this incremental verification feature end to end. The jira does not explain how this feature is activated. It seems one needs to provide the end time parameter. If so, the same end time parameter needs to be used for the base run and the incremental runs.  How we do differentiate the result table entries of one run from those for another run when these runs use the same end time? Do we need to differentiate it? How about the start time? What happens if the end times are the same but the start times are different for these runs. All these questions would have been considered if an integration test were written. Have you considered them?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] priyankporwal commented on a change in pull request #779: PHOENIX-5896: Implement incremental rebuild along the failed regions …

Posted by GitBox <gi...@apache.org>.
priyankporwal commented on a change in pull request #779:
URL: https://github.com/apache/phoenix/pull/779#discussion_r426797008



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/mapreduce/index/IndexTool.java
##########
@@ -235,15 +235,22 @@ public static IndexVerifyType fromValue(byte[] value) {
     private static final Option END_TIME_OPTION = new Option("et", "endtime",

Review comment:
       For consistency: These should both use "start-time" / "end-time" .. just like all other composite-word options.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] swaroopak commented on a change in pull request #779: PHOENIX-5896: Implement incremental rebuild along the failed regions …

Posted by GitBox <gi...@apache.org>.
swaroopak commented on a change in pull request #779:
URL: https://github.com/apache/phoenix/pull/779#discussion_r425492111



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/IndexRebuildRegionScanner.java
##########
@@ -1191,6 +1203,9 @@ public boolean next(List<Cell> results) throws IOException {
         }
         Cell lastCell = null;
         int rowCount = 0;
+        if(!shouldRebuildOrVerify()) {

Review comment:
       This was one way without taking any additional parameters. I think the read repair won't be impacted as it would have different ts than the key of PIT result and this `if` will not get executed. We can wrap this under a parameter if you think that's the right thing to do.  




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] kadirozde commented on a change in pull request #779: PHOENIX-5896: Implement incremental rebuild along the failed regions …

Posted by GitBox <gi...@apache.org>.
kadirozde commented on a change in pull request #779:
URL: https://github.com/apache/phoenix/pull/779#discussion_r425467457



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/IndexRebuildRegionScanner.java
##########
@@ -1191,6 +1203,9 @@ public boolean next(List<Cell> results) throws IOException {
         }
         Cell lastCell = null;
         int rowCount = 0;
+        if(!shouldRebuildOrVerify()) {

Review comment:
       Have we decided that we would do this check always? Should we do this only when incremental build is specified? At least we should not do this check for read-repair since it will unnecessarily impact the read performance.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] priyankporwal commented on a change in pull request #779: PHOENIX-5896: Implement incremental rebuild along the failed regions …

Posted by GitBox <gi...@apache.org>.
priyankporwal commented on a change in pull request #779:
URL: https://github.com/apache/phoenix/pull/779#discussion_r426796578



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/mapreduce/index/IndexTool.java
##########
@@ -235,15 +235,22 @@ public static IndexVerifyType fromValue(byte[] value) {
     private static final Option END_TIME_OPTION = new Option("et", "endtime",
             true, "End time for indextool rebuild or verify");
 
+    private static final Option INCREMENTAL_VERIFY_OPTION = new Option("iv", "incre-verify",

Review comment:
       Nit: Please change this to full "incremental-verify"




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] swaroopak merged pull request #779: PHOENIX-5896: Implement incremental rebuild along the failed regions …

Posted by GitBox <gi...@apache.org>.
swaroopak merged pull request #779:
URL: https://github.com/apache/phoenix/pull/779


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] priyankporwal commented on a change in pull request #779: PHOENIX-5896: Implement incremental rebuild along the failed regions …

Posted by GitBox <gi...@apache.org>.
priyankporwal commented on a change in pull request #779:
URL: https://github.com/apache/phoenix/pull/779#discussion_r426798261



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/mapreduce/index/IndexTool.java
##########
@@ -235,15 +235,22 @@ public static IndexVerifyType fromValue(byte[] value) {
     private static final Option END_TIME_OPTION = new Option("et", "endtime",
             true, "End time for indextool rebuild or verify");
 
+    private static final Option INCREMENTAL_VERIFY_OPTION = new Option("iv", "incre-verify",

Review comment:
       Should we call this as "retry-verify" since that's what the implementation looks like?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] kadirozde commented on a change in pull request #779: PHOENIX-5896: Implement incremental rebuild along the failed regions …

Posted by GitBox <gi...@apache.org>.
kadirozde commented on a change in pull request #779:
URL: https://github.com/apache/phoenix/pull/779#discussion_r425579243



##########
File path: phoenix-core/src/test/java/org/apache/phoenix/index/ShouldVerifyTest.java
##########
@@ -0,0 +1,91 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.phoenix.index;
+
+import org.apache.hadoop.hbase.client.Scan;
+import org.apache.hadoop.hbase.regionserver.Region;
+import org.apache.hadoop.hbase.util.Bytes;
+import org.apache.phoenix.coprocessor.IndexRebuildRegionScanner;
+import org.apache.phoenix.mapreduce.index.IndexTool;
+import org.apache.phoenix.mapreduce.index.IndexVerificationResultRepository;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+import org.mockito.Matchers;
+import org.mockito.Mock;
+import org.mockito.MockitoAnnotations;
+
+import java.io.IOException;
+
+import static org.mockito.Matchers.any;
+import static org.mockito.Mockito.when;
+
+public class ShouldVerifyTest {

Review comment:
       This test class is written to test a tiny method which does a simple check. Given how this test class is constructed, I do not see any value of these tests and the entire class. I am not sure if you notice that these tests verifies that the following block of code is correct
   if(a != null || b == IndexTool.IndexVerifyType.ONLY) {
               return true;
           }
           return !c;
   where a, b and c are supplied by the tests. 
   
   I suggest adding a test to an existing integration test to test this incremental verification feature end to end. The jira does not explain how this feature is activated. It seems one needs to provide the end time parameter. If so, the same end time parameter needs to be used for the base run and the incremental runs.  How do we differentiate the result table entries of one run from those for another run when these runs use the same end time? Do we need to differentiate it? How about the start time? What happens if the end times are the same but the start times are different for these runs. All these questions would have been considered if an integration test were written. Have you considered them?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] swaroopak commented on a change in pull request #779: PHOENIX-5896: Implement incremental rebuild along the failed regions …

Posted by GitBox <gi...@apache.org>.
swaroopak commented on a change in pull request #779:
URL: https://github.com/apache/phoenix/pull/779#discussion_r425958945



##########
File path: phoenix-core/src/test/java/org/apache/phoenix/index/ShouldVerifyTest.java
##########
@@ -0,0 +1,91 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.phoenix.index;
+
+import org.apache.hadoop.hbase.client.Scan;
+import org.apache.hadoop.hbase.regionserver.Region;
+import org.apache.hadoop.hbase.util.Bytes;
+import org.apache.phoenix.coprocessor.IndexRebuildRegionScanner;
+import org.apache.phoenix.mapreduce.index.IndexTool;
+import org.apache.phoenix.mapreduce.index.IndexVerificationResultRepository;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+import org.mockito.Matchers;
+import org.mockito.Mock;
+import org.mockito.MockitoAnnotations;
+
+import java.io.IOException;
+
+import static org.mockito.Matchers.any;
+import static org.mockito.Mockito.when;
+
+public class ShouldVerifyTest {

Review comment:
       @kadirozde Thank you for explaining the missing piece in the PR. I will add an IT in this one. I agree with @gjacoby126  and I faced the same challenge of refactoring it. Some unit tests are better than no tests I guess :( 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] gjacoby126 commented on a change in pull request #779: PHOENIX-5896: Implement incremental rebuild along the failed regions …

Posted by GitBox <gi...@apache.org>.
gjacoby126 commented on a change in pull request #779:
URL: https://github.com/apache/phoenix/pull/779#discussion_r425472240



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/IndexRebuildRegionScanner.java
##########
@@ -159,6 +161,16 @@ public IndexRebuildRegionScanner(final RegionScanner innerScanner, final Region
         }
     }
 
+    private boolean shouldRebuildOrVerify() throws IOException {
+        if(verifyType == IndexTool.IndexVerifyType.ONLY) {
+            return true;
+        }
+        byte [] rowKey = generateResultTableRowKey(scan.getTimeRange().getMax(),

Review comment:
       rather than importing the generateResultTableRowKey here, might be cleaner to have a ResultRepository "exists" overload that takes Scan, indexMaintainer, and region name. That keeps the internal details of the table from "leaking" back into the region scanner. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] gjacoby126 commented on a change in pull request #779: PHOENIX-5896: Implement incremental rebuild along the failed regions …

Posted by GitBox <gi...@apache.org>.
gjacoby126 commented on a change in pull request #779:
URL: https://github.com/apache/phoenix/pull/779#discussion_r425953404



##########
File path: phoenix-core/src/test/java/org/apache/phoenix/index/ShouldVerifyTest.java
##########
@@ -0,0 +1,91 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.phoenix.index;
+
+import org.apache.hadoop.hbase.client.Scan;
+import org.apache.hadoop.hbase.regionserver.Region;
+import org.apache.hadoop.hbase.util.Bytes;
+import org.apache.phoenix.coprocessor.IndexRebuildRegionScanner;
+import org.apache.phoenix.mapreduce.index.IndexTool;
+import org.apache.phoenix.mapreduce.index.IndexVerificationResultRepository;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+import org.mockito.Matchers;
+import org.mockito.Mock;
+import org.mockito.MockitoAnnotations;
+
+import java.io.IOException;
+
+import static org.mockito.Matchers.any;
+import static org.mockito.Mockito.when;
+
+public class ShouldVerifyTest {

Review comment:
       @kadirozde - while I agree an IT test is also necessary here, it's useful to have a unit test that can quickly verify behavior given a complicated state machine (even if the lines of code under test are few). What makes it tricky is that we're essentially verifying a private method here, which is a code smell. I don't have a great refactor off the top of my head. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org