You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@phoenix.apache.org by GitBox <gi...@apache.org> on 2020/06/11 20:25:53 UTC

[GitHub] [phoenix] gjacoby126 opened a new pull request #801: PHOENIX-5951 - Index rebuild output logging for past-max-lookback row…

gjacoby126 opened a new pull request #801:
URL: https://github.com/apache/phoenix/pull/801


   …s should be configurable


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] gjacoby126 commented on a change in pull request #801: PHOENIX-5951 - Index rebuild output logging for past-max-lookback row…

Posted by GitBox <gi...@apache.org>.
gjacoby126 commented on a change in pull request #801:
URL: https://github.com/apache/phoenix/pull/801#discussion_r439656954



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/mapreduce/index/IndexVerificationOutputRepository.java
##########
@@ -85,6 +88,15 @@
     public static final byte[] PHASE_BEFORE_VALUE = Bytes.toBytes("BEFORE");
     public static final byte[] PHASE_AFTER_VALUE = Bytes.toBytes("AFTER");
 
+
+    public enum IndexVerificationErrorType {
+        INVALID_ROW,

Review comment:
       Since all of them refer to index rows, and Index is in the name of the enum, isn't that implicit?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] gjacoby126 commented on a change in pull request #801: PHOENIX-5951 - Index rebuild output logging for past-max-lookback row…

Posted by GitBox <gi...@apache.org>.
gjacoby126 commented on a change in pull request #801:
URL: https://github.com/apache/phoenix/pull/801#discussion_r440427105



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/IndexRebuildRegionScanner.java
##########
@@ -152,6 +158,9 @@ public IndexRebuildRegionScanner(final RegionScanner innerScanner, final Region
             verifyType = IndexTool.IndexVerifyType.fromValue(valueBytes);
             if (verifyType != IndexTool.IndexVerifyType.NONE) {
                 verify = true;
+                boolean shouldLogBeyondMaxLookbackInvalidRows =
+                    config.getBoolean(PHOENIX_LOG_BEYOND_MAX_LOOKBACK_ERROR_ROWS_CONF_KEY,

Review comment:
       So far I have done nothing special to allow it be overridden from client-side. If the IndexTool is already taking all conf parameters from the client and using them on the server-side, then that's the case for this as well, and if it's not, it's not. (At a cursory look it appears that it's only certain flags which the InputFormat explicitly passes as scan attributes to the coproc?)
   
   My original intention was that this would be a cluster-wide setting. I could change it to be overridable. 
   
   @swaroopak - if this were overridable from the client-side, what kind of test would be sufficient? Given the long running time of our index tool tests already, I'm reluctant to add yet-another-end-to-end test just to check that a flag is carried through, but I'm not sure what other options there are to actually verify that each of the several layers passes the config through properly. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] kadirozde commented on a change in pull request #801: PHOENIX-5951 - Index rebuild output logging for past-max-lookback row…

Posted by GitBox <gi...@apache.org>.
kadirozde commented on a change in pull request #801:
URL: https://github.com/apache/phoenix/pull/801#discussion_r446492037



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/mapreduce/PhoenixServerBuildIndexInputFormat.java
##########
@@ -108,6 +109,12 @@ protected QueryPlan getQueryPlan(final JobContext context, final Configuration c
                 scan.setAttribute(BaseScannerRegionObserver.INDEX_RETRY_VERIFY, Bytes.toBytes(lastVerifyTimeValue));
                 scan.setAttribute(BaseScannerRegionObserver.INDEX_REBUILD_DISABLE_LOGGING_VERIFY_TYPE,
                     getDisableLoggingVerifyType(configuration).toBytes());
+                String shouldLogMaxLookbackOutput =
+                    configuration.get(IndexRebuildRegionScanner.INDEX_VERIFY_ROW_COUNTS_PER_TASK_CONF_KEY);

Review comment:
       PHOENIX_INDEX_MR_LOG_BEYOND_MAX_LOOKBACK_ERRORS rather than INDEX_VERIFY_ROW_COUNTS_PER_TASK_CONF_KEY




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] gjacoby126 commented on pull request #801: PHOENIX-5951 - Index rebuild output logging for past-max-lookback row…

Posted by GitBox <gi...@apache.org>.
gjacoby126 commented on pull request #801:
URL: https://github.com/apache/phoenix/pull/801#issuecomment-649888902


   @kadirozde - please take a look at the change to the IndexRebuildRegionScanner when you have a moment. 
   
   When testing the max lookback counters and output logging, I found that ONLY (and I assume BEFORE, AFTER and BOTH) rebuilds were being treated as NONEs because the IndexRebuildRegionScanner wasn't setting verifyType. I'm thinking this got introduced in PHOENIX-5928, but I'm confused why the IndexToolIT.testSecondaryIndex:308 check for the valid counter in a BEFORE rebuild didn't catch this. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] gjacoby126 commented on pull request #801: PHOENIX-5951 - Index rebuild output logging for past-max-lookback row…

Posted by GitBox <gi...@apache.org>.
gjacoby126 commented on pull request #801:
URL: https://github.com/apache/phoenix/pull/801#issuecomment-651237031


   @kadirozde - I believe I have addressed all your comments. (Thanks for pointing out the auto-complete errors where the wrong conf key was being used.) 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] gjacoby126 commented on a change in pull request #801: PHOENIX-5951 - Index rebuild output logging for past-max-lookback row…

Posted by GitBox <gi...@apache.org>.
gjacoby126 commented on a change in pull request #801:
URL: https://github.com/apache/phoenix/pull/801#discussion_r445913187



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/IndexRebuildRegionScanner.java
##########
@@ -152,6 +158,9 @@ public IndexRebuildRegionScanner(final RegionScanner innerScanner, final Region
             verifyType = IndexTool.IndexVerifyType.fromValue(valueBytes);
             if (verifyType != IndexTool.IndexVerifyType.NONE) {
                 verify = true;
+                boolean shouldLogBeyondMaxLookbackInvalidRows =
+                    config.getBoolean(PHOENIX_LOG_BEYOND_MAX_LOOKBACK_ERROR_ROWS_CONF_KEY,

Review comment:
       @swaroopak - Added a test in latest commit. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] kadirozde commented on a change in pull request #801: PHOENIX-5951 - Index rebuild output logging for past-max-lookback row…

Posted by GitBox <gi...@apache.org>.
kadirozde commented on a change in pull request #801:
URL: https://github.com/apache/phoenix/pull/801#discussion_r446491820



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/IndexRebuildRegionScanner.java
##########
@@ -96,6 +102,11 @@
 
     private static final Logger LOGGER = LoggerFactory.getLogger(IndexRebuildRegionScanner.class);
 
+    public static final String PHOENIX_INDEX_MR_LOG_BEYOND_MAX_LOOKBACK_ERRORS =
+        "phoenix.index.mr.log.beyond.max.lookback.errors";
+    public static final boolean DEFAULT_PHOENIX_INDEX_MR_LOG_BEYOND_MAX_LOOKBACK_ERRORS = false;
+    private boolean useProto = true;
+    private byte[] indexRowKey;

Review comment:
       indexRowKey and useProto are not used




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] gjacoby126 merged pull request #801: PHOENIX-5951 - Index rebuild output logging for past-max-lookback row…

Posted by GitBox <gi...@apache.org>.
gjacoby126 merged pull request #801:
URL: https://github.com/apache/phoenix/pull/801


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] swaroopak commented on a change in pull request #801: PHOENIX-5951 - Index rebuild output logging for past-max-lookback row…

Posted by GitBox <gi...@apache.org>.
swaroopak commented on a change in pull request #801:
URL: https://github.com/apache/phoenix/pull/801#discussion_r440506967



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/IndexRebuildRegionScanner.java
##########
@@ -152,6 +158,9 @@ public IndexRebuildRegionScanner(final RegionScanner innerScanner, final Region
             verifyType = IndexTool.IndexVerifyType.fromValue(valueBytes);
             if (verifyType != IndexTool.IndexVerifyType.NONE) {
                 verify = true;
+                boolean shouldLogBeyondMaxLookbackInvalidRows =
+                    config.getBoolean(PHOENIX_LOG_BEYOND_MAX_LOOKBACK_ERROR_ROWS_CONF_KEY,

Review comment:
       It is important that we have this property overridable for each IndexTool run. You could modify the existing/the one added in this PR run of IndexTool to make sure the property changes from the client-side are reflecting properly. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] kadirozde commented on pull request #801: PHOENIX-5951 - Index rebuild output logging for past-max-lookback row…

Posted by GitBox <gi...@apache.org>.
kadirozde commented on pull request #801:
URL: https://github.com/apache/phoenix/pull/801#issuecomment-649964002


   > @kadirozde - please take a look at the change to the IndexRebuildRegionScanner when you have a moment.
   > 
   > When testing the max lookback counters and output logging, I found that ONLY (and I assume BEFORE, AFTER and BOTH) rebuilds were being treated as NONEs because the IndexRebuildRegionScanner wasn't setting verifyType. I'm thinking this got introduced in PHOENIX-5928, but I'm confused why the IndexToolIT.testSecondaryIndex:308 check for the valid counter in a BEFORE rebuild didn't catch this.
   
   I see that verifyType is set in the constructor. Please make sure to check the constructor in both IndexRebuildRegionScanner and its super class GlobalIndexRegionScanner.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] gjacoby126 commented on a change in pull request #801: PHOENIX-5951 - Index rebuild output logging for past-max-lookback row…

Posted by GitBox <gi...@apache.org>.
gjacoby126 commented on a change in pull request #801:
URL: https://github.com/apache/phoenix/pull/801#discussion_r446770594



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/mapreduce/PhoenixServerBuildIndexInputFormat.java
##########
@@ -108,6 +109,12 @@ protected QueryPlan getQueryPlan(final JobContext context, final Configuration c
                 scan.setAttribute(BaseScannerRegionObserver.INDEX_RETRY_VERIFY, Bytes.toBytes(lastVerifyTimeValue));
                 scan.setAttribute(BaseScannerRegionObserver.INDEX_REBUILD_DISABLE_LOGGING_VERIFY_TYPE,
                     getDisableLoggingVerifyType(configuration).toBytes());
+                String shouldLogMaxLookbackOutput =
+                    configuration.get(IndexRebuildRegionScanner.INDEX_VERIFY_ROW_COUNTS_PER_TASK_CONF_KEY);

Review comment:
       Fixed

##########
File path: phoenix-core/src/it/java/org/apache/phoenix/end2end/IndexToolForNonTxGlobalIndexIT.java
##########
@@ -1006,6 +1011,72 @@ public void testDisableOutputLogging() throws Exception {
         }
     }
 
+    @Test
+    public void testEnableOutputLoggingForMaxLookback() throws Exception {
+        //by default we don't log invalid or missing rows past max lookback age to the
+        // PHOENIX_INDEX_TOOL table. Verify that we can flip that logging on from the client-side
+        // using a system property (such as from the command line) and have it log rows on the
+        // server-side
+        if (!mutable) {
+            return;
+        }
+        String schemaName = generateUniqueName();
+        String dataTableName = generateUniqueName();
+        String dataTableFullName = SchemaUtil.getTableName(schemaName, dataTableName);
+        String indexTableName = generateUniqueName();
+        String indexTableFullName = SchemaUtil.getTableName(schemaName, indexTableName);
+
+        try(Connection conn = DriverManager.getConnection(getUrl())) {
+            ManualEnvironmentEdge injectEdge = new ManualEnvironmentEdge();
+            injectEdge.setValue(EnvironmentEdgeManager.currentTimeMillis());
+            EnvironmentEdgeManager.injectEdge(injectEdge);
+            deleteAllRows(conn,
+                TableName.valueOf(IndexVerificationOutputRepository.OUTPUT_TABLE_NAME));
+            String stmString1 =
+                "CREATE TABLE " + dataTableFullName
+                    + " (ID INTEGER NOT NULL PRIMARY KEY, NAME VARCHAR, ZIP INTEGER) ";
+            conn.createStatement().execute(stmString1);
+
+            injectEdge.incrementValue(1L);
+            String upsertQuery = String.format("UPSERT INTO %s VALUES(?, ?, ?)", dataTableFullName);
+            PreparedStatement stmt1 = conn.prepareStatement(upsertQuery);
+
+            // insert two rows
+            IndexToolIT.upsertRow(stmt1, 1);
+            IndexToolIT.upsertRow(stmt1, 2);
+            conn.commit();
+            injectEdge.incrementValue(1L); //we have to increment time to see our writes
+            //now create an index async so it won't have the two rows in the base table
+
+            String stmtString2 =
+                String.format(
+                    "CREATE INDEX %s ON %s (LPAD(UPPER(NAME, 'en_US'),8,'x')||'_xyz') ASYNC ",
+                    indexTableName, dataTableFullName);
+            conn.createStatement().execute(stmtString2);
+            conn.commit();
+            injectEdge.incrementValue(MAX_LOOKBACK_AGE * 1000);
+            deleteAllRows(conn, TableName.valueOf(IndexVerificationOutputRepository.OUTPUT_TABLE_NAME));
+            getUtility().getConfiguration().
+                set(IndexRebuildRegionScanner.INDEX_VERIFY_ROW_COUNTS_PER_TASK_CONF_KEY, "true");

Review comment:
       Fixed




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] swaroopak commented on a change in pull request #801: PHOENIX-5951 - Index rebuild output logging for past-max-lookback row…

Posted by GitBox <gi...@apache.org>.
swaroopak commented on a change in pull request #801:
URL: https://github.com/apache/phoenix/pull/801#discussion_r439600675



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/mapreduce/index/IndexVerificationOutputRepository.java
##########
@@ -85,6 +88,15 @@
     public static final byte[] PHASE_BEFORE_VALUE = Bytes.toBytes("BEFORE");
     public static final byte[] PHASE_AFTER_VALUE = Bytes.toBytes("AFTER");
 
+
+    public enum IndexVerificationErrorType {
+        INVALID_ROW,

Review comment:
       shall we be specific with INVALID_INDEX_ROW and same for others? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] gjacoby126 commented on a change in pull request #801: PHOENIX-5951 - Index rebuild output logging for past-max-lookback row…

Posted by GitBox <gi...@apache.org>.
gjacoby126 commented on a change in pull request #801:
URL: https://github.com/apache/phoenix/pull/801#discussion_r439666283



##########
File path: phoenix-core/src/it/java/org/apache/phoenix/end2end/index/IndexVerificationOutputRepositoryIT.java
##########
@@ -53,6 +54,7 @@
 import java.util.Map;
 
 import static org.apache.phoenix.coprocessor.MetaDataProtocol.DEFAULT_LOG_TTL;
+import static org.apache.phoenix.mapreduce.index.IndexVerificationOutputRepository.IndexVerificationErrorType.*;

Review comment:
       Done

##########
File path: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/IndexRebuildRegionScanner.java
##########
@@ -98,6 +103,9 @@
 
     private static final Logger LOGGER = LoggerFactory.getLogger(IndexRebuildRegionScanner.class);
 
+    public static final String PHOENIX_LOG_BEYOND_MAX_LOOKBACK_INVALID_ROWS_CONF_KEY =
+        "phoenix.log.beyond.max.lookback.age.invalid.rows";
+    public static final boolean DEFAULT_PHOENIX_LOG_BEYOND_MAX_LOOKBACK_INVALID_ROWS = true;

Review comment:
       Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] gjacoby126 commented on a change in pull request #801: PHOENIX-5951 - Index rebuild output logging for past-max-lookback row…

Posted by GitBox <gi...@apache.org>.
gjacoby126 commented on a change in pull request #801:
URL: https://github.com/apache/phoenix/pull/801#discussion_r439666389



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/IndexRebuildRegionScanner.java
##########
@@ -98,6 +103,9 @@
 
     private static final Logger LOGGER = LoggerFactory.getLogger(IndexRebuildRegionScanner.class);
 
+    public static final String PHOENIX_LOG_BEYOND_MAX_LOOKBACK_INVALID_ROWS_CONF_KEY =

Review comment:
       Switched to "ERROR" to encompass both INVALID and MISSING.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] gjacoby126 commented on pull request #801: PHOENIX-5951 - Index rebuild output logging for past-max-lookback row…

Posted by GitBox <gi...@apache.org>.
gjacoby126 commented on pull request #801:
URL: https://github.com/apache/phoenix/pull/801#issuecomment-650284537


   @kadirozde - figured out the problem. A previous rebase on this feature branch had re-introduced a private verifyType in IndexRebuildRegionScanner and that was masking the protected verifyType in the superclass. My test passes upon removing the extra verifyType. 
   
   Will get a new commit up and also submit to HadoopQA


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] swaroopak commented on a change in pull request #801: PHOENIX-5951 - Index rebuild output logging for past-max-lookback row…

Posted by GitBox <gi...@apache.org>.
swaroopak commented on a change in pull request #801:
URL: https://github.com/apache/phoenix/pull/801#discussion_r439675076



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/IndexRebuildRegionScanner.java
##########
@@ -152,6 +158,9 @@ public IndexRebuildRegionScanner(final RegionScanner innerScanner, final Region
             verifyType = IndexTool.IndexVerifyType.fromValue(valueBytes);
             if (verifyType != IndexTool.IndexVerifyType.NONE) {
                 verify = true;
+                boolean shouldLogBeyondMaxLookbackInvalidRows =
+                    config.getBoolean(PHOENIX_LOG_BEYOND_MAX_LOOKBACK_ERROR_ROWS_CONF_KEY,

Review comment:
       Can this property be modified via -D option from the client-side? Good to have a test for that.  




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] gjacoby126 commented on a change in pull request #801: PHOENIX-5951 - Index rebuild output logging for past-max-lookback row…

Posted by GitBox <gi...@apache.org>.
gjacoby126 commented on a change in pull request #801:
URL: https://github.com/apache/phoenix/pull/801#discussion_r447110046



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/IndexRebuildRegionScanner.java
##########
@@ -96,6 +102,11 @@
 
     private static final Logger LOGGER = LoggerFactory.getLogger(IndexRebuildRegionScanner.class);
 
+    public static final String PHOENIX_INDEX_MR_LOG_BEYOND_MAX_LOOKBACK_ERRORS =
+        "phoenix.index.mr.log.beyond.max.lookback.errors";
+    public static final boolean DEFAULT_PHOENIX_INDEX_MR_LOG_BEYOND_MAX_LOOKBACK_ERRORS = false;
+    private boolean useProto = true;
+    private byte[] indexRowKey;

Review comment:
       Fixed




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] gokceni commented on a change in pull request #801: PHOENIX-5951 - Index rebuild output logging for past-max-lookback row…

Posted by GitBox <gi...@apache.org>.
gokceni commented on a change in pull request #801:
URL: https://github.com/apache/phoenix/pull/801#discussion_r439562590



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/IndexRebuildRegionScanner.java
##########
@@ -98,6 +103,9 @@
 
     private static final Logger LOGGER = LoggerFactory.getLogger(IndexRebuildRegionScanner.class);
 
+    public static final String PHOENIX_LOG_BEYOND_MAX_LOOKBACK_INVALID_ROWS_CONF_KEY =
+        "phoenix.log.beyond.max.lookback.age.invalid.rows";
+    public static final boolean DEFAULT_PHOENIX_LOG_BEYOND_MAX_LOOKBACK_INVALID_ROWS = true;

Review comment:
       Should this be false? I feel like "beyond max lookback" is kind of analogy to debug log. Why don't we make it false?

##########
File path: phoenix-core/src/it/java/org/apache/phoenix/end2end/index/IndexVerificationOutputRepositoryIT.java
##########
@@ -53,6 +54,7 @@
 import java.util.Map;
 
 import static org.apache.phoenix.coprocessor.MetaDataProtocol.DEFAULT_LOG_TTL;
+import static org.apache.phoenix.mapreduce.index.IndexVerificationOutputRepository.IndexVerificationErrorType.*;

Review comment:
       Let's remove the *




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] gjacoby126 edited a comment on pull request #801: PHOENIX-5951 - Index rebuild output logging for past-max-lookback row…

Posted by GitBox <gi...@apache.org>.
gjacoby126 edited a comment on pull request #801:
URL: https://github.com/apache/phoenix/pull/801#issuecomment-649997131


   I see that verifyType is set in the superclass constructor, if the Scan has the right attribute, as you say. But without manually setting it in the IndexRebuildRegionScanner, I was getting job counters with scan=2, rebuilt=2, and all other counters 0, even though it was an ONLY rebuild of an never-built index of a data table with 2 rows. The debugger showed that verifyType was NONE when it got to working the rebuild sub-tasks. 
   
   Setting verifyType let the MISSING or BEYOND_MAX_LOOKBACK_AGE_MISSING counter be 2 as they should, depending on the age of the rows. 
   
   Could be a test bug on my part, or could be there's some code path where the Scan attribute isn't being set properly. I'll look further in the morning. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] gjacoby126 commented on pull request #801: PHOENIX-5951 - Index rebuild output logging for past-max-lookback row…

Posted by GitBox <gi...@apache.org>.
gjacoby126 commented on pull request #801:
URL: https://github.com/apache/phoenix/pull/801#issuecomment-650461976


   OK, @swaroopak , fixed the test bug I mentioned to you and the real bug that revealed. It also now uses Configuration to verify. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] priyankporwal commented on a change in pull request #801: PHOENIX-5951 - Index rebuild output logging for past-max-lookback row…

Posted by GitBox <gi...@apache.org>.
priyankporwal commented on a change in pull request #801:
URL: https://github.com/apache/phoenix/pull/801#discussion_r440287703



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/IndexRebuildRegionScanner.java
##########
@@ -91,13 +95,15 @@
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
-import com.google.common.base.Throwables;
 import com.google.common.collect.Maps;
 
 public class IndexRebuildRegionScanner extends GlobalIndexRegionScanner {
 
     private static final Logger LOGGER = LoggerFactory.getLogger(IndexRebuildRegionScanner.class);
 
+    public static final String PHOENIX_LOG_BEYOND_MAX_LOOKBACK_ERROR_ROWS_CONF_KEY =
+        "phoenix.log.beyond.max.lookback.age.error.rows";

Review comment:
       Can we please have this config be named consistently like the previous one? 
   "phoenix.mr.index.IndexDisableLoggingType=BEFORE" .. It would make it easier to find and put those together whenever they need non-default values.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] swaroopak commented on a change in pull request #801: PHOENIX-5951 - Index rebuild output logging for past-max-lookback row…

Posted by GitBox <gi...@apache.org>.
swaroopak commented on a change in pull request #801:
URL: https://github.com/apache/phoenix/pull/801#discussion_r446469639



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/IndexRebuildRegionScanner.java
##########
@@ -96,6 +102,11 @@
 
     private static final Logger LOGGER = LoggerFactory.getLogger(IndexRebuildRegionScanner.class);
 
+    public static final String PHOENIX_INDEX_MR_LOG_BEYOND_MAX_LOOKBACK_ERRORS =
+        "phoenix.index.mr.log.beyond.max.lookback.errors";
+    public static final boolean DEFAULT_PHOENIX_INDEX_MR_LOG_BEYOND_MAX_LOOKBACK_ERRORS = false;
+    private boolean useProto = true;
+    private byte[] indexRowKey;

Review comment:
       nit: looks like this made its way in the rebase?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] kadirozde commented on a change in pull request #801: PHOENIX-5951 - Index rebuild output logging for past-max-lookback row…

Posted by GitBox <gi...@apache.org>.
kadirozde commented on a change in pull request #801:
URL: https://github.com/apache/phoenix/pull/801#discussion_r446491937



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/IndexRebuildRegionScanner.java
##########
@@ -1223,7 +1254,7 @@ public int prepareIndexMutations(Put put, Delete del, Map<byte[], List<Mutation>
                                      Set<byte[]> mostRecentIndexRowKeys) throws IOException {
         List<Mutation> indexMutations = prepareIndexMutationsForRebuild(indexMaintainer, put, del);
         boolean mostRecentDone = false;
-        // Do not populate mostRecentIndexRowKeys when verifyType != IndexTool.IndexVerifyType.ONLY
+        // Do not populate mostRecentIndexRowKeys when verifyType is NONE or AFTER

Review comment:
       Thanks for correcting the comment




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] gjacoby126 commented on pull request #801: PHOENIX-5951 - Index rebuild output logging for past-max-lookback row…

Posted by GitBox <gi...@apache.org>.
gjacoby126 commented on pull request #801:
URL: https://github.com/apache/phoenix/pull/801#issuecomment-649997131


   I see that verifyType is set in the superclass constructor, if the Scan has the right attribute, as you say. But without manually setting it in the IndexRebuildRegionScanner, I was getting job counters with scan=2, rebuilt=2, and all other counters 0, even though it was an ONLY rebuild of an never-built index of a data table with 2 rows. Setting verifyType let the MISSING or BEYOND_MAX_LOOKBACK_AGE_MISSING counter be 2 as they should, depending on the age of the rows. 
   
   Could be a test bug on my part, or could be there's some code path where the Scan attribute isn't being set properly. I'll look further in the morning. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] swaroopak commented on a change in pull request #801: PHOENIX-5951 - Index rebuild output logging for past-max-lookback row…

Posted by GitBox <gi...@apache.org>.
swaroopak commented on a change in pull request #801:
URL: https://github.com/apache/phoenix/pull/801#discussion_r439600279



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/IndexRebuildRegionScanner.java
##########
@@ -98,6 +103,9 @@
 
     private static final Logger LOGGER = LoggerFactory.getLogger(IndexRebuildRegionScanner.class);
 
+    public static final String PHOENIX_LOG_BEYOND_MAX_LOOKBACK_INVALID_ROWS_CONF_KEY =

Review comment:
       invalid and missing, right? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] kadirozde commented on a change in pull request #801: PHOENIX-5951 - Index rebuild output logging for past-max-lookback row…

Posted by GitBox <gi...@apache.org>.
kadirozde commented on a change in pull request #801:
URL: https://github.com/apache/phoenix/pull/801#discussion_r446491581



##########
File path: phoenix-core/src/it/java/org/apache/phoenix/end2end/IndexToolForNonTxGlobalIndexIT.java
##########
@@ -1006,6 +1011,72 @@ public void testDisableOutputLogging() throws Exception {
         }
     }
 
+    @Test
+    public void testEnableOutputLoggingForMaxLookback() throws Exception {
+        //by default we don't log invalid or missing rows past max lookback age to the
+        // PHOENIX_INDEX_TOOL table. Verify that we can flip that logging on from the client-side
+        // using a system property (such as from the command line) and have it log rows on the
+        // server-side
+        if (!mutable) {
+            return;
+        }
+        String schemaName = generateUniqueName();
+        String dataTableName = generateUniqueName();
+        String dataTableFullName = SchemaUtil.getTableName(schemaName, dataTableName);
+        String indexTableName = generateUniqueName();
+        String indexTableFullName = SchemaUtil.getTableName(schemaName, indexTableName);
+
+        try(Connection conn = DriverManager.getConnection(getUrl())) {
+            ManualEnvironmentEdge injectEdge = new ManualEnvironmentEdge();
+            injectEdge.setValue(EnvironmentEdgeManager.currentTimeMillis());
+            EnvironmentEdgeManager.injectEdge(injectEdge);
+            deleteAllRows(conn,
+                TableName.valueOf(IndexVerificationOutputRepository.OUTPUT_TABLE_NAME));
+            String stmString1 =
+                "CREATE TABLE " + dataTableFullName
+                    + " (ID INTEGER NOT NULL PRIMARY KEY, NAME VARCHAR, ZIP INTEGER) ";
+            conn.createStatement().execute(stmString1);
+
+            injectEdge.incrementValue(1L);
+            String upsertQuery = String.format("UPSERT INTO %s VALUES(?, ?, ?)", dataTableFullName);
+            PreparedStatement stmt1 = conn.prepareStatement(upsertQuery);
+
+            // insert two rows
+            IndexToolIT.upsertRow(stmt1, 1);
+            IndexToolIT.upsertRow(stmt1, 2);
+            conn.commit();
+            injectEdge.incrementValue(1L); //we have to increment time to see our writes
+            //now create an index async so it won't have the two rows in the base table
+
+            String stmtString2 =
+                String.format(
+                    "CREATE INDEX %s ON %s (LPAD(UPPER(NAME, 'en_US'),8,'x')||'_xyz') ASYNC ",
+                    indexTableName, dataTableFullName);
+            conn.createStatement().execute(stmtString2);
+            conn.commit();
+            injectEdge.incrementValue(MAX_LOOKBACK_AGE * 1000);
+            deleteAllRows(conn, TableName.valueOf(IndexVerificationOutputRepository.OUTPUT_TABLE_NAME));
+            getUtility().getConfiguration().
+                set(IndexRebuildRegionScanner.INDEX_VERIFY_ROW_COUNTS_PER_TASK_CONF_KEY, "true");

Review comment:
       This is not the right conf key to set here




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org