You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2021/02/16 08:05:53 UTC

[GitHub] [hive] pvargacl opened a new pull request #1979: HIVE-24785: Fix HIVE_COMPACTOR_COMPACT_MM property

pvargacl opened a new pull request #1979:
URL: https://github.com/apache/hive/pull/1979


   
   
   ### What changes were proposed in this pull request?
   Fix how HIVE_COMPACTOR_COMPACT_MM will disable compaction for insert only tables.
   
   
   ### Why are the changes needed?
   Currently if this property is disabled, compaction will be initiated for insert_only tables and fail in MR based compaction
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   Current unit tests
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] klcopp commented on a change in pull request #1979: HIVE-24785: Fix HIVE_COMPACTOR_COMPACT_MM property

Posted by GitBox <gi...@apache.org>.
klcopp commented on a change in pull request #1979:
URL: https://github.com/apache/hive/pull/1979#discussion_r576972366



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java
##########
@@ -469,6 +469,12 @@ private boolean isEligibleForCompaction(CompactionInfo ci, ShowCompactResponse c
             "=true so we will not compact it.");
         return false;
       }
+      if (AcidUtils.isInsertOnlyTable(t.getParameters()) && !HiveConf

Review comment:
       @deniskuzZ  I guess the original intent for this config was as a feature flag. Since compaction _is_ unreliable at the moment I don't think it's a tragedy if we keep it for a bit longer. Why don't we open a ticket for its removal and do it when we feel ready?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] pvargacl commented on a change in pull request #1979: HIVE-24785: Fix HIVE_COMPACTOR_COMPACT_MM property

Posted by GitBox <gi...@apache.org>.
pvargacl commented on a change in pull request #1979:
URL: https://github.com/apache/hive/pull/1979#discussion_r576638351



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java
##########
@@ -469,6 +469,12 @@ private boolean isEligibleForCompaction(CompactionInfo ci, ShowCompactResponse c
             "=true so we will not compact it.");
         return false;
       }
+      if (AcidUtils.isInsertOnlyTable(t.getParameters()) && !HiveConf

Review comment:
       no until the underlying bug is fixed, but it would enable the full acid tables to work properly




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a change in pull request #1979: HIVE-24785: Fix HIVE_COMPACTOR_COMPACT_MM property

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on a change in pull request #1979:
URL: https://github.com/apache/hive/pull/1979#discussion_r576624589



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java
##########
@@ -469,6 +469,12 @@ private boolean isEligibleForCompaction(CompactionInfo ci, ShowCompactResponse c
             "=true so we will not compact it.");
         return false;
       }
+      if (AcidUtils.isInsertOnlyTable(t.getParameters()) && !HiveConf

Review comment:
       Does this config makes any sense or just adds extra hassle for end users - "Whether the compactor should compact insert-only tables. A safety switch."? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a change in pull request #1979: HIVE-24785: Fix HIVE_COMPACTOR_COMPACT_MM property

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on a change in pull request #1979:
URL: https://github.com/apache/hive/pull/1979#discussion_r576993768



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java
##########
@@ -469,6 +469,12 @@ private boolean isEligibleForCompaction(CompactionInfo ci, ShowCompactResponse c
             "=true so we will not compact it.");
         return false;
       }
+      if (AcidUtils.isInsertOnlyTable(t.getParameters()) && !HiveConf

Review comment:
       @klcopp, that sounds good! My point was that once compaction is stable, we should probably revisit some of these toggles.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a change in pull request #1979: HIVE-24785: Fix HIVE_COMPACTOR_COMPACT_MM property

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on a change in pull request #1979:
URL: https://github.com/apache/hive/pull/1979#discussion_r576634363



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java
##########
@@ -469,6 +469,12 @@ private boolean isEligibleForCompaction(CompactionInfo ci, ShowCompactResponse c
             "=true so we will not compact it.");
         return false;
       }
+      if (AcidUtils.isInsertOnlyTable(t.getParameters()) && !HiveConf

Review comment:
       but what would happen to insert only tables, they won't be compacted ever?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] pvargacl commented on a change in pull request #1979: HIVE-24785: Fix HIVE_COMPACTOR_COMPACT_MM property

Posted by GitBox <gi...@apache.org>.
pvargacl commented on a change in pull request #1979:
URL: https://github.com/apache/hive/pull/1979#discussion_r576668055



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java
##########
@@ -469,6 +469,12 @@ private boolean isEligibleForCompaction(CompactionInfo ci, ShowCompactResponse c
             "=true so we will not compact it.");
         return false;
       }
+      if (AcidUtils.isInsertOnlyTable(t.getParameters()) && !HiveConf

Review comment:
       1. This switch is already there it will just cause ClassCastException, rather than handle this properly
   2. This customer is using MR based compaction which works fine, the problem is for mm tables the system will run query based no matter what, we want to turn off query based compaction for every table, until they can get the fix to their cluster, which is a very valid thing to do, otherwise compaction will not run for anything.
   3. Using a switch like this until the upgrade is much better solution then disabling it off for 1000+ tables manually




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a change in pull request #1979: HIVE-24785: Fix HIVE_COMPACTOR_COMPACT_MM property

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on a change in pull request #1979:
URL: https://github.com/apache/hive/pull/1979#discussion_r576703148



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java
##########
@@ -469,6 +469,12 @@ private boolean isEligibleForCompaction(CompactionInfo ci, ShowCompactResponse c
             "=true so we will not compact it.");
         return false;
       }
+      if (AcidUtils.isInsertOnlyTable(t.getParameters()) && !HiveConf

Review comment:
       Just my thought on that, feel free to ignore them:
   If something is already there doesn't mean it's a good practice. The presence of a switch just indicates how unreliable compaction functionality is that we still need to maintain feature flag. 
   Do we have similar toggle just for MR based compaction? :)
   Until the upgrade they should probably avoid using MM tables. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] pvargacl commented on a change in pull request #1979: HIVE-24785: Fix HIVE_COMPACTOR_COMPACT_MM property

Posted by GitBox <gi...@apache.org>.
pvargacl commented on a change in pull request #1979:
URL: https://github.com/apache/hive/pull/1979#discussion_r576708599



##########
File path: ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestInitiator.java
##########
@@ -1200,6 +1202,36 @@ public void resolveUserToRunAs() throws Exception {
     Assert.assertEquals(tableName, compacts.get(0).getTablename());
   }
 
+  @Test

Review comment:
       Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] pvargacl commented on a change in pull request #1979: HIVE-24785: Fix HIVE_COMPACTOR_COMPACT_MM property

Posted by GitBox <gi...@apache.org>.
pvargacl commented on a change in pull request #1979:
URL: https://github.com/apache/hive/pull/1979#discussion_r576631100



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java
##########
@@ -469,6 +469,12 @@ private boolean isEligibleForCompaction(CompactionInfo ci, ShowCompactResponse c
             "=true so we will not compact it.");
         return false;
       }
+      if (AcidUtils.isInsertOnlyTable(t.getParameters()) && !HiveConf

Review comment:
       I actually have a customer where this would be very useful. MM compaction makes the Workers to hung indefinitely on their system and since all worker thread stops, compaction is down for every table.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] pvargacl merged pull request #1979: HIVE-24785: Fix HIVE_COMPACTOR_COMPACT_MM property

Posted by GitBox <gi...@apache.org>.
pvargacl merged pull request #1979:
URL: https://github.com/apache/hive/pull/1979


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a change in pull request #1979: HIVE-24785: Fix HIVE_COMPACTOR_COMPACT_MM property

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on a change in pull request #1979:
URL: https://github.com/apache/hive/pull/1979#discussion_r576643788



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java
##########
@@ -469,6 +469,12 @@ private boolean isEligibleForCompaction(CompactionInfo ci, ShowCompactResponse c
             "=true so we will not compact it.");
         return false;
       }
+      if (AcidUtils.isInsertOnlyTable(t.getParameters()) && !HiveConf

Review comment:
       what kind of underlying bug? in most cases query based compaction is used for full acid tables as well. this config looks like a real hack to ignore the problem that over time could lead to a more serious issue with accumulated data to be compacted and performance degradation. If we want to improve compaction usability I don't think that's a good approach, but that's just my 10 cents. From code point of view patch looks good to me.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a change in pull request #1979: HIVE-24785: Fix HIVE_COMPACTOR_COMPACT_MM property

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on a change in pull request #1979:
URL: https://github.com/apache/hive/pull/1979#discussion_r576651343



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java
##########
@@ -469,6 +469,12 @@ private boolean isEligibleForCompaction(CompactionInfo ci, ShowCompactResponse c
             "=true so we will not compact it.");
         return false;
       }
+      if (AcidUtils.isInsertOnlyTable(t.getParameters()) && !HiveConf

Review comment:
       btw, if needed, they could disable auto-compaction on table level




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] klcopp commented on a change in pull request #1979: HIVE-24785: Fix HIVE_COMPACTOR_COMPACT_MM property

Posted by GitBox <gi...@apache.org>.
klcopp commented on a change in pull request #1979:
URL: https://github.com/apache/hive/pull/1979#discussion_r576638280



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java
##########
@@ -533,6 +533,10 @@ protected Boolean findNextCompactionAndExecute(boolean computeStats) throws Inte
         todo Find a more generic approach to collecting files in the same logical bucket to compact within the same
         task (currently we're using Tez split grouping).
         */
+        if (AcidUtils.isInsertOnlyTable(t.getParameters()) && !HiveConf
+            .getBoolVar(conf, HiveConf.ConfVars.HIVE_COMPACTOR_COMPACT_MM)) {
+          throw new HiveException("Insert only compaction is disabled. Set hive.compactor.compact.insert.only to true to enable it.");
+        }

Review comment:
       Could we put this logic in the QueryCompactorFactory for better readability?

##########
File path: ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestInitiator.java
##########
@@ -1200,6 +1202,36 @@ public void resolveUserToRunAs() throws Exception {
     Assert.assertEquals(tableName, compacts.get(0).getTablename());
   }
 
+  @Test

Review comment:
       Could you add a test for the check in Worker as well?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] pvargacl commented on a change in pull request #1979: HIVE-24785: Fix HIVE_COMPACTOR_COMPACT_MM property

Posted by GitBox <gi...@apache.org>.
pvargacl commented on a change in pull request #1979:
URL: https://github.com/apache/hive/pull/1979#discussion_r576708427



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java
##########
@@ -533,6 +533,10 @@ protected Boolean findNextCompactionAndExecute(boolean computeStats) throws Inte
         todo Find a more generic approach to collecting files in the same logical bucket to compact within the same
         task (currently we're using Tez split grouping).
         */
+        if (AcidUtils.isInsertOnlyTable(t.getParameters()) && !HiveConf
+            .getBoolVar(conf, HiveConf.ConfVars.HIVE_COMPACTOR_COMPACT_MM)) {
+          throw new HiveException("Insert only compaction is disabled. Set hive.compactor.compact.insert.only to true to enable it.");
+        }

Review comment:
       done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org