You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2022/07/20 07:54:57 UTC

[GitHub] [hive] SourabhBadhya opened a new pull request, #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

SourabhBadhya opened a new pull request, #3457:
URL: https://github.com/apache/hive/pull/3457

   …f uncommitted data
   
   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://cwiki.apache.org/confluence/display/Hive/HowToContribute
     2. Ensure that you have created an issue on the Hive project JIRA: https://issues.apache.org/jira/projects/HIVE/summary
     3. Ensure you have added or run the appropriate tests for your PR: 
     4. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP]HIVE-XXXXX:  Your PR title ...'.
     5. Be sure to keep the PR description updated to reflect all changes.
     6. Please write your PR title to summarize what this PR proposes.
     7. If possible, provide a concise example to reproduce the issue for a faster review.
   
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   Aborted/Cancelled CTAS operations must initiate cleanup of uncommitted data when the failure happens before creation of table and after write.
   
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   The data will lie around without cleanup and will accumulate.
   
   
   ### Does this PR introduce _any_ user-facing change?
   <!--
   Note that it means *any* user-facing change including all aspects such as the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description, screenshot and/or a reproducable example to show the behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to the released Hive versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   No
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   -->
   Unit test
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r925344280


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +480,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() {

Review Comment:
   Yes you are right.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r931126498


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +493,27 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupOutputDir(Context ctx) throws MetaException {
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+      Table destinationTable = ctx.getDestinationTable();
+      if (destinationTable != null) {
+        try {
+          CompactionRequest rqst = new CompactionRequest(
+                  destinationTable.getDbName(), destinationTable.getTableName(), CompactionType.MAJOR);
+          rqst.setRunas(TxnUtils.findUserToRunAs(destinationTable.getSd().getLocation(),
+                  destinationTable.getTTable(), conf));
+
+          rqst.putToProperties(META_TABLE_LOCATION, destinationTable.getSd().getLocation());
+          rqst.putToProperties(IF_PURGE, Boolean.toString(true));
+          TxnStore txnHandler = TxnUtils.getTxnStore(conf);

Review Comment:
   > btw, would it be hard to create a completionHook similar to Iceberg one?
   
   We could create one but it would include failures only within Query execution.
   Anything done after query execution (post execution activities) will not be within its scope, which is why I disregarded the Hook approach.
   
   The hooks are used as part of finally block here - 
   https://github.com/apache/hive/blob/b197ed86029f07696e326acb5878f86c286e9e1a/ql/src/java/org/apache/hadoop/hive/ql/Executor.java#L118
   
   Cleanup will then be dependent on a HiveConf - `hive.query.lifetime.hooks`. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r934526189


##########
ql/src/java/org/apache/hadoop/hive/ql/HiveQueryLifeTimeHook.java:
##########
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.CompactionRequest;
+import org.apache.hadoop.hive.metastore.api.CompactionType;
+import org.apache.hadoop.hive.metastore.txn.TxnUtils;
+import org.apache.hadoop.hive.ql.hooks.PrivateHookContext;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHook;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHookContext;
+import org.apache.hadoop.hive.ql.metadata.Hive;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.ql.metadata.Table;
+import org.apache.hadoop.hive.ql.plan.HiveOperation;
+import org.apache.thrift.TException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.io.IOException;
+
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.IF_PURGE;
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.META_TABLE_LOCATION;
+
+public class HiveQueryLifeTimeHook implements QueryLifeTimeHook {
+
+  private static final Logger LOG = LoggerFactory.getLogger(HiveQueryLifeTimeHook.class);
+
+  @Override
+  public void beforeCompile(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterCompile(QueryLifeTimeHookContext ctx, boolean hasError) {
+
+  }
+
+  @Override
+  public void beforeExecution(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterExecution(QueryLifeTimeHookContext ctx, boolean hasError) {
+    if (hasError) {
+      checkAndRollbackCTAS(ctx);
+    }
+  }
+
+  private void checkAndRollbackCTAS(QueryLifeTimeHookContext ctx) {
+    HiveConf conf = ctx.getHiveConf();
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+      PrivateHookContext pCtx = (PrivateHookContext) ctx.getHookContext();
+      Table table = ctx.getHookContext().getQueryPlan().getAcidSinks().iterator().next().getTable();
+      boolean isCTAS = ctx.getHookContext().getQueryState().getCommandType()

Review Comment:
   ````
   ctx.getHookContext().getQueryPlan().getQueryProperties().isCTAS()
   ````



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r933142617


##########
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/create/CtasQueryLifeTimeHook.java:
##########
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.ddl.table.create;
+
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.CompactionRequest;
+import org.apache.hadoop.hive.metastore.api.CompactionType;
+import org.apache.hadoop.hive.metastore.api.MetaException;
+import org.apache.hadoop.hive.metastore.txn.TxnStore;
+import org.apache.hadoop.hive.metastore.txn.TxnUtils;
+import org.apache.hadoop.hive.ql.Context;
+import org.apache.hadoop.hive.ql.hooks.PrivateHookContext;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHook;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHookContext;
+import org.apache.hadoop.hive.ql.metadata.Table;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.io.IOException;
+
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.IF_PURGE;
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.META_TABLE_LOCATION;
+
+public class CtasQueryLifeTimeHook implements QueryLifeTimeHook {

Review Comment:
   can we make it a generic one: HiveQueryLifeTimeHook?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r933135808


##########
ql/src/java/org/apache/hadoop/hive/ql/HookRunner.java:
##########
@@ -56,6 +57,7 @@ public class HookRunner {
   HookRunner(HiveConf conf, SessionState.LogHelper console) {
     this.conf = conf;
     this.hooks = new HiveHooks(conf, console);
+    addNecessaryHooks();

Review Comment:
   `addNecessaryHooks()` will have all the hooks that are absolutely required from now on. Thought this would be a good way to go about. But let me know if this function required so that I can change it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r927295389


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +480,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() {

Review Comment:
   Currently `Context ctx` is not available during `rollbackTxn` which is why I chose to store the object.
   However I agree passing `Context ctx` is better.
   There are multiple ways of doing this - 
   First would be to involve passing `Context ctx` to `rollbackTxn` method which would change the HiveTxnManager API itself (I particularly dont like this since this would be a breaking change).
   
   Or we could create a new function in the `HiveTxnManager` interface of the same name and call it from the driver when rollback conditions are satisfied.
   
   My idea was to avoid both and initialise the `destinationTable` in one of the existing APIs but I am open for any other suggestions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r933146801


##########
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/create/CtasQueryLifeTimeHook.java:
##########
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.ddl.table.create;
+
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.CompactionRequest;
+import org.apache.hadoop.hive.metastore.api.CompactionType;
+import org.apache.hadoop.hive.metastore.api.MetaException;
+import org.apache.hadoop.hive.metastore.txn.TxnStore;
+import org.apache.hadoop.hive.metastore.txn.TxnUtils;
+import org.apache.hadoop.hive.ql.Context;
+import org.apache.hadoop.hive.ql.hooks.PrivateHookContext;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHook;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHookContext;
+import org.apache.hadoop.hive.ql.metadata.Table;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.io.IOException;
+
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.IF_PURGE;
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.META_TABLE_LOCATION;
+
+public class CtasQueryLifeTimeHook implements QueryLifeTimeHook {
+
+  private static final Logger LOG = LoggerFactory.getLogger(CtasQueryLifeTimeHook.class);
+
+  @Override
+  public void beforeCompile(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterCompile(QueryLifeTimeHookContext ctx, boolean hasError) {
+
+  }
+
+  @Override
+  public void beforeExecution(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterExecution(QueryLifeTimeHookContext ctx, boolean hasError) {
+    HiveConf conf = ctx.getHiveConf();
+    PrivateHookContext privateHookContext = (PrivateHookContext) ctx.getHookContext();
+    Context context = privateHookContext.getContext();
+
+    if (hasError && HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+      Table table = context.getDestinationTable();
+      if (table != null) {
+        LOG.info("Performing cleanup as part of rollback: {}", table.getFullTableName().toString());
+        try {
+          TxnStore txnHandler = TxnUtils.getTxnStore(conf);

Review Comment:
   can we extract TxnStore into the constructor or init method? that would be really expensive to create a new one, maybe consider ThreadLocal



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r934530893


##########
ql/src/java/org/apache/hadoop/hive/ql/Context.java:
##########
@@ -191,6 +191,8 @@ public class Context {
 
   private List<Pair<String, String>> parsedTables = new ArrayList<>();
 
+  private Path destinationPath;

Review Comment:
   change to `location`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r925344280


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +480,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() {

Review Comment:
   Yes you are correct.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ merged pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ merged PR #3457:
URL: https://github.com/apache/hive/pull/3457


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] pvary commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
pvary commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r925331780


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +480,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() {

Review Comment:
   Do I understand correctly, that we do not create a table for CTAS, we just create a `destinationTable` object which behaves as a table, but it is not created in the HMS?
   So basically we do not really have to drop the table, we just have to drop the directory?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r927401945


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +480,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() {
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+      if (destinationTable != null) {
+        try {
+          CompactionRequest rqst = new CompactionRequest(
+                  destinationTable.getDbName(), destinationTable.getTableName(), CompactionType.MAJOR);
+          rqst.setRunas(TxnUtils.findUserToRunAs(destinationTable.getSd().getLocation(),
+                  destinationTable.getTTable(), conf));
+
+          rqst.putToProperties("location", destinationTable.getSd().getLocation());
+          rqst.putToProperties("ifPurge", Boolean.toString(true));
+          TxnStore txnHandler = TxnUtils.getTxnStore(conf);
+          txnHandler.submitForCleanup(rqst, destinationTable.getTTable().getWriteId(), getCurrentTxnId());
+        } catch (InterruptedException | IOException | MetaException e) {
+          throw new RuntimeException("Not able to submit cleanup operation of directory written by CTAS");

Review Comment:
   All three exceptions are thrown within the code. Changed it to LockException instead of RuntimeException.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r935200343


##########
ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java:
##########
@@ -3080,6 +3084,83 @@ public void testAcidOrcWritePreservesFieldNames() throws Exception {
     hiveConf.setBoolVar(HiveConf.ConfVars.HIVE_VECTORIZATION_ENABLED, true);
   }
 
+  @Test
+  public void testFailureScenariosCleanupCTAS() throws Exception {
+    boolean[] booleans = {true, false};
+    for (boolean var1 : booleans) {
+      for (boolean var2 : booleans) {
+        for (boolean var3 : booleans) {
+          for (boolean var4 : booleans) {
+            failureScenarioCleanupCTAS(var1, var2, var3, var4);
+          }
+        }
+      }
+    }
+  }
+
+  public void failureScenarioCleanupCTAS(boolean isPartitioned,
+                                         boolean isDirectInsertEnabled,
+                                         boolean isLocklessReadsEnabled,
+                                         boolean isLocationUsed) throws Exception {
+    String tableName = "atable";
+
+    //Set configurations
+    hiveConf.setBoolVar(HiveConf.ConfVars.HIVE_ACID_DIRECT_INSERT_ENABLED, isDirectInsertEnabled);
+    hiveConf.setBoolVar(HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED, isLocklessReadsEnabled);
+    hiveConf.setBoolVar(HiveConf.ConfVars.TXN_CTAS_X_LOCK, true);
+    hiveConf.setIntVar(HiveConf.ConfVars.HIVE_COMPACTOR_ABORTEDTXN_THRESHOLD, 0);
+
+    // Add a '1' at the end of table name for custom location.
+    String querylocation = (isLocationUsed) ? " location '" + getWarehouseDir() + "/" + tableName + "1'" : "";
+    String queryPartitions = (isPartitioned) ? " partitioned by (a)" : "";
+
+    d.run("insert into " + Table.ACIDTBL + "(a,b) values (3,4)");
+    d.run("drop table if exists " + tableName);
+    d.compileAndRespond("create table " + tableName + queryPartitions + " stored as orc" + querylocation +
+            " tblproperties ('transactional'='true') as select * from " + Table.ACIDTBL);
+    long txnId = d.getQueryState().getTxnManager().getCurrentTxnId();
+    DriverContext driverContext = d.getDriverContext();
+    traverseTasksRecursively(driverContext.getPlan().getRootTasks());
+    int assertError = 0;
+    try {
+      d.run();
+    } catch (Exception e) {
+      assertError = 1;
+    }
+
+    runInitiator(hiveConf);

Review Comment:
   Removed. Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r934330597


##########
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java:
##########
@@ -3478,6 +3478,17 @@ CompactionResponse compact2(String dbname, String tableName, String partitionNam
    */
   ShowCompactResponse showCompactions() throws TException;
 
+  /**
+   * Submit a request for performing cleanup of output directory. This is particularly
+   * useful for CTAS when the query fails after write and before creation of table.
+   * @return Status of whether the request was successfully submitted. True indicates
+   * the request was successfully submitted and false indicates failure of request submitted.
+   * @throws TException
+   */
+  boolean submitForCleanup(String dbname, String tableName, CompactionType type,

Review Comment:
   As discussed offline, this was already done. The request is created inside this function which actually makes the request to the HMS.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r934537997


##########
ql/src/java/org/apache/hadoop/hive/ql/HiveQueryLifeTimeHook.java:
##########
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.CompactionRequest;
+import org.apache.hadoop.hive.metastore.api.CompactionType;
+import org.apache.hadoop.hive.metastore.txn.TxnUtils;
+import org.apache.hadoop.hive.ql.hooks.PrivateHookContext;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHook;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHookContext;
+import org.apache.hadoop.hive.ql.metadata.Hive;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.ql.metadata.Table;
+import org.apache.hadoop.hive.ql.plan.HiveOperation;
+import org.apache.thrift.TException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.io.IOException;
+
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.IF_PURGE;
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.META_TABLE_LOCATION;
+
+public class HiveQueryLifeTimeHook implements QueryLifeTimeHook {
+
+  private static final Logger LOG = LoggerFactory.getLogger(HiveQueryLifeTimeHook.class);
+
+  @Override
+  public void beforeCompile(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterCompile(QueryLifeTimeHookContext ctx, boolean hasError) {
+
+  }
+
+  @Override
+  public void beforeExecution(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterExecution(QueryLifeTimeHookContext ctx, boolean hasError) {
+    if (hasError) {
+      checkAndRollbackCTAS(ctx);
+    }
+  }
+
+  private void checkAndRollbackCTAS(QueryLifeTimeHookContext ctx) {
+    HiveConf conf = ctx.getHiveConf();
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+      PrivateHookContext pCtx = (PrivateHookContext) ctx.getHookContext();
+      Table table = ctx.getHookContext().getQueryPlan().getAcidSinks().iterator().next().getTable();

Review Comment:
   call it after checking the location and operationType. Make sure it's not empty



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r934585516


##########
ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java:
##########
@@ -3080,6 +3084,83 @@ public void testAcidOrcWritePreservesFieldNames() throws Exception {
     hiveConf.setBoolVar(HiveConf.ConfVars.HIVE_VECTORIZATION_ENABLED, true);
   }
 
+  @Test
+  public void testFailureScenariosCleanupCTAS() throws Exception {
+    boolean[] booleans = {true, false};
+    for (boolean var1 : booleans) {

Review Comment:
   do we really need to cover all 16 combinations?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r930799014


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +493,27 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupOutputDir(Context ctx) throws MetaException {
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+      Table destinationTable = ctx.getDestinationTable();
+      if (destinationTable != null) {
+        try {
+          CompactionRequest rqst = new CompactionRequest(
+                  destinationTable.getDbName(), destinationTable.getTableName(), CompactionType.MAJOR);
+          rqst.setRunas(TxnUtils.findUserToRunAs(destinationTable.getSd().getLocation(),
+                  destinationTable.getTTable(), conf));
+
+          rqst.putToProperties(META_TABLE_LOCATION, destinationTable.getSd().getLocation());
+          rqst.putToProperties(IF_PURGE, Boolean.toString(true));
+          TxnStore txnHandler = TxnUtils.getTxnStore(conf);

Review Comment:
   btw, would it be hard to create a completionHook similar to Iceberg one?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r927309557


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +480,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() {
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {

Review Comment:
   The idea is that `destinationTable` object is set only when CTAS operations are performed ([here](https://github.com/apache/hive/pull/3457/files#diff-d4b1a32bbbd9e283893a6b52854c7aeb3e356a1ba1add2c4107e52901ca268f9R7856)). So no inherent need of checking whether the operation is CTAS.
   
   As far as exclusive lock is enabled, it would be right to perform cleanup when a exclusive lock is acquired otherwise we might have a situation wherein the cleaner is cleaning and concurrent CTAS operations write to the same location causing issues.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r927504663


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +496,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() throws LockException {
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+      if (destinationTable != null) {
+        try {
+          CompactionRequest rqst = new CompactionRequest(
+                  destinationTable.getDbName(), destinationTable.getTableName(), CompactionType.MAJOR);
+          rqst.setRunas(TxnUtils.findUserToRunAs(destinationTable.getSd().getLocation(),
+                  destinationTable.getTTable(), conf));
+
+          rqst.putToProperties(META_TABLE_LOCATION, destinationTable.getSd().getLocation());
+          rqst.putToProperties(IF_PURGE, Boolean.toString(true));
+          TxnStore txnHandler = TxnUtils.getTxnStore(conf);
+          txnHandler.submitForCleanup(rqst, destinationTable.getTTable().getWriteId(), getCurrentTxnId());
+        } catch (InterruptedException | IOException | MetaException e) {

Review Comment:
   we could simply do next:
   ````
   private void cleanupOutputDir() throws MetaException {
   ......
           } catch (InterruptedException | IOException e) {
             throwMetaException(e);
           }
   ````



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r927487444


##########
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##########
@@ -7852,6 +7852,10 @@ protected Operator genFileSinkPlan(String dest, QB qb, Operator input)
           throw new SemanticException("Error while getting the full qualified path for the given directory: " + ex.getMessage());
         }
       }
+
+      if (!isNonNativeTable && AcidUtils.isTransactionalTable(destinationTable) && qb.isCTAS()) {

Review Comment:
   is there a reason, why we should exclude them? additional Iceberg-related changes are required?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r927498250


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +496,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() throws LockException {

Review Comment:
   I would make the method name more generic, like `cleanupResultDir`/ `cleanupOutputDir`, so it could be extended in the future.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r934484243


##########
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java:
##########
@@ -3478,6 +3478,17 @@ CompactionResponse compact2(String dbname, String tableName, String partitionNam
    */
   ShowCompactResponse showCompactions() throws TException;
 
+  /**
+   * Submit a request for performing cleanup of output directory. This is particularly
+   * useful for CTAS when the query fails after write and before creation of table.
+   * @return Status of whether the request was successfully submitted. True indicates
+   * the request was successfully submitted and false indicates failure of request submitted.
+   * @throws TException
+   */
+  boolean submitForCleanup(String dbname, String tableName, CompactionType type,

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r928844426


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java:
##########
@@ -161,7 +161,7 @@ void replTableWriteIdState(String validWriteIdList, String dbName, String tableN
    * @throws LockException if there is no current transaction or the
    * transaction has already been committed or aborted.
    */
-  void rollbackTxn() throws LockException;
+  void rollbackTxn(Context ctx) throws LockException;

Review Comment:
   You mean retain this function - 
   `void rollbackTxn() throws LockException;`
    As well as create a new function -
   `void rollbackTxn(Context ctx) throws LockException;`  in the interface?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r934210285


##########
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/create/CtasQueryLifeTimeHook.java:
##########
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.ddl.table.create;
+
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.CompactionRequest;
+import org.apache.hadoop.hive.metastore.api.CompactionType;
+import org.apache.hadoop.hive.metastore.api.MetaException;
+import org.apache.hadoop.hive.metastore.txn.TxnStore;
+import org.apache.hadoop.hive.metastore.txn.TxnUtils;
+import org.apache.hadoop.hive.ql.Context;
+import org.apache.hadoop.hive.ql.hooks.PrivateHookContext;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHook;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHookContext;
+import org.apache.hadoop.hive.ql.metadata.Table;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.io.IOException;
+
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.IF_PURGE;
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.META_TABLE_LOCATION;
+
+public class CtasQueryLifeTimeHook implements QueryLifeTimeHook {
+
+  private static final Logger LOG = LoggerFactory.getLogger(CtasQueryLifeTimeHook.class);
+
+  @Override
+  public void beforeCompile(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterCompile(QueryLifeTimeHookContext ctx, boolean hasError) {
+
+  }
+
+  @Override
+  public void beforeExecution(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterExecution(QueryLifeTimeHookContext ctx, boolean hasError) {
+    HiveConf conf = ctx.getHiveConf();
+    PrivateHookContext privateHookContext = (PrivateHookContext) ctx.getHookContext();
+    Context context = privateHookContext.getContext();
+
+    if (hasError && HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+      Table table = context.getDestinationTable();
+      if (table != null) {
+        LOG.info("Performing cleanup as part of rollback: {}", table.getFullTableName().toString());
+        try {
+          TxnStore txnHandler = TxnUtils.getTxnStore(conf);

Review Comment:
   Implemented API for this in HiveMetastoreClient - `submitForCleanup`. Made use of this API to send the cleanup request. Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r934330597


##########
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java:
##########
@@ -3478,6 +3478,17 @@ CompactionResponse compact2(String dbname, String tableName, String partitionNam
    */
   ShowCompactResponse showCompactions() throws TException;
 
+  /**
+   * Submit a request for performing cleanup of output directory. This is particularly
+   * useful for CTAS when the query fails after write and before creation of table.
+   * @return Status of whether the request was successfully submitted. True indicates
+   * the request was successfully submitted and false indicates failure of request submitted.
+   * @throws TException
+   */
+  boolean submitForCleanup(String dbname, String tableName, CompactionType type,

Review Comment:
   As discussed offline, this was already done. The request is created inside this function which actually makes the request to the HMS.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r935414100


##########
ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java:
##########
@@ -3080,6 +3084,83 @@ public void testAcidOrcWritePreservesFieldNames() throws Exception {
     hiveConf.setBoolVar(HiveConf.ConfVars.HIVE_VECTORIZATION_ENABLED, true);
   }
 
+  @Test
+  public void testFailureScenariosCleanupCTAS() throws Exception {
+    boolean[] booleans = {true, false};
+    for (boolean var1 : booleans) {

Review Comment:
   Done. Nice use of bitwise operators :) .



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r935413606


##########
ql/src/java/org/apache/hadoop/hive/ql/Driver.java:
##########
@@ -905,6 +905,11 @@ public QueryDisplay getQueryDisplay() {
     return driverContext.getQueryDisplay();
   }
 
+  @VisibleForTesting
+  DriverContext getDriverContext() {

Review Comment:
   Removed. Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r934683229


##########
ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java:
##########
@@ -3080,6 +3084,83 @@ public void testAcidOrcWritePreservesFieldNames() throws Exception {
     hiveConf.setBoolVar(HiveConf.ConfVars.HIVE_VECTORIZATION_ENABLED, true);
   }
 
+  @Test
+  public void testFailureScenariosCleanupCTAS() throws Exception {
+    boolean[] booleans = {true, false};
+    for (boolean var1 : booleans) {
+      for (boolean var2 : booleans) {
+        for (boolean var3 : booleans) {
+          for (boolean var4 : booleans) {
+            failureScenarioCleanupCTAS(var1, var2, var3, var4);
+          }
+        }
+      }
+    }
+  }
+
+  public void failureScenarioCleanupCTAS(boolean isPartitioned,
+                                         boolean isDirectInsertEnabled,
+                                         boolean isLocklessReadsEnabled,
+                                         boolean isLocationUsed) throws Exception {
+    String tableName = "atable";
+
+    //Set configurations
+    hiveConf.setBoolVar(HiveConf.ConfVars.HIVE_ACID_DIRECT_INSERT_ENABLED, isDirectInsertEnabled);
+    hiveConf.setBoolVar(HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED, isLocklessReadsEnabled);
+    hiveConf.setBoolVar(HiveConf.ConfVars.TXN_CTAS_X_LOCK, true);
+    hiveConf.setIntVar(HiveConf.ConfVars.HIVE_COMPACTOR_ABORTEDTXN_THRESHOLD, 0);
+
+    // Add a '1' at the end of table name for custom location.
+    String querylocation = (isLocationUsed) ? " location '" + getWarehouseDir() + "/" + tableName + "1'" : "";
+    String queryPartitions = (isPartitioned) ? " partitioned by (a)" : "";
+
+    d.run("insert into " + Table.ACIDTBL + "(a,b) values (3,4)");
+    d.run("drop table if exists " + tableName);
+    d.compileAndRespond("create table " + tableName + queryPartitions + " stored as orc" + querylocation +
+            " tblproperties ('transactional'='true') as select * from " + Table.ACIDTBL);
+    long txnId = d.getQueryState().getTxnManager().getCurrentTxnId();
+    DriverContext driverContext = d.getDriverContext();
+    traverseTasksRecursively(driverContext.getPlan().getRootTasks());

Review Comment:
   could be simplified with
   ````
   MapRedTask mrtask = Mockito.spy(new MapRedTask());
       Mockito.doThrow(new RuntimeException()).when(mrtask).execute();
       driverContext.getPlan().setRootTasks(Lists.newArrayList(mrtask));
   ````



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r935198844


##########
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##########
@@ -7983,7 +7984,17 @@ protected boolean enableColumnStatsCollecting() {
     return true;
   }
 
-  private Path getCtasLocation(CreateTableDesc tblDesc) throws SemanticException {
+  private Path getCtasLocation(CreateTableDesc tblDesc, boolean createTableWithSuffix) throws SemanticException {
+    Path destinationPath = getCtasLocationWithoutSuffix(tblDesc);
+    if (createTableWithSuffix) {
+      long txnId = ctx.getHiveTxnManager().getCurrentTxnId();
+      String suffix = AcidUtils.getPathSuffix(txnId);
+      destinationPath = new Path(destinationPath.toString() + suffix);
+    }
+    return destinationPath;
+  }
+
+  private Path getCtasLocationWithoutSuffix(CreateTableDesc tblDesc) throws SemanticException {

Review Comment:
   Done



##########
ql/src/java/org/apache/hadoop/hive/ql/HiveQueryLifeTimeHook.java:
##########
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.CompactionRequest;
+import org.apache.hadoop.hive.metastore.api.CompactionType;
+import org.apache.hadoop.hive.metastore.txn.TxnUtils;
+import org.apache.hadoop.hive.ql.hooks.PrivateHookContext;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHook;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHookContext;
+import org.apache.hadoop.hive.ql.metadata.Hive;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.ql.metadata.Table;
+import org.apache.hadoop.hive.ql.plan.HiveOperation;
+import org.apache.thrift.TException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.io.IOException;
+
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.IF_PURGE;
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.META_TABLE_LOCATION;
+
+public class HiveQueryLifeTimeHook implements QueryLifeTimeHook {
+
+  private static final Logger LOG = LoggerFactory.getLogger(HiveQueryLifeTimeHook.class);
+
+  @Override
+  public void beforeCompile(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterCompile(QueryLifeTimeHookContext ctx, boolean hasError) {
+
+  }
+
+  @Override
+  public void beforeExecution(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterExecution(QueryLifeTimeHookContext ctx, boolean hasError) {
+    if (hasError) {
+      checkAndRollbackCTAS(ctx);
+    }
+  }
+
+  private void checkAndRollbackCTAS(QueryLifeTimeHookContext ctx) {
+    HiveConf conf = ctx.getHiveConf();
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+      PrivateHookContext pCtx = (PrivateHookContext) ctx.getHookContext();
+      Table table = ctx.getHookContext().getQueryPlan().getAcidSinks().iterator().next().getTable();
+      boolean isCTAS = ctx.getHookContext().getQueryState().getCommandType()

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r933146801


##########
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/create/CtasQueryLifeTimeHook.java:
##########
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.ddl.table.create;
+
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.CompactionRequest;
+import org.apache.hadoop.hive.metastore.api.CompactionType;
+import org.apache.hadoop.hive.metastore.api.MetaException;
+import org.apache.hadoop.hive.metastore.txn.TxnStore;
+import org.apache.hadoop.hive.metastore.txn.TxnUtils;
+import org.apache.hadoop.hive.ql.Context;
+import org.apache.hadoop.hive.ql.hooks.PrivateHookContext;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHook;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHookContext;
+import org.apache.hadoop.hive.ql.metadata.Table;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.io.IOException;
+
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.IF_PURGE;
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.META_TABLE_LOCATION;
+
+public class CtasQueryLifeTimeHook implements QueryLifeTimeHook {
+
+  private static final Logger LOG = LoggerFactory.getLogger(CtasQueryLifeTimeHook.class);
+
+  @Override
+  public void beforeCompile(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterCompile(QueryLifeTimeHookContext ctx, boolean hasError) {
+
+  }
+
+  @Override
+  public void beforeExecution(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterExecution(QueryLifeTimeHookContext ctx, boolean hasError) {
+    HiveConf conf = ctx.getHiveConf();
+    PrivateHookContext privateHookContext = (PrivateHookContext) ctx.getHookContext();
+    Context context = privateHookContext.getContext();
+
+    if (hasError && HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+      Table table = context.getDestinationTable();
+      if (table != null) {
+        LOG.info("Performing cleanup as part of rollback: {}", table.getFullTableName().toString());
+        try {
+          TxnStore txnHandler = TxnUtils.getTxnStore(conf);

Review Comment:
   You shouldn't use TxnStore here, but mscClient. see DbTxnManager.getMS()



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r928844426


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java:
##########
@@ -161,7 +161,7 @@ void replTableWriteIdState(String validWriteIdList, String dbName, String tableN
    * @throws LockException if there is no current transaction or the
    * transaction has already been committed or aborted.
    */
-  void rollbackTxn() throws LockException;
+  void rollbackTxn(Context ctx) throws LockException;

Review Comment:
   You mean retain this function - 
   `void rollbackTxn() throws LockException;`
    As well as create a -
   `void rollbackTxn(Context ctx) throws LockException;`  in the interface?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r928839830


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java:
##########
@@ -161,7 +161,7 @@ void replTableWriteIdState(String validWriteIdList, String dbName, String tableN
    * @throws LockException if there is no current transaction or the
    * transaction has already been committed or aborted.
    */
-  void rollbackTxn() throws LockException;
+  void rollbackTxn(Context ctx) throws LockException;

Review Comment:
   this is a change in API, we should keep the original signature



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r934530893


##########
ql/src/java/org/apache/hadoop/hive/ql/Context.java:
##########
@@ -191,6 +191,8 @@ public class Context {
 
   private List<Pair<String, String>> parsedTables = new ArrayList<>();
 
+  private Path destinationPath;

Review Comment:
   change to `location`, please   check what is resDir, maybe it's the same
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r934588653


##########
ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java:
##########
@@ -3080,6 +3084,83 @@ public void testAcidOrcWritePreservesFieldNames() throws Exception {
     hiveConf.setBoolVar(HiveConf.ConfVars.HIVE_VECTORIZATION_ENABLED, true);
   }
 
+  @Test
+  public void testFailureScenariosCleanupCTAS() throws Exception {
+    boolean[] booleans = {true, false};
+    for (boolean var1 : booleans) {

Review Comment:
   Its always better to check all cases.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r935205909


##########
ql/src/java/org/apache/hadoop/hive/ql/HiveQueryLifeTimeHook.java:
##########
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.CompactionRequest;
+import org.apache.hadoop.hive.metastore.api.CompactionType;
+import org.apache.hadoop.hive.metastore.txn.TxnUtils;
+import org.apache.hadoop.hive.ql.hooks.PrivateHookContext;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHook;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHookContext;
+import org.apache.hadoop.hive.ql.metadata.Hive;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.ql.metadata.Table;
+import org.apache.hadoop.hive.ql.plan.FileSinkDesc;
+import org.apache.thrift.TException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.io.IOException;
+
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.IF_PURGE;
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.META_TABLE_LOCATION;
+
+public class HiveQueryLifeTimeHook implements QueryLifeTimeHook {
+
+  private static final Logger LOG = LoggerFactory.getLogger(HiveQueryLifeTimeHook.class);
+
+  @Override
+  public void beforeCompile(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterCompile(QueryLifeTimeHookContext ctx, boolean hasError) {
+
+  }
+
+  @Override
+  public void beforeExecution(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterExecution(QueryLifeTimeHookContext ctx, boolean hasError) {
+    if (hasError) {
+      checkAndRollbackCTAS(ctx);
+    }
+  }
+
+  private void checkAndRollbackCTAS(QueryLifeTimeHookContext ctx) {
+    HiveConf conf = ctx.getHiveConf();
+    PrivateHookContext pCtx = (PrivateHookContext) ctx.getHookContext();
+    QueryPlan queryPlan = ctx.getHookContext().getQueryPlan();
+    if (queryPlan.getAcidSinks() != null && queryPlan.getAcidSinks().size() > 0) {
+      FileSinkDesc fileSinkDesc = queryPlan.getAcidSinks().iterator().next();
+      Table table = fileSinkDesc.getTable();
+      long writeId = fileSinkDesc.getTableWriteId();
+      boolean isCTAS = ctx.getHookContext().getQueryPlan().getQueryProperties().isCTAS();
+      Path destinationPath = pCtx.getContext().getLocation();
+
+      if (destinationPath != null && table != null && isCTAS &&
+              HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+        LOG.info("Performing cleanup as part of rollback: {}", table.getFullTableName().toString());
+        try {
+          CompactionRequest rqst = new CompactionRequest(table.getDbName(), table.getTableName(),
+                  CompactionType.MAJOR);
+          rqst.setRunas(TxnUtils.findUserToRunAs(destinationPath.toString(), table.getTTable(), conf));
+          rqst.putToProperties(META_TABLE_LOCATION, destinationPath.toString());
+          rqst.putToProperties(IF_PURGE, Boolean.toString(true));
+          boolean success = Hive.get(conf).getMSC().submitForCleanup(rqst, writeId,
+                  pCtx.getQueryState().getTxnManager().getCurrentTxnId());
+          if (success) {
+            LOG.info("The cleanup request has been submitted");
+          } else {
+            LOG.info("The cleanup request has not been submitted");
+          }
+        } catch (HiveException | IOException | InterruptedException | TException e) {

Review Comment:
   we shouldn't catch InterruptedException



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r934518439


##########
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##########
@@ -7983,7 +7984,17 @@ protected boolean enableColumnStatsCollecting() {
     return true;
   }
 
-  private Path getCtasLocation(CreateTableDesc tblDesc) throws SemanticException {
+  private Path getCtasLocation(CreateTableDesc tblDesc, boolean createTableWithSuffix) throws SemanticException {
+    Path destinationPath = getCtasLocationWithoutSuffix(tblDesc);
+    if (createTableWithSuffix) {
+      long txnId = ctx.getHiveTxnManager().getCurrentTxnId();
+      String suffix = AcidUtils.getPathSuffix(txnId);
+      destinationPath = new Path(destinationPath.toString() + suffix);
+    }
+    return destinationPath;
+  }
+
+  private Path getCtasLocationWithoutSuffix(CreateTableDesc tblDesc) throws SemanticException {

Review Comment:
   please keep it as `getCtasLocation` and call overloaded getCtasLocation(tblDesc, false)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r935199794


##########
ql/src/java/org/apache/hadoop/hive/ql/HiveQueryLifeTimeHook.java:
##########
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.CompactionRequest;
+import org.apache.hadoop.hive.metastore.api.CompactionType;
+import org.apache.hadoop.hive.metastore.txn.TxnUtils;
+import org.apache.hadoop.hive.ql.hooks.PrivateHookContext;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHook;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHookContext;
+import org.apache.hadoop.hive.ql.metadata.Hive;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.ql.metadata.Table;
+import org.apache.hadoop.hive.ql.plan.HiveOperation;
+import org.apache.thrift.TException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.io.IOException;
+
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.IF_PURGE;
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.META_TABLE_LOCATION;
+
+public class HiveQueryLifeTimeHook implements QueryLifeTimeHook {
+
+  private static final Logger LOG = LoggerFactory.getLogger(HiveQueryLifeTimeHook.class);
+
+  @Override
+  public void beforeCompile(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterCompile(QueryLifeTimeHookContext ctx, boolean hasError) {
+
+  }
+
+  @Override
+  public void beforeExecution(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterExecution(QueryLifeTimeHookContext ctx, boolean hasError) {
+    if (hasError) {
+      checkAndRollbackCTAS(ctx);
+    }
+  }
+
+  private void checkAndRollbackCTAS(QueryLifeTimeHookContext ctx) {
+    HiveConf conf = ctx.getHiveConf();
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+      PrivateHookContext pCtx = (PrivateHookContext) ctx.getHookContext();
+      Table table = ctx.getHookContext().getQueryPlan().getAcidSinks().iterator().next().getTable();

Review Comment:
   Made sure that there is atleast one `AcidSink`. Done



##########
ql/src/java/org/apache/hadoop/hive/ql/HiveQueryLifeTimeHook.java:
##########
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.CompactionRequest;
+import org.apache.hadoop.hive.metastore.api.CompactionType;
+import org.apache.hadoop.hive.metastore.txn.TxnUtils;
+import org.apache.hadoop.hive.ql.hooks.PrivateHookContext;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHook;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHookContext;
+import org.apache.hadoop.hive.ql.metadata.Hive;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.ql.metadata.Table;
+import org.apache.hadoop.hive.ql.plan.HiveOperation;
+import org.apache.thrift.TException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.io.IOException;
+
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.IF_PURGE;
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.META_TABLE_LOCATION;
+
+public class HiveQueryLifeTimeHook implements QueryLifeTimeHook {
+
+  private static final Logger LOG = LoggerFactory.getLogger(HiveQueryLifeTimeHook.class);
+
+  @Override
+  public void beforeCompile(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterCompile(QueryLifeTimeHookContext ctx, boolean hasError) {
+
+  }
+
+  @Override
+  public void beforeExecution(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterExecution(QueryLifeTimeHookContext ctx, boolean hasError) {
+    if (hasError) {
+      checkAndRollbackCTAS(ctx);
+    }
+  }
+
+  private void checkAndRollbackCTAS(QueryLifeTimeHookContext ctx) {
+    HiveConf conf = ctx.getHiveConf();
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r933133215


##########
ql/src/java/org/apache/hadoop/hive/ql/HookRunner.java:
##########
@@ -56,6 +57,7 @@ public class HookRunner {
   HookRunner(HiveConf conf, SessionState.LogHelper console) {
     this.conf = conf;
     this.hooks = new HiveHooks(conf, console);
+    addNecessaryHooks();

Review Comment:
   just add it directly



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] pvary commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
pvary commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r925400948


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +480,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() {

Review Comment:
   Thanks for the explanation!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r928844426


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java:
##########
@@ -161,7 +161,7 @@ void replTableWriteIdState(String validWriteIdList, String dbName, String tableN
    * @throws LockException if there is no current transaction or the
    * transaction has already been committed or aborted.
    */
-  void rollbackTxn() throws LockException;
+  void rollbackTxn(Context ctx) throws LockException;

Review Comment:
   You mean retain this function - 
   `void rollbackTxn() throws LockException;`
    As well as create a new function -
   `void rollbackTxn(Context ctx) throws LockException;`  in the interface?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r934269159


##########
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java:
##########
@@ -3478,6 +3478,17 @@ CompactionResponse compact2(String dbname, String tableName, String partitionNam
    */
   ShowCompactResponse showCompactions() throws TException;
 
+  /**
+   * Submit a request for performing cleanup of output directory. This is particularly
+   * useful for CTAS when the query fails after write and before creation of table.
+   * @return Status of whether the request was successfully submitted. True indicates
+   * the request was successfully submitted and false indicates failure of request submitted.
+   * @throws TException
+   */
+  boolean submitForCleanup(String dbname, String tableName, CompactionType type,

Review Comment:
   please use a request object, so it could be modified in future
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r934566867


##########
ql/src/java/org/apache/hadoop/hive/ql/HiveQueryLifeTimeHook.java:
##########
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.CompactionRequest;
+import org.apache.hadoop.hive.metastore.api.CompactionType;
+import org.apache.hadoop.hive.metastore.txn.TxnUtils;
+import org.apache.hadoop.hive.ql.hooks.PrivateHookContext;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHook;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHookContext;
+import org.apache.hadoop.hive.ql.metadata.Hive;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.ql.metadata.Table;
+import org.apache.hadoop.hive.ql.plan.HiveOperation;
+import org.apache.thrift.TException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.io.IOException;
+
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.IF_PURGE;
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.META_TABLE_LOCATION;
+
+public class HiveQueryLifeTimeHook implements QueryLifeTimeHook {
+
+  private static final Logger LOG = LoggerFactory.getLogger(HiveQueryLifeTimeHook.class);
+
+  @Override
+  public void beforeCompile(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterCompile(QueryLifeTimeHookContext ctx, boolean hasError) {
+
+  }
+
+  @Override
+  public void beforeExecution(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterExecution(QueryLifeTimeHookContext ctx, boolean hasError) {
+    if (hasError) {
+      checkAndRollbackCTAS(ctx);
+    }
+  }
+
+  private void checkAndRollbackCTAS(QueryLifeTimeHookContext ctx) {
+    HiveConf conf = ctx.getHiveConf();
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+      PrivateHookContext pCtx = (PrivateHookContext) ctx.getHookContext();
+      Table table = ctx.getHookContext().getQueryPlan().getAcidSinks().iterator().next().getTable();

Review Comment:
   extract queryPlan into a local var for readability
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r935205909


##########
ql/src/java/org/apache/hadoop/hive/ql/HiveQueryLifeTimeHook.java:
##########
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.CompactionRequest;
+import org.apache.hadoop.hive.metastore.api.CompactionType;
+import org.apache.hadoop.hive.metastore.txn.TxnUtils;
+import org.apache.hadoop.hive.ql.hooks.PrivateHookContext;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHook;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHookContext;
+import org.apache.hadoop.hive.ql.metadata.Hive;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.ql.metadata.Table;
+import org.apache.hadoop.hive.ql.plan.FileSinkDesc;
+import org.apache.thrift.TException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.io.IOException;
+
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.IF_PURGE;
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.META_TABLE_LOCATION;
+
+public class HiveQueryLifeTimeHook implements QueryLifeTimeHook {
+
+  private static final Logger LOG = LoggerFactory.getLogger(HiveQueryLifeTimeHook.class);
+
+  @Override
+  public void beforeCompile(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterCompile(QueryLifeTimeHookContext ctx, boolean hasError) {
+
+  }
+
+  @Override
+  public void beforeExecution(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterExecution(QueryLifeTimeHookContext ctx, boolean hasError) {
+    if (hasError) {
+      checkAndRollbackCTAS(ctx);
+    }
+  }
+
+  private void checkAndRollbackCTAS(QueryLifeTimeHookContext ctx) {
+    HiveConf conf = ctx.getHiveConf();
+    PrivateHookContext pCtx = (PrivateHookContext) ctx.getHookContext();
+    QueryPlan queryPlan = ctx.getHookContext().getQueryPlan();
+    if (queryPlan.getAcidSinks() != null && queryPlan.getAcidSinks().size() > 0) {
+      FileSinkDesc fileSinkDesc = queryPlan.getAcidSinks().iterator().next();
+      Table table = fileSinkDesc.getTable();
+      long writeId = fileSinkDesc.getTableWriteId();
+      boolean isCTAS = ctx.getHookContext().getQueryPlan().getQueryProperties().isCTAS();
+      Path destinationPath = pCtx.getContext().getLocation();
+
+      if (destinationPath != null && table != null && isCTAS &&
+              HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+        LOG.info("Performing cleanup as part of rollback: {}", table.getFullTableName().toString());
+        try {
+          CompactionRequest rqst = new CompactionRequest(table.getDbName(), table.getTableName(),
+                  CompactionType.MAJOR);
+          rqst.setRunas(TxnUtils.findUserToRunAs(destinationPath.toString(), table.getTTable(), conf));
+          rqst.putToProperties(META_TABLE_LOCATION, destinationPath.toString());
+          rqst.putToProperties(IF_PURGE, Boolean.toString(true));
+          boolean success = Hive.get(conf).getMSC().submitForCleanup(rqst, writeId,
+                  pCtx.getQueryState().getTxnManager().getCurrentTxnId());
+          if (success) {
+            LOG.info("The cleanup request has been submitted");
+          } else {
+            LOG.info("The cleanup request has not been submitted");
+          }
+        } catch (HiveException | IOException | InterruptedException | TException e) {

Review Comment:
   we shouldn't catch InterruptedException



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r925344280


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +480,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() {

Review Comment:
   Yes you are right. Since this scenario (test case written) is about the case when failure occurs before creation of table and after write, the directory must only be deleted.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r927489420


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +480,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() {
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {

Review Comment:
   shouldn't we do cleanup when there is no concurrent CTAS, but the operation was canceled by the user in the middle?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r927568434


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +480,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() {
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {

Review Comment:
   Yes, the cleanup is for cases when there is no concurrent CTAS and a single query fails. But if we disable exclusive locking and perform concurrent CTAS operations and let's say the first query fails, then cleanup is triggered by the first query on the same location and the second query will write to the same location. 
   
   This is the situation I want to avoid which is why perform cleanup only when the exclusive locking on CTAS is enabled.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r930790946


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java:
##########
@@ -161,7 +161,7 @@ void replTableWriteIdState(String validWriteIdList, String dbName, String tableN
    * @throws LockException if there is no current transaction or the
    * transaction has already been committed or aborted.
    */
-  void rollbackTxn() throws LockException;
+  void rollbackTxn(Context ctx) throws LockException;

Review Comment:
   true, however, Impala could still be using an old API
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r928844426


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java:
##########
@@ -161,7 +161,7 @@ void replTableWriteIdState(String validWriteIdList, String dbName, String tableN
    * @throws LockException if there is no current transaction or the
    * transaction has already been committed or aborted.
    */
-  void rollbackTxn() throws LockException;
+  void rollbackTxn(Context ctx) throws LockException;

Review Comment:
   You mean retain this function - 
   `void rollbackTxn() throws LockException;`
    As well as create a -
   `void rollbackTxn(Context ctx) throws LockException;` ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r928497740


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +480,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() {

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r927485805


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +480,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() {

Review Comment:
   As you mentioned, we need to add an overloaded `rollbackTxn` method into the `HiveTxnManager` interface that accepts `Context ctx` param.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r927568434


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +480,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() {
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {

Review Comment:
   Yes, the cleanup is for cases when there is no concurrent CTAS and a single query fails. But if we disable exclusive locking and perform concurrent CTAS operations and let's say the first query fails, then cleanup is triggered by the first query on the same location as well the second query will write to the same location. 
   
   This is the situation I want to avoid which is why perform cleanup only when the exclusive locking on CTAS is enabled.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r925356465


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +480,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() {

Review Comment:
   Creation of table happens here - As part of DDLTask. 
   https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/ddl/table/create/CreateTableOperation.java#L104-L117
   
   The test case I have written is essentially manipulating the contents of CreateTableDesc to null so that creation of table fails. But in reality, if there is a failure before creation of table then the data written will stay and is not associated to any table and is never cleaned up.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r927592887


##########
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##########
@@ -7852,6 +7852,10 @@ protected Operator genFileSinkPlan(String dest, QB qb, Operator input)
           throw new SemanticException("Error while getting the full qualified path for the given directory: " + ex.getMessage());
         }
       }
+
+      if (!isNonNativeTable && AcidUtils.isTransactionalTable(destinationTable) && qb.isCTAS()) {

Review Comment:
   I enquired with @lcspinter offline, Iceberg tables are treated as external tables and specifically for Iceberg tables, there is a implementation of cleanup here - 
   https://github.com/apache/hive/blob/master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergQueryLifeTimeHook.java



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r933135808


##########
ql/src/java/org/apache/hadoop/hive/ql/HookRunner.java:
##########
@@ -56,6 +57,7 @@ public class HookRunner {
   HookRunner(HiveConf conf, SessionState.LogHelper console) {
     this.conf = conf;
     this.hooks = new HiveHooks(conf, console);
+    addNecessaryHooks();

Review Comment:
   `addNecessaryHooks()` will have all the hooks that are absolutely required from now on. Thought this would be a good way to go about. But let me know if this function is required so that I can change it.



##########
ql/src/java/org/apache/hadoop/hive/ql/HookRunner.java:
##########
@@ -56,6 +57,7 @@ public class HookRunner {
   HookRunner(HiveConf conf, SessionState.LogHelper console) {
     this.conf = conf;
     this.hooks = new HiveHooks(conf, console);
+    addNecessaryHooks();

Review Comment:
   Done



##########
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/create/CtasQueryLifeTimeHook.java:
##########
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.ddl.table.create;
+
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.CompactionRequest;
+import org.apache.hadoop.hive.metastore.api.CompactionType;
+import org.apache.hadoop.hive.metastore.api.MetaException;
+import org.apache.hadoop.hive.metastore.txn.TxnStore;
+import org.apache.hadoop.hive.metastore.txn.TxnUtils;
+import org.apache.hadoop.hive.ql.Context;
+import org.apache.hadoop.hive.ql.hooks.PrivateHookContext;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHook;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHookContext;
+import org.apache.hadoop.hive.ql.metadata.Table;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.io.IOException;
+
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.IF_PURGE;
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.META_TABLE_LOCATION;
+
+public class CtasQueryLifeTimeHook implements QueryLifeTimeHook {

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r930796903


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +493,27 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupOutputDir(Context ctx) throws MetaException {
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+      Table destinationTable = ctx.getDestinationTable();
+      if (destinationTable != null) {
+        try {
+          CompactionRequest rqst = new CompactionRequest(
+                  destinationTable.getDbName(), destinationTable.getTableName(), CompactionType.MAJOR);
+          rqst.setRunas(TxnUtils.findUserToRunAs(destinationTable.getSd().getLocation(),
+                  destinationTable.getTTable(), conf));
+
+          rqst.putToProperties(META_TABLE_LOCATION, destinationTable.getSd().getLocation());
+          rqst.putToProperties(IF_PURGE, Boolean.toString(true));
+          TxnStore txnHandler = TxnUtils.getTxnStore(conf);

Review Comment:
   that would be expensive to create a new txnStore everytime



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r930794617


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -598,6 +627,16 @@ public void rollbackTxn() throws LockException {
     }
   }
 
+  @Override
+  public void rollbackTxn(Context ctx) throws LockException {
+    try {
+      cleanupOutputDir(ctx);
+    } catch (TException e) {
+      throw new LockException(ErrorMsg.METASTORE_COMMUNICATION_FAILED.getMsg(), e);
+    }
+    rollbackTxn();

Review Comment:
   we should call rollback first, and then try the cleanup, if cleanup fails, we'll never mark a txn as aborted 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r935221669


##########
ql/src/java/org/apache/hadoop/hive/ql/Driver.java:
##########
@@ -905,6 +905,11 @@ public QueryDisplay getQueryDisplay() {
     return driverContext.getQueryDisplay();
   }
 
+  @VisibleForTesting
+  DriverContext getDriverContext() {

Review Comment:
   no need to expose context in test. if you need queryPlan, you can simply call:
   ````
   driver.getPlan()
   ````



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r934530893


##########
ql/src/java/org/apache/hadoop/hive/ql/Context.java:
##########
@@ -191,6 +191,8 @@ public class Context {
 
   private List<Pair<String, String>> parsedTables = new ArrayList<>();
 
+  private Path destinationPath;

Review Comment:
   change to `location`
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r926391951


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +480,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() {

Review Comment:
   could we pass a `Context ctx` here instead of initializing the destination table when acquiring locks? 



##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +480,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() {
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {

Review Comment:
   shouldn't we check that this was a CTAS operation? do we only need to cleanup if an exclusive lock is enabled?



##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +480,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() {
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+      if (destinationTable != null) {
+        try {
+          CompactionRequest rqst = new CompactionRequest(
+                  destinationTable.getDbName(), destinationTable.getTableName(), CompactionType.MAJOR);
+          rqst.setRunas(TxnUtils.findUserToRunAs(destinationTable.getSd().getLocation(),
+                  destinationTable.getTTable(), conf));
+
+          rqst.putToProperties("location", destinationTable.getSd().getLocation());

Review Comment:
   use hive_metastoreConstants.META_TABLE_LOCATION?



##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -29,19 +29,10 @@ Licensed to the Apache Software Foundation (ASF) under one
 import org.apache.hadoop.hive.metastore.IMetaStoreClient;
 import org.apache.hadoop.hive.metastore.LockComponentBuilder;
 import org.apache.hadoop.hive.metastore.LockRequestBuilder;
-import org.apache.hadoop.hive.metastore.api.LockComponent;
-import org.apache.hadoop.hive.metastore.api.LockResponse;
-import org.apache.hadoop.hive.metastore.api.LockState;
-import org.apache.hadoop.hive.metastore.api.MetaException;
-import org.apache.hadoop.hive.metastore.api.NoSuchLockException;
-import org.apache.hadoop.hive.metastore.api.NoSuchTxnException;
-import org.apache.hadoop.hive.metastore.api.TxnAbortedException;
-import org.apache.hadoop.hive.metastore.api.TxnToWriteId;
-import org.apache.hadoop.hive.metastore.api.CommitTxnRequest;
-import org.apache.hadoop.hive.metastore.api.DataOperationType;
-import org.apache.hadoop.hive.metastore.api.GetOpenTxnsResponse;
-import org.apache.hadoop.hive.metastore.api.TxnType;
+import org.apache.hadoop.hive.metastore.api.*;

Review Comment:
   please avoid wildcard imports



##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +480,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() {
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+      if (destinationTable != null) {
+        try {
+          CompactionRequest rqst = new CompactionRequest(
+                  destinationTable.getDbName(), destinationTable.getTableName(), CompactionType.MAJOR);
+          rqst.setRunas(TxnUtils.findUserToRunAs(destinationTable.getSd().getLocation(),
+                  destinationTable.getTTable(), conf));
+
+          rqst.putToProperties("location", destinationTable.getSd().getLocation());
+          rqst.putToProperties("ifPurge", Boolean.toString(true));
+          TxnStore txnHandler = TxnUtils.getTxnStore(conf);
+          txnHandler.submitForCleanup(rqst, destinationTable.getTTable().getWriteId(), getCurrentTxnId());
+        } catch (InterruptedException | IOException | MetaException e) {
+          throw new RuntimeException("Not able to submit cleanup operation of directory written by CTAS");

Review Comment:
   should we catch just InterruptedException | IOException and re-throw MetaException here?



##########
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##########
@@ -7852,6 +7852,10 @@ protected Operator genFileSinkPlan(String dest, QB qb, Operator input)
           throw new SemanticException("Error while getting the full qualified path for the given directory: " + ex.getMessage());
         }
       }
+
+      if (!isNonNativeTable && AcidUtils.isTransactionalTable(destinationTable) && qb.isCTAS()) {

Review Comment:
   is `!isNonNativeTable` required to exclude Iceberg tables?



##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +480,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() {
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+      if (destinationTable != null) {
+        try {
+          CompactionRequest rqst = new CompactionRequest(
+                  destinationTable.getDbName(), destinationTable.getTableName(), CompactionType.MAJOR);
+          rqst.setRunas(TxnUtils.findUserToRunAs(destinationTable.getSd().getLocation(),
+                  destinationTable.getTTable(), conf));
+
+          rqst.putToProperties("location", destinationTable.getSd().getLocation());
+          rqst.putToProperties("ifPurge", Boolean.toString(true));

Review Comment:
   would be good if we move "ifPurge" under constants as well



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r927401945


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +480,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() {
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+      if (destinationTable != null) {
+        try {
+          CompactionRequest rqst = new CompactionRequest(
+                  destinationTable.getDbName(), destinationTable.getTableName(), CompactionType.MAJOR);
+          rqst.setRunas(TxnUtils.findUserToRunAs(destinationTable.getSd().getLocation(),
+                  destinationTable.getTTable(), conf));
+
+          rqst.putToProperties("location", destinationTable.getSd().getLocation());
+          rqst.putToProperties("ifPurge", Boolean.toString(true));
+          TxnStore txnHandler = TxnUtils.getTxnStore(conf);
+          txnHandler.submitForCleanup(rqst, destinationTable.getTTable().getWriteId(), getCurrentTxnId());
+        } catch (InterruptedException | IOException | MetaException e) {
+          throw new RuntimeException("Not able to submit cleanup operation of directory written by CTAS");

Review Comment:
   All three exceptions are thrown within the `try-catch` code. Changed it to LockException instead of RuntimeException.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r927387412


##########
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##########
@@ -7852,6 +7852,10 @@ protected Operator genFileSinkPlan(String dest, QB qb, Operator input)
           throw new SemanticException("Error while getting the full qualified path for the given directory: " + ex.getMessage());
         }
       }
+
+      if (!isNonNativeTable && AcidUtils.isTransactionalTable(destinationTable) && qb.isCTAS()) {

Review Comment:
   Yes



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r928497915


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +496,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() throws LockException {

Review Comment:
   Done



##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +496,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() throws LockException {
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+      if (destinationTable != null) {
+        try {
+          CompactionRequest rqst = new CompactionRequest(
+                  destinationTable.getDbName(), destinationTable.getTableName(), CompactionType.MAJOR);
+          rqst.setRunas(TxnUtils.findUserToRunAs(destinationTable.getSd().getLocation(),
+                  destinationTable.getTTable(), conf));
+
+          rqst.putToProperties(META_TABLE_LOCATION, destinationTable.getSd().getLocation());
+          rqst.putToProperties(IF_PURGE, Boolean.toString(true));
+          TxnStore txnHandler = TxnUtils.getTxnStore(conf);
+          txnHandler.submitForCleanup(rqst, destinationTable.getTTable().getWriteId(), getCurrentTxnId());
+        } catch (InterruptedException | IOException | MetaException e) {

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r933143569


##########
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/create/CtasQueryLifeTimeHook.java:
##########
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.ddl.table.create;
+
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.CompactionRequest;
+import org.apache.hadoop.hive.metastore.api.CompactionType;
+import org.apache.hadoop.hive.metastore.api.MetaException;
+import org.apache.hadoop.hive.metastore.txn.TxnStore;
+import org.apache.hadoop.hive.metastore.txn.TxnUtils;
+import org.apache.hadoop.hive.ql.Context;
+import org.apache.hadoop.hive.ql.hooks.PrivateHookContext;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHook;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHookContext;
+import org.apache.hadoop.hive.ql.metadata.Table;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.io.IOException;
+
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.IF_PURGE;
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.META_TABLE_LOCATION;
+
+public class CtasQueryLifeTimeHook implements QueryLifeTimeHook {
+
+  private static final Logger LOG = LoggerFactory.getLogger(CtasQueryLifeTimeHook.class);
+
+  @Override
+  public void beforeCompile(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterCompile(QueryLifeTimeHookContext ctx, boolean hasError) {
+
+  }
+
+  @Override
+  public void beforeExecution(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterExecution(QueryLifeTimeHookContext ctx, boolean hasError) {
+    HiveConf conf = ctx.getHiveConf();

Review Comment:
   ````
       if (hasError) {
         checkAndRollbackCTAS(ctx);
       }
   ````



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r931126498


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +493,27 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupOutputDir(Context ctx) throws MetaException {
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+      Table destinationTable = ctx.getDestinationTable();
+      if (destinationTable != null) {
+        try {
+          CompactionRequest rqst = new CompactionRequest(
+                  destinationTable.getDbName(), destinationTable.getTableName(), CompactionType.MAJOR);
+          rqst.setRunas(TxnUtils.findUserToRunAs(destinationTable.getSd().getLocation(),
+                  destinationTable.getTTable(), conf));
+
+          rqst.putToProperties(META_TABLE_LOCATION, destinationTable.getSd().getLocation());
+          rqst.putToProperties(IF_PURGE, Boolean.toString(true));
+          TxnStore txnHandler = TxnUtils.getTxnStore(conf);

Review Comment:
   > btw, would it be hard to create a completionHook similar to Iceberg one?
   
   We could create one but it would include failures only within Query execution.
   Anything done after query execution (post execution activities like releasing locks) will not be within its scope, which is why I disregarded the Hook approach.
   
   The hooks are used as part of finally block here - 
   https://github.com/apache/hive/blob/b197ed86029f07696e326acb5878f86c286e9e1a/ql/src/java/org/apache/hadoop/hive/ql/Executor.java#L118
   
   Cleanup will then be dependent on a HiveConf - `hive.query.lifetime.hooks`. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r934564806


##########
ql/src/java/org/apache/hadoop/hive/ql/HiveQueryLifeTimeHook.java:
##########
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.CompactionRequest;
+import org.apache.hadoop.hive.metastore.api.CompactionType;
+import org.apache.hadoop.hive.metastore.txn.TxnUtils;
+import org.apache.hadoop.hive.ql.hooks.PrivateHookContext;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHook;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHookContext;
+import org.apache.hadoop.hive.ql.metadata.Hive;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.ql.metadata.Table;
+import org.apache.hadoop.hive.ql.plan.HiveOperation;
+import org.apache.thrift.TException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.io.IOException;
+
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.IF_PURGE;
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.META_TABLE_LOCATION;
+
+public class HiveQueryLifeTimeHook implements QueryLifeTimeHook {
+
+  private static final Logger LOG = LoggerFactory.getLogger(HiveQueryLifeTimeHook.class);
+
+  @Override
+  public void beforeCompile(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterCompile(QueryLifeTimeHookContext ctx, boolean hasError) {
+
+  }
+
+  @Override
+  public void beforeExecution(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterExecution(QueryLifeTimeHookContext ctx, boolean hasError) {
+    if (hasError) {
+      checkAndRollbackCTAS(ctx);
+    }
+  }
+
+  private void checkAndRollbackCTAS(QueryLifeTimeHookContext ctx) {
+    HiveConf conf = ctx.getHiveConf();
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {

Review Comment:
   check this only if it's CTAS, move it under the location and operationType check



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r934564806


##########
ql/src/java/org/apache/hadoop/hive/ql/HiveQueryLifeTimeHook.java:
##########
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.CompactionRequest;
+import org.apache.hadoop.hive.metastore.api.CompactionType;
+import org.apache.hadoop.hive.metastore.txn.TxnUtils;
+import org.apache.hadoop.hive.ql.hooks.PrivateHookContext;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHook;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHookContext;
+import org.apache.hadoop.hive.ql.metadata.Hive;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.ql.metadata.Table;
+import org.apache.hadoop.hive.ql.plan.HiveOperation;
+import org.apache.thrift.TException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.io.IOException;
+
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.IF_PURGE;
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.META_TABLE_LOCATION;
+
+public class HiveQueryLifeTimeHook implements QueryLifeTimeHook {
+
+  private static final Logger LOG = LoggerFactory.getLogger(HiveQueryLifeTimeHook.class);
+
+  @Override
+  public void beforeCompile(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterCompile(QueryLifeTimeHookContext ctx, boolean hasError) {
+
+  }
+
+  @Override
+  public void beforeExecution(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterExecution(QueryLifeTimeHookContext ctx, boolean hasError) {
+    if (hasError) {
+      checkAndRollbackCTAS(ctx);
+    }
+  }
+
+  private void checkAndRollbackCTAS(QueryLifeTimeHookContext ctx) {
+    HiveConf conf = ctx.getHiveConf();
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {

Review Comment:
   check this if it's CTAS, move it under the location and operationType check



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r934568273


##########
ql/src/java/org/apache/hadoop/hive/ql/Context.java:
##########
@@ -191,6 +191,8 @@ public class Context {
 
   private List<Pair<String, String>> parsedTables = new ArrayList<>();
 
+  private Path destinationPath;

Review Comment:
   change this to `location`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r935201293


##########
ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java:
##########
@@ -3080,6 +3084,83 @@ public void testAcidOrcWritePreservesFieldNames() throws Exception {
     hiveConf.setBoolVar(HiveConf.ConfVars.HIVE_VECTORIZATION_ENABLED, true);
   }
 
+  @Test
+  public void testFailureScenariosCleanupCTAS() throws Exception {
+    boolean[] booleans = {true, false};
+    for (boolean var1 : booleans) {
+      for (boolean var2 : booleans) {
+        for (boolean var3 : booleans) {
+          for (boolean var4 : booleans) {
+            failureScenarioCleanupCTAS(var1, var2, var3, var4);
+          }
+        }
+      }
+    }
+  }
+
+  public void failureScenarioCleanupCTAS(boolean isPartitioned,
+                                         boolean isDirectInsertEnabled,
+                                         boolean isLocklessReadsEnabled,
+                                         boolean isLocationUsed) throws Exception {
+    String tableName = "atable";
+
+    //Set configurations
+    hiveConf.setBoolVar(HiveConf.ConfVars.HIVE_ACID_DIRECT_INSERT_ENABLED, isDirectInsertEnabled);
+    hiveConf.setBoolVar(HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED, isLocklessReadsEnabled);
+    hiveConf.setBoolVar(HiveConf.ConfVars.TXN_CTAS_X_LOCK, true);
+    hiveConf.setIntVar(HiveConf.ConfVars.HIVE_COMPACTOR_ABORTEDTXN_THRESHOLD, 0);
+
+    // Add a '1' at the end of table name for custom location.
+    String querylocation = (isLocationUsed) ? " location '" + getWarehouseDir() + "/" + tableName + "1'" : "";
+    String queryPartitions = (isPartitioned) ? " partitioned by (a)" : "";
+
+    d.run("insert into " + Table.ACIDTBL + "(a,b) values (3,4)");
+    d.run("drop table if exists " + tableName);
+    d.compileAndRespond("create table " + tableName + queryPartitions + " stored as orc" + querylocation +
+            " tblproperties ('transactional'='true') as select * from " + Table.ACIDTBL);
+    long txnId = d.getQueryState().getTxnManager().getCurrentTxnId();
+    DriverContext driverContext = d.getDriverContext();
+    traverseTasksRecursively(driverContext.getPlan().getRootTasks());

Review Comment:
   Assigning a task like this wont help us in simulating the error because the write will not happen. However, I have replaced the failure in DDLTask my mocking in a similar way.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r934701343


##########
ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java:
##########
@@ -3080,6 +3084,83 @@ public void testAcidOrcWritePreservesFieldNames() throws Exception {
     hiveConf.setBoolVar(HiveConf.ConfVars.HIVE_VECTORIZATION_ENABLED, true);
   }
 
+  @Test
+  public void testFailureScenariosCleanupCTAS() throws Exception {
+    boolean[] booleans = {true, false};
+    for (boolean var1 : booleans) {

Review Comment:
   could we create parametrized test:
   ````
   private static Stream<Arguments> generateArgs() {
       return IntStream.range(0, 1 << 4).mapToObj(i ->
         Arguments.of((i & 1) != 0, ((i >>> 1) & 1) != 0, ((i >>> 2) & 1) != 0, ((i >>> 3) & 1) != 0));
     }
   
     @ParameterizedTest
     @MethodSource("generateArgs")
     public void failureScenarioCleanupCTAS(boolean isDirectInsertEnabled,
                                            boolean isLocklessReadsEnabled,
                                            boolean isLocationUsed, 
                                           boolean isLocationUsed) throws Exception {
   ````



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r927401158


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +480,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() {
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+      if (destinationTable != null) {
+        try {
+          CompactionRequest rqst = new CompactionRequest(
+                  destinationTable.getDbName(), destinationTable.getTableName(), CompactionType.MAJOR);
+          rqst.setRunas(TxnUtils.findUserToRunAs(destinationTable.getSd().getLocation(),
+                  destinationTable.getTTable(), conf));
+
+          rqst.putToProperties("location", destinationTable.getSd().getLocation());
+          rqst.putToProperties("ifPurge", Boolean.toString(true));

Review Comment:
   Create a constant called `IF_PURGE` . Done.



##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +480,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() {
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+      if (destinationTable != null) {
+        try {
+          CompactionRequest rqst = new CompactionRequest(
+                  destinationTable.getDbName(), destinationTable.getTableName(), CompactionType.MAJOR);
+          rqst.setRunas(TxnUtils.findUserToRunAs(destinationTable.getSd().getLocation(),
+                  destinationTable.getTTable(), conf));
+
+          rqst.putToProperties("location", destinationTable.getSd().getLocation());
+          rqst.putToProperties("ifPurge", Boolean.toString(true));

Review Comment:
   Created a constant called `IF_PURGE` . Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r927400722


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -29,19 +29,10 @@ Licensed to the Apache Software Foundation (ASF) under one
 import org.apache.hadoop.hive.metastore.IMetaStoreClient;
 import org.apache.hadoop.hive.metastore.LockComponentBuilder;
 import org.apache.hadoop.hive.metastore.LockRequestBuilder;
-import org.apache.hadoop.hive.metastore.api.LockComponent;
-import org.apache.hadoop.hive.metastore.api.LockResponse;
-import org.apache.hadoop.hive.metastore.api.LockState;
-import org.apache.hadoop.hive.metastore.api.MetaException;
-import org.apache.hadoop.hive.metastore.api.NoSuchLockException;
-import org.apache.hadoop.hive.metastore.api.NoSuchTxnException;
-import org.apache.hadoop.hive.metastore.api.TxnAbortedException;
-import org.apache.hadoop.hive.metastore.api.TxnToWriteId;
-import org.apache.hadoop.hive.metastore.api.CommitTxnRequest;
-import org.apache.hadoop.hive.metastore.api.DataOperationType;
-import org.apache.hadoop.hive.metastore.api.GetOpenTxnsResponse;
-import org.apache.hadoop.hive.metastore.api.TxnType;
+import org.apache.hadoop.hive.metastore.api.*;

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r927295389


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +480,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() {

Review Comment:
   Currently `Context ctx` is not available during `rollbackTxn` which is why I chose to store the object.
   However I agree passing `Context ctx` is better.
   There are multiple ways of doing this - 
   First would be to involve passing `Context ctx` to `rollbackTxn` method which would change the HiveTxnManager API itself (I particularly dont like this since this would be a breaking change).
   
   Or we could create a new function in the `HiveTxnManager` interface of the same name and call it from the driver when rollback conditions are satisfied.
   
   My idea was to avoid both and initialise the destination in one of the existing APIs but I am open for any other suggestions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r927437027


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +480,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() {
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+      if (destinationTable != null) {
+        try {
+          CompactionRequest rqst = new CompactionRequest(
+                  destinationTable.getDbName(), destinationTable.getTableName(), CompactionType.MAJOR);
+          rqst.setRunas(TxnUtils.findUserToRunAs(destinationTable.getSd().getLocation(),
+                  destinationTable.getTTable(), conf));
+
+          rqst.putToProperties("location", destinationTable.getSd().getLocation());

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r929574342


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java:
##########
@@ -161,7 +161,7 @@ void replTableWriteIdState(String validWriteIdList, String dbName, String tableN
    * @throws LockException if there is no current transaction or the
    * transaction has already been committed or aborted.
    */
-  void rollbackTxn() throws LockException;
+  void rollbackTxn(Context ctx) throws LockException;

Review Comment:
   Added `void rollbackTxn(Context ctx)` and retained the old signature. Done. 
   
   However most of the places will use `void rollbackTxn(Context ctx)` from now on.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r934209248


##########
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/create/CtasQueryLifeTimeHook.java:
##########
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.ddl.table.create;
+
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.CompactionRequest;
+import org.apache.hadoop.hive.metastore.api.CompactionType;
+import org.apache.hadoop.hive.metastore.api.MetaException;
+import org.apache.hadoop.hive.metastore.txn.TxnStore;
+import org.apache.hadoop.hive.metastore.txn.TxnUtils;
+import org.apache.hadoop.hive.ql.Context;
+import org.apache.hadoop.hive.ql.hooks.PrivateHookContext;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHook;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHookContext;
+import org.apache.hadoop.hive.ql.metadata.Table;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.io.IOException;
+
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.IF_PURGE;
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.META_TABLE_LOCATION;
+
+public class CtasQueryLifeTimeHook implements QueryLifeTimeHook {
+
+  private static final Logger LOG = LoggerFactory.getLogger(CtasQueryLifeTimeHook.class);
+
+  @Override
+  public void beforeCompile(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterCompile(QueryLifeTimeHookContext ctx, boolean hasError) {
+
+  }
+
+  @Override
+  public void beforeExecution(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterExecution(QueryLifeTimeHookContext ctx, boolean hasError) {
+    HiveConf conf = ctx.getHiveConf();

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r933135808


##########
ql/src/java/org/apache/hadoop/hive/ql/HookRunner.java:
##########
@@ -56,6 +57,7 @@ public class HookRunner {
   HookRunner(HiveConf conf, SessionState.LogHelper console) {
     this.conf = conf;
     this.hooks = new HiveHooks(conf, console);
+    addNecessaryHooks();

Review Comment:
   `addNecessaryHooks()` will have all the hooks that are absolutely required from now on. Thought this would be a good way to go about. But let me know if I have to change and directly add it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r934532620


##########
ql/src/java/org/apache/hadoop/hive/ql/HiveQueryLifeTimeHook.java:
##########
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.CompactionRequest;
+import org.apache.hadoop.hive.metastore.api.CompactionType;
+import org.apache.hadoop.hive.metastore.txn.TxnUtils;
+import org.apache.hadoop.hive.ql.hooks.PrivateHookContext;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHook;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHookContext;
+import org.apache.hadoop.hive.ql.metadata.Hive;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.ql.metadata.Table;
+import org.apache.hadoop.hive.ql.plan.HiveOperation;
+import org.apache.thrift.TException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.io.IOException;
+
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.IF_PURGE;
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.META_TABLE_LOCATION;
+
+public class HiveQueryLifeTimeHook implements QueryLifeTimeHook {
+
+  private static final Logger LOG = LoggerFactory.getLogger(HiveQueryLifeTimeHook.class);
+
+  @Override
+  public void beforeCompile(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterCompile(QueryLifeTimeHookContext ctx, boolean hasError) {
+
+  }
+
+  @Override
+  public void beforeExecution(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterExecution(QueryLifeTimeHookContext ctx, boolean hasError) {
+    if (hasError) {
+      checkAndRollbackCTAS(ctx);
+    }
+  }
+
+  private void checkAndRollbackCTAS(QueryLifeTimeHookContext ctx) {
+    HiveConf conf = ctx.getHiveConf();
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+      PrivateHookContext pCtx = (PrivateHookContext) ctx.getHookContext();

Review Comment:
   no need for a cast, HookContext is good



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r934588615


##########
ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java:
##########
@@ -3080,6 +3084,83 @@ public void testAcidOrcWritePreservesFieldNames() throws Exception {
     hiveConf.setBoolVar(HiveConf.ConfVars.HIVE_VECTORIZATION_ENABLED, true);
   }
 
+  @Test
+  public void testFailureScenariosCleanupCTAS() throws Exception {
+    boolean[] booleans = {true, false};
+    for (boolean var1 : booleans) {
+      for (boolean var2 : booleans) {
+        for (boolean var3 : booleans) {
+          for (boolean var4 : booleans) {
+            failureScenarioCleanupCTAS(var1, var2, var3, var4);
+          }
+        }
+      }
+    }
+  }
+
+  public void failureScenarioCleanupCTAS(boolean isPartitioned,
+                                         boolean isDirectInsertEnabled,
+                                         boolean isLocklessReadsEnabled,
+                                         boolean isLocationUsed) throws Exception {
+    String tableName = "atable";
+
+    //Set configurations
+    hiveConf.setBoolVar(HiveConf.ConfVars.HIVE_ACID_DIRECT_INSERT_ENABLED, isDirectInsertEnabled);
+    hiveConf.setBoolVar(HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED, isLocklessReadsEnabled);
+    hiveConf.setBoolVar(HiveConf.ConfVars.TXN_CTAS_X_LOCK, true);
+    hiveConf.setIntVar(HiveConf.ConfVars.HIVE_COMPACTOR_ABORTEDTXN_THRESHOLD, 0);
+
+    // Add a '1' at the end of table name for custom location.
+    String querylocation = (isLocationUsed) ? " location '" + getWarehouseDir() + "/" + tableName + "1'" : "";
+    String queryPartitions = (isPartitioned) ? " partitioned by (a)" : "";
+
+    d.run("insert into " + Table.ACIDTBL + "(a,b) values (3,4)");
+    d.run("drop table if exists " + tableName);
+    d.compileAndRespond("create table " + tableName + queryPartitions + " stored as orc" + querylocation +
+            " tblproperties ('transactional'='true') as select * from " + Table.ACIDTBL);
+    long txnId = d.getQueryState().getTxnManager().getCurrentTxnId();
+    DriverContext driverContext = d.getDriverContext();
+    traverseTasksRecursively(driverContext.getPlan().getRootTasks());
+    int assertError = 0;
+    try {
+      d.run();
+    } catch (Exception e) {
+      assertError = 1;
+    }
+
+    runInitiator(hiveConf);

Review Comment:
   why do you need initiator here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r935200088


##########
ql/src/java/org/apache/hadoop/hive/ql/HiveQueryLifeTimeHook.java:
##########
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.CompactionRequest;
+import org.apache.hadoop.hive.metastore.api.CompactionType;
+import org.apache.hadoop.hive.metastore.txn.TxnUtils;
+import org.apache.hadoop.hive.ql.hooks.PrivateHookContext;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHook;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHookContext;
+import org.apache.hadoop.hive.ql.metadata.Hive;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.ql.metadata.Table;
+import org.apache.hadoop.hive.ql.plan.HiveOperation;
+import org.apache.thrift.TException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.io.IOException;
+
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.IF_PURGE;
+import static org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.META_TABLE_LOCATION;
+
+public class HiveQueryLifeTimeHook implements QueryLifeTimeHook {
+
+  private static final Logger LOG = LoggerFactory.getLogger(HiveQueryLifeTimeHook.class);
+
+  @Override
+  public void beforeCompile(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterCompile(QueryLifeTimeHookContext ctx, boolean hasError) {
+
+  }
+
+  @Override
+  public void beforeExecution(QueryLifeTimeHookContext ctx) {
+
+  }
+
+  @Override
+  public void afterExecution(QueryLifeTimeHookContext ctx, boolean hasError) {
+    if (hasError) {
+      checkAndRollbackCTAS(ctx);
+    }
+  }
+
+  private void checkAndRollbackCTAS(QueryLifeTimeHookContext ctx) {
+    HiveConf conf = ctx.getHiveConf();
+    if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+      PrivateHookContext pCtx = (PrivateHookContext) ctx.getHookContext();
+      Table table = ctx.getHookContext().getQueryPlan().getAcidSinks().iterator().next().getTable();

Review Comment:
   Done.



##########
ql/src/java/org/apache/hadoop/hive/ql/Context.java:
##########
@@ -191,6 +191,8 @@ public class Context {
 
   private List<Pair<String, String>> parsedTables = new ArrayList<>();
 
+  private Path destinationPath;

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] SourabhBadhya commented on a diff in pull request #3457: HIVE-26414: Aborted/Cancelled CTAS operations must initiate cleanup o…

Posted by GitBox <gi...@apache.org>.
SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r933135808


##########
ql/src/java/org/apache/hadoop/hive/ql/HookRunner.java:
##########
@@ -56,6 +57,7 @@ public class HookRunner {
   HookRunner(HiveConf conf, SessionState.LogHelper console) {
     this.conf = conf;
     this.hooks = new HiveHooks(conf, console);
+    addNecessaryHooks();

Review Comment:
   `addNecessaryHooks()` will have all the hooks that are absolutely required from now on. Thought this would be a good way to go about. But let me know if this function is required so that I can change it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org