You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2021/07/20 12:27:48 UTC

[GitHub] [hive] jshmchenxi opened a new pull request #2500: Hive-24467: ConditionalTask remove tasks that not selected exists thr…

jshmchenxi opened a new pull request #2500:
URL: https://github.com/apache/hive/pull/2500


   …ead safety problem
   
   ### What changes were proposed in this pull request?
   Add lock to org.apache.hadoop.hive.ql.exec.ConditionalTask#resolveTask to prevent race condition when parallel execution is enabled
   
   ### Why are the changes needed?
   Otherwise some stages might not be executed because of the race condition, result in wrong result or partial result.
   Please refer to the JIRA issue.
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Tested in version hive-1.1.0 in our hadoop cluster


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] pvary commented on a change in pull request #2500: Hive-24467: ConditionalTask remove tasks that not selected exists thr…

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2500:
URL: https://github.com/apache/hive/pull/2500#discussion_r674195308



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/ConditionalTask.java
##########
@@ -32,6 +33,8 @@
 public class ConditionalTask extends Task<ConditionalWork> implements Serializable {
 
   private static final long serialVersionUID = 1L;
+  private static final ReentrantLock RESOLVE_TASK_LOCK = new ReentrantLock();

Review comment:
       Having a JVM level global lock seems quite problematic. It could stop every query running on this HS2 instance.
   Could we find a way to restrict the scope of the lock?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] jshmchenxi commented on pull request #2500: Hive-24467: ConditionalTask remove tasks that not selected exists thr…

Posted by GitBox <gi...@apache.org>.
jshmchenxi commented on pull request #2500:
URL: https://github.com/apache/hive/pull/2500#issuecomment-883343164


   @deniskuzZ @miklosgergely @pvary Hi, would you take a look? Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] jshmchenxi commented on pull request #2500: Hive-24467: ConditionalTask remove tasks that not selected exists thr…

Posted by GitBox <gi...@apache.org>.
jshmchenxi commented on pull request #2500:
URL: https://github.com/apache/hive/pull/2500#issuecomment-885525766


   The two failed tests seem to be not related:
   
   - Testing / init@postgres / init-metastore / verify / install – org.apache.hadoop.hive.metastore.dbinstall.ITestPostgres
   - Testing / split-05 / PostProcess / testCliDriver[external_jdbc_table] – org.apache.hadoop.hive.cli.split8.TestMiniLlapLocalCliDriver


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] jshmchenxi commented on a change in pull request #2500: Hive-24467: ConditionalTask remove tasks that not selected exists thr…

Posted by GitBox <gi...@apache.org>.
jshmchenxi commented on a change in pull request #2500:
URL: https://github.com/apache/hive/pull/2500#discussion_r674732170



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/ConditionalTask.java
##########
@@ -32,6 +33,8 @@
 public class ConditionalTask extends Task<ConditionalWork> implements Serializable {
 
   private static final long serialVersionUID = 1L;
+  private static final ReentrantLock RESOLVE_TASK_LOCK = new ReentrantLock();

Review comment:
       I think that's nice! And we may just use `Task.queryState` instead of accessing `queryStateMap`?
   I put the lock in session level because in our use case we seldom run queries in parallel within one session.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] jshmchenxi commented on pull request #2500: Hive-24467: ConditionalTask remove tasks that not selected exists thr…

Posted by GitBox <gi...@apache.org>.
jshmchenxi commented on pull request #2500:
URL: https://github.com/apache/hive/pull/2500#issuecomment-888892175


   @pvary  Thanks for the reply! I've rebased the code and the failures are gone.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] jshmchenxi commented on a change in pull request #2500: Hive-24467: ConditionalTask remove tasks that not selected exists thr…

Posted by GitBox <gi...@apache.org>.
jshmchenxi commented on a change in pull request #2500:
URL: https://github.com/apache/hive/pull/2500#discussion_r674475011



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/ConditionalTask.java
##########
@@ -32,6 +33,8 @@
 public class ConditionalTask extends Task<ConditionalWork> implements Serializable {
 
   private static final long serialVersionUID = 1L;
+  private static final ReentrantLock RESOLVE_TASK_LOCK = new ReentrantLock();

Review comment:
       That's a good point! I changed it to session scope lock.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] pvary merged pull request #2500: Hive-24467: ConditionalTask remove tasks that not selected exists thr…

Posted by GitBox <gi...@apache.org>.
pvary merged pull request #2500:
URL: https://github.com/apache/hive/pull/2500


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] jshmchenxi commented on pull request #2500: Hive-24467: ConditionalTask remove tasks that not selected exists thr…

Posted by GitBox <gi...@apache.org>.
jshmchenxi commented on pull request #2500:
URL: https://github.com/apache/hive/pull/2500#issuecomment-888803393


   @pvary Hi, the failed tests seem to be unrelated. Would you help merging this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] pvary commented on pull request #2500: Hive-24467: ConditionalTask remove tasks that not selected exists thr…

Posted by GitBox <gi...@apache.org>.
pvary commented on pull request #2500:
URL: https://github.com/apache/hive/pull/2500#issuecomment-888807837


   > @pvary Hi, the failed tests seem to be unrelated. Would you help merging this PR?
   
   We require a green CI run for merge.
   There was a letter on the dev list about the handling of the flaky tests.
   Could you please follow the steps there? I might be mistaken, but I have disabled a JDBC test lately (maybe the same as you have failing here), if it is the same, then you only have to rebase on that to have it fixed.
   
   Thanks, Peter 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] pvary commented on a change in pull request #2500: Hive-24467: ConditionalTask remove tasks that not selected exists thr…

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2500:
URL: https://github.com/apache/hive/pull/2500#discussion_r674504799



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/ConditionalTask.java
##########
@@ -32,6 +33,8 @@
 public class ConditionalTask extends Task<ConditionalWork> implements Serializable {
 
   private static final long serialVersionUID = 1L;
+  private static final ReentrantLock RESOLVE_TASK_LOCK = new ReentrantLock();

Review comment:
       What about query level?
   We can change `queryStateMap` to concurrent, and put the lock into the Query State. What do you think? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org