You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by "ngsg (via GitHub)" <gi...@apache.org> on 2023/04/03 05:06:47 UTC

[GitHub] [hive] ngsg opened a new pull request, #4190: HIVE-26659: Proceed MapJoin with empty storage if it performs AntiJoin

ngsg opened a new pull request, #4190:
URL: https://github.com/apache/hive/pull/4190

   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://cwiki.apache.org/confluence/display/Hive/HowToContribute
     2. Ensure that you have created an issue on the Hive project JIRA: https://issues.apache.org/jira/projects/HIVE/summary
     3. Ensure you have added or run the appropriate tests for your PR: 
     4. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP]HIVE-XXXXX:  Your PR title ...'.
     5. Be sure to keep the PR description updated to reflect all changes.
     6. Please write your PR title to summarize what this PR proposes.
     7. If possible, provide a concise example to reproduce the issue for a faster review.
   
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   Proceed MapJoin operation with empty storage if that storage is RHS table of AntiJoin.
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   Current MapJoinOperator does not join empty RowContainers unless it has any OuterJoin.
   Therefore, MapJoinOperator always produces no rows for AntiJoin as it joins non-empty LHS RowContainer and empty RHS RowContainer.
   
   ### Does this PR introduce _any_ user-facing change?
   <!--
   Note that it means *any* user-facing change including all aspects such as the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description, screenshot and/or a reproducable example to show the behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to the released Hive versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   No
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   -->
   I add a qfile test that does not use vectorized MapJoin operator.
   I also ran all TPC-DS queries on 1TB dataset with hive.auto.convert.anti.join=true and checked that all queries returns the correct result. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] amansinha100 commented on a diff in pull request #4190: HIVE-26659: Proceed MapJoin with empty storage if it performs AntiJoin

Posted by "amansinha100 (via GitHub)" <gi...@apache.org>.
amansinha100 commented on code in PR #4190:
URL: https://github.com/apache/hive/pull/4190#discussion_r1179845226


##########
ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java:
##########
@@ -556,6 +556,9 @@ public void process(Object row, int tag) throws HiveException {
               }
             } else {
               storage[pos] = emptyList;
+              if (pos != 0 && condn[pos - 1].getType() == JoinDesc.ANTI_JOIN) {

Review Comment:
   Just for reference, this check matches the check that is done in CommonJoinOperator.checkAndGenObject() here: https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java#L973 .  Looks good to me.  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] sonarcloud[bot] commented on pull request #4190: HIVE-26659: Proceed MapJoin with empty storage if it performs AntiJoin

Posted by "sonarcloud[bot] (via GitHub)" <gi...@apache.org>.
sonarcloud[bot] commented on PR #4190:
URL: https://github.com/apache/hive/pull/4190#issuecomment-1493698830

   Kudos, SonarCloud Quality Gate passed!&nbsp; &nbsp; [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=4190)
   
   [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4190&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4190&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4190&resolved=false&types=BUG)  
   [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4190&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4190&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4190&resolved=false&types=VULNERABILITY)  
   [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4190&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4190&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4190&resolved=false&types=SECURITY_HOTSPOT)  
   [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4190&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4190&resolved=false&types=CODE_SMELL) [0 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4190&resolved=false&types=CODE_SMELL)
   
   [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4190&metric=coverage&view=list) No Coverage information  
   [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4190&metric=duplicated_lines_density&view=list) No Duplication information
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] scarlin-cloudera commented on a diff in pull request #4190: HIVE-26659: Proceed MapJoin with empty storage if it performs AntiJoin

Posted by "scarlin-cloudera (via GitHub)" <gi...@apache.org>.
scarlin-cloudera commented on code in PR #4190:
URL: https://github.com/apache/hive/pull/4190#discussion_r1179548225


##########
ql/src/test/queries/clientpositive/antijoin2.q:
##########
@@ -0,0 +1,75 @@
+set hive.merge.nway.joins=false;
+set hive.vectorized.execution.enabled=false;
+set hive.auto.convert.join=true;
+set hive.auto.convert.anti.join=true;
+
+drop table if exists tt1;
+drop table if exists tt2;
+drop table if exists tt3;
+
+create table tt1 (ws_order_number bigint, ws_ext_ship_cost decimal(7, 2));
+create table tt2 (ws_order_number bigint);
+create table tt3 (wr_order_number bigint);
+
+insert into tt1 values (42, 3093.96), (1041, 299.28), (1378, 85.56), (1378, 719.44), (1395, 145.68);
+insert into tt2 values (1378), (1395);
+insert into tt3 values (42), (1041);
+
+-- The result should be the same regardless of vectorization.
+
+explain

Review Comment:
   We tested locally with the tpcds queries and your patch seems to have fixed the problem.
   
   Thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ngsg commented on a diff in pull request #4190: HIVE-26659: Proceed MapJoin with empty storage if it performs AntiJoin

Posted by "ngsg (via GitHub)" <gi...@apache.org>.
ngsg commented on code in PR #4190:
URL: https://github.com/apache/hive/pull/4190#discussion_r1157957495


##########
ql/src/test/queries/clientpositive/antijoin2.q:
##########
@@ -0,0 +1,41 @@
+set hive.merge.nway.joins=false;

Review Comment:
   I added explain and explain cbo for each of the queries.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ramesh0201 commented on pull request #4190: HIVE-26659: Proceed MapJoin with empty storage if it performs AntiJoin

Posted by "ramesh0201 (via GitHub)" <gi...@apache.org>.
ramesh0201 commented on PR #4190:
URL: https://github.com/apache/hive/pull/4190#issuecomment-1531144303

   +1.  verified the test file with and without the patch. The patch fixes the issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ngsg commented on a diff in pull request #4190: HIVE-26659: Proceed MapJoin with empty storage if it performs AntiJoin

Posted by "ngsg (via GitHub)" <gi...@apache.org>.
ngsg commented on code in PR #4190:
URL: https://github.com/apache/hive/pull/4190#discussion_r1157954219


##########
ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java:
##########
@@ -556,6 +556,9 @@ public void process(Object row, int tag) throws HiveException {
               }
             } else {
               storage[pos] = emptyList;
+              if (pos != 0 && condn[pos - 1].getType() == JoinDesc.ANTI_JOIN) {

Review Comment:
   Thank you for your review. I added a comment which explains the reason for proceeding join.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] sonarcloud[bot] commented on pull request #4190: HIVE-26659: Proceed MapJoin with empty storage if it performs AntiJoin

Posted by "sonarcloud[bot] (via GitHub)" <gi...@apache.org>.
sonarcloud[bot] commented on PR #4190:
URL: https://github.com/apache/hive/pull/4190#issuecomment-1496862774

   Kudos, SonarCloud Quality Gate passed!&nbsp; &nbsp; [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=4190)
   
   [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4190&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4190&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4190&resolved=false&types=BUG)  
   [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4190&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4190&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4190&resolved=false&types=VULNERABILITY)  
   [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4190&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4190&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4190&resolved=false&types=SECURITY_HOTSPOT)  
   [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4190&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4190&resolved=false&types=CODE_SMELL) [0 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4190&resolved=false&types=CODE_SMELL)
   
   [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4190&metric=coverage&view=list) No Coverage information  
   [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4190&metric=duplicated_lines_density&view=list) No Duplication information
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] sonarcloud[bot] commented on pull request #4190: HIVE-26659: Proceed MapJoin with empty storage if it performs AntiJoin

Posted by "sonarcloud[bot] (via GitHub)" <gi...@apache.org>.
sonarcloud[bot] commented on PR #4190:
URL: https://github.com/apache/hive/pull/4190#issuecomment-1514415422

   Kudos, SonarCloud Quality Gate passed!&nbsp; &nbsp; [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=4190)
   
   [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4190&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4190&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4190&resolved=false&types=BUG)  
   [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4190&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4190&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4190&resolved=false&types=VULNERABILITY)  
   [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4190&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4190&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4190&resolved=false&types=SECURITY_HOTSPOT)  
   [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4190&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4190&resolved=false&types=CODE_SMELL) [0 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4190&resolved=false&types=CODE_SMELL)
   
   [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4190&metric=coverage&view=list) No Coverage information  
   [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4190&metric=duplicated_lines_density&view=list) No Duplication information
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] scarlin-cloudera commented on a diff in pull request #4190: HIVE-26659: Proceed MapJoin with empty storage if it performs AntiJoin

Posted by "scarlin-cloudera (via GitHub)" <gi...@apache.org>.
scarlin-cloudera commented on code in PR #4190:
URL: https://github.com/apache/hive/pull/4190#discussion_r1157379902


##########
ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java:
##########
@@ -556,6 +556,9 @@ public void process(Object row, int tag) throws HiveException {
               }
             } else {
               storage[pos] = emptyList;
+              if (pos != 0 && condn[pos - 1].getType() == JoinDesc.ANTI_JOIN) {

Review Comment:
   Can we add a comment here?  Thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] scarlin-cloudera commented on a diff in pull request #4190: HIVE-26659: Proceed MapJoin with empty storage if it performs AntiJoin

Posted by "scarlin-cloudera (via GitHub)" <gi...@apache.org>.
scarlin-cloudera commented on code in PR #4190:
URL: https://github.com/apache/hive/pull/4190#discussion_r1157432947


##########
ql/src/test/queries/clientpositive/antijoin2.q:
##########
@@ -0,0 +1,41 @@
+set hive.merge.nway.joins=false;

Review Comment:
   Can we also add the explain cbo plan and explain plan here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] kasakrisz merged pull request #4190: HIVE-26659: Proceed MapJoin with empty storage if it performs AntiJoin

Posted by "kasakrisz (via GitHub)" <gi...@apache.org>.
kasakrisz merged PR #4190:
URL: https://github.com/apache/hive/pull/4190


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] amansinha100 commented on a diff in pull request #4190: HIVE-26659: Proceed MapJoin with empty storage if it performs AntiJoin

Posted by "amansinha100 (via GitHub)" <gi...@apache.org>.
amansinha100 commented on code in PR #4190:
URL: https://github.com/apache/hive/pull/4190#discussion_r1170887837


##########
ql/src/test/queries/clientpositive/antijoin2.q:
##########
@@ -0,0 +1,75 @@
+set hive.merge.nway.joins=false;
+set hive.vectorized.execution.enabled=false;
+set hive.auto.convert.join=true;
+set hive.auto.convert.anti.join=true;
+
+drop table if exists tt1;
+drop table if exists tt2;
+drop table if exists tt3;
+
+create table tt1 (ws_order_number bigint, ws_ext_ship_cost decimal(7, 2));
+create table tt2 (ws_order_number bigint);
+create table tt3 (wr_order_number bigint);
+
+insert into tt1 values (42, 3093.96), (1041, 299.28), (1378, 85.56), (1378, 719.44), (1395, 145.68);
+insert into tt2 values (1378), (1395);
+insert into tt3 values (42), (1041);
+
+-- The result should be the same regardless of vectorization.
+
+explain

Review Comment:
   The plans for these queries don't show the pattern of MergeJoin --> MapJoin.  Without this pattern, as I noted in the Jira, the wrong results issue was not reproducible.  In order to force this plan, I had to set the following stats:
   alter table tt1 update statistics set ('numRows'='10000000');
   alter table tt2 update statistics set ('numRows'='10000000');
   alter table tt3 update statistics set ('numRows'='2');
   Could you add/modify the existing tests to set these stats and generate the MergeJoin-->MapJoin plan and verify that the fix works for that ? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ngsg commented on a diff in pull request #4190: HIVE-26659: Proceed MapJoin with empty storage if it performs AntiJoin

Posted by "ngsg (via GitHub)" <gi...@apache.org>.
ngsg commented on code in PR #4190:
URL: https://github.com/apache/hive/pull/4190#discussion_r1170991803


##########
ql/src/test/queries/clientpositive/antijoin2.q:
##########
@@ -0,0 +1,75 @@
+set hive.merge.nway.joins=false;
+set hive.vectorized.execution.enabled=false;
+set hive.auto.convert.join=true;
+set hive.auto.convert.anti.join=true;
+
+drop table if exists tt1;
+drop table if exists tt2;
+drop table if exists tt3;
+
+create table tt1 (ws_order_number bigint, ws_ext_ship_cost decimal(7, 2));
+create table tt2 (ws_order_number bigint);
+create table tt3 (wr_order_number bigint);
+
+insert into tt1 values (42, 3093.96), (1041, 299.28), (1378, 85.56), (1378, 719.44), (1395, 145.68);
+insert into tt2 values (1378), (1395);
+insert into tt3 values (42), (1041);
+
+-- The result should be the same regardless of vectorization.
+
+explain

Review Comment:
   I updated the qfile and expected output for verifying MergeJoin -> MapJoin pattern.
   
   However, I think this issue is unrelated with MergeJoin -> MapJoin pattern and depends only on MapJoinOperator. For example, the second query in this qfile contains only one Join, but the current Hive returns no rows although tt1 - tt2 is not empty. Also, I have checked that the current Hive returns wrong result for the first query without MergeJoin -> MapJoin pattern. (I used mvn test to run both of the queries.)
   
   I think you could reproduce this issue with MapJoin -> MapJoin pattern if you disable vectorized execution. Could you please give me more details about the configuration that you used for reproducing the problem? Also I would be glad if you could run this qfile in your local environment and share the result.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org