You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Seonggon Namgung (Jira)" <ji...@apache.org> on 2023/04/06 10:38:00 UTC

[jira] [Updated] (HIVE-27226) FullOuterJoin with filter expressions is not computed correctly

     [ https://issues.apache.org/jira/browse/HIVE-27226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Seonggon Namgung updated HIVE-27226:
------------------------------------
    Description: 
I tested many OuterJoin queries as an extension of HIVE-27138, and I found that Hive returns incorrect result for a query containing FullOuterJoin with filter expressions. In a nutshell, all JoinOperators that run on Tez engine return incorrect result for OuterJoin queries, and one of the reason for incorrect computation comes from CommonJoinOperator, which is the base of all JoinOperators. I attached the queries and configuration that I used at the bottom of the document. I am still inspecting this problems, and I will share an update once when I find out another reason. Also any comments and opinions would be appreciated.

First of all, I observed that current Hive ignores filter expressions contained in MapJoinOperator. For example, the attached result of query1 shows that MapJoinOperator performs inner join, not full outer join. This problem stems from removal of filterMap. When converting JoinOperator to MapJoinOperator, ConvertJoinMapJoin#convertJoinDynamicPartitionedHashJoin() removes filterMap of MapJoinOperator. Because MapJoinOperator does not evaluate filter expressions if filterMap is null, this change makes MapJoinOperator ignore filter expressions and it always joins tables regardless whether they satisfy filter expressions or not. To solve this problem, I disable FullOuterMapJoinOptimization and apply path for HIVE-27138, which prevents NPE. (The patch is available at the following link: LINK.) The rest of this document uses this modified Hive, but most of problems happen to current Hive, too.

The second problem I found is that Hive returns the same left-null or right-null rows multiple time when it uses MapJoinOperator or CommonMergeJoinOperator. This is caused by the logic of current CommonJoinOperator. Both of the two JoinOperators joins tables in 2 steps. First, they create RowContainers, each of which is a group of rows from one table and has the same key. Second, they call CommonJoinOperator#checkAndGenObject() with created RowContainers. This method checks filterTag of each row in RowContainers and forwards joined row if they meet all filter conditions. For OuterJoin, checkAndGenObject() forwards non-matching rows if there is no matching row in RowContainer. The problem happens when there are multiple RowContainer for the same key and table. For example, suppose that there are two left RowContainers and one right RowContainer. If none of the row in two left RowContainers satisfies filter condition, then checkAndGenObject() will forward Left-Null row for each right row. Because checkAndGenObject() is called with each left RowContainer, there will be two duplicated Left-Null rows for every right row.

In the case of MapJoinOperator, it always creates singleton RowContainer for big table. Therefore, it always produces duplicated non-matching rows. CommonMergeJoinOperator also creates multiple RowContainer for big table, whose size is hive.join.emit.interval. In the below experiment, I also set hive.join.shortcut.unmatched.rows=false, and hive.exec.reducers.max=1 to disable specialized algorithm for OuterJoin of 2 tables and force calling checkAndGenObject() before all rows with the same keys are gathered. I didn't observe this problem when using VectorMapJoinOperator, and I will inspect VectorMapJoinOperator whether we can reproduce the problem with it.

I think the second problem is not limited to FullOuterJoin, but I couldn't find such query as of now. This will also be added to this issue if I can write a query that reproduces the second problem without FullOuterJoin.

I also found that Hive returns wrong result for query2 even when I used VectorMapJoinOperator. I am still inspecting this problem and I will add an update on it when I find out the reason.

 

Experiment:

 
{code:java}
---- Configuration
set hive.optimize.shared.work=false;

-- Std MapJoin
set hive.auto.convert.join=true;
set hive.vectorized.execution.enabled=false;

-- Vec MapJoin
set hive.auto.convert.join=true;
set hive.vectorized.execution.enabled=true;

-- MergeJoin
set hive.auto.convert.join=false;
set hive.vectorized.execution.enabled=false;
set hive.join.shortcut.unmatched.rows=false;
set hive.join.emit.interval=1;
set hive.exec.reducers.max=1;
 
---- Queries
-- Query 1
DROP TABLE IF EXISTS a;
CREATE TABLE a (key string, value string);
INSERT INTO a VALUES (1, 1), (1, 2), (2, 1);
SELECT * FROM a FULL OUTER JOIN a b ON a.key = b.key AND a.key < 0;

-- Query 2
DROP TABLE IF EXISTS b;
CREATE TABLE b (key string, value string);
INSERT INTO b VALUES (1, 0), (1, 1);
SELECT * FROM b FULL OUTER JOIN b a ON a.key = b.key AND a.value > 0 AND b.value > 0;{code}
 

 

Experiment result:

 
{code:java}
-- PostgresSQL
-- Query1
key | value | key | value
-----+-------+-----+-------
   1 |     1 |     |
   1 |     2 |     |
   2 |     1 |     |
     |       |   1 |     2
     |       |   1 |     1
     |       |   2 |     1
(6 rows)

-- Query2
 key | value | key | value
-----+-------+-----+-------
   1 |     0 |     |      
   1 |     1 |   1 |     1
     |       |   1 |     0
(3 rows){code}
{code:java}
-- Query1 Result, current Hive
-- Std MapJoin
+--------+----------+--------+----------+
| a.key  | a.value  | b.key  | b.value  |
+--------+----------+--------+----------+
| 2      | 1        | 2      | 1        |
| 1      | 2        | 1      | 2        |
| 1      | 2        | 1      | 1        |
| 1      | 1        | 1      | 2        |
| 1      | 1        | 1      | 1        |
+--------+----------+--------+----------+
-- Vec MapJoin
+--------+----------+--------+----------+
| a.key  | a.value  | b.key  | b.value  |
+--------+----------+--------+----------+
| 1      | 2        | NULL   | NULL     |
| 1      | 1        | NULL   | NULL     |
| NULL   | NULL     | 1      | 2        |
| NULL   | NULL     | 1      | 1        |
| 2      | 1        | NULL   | NULL     |
| NULL   | NULL     | 2      | 1        |
+--------+----------+--------+----------+
-- MergeJoin
+--------+----------+--------+----------+
| a.key  | a.value  | b.key  | b.value  |
+--------+----------+--------+----------+
| 1      | 1        | NULL   | NULL     |
| NULL   | NULL     | 1      | 1        |
| NULL   | NULL     | 1      | 2        |
| 1      | 2        | NULL   | NULL     |
| NULL   | NULL     | 1      | 1        |
| NULL   | NULL     | 1      | 2        |
| 2      | 1        | NULL   | NULL     |
| NULL   | NULL     | 2      | 1        |
+--------+----------+--------+----------+ {code}
{code:java}
-- Query1 Result, Hive with HIVE-27138 patch, disable FullOuterMapJoinOptimization
-- Std MapJoin
+--------+----------+--------+----------+
| a.key  | a.value  | b.key  | b.value  |
+--------+----------+--------+----------+
| 2      | 1        | NULL   | NULL     |
| NULL   | NULL     | 2      | 1        |
| 1      | 2        | NULL   | NULL     |
| NULL   | NULL     | 1      | 2        |
| NULL   | NULL     | 1      | 1        |
| 1      | 1        | NULL   | NULL     |
| NULL   | NULL     | 1      | 2        |
| NULL   | NULL     | 1      | 1        |
+--------+----------+--------+----------+
-- Vec MapJoin
+--------+----------+--------+----------+
| a.key  | a.value  | b.key  | b.value  |
+--------+----------+--------+----------+
| 1      | 2        | NULL   | NULL     |
| 1      | 1        | NULL   | NULL     |
| NULL   | NULL     | 1      | 2        |
| NULL   | NULL     | 1      | 1        |
| 2      | 1        | NULL   | NULL     |
| NULL   | NULL     | 2      | 1        |
+--------+----------+--------+----------+
-- MergeJoin
+--------+----------+--------+----------+
| a.key  | a.value  | b.key  | b.value  |
+--------+----------+--------+----------+
| 1      | 1        | NULL   | NULL     |
| NULL   | NULL     | 1      | 1        |
| NULL   | NULL     | 1      | 2        |
| 1      | 2        | NULL   | NULL     |
| NULL   | NULL     | 1      | 1        |
| NULL   | NULL     | 1      | 2        |
| 2      | 1        | NULL   | NULL     |
| NULL   | NULL     | 2      | 1        |
+--------+----------+--------+----------+ {code}
{code:java}
-- Query2 Result, current Hive
-- Std MapJoin
+--------+----------+--------+----------+
| b.key  | b.value  | a.key  | a.value  |
+--------+----------+--------+----------+
| 1      | 0        | NULL   | NULL     |
| NULL   | NULL     | 1      | 0        |
| NULL   | NULL     | 1      | 1        |
| 1      | 1        | 1      | 0        |
| 1      | 1        | 1      | 1        |
+--------+----------+--------+----------+
-- Vec MapJoin
+--------+----------+--------+----------+
| b.key  | b.value  | a.key  | a.value  |
+--------+----------+--------+----------+
| 1      | 0        | NULL   | NULL     |
| 1      | 1        | 1      | 1        |
| 1      | 1        | 1      | 0        |
+--------+----------+--------+----------+
-- MergeJoin
+--------+----------+--------+----------+
| b.key  | b.value  | a.key  | a.value  |
+--------+----------+--------+----------+
| 1      | 0        | NULL   | NULL     |
| NULL   | NULL     | 1      | 0        |
| 1      | 1        | 1      | 1        |
+--------+----------+--------+----------+ {code}
{code:java}
-- Query2 Result, Hive with HIVE-27138 patch, disable FullOuterMapJoinOptimization
-- Std MapJoin
+--------+----------+--------+----------+
| b.key  | b.value  | a.key  | a.value  |
+--------+----------+--------+----------+
| 1      | 1        | 1      | 1        |
| 1      | 1        | 1      | 0        |
| 1      | 0        | 1      | 1        |
| 1      | 0        | 1      | 0        |
+--------+----------+--------+----------+
-- Vec MapJoin
+--------+----------+--------+----------+
| b.key  | b.value  | a.key  | a.value  |
+--------+----------+--------+----------+
| 1      | 0        | NULL   | NULL     |
| 1      | 1        | 1      | 0        |
| 1      | 1        | 1      | 1        |
+--------+----------+--------+----------+
-- MergeJoin
+--------+----------+--------+----------+
| b.key  | b.value  | a.key  | a.value  |
+--------+----------+--------+----------+
| 1      | 0        | NULL   | NULL     |
| NULL   | NULL     | 1      | 0        |
| 1      | 1        | 1      | 1        |
+--------+----------+--------+----------+ {code}
 

  was:
I tested many OuterJoin queries as an extension of HIVE-27138, and I found that Hive returns incorrect result for a query containing FullOuterJoin with filter expressions. In a nutshell, all JoinOperators that run on Tez engine return incorrect result for OuterJoin queries, and one of the reason for incorrect computation comes from CommonJoinOperator, which is the base of all JoinOperators. I attached the queries and configuration that I used at the bottom of the document. I am still inspecting this problems, and I will share an update once when I find out another reason. Also any comments and opinions would be appreciated.


First of all, I observed that current Hive ignores filter expressions contained in MapJoinOperator. For example, the attached result of query1 shows that MapJoinOperator performs inner join, not full outer join. This problem stems from removal of filterMap. When converting JoinOperator to MapJoinOperator, ConvertJoinMapJoin#convertJoinDynamicPartitionedHashJoin() removes filterMap of MapJoinOperator. Because MapJoinOperator does not evaluate filter expressions if filterMap is null, this change makes MapJoinOperator ignore filter expressions and it always joins tables regardless whether they satisfy filter expressions or not. To solve this problem, I disable FullOuterMapJoinOptimization and apply path for HIVE-27138, which prevents NPE. (The patch is available at the following link: LINK.) The rest of this document uses this modified Hive, but most of problems happen to current Hive, too.


The second problem I found is that Hive returns the same left-null or right-null rows multiple time when it uses MapJoinOperator or CommonMergeJoinOperator. This is caused by the logic of current CommonJoinOperator. Both of the two JoinOperators joins tables in 2 steps. First, they create RowContainers, each of which is a group of rows from one table and has the same key. Second, they call CommonJoinOperator#checkAndGenObject() with created RowContainers. This method checks filterTag of each row in RowContainers and forwards joined row if they meet all filter conditions. For OuterJoin, checkAndGenObject() forwards non-matching rows if there is no matching row in RowContainer. The problem happens when there are multiple RowContainer for the same key and table. For example, suppose that there are two left RowContainers and one right RowContainer. If none of the row in two left RowContainers satisfies filter condition, then checkAndGenObject() will forward Left-Null row for each right row. Because checkAndGenObject() is called with each left RowContainer, there will be two duplicated Left-Null rows for every right row.


In the case of MapJoinOperator, it always creates singleton RowContainer for big table. Therefore, it always produces duplicated non-matching rows. CommonMergeJoinOperator also creates multiple RowContainer for big table, whose size is hive.join.emit.interval. In the below experiment, I also set hive.join.shortcut.unmatched.rows=false, and hive.exec.reducers.max=1 to disable specialized algorithm for OuterJoin of 2 tables and force calling checkAndGenObject() before all rows with the same keys are gathered. I didn't observe this problem when using VectorMapJoinOperator, and I will inspect VectorMapJoinOperator whether we can reproduce the problem with it.


I think the second problem is not limited to FullOuterJoin, but I couldn't find such query as of now. This will also be added to this issue if I can write a query that reproduces the second problem without FullOuterJoin.


I also found that Hive returns wrong result for query2 even when I used VectorMapJoinOperator. I am still inspecting this problem and I will add an update on it when I find out the reason.

 

Experiment:

 
{code:java}
---- Configuration
set hive.optimize.shared.work=false;

-- Std MapJoin
set hive.auto.convert.join=true;
set hive.vectorized.execution.enabled=false;

-- Vec MapJoin
set hive.auto.convert.join=true;
set hive.vectorized.execution.enabled=true;

-- MergeJoin
set hive.auto.convert.join=false;
set hive.vectorized.execution.enabled=false;
set hive.join.shortcut.unmatched.rows=false;
set hive.join.emit.interval=1;
set hive.exec.reducers.max=1;
 
---- Queries
-- Query 1
DROP TABLE IF EXISTS a;
CREATE TABLE a (key string, value string);
INSERT INTO a VALUES (1, 1), (1, 2), (2, 1);
SELECT * FROM a FULL OUTER JOIN a b ON a.key = b.key AND a.key < 0;

-- Query 2
DROP TABLE IF EXISTS b;
CREATE TABLE b (key string, value string);
INSERT INTO b VALUES (1, 0), (1, 1);
SELECT * FROM b FULL OUTER JOIN b a ON a.key = b.key AND a.value > 0 AND b.value > 0;{code}
 

 

Experiment result:

 
{code:java}
-- PostgresSQL
-- Query1
key | value | key | value
-----+-------+-----+-------
   1 |     1 |     |
   1 |     2 |     |
   2 |     1 |     |
     |       |   1 |     2
     |       |   1 |     1
     |       |   2 |     1
(6 rows)

-- Query2
 key | value | key | value
-----+-------+-----+-------
   1 |     0 |     |      
   1 |     1 |   1 |     1
     |       |   1 |     0
(3 rows){code}
 


> FullOuterJoin with filter expressions is not computed correctly
> ---------------------------------------------------------------
>
>                 Key: HIVE-27226
>                 URL: https://issues.apache.org/jira/browse/HIVE-27226
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Seonggon Namgung
>            Priority: Major
>
> I tested many OuterJoin queries as an extension of HIVE-27138, and I found that Hive returns incorrect result for a query containing FullOuterJoin with filter expressions. In a nutshell, all JoinOperators that run on Tez engine return incorrect result for OuterJoin queries, and one of the reason for incorrect computation comes from CommonJoinOperator, which is the base of all JoinOperators. I attached the queries and configuration that I used at the bottom of the document. I am still inspecting this problems, and I will share an update once when I find out another reason. Also any comments and opinions would be appreciated.
> First of all, I observed that current Hive ignores filter expressions contained in MapJoinOperator. For example, the attached result of query1 shows that MapJoinOperator performs inner join, not full outer join. This problem stems from removal of filterMap. When converting JoinOperator to MapJoinOperator, ConvertJoinMapJoin#convertJoinDynamicPartitionedHashJoin() removes filterMap of MapJoinOperator. Because MapJoinOperator does not evaluate filter expressions if filterMap is null, this change makes MapJoinOperator ignore filter expressions and it always joins tables regardless whether they satisfy filter expressions or not. To solve this problem, I disable FullOuterMapJoinOptimization and apply path for HIVE-27138, which prevents NPE. (The patch is available at the following link: LINK.) The rest of this document uses this modified Hive, but most of problems happen to current Hive, too.
> The second problem I found is that Hive returns the same left-null or right-null rows multiple time when it uses MapJoinOperator or CommonMergeJoinOperator. This is caused by the logic of current CommonJoinOperator. Both of the two JoinOperators joins tables in 2 steps. First, they create RowContainers, each of which is a group of rows from one table and has the same key. Second, they call CommonJoinOperator#checkAndGenObject() with created RowContainers. This method checks filterTag of each row in RowContainers and forwards joined row if they meet all filter conditions. For OuterJoin, checkAndGenObject() forwards non-matching rows if there is no matching row in RowContainer. The problem happens when there are multiple RowContainer for the same key and table. For example, suppose that there are two left RowContainers and one right RowContainer. If none of the row in two left RowContainers satisfies filter condition, then checkAndGenObject() will forward Left-Null row for each right row. Because checkAndGenObject() is called with each left RowContainer, there will be two duplicated Left-Null rows for every right row.
> In the case of MapJoinOperator, it always creates singleton RowContainer for big table. Therefore, it always produces duplicated non-matching rows. CommonMergeJoinOperator also creates multiple RowContainer for big table, whose size is hive.join.emit.interval. In the below experiment, I also set hive.join.shortcut.unmatched.rows=false, and hive.exec.reducers.max=1 to disable specialized algorithm for OuterJoin of 2 tables and force calling checkAndGenObject() before all rows with the same keys are gathered. I didn't observe this problem when using VectorMapJoinOperator, and I will inspect VectorMapJoinOperator whether we can reproduce the problem with it.
> I think the second problem is not limited to FullOuterJoin, but I couldn't find such query as of now. This will also be added to this issue if I can write a query that reproduces the second problem without FullOuterJoin.
> I also found that Hive returns wrong result for query2 even when I used VectorMapJoinOperator. I am still inspecting this problem and I will add an update on it when I find out the reason.
>  
> Experiment:
>  
> {code:java}
> ---- Configuration
> set hive.optimize.shared.work=false;
> -- Std MapJoin
> set hive.auto.convert.join=true;
> set hive.vectorized.execution.enabled=false;
> -- Vec MapJoin
> set hive.auto.convert.join=true;
> set hive.vectorized.execution.enabled=true;
> -- MergeJoin
> set hive.auto.convert.join=false;
> set hive.vectorized.execution.enabled=false;
> set hive.join.shortcut.unmatched.rows=false;
> set hive.join.emit.interval=1;
> set hive.exec.reducers.max=1;
>  
> ---- Queries
> -- Query 1
> DROP TABLE IF EXISTS a;
> CREATE TABLE a (key string, value string);
> INSERT INTO a VALUES (1, 1), (1, 2), (2, 1);
> SELECT * FROM a FULL OUTER JOIN a b ON a.key = b.key AND a.key < 0;
> -- Query 2
> DROP TABLE IF EXISTS b;
> CREATE TABLE b (key string, value string);
> INSERT INTO b VALUES (1, 0), (1, 1);
> SELECT * FROM b FULL OUTER JOIN b a ON a.key = b.key AND a.value > 0 AND b.value > 0;{code}
>  
>  
> Experiment result:
>  
> {code:java}
> -- PostgresSQL
> -- Query1
> key | value | key | value
> -----+-------+-----+-------
>    1 |     1 |     |
>    1 |     2 |     |
>    2 |     1 |     |
>      |       |   1 |     2
>      |       |   1 |     1
>      |       |   2 |     1
> (6 rows)
> -- Query2
>  key | value | key | value
> -----+-------+-----+-------
>    1 |     0 |     |      
>    1 |     1 |   1 |     1
>      |       |   1 |     0
> (3 rows){code}
> {code:java}
> -- Query1 Result, current Hive
> -- Std MapJoin
> +--------+----------+--------+----------+
> | a.key  | a.value  | b.key  | b.value  |
> +--------+----------+--------+----------+
> | 2      | 1        | 2      | 1        |
> | 1      | 2        | 1      | 2        |
> | 1      | 2        | 1      | 1        |
> | 1      | 1        | 1      | 2        |
> | 1      | 1        | 1      | 1        |
> +--------+----------+--------+----------+
> -- Vec MapJoin
> +--------+----------+--------+----------+
> | a.key  | a.value  | b.key  | b.value  |
> +--------+----------+--------+----------+
> | 1      | 2        | NULL   | NULL     |
> | 1      | 1        | NULL   | NULL     |
> | NULL   | NULL     | 1      | 2        |
> | NULL   | NULL     | 1      | 1        |
> | 2      | 1        | NULL   | NULL     |
> | NULL   | NULL     | 2      | 1        |
> +--------+----------+--------+----------+
> -- MergeJoin
> +--------+----------+--------+----------+
> | a.key  | a.value  | b.key  | b.value  |
> +--------+----------+--------+----------+
> | 1      | 1        | NULL   | NULL     |
> | NULL   | NULL     | 1      | 1        |
> | NULL   | NULL     | 1      | 2        |
> | 1      | 2        | NULL   | NULL     |
> | NULL   | NULL     | 1      | 1        |
> | NULL   | NULL     | 1      | 2        |
> | 2      | 1        | NULL   | NULL     |
> | NULL   | NULL     | 2      | 1        |
> +--------+----------+--------+----------+ {code}
> {code:java}
> -- Query1 Result, Hive with HIVE-27138 patch, disable FullOuterMapJoinOptimization
> -- Std MapJoin
> +--------+----------+--------+----------+
> | a.key  | a.value  | b.key  | b.value  |
> +--------+----------+--------+----------+
> | 2      | 1        | NULL   | NULL     |
> | NULL   | NULL     | 2      | 1        |
> | 1      | 2        | NULL   | NULL     |
> | NULL   | NULL     | 1      | 2        |
> | NULL   | NULL     | 1      | 1        |
> | 1      | 1        | NULL   | NULL     |
> | NULL   | NULL     | 1      | 2        |
> | NULL   | NULL     | 1      | 1        |
> +--------+----------+--------+----------+
> -- Vec MapJoin
> +--------+----------+--------+----------+
> | a.key  | a.value  | b.key  | b.value  |
> +--------+----------+--------+----------+
> | 1      | 2        | NULL   | NULL     |
> | 1      | 1        | NULL   | NULL     |
> | NULL   | NULL     | 1      | 2        |
> | NULL   | NULL     | 1      | 1        |
> | 2      | 1        | NULL   | NULL     |
> | NULL   | NULL     | 2      | 1        |
> +--------+----------+--------+----------+
> -- MergeJoin
> +--------+----------+--------+----------+
> | a.key  | a.value  | b.key  | b.value  |
> +--------+----------+--------+----------+
> | 1      | 1        | NULL   | NULL     |
> | NULL   | NULL     | 1      | 1        |
> | NULL   | NULL     | 1      | 2        |
> | 1      | 2        | NULL   | NULL     |
> | NULL   | NULL     | 1      | 1        |
> | NULL   | NULL     | 1      | 2        |
> | 2      | 1        | NULL   | NULL     |
> | NULL   | NULL     | 2      | 1        |
> +--------+----------+--------+----------+ {code}
> {code:java}
> -- Query2 Result, current Hive
> -- Std MapJoin
> +--------+----------+--------+----------+
> | b.key  | b.value  | a.key  | a.value  |
> +--------+----------+--------+----------+
> | 1      | 0        | NULL   | NULL     |
> | NULL   | NULL     | 1      | 0        |
> | NULL   | NULL     | 1      | 1        |
> | 1      | 1        | 1      | 0        |
> | 1      | 1        | 1      | 1        |
> +--------+----------+--------+----------+
> -- Vec MapJoin
> +--------+----------+--------+----------+
> | b.key  | b.value  | a.key  | a.value  |
> +--------+----------+--------+----------+
> | 1      | 0        | NULL   | NULL     |
> | 1      | 1        | 1      | 1        |
> | 1      | 1        | 1      | 0        |
> +--------+----------+--------+----------+
> -- MergeJoin
> +--------+----------+--------+----------+
> | b.key  | b.value  | a.key  | a.value  |
> +--------+----------+--------+----------+
> | 1      | 0        | NULL   | NULL     |
> | NULL   | NULL     | 1      | 0        |
> | 1      | 1        | 1      | 1        |
> +--------+----------+--------+----------+ {code}
> {code:java}
> -- Query2 Result, Hive with HIVE-27138 patch, disable FullOuterMapJoinOptimization
> -- Std MapJoin
> +--------+----------+--------+----------+
> | b.key  | b.value  | a.key  | a.value  |
> +--------+----------+--------+----------+
> | 1      | 1        | 1      | 1        |
> | 1      | 1        | 1      | 0        |
> | 1      | 0        | 1      | 1        |
> | 1      | 0        | 1      | 0        |
> +--------+----------+--------+----------+
> -- Vec MapJoin
> +--------+----------+--------+----------+
> | b.key  | b.value  | a.key  | a.value  |
> +--------+----------+--------+----------+
> | 1      | 0        | NULL   | NULL     |
> | 1      | 1        | 1      | 0        |
> | 1      | 1        | 1      | 1        |
> +--------+----------+--------+----------+
> -- MergeJoin
> +--------+----------+--------+----------+
> | b.key  | b.value  | a.key  | a.value  |
> +--------+----------+--------+----------+
> | 1      | 0        | NULL   | NULL     |
> | NULL   | NULL     | 1      | 0        |
> | 1      | 1        | 1      | 1        |
> +--------+----------+--------+----------+ {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)