You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Bruce Robbins (Jira)" <ji...@apache.org> on 2023/04/12 22:34:00 UTC
[jira] [Created] (SPARK-43113) Codegen error when full outer join's bound condition has multiple references to the same stream-side column
Bruce Robbins created SPARK-43113:
-------------------------------------
Summary: Codegen error when full outer join's bound condition has multiple references to the same stream-side column
Key: SPARK-43113
URL: https://issues.apache.org/jira/browse/SPARK-43113
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 3.3.2, 3.4.0, 3.5.0
Reporter: Bruce Robbins
Example # 1 (sort merge join):
{noformat}
create or replace temp view v1 as
select * from values
(1, 1),
(2, 2),
(3, 1)
as v1(key, value);
create or replace temp view v2 as
select * from values
(1, 22, 22),
(3, -1, -1),
(7, null, null)
as v2(a, b, c);
select *
from v1
full outer join v2
on key = a
and value > b
and value > c;
{noformat}
The join's generated code causes the following compilation error:
{noformat}
org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 277, Column 9: Redefinition of local variable "smj_isNull_7"
{noformat}
Example #2 (shuffle hash join):
{noformat}
select /*+ SHUFFLE_HASH(v2) */ *
from v1
full outer join v2
on key = a
and value > b
and value > c;
{noformat}
The shuffle hash join's generated code causes the following compilation error:
{noformat}
org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 174, Column 5: Redefinition of local variable "shj_value_1"
{noformat}
With default configuration, both queries end up succeeding, since Spark falls back to running each query with whole-stage codegen disabled.
The issue happens only when the join's bound condition refers to the same stream-side column more than once.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org