You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Enrico Minack <ma...@Enrico.Minack.dev> on 2019/11/06 14:50:31 UTC
[SPARK-29176][DISCUSS] Optimization should change join type to CROSS
Hi,
I would like to discuss issue SPARK-29176 to see if this is considered a
bug and if so, to sketch out a fix.
In short, the issue is that a valid inner join with condition gets
optimized so that no condition is left, but the type is still INNER.
Then CheckCartesianProducts throws an exception. The type should have
changed to CROSS when it gets optimized in that way.
I understand that with spark.sql.crossJoin.enabled you can make Spark
not throw this exception, but I think you should not need this
work-around for a valid query.
Please let me know what you think about this issue and how I could fix
it. It might affect more rules than the two given in the Jira ticket.
Thanks,
Enrico
Re: [SPARK-29176][DISCUSS] Optimization should change join type to
CROSS
Posted by Enrico Minack <ma...@Enrico.Minack.dev>.
So you say the optimized inner join with no conditions is also a valid
query?
Then I agree the optimizer is not breaking the query, hence it is not a bug.
Enrico
Am 06.11.19 um 15:53 schrieb Sean Owen:
> You asked for an inner join but it turned into a cross-join. This
> might be surprising, hence the error you can disable.
> The query is not invalid in any case. It's just stopping you from
> doing something you may not meant to, and which may be expensive.
> However I think we've already changed the default to enable it in
> Spark 3 anyway.
>
> On Wed, Nov 6, 2019 at 8:50 AM Enrico Minack <ma...@enrico.minack.dev> wrote:
>> Hi,
>>
>> I would like to discuss issue SPARK-29176 to see if this is considered a bug and if so, to sketch out a fix.
>>
>> In short, the issue is that a valid inner join with condition gets optimized so that no condition is left, but the type is still INNER. Then CheckCartesianProducts throws an exception. The type should have changed to CROSS when it gets optimized in that way.
>>
>> I understand that with spark.sql.crossJoin.enabled you can make Spark not throw this exception, but I think you should not need this work-around for a valid query.
>>
>> Please let me know what you think about this issue and how I could fix it. It might affect more rules than the two given in the Jira ticket.
>>
>> Thanks,
>> Enrico
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>
---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
Re: [SPARK-29176][DISCUSS] Optimization should change join type to CROSS
Posted by Sean Owen <sr...@gmail.com>.
You asked for an inner join but it turned into a cross-join. This
might be surprising, hence the error you can disable.
The query is not invalid in any case. It's just stopping you from
doing something you may not meant to, and which may be expensive.
However I think we've already changed the default to enable it in
Spark 3 anyway.
On Wed, Nov 6, 2019 at 8:50 AM Enrico Minack <ma...@enrico.minack.dev> wrote:
>
> Hi,
>
> I would like to discuss issue SPARK-29176 to see if this is considered a bug and if so, to sketch out a fix.
>
> In short, the issue is that a valid inner join with condition gets optimized so that no condition is left, but the type is still INNER. Then CheckCartesianProducts throws an exception. The type should have changed to CROSS when it gets optimized in that way.
>
> I understand that with spark.sql.crossJoin.enabled you can make Spark not throw this exception, but I think you should not need this work-around for a valid query.
>
> Please let me know what you think about this issue and how I could fix it. It might affect more rules than the two given in the Jira ticket.
>
> Thanks,
> Enrico
---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org