You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (Jira)" <ji...@apache.org> on 2019/12/17 02:05:00 UTC

[jira] [Commented] (SPARK-29164) Rewrite coalesce(boolean, booleanLit) as boolean expression

    [ https://issues.apache.org/jira/browse/SPARK-29164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997791#comment-16997791 ] 

Hyukjin Kwon commented on SPARK-29164:
--------------------------------------

Resolving per the discussion in the PR.

> Rewrite coalesce(boolean, booleanLit) as boolean expression
> -----------------------------------------------------------
>
>                 Key: SPARK-29164
>                 URL: https://issues.apache.org/jira/browse/SPARK-29164
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Josh Rosen
>            Priority: Major
>
> I propose the following expression rewrite optimizations:
> {code:java}
> coalesce(x: Boolean, true)  -> x or isnull(x)
> coalesce(x: Boolean, false) -> x and isnotnull(x){code}
> This pattern appears when translating Dataset filters on {{Option[Boolean]}} columns: we might have a typed Dataset filter which looks like
> {code:java}
>  .filter(_.boolCol.getOrElse(DEFAULT_VALUE)){code}
> and the most idiomatic, user-friendly translation of this in Catalyst is to use {{coalesce()}}. However, the {{coalesce()}} form of this expression is not eligible for Parquet / data source filter pushdown.
> (We should write out truth-tables to double-check this rewrite's correctness)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org