You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Yuming Wang (JIRA)" <ji...@apache.org> on 2018/09/24 01:02:00 UTC

[jira] [Commented] (SPARK-23985) predicate push down doesn't work with simple compound partition spec

    [ https://issues.apache.org/jira/browse/SPARK-23985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16625305#comment-16625305 ] 

Yuming Wang commented on SPARK-23985:
-------------------------------------

[~uzadude] It seems already works:
{code:scala}
withTable("t1") {
  withSQLConf(SQLConf.OPTIMIZER_PLAN_CHANGE_LOG_LEVEL.key -> "warn") {
    spark.range(10).selectExpr("cast(id as string) as a", "id as b", "id").write.saveAsTable("t1")
    val w = spark.sql(
      "select *, row_number() over (partition by concat(a,'lit') order by b) from t1 where a>'1'")
    w.explain()
  }
}
{code}

{noformat}
== Physical Plan ==
*(3) Project [a#11, b#12L, id#13L, row_number() OVER (PARTITION BY concat(a, lit) ORDER BY b ASC NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)#22]
+- Window [row_number() windowspecdefinition(_w0#23, b#12L ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS row_number() OVER (PARTITION BY concat(a, lit) ORDER BY b ASC NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)#22], [_w0#23], [b#12L ASC NULLS FIRST]
   +- *(2) Sort [_w0#23 ASC NULLS FIRST, b#12L ASC NULLS FIRST], false, 0
      +- Exchange hashpartitioning(_w0#23, 5)
         +- *(1) Project [a#11, b#12L, id#13L, concat(a#11, lit) AS _w0#23]
            +- *(1) Filter (isnotnull(a#11) && (a#11 > 1))
               +- *(1) FileScan parquet default.t1[a#11,b#12L,id#13L] Batched: true, DataFilters: [isnotnull(a#11), (a#11 > 1)], Format: Parquet, Location: InMemoryFileIndex[file:/Users/yumwang/opensource/spark/core/spark-warehouse/t1], PartitionFilters: [], PushedFilters: [IsNotNull(a), GreaterThan(a,1)], ReadSchema: struct<a:string,b:bigint,id:bigint>
17:58:56.582 WARN org.apache.spark.sql.DataFrameSuite: 
{noformat}


> predicate push down doesn't work with simple compound partition spec
> --------------------------------------------------------------------
>
>                 Key: SPARK-23985
>                 URL: https://issues.apache.org/jira/browse/SPARK-23985
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.4.0
>            Reporter: Ohad Raviv
>            Priority: Minor
>
> while predicate push down works with this query: 
> {code:sql}
> select *, row_number() over (partition by a order by b) from t1 where a>1
> {code}
> it dowsn't work with:
> {code:sql}
> select *, row_number() over (partition by concat(a,'lit') order by b) from t1 where a>1
> {code}
>  
> I added a test to FilterPushdownSuite which I think recreates the problem:
> {code:scala}
>   test("Window: predicate push down -- ohad") {
>     val winExpr = windowExpr(count('b),
>       windowSpec(Concat('a :: Nil) :: Nil, 'b.asc :: Nil, UnspecifiedFrame))
>     val originalQuery = testRelation.select('a, 'b, 'c, winExpr.as('window)).where('a > 1)
>     val correctAnswer = testRelation
>       .where('a > 1).select('a, 'b, 'c)
>       .window(winExpr.as('window) :: Nil, 'a :: Nil, 'b.asc :: Nil)
>       .select('a, 'b, 'c, 'window).analyze
>     comparePlans(Optimize.execute(originalQuery.analyze), correctAnswer)
>   }
> {code}
> will try to create a PR with a correction



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org