You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Apache Spark (Jira)" <ji...@apache.org> on 2021/05/06 15:14:00 UTC

[jira] [Commented] (SPARK-35329) Split generated switch code into pieces in ExpandExec

    [ https://issues.apache.org/jira/browse/SPARK-35329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17340256#comment-17340256 ] 

Apache Spark commented on SPARK-35329:
--------------------------------------

User 'maropu' has created a pull request for this issue:
https://github.com/apache/spark/pull/32457

> Split generated switch code into pieces in ExpandExec
> -----------------------------------------------------
>
>                 Key: SPARK-35329
>                 URL: https://issues.apache.org/jira/browse/SPARK-35329
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.2.0
>            Reporter: Takeshi Yamamuro
>            Priority: Minor
>
> This ticket aim at splitting generated switch code into smaller ones in `ExpandExec`. In the current master, even a simple query like the one below generates a large method whose size (`maxMethodCodeSize:7448`) is close to `8000` (`CodeGenerator.DEFAULT_JVM_HUGE_METHOD_LIMIT`);
> {code:java}
> scala> val df = Seq(("2016-03-27 19:39:34", 1, "a"), ("2016-03-27 19:39:56", 2, "a"), ("2016-03-27 19:39:27", 4, "b")).toDF("time", "value", "id")
> scala> val rdf = df.select(window($"time", "10 seconds", "3 seconds", "0 second"), $"value").orderBy($"window.start".asc, $"value".desc).select("value")
> scala> sql("SET spark.sql.adaptive.enabled=false")
> scala> import org.apache.spark.sql.execution.debug._ 
> scala> rdf.debugCodegen
> Found 2 WholeStageCodegen subtrees.
> == Subtree 1 / 2 (maxMethodCodeSize:7448; maxConstantPoolSize:189(0.29% used); numInnerClasses:0) ==
>                                     ^^^^
> *(1) Project [window#34.start AS _gen_alias_39#39, value#11]
> +- *(1) Filter ((isnotnull(window#34) AND (cast(time#10 as timestamp) >= window#34.start)) AND (cast(time#10 as timestamp) < window#34.end))
>    +- *(1) Expand [List(named_struct(start, precisetimestampcon...
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org