You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (Jira)" <ji...@apache.org> on 2021/11/18 04:55:00 UTC
[jira] [Commented] (SPARK-37369) Avoid redundant ColumnarToRow transistion on InMemoryTableScan
[ https://issues.apache.org/jira/browse/SPARK-37369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17445629#comment-17445629 ]
Apache Spark commented on SPARK-37369:
--------------------------------------
User 'viirya' has created a pull request for this issue:
https://github.com/apache/spark/pull/34642
> Avoid redundant ColumnarToRow transistion on InMemoryTableScan
> --------------------------------------------------------------
>
> Key: SPARK-37369
> URL: https://issues.apache.org/jira/browse/SPARK-37369
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 3.3.0
> Reporter: L. C. Hsieh
> Priority: Major
>
> We have a rule to insert columnar transition between row-based and columnar query plans. InMemoryTableScanExec can produce columnar output. So if its parent plan isn't columnar, the rule adds a ColumnarToRow between them.
> But InMemoryTableScanExec is a special query plan because it can convert from cached batch to columnar batch or row.
> For such case, we ask InMemoryTableScanExec to convert cached batch to columnar batch, and then convert to row in the added ColumnarToRow, before the parent query.
> So for such case, we can simply ask InMemoryTableScanExec to produce row output instead of a redundant conversion.
> ```
> +- Union
> :- ColumnarToRow
> : +- InMemoryTableScan [i#8, j#9]
> : +- InMemoryRelation [i#8, j#9], StorageLevel(disk, memory, deserialized, 1 replicas)
> ```
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org