You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "L. C. Hsieh (Jira)" <ji...@apache.org> on 2021/11/18 04:49:00 UTC

[jira] [Created] (SPARK-37369) Avoid redundant ColumnarToRow transistion on InMemoryTableScan

L. C. Hsieh created SPARK-37369:
-----------------------------------

             Summary: Avoid redundant ColumnarToRow transistion on InMemoryTableScan
                 Key: SPARK-37369
                 URL: https://issues.apache.org/jira/browse/SPARK-37369
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 3.3.0
            Reporter: L. C. Hsieh


We have a rule to insert columnar transition between row-based and columnar query plans. InMemoryTableScanExec can produce columnar output. So if its parent plan isn't columnar, the rule adds a ColumnarToRow between them.

But InMemoryTableScanExec is a special query plan because it can convert from cached batch to columnar batch or row. 

For such case, we ask InMemoryTableScanExec to convert cached batch to columnar batch, and then convert to row in the added ColumnarToRow, before the parent query.

So for such case, we can simply ask InMemoryTableScanExec to produce row output instead of a redundant conversion.

```
               +- Union                                                                                                                                                      
                  :- ColumnarToRow                                                                                                                                           
                  :  +- InMemoryTableScan [i#8, j#9]                                                                                                                         
                  :        +- InMemoryRelation [i#8, j#9], StorageLevel(disk, memory, deserialized, 1 replicas)     
```



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org