You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Jacques Nadeau (JIRA)" <ji...@apache.org> on 2015/01/04 22:27:43 UTC

[jira] [Updated] (DRILL-350) Streaming Aggregate physical operator projects columns that may not be needed

     [ https://issues.apache.org/jira/browse/DRILL-350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jacques Nadeau updated DRILL-350:
---------------------------------
    Component/s: Execution - Operators

> Streaming Aggregate physical operator projects columns that may not be needed
> -----------------------------------------------------------------------------
>
>                 Key: DRILL-350
>                 URL: https://issues.apache.org/jira/browse/DRILL-350
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Execution - Operators
>            Reporter: Aman Sinha
>             Fix For: Future
>
>
> A query may have GROUP-BY keys that are not part of the SELECT list.. for example:  SELECT SUM(c1) FROM t1 GROUP BY a1, b1. 
> The Streaming Aggregate physical operator currently projects all GROUP-BY columns with the assumption that a subsequent Project will drop the unnecessary columns.  This is sub-optimal because we incur the memory and cpu overhead of populating the output record batch value vectors for those columns.  Ideally, the operator could keep track of the columns that are needed by the parent (downstream) operator and only project those group-by columns.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)