You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Reynold Xin (JIRA)" <ji...@apache.org> on 2015/07/26 09:34:04 UTC

[jira] [Created] (SPARK-9357) Remove JoinedRow

Reynold Xin created SPARK-9357:
----------------------------------

             Summary: Remove JoinedRow
                 Key: SPARK-9357
                 URL: https://issues.apache.org/jira/browse/SPARK-9357
             Project: Spark
          Issue Type: Umbrella
          Components: SQL
            Reporter: Reynold Xin


JoinedRow was introduced to join two rows together, in aggregation (join key and value), joins (left, right), window functions, etc.

It aims to reduce the amount of data copied, but incurs branches when the row is actually read. Given all the fields will be read almost all the time (otherwise they get pruned out by the optimizer), branch predictor cannot do anything about those branches.

I think a better way is just to remove this thing, and materializes the row data directly.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org