You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Ohad Raviv (Jira)" <ji...@apache.org> on 2022/11/27 13:39:00 UTC
[jira] [Created] (SPARK-41277) Save and leverage shuffle key in tblproperties
Ohad Raviv created SPARK-41277:
----------------------------------
Summary: Save and leverage shuffle key in tblproperties
Key: SPARK-41277
URL: https://issues.apache.org/jira/browse/SPARK-41277
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 3.3.1
Reporter: Ohad Raviv
I'm not sure if I'm not missing anything trivial.
In a typical process, many datasets get materialized and many of them after a shuffle (e.g join). then they would again be involved in further actions and often use the same key.
Wouldn't it make sense to save the shuffle key along with the table to avoid unnecessary shuffles?
Also, the implementation seems quite straightforward - to just leverage the bucketing mechanism.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org