You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Zhen Li (Jira)" <ji...@apache.org> on 2023/10/26 21:21:00 UTC

[jira] [Created] (SPARK-45679) Add clusterBy in DataFrame API

Zhen Li created SPARK-45679:
-------------------------------

             Summary: Add clusterBy in DataFrame API
                 Key: SPARK-45679
                 URL: https://issues.apache.org/jira/browse/SPARK-45679
             Project: Spark
          Issue Type: Improvement
          Components: Connect
    Affects Versions: 3.5.1
            Reporter: Zhen Li


Add clusterBy to Dataframe API e.g. in python

DataframeWriterV1
```
df.write
  .format("delta")
  .clusterBy("clusteringColumn1", "clusteringColumn2")
  .save(...) or saveAsTable(...)
```

DataFrameWriterV2
```
df.writeTo(...).using("delta")
  .clusterBy("clusteringColumn1", "clusteringColumn2")
  .create() or replace() or createOrReplace()
```



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org