You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by sr...@apache.org on 2022/09/22 13:28:29 UTC

[spark] branch master updated (f6c4e58b85d -> 08678456d16)

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


    from f6c4e58b85d [SPARK-40407][SQL] Fix the potential data skew caused by df.repartition
     add 08678456d16 [SPARK-40476][ML][SQL] Reduce the shuffle size of ALS

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/ml/recommendation/ALS.scala   |  18 ++--
 .../ml/recommendation/TopByKeyAggregator.scala     |  59 -----------
 .../spark/ml/recommendation/CollectTopKSuite.scala | 111 +++++++++++++++++++++
 .../recommendation/TopByKeyAggregatorSuite.scala   |  73 --------------
 .../catalyst/expressions/aggregate/collect.scala   |  46 ++++++++-
 .../scala/org/apache/spark/sql/functions.scala     |   3 +
 6 files changed, 169 insertions(+), 141 deletions(-)
 delete mode 100644 mllib/src/main/scala/org/apache/spark/ml/recommendation/TopByKeyAggregator.scala
 create mode 100644 mllib/src/test/scala/org/apache/spark/ml/recommendation/CollectTopKSuite.scala
 delete mode 100644 mllib/src/test/scala/org/apache/spark/ml/recommendation/TopByKeyAggregatorSuite.scala


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org