You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by do...@apache.org on 2022/01/14 04:33:47 UTC
[spark] branch master updated: [SPARK-37905][INFRA] Make `merge_spark_pr.py` set primary author from the first commit in case of ties
This is an automated email from the ASF dual-hosted git repository.
dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new ef837ca [SPARK-37905][INFRA] Make `merge_spark_pr.py` set primary author from the first commit in case of ties
ef837ca is described below
commit ef837ca71020950b841f9891c70dc4b29d968bf1
Author: Dongjoon Hyun <do...@apache.org>
AuthorDate: Thu Jan 13 20:32:46 2022 -0800
[SPARK-37905][INFRA] Make `merge_spark_pr.py` set primary author from the first commit in case of ties
### What changes were proposed in this pull request?
This PR aim to make `merge_spark_pr.py` set the primary author from the first commit in case of ties.
### Why are the changes needed?
Currently, `merge_spark_pr.py` chooses the primary author randomly when there are two commits from two authors.
https://github.com/apache/spark/pull/35190
The best case could choose the primary author based on the number of lines, but it seems to hard. So, this PR aims to become better than before.
### Does this PR introduce _any_ user-facing change?
No. This is a dev only.
### How was this patch tested?
Manually.
Closes #35205 from dongjoon-hyun/SPARK-37905.
Authored-by: Dongjoon Hyun <do...@apache.org>
Signed-off-by: Dongjoon Hyun <do...@apache.org>
---
dev/merge_spark_pr.py | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/dev/merge_spark_pr.py b/dev/merge_spark_pr.py
index 8d09c53..e21a39a 100755
--- a/dev/merge_spark_pr.py
+++ b/dev/merge_spark_pr.py
@@ -135,11 +135,12 @@ def merge_pr(pr_num, target_ref, title, body, pr_repo_desc):
continue_maybe(msg)
had_conflicts = True
+ # First commit author should be considered as the primary author when the rank is the same
commit_authors = run_cmd(
- ["git", "log", "HEAD..%s" % pr_branch_name, "--pretty=format:%an <%ae>"]
+ ["git", "log", "HEAD..%s" % pr_branch_name, "--pretty=format:%an <%ae>", "--reverse"]
).split("\n")
distinct_authors = sorted(
- set(commit_authors), key=lambda x: commit_authors.count(x), reverse=True
+ list(dict.fromkeys(commit_authors)), key=lambda x: commit_authors.count(x), reverse=True
)
primary_author = input(
'Enter primary author in the format of "name <email>" [%s]: ' % distinct_authors[0]
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org