You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by al...@apache.org on 2021/11/15 12:02:54 UTC
[arrow-datafusion] branch master updated: python: update release instructions & automation (#1295)
This is an automated email from the ASF dual-hosted git repository.
alamb pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow-datafusion.git
The following commit(s) were added to refs/heads/master by this push:
new 05fdb7a python: update release instructions & automation (#1295)
05fdb7a is described below
commit 05fdb7aefb1e3f9c1b170c70d998337c173eb12c
Author: QP Hou <qp...@scribd.com>
AuthorDate: Mon Nov 15 04:02:50 2021 -0800
python: update release instructions & automation (#1295)
* python: update release instructions & automation
* add PMC member note
---
.gitignore | 3 +
dev/release/README.md | 62 +++++++++++++----
dev/release/create-tarball.sh | 18 ++++-
dev/release/download-python-wheels.py | 119 ++++++++++++++++++++++++++++++++
dev/release/verify-release-candidate.sh | 17 +++--
5 files changed, 197 insertions(+), 22 deletions(-)
diff --git a/.gitignore b/.gitignore
index 31bdf49..80a9cb6 100644
--- a/.gitignore
+++ b/.gitignore
@@ -91,3 +91,6 @@ rusty-tags.vi
.vscode
venv/*
+
+# apache release artifacts
+dev/dist
diff --git a/dev/release/README.md b/dev/release/README.md
index 775678a..2127dc2 100644
--- a/dev/release/README.md
+++ b/dev/release/README.md
@@ -50,7 +50,7 @@ release backport branch.
As part of the Apache governance model, official releases consist of signed
source tarballs approved by the PMC.
-We then use the code in the approved source tarball to release to crates.io and
+We then use the code in the approved artifacts to release to crates.io and
PyPI.
### Change Log
@@ -126,9 +126,9 @@ could change the change log content landed in the `master` branch before you
could merge the PR, you need to rerun the changelog update script to regenerate
the changelog and update the PR accordingly.
-## Prepare release candidate tarball
+## Prepare release candidate artifacts
-After the PR gets merged, you are ready to create a releaes tarball from the
+After the PR gets merged, you are ready to create releaes artifacts based off the
merged commit.
(Note you need to be a committer to run these scripts as they upload to the apache svn distribution servers)
@@ -139,7 +139,8 @@ Pick numbers in sequential order, with `0` for `rc0`, `1` for `rc1`, etc.
### Create git tag for the release:
-While the official release artifact is a signed tarball, we also tag the commit it was created for convenience and code archaeology.
+While the official release artifacts are signed tarballs and zip files, we also
+tag the commit it was created for convenience and code archaeology.
Using a string such as `5.1.0` as the `<version>`, create and push the tag thusly:
@@ -150,24 +151,27 @@ git tag <version>-<rc> apache/master
git push apache <version>
```
-### Create, sign, and upload tarball
+This should trigger the `Python Release Build` Github Action workflow for the
+pushed tag. You can monitor the pipline run status at https://github.com/apache/arrow-datafusion/actions/workflows/python_build.yml.
+
+### Create, sign, and upload artifacts
Run `create-tarball.sh` with the `<version>` tag and `<rc>` and you found in previous steps:
```shell
-./dev/release/create-tarball.sh 5.1.0 0
+GH_TOKEN=<TOKEN> ./dev/release/create-tarball.sh 5.1.0 0
```
The `create-tarball.sh` script
-1. creates and uploads a release candidate tarball to the [arrow
+1. creates and uploads all release candidate artifacts to the [arrow
dev](https://dist.apache.org/repos/dist/dev/arrow) location on the
apache distribution svn server
2. provide you an email template to
send to dev@arrow.apache.org for release voting.
-### Vote on Release Candidate tarball
+### Vote on Release Candidate artifacts
Send the email output from the script to dev@arrow.apache.org. The email should look like
@@ -181,7 +185,7 @@ I would like to propose a release of Apache Arrow Datafusion Implementation,
version 5.1.0.
This release candidate is based on commit: a5dd428f57e62db20a945e8b1895de91405958c4 [1]
-The proposed release tarball and signatures are hosted at [2].
+The proposed release artifacts and signatures are hosted at [2].
The changelog is located at [3].
Please download, verify checksums and signatures, run the unit tests,
@@ -215,9 +219,11 @@ changes into master if there is any and try again with the next RC number.
## Finalize the release
+NOTE: steps in this section can only be done by PMC members.
+
### After the release is approved
-Move tarball to the release location in SVN, e.g.
+Move artifacts to the release location in SVN, e.g.
https://dist.apache.org/repos/dist/release/arrow/arrow-datafusion-5.1.0/, using
the `release-tarball.sh` script:
@@ -232,7 +238,7 @@ Congratulations! The release is now offical!
Tag the same release candidate commit with the final release tag
```
-git co apache/5.1.0-RC0
+git co apache/5.1.0-rc0
git tag 5.1.0
git push 5.1.0
```
@@ -292,7 +298,20 @@ If there is a ballista release, run
### Publish on PyPI
-TODO
+Only approved releases of the source tarball and wheels should be published to
+PyPI, in order to conform to Apache Software Foundation governance standards.
+
+First, download all official python release artifacts:
+
+```shell
+svn co https://dist.apache.org/repos/dist/release/arrow/apache-arrow-datafusion-5.1.0-rc0/python ./python-artifacts
+```
+
+Use [twine](https://pypi.org/project/twine/) to perform the upload.
+
+```shell
+twine upload ./python-artifactl/*.{tar.gz,whl}
+```
### Call the vote
@@ -300,4 +319,21 @@ Call the vote on the Arrow dev list by replying to the RC voting thread. The
reply should have a new subject constructed by adding `[RESULT]` prefix to the
old subject line.
-TODO: add example mail
+Sample announcement template:
+
+```
+The vote has passed with <NUMBER> +1 votes. Thank you to all who helped
+with the release verification.
+```
+
+You can include mention crates.io and PyPI version URLs in the email if applicable.
+
+```
+We have published new versions of datafusion and ballista to crates.io:
+
+https://crates.io/crates/datafusion/5.0.0
+https://crates.io/crates/ballista/0.5.0
+https://crates.io/crates/ballista-core/0.5.0
+https://crates.io/crates/ballista-executor/0.5.0
+https://crates.io/crates/ballista-scheduler/0.5.0
+```
diff --git a/dev/release/create-tarball.sh b/dev/release/create-tarball.sh
index 94318d0..59214a5 100755
--- a/dev/release/create-tarball.sh
+++ b/dev/release/create-tarball.sh
@@ -36,6 +36,8 @@
# 2. Logged into the apache svn server with the appropriate
# credentials
#
+# 3. Install the requests python package
+#
#
# Based in part on 02-source.sh from apache/arrow
#
@@ -48,7 +50,12 @@ SOURCE_TOP_DIR="$(cd "${SOURCE_DIR}/../../" && pwd)"
if [ "$#" -ne 2 ]; then
echo "Usage: $0 <version> <rc>"
echo "ex. $0 4.1.0 2"
- exit
+ exit
+fi
+
+if [[ -z "${GH_TOKEN}" ]]; then
+ echo "Please set personal github token through GH_TOKEN environment variable"
+ exit
fi
version=$1
@@ -118,8 +125,15 @@ gpg --armor --output ${tarball}.asc --detach-sig ${tarball}
(cd ${distdir} && shasum -a 256 ${tarname}) > ${tarball}.sha256
(cd ${distdir} && shasum -a 512 ${tarname}) > ${tarball}.sha512
+# download python binary releases from Github Action
+python_distdir=${distdir}/python
+echo "Preparing python release artifacts"
+test -d ${python_distdir} || mkdir -p ${python_distdir}
+pushd "${python_distdir}"
+ python ${SOURCE_DIR}/download-python-wheels.py "${tag}"
+popd
+
echo "Uploading to apache dist/dev to ${url}"
svn co --depth=empty https://dist.apache.org/repos/dist/dev/arrow ${SOURCE_TOP_DIR}/dev/dist
svn add ${distdir}
svn ci -m "Apache Arrow Datafusion ${version} ${rc}" ${distdir}
-
diff --git a/dev/release/download-python-wheels.py b/dev/release/download-python-wheels.py
new file mode 100644
index 0000000..043cb92
--- /dev/null
+++ b/dev/release/download-python-wheels.py
@@ -0,0 +1,119 @@
+#!/usr/bin/env python
+
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# Script that download python release artifacts from Github
+#
+# dependencies:
+# pip install requests
+
+
+import sys
+import os
+import argparse
+import requests
+import zipfile
+import subprocess
+import hashlib
+import io
+
+
+def main():
+ parser = argparse.ArgumentParser(
+ description='Download python binary wheels from release candidate workflow runs.')
+ parser.add_argument('tag', type=str, help='datafusion RC release tag')
+ args = parser.parse_args()
+
+ tag = args.tag
+ ghp_token = os.environ.get("GH_TOKEN")
+ if not ghp_token:
+ print(
+ "ERROR: Personal Github token is required to download workflow artifacts. "
+ "Please specify a token through GH_TOKEN environment variable.")
+ sys.exit(1)
+
+ print(f"Downloading latest python wheels for RC tag {tag}...")
+
+ headers = {
+ "Accept": "application/vnd.github.v3+json",
+ "Authorization": f"token {ghp_token}",
+ }
+ url = f"https://api.github.com/repos/apache/arrow-datafusion/actions/runs?branch={tag}"
+ resp = requests.get(url, headers=headers)
+ resp.raise_for_status()
+
+ artifacts_url = None
+ for run in resp.json()["workflow_runs"]:
+ if run["name"] != "Python Release Build":
+ continue
+ artifacts_url = run["artifacts_url"]
+
+ if artifacts_url is None:
+ print("ERROR: Could not find python wheel binaries from Github Action run")
+ sys.exit(1)
+ print(f"Found artifacts url: {artifacts_url}")
+
+ download_url = None
+ artifacts = requests.get(artifacts_url, headers=headers).json()["artifacts"]
+ for artifact in artifacts:
+ if artifact["name"] != "dist":
+ continue
+ download_url = artifact["archive_download_url"]
+
+ if download_url is None:
+ print(f"ERROR: Could not resolve python wheel download URL from list of artifacts: {artifacts}")
+ sys.exit(1)
+ print(f"Extracting archive from: {download_url}...")
+
+ resp = requests.get(download_url, headers=headers, stream=True)
+ resp.raise_for_status()
+ zf = zipfile.ZipFile(io.BytesIO(resp.content))
+ zf.extractall("./")
+
+ for entry in os.listdir("./"):
+ if entry.endswith(".whl") or entry.endswith(".tar.gz"):
+ print(f"Sign and checksum artifact: {entry}")
+ subprocess.check_output([
+ "gpg", "--armor",
+ "--output", entry+".asc",
+ "--detach-sig", entry,
+ ])
+
+ sha256 = hashlib.sha256()
+ sha512 = hashlib.sha512()
+ with open(entry, "rb") as fd:
+ while True:
+ data = fd.read(65536)
+ if not data:
+ break
+ sha256.update(data)
+ sha512.update(data)
+ with open(entry+".sha256", "w") as fd:
+ fd.write(sha256.hexdigest())
+ fd.write(" ")
+ fd.write(entry)
+ fd.write("\n")
+ with open(entry+".sha512", "w") as fd:
+ fd.write(sha512.hexdigest())
+ fd.write(" ")
+ fd.write(entry)
+ fd.write("\n")
+
+
+if __name__ == "__main__":
+ main()
diff --git a/dev/release/verify-release-candidate.sh b/dev/release/verify-release-candidate.sh
index a37b6ff..5ac7b23 100755
--- a/dev/release/verify-release-candidate.sh
+++ b/dev/release/verify-release-candidate.sh
@@ -67,9 +67,7 @@ fetch_archive() {
download_rc_file ${dist_name}.tar.gz.asc
download_rc_file ${dist_name}.tar.gz.sha256
download_rc_file ${dist_name}.tar.gz.sha512
- gpg --verify ${dist_name}.tar.gz.asc ${dist_name}.tar.gz
- ${sha256_verify} ${dist_name}.tar.gz.sha256
- ${sha512_verify} ${dist_name}.tar.gz.sha512
+ verify_dir_artifact_signatures
}
verify_dir_artifact_signatures() {
@@ -82,9 +80,7 @@ verify_dir_artifact_signatures() {
# basename of the artifact
pushd $(dirname $artifact)
base_artifact=$(basename $artifact)
- if [ -f $base_artifact.sha256 ]; then
- ${sha256_verify} $base_artifact.sha256 || exit 1
- fi
+ ${sha256_verify} $base_artifact.sha256 || exit 1
${sha512_verify} $base_artifact.sha512 || exit 1
popd
done
@@ -150,7 +146,14 @@ import_gpg_keys
fetch_archive ${dist_name}
tar xf ${dist_name}.tar.gz
pushd ${dist_name}
-test_source_distribution
+ test_source_distribution
+popd
+
+echo "Verifying python artifacts..."
+svn co $ARROW_DIST_URL/apache-arrow-datafusion-${VERSION}-rc${RC_NUMBER}/python python-artifacts
+pushd python-artifacts
+ verify_dir_artifact_signatures
+ twine check *.{whl,tar.gz}
popd
TEST_SUCCESS=yes