You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by ag...@apache.org on 2022/10/25 19:44:03 UTC

[arrow-datafusion-python] branch master updated: [CI] - Add Release Audit Tool(RAT) in CI (#63)

This is an automated email from the ASF dual-hosted git repository.

agrove pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow-datafusion-python.git


The following commit(s) were added to refs/heads/master by this push:
     new 8c4877b  [CI] - Add Release Audit Tool(RAT) in CI (#63)
8c4877b is described below

commit 8c4877bb5bd53f1b9d8c0cdea619e35a14069e8f
Author: Francis Du <me...@francis.run>
AuthorDate: Wed Oct 26 03:43:58 2022 +0800

    [CI] - Add Release Audit Tool(RAT) in CI (#63)
    
    * add rat ci
    
    * revert files
    
    * fix lint
---
 .github/ISSUE_TEMPLATE/bug_report.md               | 20 +++++++
 .github/ISSUE_TEMPLATE/feature_request.md          | 21 ++++++++
 .github/pull_request_template.md                   | 27 ++++++++++
 .github/workflows/dev.yml                          | 34 ++++++++++++
 .gitignore                                         |  3 ++
 ci/scripts/rust_clippy.sh                          | 21 ++++++++
 ci/scripts/rust_fmt.sh                             | 21 ++++++++
 ci/scripts/rust_toml_fmt.sh                        | 21 ++++++++
 datafusion/tests/conftest.py                       | 17 ++++++
 dev/build-set-env.sh                               | 20 +++++++
 dev/release/check-rat-report.py                    | 62 ++++++++++++++++++++++
 dev/release/rat_exclude_files.txt                  | 44 +++++++++++++++
 dev/release/run-rat.sh                             | 43 +++++++++++++++
 dev/rust_lint.sh                                   | 31 +++++++++++
 .../python/generated/datafusion.DataFrame.rst      | 17 ++++++
 .../python/generated/datafusion.Expression.rst     | 17 ++++++
 .../python/generated/datafusion.SessionContext.rst | 17 ++++++
 .../python/generated/datafusion.functions.rst      | 19 ++++++-
 18 files changed, 454 insertions(+), 1 deletion(-)

diff --git a/.github/ISSUE_TEMPLATE/bug_report.md b/.github/ISSUE_TEMPLATE/bug_report.md
new file mode 100644
index 0000000..5600dab
--- /dev/null
+++ b/.github/ISSUE_TEMPLATE/bug_report.md
@@ -0,0 +1,20 @@
+---
+name: Bug report
+about: Create a report to help us improve
+title: ''
+labels: bug
+assignees: ''
+
+---
+
+**Describe the bug**
+A clear and concise description of what the bug is.
+
+**To Reproduce**
+Steps to reproduce the behavior:
+
+**Expected behavior**
+A clear and concise description of what you expected to happen.
+
+**Additional context**
+Add any other context about the problem here.
diff --git a/.github/ISSUE_TEMPLATE/feature_request.md b/.github/ISSUE_TEMPLATE/feature_request.md
new file mode 100644
index 0000000..d9883dd
--- /dev/null
+++ b/.github/ISSUE_TEMPLATE/feature_request.md
@@ -0,0 +1,21 @@
+---
+name: Feature request
+about: Suggest an idea for this project
+title: ''
+labels: enhancement
+assignees: ''
+
+---
+
+**Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
+A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] 
+(This section helps Arrow developers understand the context and *why* for this feature, in addition to  the *what*)
+
+**Describe the solution you'd like**
+A clear and concise description of what you want to happen.
+
+**Describe alternatives you've considered**
+A clear and concise description of any alternative solutions or features you've considered.
+
+**Additional context**
+Add any other context or screenshots about the feature request here.
diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md
new file mode 100644
index 0000000..18b9094
--- /dev/null
+++ b/.github/pull_request_template.md
@@ -0,0 +1,27 @@
+# Which issue does this PR close?
+
+<!--
+We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123.
+-->
+
+Closes #.
+
+ # Rationale for this change
+<!--
+ Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed.
+ Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes.  
+-->
+
+# What changes are included in this PR?
+<!--
+There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR.
+-->
+
+# Are there any user-facing changes?
+<!--
+If there are user-facing changes then we may require documentation to be updated before approving the PR.
+-->
+
+<!--
+If there are any breaking changes to public APIs, please add the `api change` label.
+-->
\ No newline at end of file
diff --git a/.github/workflows/dev.yml b/.github/workflows/dev.yml
new file mode 100644
index 0000000..05cf8ce
--- /dev/null
+++ b/.github/workflows/dev.yml
@@ -0,0 +1,34 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+name: Dev
+on: [push, pull_request]
+
+jobs:
+
+  rat:
+    name: Release Audit Tool (RAT)
+    runs-on: ubuntu-latest
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v3
+      - name: Setup Python
+        uses: actions/setup-python@v4
+        with:
+          python-version: "3.10"
+      - name: Audit licenses
+        run: ./dev/release/run-rat.sh .
diff --git a/.gitignore b/.gitignore
index 2e03daf..cbd980e 100644
--- a/.gitignore
+++ b/.gitignore
@@ -16,3 +16,6 @@ __pycache__/
 #   intended to run in multiple environments; otherwise, check them in:
 .python-version
 venv
+
+apache-rat-*.jar
+*rat.txt
diff --git a/ci/scripts/rust_clippy.sh b/ci/scripts/rust_clippy.sh
new file mode 100755
index 0000000..911330c
--- /dev/null
+++ b/ci/scripts/rust_clippy.sh
@@ -0,0 +1,21 @@
+#!/usr/bin/env bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+set -ex
+cargo clippy --all-targets --workspace --features default -- -D warnings
diff --git a/ci/scripts/rust_fmt.sh b/ci/scripts/rust_fmt.sh
new file mode 100755
index 0000000..9d83258
--- /dev/null
+++ b/ci/scripts/rust_fmt.sh
@@ -0,0 +1,21 @@
+#!/usr/bin/env bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+set -ex
+cargo fmt --all -- --check
diff --git a/ci/scripts/rust_toml_fmt.sh b/ci/scripts/rust_toml_fmt.sh
new file mode 100755
index 0000000..e297ef0
--- /dev/null
+++ b/ci/scripts/rust_toml_fmt.sh
@@ -0,0 +1,21 @@
+#!/usr/bin/env bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+set -ex
+find . -mindepth 2 -name 'Cargo.toml' -exec cargo tomlfmt -p {} \;
diff --git a/datafusion/tests/conftest.py b/datafusion/tests/conftest.py
index 93662a0..a4eec41 100644
--- a/datafusion/tests/conftest.py
+++ b/datafusion/tests/conftest.py
@@ -1,3 +1,20 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
 import pytest
 from datafusion import SessionContext
 import pyarrow as pa
diff --git a/dev/build-set-env.sh b/dev/build-set-env.sh
new file mode 100755
index 0000000..1d98471
--- /dev/null
+++ b/dev/build-set-env.sh
@@ -0,0 +1,20 @@
+#!/bin/bash
+
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+export PY_DATAFUSION_VERSION=$(awk -F'[ ="]+' '$1 == "version" { print $2 }' Cargo.toml)
diff --git a/dev/release/check-rat-report.py b/dev/release/check-rat-report.py
new file mode 100644
index 0000000..30a0111
--- /dev/null
+++ b/dev/release/check-rat-report.py
@@ -0,0 +1,62 @@
+#!/usr/bin/python
+##############################################################################
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+##############################################################################
+import fnmatch
+import re
+import sys
+import xml.etree.ElementTree as ET
+
+if len(sys.argv) != 3:
+    sys.stderr.write(
+        "Usage: %s exclude_globs.lst rat_report.xml\n" % sys.argv[0]
+    )
+    sys.exit(1)
+
+exclude_globs_filename = sys.argv[1]
+xml_filename = sys.argv[2]
+
+globs = [line.strip() for line in open(exclude_globs_filename, "r")]
+
+tree = ET.parse(xml_filename)
+root = tree.getroot()
+resources = root.findall("resource")
+
+all_ok = True
+for r in resources:
+    approvals = r.findall("license-approval")
+    if not approvals or approvals[0].attrib["name"] == "true":
+        continue
+    clean_name = re.sub("^[^/]+/", "", r.attrib["name"])
+    excluded = False
+    for g in globs:
+        if fnmatch.fnmatch(clean_name, g):
+            excluded = True
+            break
+    if not excluded:
+        sys.stdout.write(
+            "NOT APPROVED: %s (%s): %s\n"
+            % (clean_name, r.attrib["name"], approvals[0].attrib["name"])
+        )
+        all_ok = False
+
+if not all_ok:
+    sys.exit(1)
+
+print("OK")
+sys.exit(0)
diff --git a/dev/release/rat_exclude_files.txt b/dev/release/rat_exclude_files.txt
new file mode 100644
index 0000000..a84ed5d
--- /dev/null
+++ b/dev/release/rat_exclude_files.txt
@@ -0,0 +1,44 @@
+*.npmrc
+*.gitignore
+*.dockerignore
+.gitmodules
+*_generated.h
+*_generated.js
+*_generated.ts
+*.csv
+*.json
+*.snap
+.github/ISSUE_TEMPLATE/*.md
+.github/pull_request_template.md
+CHANGELOG.md
+dev/release/rat_exclude_files.txt
+MANIFEST.in
+__init__.pxd
+__init__.py
+*.html
+*.sgml
+*.css
+*.png
+*.ico
+*.svg
+*.devhelp2
+*.scss
+.gitattributes
+requirements.txt
+*requirements*.txt
+**/testdata/*
+ci/*
+**/*.svg
+**/*.csv
+**/*.json
+**/*.sql
+venv/*
+parquet/*
+testing/*
+target/*
+**/target/*
+Cargo.lock
+**/Cargo.lock
+.history
+*rat.txt
+*/.git
\ No newline at end of file
diff --git a/dev/release/run-rat.sh b/dev/release/run-rat.sh
new file mode 100755
index 0000000..94fa55f
--- /dev/null
+++ b/dev/release/run-rat.sh
@@ -0,0 +1,43 @@
+#!/bin/bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+RAT_VERSION=0.13
+
+# download apache rat
+if [ ! -f apache-rat-${RAT_VERSION}.jar ]; then
+  curl -s https://repo1.maven.org/maven2/org/apache/rat/apache-rat/${RAT_VERSION}/apache-rat-${RAT_VERSION}.jar > apache-rat-${RAT_VERSION}.jar
+fi
+
+RAT="java -jar apache-rat-${RAT_VERSION}.jar -x "
+
+RELEASE_DIR=$(cd "$(dirname "$BASH_SOURCE")"; pwd)
+
+# generate the rat report
+$RAT $1 > rat.txt
+python $RELEASE_DIR/check-rat-report.py $RELEASE_DIR/rat_exclude_files.txt rat.txt > filtered_rat.txt
+cat filtered_rat.txt
+UNAPPROVED=`cat filtered_rat.txt  | grep "NOT APPROVED" | wc -l`
+
+if [ "0" -eq "${UNAPPROVED}" ]; then
+  echo "No unapproved licenses"
+else
+  echo "${UNAPPROVED} unapproved licences. Check rat report: rat.txt"
+  exit 1
+fi
diff --git a/dev/rust_lint.sh b/dev/rust_lint.sh
new file mode 100755
index 0000000..b1285cb
--- /dev/null
+++ b/dev/rust_lint.sh
@@ -0,0 +1,31 @@
+#!/bin/bash
+
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# This script runs all the Rust lints locally the same way the
+# DataFusion CI does
+
+set -e
+if ! command -v cargo-tomlfmt &> /dev/null; then
+    echo "Installing cargo-tomlfmt using cargo"
+    cargo install cargo-tomlfmt
+fi
+
+ci/scripts/rust_fmt.sh
+ci/scripts/rust_clippy.sh
+ci/scripts/rust_toml_fmt.sh
diff --git a/docs/source/python/generated/datafusion.DataFrame.rst b/docs/source/python/generated/datafusion.DataFrame.rst
index 365f593..ffee788 100644
--- a/docs/source/python/generated/datafusion.DataFrame.rst
+++ b/docs/source/python/generated/datafusion.DataFrame.rst
@@ -1,3 +1,20 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+
 datafusion.DataFrame
 ====================
 
diff --git a/docs/source/python/generated/datafusion.Expression.rst b/docs/source/python/generated/datafusion.Expression.rst
index 427fed0..58a5d04 100644
--- a/docs/source/python/generated/datafusion.Expression.rst
+++ b/docs/source/python/generated/datafusion.Expression.rst
@@ -1,3 +1,20 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+
 datafusion.Expression
 =====================
 
diff --git a/docs/source/python/generated/datafusion.SessionContext.rst b/docs/source/python/generated/datafusion.SessionContext.rst
index 86b942f..137e231 100644
--- a/docs/source/python/generated/datafusion.SessionContext.rst
+++ b/docs/source/python/generated/datafusion.SessionContext.rst
@@ -1,3 +1,20 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+
 datafusion.SessionContext
 =========================
 
diff --git a/docs/source/python/generated/datafusion.functions.rst b/docs/source/python/generated/datafusion.functions.rst
index 4bac3c3..d00e2b4 100644
--- a/docs/source/python/generated/datafusion.functions.rst
+++ b/docs/source/python/generated/datafusion.functions.rst
@@ -1,4 +1,21 @@
-datafusion.functions
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+
+datafusion.functions
 ====================
 
 .. automodule:: datafusion.functions