You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by do...@apache.org on 2021/07/30 07:05:45 UTC
[spark] branch master updated: [SPARK-36254][INFRA][PYTHON] Install
mlflow in Github Actions CI
This is an automated email from the ASF dual-hosted git repository.
dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new abce61f [SPARK-36254][INFRA][PYTHON] Install mlflow in Github Actions CI
abce61f is described below
commit abce61f3fda73e865a80e9c38bf9ca471a6a5db8
Author: itholic <ha...@databricks.com>
AuthorDate: Fri Jul 30 00:04:48 2021 -0700
[SPARK-36254][INFRA][PYTHON] Install mlflow in Github Actions CI
### What changes were proposed in this pull request?
This PR proposes adding a Python package, `mlflow` and `sklearn` to enable the MLflow test in pandas API on Spark.
### Why are the changes needed?
To enable the MLflow test in pandas API on Spark.
### Does this PR introduce _any_ user-facing change?
No, it's test-only
### How was this patch tested?
Manually test on local, with `python/run-tests --testnames pyspark.pandas.mlflow`.
Closes #33567 from itholic/SPARK-36254.
Lead-authored-by: itholic <ha...@databricks.com>
Co-authored-by: Haejoon Lee <44...@users.noreply.github.com>
Signed-off-by: Dongjoon Hyun <do...@apache.org>
---
.github/workflows/build_and_test.yml | 2 ++
dev/requirements.txt | 3 ++-
python/pyspark/pandas/mlflow.py | 8 +-------
3 files changed, 5 insertions(+), 8 deletions(-)
diff --git a/.github/workflows/build_and_test.yml b/.github/workflows/build_and_test.yml
index f3a6363..17908ff 100644
--- a/.github/workflows/build_and_test.yml
+++ b/.github/workflows/build_and_test.yml
@@ -252,6 +252,8 @@ jobs:
# Run the tests.
- name: Run tests
run: |
+ # TODO(SPARK-36345): Install mlflow>=1.0 and sklearn in Python 3.9 of the base image
+ python3.9 -m pip install 'mlflow>=1.0' sklearn
export PATH=$PATH:$HOME/miniconda/bin
./dev/run-tests --parallelism 1 --modules "$MODULES_TO_TEST"
- name: Upload test results to report
diff --git a/dev/requirements.txt b/dev/requirements.txt
index f5d662b..34f4b88 100644
--- a/dev/requirements.txt
+++ b/dev/requirements.txt
@@ -7,7 +7,8 @@ pyarrow
pandas
scipy
plotly
-mlflow
+mlflow>=1.0
+sklearn
matplotlib<3.3.0
# PySpark test dependencies
diff --git a/python/pyspark/pandas/mlflow.py b/python/pyspark/pandas/mlflow.py
index 719db40..4e48369 100644
--- a/python/pyspark/pandas/mlflow.py
+++ b/python/pyspark/pandas/mlflow.py
@@ -229,10 +229,4 @@ def _test() -> None:
if __name__ == "__main__":
- try:
- import mlflow # noqa: F401
- import sklearn # noqa: F401
-
- _test()
- except ImportError:
- pass
+ _test()
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org