You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by do...@apache.org on 2021/07/30 07:05:45 UTC

[spark] branch master updated: [SPARK-36254][INFRA][PYTHON] Install mlflow in Github Actions CI

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new abce61f  [SPARK-36254][INFRA][PYTHON] Install mlflow in Github Actions CI
abce61f is described below

commit abce61f3fda73e865a80e9c38bf9ca471a6a5db8
Author: itholic <ha...@databricks.com>
AuthorDate: Fri Jul 30 00:04:48 2021 -0700

    [SPARK-36254][INFRA][PYTHON] Install mlflow in Github Actions CI
    
    ### What changes were proposed in this pull request?
    
    This PR proposes adding a Python package, `mlflow` and `sklearn` to enable the MLflow test in pandas API on Spark.
    
    ### Why are the changes needed?
    
    To enable the MLflow test in pandas API on Spark.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No, it's test-only
    
    ### How was this patch tested?
    
    Manually test on local, with `python/run-tests --testnames pyspark.pandas.mlflow`.
    
    Closes #33567 from itholic/SPARK-36254.
    
    Lead-authored-by: itholic <ha...@databricks.com>
    Co-authored-by: Haejoon Lee <44...@users.noreply.github.com>
    Signed-off-by: Dongjoon Hyun <do...@apache.org>
---
 .github/workflows/build_and_test.yml | 2 ++
 dev/requirements.txt                 | 3 ++-
 python/pyspark/pandas/mlflow.py      | 8 +-------
 3 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/.github/workflows/build_and_test.yml b/.github/workflows/build_and_test.yml
index f3a6363..17908ff 100644
--- a/.github/workflows/build_and_test.yml
+++ b/.github/workflows/build_and_test.yml
@@ -252,6 +252,8 @@ jobs:
     # Run the tests.
     - name: Run tests
       run: |
+        # TODO(SPARK-36345): Install mlflow>=1.0 and sklearn in Python 3.9 of the base image
+        python3.9 -m pip install 'mlflow>=1.0' sklearn
         export PATH=$PATH:$HOME/miniconda/bin
         ./dev/run-tests --parallelism 1 --modules "$MODULES_TO_TEST"
     - name: Upload test results to report
diff --git a/dev/requirements.txt b/dev/requirements.txt
index f5d662b..34f4b88 100644
--- a/dev/requirements.txt
+++ b/dev/requirements.txt
@@ -7,7 +7,8 @@ pyarrow
 pandas
 scipy
 plotly
-mlflow
+mlflow>=1.0
+sklearn
 matplotlib<3.3.0
 
 # PySpark test dependencies
diff --git a/python/pyspark/pandas/mlflow.py b/python/pyspark/pandas/mlflow.py
index 719db40..4e48369 100644
--- a/python/pyspark/pandas/mlflow.py
+++ b/python/pyspark/pandas/mlflow.py
@@ -229,10 +229,4 @@ def _test() -> None:
 
 
 if __name__ == "__main__":
-    try:
-        import mlflow  # noqa: F401
-        import sklearn  # noqa: F401
-
-        _test()
-    except ImportError:
-        pass
+    _test()

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org