You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2022/05/20 08:11:37 UTC
[GitHub] [dolphinscheduler] zhongjiajie commented on a diff in pull request #10150: [Feature][MLops] Support MLflow Models to deploy model service (MLflow models serve and Docker)

zhongjiajie commented on code in PR #10150:
URL: https://github.com/apache/dolphinscheduler/pull/10150#discussion_r877861232


##########
docs/docs/en/guide/task/mlflow.md:
##########
@@ -5,77 +5,140 @@
 [MLflow](https://mlflow.org) is an excellent open source platform to manage the ML lifecycle, including experimentation,
 reproducibility, deployment, and a central model registry.
 
-Mlflow task is used to perform mlflow project tasks, which include basic algorithmic and autoML capabilities (
-User-defined MLFlow project task execution will be supported in the near future)
+MLflow task plugin used to execute MLflow tasks，Currently contains Mlflow Projects and MLflow Models.（Model Registry will soon be rewarded for support）
+
+- Mlflow Projects: Package data science code in a format to reproduce runs on any platform.
+- MLflow Models: Deploy machine learning models in diverse serving environments.
+- Model Registry: Store, annotate, discover, and manage models in a central repository.
+
+The Mlflow plugin currently supports and will support the following:
+
+- [ ] MLflow Projects
+    - [x] BasicAlgorithm: contains lr, svm, lightgbm, xgboost
+    - [x] AutoML: AutoML tool，contains autosklean, flaml
+    - [ ] Custom projects: Support for running your own MLflow projects
+- [ ] MLflow Models
+    - [x] MLFLOW: Use `MLflow models serve` to deploy a model service
+    - [x] Docker: Run the container after packaging the docker image
+    - [ ] Docker Compose: Use docker compose to run the container, Will replace the docker run above
+    - [ ] Seldon core: Use Selcon core to deploy model to k8s cluster
+    - [ ] k8s: Deploy containers directly to K8S 
+    - [ ] mlflow deployments: Built-in deployment modules, such as built-in deployment to SageMaker, etc
+- [ ] Model Registry
+    - [ ] Register Model: Allows artifacts (Including model and related parameters, indicators) to be registered directly into the model center

Review Comment:
   It is a document for users, I think we should not add something we do not support yet, it is for developers or contributors, maybe we should only add what we support here, but I do not have strong opinions here.



##########
docs/docs/en/guide/task/mlflow.md:
##########
@@ -5,77 +5,140 @@
 [MLflow](https://mlflow.org) is an excellent open source platform to manage the ML lifecycle, including experimentation,
 reproducibility, deployment, and a central model registry.
 
-Mlflow task is used to perform mlflow project tasks, which include basic algorithmic and autoML capabilities (
-User-defined MLFlow project task execution will be supported in the near future)
+MLflow task plugin used to execute MLflow tasks，Currently contains Mlflow Projects and MLflow Models.（Model Registry will soon be rewarded for support）
+
+- Mlflow Projects: Package data science code in a format to reproduce runs on any platform.
+- MLflow Models: Deploy machine learning models in diverse serving environments.
+- Model Registry: Store, annotate, discover, and manage models in a central repository.
+
+The Mlflow plugin currently supports and will support the following:
+
+- [ ] MLflow Projects
+    - [x] BasicAlgorithm: contains lr, svm, lightgbm, xgboost
+    - [x] AutoML: AutoML tool，contains autosklean, flaml
+    - [ ] Custom projects: Support for running your own MLflow projects
+- [ ] MLflow Models
+    - [x] MLFLOW: Use `MLflow models serve` to deploy a model service
+    - [x] Docker: Run the container after packaging the docker image
+    - [ ] Docker Compose: Use docker compose to run the container, Will replace the docker run above
+    - [ ] Seldon core: Use Selcon core to deploy model to k8s cluster
+    - [ ] k8s: Deploy containers directly to K8S 
+    - [ ] mlflow deployments: Built-in deployment modules, such as built-in deployment to SageMaker, etc
+- [ ] Model Registry
+    - [ ] Register Model: Allows artifacts (Including model and related parameters, indicators) to be registered directly into the model center
+
+
 
 ## Create Task
 
 - Click `Project -> Management-Project -> Name-Workflow Definition`, and click the "Create Workflow" button to enter the
   DAG editing page.
 - Drag from the toolbar <img src="/img/tasks/icons/mlflow.png" width="15"/> task node to canvas.
 
-## Task Parameter
-
-- DolphinScheduler common parameters
-    - **Node name**: The node name in a workflow definition is unique.
-    - **Run flag**: Identifies whether this node schedules normally, if it does not need to execute, select
-      the `prohibition execution`.
-    - **Descriptive information**: Describe the function of the node.
-    - **Task priority**: When the number of worker threads is insufficient, execute in the order of priority from high
-      to low, and tasks with the same priority will execute in a first-in first-out order.
-    - **Worker grouping**: Assign tasks to the machines of the worker group to execute. If `Default` is selected,
-      randomly select a worker machine for execution.
-    - **Environment Name**: Configure the environment name in which run the script.
-    - **Times of failed retry attempts**: The number of times the task failed to resubmit.
-    - **Failed retry interval**: The time interval (unit minute) for resubmitting the task after a failed task.
-    - **Delayed execution time**: The time (unit minute) that a task delays in execution.
-    - **Timeout alarm**: Check the timeout alarm and timeout failure. When the task runs exceed the "timeout", an alarm
-      email will send and the task execution will fail.
-    - **Custom parameter**: It is a local user-defined parameter for mlflow, and will replace the content
-      with `${variable}` in the script.
-    - **Predecessor task**: Selecting a predecessor task for the current task, will set the selected predecessor task as
-      upstream of the current task.
-
-- MLflow task specific parameters
-    - **mlflow server tracking uri** ：MLflow server uri, default http://localhost:5000.
-    - **experiment name** ：The experiment in which the task is running, if none, is created.
-    - **register model** ：Register the model or not. If register is selected, the following parameters are expanded.
-        - **model name** : The registered model name is added to the original model version and registered as
-          Production.
-    - **job type** : The type of task to run, currently including the underlying algorithm and AutoML. (User-defined
-      MLFlow project task execution will be supported in the near future)
-        - BasicAlgorithm specific parameters
-            - **algorithm** ：The selected algorithm currently supports `LR`, `SVM`, `LightGBM` and `XGboost` based
-              on [scikit-learn](https://scikit-learn.org/) form.
-            - **Parameter search space** : Parameter search space when running the corresponding algorithm, which can be
-              empty. For example, the parameter `max_depth=[5, 10];n_estimators=[100, 200]` for lightgbm 。The convention
-              will be passed with '; 'shards each parameter, using the name before the equal sign as the parameter name,
-              and using the name after the equal sign to get the corresponding parameter value through `python eval()`.
-        - AutoML specific parameters
-            - **AutoML tool** : The AutoML tool used, currently
-              supports [autosklearn](https://github.com/automl/auto-sklearn)
-              and [flaml](https://github.com/microsoft/FLAML)
-        - Parameters common to BasicAlgorithm and AutoML
-        - **data path** : The absolute path of the file or folder. Ends with .csv for file or contain train.csv and
-          test.csv for folder（In the suggested way, users should build their own test sets for model evaluation）。
-        - **parameters** : Parameter when initializing the algorithm/AutoML model, which can be empty. For example
-          parameters `"time_budget=30;estimator_list=['lgbm']"` for flaml 。The convention will be passed with '; 'shards
-          each parameter, using the name before the equal sign as the parameter name, and using the name after the equal
-          sign to get the corresponding parameter value through `python eval()`.
-            - BasicAlgorithm
-                - [lr](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression)
-                - [SVM](https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html?highlight=svc#sklearn.svm.SVC)
-                - [lightgbm](https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.LGBMClassifier.html#lightgbm.LGBMClassifier)
-                - [xgboost](https://xgboost.readthedocs.io/en/stable/python/python_api.html#xgboost.XGBClassifier)
-            - AutoML
-                - [flaml](https://microsoft.github.io/FLAML/docs/reference/automl#automl-objects)
-                - [autosklearn](https://automl.github.io/auto-sklearn/master/api.html)
-
 ## Task Example
 
-### Preparation
+First, introduce some general parameters of DolphinScheduler
+
+- **Node name**: The node name in a workflow definition is unique.
+- **Run flag**: Identifies whether this node schedules normally, if it does not need to execute, select
+  the `prohibition execution`.
+- **Descriptive information**: Describe the function of the node.
+- **Task priority**: When the number of worker threads is insufficient, execute in the order of priority from high
+  to low, and tasks with the same priority will execute in a first-in first-out order.
+- **Worker grouping**: Assign tasks to the machines of the worker group to execute. If `Default` is selected,
+  randomly select a worker machine for execution.
+- **Environment Name**: Configure the environment name in which run the script.
+- **Times of failed retry attempts**: The number of times the task failed to resubmit.
+- **Failed retry interval**: The time interval (unit minute) for resubmitting the task after a failed task.
+- **Delayed execution time**: The time (unit minute) that a task delays in execution.
+- **Timeout alarm**: Check the timeout alarm and timeout failure. When the task runs exceed the "timeout", an alarm
+  email will send and the task execution will fail.
+- **Predecessor task**: Selecting a predecessor task for the current task, will set the selected predecessor task as
+  upstream of the current task.
+
+### MLflow Projects
+
+#### BasicAlgorithm
+
+![mlflow-conda-env](/img/tasks/demo/mlflow-basic-algorithm.png)
+
+**Task Parameter**
+
+- **mlflow server tracking uri** ：MLflow server uri, default http://localhost:5000.
+- **job type** : The type of task to run, currently including the underlying algorithm and AutoML. (User-defined
+  MLFlow project task execution will be supported in the near future)
+- **experiment name** ：The experiment in which the task is running, if none, is created.
+- **register model** ：Register the model or not. If register is selected, the following parameters are expanded.
+    - **model name** : The registered model name is added to the original model version and registered as
+      Production.
+- **data path** : The absolute path of the file or folder. Ends with .csv for file or contain train.csv and
+  test.csv for folder（In the suggested way, users should build their own test sets for model evaluation）。
+- **parameters** : Parameter when initializing the algorithm/AutoML model, which can be empty. For example
+  parameters `"time_budget=30;estimator_list=['lgbm']"` for flaml 。The convention will be passed with '; 'shards
+  each parameter, using the name before the equal sign as the parameter name, and using the name after the equal
+  sign to get the corresponding parameter value through `python eval()`.
+    - [lr](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression)

Review Comment:
   I do not like abbreviation in the document, I do not think all the reader will know its mean.
   ```suggestion
       - [Logistic Regression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org