You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dolphinscheduler.apache.org by GitBox <gi...@apache.org> on 2022/08/09 03:25:12 UTC

[GitHub] [dolphinscheduler] zhongjiajie commented on a diff in pull request #11360: [DSIP-13][Feature]local_resource_plugin

zhongjiajie commented on code in PR #11360:
URL: https://github.com/apache/dolphinscheduler/pull/11360#discussion_r940830292


##########
dolphinscheduler-python/pydolphinscheduler/src/pydolphinscheduler/core/process_definition.py:
##########
@@ -21,13 +21,15 @@
 from datetime import datetime
 from typing import Any, Dict, List, Optional, Set
 
+
 from pydolphinscheduler import configuration
 from pydolphinscheduler.constants import TaskType
 from pydolphinscheduler.core.resource import Resource
 from pydolphinscheduler.exceptions import PyDSParamException, PyDSTaskNoFoundException
 from pydolphinscheduler.java_gateway import JavaGate
 from pydolphinscheduler.models import Base, Project, Tenant, User
 from pydolphinscheduler.utils.date import MAX_DATETIME, conv_from_str, conv_to_schedule
+from pydolphinscheduler.resources_plugin.__init__ import ResourcePlugin

Review Comment:
   Not sure it work or not, but if we can import ResourcePlugin from module `resources_plugin` will be better
   ```suggestion
   from pydolphinscheduler.resources_plugin import ResourcePlugin
   ```



##########
dolphinscheduler-master/src/main/resources/logback-spring.xml:
##########
@@ -66,11 +66,12 @@
     </appender>
 
     <root level="INFO">
-        <if condition="${DOCKER:-false}">

Review Comment:
   revert



##########
dolphinscheduler-common/src/main/resources/common.properties:
##########
@@ -57,7 +57,6 @@ login.user.keytab.path=/opt/hdfs.headless.keytab
 # kerberos expire time, the unit is hour
 kerberos.expire.time=2
 
-

Review Comment:
   revert this change too



##########
dolphinscheduler-api/src/main/resources/logback-spring.xml:
##########
@@ -53,11 +53,12 @@
     <logger name="org.apache.hadoop" level="WARN"/>
 
     <root level="INFO">
-        <if condition="${DOCKER:-false}">
-            <then>
-                <appender-ref ref="STDOUT"/>
-            </then>
-        </if>
+<!--        <if condition="${DOCKER:-false}">-->
+<!--            <then>-->
+<!--                <appender-ref ref="STDOUT"/>-->
+<!--            </then>-->
+<!--        </if>-->
+        <appender-ref ref="STDOUT"/>

Review Comment:
   Please revert unnecessary changed



##########
dolphinscheduler-python/pydolphinscheduler/src/pydolphinscheduler/core/process_definition.py:
##########
@@ -21,13 +21,15 @@
 from datetime import datetime
 from typing import Any, Dict, List, Optional, Set
 
+

Review Comment:
   revert



##########
dolphinscheduler-python/pydolphinscheduler/src/pydolphinscheduler/constants.py:
##########
@@ -103,7 +103,21 @@ class Time(str):
     FMT_NO_COLON_TIME = "%H%M%S"
 
 
+# [start res_plugin_constants_definition]
+class ResourcePluginType(str):
+    """Constants for resources plugin type, it will also show you which kind we support up to now."""
+    LOCAL = "local"
+    GITHUB = "github"
+    GITLAB = "gitlab"
+    S3 = "S3"
+    OSS = "OSS"
+
+
+# [end res_plugin_constants_definition]
+
+
 class ResourceKey(str):
     """Constants for key of resource."""
 
     ID = "id"
+

Review Comment:
   revert



##########
dolphinscheduler-python/pydolphinscheduler/src/pydolphinscheduler/core/task.py:
##########
@@ -64,10 +66,10 @@ class TaskRelation(Base):
     }
 
     def __init__(
-        self,
-        pre_task_code: int,
-        post_task_code: int,
-        name: Optional[str] = None,
+            self,
+            pre_task_code: int,
+            post_task_code: int,
+            name: Optional[str] = None,

Review Comment:
   This change will fail the `black` code style



##########
dolphinscheduler-python/pydolphinscheduler/src/pydolphinscheduler/examples/tutorial.py:
##########
@@ -37,6 +37,8 @@
 
 # Import task Shell object cause we would create some shell tasks later
 from pydolphinscheduler.tasks.shell import Shell
+from pydolphinscheduler.constants import ResourcePluginType
+from pydolphinscheduler.resources_plugin.__init__ import ResourcePlugin

Review Comment:
   maybe we should add a new file to tell users how to use resources in module `example`, in `tutorial` we need to out-of-box run without changing anything. With resource we have to add a new file for it



##########
dolphinscheduler-python/pydolphinscheduler/src/pydolphinscheduler/core/task.py:
##########
@@ -28,6 +28,8 @@
     TaskPriority,
     TaskTimeoutFlag,
 )
+from pydolphinscheduler.exceptions import PyResPluginException
+from pydolphinscheduler.resources_plugin.__init__ import ResourcePlugin

Review Comment:
   ```suggestion
   from pydolphinscheduler.resources_plugin import ResourcePlugin
   ```



##########
dolphinscheduler-python/pydolphinscheduler/src/pydolphinscheduler/constants.py:
##########
@@ -103,7 +103,21 @@ class Time(str):
     FMT_NO_COLON_TIME = "%H%M%S"
 
 
+# [start res_plugin_constants_definition]
+class ResourcePluginType(str):
+    """Constants for resources plugin type, it will also show you which kind we support up to now."""
+    LOCAL = "local"
+    GITHUB = "github"
+    GITLAB = "gitlab"
+    S3 = "S3"
+    OSS = "OSS"

Review Comment:
   Only keep `LOCAL = "local"` is enough for this PR
   ```suggestion
   ```



##########
dolphinscheduler-python/pydolphinscheduler/src/pydolphinscheduler/resources_plugin/github.py:
##########
@@ -0,0 +1,6 @@
+
+class Github:
+    def __init__(self, path: str):
+        self.path = path
+

Review Comment:
   your PR title is **local resource plugin**, so please do not add file `github` `gitlab` in this PR



##########
dolphinscheduler-python/pydolphinscheduler/src/pydolphinscheduler/resources_plugin/__init__.py:
##########
@@ -0,0 +1,37 @@
+import importlib
+import importlib.util
+
+from pathlib import Path
+from typing import Any, Generator
+
+from pydolphinscheduler.exceptions import PyDSConfException
+
+path_resources_plugin = Path(__file__).parent
+
+
+# [start resource_plugin_definition]
+class ResourcePlugin:
+    def __init__(self, type: str, prefix: str):

Review Comment:
   Please add docstring for both class and class init parameter



##########
dolphinscheduler-python/pydolphinscheduler/src/pydolphinscheduler/constants.py:
##########
@@ -103,7 +103,21 @@ class Time(str):
     FMT_NO_COLON_TIME = "%H%M%S"
 
 
+# [start res_plugin_constants_definition]
+class ResourcePluginType(str):
+    """Constants for resources plugin type, it will also show you which kind we support up to now."""
+    LOCAL = "local"
+    GITHUB = "github"
+    GITLAB = "gitlab"
+    S3 = "S3"
+    OSS = "OSS"
+
+

Review Comment:
   ```suggestion
   ```



##########
dolphinscheduler-python/pydolphinscheduler/docs/source/resources_plugin/index.rst:
##########
@@ -0,0 +1,31 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+Resources_plugin
+================
+
+In this section
+
+.. toctree::
+   :maxdepth: 1
+
+   how-to-use
+   how-to-develop
+   resource-plugin
+   local
+
+

Review Comment:
   ```suggestion
   ```



##########
dolphinscheduler-python/pydolphinscheduler/src/pydolphinscheduler/resources_plugin/__init__.py:
##########
@@ -0,0 +1,37 @@
+import importlib
+import importlib.util
+
+from pathlib import Path
+from typing import Any, Generator
+
+from pydolphinscheduler.exceptions import PyDSConfException
+
+path_resources_plugin = Path(__file__).parent
+
+
+# [start resource_plugin_definition]
+class ResourcePlugin:
+    def __init__(self, type: str, prefix: str):
+        self.type = type
+        self.prefix = prefix
+
+    def get_all_modules(self) -> Generator[Path, Any, None]:

Review Comment:
   can we directly use `Path`here?
   ```suggestion
       def get_all_modules(self) -> Generator[Path]:
   ```



##########
dolphinscheduler-python/pydolphinscheduler/src/pydolphinscheduler/core/task.py:
##########
@@ -225,6 +233,49 @@ def task_params(self) -> Optional[Dict]:
         custom_attr |= self._task_custom_attr
         return self.get_define_custom(custom_attr=custom_attr)
 
+    def get_plugin(self):
+        if self.resource_plugin is None:
+            if self.process_definition.resource_plugin is not None:
+                return self.process_definition.resource_plugin.resource
+            else:
+                raise PyResPluginException(
+                    "The execution command of this task is a file, but the resource plugin is empty")
+        else:
+            return self.resource_plugin.resource
+
+    def get_plugin(self):
+        if self.resource_plugin is None:
+            if self.process_definition.resource_plugin is not None:
+                return self.process_definition.resource_plugin.resource
+            else:
+                raise ValueError
+        else:
+            return self.resource_plugin.resource

Review Comment:
   you have to function with the same name here, please remove one of them. and you forget to add docstring for it



##########
dolphinscheduler-python/pydolphinscheduler/docs/source/resources_plugin/how-to-use.rst:
##########
@@ -0,0 +1,35 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+How to use

Review Comment:
   Please migrate `how to use` into file `resource-plugin.rst`



##########
dolphinscheduler-python/pydolphinscheduler/src/pydolphinscheduler/resources_plugin/__init__.py:
##########
@@ -0,0 +1,37 @@
+import importlib
+import importlib.util
+
+from pathlib import Path
+from typing import Any, Generator
+
+from pydolphinscheduler.exceptions import PyDSConfException
+
+path_resources_plugin = Path(__file__).parent
+
+
+# [start resource_plugin_definition]
+class ResourcePlugin:
+    def __init__(self, type: str, prefix: str):
+        self.type = type
+        self.prefix = prefix
+
+    def get_all_modules(self) -> Generator[Path, Any, None]:
+        """Get all res files path in resources_plugin directory."""
+        return (ex for ex in path_resources_plugin.iterdir() if ex.is_file() and not ex.name.startswith("__"))
+
+    def import_module(self, script_name, script_path):
+        """Import module"""
+        spec = importlib.util.spec_from_file_location(script_name, script_path)
+        module = importlib.util.module_from_spec(spec)
+        spec.loader.exec_module(module)
+        plugin = getattr(module, self.type.capitalize())

Review Comment:
   DISCUSS: can we directly import class here instead of import module



##########
dolphinscheduler-python/pydolphinscheduler/docs/source/resources_plugin/how-to-develop.rst:
##########
@@ -0,0 +1,38 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+How to develop
+==============
+
+How to develop a plugin

Review Comment:
   ```suggestion
   ```



##########
dolphinscheduler-python/pydolphinscheduler/src/pydolphinscheduler/core/task.py:
##########
@@ -225,6 +233,49 @@ def task_params(self) -> Optional[Dict]:
         custom_attr |= self._task_custom_attr
         return self.get_define_custom(custom_attr=custom_attr)
 
+    def get_plugin(self):
+        if self.resource_plugin is None:
+            if self.process_definition.resource_plugin is not None:
+                return self.process_definition.resource_plugin.resource
+            else:
+                raise PyResPluginException(
+                    "The execution command of this task is a file, but the resource plugin is empty")
+        else:
+            return self.resource_plugin.resource
+
+    def get_plugin(self):
+        if self.resource_plugin is None:
+            if self.process_definition.resource_plugin is not None:
+                return self.process_definition.resource_plugin.resource
+            else:
+                raise ValueError
+        else:
+            return self.resource_plugin.resource
+
+    def get_content(self):
+        """Get the file content according to the resource plugin"""
+        if self.ext_attr is None:
+            raise ValueError('ext_attr is None')
+        if self.ext is None:
+            raise ValueError('ext is None')

Review Comment:
   It seems we have a bug here, with this code, the user can not define a task without task resource plugin, but in fact I think user can do that 



##########
dolphinscheduler-python/pydolphinscheduler/docs/source/resources_plugin/how-to-develop.rst:
##########
@@ -0,0 +1,38 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+How to develop
+==============
+
+How to develop a plugin
+You need your plugin class created under dolphinscheduler-python/pydolphinscheduler/src/pydolphinscheduler/resources_plugin folder.

Review Comment:
   ```suggestion
   When you want to create a new resource plugin, you need to add a new class in the module `mod`:`resources_plugin`. All you have to do is named your plugin, implement `__init__` and `read_file` method
   
   - Method `__init__`: Initiation method with `param`:`prefix` and `type`
   
   .. literalinclude:: ../../../src/pydolphinscheduler/resources_plugin/local.py
       :start-after: [start init_method]
       :end-before: [end init_method]
   
   - Method `read_file `: Get content from the given URI, <add your distribution>
   
   .. literalinclude:: ../../../src/pydolphinscheduler/resources_plugin/local.py
       :start-after: [start read_file_method]
       :end-before: [end read_file_method]
       
   Last but not least, you should also add new resource plugin in constants.py
   
   .. literalinclude:: ../../../src/pydolphinscheduler/constants.py
       :start-after: [start class_resource]
       :end-before: [end class_resource]
   ```



##########
dolphinscheduler-python/pydolphinscheduler/docs/source/resources_plugin/how-to-use.rst:
##########
@@ -0,0 +1,35 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+How to use
+==========
+This article will show you how to use resource plug-ins.
+
+When you initialize the task subclass, you can add the parameter resource_plugin parameter, specify the type and
+prefix of the resource plugin, the command at this time should be the file path, in other words, the prefix of the
+resource plugin plus the command at this time is the command file in the specified Absolute paths in resource plugins.
+
+When the resource_plugin parameter is defined in both the task subclass and the workflow, the resource_plugin defined in the task subclass is used first. If the task subclass does not define resource_plugin, but the resource_plugin is defined in the workflow, the resource_plugin in the workflow is used.
+
+Of course, if neither the task subclass nor the workflow specifies resource_plugin, the command at this time will be executed as a script, in other words, we are forward compatible.
+
+
+Example
+-------
+.. literalinclude:: ../../../src/pydolphinscheduler/examples/tutorial.py
+   :start-after: [start workflow_declare]
+   :end-before: [end task_relation_declare]

Review Comment:
   You should add reduce the scope of the usage of resource



##########
dolphinscheduler-python/pydolphinscheduler/docs/source/resources_plugin/local.rst:
##########
@@ -0,0 +1,32 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+Local
+=====
+
+Local is a local resource plugin for pydolphinscheduler.

Review Comment:
   ```suggestion
   `Local` is a local resource plugin for pydolphinscheduler.
   ```



##########
dolphinscheduler-python/pydolphinscheduler/docs/source/resources_plugin/local.rst:
##########
@@ -0,0 +1,32 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+Local
+=====
+
+Local is a local resource plugin for pydolphinscheduler.
+
+When using a local resource plugin, you do not need to use this class explicitly, you only need to add the
+resource_plugin parameter in the task subclass or workflow definition. The resource_plugin parameter type is
+ResourcePlugin.

Review Comment:
   I think we need to tell users how to use in both process definition, and task definition



##########
dolphinscheduler-python/pydolphinscheduler/docs/source/resources_plugin/index.rst:
##########
@@ -0,0 +1,31 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+Resources_plugin
+================
+
+In this section
+
+.. toctree::
+   :maxdepth: 1
+
+   how-to-use
+   how-to-develop

Review Comment:
   plesase change the file name from `how-to-develop.rst` to `develop.rst`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@dolphinscheduler.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org