You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2019/10/02 22:03:33 UTC

[GitHub] [airflow] mik-laj commented on a change in pull request #6210: [AIRFLOW-5567] [Do not Merge] prototype BaseAsyncOperator

mik-laj commented on a change in pull request #6210: [AIRFLOW-5567] [Do not Merge] prototype BaseAsyncOperator
URL: https://github.com/apache/airflow/pull/6210#discussion_r330792031
 
 

 ##########
 File path: airflow/models/base_async_operator.py
 ##########
 @@ -0,0 +1,96 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+Base Asynchronous Operator for kicking off a long running
+operations and polling for completion with reschedule mode.
+"""
+from functools import wraps
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.exceptions import AirflowException
+from airflow.models.xcom import XCOM_EXTERNAL_RESOURCE_ID_KEY
+
+class BaseAsyncOperator(BaseSensorOperator, SkipMixin):
+    """
+    AsyncOperators are derived from this class and inherit these attributes.
+
+    AsyncOperators must define a `submit_request` to fire a request for a
+    long running operation with a method and then executes a `poke` method
+    executing at a time interval and succeed when a criteria is met and fail
+    if and when they time out. They are effctively an opinionated way use
+    combine an Operator and a Sensor in order to kick off a long running
+    process without blocking a worker slot while waiting for the long running
+    process to complete by leveraging reschedule mode.
+
+    :param soft_fail: Set to true to mark the task as SKIPPED on failure
+    :type soft_fail: bool
+    :param poke_interval: Time in seconds that the job should wait in
+        between each tries
+    :type poke_interval: int
+    :param timeout: Time, in seconds before the task times out and fails.
+    :type timeout: int
+    :type mode: str
+    """
+    ui_color = '#9933ff'  # type: str
+    valid_modes = ['poke', 'reschedule']  # type: Iterable[str]
+
+    @apply_defaults
+    def __init__(self,
+                 *args,
+                 **kwargs) -> None:
+        super().__init__(mode='reschedule', *args, **kwargs)
+
+    def submit_request(self, context) -> string:
+        """
+        This method should kick off a long running operation.
+        This method should return the ID for the long running operation used
+        for polling
+        Context is the same dictionary used as when rendering jinja templates.
+
+        Refer to get_template_context for more context.
+
+        :returns: a resource_id for the long running operation.
+        :rtype: str
+        """
+        raise AirflowException('Async Operators must define a `submit_request` method.')
+
+    def process_result(self, context):
+        """
+        This method can optionally be overriden to process the result of a long running operation.
+        Context is the same dictionary used as when rendering jinja templates.
+
+        Refer to get_template_context for more context.
+        """
+        self.log.info('Got result of {}. Done.'.format(
+                      self.get_external_resource_id(context))
+
+    def pre_execute(self, context) -> None:
+        """
+        Check if we have the XCOM_EXTERNAL_RESOURCE_ID_KEY
+        for this task and call submit_request if it is missing.
+        """
+        if not self.get_external_resource_id(context):
 
 Review comment:
   There can only be one object in the universe.  In this case, there is a constant question about a unique object. After thinking for a long time, however, I have concerns about performance. Now we have one query to the database, and if we tried to save the state then there would always be two queries.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services