You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by je...@apache.org on 2021/12/10 19:50:37 UTC

[airflow] 02/05: Deferrable operators doc clarification (#20150)

This is an automated email from the ASF dual-hosted git repository.

jedcunningham pushed a commit to branch v2-2-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit d3733250ed8d0f1c06371a7854b1e69827ce16bd
Author: Daniel Standish <15...@users.noreply.github.com>
AuthorDate: Wed Dec 8 17:01:12 2021 -0800

    Deferrable operators doc clarification (#20150)
    
    The language "when two tasks defer based on the same trigger" is a bit confusing. Many tasks can reuse the same trigger class.  But two tasks can't defer using the same trigger _instance_. I think what's important to call out here, and what I try to make clearer, is that the exact same instance of the trigger may have multiple copies of itself running.
    
    Additionally I clarify cleanup is not _only_ called "when this happens" (that is, when trigger is "suddenly removed"), but called every time the trigger instance exits, no matter the reason.
    
    (cherry picked from commit 9a92a56782ad931c4c5831c35085af3317d529c7)
---
 docs/apache-airflow/concepts/deferring.rst | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/docs/apache-airflow/concepts/deferring.rst b/docs/apache-airflow/concepts/deferring.rst
index c7edced..d9126c4 100644
--- a/docs/apache-airflow/concepts/deferring.rst
+++ b/docs/apache-airflow/concepts/deferring.rst
@@ -109,13 +109,13 @@ There's also some design constraints to be aware of:
 
 * The ``run`` method *must be asynchronous* (using Python's asyncio), and correctly ``await`` whenever it does a blocking operation.
 * ``run`` must ``yield`` its TriggerEvents, not return them. If it returns before yielding at least one event, Airflow will consider this an error and fail any Task Instances waiting on it. If it throws an exception, Airflow will also fail any dependent task instances.
-* A Trigger *must be able to run in parallel* with other copies of itself. This can happen both when two tasks defer based on the same trigger, and also if a network partition happens and Airflow re-launches a trigger on a separated machine.
-* When events are emitted, and if your trigger is designed to emit more than one event, they *must* contain a payload that can be used to deduplicate events if the trigger is being run in multiple places. If you only fire one event, and don't want to pass information in the payload back to the Operator that deferred, you can just set the payload to ``None``.
-* A trigger may be suddenly removed from one process and started on a new one (if partitions are being changed, or a deployment is happening). You may provide an optional ``cleanup`` method that gets called when this happens.
+* You should assume that a trigger instance may run *more than once* (this can happen if a network partition occurs and Airflow re-launches a trigger on a separated machine). So you must be mindful about side effects. For example you might not want to use a trigger to insert database rows.
+* If your trigger is designed to emit more than one event (not currently supported), then each emitted event *must* contain a payload that can be used to deduplicate events if the trigger is being run in multiple places. If you only fire one event and don't need to pass information back to the Operator, you can just set the payload to ``None``.
+* A trigger may be suddenly removed from one triggerer service and started on a new one, for example if subnets are changed and a network partition results, or if there is a deployment. If desired you may implement the ``cleanup`` method, which is always called after ``run`` whether the trigger exits cleanly or otherwise.
 
 .. note::
 
-    Right now, Triggers are only used up to their first event, as they are only used for resuming deferred tasks (which happens on the first event fired). However, we plan to allow DAGs to be launched from triggers in future, which is where multi-event triggers will be more useful.
+    Currently Triggers are only used up to their first event, as they are only used for resuming deferred tasks (which happens on the first event fired). However, we plan to allow DAGs to be launched from triggers in future, which is where multi-event triggers will be more useful.
 
 
 Here's the structure of a basic Trigger::