You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/12/22 15:39:45 UTC

[GitHub] [airflow] jrmccluskey commented on a change in pull request #20386: Add support for BeamGoPipelineOperator

jrmccluskey commented on a change in pull request #20386:
URL: https://github.com/apache/airflow/pull/20386#discussion_r773961221



##########
File path: airflow/providers/apache/beam/operators/beam.py
##########
@@ -470,3 +529,144 @@ def on_kill(self) -> None:
                 job_id=self.dataflow_job_id,
                 project_id=self.dataflow_config.project_id,
             )
+
+
+class BeamRunGoPipelineOperator(BeamBasePipelineOperator):
+    """
+    Launching Apache Beam pipelines written in Go. Note that both
+    ``default_pipeline_options`` and ``pipeline_options`` will be merged to specify pipeline
+    execution parameter, and ``default_pipeline_options`` is expected to save
+    high-level options, for instances, project and zone information, which
+    apply to all beam operators in the DAG.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:BeamRunGoPipelineOperator`
+
+    .. seealso::
+        For more detail on Apache Beam have a look at the reference:
+        https://beam.apache.org/documentation/
+
+    :param go_file: Reference to the Go Apache Beam pipeline e.g.,
+        /some/local/file/path/to/your/go/pipeline/file.go
+    :type go_file: str
+    :param runner: Runner on which pipeline will be run. By default "DirectRunner" is being used.
+        Other possible options: DataflowRunner, SparkRunner, FlinkRunner.

Review comment:
       The Python Portable Runner (handled as the "universal" runner as far as flags are concerned) is also an option. Same situation as with Spark and Flink, just need to set `--runner` to universal and provide the correct endpoint. 

##########
File path: airflow/providers/apache/beam/operators/beam.py
##########
@@ -470,3 +529,144 @@ def on_kill(self) -> None:
                 job_id=self.dataflow_job_id,
                 project_id=self.dataflow_config.project_id,
             )
+
+
+class BeamRunGoPipelineOperator(BeamBasePipelineOperator):
+    """
+    Launching Apache Beam pipelines written in Go. Note that both
+    ``default_pipeline_options`` and ``pipeline_options`` will be merged to specify pipeline
+    execution parameter, and ``default_pipeline_options`` is expected to save
+    high-level options, for instances, project and zone information, which
+    apply to all beam operators in the DAG.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:BeamRunGoPipelineOperator`
+
+    .. seealso::
+        For more detail on Apache Beam have a look at the reference:
+        https://beam.apache.org/documentation/
+
+    :param go_file: Reference to the Go Apache Beam pipeline e.g.,
+        /some/local/file/path/to/your/go/pipeline/file.go
+    :type go_file: str
+    :param runner: Runner on which pipeline will be run. By default "DirectRunner" is being used.
+        Other possible options: DataflowRunner, SparkRunner, FlinkRunner.
+        See: :class:`~providers.apache.beam.hooks.beam.BeamRunnerType`
+        See: https://beam.apache.org/documentation/runners/capability-matrix/
+    :type runner: str
+    :param default_pipeline_options: (optional) Map of default pipeline options.
+    :type default_pipeline_options: dict
+    :param pipeline_options: (optional) Map of pipeline options.The key must be a dictionary.
+        The value can contain different types:
+
+        * If the value is None, the single option - ``--key`` (without value) will be added.
+        * If the value is False, this option will be skipped
+        * If the value is True, the single option - ``--key`` (without value) will be added.
+        * If the value is list, the many options will be added for each key.
+          If the value is ``['A', 'B']`` and the key is ``key`` then the ``--key=A --key-B`` options

Review comment:
       ```suggestion
             If the value is ``['A', 'B']`` and the key is ``key`` then the ``--key=A --key=B`` options
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org