You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2019/04/01 22:54:33 UTC
[GitHub] [airflow] mik-laj commented on a change in pull request #4986: [AIRFLOW-4169] Create Google Cloud Vision Detect Operators

mik-laj commented on a change in pull request #4986: [AIRFLOW-4169] Create Google Cloud Vision Detect Operators
URL: https://github.com/apache/airflow/pull/4986#discussion_r271080379
 
 

 ##########
 File path: airflow/contrib/operators/gcp_vision_operator.py
 ##########
 @@ -961,3 +961,275 @@ def execute(self, context):
             timeout=self.timeout,
             metadata=self.metadata,
         )
+
+
+class CloudVisionDetectTextOperator(BaseOperator):
+    """
+    Detects Text in the image
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:CloudVisionDetectTextOperator`
+
+    :param image: (Required) The image to analyze. See more:
+        https://googleapis.github.io/google-cloud-python/latest/vision/gapic/v1/types.html#google.cloud.vision_v1.types.Image
+    :type image: dict or google.cloud.vision_v1.types.Image
+    :param max_results: (Optional) Number of results to return.
+    :type max_results: int
+    :param retry: (Optional) A retry object used to retry requests. If `None` is
+        specified, requests will not be retried.
+    :type retry: google.api_core.retry.Retry
+    :param timeout: Number of seconds before timing out.
+    :type timeout: float
+    :param language_hints: List of languages to use for TEXT_DETECTION.
+        In most cases, an empty value yields the best results since it enables automatic language detection.
+        For languages based on the Latin alphabet, setting language_hints is not needed.
+    :type str, list or google.cloud.vision.v1.ImageContext.language_hints:
+    :param web_detection_params: Parameters for web detection.
+    :type web_detection_params: dict or google.cloud.vision.v1.ImageContext.web_detection_params
+    :param additional_properties: Additional properties to be set on the AnnotateImageRequest. See more:
+        :class:`google.cloud.vision_v1.types.AnnotateImageRequest`
+    :type additional_properties: dict
+    """
+
+    # [START vision_detect_text_set_template_fields]
+    template_fields = ("image", "max_results", "timeout", "gcp_conn_id")
+    # [END vision_detect_text_set_template_fields]
+
+    def __init__(
+        self,
+        image,
+        max_results=None,
+        retry=None,
+        timeout=None,
+        language_hints=None,
+        web_detection_params=None,
+        additional_properties=None,
+        gcp_conn_id="google_cloud_default",
+        *args,
+        **kwargs
+    ):
+        super(CloudVisionDetectTextOperator, self).__init__(*args, **kwargs)
+        self.image = image
+        self.max_results = max_results
+        self.retry = retry
+        self.timeout = timeout
+        self.gcp_conn_id = gcp_conn_id
+        self.kwargs = kwargs
+        self.additional_properties = prepare_additional_parameters(
+            additional_properties=additional_properties,
+            language_hints=language_hints,
+            web_detection_params=web_detection_params,
+        )
+
+    def execute(self, context):
+        hook = CloudVisionHook(gcp_conn_id=self.gcp_conn_id)
+        return hook.text_detection(
+            image=self.image,
+            max_results=self.max_results,
+            retry=self.retry,
+            timeout=self.timeout,
+            additional_properties=self.additional_properties,
+        )
+
+
+class CloudVisionDetectDocumentTextOperator(BaseOperator):
+    """
+    Detects Document Text in the image
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:CloudVisionDetectDocumentTextOperator`
+
+    :param image: (Required) The image to analyze. See more:
+        https://googleapis.github.io/google-cloud-python/latest/vision/gapic/v1/types.html#google.cloud.vision_v1.types.Image
+    :type image: dict or google.cloud.vision_v1.types.Image
+    :param max_results: Number of results to return.
+    :type max_results: int
+    :param retry: (Optional) A retry object used to retry requests. If `None` is
+        specified, requests will not be retried.
+    :type retry: google.api_core.retry.Retry
+    :param timeout: Number of seconds before timing out.
+    :type timeout: float
+    :param language_hints: List of languages to use for TEXT_DETECTION.
+        In most cases, an empty value yields the best results since it enables automatic language detection.
+        For languages based on the Latin alphabet, setting language_hints is not needed.
+    :type str, list or google.cloud.vision.v1.ImageContext.language_hints:
+    :param web_detection_params: Parameters for web detection.
+    :type web_detection_params: dict or google.cloud.vision.v1.ImageContext.web_detection_params
+    :param additional_properties: Additional properties to be set on the AnnotateImageRequest. See more:
+        https://googleapis.github.io/google-cloud-python/latest/vision/gapic/v1/types.html#google.cloud.vision_v1.types.AnnotateImageRequest
+    :type additional_properties: dict
+    """
+
+    # [START vision_document_detect_text_set_template_fields]
+    template_fields = ("image", "max_results", "timeout", "gcp_conn_id")
+    # [END vision_document_detect_text_set_template_fields]
+
+    def __init__(
+        self,
+        image,
+        max_results=None,
+        retry=None,
+        timeout=None,
+        language_hints=None,
+        web_detection_params=None,
+        additional_properties=None,
+        gcp_conn_id="google_cloud_default",
+        *args,
+        **kwargs
+    ):
+        super(CloudVisionDetectDocumentTextOperator, self).__init__(*args, **kwargs)
+        self.image = image
+        self.max_results = max_results
+        self.retry = retry
+        self.timeout = timeout
+        self.gcp_conn_id = gcp_conn_id
+        self.additional_properties = prepare_additional_parameters(
 
 Review comment:
   I have concerns about modifying the parameter provided by the user is safe. The user can use one dictionary in several places, which can lead to unexpected behavior. Do you think it is a real problem? A similar problem was with GCP Transfer. 
   Look at: https://github.com/PolideaInternal/airflow/blob/e27950a75ce287c094e550fba07d1c8de5dc4143/airflow/contrib/hooks/gcp_transfer_hook.py#L346
   https://github.com/PolideaInternal/airflow/blob/master/airflow/contrib/hooks/gcp_vision_hook.py#L53
   https://github.com/PolideaInternal/airflow/commit/968ae52cdec72ee3ffe4b77e7fcadafa1ffdbd07#diff-aaa401f5fe6537da892ade36c656044bR277

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services