You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/06/21 11:41:07 UTC

[GitHub] [airflow] pankajastro commented on a diff in pull request #24554: Add SqsBatchSensor

pankajastro commented on code in PR #24554:
URL: https://github.com/apache/airflow/pull/24554#discussion_r902513788


##########
airflow/providers/amazon/aws/sensors/sqs.py:
##########
@@ -215,3 +226,65 @@ def __init__(self, *args, **kwargs):
             stacklevel=2,
         )
         super().__init__(*args, **kwargs)
+
+
+class SqsBatchSensor(SqsSensor):
+    """
+    Get messages from an Amazon SQS queue in batches and then delete the retrieved messages from the queue.
+    If deletion of messages fails an AirflowException is thrown. Otherwise, all messages
+    are pushed through XCom with the key ``messages``.
+    The total number of messages retrieved at maxium will be equal to the number of messages retrived for each
+    SQS's API call multiplies with total number of call. Each SQS receive_message can get a max 10 messages.
+    This sensor is identical to SQSSensor, except the fact that SQSSensor performs one and only one SQS call
+    per poke, while SQSBatchSensor performs multiple SQS API calls per poke.
+    .. seealso::
+        For more information on how to use this sensor, take a look at the guide:
+        :ref:`howto/sensor:SqsBatchSensor`
+    :param batch: The number of time the sensor will call the SQS to receive messages (default: 1)
+    """
+
+    def __init__(
+        self,
+        *,
+        batch: int = 1,
+        **kwargs,
+    ):
+        super().__init__(**kwargs)
+        self.batch = batch
+
+    def poke(self, context: 'Context'):
+        """
+        Check for message on subscribed queue and write to xcom the message with key ``messages``
+        :param context: the context object
+        :return: ``True`` if message is available or ``False``
+        """
+        sqs_conn = self.get_hook().get_conn()
+        message_batch = []
+        # perform multiple SQS call to retrieve messages in series
+        for _ in range(self.batch):
+            messages = self.poll_sqs(sqs_conn=sqs_conn)
+
+            if not len(messages):
+                continue
+
+            message_batch.extend(messages)
+
+            if self.delete_message_on_reception:
+
+                self.log.info("Deleting %d messages", len(messages))
+
+                entries = [
+                    {'Id': message['MessageId'], 'ReceiptHandle': message['ReceiptHandle']}
+                    for message in messages
+                ]
+                response = sqs_conn.delete_message_batch(QueueUrl=self.sqs_queue, Entries=entries)
+
+                if 'Successful' not in response:
+                    raise AirflowException(
+                        'Delete SQS Messages failed ' + str(response) + ' for messages ' + str(messages)
+                    )
+        if not len(message_batch):
+            return False
+
+        context['ti'].xcom_push(key='messages', value=message_batch)

Review Comment:
   just thinking that either we should always push the messages in xocm or we should push it depending on the value of param `do_xcom_push`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org