You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/01/11 16:52:43 UTC

[GitHub] [airflow] ayushchauhan0811 opened a new pull request #20814: fix: cloudwatch logs fetch logic

ayushchauhan0811 opened a new pull request #20814:
URL: https://github.com/apache/airflow/pull/20814


   Cloudwatch `get_log_events` can return empty results even though there are more log events available in the stream. In my recent interaction with the AWS team, it was pointed out that the correct way to check for end of stream is that if the value of `nextForwardToken` is same in subsequent calls 
   
   https://docs.aws.amazon.com/AmazonCloudWatchLogs/latest/APIReference/API_GetLogEvents.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #20814: fix: cloudwatch logs fetch logic

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #20814:
URL: https://github.com/apache/airflow/pull/20814#issuecomment-1014156796


   The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #20814: fix: cloudwatch logs fetch logic

Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #20814:
URL: https://github.com/apache/airflow/pull/20814


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on a change in pull request #20814: fix: cloudwatch logs fetch logic

Posted by GitBox <gi...@apache.org>.
uranusjr commented on a change in pull request #20814:
URL: https://github.com/apache/airflow/pull/20814#discussion_r782674196



##########
File path: airflow/providers/amazon/aws/hooks/logs.py
##########
@@ -99,7 +97,7 @@ def get_log_events(
 
             yield from events
 
-            if 'nextForwardToken' in response:
+            if next_token != response['nextForwardToken']:

Review comment:
       Probably OK to just fail if the response format changes. An exception raised here (assuming the hook is used in a DAG) would eventually be caught by the test runner and logged to the task logs, which would clearly signal the problem (the API has an unexpected change in format). If we silently fall back to None, it would be more difficult to debug later if the API changed format unexpected (missing `nextForwardToken`), or actually returned `"nextForwardToken": null` (which is expected and signals a valid scenario, indicating we or the user may have logical bugs in the workflow).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ayushchauhan0811 commented on a change in pull request #20814: fix: cloudwatch logs fetch logic

Posted by GitBox <gi...@apache.org>.
ayushchauhan0811 commented on a change in pull request #20814:
URL: https://github.com/apache/airflow/pull/20814#discussion_r784948181



##########
File path: airflow/providers/amazon/aws/hooks/logs.py
##########
@@ -99,7 +97,7 @@ def get_log_events(
 
             yield from events
 
-            if 'nextForwardToken' in response:
+            if next_token != response['nextForwardToken']:

Review comment:
       @ferruzzi @uranusjr Is there any change required in this PR or is it good to go? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #20814: fix: cloudwatch logs fetch logic

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #20814:
URL: https://github.com/apache/airflow/pull/20814#discussion_r782589340



##########
File path: airflow/providers/amazon/aws/hooks/logs.py
##########
@@ -99,7 +97,7 @@ def get_log_events(
 
             yield from events
 
-            if 'nextForwardToken' in response:
+            if next_token != response['nextForwardToken']:

Review comment:
       Should we use `response.get('nextForwardToken', None)` to avoid any chance of a  KeyError here, or do we know for sure that will always be included in the response?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ferruzzi commented on a change in pull request #20814: fix: cloudwatch logs fetch logic

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on a change in pull request #20814:
URL: https://github.com/apache/airflow/pull/20814#discussion_r785094194



##########
File path: airflow/providers/amazon/aws/hooks/logs.py
##########
@@ -99,7 +97,7 @@ def get_log_events(
 
             yield from events
 
-            if 'nextForwardToken' in response:
+            if next_token != response['nextForwardToken']:

Review comment:
       uranusjr covered my concern, but I'm also not a committer so my approval doesn't carry much (any) weight.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org