You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/01/29 11:53:00 UTC

[jira] [Commented] (AIRFLOW-6672) AWS DataSync - better logging of error message

    [ https://issues.apache.org/jira/browse/AIRFLOW-6672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025825#comment-17025825 ] 

ASF GitHub Bot commented on AIRFLOW-6672:
-----------------------------------------

baolsen commented on pull request #7288: [AIRFLOW-6672] AWS DataSync - better logging of error message
URL: https://github.com/apache/airflow/pull/7288
 
 
   ---
   Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg)
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [X] Description above provides context of the change
   - [X] Commit message/PR title starts with `[AIRFLOW-NNNN]`. AIRFLOW-NNNN = JIRA ID<sup>*</sup>
   - [X] Unit tests coverage for changes (not needed for documentation changes)
   - [X] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)"
   - [X] Relevant documentation is updated including usage instructions.
   - [X] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   <sup>*</sup> For document-only changes commit message can start with `[AIRFLOW-XXXX]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information.
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> AWS DataSync - better logging of error message
> ----------------------------------------------
>
>                 Key: AIRFLOW-6672
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6672
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: aws
>    Affects Versions: 1.10.7
>            Reporter: Bjorn Olsen
>            Assignee: Bjorn Olsen
>            Priority: Minor
>
> When the AWS DataSync operator fails, it dumps a TaskDescription to the log. The TaskDescription is in JSON format and contains several elements. This is hard to read to try and see what exactly went wrong.
> Example 1:
> [2020-01-28 17:44:39,495] \{datasync.py:354} INFO - task_execution_description=\{"TaskExecutionArn": "arn:aws:datasync:***:***:task/task-***/execution/exec-***", "Status": "ERROR", "Options": {"VerifyMode": "ONLY_FILES_TRANSFERRED", "OverwriteMode": "ALWAYS", "Atime": "BEST_EFFORT", "Mtime": "PRESERVE", "Uid": "INT_VALUE", "Gid": "INT_VALUE", "PreserveDeletedFiles": "PRESERVE", "PreserveDevices": "NONE", "PosixPermissions": "PRESERVE", "BytesPerSecond": -1, "TaskQueueing": "ENABLED"}, "Excludes": [], "Includes": [\{"FilterType": "SIMPLE_PATTERN", "Value": "***"}], "StartTime": datetime.datetime(2020, 1, 28, 17, 36, 2, 816000, tzinfo=tzlocal()), "EstimatedFilesToTransfer": 7, "EstimatedBytesToTransfer": 4534925, "FilesTransferred": 7, "BytesWritten": 4534925, "BytesTransferred": 4534925, "Result": \{"PrepareDuration": 9795, "PrepareStatus": "SUCCESS", "TotalDuration": 351660, "TransferDuration": 338568, "TransferStatus": "SUCCESS", "VerifyDuration": 7006, "VerifyStatus": "ERROR", "ErrorCode": "OpNotSupp", "ErrorDetail": "Operation not supported"}, "ResponseMetadata": \{"RequestId": "***", "HTTPStatusCode": 200, "HTTPHeaders": {"date": "Tue, 28 Jan 2020 15:44:39 GMT", "content-type": "application/x-amz-json-1.1", "content-length": "994", "connection": "keep-alive", "x-amzn-requestid": "***"}, "RetryAttempts": 0}}
> Example 2:
> [2020-01-28 18:23:23,322] \{datasync.py:354} INFO - task_execution_description=\{"TaskExecutionArn": "arn:aws:datasync:***:***:task/task-***/execution/exec-***", "Status": "ERROR", "Options": {"VerifyMode": "ONLY_FILES_TRANSFERRED", "OverwriteMode": "ALWAYS", "Atime": "BEST_EFFORT", "Mtime": "PRESERVE", "Uid": "INT_VALUE", "Gid": "INT_VALUE", "PreserveDeletedFiles": "PRESERVE", "PreserveDevices": "NONE", "PosixPermissions": "PRESERVE", "BytesPerSecond": -1, "TaskQueueing": "ENABLED"}, "Excludes": [], "Includes": [\{"FilterType": "SIMPLE_PATTERN", "Value": "***"}], "StartTime": datetime.datetime(2020, 1, 28, 17, 45, 57, 212000, tzinfo=tzlocal()), "EstimatedFilesToTransfer": 0, "EstimatedBytesToTransfer": 0, "FilesTransferred": 0, "BytesWritten": 0, "BytesTransferred": 0, "Result": \{"PrepareDuration": 16687, "PrepareStatus": "SUCCESS", "TotalDuration": 2083467, "TransferDuration": 2065744, "TransferStatus": "ERROR", "VerifyDuration": 5251, "VerifyStatus": "SUCCESS", "ErrorCode": "SockTlsHandshakeFailure", "ErrorDetail": "DataSync agent ran into an error connecting to AWS.Please review the DataSync network requirements and ensure required endpoints are accessible from the agent. Please contact AWS support if the error persists."}, "ResponseMetadata": \{"RequestId": "***", "HTTPStatusCode": 200, "HTTPHeaders": {"date": "Tue, 28 Jan 2020 16:23:23 GMT", "content-type": "application/x-amz-json-1.1", "content-length": "1179", "connection": "keep-alive", "x-amzn-requestid": "***"}, "RetryAttempts": 0}}
>  
> Note that the 'Result' element contains the statuses and errors that are of interest, however these are hard to see in the log at the moment.
> Example of a successful one:
> 'Result': \{'PrepareDuration': 9663, 'PrepareStatus': 'SUCCESS', 'TotalDuration': 352095, 'TransferDuration': 338358, 'TransferStatus': 'SUCCESS', 'VerifyDuration': 7171, 'VerifyStatus': 'SUCCESS'},
> Suggested output is to include the previous line/s but also add:
> [2020-01-28 17:44:39,495] \{datasync.py:354} INFO/ERROR - Status=SUCCESS/ERROR
> [2020-01-28 17:44:39,495] \{datasync.py:354} INFO/ERROR - PrepareStatus=SUCCESS/ERROR PrepareDuration=9795
> [2020-01-28 17:44:39,495] \{datasync.py:354} INFO/ERROR - TransferStatus=SUCCESS/ERROR TransferDuration=9795
> [2020-01-28 17:44:39,495] \{datasync.py:354} INFO/ERROR - VerifyStatus=SUCCESS/ERROR TransferDuration=9795
> [2020-01-28 17:44:39,495] \{datasync.py:354} ERROR - ErrorCode=OpNotSupp, ErrorDetail=Operation not supported
>  
> This should make it much clearer what the job status and errors are.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)