You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/06/18 21:24:10 UTC

[GitHub] [airflow] o-nikolas commented on a change in pull request #16264: Add flag to delete local logs after upload

o-nikolas commented on a change in pull request #16264:
URL: https://github.com/apache/airflow/pull/16264#discussion_r653908548



##########
File path: airflow/providers/microsoft/azure/log/wasb_task_handler.py
##########
@@ -100,8 +100,8 @@ def close(self) -> None:
             with open(local_loc) as logfile:
                 log = logfile.read()
             self.wasb_write(log, remote_loc, append=True)
-
-            if self.delete_local_copy:
+            keep_local = conf.getboolean('logging', 'KEEP_LOCAL_LOGS')
+            if self.delete_local_copy or not keep_local:

Review comment:
       I wonder if `delete_local_copy` is still needed now that you have introduced this global behaviour?

##########
File path: airflow/providers/google/cloud/log/gcs_task_handler.py
##########
@@ -132,7 +134,10 @@ def close(self):
             # read log and remove old logs to get just the latest additions
             with open(local_loc) as logfile:
                 log = logfile.read()
-            self.gcs_write(log, remote_loc)
+            success = self.gcs_write(log, remote_loc)
+            keep_local = conf.getboolean('logging', 'KEEP_LOCAL_LOGS')
+            if success and not keep_local:
+                shutil.rmtree(os.path.dirname(local_loc))

Review comment:
       You're implementing the same cleanup recipe several times on the back of a global config, both of which are indicators that this is a good candidate for logic that should live in a super class. Doing it ad hoc like this leaves us open for future developers of remote logging classes to forget or mis-implement this logic. The individual remote logging classes should only be responsible for doing the upload to their respective service, they shouldn't have to re-implement the cleanup.
   
   It is possible to teach `FileTaskHandler` to do this, but it would be tricky to make it work in both cases and is a bit smelly. It's maybe time for a new super class `RemoteTaskHandler`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org