You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/03/14 00:30:32 UTC

[GitHub] [airflow] uranusjr commented on a change in pull request #22231: S3 list key filter

uranusjr commented on a change in pull request #22231:
URL: https://github.com/apache/airflow/pull/22231#discussion_r825527820



##########
File path: airflow/providers/amazon/aws/hooks/s3.py
##########
@@ -255,6 +256,23 @@ def list_prefixes(
 
         return prefixes
 
+    def _list_key_object_filter(
+        self, keys: list, from_datetime: Optional[DateTime] = None, to_datetime: Optional[DateTime] = None
+    ) -> list:
+        if from_datetime is None and to_datetime is None:
+            return [k['Key'] for k in keys]
+        elif to_datetime is None:
+            return [k['Key'] for k in keys if k['LastModified'] >= from_datetime]
+        elif from_datetime is None:
+            return [k['Key'] for k in keys if k['LastModified'] < to_datetime]
+        else:
+            return [
+                k['Key']
+                for k in keys
+                if k['LastModified'] >= from_datetime and k['LastModified'] < to_datetime
+            ]
+        return [k['Key'] for k in keys]

Review comment:
       How about
   
   ```suggestion
       def _list_key_object_filter(
           self, keys: list, from_datetime: Optional[DateTime] = None, to_datetime: Optional[DateTime] = None
       ) -> list:
           def _is_in_period(dt: datetime) -> bool:
               if from_datetime is not None and dt < from_datetime:
                   return False
               if to_datetime is not None and dt > to_datetime:
                   return False
               return True
   
           return [k['Key'] for k in keys if _is_in_period(k['LastModified'])]
   ```

##########
File path: airflow/providers/amazon/aws/hooks/s3.py
##########
@@ -263,6 +281,10 @@ def list_keys(
         delimiter: Optional[str] = None,
         page_size: Optional[int] = None,
         max_items: Optional[int] = None,
+        start_after_key: Optional[str] = None,
+        from_datetime: Optional[DateTime] = None,
+        to_datetime: Optional[DateTime] = None,

Review comment:
       I don’t think this needs to take `pendulum.DateTime`. Normal `datetime.datetime` works equally well (and is compatible with `pendulum.DateTime`).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org