You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/03/14 00:30:32 UTC
[GitHub] [airflow] uranusjr commented on a change in pull request #22231: S3 list key filter
uranusjr commented on a change in pull request #22231:
URL: https://github.com/apache/airflow/pull/22231#discussion_r825527820
##########
File path: airflow/providers/amazon/aws/hooks/s3.py
##########
@@ -255,6 +256,23 @@ def list_prefixes(
return prefixes
+ def _list_key_object_filter(
+ self, keys: list, from_datetime: Optional[DateTime] = None, to_datetime: Optional[DateTime] = None
+ ) -> list:
+ if from_datetime is None and to_datetime is None:
+ return [k['Key'] for k in keys]
+ elif to_datetime is None:
+ return [k['Key'] for k in keys if k['LastModified'] >= from_datetime]
+ elif from_datetime is None:
+ return [k['Key'] for k in keys if k['LastModified'] < to_datetime]
+ else:
+ return [
+ k['Key']
+ for k in keys
+ if k['LastModified'] >= from_datetime and k['LastModified'] < to_datetime
+ ]
+ return [k['Key'] for k in keys]
Review comment:
How about
```suggestion
def _list_key_object_filter(
self, keys: list, from_datetime: Optional[DateTime] = None, to_datetime: Optional[DateTime] = None
) -> list:
def _is_in_period(dt: datetime) -> bool:
if from_datetime is not None and dt < from_datetime:
return False
if to_datetime is not None and dt > to_datetime:
return False
return True
return [k['Key'] for k in keys if _is_in_period(k['LastModified'])]
```
##########
File path: airflow/providers/amazon/aws/hooks/s3.py
##########
@@ -263,6 +281,10 @@ def list_keys(
delimiter: Optional[str] = None,
page_size: Optional[int] = None,
max_items: Optional[int] = None,
+ start_after_key: Optional[str] = None,
+ from_datetime: Optional[DateTime] = None,
+ to_datetime: Optional[DateTime] = None,
Review comment:
I don’t think this needs to take `pendulum.DateTime`. Normal `datetime.datetime` works equally well (and is compatible with `pendulum.DateTime`).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org