You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Lee-W (via GitHub)" <gi...@apache.org> on 2023/09/19 09:28:36 UTC

[GitHub] [airflow] Lee-W commented on a diff in pull request #34466: Add distinct function to MongoHook in apache-airflow-providers-mongo

Lee-W commented on code in PR #34466:
URL: https://github.com/apache/airflow/pull/34466#discussion_r1329834961


##########
airflow/providers/mongo/hooks/mongo.py:
##########
@@ -366,3 +366,22 @@ def delete_many(
         collection = self.get_collection(mongo_collection, mongo_db=mongo_db)
 
         return collection.delete_many(filter_doc, **kwargs)
+
+    def distinct(
+        self, mongo_collection: str, key: str, filter_doc: dict | None, mongo_db: str | None = None, **kwargs

Review Comment:
   Do we want to name `key` as `distinct_key` instead? Or is it intuitive enough for users to understand?



##########
airflow/providers/mongo/hooks/mongo.py:
##########
@@ -366,3 +366,22 @@ def delete_many(
         collection = self.get_collection(mongo_collection, mongo_db=mongo_db)
 
         return collection.delete_many(filter_doc, **kwargs)
+
+    def distinct(
+        self, mongo_collection: str, key: str, filter_doc: dict | None, mongo_db: str | None = None, **kwargs
+    ) -> list[Any]:
+        """
+        Returns a list of distinct values for the given key across a collection.
+
+        https://pymongo.readthedocs.io/en/stable/api/pymongo/collection.html#pymongo.collection.Collection.distinct
+
+        :param mongo_collection: The name of the collection to perform distinct on.
+        :param key: The field to return distinct values from.
+        :param filter_doc: A query that matches the documents get distinct values from.
+            Optional. Defaults to {}.

Review Comment:
   This seems to be inaccurate. There's no default to the parameter. Should we add it or change the docstring?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org