You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "uranusjr (via GitHub)" <gi...@apache.org> on 2023/02/21 05:31:07 UTC

[GitHub] [airflow] uranusjr commented on a diff in pull request #29625: Aggressively cache entry points in process

uranusjr commented on code in PR #29625:
URL: https://github.com/apache/airflow/pull/29625#discussion_r1112560468


##########
airflow/utils/entry_points.py:
##########
@@ -28,26 +30,33 @@
 
 log = logging.getLogger(__name__)
 
+EPnD = Tuple[metadata.EntryPoint, metadata.Distribution]
 
-def entry_points_with_dist(group: str) -> Iterator[tuple[metadata.EntryPoint, metadata.Distribution]]:
-    """Retrieve entry points of the given group.
-
-    This is like the ``entry_points()`` function from importlib.metadata,
-    except it also returns the distribution the entry_point was loaded from.
 
-    :param group: Filter results to only this entrypoint group
-    :return: Generator of (EntryPoint, Distribution) objects for the specified groups
-    """
+@functools.lru_cache(maxsize=None)
+def _get_grouped_entry_points() -> dict[str, list[EPnD]]:
     loaded: set[str] = set()
+    mapping: dict[str, list[EPnD]] = collections.defaultdict(list)
     for dist in metadata.distributions():
         try:
             key = canonicalize_name(dist.metadata["Name"])

Review Comment:
   I checked the callers and currently this is used by loading `airflow.plugins` and `apache_airflow_provider`. Both of these implement their own deduplication logic, so I think it is safe to remove this entirely. Although this would not actually help the latter case, which still accesses `metadata` anyway…



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org