You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/08/15 05:33:00 UTC

[jira] [Commented] (AIRFLOW-2808) Plugin duplication checking is not working

    [ https://issues.apache.org/jira/browse/AIRFLOW-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16580738#comment-16580738 ] 

ASF GitHub Bot commented on AIRFLOW-2808:
-----------------------------------------

XD-DENG closed pull request #3649: [AIRFLOW-2808] Fix Plugin Duplication Checking
URL: https://github.com/apache/incubator-airflow/pull/3649
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/plugins_manager.py b/airflow/plugins_manager.py
index 735f2de1e8..00200f66f1 100644
--- a/airflow/plugins_manager.py
+++ b/airflow/plugins_manager.py
@@ -28,6 +28,7 @@
 import os
 import re
 import sys
+from collections import Counter
 
 from airflow import configuration
 from airflow.utils.log.logging_mixin import LoggingMixin
@@ -90,8 +91,7 @@ def validate(cls):
                         issubclass(obj, AirflowPlugin) and
                         obj is not AirflowPlugin):
                     obj.validate()
-                    if obj not in plugins:
-                        plugins.append(obj)
+                    plugins.append(obj)
 
         except Exception as e:
             log.exception(e)
@@ -119,7 +119,12 @@ def make_module(name, objects):
 flask_blueprints = []
 menu_links = []
 
+uniq_plugin_modules = []
+
 for p in plugins:
+
+    uniq_plugin_modules.append(p.name)
+
     operators_modules.append(
         make_module('airflow.operators.' + p.name, p.operators + p.sensors))
     sensors_modules.append(
@@ -133,3 +138,9 @@ def make_module(name, objects):
     admin_views.extend(p.admin_views)
     flask_blueprints.extend(p.flask_blueprints)
     menu_links.extend(p.menu_links)
+
+plugins_counter = Counter(uniq_plugin_modules)
+if max(plugins_counter.values()) > 1:
+    log.warn("There are duplicated plugin files for method(s) %s.",
+             [p[0] for p in plugins_counter.items() if p[1] > 1])
+    log.warn("Among duplicated plugins of each method, only one to be loaded.")


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Plugin duplication checking is not working
> ------------------------------------------
>
>                 Key: AIRFLOW-2808
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2808
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: plugins
>            Reporter: Xiaodong DENG
>            Assignee: Xiaodong DENG
>            Priority: Major
>
> h2. *Background*
> A plugin duplication checking was designed in *plugins_manager.py* [https://github.com/apache/incubator-airflow/blob/master/airflow/plugins_manager.py#L93]  .
> Corresponding commit was [https://github.com/apache/incubator-airflow/commit/3f38dec9bf44717a275412d1fe155e8252e45ee5|https://github.com/apache/incubator-airflow/commit/3f38dec9bf44717a275412d1fe155e8252e45ee5.]    
> However, it turns out that this checking is not really working (reason: plugin method object name is formed using plugin file path + plugin file name + Plugin Class name. It will never be duplicated given there will not be two files with the same name in the same directory).
> h2. *Issue*
> In my production environment, there are two plugin files with the same name and operator names in the new _AirflowPlugin_ classes defined inside. However, they passed the check without any warning or exception.
> For example, I have a plugin *file_sensor_1.py* as below, 
> {code:java}
> from airflow.plugins_manager import AirflowPlugin
> from airflow.operators.sensors import BaseSensorOperator
> from airflow.utils.decorators import apply_defaults
> import os
> class local_file_sensor(BaseSensorOperator):
>     @apply_defaults
>     def __init__(self, file_path, *args, **kwargs):
>         super(local_file_sensor, self).__init__(*args, **kwargs)
>         self.file_path = file_path
>     def poke(self, context):
>         self.log.info('A-Poking: %s', self.file_path)
>         return os.path.exists(self.file_path)
> class AirflowLocalFileSensorPlugin(AirflowPlugin):
>     name = "local_file_sensor_plugin"
>     operators = [local_file_sensor]
> {code}
>  
> I copy & paste it into another plugin file *file_sensor_2.py*, and make the only change to change the log info from "_A-Poking_" to "_B-Poking_" (to help me check which one is picked).
> Only one plugin would be loaded eventually (because the earlier loaded one will be overwritten by the later loaded one [https://github.com/apache/incubator-airflow/blob/master/airflow/plugins_manager.py#L101] ). However, which one? We don't know. It's indeterminate. So far the file name seems to be the only factor affecting which one would be picked by Airflow.
> h2. *My proposal*
> Give WARNING to the users when they launch the Airflow. (Or should we give error msg and fail the launching?) 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)