You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/04/30 00:20:55 UTC

[GitHub] [airflow] dimberman opened a new pull request #8637: Changes deprecated config check rules. Now uses regex to look for an …

dimberman opened a new pull request #8637:
URL: https://github.com/apache/airflow/pull/8637


   …old pattern in the val. Updates 'hostname_callable'.
   
   ---
   Make sure to mark the boxes below before creating PR: [x]
   
   - [ ] Description above provides context of the change
   - [ ] Unit tests coverage for changes (not needed for documentation changes)
   - [ ] Target Github ISSUE in description if exists
   - [ ] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)"
   - [ ] Relevant documentation is updated including usage instructions.
   - [ ] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #8637: Add check for pre-2.0 style hostname_callable config value

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #8637:
URL: https://github.com/apache/airflow/pull/8637#discussion_r418391929



##########
File path: tests/test_configuration.py
##########
@@ -440,33 +441,42 @@ def make_config():
             # lookup even if we remove this explicit fallback
             test_conf.deprecated_values = {
                 'core': {
-                    'task_runner': ('BashTaskRunner', 'StandardTaskRunner', '2.0'),
+                    'task_runner': (re.compile(r'\ABashTaskRunner\Z'), r'StandardTaskRunner', '2.0'),
+                    'hostname_callable': (re.compile(r':'), r'.', '2.0'),
                 },
             }
             test_conf.read_dict({
                 'core': {
                     'executor': 'SequentialExecutor',
                     'task_runner': 'BashTaskRunner',
                     'sql_alchemy_conn': 'sqlite://',
+                    'hostname_callable': 'socket:getfqdn',
                 },
             })
             return test_conf
 
         with self.assertWarns(FutureWarning):
             test_conf = make_config()
             self.assertEqual(test_conf.get('core', 'task_runner'), 'StandardTaskRunner')
+            self.assertEqual(test_conf.get('core', 'hostname_callable'), 'socket.getfqdn')
 
         with self.assertWarns(FutureWarning):
             with unittest.mock.patch.dict('os.environ', AIRFLOW__CORE__TASK_RUNNER='BashTaskRunner'):
                 test_conf = make_config()
 
                 self.assertEqual(test_conf.get('core', 'task_runner'), 'StandardTaskRunner')
+

Review comment:
       ```suggestion
   
   with self.assertWarns(FutureWarning):
   ```
   
   We should have a separate context manager/check for each instance too




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #8637: Changes deprecated config check rules.

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #8637:
URL: https://github.com/apache/airflow/pull/8637#discussion_r417852019



##########
File path: airflow/configuration.py
##########
@@ -188,25 +189,37 @@ def _validate(self):
         for section, replacement in self.deprecated_values.items():
             for name, info in replacement.items():
                 old, new, version = info
-                if self.get(section, name, fallback=None) == old:
-                    # Make sure the env var option is removed, otherwise it
-                    # would be read and used instead of the value we set
-                    env_var = self._env_var_name(section, name)
-                    os.environ.pop(env_var, None)
-
-                    self.set(section, name, new)
-                    warnings.warn(
-                        'The {name} setting in [{section}] has the old default value '
-                        'of {old!r}. This value has been changed to {new!r} in the '
-                        'running config, but please update your config before Apache '
-                        'Airflow {version}.'.format(
-                            name=name, section=section, old=old, new=new, version=version
-                        ),
-                        FutureWarning
-                    )
+                current_value = self.get(section, name, fallback=None)
+                if self._using_old_value(old, current_value):
+                    new_value = re.sub(old, new, current_value)
+                    self._update_env_var(section=section, name=name, new_value=new_value)
+                    self._create_future_warning(name=name, section=section, current_value=current_value, new_value=new_value, version=version)
 
         self.is_validated = True
 
+    def _using_old_value(self, old, current_value):
+        pattern = re.compile(old)

Review comment:
       Not that it's big, but lets move the re.compile in to the deprecated_values dict.
   
   ```
       'task_runner': (re.compile(r'^BashTaskRunner\Z'), re.compile(r'StandardTaskRunner'), '2.0'),
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #8637: Changes deprecated config check rules.

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #8637:
URL: https://github.com/apache/airflow/pull/8637#discussion_r417851317



##########
File path: airflow/configuration.py
##########
@@ -158,7 +158,8 @@ class AirflowConfigParser(ConfigParser):
     # about. Mapping of section -> setting -> { old, replace, by_version }
     deprecated_values = {
         'core': {
-            'task_runner': ('BashTaskRunner', 'StandardTaskRunner', '2.0'),
+            'task_runner': (r'^BashTaskRunner\Z', r'StandardTaskRunner', '2.0'),

Review comment:
       If we're using `\Z` should we use `\A`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] jhtimmins commented on pull request #8637: Changes deprecated config check rules.

Posted by GitBox <gi...@apache.org>.
jhtimmins commented on pull request #8637:
URL: https://github.com/apache/airflow/pull/8637#issuecomment-621541510


   @dimberman This addresses an incompatibility between versions 1.10 and 2. For the `hostname_callable` config value, the notation changed from using colons (`socket:getfqdn`) to periods (`socket.getfqdn`). 
   
   This is a problem if you already have an airflow.cfg file (or unittest.cfg) from < 2.0, as airflow will continue to use the out-of-date values and throw an error. This branch does the following:
   1. Adds shim code to replace the colon w/ a period. 
   2. Modifies the way that deprecated config values are swapped out. Previously, they did it with simple string comparison. This uses a regex to search for and replace the values, because it's possible to use a custom `hostname_callable` value that is formatted in the deprecated manner. Using a regex gives more control about what constitutes a deprecated value.
   3. Adds FutureWarning alert when the deprecated style is in use.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #8637: Add check for pre-2.0 style hostname_callable config value

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #8637:
URL: https://github.com/apache/airflow/pull/8637#issuecomment-621726818


   (Don't merge this yet, Github Actions haven't run yet)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] jhtimmins commented on a change in pull request #8637: Add check for pre-2.0 style hostname_callable config value

Posted by GitBox <gi...@apache.org>.
jhtimmins commented on a change in pull request #8637:
URL: https://github.com/apache/airflow/pull/8637#discussion_r418110002



##########
File path: airflow/configuration.py
##########
@@ -188,25 +189,37 @@ def _validate(self):
         for section, replacement in self.deprecated_values.items():
             for name, info in replacement.items():
                 old, new, version = info
-                if self.get(section, name, fallback=None) == old:
-                    # Make sure the env var option is removed, otherwise it
-                    # would be read and used instead of the value we set
-                    env_var = self._env_var_name(section, name)
-                    os.environ.pop(env_var, None)
-
-                    self.set(section, name, new)
-                    warnings.warn(
-                        'The {name} setting in [{section}] has the old default value '
-                        'of {old!r}. This value has been changed to {new!r} in the '
-                        'running config, but please update your config before Apache '
-                        'Airflow {version}.'.format(
-                            name=name, section=section, old=old, new=new, version=version
-                        ),
-                        FutureWarning
-                    )
+                current_value = self.get(section, name, fallback=None)
+                if self._using_old_value(old, current_value):
+                    new_value = re.sub(old, new, current_value)
+                    self._update_env_var(section=section, name=name, new_value=new_value)
+                    self._create_future_warning(name=name, section=section, current_value=current_value, new_value=new_value, version=version)
 
         self.is_validated = True
 
+    def _using_old_value(self, old, current_value):
+        pattern = re.compile(old)

Review comment:
       @ashb I'm fine with that, but `re.sub` only supports a compiled pattern object for the old value, not the new value. So it'll need to be.
   `'task_runner': (re.compile(r'^BashTaskRunner\Z'), r'StandardTaskRunner', '2.0'),`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #8637: Changes deprecated config check rules.

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #8637:
URL: https://github.com/apache/airflow/pull/8637#discussion_r417852863



##########
File path: airflow/configuration.py
##########
@@ -105,7 +106,6 @@ def default_config_yaml() -> dict:
     with open(file_path) as config_file:
         return yaml.safe_load(config_file)
 
-

Review comment:
       I think two blank lines between methods/classes is required by Flake8, no?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #8637: Changes deprecated config check rules.

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #8637:
URL: https://github.com/apache/airflow/pull/8637#issuecomment-621698837


   This lets us pull the change back in to 1.10.x, so that by the time 2.0 is around people will have had time and notice to update, without reading (the now quite long) UPDATING.md. Nice


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dimberman commented on pull request #8637: Changes deprecated config check rules.

Posted by GitBox <gi...@apache.org>.
dimberman commented on pull request #8637:
URL: https://github.com/apache/airflow/pull/8637#issuecomment-621538762


   @jhtimmins could you please explain a bit more what this PR does and what bug it addresses?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #8637: Add check for pre-2.0 style hostname_callable config value

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #8637:
URL: https://github.com/apache/airflow/pull/8637#discussion_r418391772



##########
File path: tests/test_configuration.py
##########
@@ -440,33 +441,42 @@ def make_config():
             # lookup even if we remove this explicit fallback
             test_conf.deprecated_values = {
                 'core': {
-                    'task_runner': ('BashTaskRunner', 'StandardTaskRunner', '2.0'),
+                    'task_runner': (re.compile(r'\ABashTaskRunner\Z'), r'StandardTaskRunner', '2.0'),
+                    'hostname_callable': (re.compile(r':'), r'.', '2.0'),
                 },
             }
             test_conf.read_dict({
                 'core': {
                     'executor': 'SequentialExecutor',
                     'task_runner': 'BashTaskRunner',
                     'sql_alchemy_conn': 'sqlite://',
+                    'hostname_callable': 'socket:getfqdn',
                 },
             })
             return test_conf
 
         with self.assertWarns(FutureWarning):
             test_conf = make_config()
             self.assertEqual(test_conf.get('core', 'task_runner'), 'StandardTaskRunner')
+            self.assertEqual(test_conf.get('core', 'hostname_callable'), 'socket.getfqdn')
 
         with self.assertWarns(FutureWarning):
             with unittest.mock.patch.dict('os.environ', AIRFLOW__CORE__TASK_RUNNER='BashTaskRunner'):
                 test_conf = make_config()
 
                 self.assertEqual(test_conf.get('core', 'task_runner'), 'StandardTaskRunner')
+
+            with unittest.mock.patch.dict('os.environ', AIRFLOW__CORE__HOSTNAME_CALLABLE='socket:getfqdn'):
+                test_conf = make_config()
+
+                self.assertEqual(test_conf.get('core', 'hostname_callable'), 'socket.getfqdn')
         with reset_warning_registry():
             with warnings.catch_warnings(record=True) as warning:
-                with unittest.mock.patch.dict('os.environ', AIRFLOW__CORE__TASK_RUNNER='NotBashTaskRunner'):
+                with unittest.mock.patch.dict('os.environ', AIRFLOW__CORE__TASK_RUNNER='NotBashTaskRunner', AIRFLOW__CORE__HOSTNAME_CALLABLE='CarrierPigeon'):

Review comment:
       > tests/test_configuration.py:475:0: C0301: Line too long (158/110) (line-too-long)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] jhtimmins commented on pull request #8637: Add check for pre-2.0 style hostname_callable config value

Posted by GitBox <gi...@apache.org>.
jhtimmins commented on pull request #8637:
URL: https://github.com/apache/airflow/pull/8637#issuecomment-621927544


   Depends on https://github.com/apache/airflow/pull/8463. @dimberman mind adding that to the description?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #8637: Add check for pre-2.0 style hostname_callable config value

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #8637:
URL: https://github.com/apache/airflow/pull/8637#discussion_r418566928



##########
File path: airflow/configuration.py
##########
@@ -105,7 +106,6 @@ def default_config_yaml() -> dict:
     with open(file_path) as config_file:
         return yaml.safe_load(config_file)
 

Review comment:
       ```suggestion
   
   
   ```

##########
File path: tests/test_configuration.py
##########
@@ -440,33 +441,42 @@ def make_config():
             # lookup even if we remove this explicit fallback
             test_conf.deprecated_values = {
                 'core': {
-                    'task_runner': ('BashTaskRunner', 'StandardTaskRunner', '2.0'),
+                    'task_runner': (re.compile(r'\ABashTaskRunner\Z'), r'StandardTaskRunner', '2.0'),
+                    'hostname_callable': (re.compile(r':'), r'.', '2.0'),
                 },
             }
             test_conf.read_dict({
                 'core': {
                     'executor': 'SequentialExecutor',
                     'task_runner': 'BashTaskRunner',
                     'sql_alchemy_conn': 'sqlite://',
+                    'hostname_callable': 'socket:getfqdn',
                 },
             })
             return test_conf
 
         with self.assertWarns(FutureWarning):
             test_conf = make_config()
             self.assertEqual(test_conf.get('core', 'task_runner'), 'StandardTaskRunner')
+            self.assertEqual(test_conf.get('core', 'hostname_callable'), 'socket.getfqdn')
 
         with self.assertWarns(FutureWarning):
             with unittest.mock.patch.dict('os.environ', AIRFLOW__CORE__TASK_RUNNER='BashTaskRunner'):
                 test_conf = make_config()
 
                 self.assertEqual(test_conf.get('core', 'task_runner'), 'StandardTaskRunner')
+
+            with unittest.mock.patch.dict('os.environ', AIRFLOW__CORE__HOSTNAME_CALLABLE='socket:getfqdn'):
+                test_conf = make_config()
+
+                self.assertEqual(test_conf.get('core', 'hostname_callable'), 'socket.getfqdn')
         with reset_warning_registry():
             with warnings.catch_warnings(record=True) as warning:
-                with unittest.mock.patch.dict('os.environ', AIRFLOW__CORE__TASK_RUNNER='NotBashTaskRunner'):
+                with unittest.mock.patch.dict('os.environ', AIRFLOW__CORE__TASK_RUNNER='NotBashTaskRunner', AIRFLOW__CORE__HOSTNAME_CALLABLE='CarrierPigeon'):

Review comment:
       ```suggestion
                   with unittest.mock.patch.dict('os.environ',
                                                 AIRFLOW__CORE__TASK_RUNNER='NotBashTaskRunner',
                                                 AIRFLOW__CORE__HOSTNAME_CALLABLE='CarrierPigeon'):
   ```

##########
File path: airflow/configuration.py
##########
@@ -188,25 +189,36 @@ def _validate(self):
         for section, replacement in self.deprecated_values.items():
             for name, info in replacement.items():
                 old, new, version = info
-                if self.get(section, name, fallback=None) == old:
-                    # Make sure the env var option is removed, otherwise it
-                    # would be read and used instead of the value we set
-                    env_var = self._env_var_name(section, name)
-                    os.environ.pop(env_var, None)
-
-                    self.set(section, name, new)
-                    warnings.warn(
-                        'The {name} setting in [{section}] has the old default value '
-                        'of {old!r}. This value has been changed to {new!r} in the '
-                        'running config, but please update your config before Apache '
-                        'Airflow {version}.'.format(
-                            name=name, section=section, old=old, new=new, version=version
-                        ),
-                        FutureWarning
-                    )
+                current_value = self.get(section, name, fallback=None)
+                if self._using_old_value(old, current_value):
+                    new_value = re.sub(old, new, current_value)
+                    self._update_env_var(section=section, name=name, new_value=new_value)
+                    self._create_future_warning(name=name, section=section, current_value=current_value, new_value=new_value, version=version)

Review comment:
       ```suggestion
                       self._create_future_warning(name=name,
                                                   section=section,
                                                   current_value=current_value,
                                                   new_value=new_value,
                                                   version=version)
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org