You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2019/05/05 17:05:01 UTC

[jira] [Commented] (AIRFLOW-4364) Integrate Pylint

    [ https://issues.apache.org/jira/browse/AIRFLOW-4364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16833415#comment-16833415 ] 

ASF GitHub Bot commented on AIRFLOW-4364:
-----------------------------------------

BasPH commented on pull request #5238: [AIRFLOW-4364] Add Pylint to CI
URL: https://github.com/apache/airflow/pull/5238
 
 
   This PR adds Pylint to the CI, initially **_only checking on changed lines_**. The default Pylint configuration is quite strict so I ran Pylint with default settings, collected all messages and went over every single message type and decided if it should be kept, changed or dropped.
   
   I started with a Bash script trying to combine changed lines and pylint output, which proved to be quite complex. So, I then added git-lint. Although it worked okay at first, there were several annoying features, such as broken formatting when changing the config and obsolete checkers for other files being applied which cannot be turned off. In the end, I couldn't get it to do everything I wanted and wrote a small Python script to combine the changed lines and Pylint output, which worked nicely.
   
   Note linting on changed lines is almost, but not completely fool-proof. There are some edge cases such as module-level messages which won't be picked up. Ideally Pylint is applied on complete files so I hope to make Airflow completely Pylint-compatible as soon as possible, however this is a lot of work.
   
   The list of all messages with default Pylint configuration on Airflow master:
   
   | Error                                        | Total | Action                         |
   |----------------------------------------------|-------|--------------------------------|
   | C0111 (missing-docstring)                    | 4770  | Valid error, no config change  |
   | C0103 (invalid-name)                         | 3030  | - Constants (1077): Currently all different styles are found in the codebase. Set to any.</br>- Variables (1643): See all different styles, everything seems valid to me, kept at snake_case.</br>- Functions (35): Not too many errors, kept at snake_case.</br>- Arguments (146): Kept snake_case but allowed for 1 char argument names</br>- Methods (36): Kept on snake_case.</br>- Classes (16): Kept on PascalCase. |
   | C0330 (bad-continuation)                     | 793   | Invalid error, ignored |
   | E0401 (import-error)                         | 712   | Ignored to keep Pylint CI task fast. Tests with devel env installed should fail anyways on missing dependencies. |
   | C0301 (line-too-long)                        | 690   | Set line length to 110 |
   | W0212 (protected-access)                     | 525   | Valid error, no change |
   | R0801 (Similar lines in X files)             | 404   | invalid, ignored |
   | R0913 (too-many-arguments)                   | 360   | Default at 5, set to 80% of the error values -> 10 |
   | C0411 (wrong-import-order)                   | 330   | Valid error, no change. More discussion here: https://github.com/apache/airflow/pull/4892#issuecomment-482873803 |
   | W0223 (abstract-method)                      | 320   | Don't think it's necessary, ignored |
   | W1113 (keyword-arg-before-vararg)            | 276   | Unnecessary, ignored |
   | R0201 (no-self-use)                          | 271   | Very common, not really a problem, ignored |
   | W0613 (unused-argument)                      | 268   | Valid error, no change |
   | R0401 (cyclic-import)                        | 148   | Valid error, no change |
   | E1101 (no-member)                            | 141   | Valid error, no change |
   | R1705 (no-else-return)                       | 137   | Don't consider this a problem. Ignored, can always undo in the future. |
   | R0914 (too-many-locals)                      | 132   | Default on 15, set to 80% of the error values -> 24 |
   | R0902 (too-many-instance-attributes)         | 122   | Default on 7, set to 80% of the error values -> 15 |
   | W0703 (broad-except)                         | 107   | Valid error, no change |
   | W0201 (attribute-defined-outside-init)       | 107   | Seen some cases in Airflow where this was indeed confusing, keep config |
   | W0104 (pointless-statement)                  | 91    | Mostly thrown on the bitshift operation in example DAGs (72x), but also saw some valid errors (19x). There's a discussion to ignore specific messages for specific directories for years: More info: https://github.com/PyCQA/pylint/issues/2584. For now, I'll explicitly disable pointless-statement in all example DAGs and keep the message. | 
   | R0903 (too-few-public-methods)               | 88    | Not really an issue, ignored |
   | W1505 (deprecated-method)                    | 79    | Valid, keeping config |
   | E1120 (no-value-for-parameter)               | 74    | Raised with @provide_session and some other (valid) cases. Would like to ignore in case of @provide_session, but currently not possible. Asked on Pylint Github: https://github.com/PyCQA/pylint/issues/2894. Keeping config for now. |
   | C1801 (len-as-condition)                     | 73    | Valid, keep config |
   | W0612 (unused-variable)                      | 65    | Ignored args and kwargs keep checking other variables |
   | R0205 (useless-object-inheritance)           | 65    | Should all be resolved after https://issues.apache.org/jira/browse/AIRFLOW-4206, keep config |
   | W0231 (super-init-not-called)                | 60    | I believe this is a valid error, but there are simply too many hooks subclassing from BaseHook and not calling super() to keep it. Ignoring for now and hopefully cleanup later. |
   | R1720 (no-else-raise)                        | 57    | Not really a problem, ignored |
   | C0412 (ungrouped-imports)                    | 55    | Valid, keep config |
   | W0621 (redefined-outer-name)                 | 54    | Valid, keep config |
   | W0511 (fixme)                                | 53    | There should be a good reason for a TODO, but I don't see a problem adding one. Ignore. |
   | W0622 (redefined-builtin)                    | 52    | Valid, keep config |
   | W0221 (arguments-differ)                     | 45    | Doesn't always seem valid, ignored |
   | R0912 (too-many-branches)                    | 43    | Default 12, set to 80% of error values -> 22 |
   | R1710 (inconsistent-return-statements)       | 39    | Valid, keep config |
   | W0107 (unnecessary-pass)                     | 36    | In case of empty function, docstring acts as statement and pass is superfluous, keep config |
   | E1123 (unexpected-keyword-arg)               | 36    | Valid, keep config |
   | R0915 (too-many-statements)                  | 35    | Default at 50, set to 80% of the error values -> 69 |
   | C0413 (wrong-import-position)                | 31    | Valid, keep config |
   | R0904 (too-many-public-methods)              | 30    | Default at 20, set to 50% of the error values -> 27 |
   | W1202 (logging-format-interpolation)         | 23    | Valid, keep config |
   | W0611 (unused-import)                        | 22    | Valid, keep config |
   | W0603 (global-statement)                     | 20    | Valid, keep config |
   | E1111 (assignment-from-no-return)            | 20    | Valid, keep config |
   | W0106 (expression-not-assigned)              | 18    | Sometimes valid, sometimes not. Ignore manually in code |
   | C0123 (unidiomatic-typecheck)                | 18    | Valid, keep config |
   | W0222 (signature-differs)                    | 17    | Valid, keep config |
   | W0235 (useless-super-delegation)             | 16    | Valid, keep config |
   | C0302 (too-many-lines)                       | 16    | Good indication the file should be split, keep config |
   | C0122 (misplaced-comparison-constant)        | 16    | Valid, keep config |
   | C0325 (superfluous-parens)                   | 15    | Valid, keep config |
   | W0715 (raising-format-tuple)                 | 13    | Valid, keep config |
   | E0611 (no-name-in-module)                    | 11    | Valid, keep config |
   | W0404 (reimported)                           | 10    | Valid, keep config |
   | C0121 (singleton-comparison)                 | 10    | Valid, keep config |
   | E0213 (no-self-argument)                     | 9     | Valid, keep config |
   | R1714 (consider-using-in)                    | 8     | Valid, keep config |
   | W1201 (logging-not-lazy)                     | 7     | Valid, keep config |
   | R1718 (consider-using-set-comprehension)     | 7     | Valid, keep config |
   | R0911 (too-many-return-statements)           | 7     | Indication there might be a nicer way to solve returns, keep config |
   | R1719 (simplifiable-if-expression)           | 6     | Valid, keep config |
   | W0105 (pointless-string-statement)           | 5     | Valid, keep config |
   | R1704 (redefined-argument-from-local)        | 5     | Valid, keep config |
   | E1305 (too-many-format-args)                 | 5     | Pylint fails on multiline "".format(), ignore. |
   | E0602 (undefined-variable)                   | 5     | Seems valid, keep config and ignore inline |
   | W1308 (duplicate-string-formatting-argument) | 4     | Valid, keep config |
   | W0102 (dangerous-default-value)              | 4     | Valid, keep config |
   | E1102 (not-callable)                         | 4     | Valid, keep config |
   | W1509 (subprocess-popen-preexec-fn)          | 3     | Valid, keep config |
   | W0640 (cell-var-from-loop)                   | 3     | Code seems valid, ignore error |
   | W0631 (undefined-loop-variable)              | 3     | Valid, keep config |
   | R1711 (useless-return)                       | 3     | Valid, keep config |
   | R1707 (trailing-comma-tuple)                 | 3     | Valid, keep config |
   | R1703 (simplifiable-if-statement)            | 3     | Valid, keep config |
   | C0200 (consider-using-enumerate)             | 3     | Valid, keep config |
   | W0706 (try-except-raise)                     | 2     | Valid, keep config |
   | W0602 (global-variable-not-assigned)         | 2     | Valid, keep config |
   | W0150 (lost-exception)                       | 2     | Valid, keep config |
   | R1701 (consider-merging-isinstance)          | 2     | Valid, keep config |
   | E1205 (logging-too-many-args)                | 2     | Valid, keep config |
   | E1136 (unsubscriptable-object)               | 2     | Valid, keep config |
   | E1133 (not-an-iterable)                      | 2     | Valid, keep config |
   | E1121 (too-many-function-args)               | 2     | Valid, keep config |
   | E0211 (no-method-argument)                   | 2     | Valid, keep config |
   | E0203 (access-member-before-definition)      | 2     | Valid, keep config |
   | C0303 (trailing-whitespace)                  | 2     | Valid, keep config |
   | C0201 (consider-iterating-dictionary)        | 2     | Valid, keep config |
   | W1503 (redundant-unittest-assert)            | 1     | Valid, keep config |
   | W0702 (bare-except)                          | 1     | Valid, keep config |
   | W0604 (global-at-module-level)               | 1     | Valid, keep config |
   | W0601 (global-variable-undefined)            | 1     | Valid, keep config |
   | W0402 (deprecated-module)                    | 1     | Valid, keep config |
   | W0401 (wildcard-import)                      | 1     | Valid, keep config |
   | W0233 (non-parent-init-called)               | 1     | Valid, keep config |
   | W0125 (using-constant-test)                  | 1     | Valid, keep config |
   | R1717 (consider-using-dict-comprehension)    | 1     | Valid, keep config |
   | R1715 (consider-using-get)                   | 1     | Valid, keep config |
   | R1702 (too-many-nested-blocks)               | 1     | 5 seems sensible default, keep config |
   | R0916 (too-many-boolean-expressions)         | 1     | 5 seems sensible default, keep config |
   | E0303 (invalid-length-returned)              | 1     | Valid, keep config |
   | E0202 (method-hidden)                        | 1     | Valid, keep config |
   | C0113 (unneeded-not)                         | 1     | Valid, keep config |
   | C0102 (blacklisted-name)                     | 1     | Removed foo, bar & baz from the backlisted names. Useful for example DAGs. |
   
   ---------
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
     - https://issues.apache.org/jira/browse/AIRFLOW-4364
     - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
     - In case you are proposing a fundamental code change, you need to create an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
     - In case you are adding a dependency, check if the license complies with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI changes:
   
   Check Pylint on changed lines.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)":
     1. Subject is separated from body by a blank line
     1. Subject is limited to 50 characters (not including Jira issue reference)
     1. Subject does not end with a period
     1. Subject uses the imperative mood ("add", not "adding")
     1. Body wraps at 72 characters
     1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes how to use it.
     - All the public functions and the classes in the PR contain docstrings that explain what it does
     - If you implement backwards incompatible changes, please leave a note in the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so we can assign it to a appropriate release
   
   ### Code Quality
   
   - [x] Passes `flake8`
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Integrate Pylint
> ----------------
>
>                 Key: AIRFLOW-4364
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-4364
>             Project: Apache Airflow
>          Issue Type: Improvement
>            Reporter: Bas Harenslak
>            Priority: Major
>
> After a [vote on the mailing list|https://lists.apache.org/thread.html/f4940d36e98ded96a2473bb2ccdfa4cc648faa2c1334b2aa901c0bba@%3Cdev.airflow.apache.org%3E] everybody voted for pylint integration. It involves a big change to the codebase, so let's do it in 2 steps:
>  # Check pylint only on changed code in the CI.
>  # After a while we should have a good pylint config, and the remaining non-checked code should be made compatible with pylint, i.e. enable pylint in the CI on the full project.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)