You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Kamil Bregula (JIRA)" <ji...@apache.org> on 2019/06/20 05:56:00 UTC

[jira] [Comment Edited] (AIRFLOW-4364) Integrate Pylint

    [ https://issues.apache.org/jira/browse/AIRFLOW-4364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16868276#comment-16868276 ] 

Kamil Bregula edited comment on AIRFLOW-4364 at 6/20/19 5:55 AM:
-----------------------------------------------------------------

I created a interesting statistic. This will allow us to better plan our work. Things that have more than about 100 errors should be divided into smaller items. Otherwise, the review and work is hard.

 
{code:java}
07:49 $ cat a.txt | sed $'s,\x1b\\[[0-9;]*[a-zA-Z],,g' | grep ".py:" | cut -d ":" -f 1 | sort | sed -e "s|\([a-zA-Z_0-9]*\)\.py$||" | uniq -c | sort -r
 526 tests/contrib/operators/
 458 airflow/contrib/hooks/
 405 airflow/migrations/versions/
 301 airflow/www/
 278 airflow/models/
 239 airflow/contrib/operators/
 234 tests/contrib/hooks/
 168 tests/operators/
 167 tests/models/
 152 airflow/hooks/
 142 airflow/bin/
 131 airflow/utils/
 120 airflow/operators/
 119 scripts/perf/
 102 tests/
 95 airflow/contrib/auth/backends/
 89 tests/jobs/
 75 tests/hooks/
 72 airflow/executors/
 70 airflow/contrib/sensors/
 69 airflow/jobs/
 64 tests/contrib/sensors/
 60 tests/www/
 56 airflow/utils/
 55 tests/executors/
 51 tests/sensors/
 51 airflow/utils/log/
 49 airflow/contrib/example_dags/
 49 airflow/
 44 airflow/ti_deps/deps/
 43 tests/utils/
 40 airflow/kubernetes/
 35 docs/exts/
 35 airflow/kubernetes/kubernetes_request_factory/
 26 airflow/sensors/
 24 airflow/contrib/plugins/metastore_browser/
 21 tests/
 21 airflow/
 20 tests/cli/
 20 tests/api/common/experimental/
 19 airflow/contrib/utils/
 17 tests/ti_deps/deps/
 17 tests/dags/
 14 tests/utils/log/
 14 airflow/lineage/
 13 tests/contrib/utils/
 12 tests/minikube/
 12 airflow/lineage/
 12 airflow/
 11 tests/www/api/experimental/
 10 tests/utils/log/elasticmock/
 10 airflow/security/
 10 airflow/migrations/
 10 airflow/macros/
 10 airflow/
 8 airflow/
 7 tests/task/task_runner/
 7 airflow/task/task_runner/
 7 airflow/contrib/utils/
 6 airflow/www/api/experimental/
 6 airflow/lineage/backend/atlas/
 6 airflow/
 5 tests/macros/
 5 tests/lineage/backend/
 5 tests/kubernetes/kubernetes_request_factory/
 5 airflow/contrib/utils/log/
 5 airflow/contrib/task_runner/
 4 tests/test_utils/
 4 tests/plugins/
 4 scripts/perf/dags/
 4 airflow/config_templates/
 3 tests/security/
 3 tests/lineage/
 3 docs/
 3 airflow/kubernetes/
 2 tests/task/
 2 tests/kubernetes/
 2 dags/
 2 airflow/lineage/backend/
 2 airflow/contrib/example_dags/libs/
 1 tests/api/client/
 1 airflow/ti_deps/
 1 airflow/{code}
 


was (Author: kamil.bregula):
I created a interesting statistic. This will allow us to better plan our work. Things that have more than about 50 errors should be divided into smaller items. Otherwise, the review and work is hard.

 
{code:java}
07:49 $ cat a.txt | sed $'s,\x1b\\[[0-9;]*[a-zA-Z],,g' | grep ".py:" | cut -d ":" -f 1 | sort | sed -e "s|\([a-zA-Z_0-9]*\)\.py$||" | uniq -c | sort -r
 526 tests/contrib/operators/
 458 airflow/contrib/hooks/
 405 airflow/migrations/versions/
 301 airflow/www/
 278 airflow/models/
 239 airflow/contrib/operators/
 234 tests/contrib/hooks/
 168 tests/operators/
 167 tests/models/
 152 airflow/hooks/
 142 airflow/bin/
 131 airflow/utils/
 120 airflow/operators/
 119 scripts/perf/
 102 tests/
 95 airflow/contrib/auth/backends/
 89 tests/jobs/
 75 tests/hooks/
 72 airflow/executors/
 70 airflow/contrib/sensors/
 69 airflow/jobs/
 64 tests/contrib/sensors/
 60 tests/www/
 56 airflow/utils/
 55 tests/executors/
 51 tests/sensors/
 51 airflow/utils/log/
 49 airflow/contrib/example_dags/
 49 airflow/
 44 airflow/ti_deps/deps/
 43 tests/utils/
 40 airflow/kubernetes/
 35 docs/exts/
 35 airflow/kubernetes/kubernetes_request_factory/
 26 airflow/sensors/
 24 airflow/contrib/plugins/metastore_browser/
 21 tests/
 21 airflow/
 20 tests/cli/
 20 tests/api/common/experimental/
 19 airflow/contrib/utils/
 17 tests/ti_deps/deps/
 17 tests/dags/
 14 tests/utils/log/
 14 airflow/lineage/
 13 tests/contrib/utils/
 12 tests/minikube/
 12 airflow/lineage/
 12 airflow/
 11 tests/www/api/experimental/
 10 tests/utils/log/elasticmock/
 10 airflow/security/
 10 airflow/migrations/
 10 airflow/macros/
 10 airflow/
 8 airflow/
 7 tests/task/task_runner/
 7 airflow/task/task_runner/
 7 airflow/contrib/utils/
 6 airflow/www/api/experimental/
 6 airflow/lineage/backend/atlas/
 6 airflow/
 5 tests/macros/
 5 tests/lineage/backend/
 5 tests/kubernetes/kubernetes_request_factory/
 5 airflow/contrib/utils/log/
 5 airflow/contrib/task_runner/
 4 tests/test_utils/
 4 tests/plugins/
 4 scripts/perf/dags/
 4 airflow/config_templates/
 3 tests/security/
 3 tests/lineage/
 3 docs/
 3 airflow/kubernetes/
 2 tests/task/
 2 tests/kubernetes/
 2 dags/
 2 airflow/lineage/backend/
 2 airflow/contrib/example_dags/libs/
 1 tests/api/client/
 1 airflow/ti_deps/
 1 airflow/{code}
 

> Integrate Pylint
> ----------------
>
>                 Key: AIRFLOW-4364
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-4364
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: ci
>    Affects Versions: 2.0.0
>            Reporter: Bas Harenslak
>            Priority: Major
>
> After a [vote on the mailing list|https://lists.apache.org/thread.html/f4940d36e98ded96a2473bb2ccdfa4cc648faa2c1334b2aa901c0bba@%3Cdev.airflow.apache.org%3E] everybody voted for pylint integration.
> Making the whole project Pylint compatible is a lot of work and big change. Therefore we split up all the work in subissues under this issue. The approach is as follows:
> All files are currently blacklisted from Pylint. The blacklist is stored in scripts/ci/pylint_todo.txt. Every subissue relates to one or more files on the blacklist. Once you start on an issue:
>  # (running scripts/ci/ci_pylint.sh on master should produce no messages)
>  # Remove the files mentioned in your issue from the blacklist
>  # Run scripts/ci/ci_pylint.sh to see all messages on the no longer blacklisted files
>  # Fix all messages and create PR
> *Why a separate blacklist file and not use Pylint's --ignore-pattern to ignore files?*
>  --ignore-pattern only works on base filenames, not paths.
> *Why don't you blacklist patterns, where 1 line relates to 1 JIRA issue?*
>  Creating a list of non-overlapping patterns proved difficult, this was the pragmatic solution.
> *Rule X is too strict. Can we disable it?*
> In the first PR ([https://github.com/apache/airflow/pull/5238]) we made a choice on every error found on Airflow master back then. While at occasions it might seem harsh to be strict on the code, Airflow is an open source project with many contributors from all over the world. Others read the code without the thought process you put into the code and it helps to have e.g. descriptive variable names, docstrings and sticking to Python conventions. This helps the collaboration between everybody and even your future self. Typically, this question suggests one is trying to lower the boundaries. If you believe there is a valid reason for doing so, please add it to the PR and explain the reason.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)