You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/04/28 04:42:46 UTC

[GitHub] [airflow] dstandish opened a new pull request, #23317: WIP - Add dag configuration error model

dstandish opened a new pull request, #23317:
URL: https://github.com/apache/airflow/pull/23317

   We can use this to track (and surface to the user in the web UI) configuration problems that don't rise to the level of failing the dag parse, such tasks that reference nonexistent pools.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on a diff in pull request #23317: Add dag configuration warning for missing pools

Posted by GitBox <gi...@apache.org>.
potiuk commented on code in PR #23317:
URL: https://github.com/apache/airflow/pull/23317#discussion_r895405933


##########
airflow/dag_processing/processor.py:
##########
@@ -559,6 +560,28 @@ def update_import_errors(session: Session, dagbag: DagBag) -> None:
 
         session.commit()
 
+    @staticmethod
+    def process_dag_warnings(session: Session, dagbag: DagBag) -> None:
+        """
+        For the DAGs in the given DagBag, record any associated configuration warnings and clear
+        warnings for files that no longer have them. These are usually displayed through the
+        Airflow UI so that users know that there are issues parsing DAGs.
+
+        :param session: session for ORM operations
+        :param dagbag: DagBag containing DAGs with configuration warnings
+        """
+        stored_config_warnings = set(session.query(DagWarning).all())

Review Comment:
   Yeah. Agree that optimization here is likely premature.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on a diff in pull request #23317: Add dag configuration warning for missing pools

Posted by GitBox <gi...@apache.org>.
uranusjr commented on code in PR #23317:
URL: https://github.com/apache/airflow/pull/23317#discussion_r894312527


##########
airflow/dag_processing/processor.py:
##########
@@ -559,6 +560,28 @@ def update_import_errors(session: Session, dagbag: DagBag) -> None:
 
         session.commit()
 
+    @staticmethod
+    def process_dag_warnings(session: Session, dagbag: DagBag) -> None:
+        """
+        For the DAGs in the given DagBag, record any associated configuration warnings and clear
+        warnings for files that no longer have them. These are usually displayed through the
+        Airflow UI so that users know that there are issues parsing DAGs.
+
+        :param session: session for ORM operations
+        :param dagbag: DagBag containing DAGs with configuration warnings
+        """
+        stored_config_warnings = set(session.query(DagWarning).all())

Review Comment:
   Instead of having to loop through all the DagWarning rows, perhaps we should store `configuration_warnings` as an id: DagWarning dict, and do
   
   ```python
   to_delete = session.query(DagWarning).filter(
       DagWarning.id.notin_(dagbag.configuration_warnings),
   )
   ```
   
   instead.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ephraimbuddy commented on a diff in pull request #23317: Add DagWarning model, and a check for missing pools

Posted by GitBox <gi...@apache.org>.
ephraimbuddy commented on code in PR #23317:
URL: https://github.com/apache/airflow/pull/23317#discussion_r897614763


##########
airflow/models/dagwarning.py:
##########
@@ -0,0 +1,93 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from enum import Enum
+
+from sqlalchemy import Column, ForeignKeyConstraint, Integer, String, Text, false
+
+from airflow.models.base import ID_LEN, Base
+from airflow.utils import timezone
+from airflow.utils.session import NEW_SESSION, provide_session
+from airflow.utils.sqlalchemy import UtcDateTime
+
+
+class DagWarning(Base):
+    """
+    A table to store DAG warnings.
+
+    DAG warnings are problems that don't rise to the level of failing the DAG parse
+    but which users should nonetheless be warned about.  These warnings are recorded
+    when parsing DAG and displayed on the Webserver in a flash message.
+    """
+
+    id = Column(
+        Integer, autoincrement=True, server_default='id'
+    )  # this default just signals to SQLA to defer to server

Review Comment:
   I think that if we use `merge` for updating items and `add` for adding new items we won't have issues with the surrogate key. We can still use `merge` for new items, we just need to ensure that the item is not already in `session` (see https://www.bookstack.cn/read/sqlalchemy-1.3/999f5d55791da984.md). Even the proposed solution will have an issue with `merge` if the item already exists in the session.
   
   Another thing is, would this table confuse other SQLAlchemy users? It feels like a hack to have an `id` field with a `server_default` value and an `autoincrement` set. If we want to use it, then we might need to make a note and also add a link for reference purposes why we need it.
   
   Also, it feels like we will also have issues reconciling the ORM and DB for this case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish merged pull request #23317: Add DagWarning model, and a check for missing pools

Posted by GitBox <gi...@apache.org>.
dstandish merged PR #23317:
URL: https://github.com/apache/airflow/pull/23317


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on a diff in pull request #23317: Add DagWarning model, and a check for missing pools

Posted by GitBox <gi...@apache.org>.
uranusjr commented on code in PR #23317:
URL: https://github.com/apache/airflow/pull/23317#discussion_r897621531


##########
airflow/migrations/versions/0111_2_4_0_add_dagwarning_model.py:
##########
@@ -0,0 +1,74 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Add DagWarning model
+
+Revision ID: 424117c37d18
+Revises: 3c94c427fdf6
+Create Date: 2022-04-27 15:57:36.736743
+"""
+
+import sqlalchemy as sa
+from alembic import op
+
+# revision identifiers, used by Alembic.
+from sqlalchemy import Column, Integer
+
+from airflow.migrations.db_types import StringID
+
+revision = '424117c37d18'
+down_revision = '3c94c427fdf6'
+branch_labels = None
+depends_on = None
+airflow_version = '2.4.0'
+
+
+def upgrade():
+    """Apply Add DagWarning model"""
+    op.create_table(
+        'dag_warning',
+        Column(
+            'id',
+            Integer,
+            nullable=False,
+            autoincrement=True,
+            primary_key=True,
+        ),
+        sa.Column('dag_id', StringID(), nullable=False),
+        sa.Column('warning_type', sa.String(length=50), nullable=False),
+        sa.Column('message', sa.String(1000), nullable=False),
+        sa.Column('timestamp', sa.DateTime(), nullable=False),

Review Comment:
   “Traditionally” Airflow uses a timezone-less column to store UTC time, and do the conversion in with `UtcDateTime`. But I _think_ there’s a loose recognision to use `TIMESTAMP` in new code.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on a diff in pull request #23317: Add dag configuration warning for missing pools

Posted by GitBox <gi...@apache.org>.
uranusjr commented on code in PR #23317:
URL: https://github.com/apache/airflow/pull/23317#discussion_r891964399


##########
airflow/migrations/versions/0109_424117c37d18_add_dagconfiguratioerror_model.py:
##########
@@ -0,0 +1,75 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Add DagConfiguratioError model
+
+Revision ID: 424117c37d18
+Revises: b1b348e02d07
+Create Date: 2022-04-27 15:57:36.736743
+
+"""
+
+import sqlalchemy as sa
+from alembic import op
+
+# revision identifiers, used by Alembic.
+from sqlalchemy import Column, Integer
+
+from airflow.migrations.db_types import StringID
+
+revision = '424117c37d18'
+down_revision = 'b1b348e02d07'
+branch_labels = None
+depends_on = None
+airflow_version = '2.4.0'
+
+
+def upgrade():
+    """Apply Add DagConfiguratioError model"""
+    op.create_table(
+        'dag_configuration_warning',

Review Comment:
   I wonder if we should just call this `dag_warning`; I recall I wanted this for something else on parse-time that isn’t quite `ImportError` (don’t remember exactly what).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on a diff in pull request #23317: Add DagWarning model, and a check for missing pools

Posted by GitBox <gi...@apache.org>.
dstandish commented on code in PR #23317:
URL: https://github.com/apache/airflow/pull/23317#discussion_r899448817


##########
airflow/models/dagwarning.py:
##########
@@ -0,0 +1,93 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from enum import Enum
+
+from sqlalchemy import Column, ForeignKeyConstraint, Integer, String, Text, false
+
+from airflow.models.base import ID_LEN, Base
+from airflow.utils import timezone
+from airflow.utils.session import NEW_SESSION, provide_session
+from airflow.utils.sqlalchemy import UtcDateTime
+
+
+class DagWarning(Base):
+    """
+    A table to store DAG warnings.
+
+    DAG warnings are problems that don't rise to the level of failing the DAG parse
+    but which users should nonetheless be warned about.  These warnings are recorded
+    when parsing DAG and displayed on the Webserver in a flash message.
+    """
+
+    id = Column(
+        Integer, autoincrement=True, server_default='id'
+    )  # this default just signals to SQLA to defer to server

Review Comment:
   The whole point of merge is you don't know whether the object is there.  And if it's a composite primary key, it's a pain to build out the query to check for its existence.  That's what's nice about merge.
   
   In any case, we don't really need the surrogate key here so I'll just chop it.  But this will likely come up again at some point.  And it seems this pattern, of using the logical key in ORM while having the surrogate key as the physical key is a good approach for this type of thing.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on a diff in pull request #23317: Add DagWarning model, and a check for missing pools

Posted by GitBox <gi...@apache.org>.
dstandish commented on code in PR #23317:
URL: https://github.com/apache/airflow/pull/23317#discussion_r899456658


##########
tests/always/test_example_dags.py:
##########
@@ -60,7 +60,7 @@ def test_should_be_importable(example):
 
 @pytest.mark.parametrize("example", example_dags_except_db_exception(), ids=relative_path)
 def test_should_not_do_database_queries(example):
-    with assert_queries_count(0):
+    with assert_queries_count(0, ignore_patterns=['dagbag.py:validate_task_pools']):

Review Comment:
   ok @potiuk i moved the logic to dag processor.  so it will not run in every DagBag load, only when dags are loaded by the dag processor 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ephraimbuddy commented on a diff in pull request #23317: Add DagWarning model, and a check for missing pools

Posted by GitBox <gi...@apache.org>.
ephraimbuddy commented on code in PR #23317:
URL: https://github.com/apache/airflow/pull/23317#discussion_r899469777


##########
airflow/api_connexion/endpoints/dag_warning_endpoint.py:
##########
@@ -0,0 +1,70 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from typing import Optional
+
+from sqlalchemy import func
+from sqlalchemy.orm import Session
+
+from airflow.api_connexion import security
+from airflow.api_connexion.exceptions import NotFound
+from airflow.api_connexion.parameters import apply_sorting, check_limit, format_parameters
+from airflow.api_connexion.schemas.dag_warning_schema import (
+    DagWarningCollection,
+    dag_warning_collection_schema,
+    dag_warning_schema,
+)
+from airflow.api_connexion.types import APIResponse
+from airflow.models.dagwarning import DagWarning as DagWarningModel
+from airflow.security import permissions
+from airflow.utils.session import NEW_SESSION, provide_session
+
+
+@security.requires_access([(permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG_WARNING)])
+@provide_session
+def get_dag_warning(*, dag_warning_id: int, session: Session = NEW_SESSION) -> APIResponse:
+    """Get a DAG warning"""
+    error = session.query(DagWarningModel).get(dag_warning_id)
+
+    if error is None:
+        raise NotFound(
+            "Dag warning not found",
+            detail=f"The DagWarning with dag_warning_id: `{dag_warning_id}` was not found",
+        )
+    return dag_warning_schema.dump(error)
+
+
+@security.requires_access([(permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG_WARNING)])
+@format_parameters({'limit': check_limit})
+@provide_session
+def get_dag_warnings(
+    *,
+    limit: int,
+    offset: Optional[int] = None,
+    order_by: str = "dag_warning_id",
+    session: Session = NEW_SESSION,
+) -> APIResponse:
+    """Get all import errors"""
+    to_replace = {"dag_warning_id": 'id'}
+    allowed_filter_attrs = ['dag_warning_id', "timestamp", "warning_type", "message"]
+    total_entries = session.query(func.count(DagWarningModel.id)).scalar()

Review Comment:
   Looks like this should be `DagWarningModel.dag_id`?



##########
airflow/models/dagwarning.py:
##########
@@ -0,0 +1,88 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from enum import Enum
+
+from sqlalchemy import Column, ForeignKeyConstraint, String, Text, false
+
+from airflow.models.base import ID_LEN, Base
+from airflow.utils import timezone
+from airflow.utils.session import NEW_SESSION, provide_session
+from airflow.utils.sqlalchemy import UtcDateTime
+
+
+class DagWarning(Base):
+    """
+    A table to store DAG warnings.
+
+    DAG warnings are problems that don't rise to the level of failing the DAG parse
+    but which users should nonetheless be warned about.  These warnings are recorded
+    when parsing DAG and displayed on the Webserver in a flash message.
+    """
+
+    dag_id = Column(String(ID_LEN), primary_key=True, nullable=False)

Review Comment:
   ```suggestion
       dag_id = Column(StringID(), primary_key=True, nullable=False)
   ```
   nit: We can import it from models.base



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on a diff in pull request #23317: Add dag configuration warning for missing pools

Posted by GitBox <gi...@apache.org>.
uranusjr commented on code in PR #23317:
URL: https://github.com/apache/airflow/pull/23317#discussion_r894312527


##########
airflow/dag_processing/processor.py:
##########
@@ -559,6 +560,28 @@ def update_import_errors(session: Session, dagbag: DagBag) -> None:
 
         session.commit()
 
+    @staticmethod
+    def process_dag_warnings(session: Session, dagbag: DagBag) -> None:
+        """
+        For the DAGs in the given DagBag, record any associated configuration warnings and clear
+        warnings for files that no longer have them. These are usually displayed through the
+        Airflow UI so that users know that there are issues parsing DAGs.
+
+        :param session: session for ORM operations
+        :param dagbag: DagBag containing DAGs with configuration warnings
+        """
+        stored_config_warnings = set(session.query(DagWarning).all())

Review Comment:
   Instead of having to loop through all the DagWarning rows, this should use the database level filter.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on a diff in pull request #23317: Add dag configuration warning for missing pools

Posted by GitBox <gi...@apache.org>.
uranusjr commented on code in PR #23317:
URL: https://github.com/apache/airflow/pull/23317#discussion_r895443690


##########
airflow/security/permissions.py:
##########
@@ -32,6 +32,7 @@
 RESOURCE_DAG_CODE = "DAG Code"
 RESOURCE_DAG_RUN = "DAG Runs"
 RESOURCE_IMPORT_ERROR = "ImportError"
+RESOURCE_DAG_WARNING = "DagWarning"

Review Comment:
   This should be sorted alphabetically. `DAG Warnings` is probably better? (I plan to rename ImportError to DAG Errors after this)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on a diff in pull request #23317: Add dag configuration warning for missing pools

Posted by GitBox <gi...@apache.org>.
dstandish commented on code in PR #23317:
URL: https://github.com/apache/airflow/pull/23317#discussion_r896035280


##########
tests/always/test_example_dags.py:
##########
@@ -60,7 +60,7 @@ def test_should_be_importable(example):
 
 @pytest.mark.parametrize("example", example_dags_except_db_exception(), ids=relative_path)
 def test_should_not_do_database_queries(example):
-    with assert_queries_count(0):
+    with assert_queries_count(0, ignore_patterns=['dagbag.py:validate_task_pools']):

Review Comment:
   wait.... nevermind... as it is, we only make _one_ query at the end after all DAG parsing has completed.  so it's not like it's querying once for every DAG.  this to me seems pretty reasonable? 
   
   but I guess DagBag load doesn't only happen from the dag processor service, but also perhaps when tasks are run....  so maybe there's something here to optimize.
   
   perhaps we could make it so that this kind of check is disabled except when loading dagbag from dag processor service. WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on a diff in pull request #23317: Add DagWarning model, and a check for missing pools

Posted by GitBox <gi...@apache.org>.
dstandish commented on code in PR #23317:
URL: https://github.com/apache/airflow/pull/23317#discussion_r897267396


##########
airflow/migrations/versions/0111_2_4_0_add_dagwarning_model.py:
##########
@@ -0,0 +1,74 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Add DagWarning model
+
+Revision ID: 424117c37d18
+Revises: 3c94c427fdf6
+Create Date: 2022-04-27 15:57:36.736743
+"""
+
+import sqlalchemy as sa
+from alembic import op
+
+# revision identifiers, used by Alembic.
+from sqlalchemy import Column, Integer
+
+from airflow.migrations.db_types import StringID
+
+revision = '424117c37d18'
+down_revision = '3c94c427fdf6'
+branch_labels = None
+depends_on = None
+airflow_version = '2.4.0'
+
+
+def upgrade():
+    """Apply Add DagWarning model"""
+    op.create_table(
+        'dag_warning',
+        Column(
+            'id',
+            Integer,
+            nullable=False,
+            autoincrement=True,
+            primary_key=True,
+        ),
+        sa.Column('dag_id', StringID(), nullable=False),
+        sa.Column('warning_type', sa.String(length=50), nullable=False),
+        sa.Column('message', sa.String(1000), nullable=False),
+        sa.Column('timestamp', sa.DateTime(), nullable=False),

Review Comment:
   I just copied from import_error.   what's the current standard?   we should go with whatever is the current "right way"



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on a diff in pull request #23317: Add dag configuration warning for missing pools

Posted by GitBox <gi...@apache.org>.
dstandish commented on code in PR #23317:
URL: https://github.com/apache/airflow/pull/23317#discussion_r896031303


##########
tests/always/test_example_dags.py:
##########
@@ -60,7 +60,7 @@ def test_should_be_importable(example):
 
 @pytest.mark.parametrize("example", example_dags_except_db_exception(), ids=relative_path)
 def test_should_not_do_database_queries(example):
-    with assert_queries_count(0):
+    with assert_queries_count(0, ignore_patterns=['dagbag.py:validate_task_pools']):

Review Comment:
   valid concern.  maybe it's possible to query once at start of dag processing and let all the processes have the information through an attr.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on a diff in pull request #23317: Add dag configuration warning for missing pools

Posted by GitBox <gi...@apache.org>.
uranusjr commented on code in PR #23317:
URL: https://github.com/apache/airflow/pull/23317#discussion_r895446783


##########
airflow/api_connexion/schemas/dag_warning_schema.py:
##########
@@ -0,0 +1,57 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from typing import List, NamedTuple
+
+from marshmallow import Schema, fields
+from marshmallow_sqlalchemy import SQLAlchemySchema, auto_field
+
+from airflow.models.dagwarning import DagWarning
+
+
+class DagWarningSchema(SQLAlchemySchema):
+    """Import error schema"""
+
+    class Meta:
+        """Meta"""
+
+        model = DagWarning
+
+    dag_warning_id = auto_field("id", dump_only=True)

Review Comment:
   I think we can use `data_key` for this?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ephraimbuddy commented on a diff in pull request #23317: Add dag configuration warning for missing pools

Posted by GitBox <gi...@apache.org>.
ephraimbuddy commented on code in PR #23317:
URL: https://github.com/apache/airflow/pull/23317#discussion_r896432641


##########
airflow/models/dagwarning.py:
##########
@@ -0,0 +1,93 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from enum import Enum
+
+from sqlalchemy import Column, ForeignKeyConstraint, Integer, String, Text, false
+
+from airflow.models.base import ID_LEN, Base
+from airflow.utils import timezone
+from airflow.utils.session import NEW_SESSION, provide_session
+from airflow.utils.sqlalchemy import UtcDateTime
+
+
+class DagWarning(Base):
+    """
+    A table to store DAG warnings.
+
+    DAG warnings are problems that don't rise to the level of failing the DAG parse
+    but which users should nonetheless be warned about.  These warnings are recorded
+    when parsing DAG and displayed on the Webserver in a flash message.
+    """
+
+    id = Column(
+        Integer, autoincrement=True, server_default='id'
+    )  # this default just signals to SQLA to defer to server

Review Comment:
   Do we need the default? I think `autoincrement` will add the needed default. Also, this table has primary key on `dag_id`, `warning_type` but the migration doesn't have these columns as the primary key. Both should have the same configuration? 
   Only the ID field is a primary key on the migration file



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] kaxil commented on a diff in pull request #23317: WIP - Add dag configuration error model

Posted by GitBox <gi...@apache.org>.
kaxil commented on code in PR #23317:
URL: https://github.com/apache/airflow/pull/23317#discussion_r860764879


##########
airflow/dag_processing/processor.py:
##########
@@ -559,6 +560,28 @@ def update_import_errors(session: Session, dagbag: DagBag) -> None:
 
         session.commit()
 
+    @staticmethod
+    def update_configuration_errors(session: Session, dagbag: DagBag) -> None:

Review Comment:
   Hmm naming is hard, "configuration" here led me to think this is something around configs in airflow.cfg especially the name of Model in `models/dagconfigurationerror.py`
   
   - DagParsingErrors 
   - AirflowInputError
   
   or we just have a `AirflowUserError` and we can remove `import_error` table and use this instead where `ImportError` type is replaced a ParsingError?
   
   (Just thinking it out loud -- no strong opinion on any of the above)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on a diff in pull request #23317: Add dag configuration warning for missing pools

Posted by GitBox <gi...@apache.org>.
dstandish commented on code in PR #23317:
URL: https://github.com/apache/airflow/pull/23317#discussion_r892502639


##########
airflow/migrations/versions/0109_424117c37d18_add_dagconfiguratioerror_model.py:
##########
@@ -0,0 +1,75 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Add DagConfiguratioError model
+
+Revision ID: 424117c37d18
+Revises: b1b348e02d07
+Create Date: 2022-04-27 15:57:36.736743
+
+"""
+
+import sqlalchemy as sa
+from alembic import op
+
+# revision identifiers, used by Alembic.
+from sqlalchemy import Column, Integer
+
+from airflow.migrations.db_types import StringID
+
+revision = '424117c37d18'
+down_revision = 'b1b348e02d07'
+branch_labels = None
+depends_on = None
+airflow_version = '2.4.0'
+
+
+def upgrade():
+    """Apply Add DagConfiguratioError model"""
+    op.create_table(
+        'dag_configuration_warning',

Review Comment:
   I like dag warning 👍 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on a diff in pull request #23317: Add dag configuration warning for missing pools

Posted by GitBox <gi...@apache.org>.
potiuk commented on code in PR #23317:
URL: https://github.com/apache/airflow/pull/23317#discussion_r895410000


##########
tests/always/test_example_dags.py:
##########
@@ -60,7 +60,7 @@ def test_should_be_importable(example):
 
 @pytest.mark.parametrize("example", example_dags_except_db_exception(), ids=relative_path)
 def test_should_not_do_database_queries(example):
-    with assert_queries_count(0):
+    with assert_queries_count(0, ignore_patterns=['dagbag.py:validate_task_pools']):

Review Comment:
   Are you sure we want to do it @dstandish  ? This basically means that whenever we parse DAGs we make Database queries. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on pull request #23317: Add dag configuration warning for missing pools

Posted by GitBox <gi...@apache.org>.
dstandish commented on PR #23317:
URL: https://github.com/apache/airflow/pull/23317#issuecomment-1115406625

   update... got flash working, got migration working and tried on all 4 dbs.  now just have to write some tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] github-actions[bot] commented on pull request #23317: Add DagWarning model, and a check for missing pools

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #23317:
URL: https://github.com/apache/airflow/pull/23317#issuecomment-1159811711

   The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on a diff in pull request #23317: Add DagWarning model, and a check for missing pools

Posted by GitBox <gi...@apache.org>.
dstandish commented on code in PR #23317:
URL: https://github.com/apache/airflow/pull/23317#discussion_r899455898


##########
airflow/migrations/versions/0111_2_4_0_add_dagwarning_model.py:
##########
@@ -0,0 +1,74 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Add DagWarning model
+
+Revision ID: 424117c37d18
+Revises: 3c94c427fdf6
+Create Date: 2022-04-27 15:57:36.736743
+"""
+
+import sqlalchemy as sa
+from alembic import op
+
+# revision identifiers, used by Alembic.
+from sqlalchemy import Column, Integer
+
+from airflow.migrations.db_types import StringID
+
+revision = '424117c37d18'
+down_revision = '3c94c427fdf6'
+branch_labels = None
+depends_on = None
+airflow_version = '2.4.0'
+
+
+def upgrade():
+    """Apply Add DagWarning model"""
+    op.create_table(
+        'dag_warning',
+        Column(
+            'id',
+            Integer,
+            nullable=False,
+            autoincrement=True,
+            primary_key=True,
+        ),
+        sa.Column('dag_id', StringID(), nullable=False),
+        sa.Column('warning_type', sa.String(length=50), nullable=False),
+        sa.Column('message', sa.String(1000), nullable=False),
+        sa.Column('timestamp', sa.DateTime(), nullable=False),

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on pull request #23317: Add DagWarning model, and a check for missing pools

Posted by GitBox <gi...@apache.org>.
dstandish commented on PR #23317:
URL: https://github.com/apache/airflow/pull/23317#issuecomment-1158140956

   not able to get this endpoint to show up in swagger ... and it doesn't work with requests.... if someone has a tip


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ephraimbuddy commented on a diff in pull request #23317: Add dag configuration warning for missing pools

Posted by GitBox <gi...@apache.org>.
ephraimbuddy commented on code in PR #23317:
URL: https://github.com/apache/airflow/pull/23317#discussion_r896427720


##########
airflow/migrations/versions/0111_2_4_0_add_dagwarning_model.py:
##########
@@ -0,0 +1,74 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Add DagWarning model
+
+Revision ID: 424117c37d18
+Revises: 3c94c427fdf6
+Create Date: 2022-04-27 15:57:36.736743
+"""
+
+import sqlalchemy as sa
+from alembic import op
+
+# revision identifiers, used by Alembic.
+from sqlalchemy import Column, Integer
+
+from airflow.migrations.db_types import StringID
+
+revision = '424117c37d18'
+down_revision = '3c94c427fdf6'
+branch_labels = None
+depends_on = None
+airflow_version = '2.4.0'
+
+
+def upgrade():
+    """Apply Add DagWarning model"""
+    op.create_table(
+        'dag_warning',
+        Column(
+            'id',
+            Integer,
+            nullable=False,
+            autoincrement=True,
+            primary_key=True,
+        ),
+        sa.Column('dag_id', StringID(), nullable=False),
+        sa.Column('warning_type', sa.String(length=50), nullable=False),
+        sa.Column('message', sa.String(1000), nullable=False),
+        sa.Column('timestamp', sa.DateTime(), nullable=False),

Review Comment:
   Do we want DateTime here or TIMESTAMP from airflow/migrations/db_types.py?. DateTime looks appropriate though, TIMESTAMP has timezone



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on a diff in pull request #23317: Add DagWarning model, and a check for missing pools

Posted by GitBox <gi...@apache.org>.
dstandish commented on code in PR #23317:
URL: https://github.com/apache/airflow/pull/23317#discussion_r897273586


##########
airflow/models/dagwarning.py:
##########
@@ -0,0 +1,93 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from enum import Enum
+
+from sqlalchemy import Column, ForeignKeyConstraint, Integer, String, Text, false
+
+from airflow.models.base import ID_LEN, Base
+from airflow.utils import timezone
+from airflow.utils.session import NEW_SESSION, provide_session
+from airflow.utils.sqlalchemy import UtcDateTime
+
+
+class DagWarning(Base):
+    """
+    A table to store DAG warnings.
+
+    DAG warnings are problems that don't rise to the level of failing the DAG parse
+    but which users should nonetheless be warned about.  These warnings are recorded
+    when parsing DAG and displayed on the Webserver in a flash message.
+    """
+
+    id = Column(
+        Integer, autoincrement=True, server_default='id'
+    )  # this default just signals to SQLA to defer to server

Review Comment:
   basically, if we want an integer surrogate key we have to make it actually the primary key and use a unique index for the logical key. but then to make sqlalchemy work properly, e.g. with function merge , then sqlalchemy has to think that the logical key is the primary key (even though it’s really not).  finally, the server_default value is required otherwise you’ll get a null constraint violation cus it will try to populate the record with a null value.
   
   the alternative is, (1) no surrogate key, or (2) yes surrogate key but bad sqlalchemy behavior (e.g. merge will be unusable).  in this particular case, having a surrogate key doesn’t really make a difference. BUT i think it’s a “best” practice to have a surrogate key in general (for normalization concerns), so i think we should probably figure out an acceptable approach for it somehow or another.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] kaxil commented on a diff in pull request #23317: WIP - Add dag configuration error model

Posted by GitBox <gi...@apache.org>.
kaxil commented on code in PR #23317:
URL: https://github.com/apache/airflow/pull/23317#discussion_r860765512


##########
airflow/dag_processing/processor.py:
##########
@@ -559,6 +560,28 @@ def update_import_errors(session: Session, dagbag: DagBag) -> None:
 
         session.commit()
 
+    @staticmethod
+    def update_configuration_errors(session: Session, dagbag: DagBag) -> None:

Review Comment:
   I ❤️ the idea of more helpful errors though so happy to see this



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on a diff in pull request #23317: Add DagWarning model, and a check for missing pools

Posted by GitBox <gi...@apache.org>.
dstandish commented on code in PR #23317:
URL: https://github.com/apache/airflow/pull/23317#discussion_r899529643


##########
airflow/api_connexion/endpoints/dag_warning_endpoint.py:
##########
@@ -0,0 +1,70 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from typing import Optional
+
+from sqlalchemy import func
+from sqlalchemy.orm import Session
+
+from airflow.api_connexion import security
+from airflow.api_connexion.exceptions import NotFound
+from airflow.api_connexion.parameters import apply_sorting, check_limit, format_parameters
+from airflow.api_connexion.schemas.dag_warning_schema import (
+    DagWarningCollection,
+    dag_warning_collection_schema,
+    dag_warning_schema,
+)
+from airflow.api_connexion.types import APIResponse
+from airflow.models.dagwarning import DagWarning as DagWarningModel
+from airflow.security import permissions
+from airflow.utils.session import NEW_SESSION, provide_session
+
+
+@security.requires_access([(permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG_WARNING)])
+@provide_session
+def get_dag_warning(*, dag_warning_id: int, session: Session = NEW_SESSION) -> APIResponse:
+    """Get a DAG warning"""
+    error = session.query(DagWarningModel).get(dag_warning_id)
+
+    if error is None:
+        raise NotFound(
+            "Dag warning not found",
+            detail=f"The DagWarning with dag_warning_id: `{dag_warning_id}` was not found",
+        )
+    return dag_warning_schema.dump(error)
+
+
+@security.requires_access([(permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG_WARNING)])
+@format_parameters({'limit': check_limit})
+@provide_session
+def get_dag_warnings(
+    *,
+    limit: int,
+    offset: Optional[int] = None,
+    order_by: str = "dag_warning_id",
+    session: Session = NEW_SESSION,
+) -> APIResponse:
+    """Get all import errors"""
+    to_replace = {"dag_warning_id": 'id'}
+    allowed_filter_attrs = ['dag_warning_id', "timestamp", "warning_type", "message"]
+    total_entries = session.query(func.count(DagWarningModel.id)).scalar()

Review Comment:
   yup thanks



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on a diff in pull request #23317: Add dag configuration warning for missing pools

Posted by GitBox <gi...@apache.org>.
dstandish commented on code in PR #23317:
URL: https://github.com/apache/airflow/pull/23317#discussion_r894753175


##########
airflow/dag_processing/processor.py:
##########
@@ -559,6 +560,28 @@ def update_import_errors(session: Session, dagbag: DagBag) -> None:
 
         session.commit()
 
+    @staticmethod
+    def process_dag_warnings(session: Session, dagbag: DagBag) -> None:
+        """
+        For the DAGs in the given DagBag, record any associated configuration warnings and clear
+        warnings for files that no longer have them. These are usually displayed through the
+        Airflow UI so that users know that there are issues parsing DAGs.
+
+        :param session: session for ORM operations
+        :param dagbag: DagBag containing DAGs with configuration warnings
+        """
+        stored_config_warnings = set(session.query(DagWarning).all())

Review Comment:
   so, this is always gonna be a small amount of data.  and, in any case we already always hold all current dag warnings in memory; every warning that was emitted in the course of the dagbag load gets stored on an attr and then we just have to sync that with the db. similar is done for importerror.
   
   which part are you thinking needs to be optimized?  the "delete stale warnings" operation?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on a diff in pull request #23317: Add dag configuration warning for missing pools

Posted by GitBox <gi...@apache.org>.
dstandish commented on code in PR #23317:
URL: https://github.com/apache/airflow/pull/23317#discussion_r896048129


##########
tests/always/test_example_dags.py:
##########
@@ -60,7 +60,7 @@ def test_should_be_importable(example):
 
 @pytest.mark.parametrize("example", example_dags_except_db_exception(), ids=relative_path)
 def test_should_not_do_database_queries(example):
-    with assert_queries_count(0):
+    with assert_queries_count(0, ignore_patterns=['dagbag.py:validate_task_pools']):

Review Comment:
   hmm moving this to dag processor seems like generally the right direction but i think i may need to rethink the sync logic because we won't always have all warnings in memory, since the processor may just run for a specific dag file and not the entire dags folder...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ephraimbuddy commented on a diff in pull request #23317: Add dag configuration warning for missing pools

Posted by GitBox <gi...@apache.org>.
ephraimbuddy commented on code in PR #23317:
URL: https://github.com/apache/airflow/pull/23317#discussion_r896432641


##########
airflow/models/dagwarning.py:
##########
@@ -0,0 +1,93 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from enum import Enum
+
+from sqlalchemy import Column, ForeignKeyConstraint, Integer, String, Text, false
+
+from airflow.models.base import ID_LEN, Base
+from airflow.utils import timezone
+from airflow.utils.session import NEW_SESSION, provide_session
+from airflow.utils.sqlalchemy import UtcDateTime
+
+
+class DagWarning(Base):
+    """
+    A table to store DAG warnings.
+
+    DAG warnings are problems that don't rise to the level of failing the DAG parse
+    but which users should nonetheless be warned about.  These warnings are recorded
+    when parsing DAG and displayed on the Webserver in a flash message.
+    """
+
+    id = Column(
+        Integer, autoincrement=True, server_default='id'
+    )  # this default just signals to SQLA to defer to server

Review Comment:
   Do we need the default? I think `autoincrement` will add the needed default. Also, this table has primary key on `dag_id`, `warning_type` but the migration doesn't have these columns as the primary key. Both should have the same configuration. 
   Only the ID field is a primary key on the migration file



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org