You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/08/04 20:10:06 UTC

[GitHub] [airflow] jperkelens opened a new issue #10160: Airflow cannot read files in Packaged DAGs folder.

jperkelens opened a new issue #10160:
URL: https://github.com/apache/airflow/issues/10160


   
   **Apache Airflow version**: 1.10.10
   
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl version`): N/A
   
   **Environment**:
   
   - **Cloud provider or hardware configuration**:  ECS on AWS
   - **OS** (e.g. from /etc/os-release): Debian GNU/Linux 9 (stretch)
   - **Kernel** (e.g. `uname -a`):  Linux 025b01778cfa 4.19.76-linuxkit #1 SMP Fri Apr 3 15:53:26 UTC 2020 x86_64 GNU/Linux
   - **Install tools**:
   - **Others**: Python 3.7.8
   
   **What happened**:
   When referring to non python files from within a packaged DAG zip, airflow attempts to access the file in the zip as if the zip file were a regular OS directory and consequently produces a `Broken DAG: [/usr/local/airflow/dags/dags.zip] [Errno 20] Not a directory: '/usr/local/airflow/dags/dags.zip/test_dag/scripts/query.sql'` error. 
   
   **What you expected to happen**:
   I expect any file access that works in a unzipped DAGs directory to work in a packaged file. 
   
   This seems related to [AIRFLOW-6853](https://issues.apache.org/jira/browse/AIRFLOW-6853) and [this stack overflow question](https://stackoverflow.com/questions/50952568/airflow-composer-template-not-found-in-zip-packaged-dag). The cause seems to be that ` _file_` or `path` operations return values similar to `dags/dags.zip/package1/file.sql` which does not exist on the file path. 
   
   Ideally, airflow would be able to understand how to deal with this by making it transparent to
 the user or providing a utility to load these files. At the very least this limitation should be documented, as there is [no indication](https://airflow.apache.org/docs/stable/concepts.html#packaged-dags) that this is not possible.
   
   **How to reproduce it**:
   
   Attached is a zip file that exhibits the error when used as a packaged DAG file, but loads appropriately when unzipped. The unzipped dag will return a `psycop` error as the accounts table is not set up. A packaged DAG file will return `jinja2.exceptions.TemplateNotFound: scripts/test_sql.sql`
   [dags.zip](https://github.com/apache/airflow/files/5024273/dags.zip)
   
    
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on issue #10160: Airflow cannot read files in Packaged DAGs folder.

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #10160:
URL: https://github.com/apache/airflow/issues/10160#issuecomment-864632921


   This issue has been automatically marked as stale because it has been open for 30 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] muscovitebob commented on issue #10160: Airflow cannot read files in Packaged DAGs folder.

Posted by GitBox <gi...@apache.org>.
muscovitebob commented on issue #10160:
URL: https://github.com/apache/airflow/issues/10160#issuecomment-669033914


   I encounter this a lot when using packaged dags also, the workaround I use is to utilise [this function](https://github.com/apache/airflow/blob/4e3799fec4c23d0f43603a0489c5a6158aeba035/airflow/utils/file.py#L78) to first load the file contents into a string before passing them to the relevant operator (it seems most operators that support reading a file also support taking a string instead). The function is quite portable so one can pluck it from the source code if the version you are running does not yet contain it. Would indeed be nice to be able to access extra files in packaged dags without code modifications, however.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on issue #10160: Airflow cannot read files in Packaged DAGs folder.

Posted by GitBox <gi...@apache.org>.
eladkal commented on issue #10160:
URL: https://github.com/apache/airflow/issues/10160#issuecomment-845947802


   Is the issue still happens in Airflow 2?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on issue #10160: Airflow cannot read files in Packaged DAGs folder.

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #10160:
URL: https://github.com/apache/airflow/issues/10160#issuecomment-869246016


   This issue has been closed because it has not received response from the issue author.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #10160: Airflow cannot read files in Packaged DAGs folder.

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #10160:
URL: https://github.com/apache/airflow/issues/10160#issuecomment-668800687


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] jperkelens commented on issue #10160: Airflow cannot read files in Packaged DAGs folder.

Posted by GitBox <gi...@apache.org>.
jperkelens commented on issue #10160:
URL: https://github.com/apache/airflow/issues/10160#issuecomment-672009979


   As a follow up, was able to resolve several errors by modifying some custom Operators to use `open_maybe_zipped` instead of using path files directly. However, we also have a configuration system that several of our DAGs use that require the loading of YAML files. We use this configuration system in contexts other than airflow, and baking in this utility in order to accomodate this scenario feels both complicated and not the right place to resolve the issue.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] jperkelens commented on issue #10160: Airflow cannot read files in Packaged DAGs folder.

Posted by GitBox <gi...@apache.org>.
jperkelens commented on issue #10160:
URL: https://github.com/apache/airflow/issues/10160#issuecomment-669235156


   @muscovitebob this function does seem quite useful and I'll look into it, I'm still a little apprehensive about switching to packaged DAGs since we run a multi tenant Airflow instance and coordinating this change across all teams is going to be quite a lift. Combined with the fact that these errors (when they surface as Broken DAGs) are not reported exhaustively, we could have a broken DAG for quite some time without noticing it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] closed issue #10160: Airflow cannot read files in Packaged DAGs folder.

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed issue #10160:
URL: https://github.com/apache/airflow/issues/10160


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org