You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "JDarDagran (via GitHub)" <gi...@apache.org> on 2023/11/09 13:41:01 UTC
[I] openlineage: improve how sql utils parse table schemas [airflow]
JDarDagran opened a new issue, #35552:
URL: https://github.com/apache/airflow/issues/35552
### Apache Airflow version
main (development)
### What happened
For SQL based operators there is `airflow.providers.openlineage.utils.sql` module used by `SQLParser` interface class.
In short: it allows to parse table schemas based on input and output dataset parsed from SQL query.
### What you think should happen instead
It should take into consideration if there is database/schema from connection setup detected from information schema query result. If there is one found it should stop adding other tables.
### How to reproduce
Corner case is following:
1. use database connection with database and/or schema default set
2. refer to table name only in SQL query (e.g. `SELECT * FROM my_table` instead of `SELECT * FROM my_schema.my_table`)
3. if there's the same table name in other database/schema (or database+schema combination, it depends on database) OL integration will produce two datasets for tables.
For instance if one uses postgres with search path set to `public` schema `SELECT * FROM my_table` would get data from `public.my_table` even if there is another table with the same name but different schema. OL integration will take both `my_schema.my_table` and `public.my_table`.
### Operating System
macOS
### Versions of Apache Airflow Providers
apache-airflow-providers-openlineage==1.2.0
### Deployment
Other Docker-based deployment
### Deployment details
_No response_
### Anything else
_No response_
### Are you willing to submit PR?
- [X] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] openlineage: improve how sql utils parse table schemas [airflow]
Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on issue #35552:
URL: https://github.com/apache/airflow/issues/35552#issuecomment-1803946585
Good ideas!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] openlineage: improve how sql utils parse table schemas [airflow]
Posted by "boring-cyborg[bot] (via GitHub)" <gi...@apache.org>.
boring-cyborg[bot] commented on issue #35552:
URL: https://github.com/apache/airflow/issues/35552#issuecomment-1803853595
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org