You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/10/22 21:42:24 UTC

[GitHub] [airflow] mik-laj opened a new pull request, #27204: Fix system tests for SnowflakeOperator

mik-laj opened a new pull request, #27204:
URL: https://github.com/apache/airflow/pull/27204

   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of an existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   I would like to run system tests for Snowflake on CI to make sure there is no regression, but unfortunately running these tests on Github Action was never successful.
   I took a look and ran into a few problems:
   * Missing `schedule="@once"` which means that the DAG were run every day in the range from `start_date` to `current_date`. I think `catchup = False` should fix that, but it doesn't. DAG Runes were still being created, which in effect meant that the DAGs never completed the expected time - <60 seconds.
   * I moved all the other operators to other files because I would like to focus only on `SnowflakeOperator` first. But as a next step, I would also like to fix the tests for `s3_to_snowflake` and `snowflake_to_slack`, but this is an extra effort.
   * I created a missing SQL file that was used by system tests. Unfortunately, this file cannot contain the license header, because Snowflake interprets the license header as a new SQL query, and we can only send one SQL query per request.  This is something that might be worth fixing, but for now, I'd like to run system tests without making any changes to the operator.
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on pull request #27204: Fix system tests for SnowflakeOperator

Posted by GitBox <gi...@apache.org>.
dstandish commented on PR #27204:
URL: https://github.com/apache/airflow/pull/27204#issuecomment-1287931077

   > I created a missing SQL file that was used by system tests. Unfortunately, this file cannot contain the license header, because Snowflake interprets the license header as a new SQL query, and we can only send one SQL query per request. This is something that might be worth fixing, but for now, I'd like to run system tests without making any changes to the operator.
   
   @mik-laj you could consider doing something like
   
   ```python
   '\n'.join(x for x in Path(...).read_text().splitlines() if not x.startswith('#'))
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] mik-laj commented on pull request #27204: Fix system tests for SnowflakeOperator

Posted by GitBox <gi...@apache.org>.
mik-laj commented on PR #27204:
URL: https://github.com/apache/airflow/pull/27204#issuecomment-1287934410

   No, because this example shows the use of the template file feature i.e. user can pass file path instead of SQL content.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on pull request #27204: Fix system tests for SnowflakeOperator

Posted by GitBox <gi...@apache.org>.
dstandish commented on PR #27204:
URL: https://github.com/apache/airflow/pull/27204#issuecomment-1287945110

   Well doesn't the license have to be there?  perhaps @potiuk can comment... i assume there is a reason that there is no other single file excluded.
   
   it's weird though... i don't remember comments being a problem with snowflake.  it's not syntax-related e.g. `//` vs `--` is it?
   
   if the missing license is an issue i suppose we could just remove the example... let's see what @potiuk says i guess


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] mik-laj commented on pull request #27204: Fix system tests for SnowflakeOperator

Posted by GitBox <gi...@apache.org>.
mik-laj commented on PR #27204:
URL: https://github.com/apache/airflow/pull/27204#issuecomment-1287961484

   > WHAT FILES IN AN APACHE RELEASE DO NOT REQUIRE A LICENSE HEADER?
   > A file without any degree of creativity in either its literal elements or its structure is not protected by copyright law; therefore, such a file does not require a license header. If in doubt about the extent of the file's creativity, add the license header to the file.
   >
   > It may make sense for some other files to have no license header. Three examples are:
   >
   > Short informational text files; for example README, INSTALL files. The expectation is that these files make it obvious which product they relate to.
   > Test data for which the addition of a source header would cause the tests to fail.
   'Snippet' files that are included in a larger file, when the larger file would have duplicate licensing headers.
   > PMCs should use their judgement, err on having a source header and contact legal-discuss@ if unsure.
   
   https://www.apache.org/legal/src-headers.html
   
   This file has a low level of creativity because it is copied from the official  Snowflake documentation, and is a test data for which the addition of a source header would cause the tests to fail. According to Apache's FAQ, we don't need to have a license header in this file.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] mik-laj commented on pull request #27204: Fix system tests for SnowflakeOperator

Posted by GitBox <gi...@apache.org>.
mik-laj commented on PR #27204:
URL: https://github.com/apache/airflow/pull/27204#issuecomment-1287962281

   This may be related to the comment style, but we have the same style in all SQL files, so I preferred not to change that for now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] mik-laj merged pull request #27204: Fix system tests for SnowflakeOperator

Posted by GitBox <gi...@apache.org>.
mik-laj merged PR #27204:
URL: https://github.com/apache/airflow/pull/27204


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on pull request #27204: Fix system tests for SnowflakeOperator

Posted by GitBox <gi...@apache.org>.
dstandish commented on PR #27204:
URL: https://github.com/apache/airflow/pull/27204#issuecomment-1287973237

   ok cool 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] mik-laj commented on pull request #27204: Fix system tests for SnowflakeOperator

Posted by GitBox <gi...@apache.org>.
mik-laj commented on PR #27204:
URL: https://github.com/apache/airflow/pull/27204#issuecomment-1287961946

   We have a license header in every file because we have opt-out policy.  Each file has a license header by default, but if necessary, we add exclusions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org