You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/04/13 18:29:39 UTC

[GitHub] [airflow] edejong opened a new pull request #8273: BigQueryCheckOperator location fix

edejong opened a new pull request #8273: BigQueryCheckOperator location fix
URL: https://github.com/apache/airflow/pull/8273
 
 
   There is currently an issue with BigQueryCheckOperator when the BigQuery dataset is not in US or EU, resulting in DAG failure with a 404 No Found issue. This is because the location parameter is currently not passed to the `jobs.getQueryResults` BigQuery API endpoint.
   
   This PR fixes the issue, though I have not yet had time to add any unit tests. I could not find many for these classes as is.
   
   ---
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [ ] Unit tests coverage for changes (not needed for documentation changes)
   - [ ] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)"
   - [ ] Relevant documentation is updated including usage instructions.
   - [ ] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] turbaszek commented on a change in pull request #8273: BigQueryCheckOperator location fix

Posted by GitBox <gi...@apache.org>.
turbaszek commented on a change in pull request #8273: BigQueryCheckOperator location fix
URL: https://github.com/apache/airflow/pull/8273#discussion_r407684951
 
 

 ##########
 File path: airflow/providers/google/cloud/hooks/bigquery.py
 ##########
 @@ -2442,10 +2442,17 @@ def next(self) -> Union[List, None]:
             if self.all_pages_loaded:
                 return None
 
-            query_results = (self.service.jobs().getQueryResults(
-                projectId=self.project_id,
-                jobId=self.job_id,
-                pageToken=self.page_token).execute(num_retries=self.num_retries))
+            if self.location:
+                query_results = (self.service.jobs().getQueryResults(
+                    projectId=self.project_id,
+                    jobId=self.job_id,
+                    location=self.location,
+                    pageToken=self.page_token).execute(num_retries=self.num_retries))
+            else:
+                query_results = (self.service.jobs().getQueryResults(
+                    projectId=self.project_id,
+                    jobId=self.job_id,
+                    pageToken=self.page_token).execute(num_retries=self.num_retries))
 
 Review comment:
   Would passing `location=None` work?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] boring-cyborg[bot] commented on issue #8273: BigQueryCheckOperator location fix

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #8273: BigQueryCheckOperator location fix
URL: https://github.com/apache/airflow/pull/8273#issuecomment-615932102
 
 
   Awesome work, congrats on your first merged pull request!
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] edejong commented on a change in pull request #8273: BigQueryCheckOperator location fix

Posted by GitBox <gi...@apache.org>.
edejong commented on a change in pull request #8273: BigQueryCheckOperator location fix
URL: https://github.com/apache/airflow/pull/8273#discussion_r407923819
 
 

 ##########
 File path: airflow/providers/google/cloud/hooks/bigquery.py
 ##########
 @@ -2442,10 +2442,17 @@ def next(self) -> Union[List, None]:
             if self.all_pages_loaded:
                 return None
 
-            query_results = (self.service.jobs().getQueryResults(
-                projectId=self.project_id,
-                jobId=self.job_id,
-                pageToken=self.page_token).execute(num_retries=self.num_retries))
+            if self.location:
+                query_results = (self.service.jobs().getQueryResults(
+                    projectId=self.project_id,
+                    jobId=self.job_id,
+                    location=self.location,
+                    pageToken=self.page_token).execute(num_retries=self.num_retries))
+            else:
+                query_results = (self.service.jobs().getQueryResults(
+                    projectId=self.project_id,
+                    jobId=self.job_id,
+                    pageToken=self.page_token).execute(num_retries=self.num_retries))
 
 Review comment:
   Not sure, I noticed this highly verbose pattern in the rest of the BigQuery related classes (just look at any other mention of `location` in `big_query_hook.py`) so I just followed it's style. I will check it and update if needed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] turbaszek merged pull request #8273: BigQueryCheckOperator location fix

Posted by GitBox <gi...@apache.org>.
turbaszek merged pull request #8273: BigQueryCheckOperator location fix
URL: https://github.com/apache/airflow/pull/8273
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] turbaszek commented on issue #8273: BigQueryCheckOperator location fix

Posted by GitBox <gi...@apache.org>.
turbaszek commented on issue #8273: BigQueryCheckOperator location fix
URL: https://github.com/apache/airflow/pull/8273#issuecomment-615150993
 
 
   @edejong can you please rebase? In the meantime, we've migrated from Travis  to Github Actions

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] turbaszek commented on a change in pull request #8273: BigQueryCheckOperator location fix

Posted by GitBox <gi...@apache.org>.
turbaszek commented on a change in pull request #8273: BigQueryCheckOperator location fix
URL: https://github.com/apache/airflow/pull/8273#discussion_r408708954
 
 

 ##########
 File path: airflow/providers/google/cloud/hooks/bigquery.py
 ##########
 @@ -2442,10 +2442,17 @@ def next(self) -> Union[List, None]:
             if self.all_pages_loaded:
                 return None
 
-            query_results = (self.service.jobs().getQueryResults(
-                projectId=self.project_id,
-                jobId=self.job_id,
-                pageToken=self.page_token).execute(num_retries=self.num_retries))
+            if self.location:
+                query_results = (self.service.jobs().getQueryResults(
+                    projectId=self.project_id,
+                    jobId=self.job_id,
+                    location=self.location,
+                    pageToken=self.page_token).execute(num_retries=self.num_retries))
+            else:
+                query_results = (self.service.jobs().getQueryResults(
+                    projectId=self.project_id,
+                    jobId=self.job_id,
+                    pageToken=self.page_token).execute(num_retries=self.num_retries))
 
 Review comment:
   @edejong thanks!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] edejong commented on a change in pull request #8273: BigQueryCheckOperator location fix

Posted by GitBox <gi...@apache.org>.
edejong commented on a change in pull request #8273: BigQueryCheckOperator location fix
URL: https://github.com/apache/airflow/pull/8273#discussion_r407932899
 
 

 ##########
 File path: airflow/providers/google/cloud/hooks/bigquery.py
 ##########
 @@ -2442,10 +2442,17 @@ def next(self) -> Union[List, None]:
             if self.all_pages_loaded:
                 return None
 
-            query_results = (self.service.jobs().getQueryResults(
-                projectId=self.project_id,
-                jobId=self.job_id,
-                pageToken=self.page_token).execute(num_retries=self.num_retries))
+            if self.location:
+                query_results = (self.service.jobs().getQueryResults(
+                    projectId=self.project_id,
+                    jobId=self.job_id,
+                    location=self.location,
+                    pageToken=self.page_token).execute(num_retries=self.num_retries))
+            else:
+                query_results = (self.service.jobs().getQueryResults(
+                    projectId=self.project_id,
+                    jobId=self.job_id,
+                    pageToken=self.page_token).execute(num_retries=self.num_retries))
 
 Review comment:
   Yep, that seems to work in this case!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] boring-cyborg[bot] commented on issue #8273: BigQueryCheckOperator location fix

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #8273: BigQueryCheckOperator location fix
URL: https://github.com/apache/airflow/pull/8273#issuecomment-613029465
 
 
   Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst)
   Here are some useful points:
   - Pay attention to the quality of your code (flake8, pylint and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/master/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that.
   - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/master/docs/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it.
   - Consider using [Breeze environment](https://github.com/apache/airflow/blob/master/BREEZE.rst) for testing locally, itโ€™s a heavy docker but it ships with a working Airflow and a lot of integrations.
   - Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
   - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#coding-style-and-best-practices).
   Apache Airflow is a community-driven project and together we are making it better ๐Ÿš€.
   In case of doubts contact the developers at:
   Mailing List: dev@airflow.apache.org
   Slack: https://apache-airflow-slack.herokuapp.com/
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services