You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/01/08 17:35:22 UTC

[GitHub] [airflow] david30907d opened a new pull request #19571: :pencil: (providers_google) add a location check

david30907d opened a new pull request #19571:
URL: https://github.com/apache/airflow/pull/19571


   if you didn't, you'll get this `returned "Not found: Job xxx"` exception
   related: <https://github.com/apache/airflow/issues/19570>
   
   Although BigQueryHook says `location` is optional, turns out it's required under some scenario
   [code](https://github.com/apache/airflow/blob/main/airflow/providers/google/cloud/hooks/bigquery.py#L99)
   
   when we invoke `get_records()` ([here](https://github.com/apache/airflow/blob/main/airflow/hooks/dbapi.py#L155)),  it would end up go to this `run_query()` to get the `job id` ([here](https://github.com/apache/airflow/blob/main/airflow/providers/google/cloud/hooks/bigquery.py#L2202)).
   
   another alternative is raise a `missing location exception` in `run_query()`
   
   thoughts on this?
   ```bash
   Traceback (most recent call last):
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 1157, in _run_raw_task
       self._prepare_and_execute_task_with_callbacks(context, task)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 1331, in _prepare_and_execute_task_with_callbacks
       result = self._execute_task(context, task_copy)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 1361, in _execute_task
       result = task_copy.execute(context=context)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/operators/python.py", line 150, in execute
       return_value = self.execute_callable()
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/operators/python.py", line 161, in execute_callable
       return self.python_callable(*self.op_args, **self.op_kwargs)
     File "/opt/airflow/dags/dags/utils/others/subscription_related.py", line 112, in wrapper
       return func(*args, **kwargs)
     File "/opt/airflow/dags/dags/utils/extractors/platform_data_extractors/shopify_extractor.py", line 75, in wrapper
       return func(*args, **kwargs)
     File "/opt/airflow/dags/dags/utils/extractors/platform_data_extractors/shopify_extractor.py", line 1019, in add_abandoned
       abandoned_checkouts_of_this_page = _parse_this_page(response_json)
     File "/opt/airflow/dags/dags/utils/extractors/platform_data_extractors/shopify_extractor.py", line 980, in _parse_this_page
       persons_queried_by_checkout_id = db_hook.get_records(
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/hooks/dbapi.py", line 135, in get_records
       return cur.fetchall()
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/google/cloud/hooks/bigquery.py", line 2886, in fetchall
       one = self.fetchone()
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/google/cloud/hooks/bigquery.py", line 2811, in fetchone
       return self.next()
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/google/cloud/hooks/bigquery.py", line 2827, in next
       self.service.jobs()
     File "/home/airflow/.local/lib/python3.8/site-packages/googleapiclient/_helpers.py", line 134, in positional_wrapper
       return wrapped(*args, **kwargs)
     File "/home/airflow/.local/lib/python3.8/site-packages/googleapiclient/http.py", line 915, in execute
       raise HttpError(resp, content, uri=self.uri)
   googleapiclient.errors.HttpError: <HttpError 404 when requesting https://bigquery.googleapis.com/bigquery/v2/projects/xxxxxx/queries/airflow_xxxxx?alt=json returned "Not found: Job xxxxx:xxxxx
   ```
   
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #19571: :pencil: (providers_google) add a location check

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #19571:
URL: https://github.com/apache/airflow/pull/19571#issuecomment-1003470769


   This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk closed pull request #19571: :pencil: (providers_google) add a location check

Posted by GitBox <gi...@apache.org>.
potiuk closed pull request #19571:
URL: https://github.com/apache/airflow/pull/19571


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] david30907d commented on pull request #19571: :pencil: (providers_google) add a location check

Posted by GitBox <gi...@apache.org>.
david30907d commented on pull request #19571:
URL: https://github.com/apache/airflow/pull/19571#issuecomment-1038176131


   @eladkal sorry for the late reply, I finally figure out how to rebase šŸ˜… (i'm git noob)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] david30907d edited a comment on pull request #19571: :pencil: (providers_google) add a location check

Posted by GitBox <gi...@apache.org>.
david30907d edited a comment on pull request #19571:
URL: https://github.com/apache/airflow/pull/19571#issuecomment-1004384748


   > @david30907d are you still willing to work on this one?
   
   Hi @turbaszek , probably not šŸ˜… 
   I'm afraid that I've to focus on this [one](https://github.com/apache/airflow/pull/19508#pullrequestreview-842396539) first. Sorry about that
   
   btw, if you get a chance, would you check this problem please? šŸ™ 
   https://github.com/apache/airflow/pull/19508#issuecomment-1003644956


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on pull request #19571: :pencil: (providers_google) add a location check

Posted by GitBox <gi...@apache.org>.
eladkal commented on pull request #19571:
URL: https://github.com/apache/airflow/pull/19571#issuecomment-1030926945


   @david30907d can you resolve conflicts and rebase?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #19571: :pencil: (providers_google) add a location check

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #19571:
URL: https://github.com/apache/airflow/pull/19571#issuecomment-1005484043


   The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] david30907d commented on pull request #19571: :pencil: (providers_google) add a location check

Posted by GitBox <gi...@apache.org>.
david30907d commented on pull request #19571:
URL: https://github.com/apache/airflow/pull/19571#issuecomment-1038158578






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] david30907d commented on a change in pull request #19571: :pencil: (providers_google) add a location check

Posted by GitBox <gi...@apache.org>.
david30907d commented on a change in pull request #19571:
URL: https://github.com/apache/airflow/pull/19571#discussion_r749869962



##########
File path: airflow/providers/google/cloud/hooks/bigquery.py
##########
@@ -2630,6 +2630,11 @@ def run_query(self, *args, **kwargs) -> str:
             stacklevel=3,
         )
         return self.hook.run_query(*args, **kwargs)
+    
+    def get_records(self, sql, parameters=None):
+        if self.location is None:
+            raise Exception("Need to specify location when instantiating BigQueryHook, otherwise it would result in Job Not Found error!")

Review comment:
       Sure, will do. Thx for the inputs šŸŽ‰




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on a change in pull request #19571: :pencil: (providers_google) add a location check

Posted by GitBox <gi...@apache.org>.
uranusjr commented on a change in pull request #19571:
URL: https://github.com/apache/airflow/pull/19571#discussion_r749869076



##########
File path: airflow/providers/google/cloud/hooks/bigquery.py
##########
@@ -2630,6 +2630,11 @@ def run_query(self, *args, **kwargs) -> str:
             stacklevel=3,
         )
         return self.hook.run_query(*args, **kwargs)
+    
+    def get_records(self, sql, parameters=None):
+        if self.location is None:
+            raise Exception("Need to specify location when instantiating BigQueryHook, otherwise it would result in Job Not Found error!")

Review comment:
       OK this sounds like a good reason.
   
   Could you change this to raise `AirflowException` (or a subclass `BigQueryLocationUnset`) instead? This pattern is more common in Airflow than raising a barebone `Exception`.
   
   Iā€™d do
   
   ```python
   raise BigQueryLocationUnset("Parameter 'location' is required to fetch records with BigQuery hook")
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] david30907d commented on pull request #19571: :pencil: (providers_google) add a location check

Posted by GitBox <gi...@apache.org>.
david30907d commented on pull request #19571:
URL: https://github.com/apache/airflow/pull/19571#issuecomment-1004384748


   > @david30907d are you still willing to work on this one?
   
   Hi @turbaszek , probably not šŸ˜… 
   I'm afraid that I've to focus on this [one](https://github.com/apache/airflow/pull/19508#pullrequestreview-842396539) first. Sorry about that


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] david30907d commented on pull request #19571: :pencil: (providers_google) add a location check

Posted by GitBox <gi...@apache.org>.
david30907d commented on pull request #19571:
URL: https://github.com/apache/airflow/pull/19571#issuecomment-969765661


   > Could you please add a unit test for that ?
   
   sure~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #19571: :pencil: (providers_google) add a location check

Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #19571:
URL: https://github.com/apache/airflow/pull/19571


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #19571: :pencil: (providers_google) add a location check

Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #19571:
URL: https://github.com/apache/airflow/pull/19571


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] turbaszek commented on pull request #19571: :pencil: (providers_google) add a location check

Posted by GitBox <gi...@apache.org>.
turbaszek commented on pull request #19571:
URL: https://github.com/apache/airflow/pull/19571#issuecomment-1004322976


   @david30907d are you still willing to work on this one?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] david30907d commented on a change in pull request #19571: :pencil: (providers_google) add a location check

Posted by GitBox <gi...@apache.org>.
david30907d commented on a change in pull request #19571:
URL: https://github.com/apache/airflow/pull/19571#discussion_r749867751



##########
File path: airflow/providers/google/cloud/hooks/bigquery.py
##########
@@ -2630,6 +2630,11 @@ def run_query(self, *args, **kwargs) -> str:
             stacklevel=3,
         )
         return self.hook.run_query(*args, **kwargs)
+    
+    def get_records(self, sql, parameters=None):
+        if self.location is None:
+            raise Exception("Need to specify location when instantiating BigQueryHook, otherwise it would result in Job Not Found error!")

Review comment:
       yea `location=None` works for other function and seems to me that it only fails when invoking `get_records()` ...
   btw I might need to postpone this PR, will try to finish it by the end of Nov
   will ping @uranusjr for help šŸ˜‚ 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #19571: :pencil: (providers_google) add a location check

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19571:
URL: https://github.com/apache/airflow/pull/19571#issuecomment-968361496


   Could you please add a unit test for that ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] david30907d commented on pull request #19571: :pencil: (providers_google) add a location check

Posted by GitBox <gi...@apache.org>.
david30907d commented on pull request #19571:
URL: https://github.com/apache/airflow/pull/19571#issuecomment-1038158578


   > @david30907d can you resolve conflicts and rebase so we can merge this PR?
   
   oh so sorry I missed this thread, on it!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #19571: :pencil: (providers_google) add a location check

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #19571:
URL: https://github.com/apache/airflow/pull/19571#issuecomment-1038351148


   > @eladkal sorry for the late reply, I finally figure out how to rebase sweat_smile (i'm git noob)
   
   As of last week you can EASILY do rebase with the GitHub UI:
   
   ![image](https://user-images.githubusercontent.com/595491/153770441-1183f22c-9703-4182-b15f-49e3dbcd8d50.png)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on a change in pull request #19571: :pencil: (providers_google) add a location check

Posted by GitBox <gi...@apache.org>.
uranusjr commented on a change in pull request #19571:
URL: https://github.com/apache/airflow/pull/19571#discussion_r749012799



##########
File path: airflow/providers/google/cloud/hooks/bigquery.py
##########
@@ -2630,6 +2630,11 @@ def run_query(self, *args, **kwargs) -> str:
             stacklevel=3,
         )
         return self.hook.run_query(*args, **kwargs)
+    
+    def get_records(self, sql, parameters=None):
+        if self.location is None:
+            raise Exception("Need to specify location when instantiating BigQueryHook, otherwise it would result in Job Not Found error!")

Review comment:
       Instead of failing on runtime, can we instead detect this in `__init__` and fail the entire DAG when `location` is invalid? Or is `location=None` a valid value for certain use cases?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #19571: :pencil: (providers_google) add a location check

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19571:
URL: https://github.com/apache/airflow/pull/19571#issuecomment-1038351148


   > @eladkal sorry for the late reply, I finally figure out how to rebase sweat_smile (i'm git noob)
   
   As of last week you can EASILY do rebase with the UI:
   
   ![image](https://user-images.githubusercontent.com/595491/153770441-1183f22c-9703-4182-b15f-49e3dbcd8d50.png)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #19571: :pencil: (providers_google) add a location check

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #19571:
URL: https://github.com/apache/airflow/pull/19571#issuecomment-1038351148


   > @eladkal sorry for the late reply, I finally figure out how to rebase sweat_smile (i'm git noob)
   
   As of last week you can EASILY do rebase with the GitHub UI:
   
   ![image](https://user-images.githubusercontent.com/595491/153770441-1183f22c-9703-4182-b15f-49e3dbcd8d50.png)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on pull request #19571: :pencil: (providers_google) add a location check

Posted by GitBox <gi...@apache.org>.
eladkal commented on pull request #19571:
URL: https://github.com/apache/airflow/pull/19571#issuecomment-1038138295


   @david30907d can you resolve conflicts and rebase so we can merge this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on pull request #19571: :pencil: (providers_google) add a location check

Posted by GitBox <gi...@apache.org>.
eladkal commented on pull request #19571:
URL: https://github.com/apache/airflow/pull/19571#issuecomment-1038138295


   @david30907d can you resolve conflicts and rebase so we can merge this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #19571: :pencil: (providers_google) add a location check

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19571:
URL: https://github.com/apache/airflow/pull/19571#issuecomment-1038351148


   > @eladkal sorry for the late reply, I finally figure out how to rebase sweat_smile (i'm git noob)
   
   As of last week you can EASILY do rebase with the UI:
   
   ![image](https://user-images.githubusercontent.com/595491/153770441-1183f22c-9703-4182-b15f-49e3dbcd8d50.png)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org