You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "vanchaxy (via GitHub)" <gi...@apache.org> on 2023/02/07 02:26:43 UTC

[GitHub] [airflow] vanchaxy opened a new issue, #29396: BigQuery Hook list_rows method missing page_token return value

vanchaxy opened a new issue, #29396:
URL: https://github.com/apache/airflow/issues/29396

   ### Apache Airflow Provider(s)
   
   google
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-google==7.0.0
   
   But the problem exists in all newer versions.
   
   ### Apache Airflow version
   
   apache-airflow==2.3.2
   
   ### Operating System
   
   Ubuntu 20.04.4 LTS
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   _No response_
   
   ### What happened
   
   The `list_rows` method in the BigQuery Hook does not return the page_token value, which is necessary for paginating query results. Same problem with `get_datasets_list` method. 
   
   The documentation for the `get_datasets_list` method even states that the page_token parameter can be accessed:
   ```
               :param page_token:  Token representing a cursor into the datasets. If not passed,
               the API will return the first page of datasets. The token marks the beginning of the
               iterator to be returned and the value of the ``page_token`` can be accessed at
               ``next_page_token`` of the :class:`~google.api_core.page_iterator.HTTPIterator`.
   ```
   but it doesn't return HTTPIterator. Instead, it converts the HTTPIterator to a list[DatasetListItem] using list(datasets), making it impossible to retrieve the original HTTPIterator and thus impossible to obtain the next_page_token.
   
   ### What you think should happen instead
   
   `list_rows` \ `get_datasets_list` methods should return `Iterator` OR both the list of rows\datasets and the page_token value to allow users to retrieve multiple results pages. For backward compatibility, we can have a parameter like return_iterator=True or smth like that. 
   
   ### How to reproduce
   
   _No response_
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] josh-fell commented on issue #29396: BigQuery Hook list_rows method missing page_token return value

Posted by "josh-fell (via GitHub)" <gi...@apache.org>.
josh-fell commented on issue #29396:
URL: https://github.com/apache/airflow/issues/29396#issuecomment-1428215516

   Thanks for logging this @vanchaxy. Can you provide a reproducible example or method please?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on issue #29396: BigQuery Hook list_rows method missing page_token return value

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on issue #29396:
URL: https://github.com/apache/airflow/issues/29396#issuecomment-1437429652

   I see the problem. List_row returns RowIterator converted to list - and indeed looses the "next_page_token" property of the iterator.  I am assigning you @vanchaxy since you marked that you are willing to submit a PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk closed issue #29396: BigQuery Hook list_rows method missing page_token return value

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk closed issue #29396: BigQuery Hook list_rows method missing page_token return value
URL: https://github.com/apache/airflow/issues/29396


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org