You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zeppelin.apache.org by "Marcus Truscello (Jira)" <ji...@apache.org> on 2022/07/01 01:06:00 UTC
[jira] [Created] (ZEPPELIN-5758) BigQuery hits socket timeout before reaching "wait_time" setting
Marcus Truscello created ZEPPELIN-5758:
------------------------------------------
Summary: BigQuery hits socket timeout before reaching "wait_time" setting
Key: ZEPPELIN-5758
URL: https://issues.apache.org/jira/browse/ZEPPELIN-5758
Project: Zeppelin
Issue Type: Bug
Components: interpreter-setting, Interpreters, zeppelin-interpreter
Affects Versions: 0.10.1
Reporter: Marcus Truscello
Attachments: bigquery-timeout.patch, stacktrace.log
The {{zeppelin.bigquery.wait_time}} BigQuery interpreter parameter is only useful up to a value of 30 seconds. Anything beyond that exceeds the underlying HTTP client's default read timeout and will result in a {{java.net.SocketTimeoutException: Read timed out}} exception being thrown. (A full stack trace is attached.)
Google's Java API guide suggests overriding the {{HttpRequestInitializer}} to set the desired connect and read timeouts: [https://developers.google.com/api-client-library/java/google-api-java-client/errors#timeouts]
This exact approach isn't feasible because the BigQuery interpreter's {{createAuthorizedClient}} method is static. Instead, we can modify the solution to use an approach similar to this StackOverflow answer which uses the builder's {{{}setHttpRequestInitializer{}}}: [https://stackoverflow.com/a/32894630]
It should be noted that setting the read timeout too large likely won't provide any value. Regardless of the {{timeoutMs}} value, BigQuery will always return a response within ~200 seconds regardless if the job has actually completed or not:
[https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/getQueryResults#query-parameters]
Given that the BigQuery interpreter doesn't handle jobComplete being false, there's no reason to set the read timeout much larger than 200 seconds.
I've attached a diff of the changes I applied to fix this issue. It should be noted that I am not a Java developer, so I apologize if the solution is a bit crude. :D
--
This message was sent by Atlassian Jira
(v8.20.10#820010)