You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Mark Mc Keown (Jira)" <ji...@apache.org> on 2022/10/26 15:11:00 UTC
[jira] [Created] (HDFS-16825) hadoop-azure flush timing out and triggering retry
Mark Mc Keown created HDFS-16825:
------------------------------------
Summary: hadoop-azure flush timing out and triggering retry
Key: HDFS-16825
URL: https://issues.apache.org/jira/browse/HDFS-16825
Project: Hadoop HDFS
Issue Type: Bug
Reporter: Mark Mc Keown
From AbfsHttpOperation the code to create a HTTP connection to Azure is:
{code}
public AbfsHttpOperation(final URL url, final String method, final List<AbfsHttpHeader> requestHeaders)
throws IOException {
this.isTraceEnabled = LOG.isTraceEnabled();
this.url = url;
this.method = method;
this.clientRequestId = UUID.randomUUID().toString();
this.connection = openConnection();
if (this.connection instanceof HttpsURLConnection) {
HttpsURLConnection secureConn = (HttpsURLConnection) this.connection;
SSLSocketFactory sslSocketFactory = SSLSocketFactoryEx.getDefaultFactory();
if (sslSocketFactory != null) {
secureConn.setSSLSocketFactory(sslSocketFactory);
}
}
this.connection.setConnectTimeout(CONNECT_TIMEOUT);
this.connection.setReadTimeout(READ_TIMEOUT);
this.connection.setRequestMethod(method);
for (AbfsHttpHeader header : requestHeaders) {
this.connection.setRequestProperty(header.getName(), header.getValue());
}
this.connection.setRequestProperty(HttpHeaderConfigurations.X_MS_CLIENT_REQUEST_ID, clientRequestId);
}
{code}
The READ_TIMEOUT is hard coded to 30 seconds. When a file uploaded to Azure and closed it triggers a flush operation - Azure sometimes takes longer than 30 seconds to respond and this is triggering a retry within hadoop-azure library.
(This can cause issues with DataBricks Autoloader which monitors EventGrid for tiggers to ingest data - multiple flush/close can confuse it, this is an Autoloader bug as retries can happen normally).
Can the READ_TIMEOUT be increased or made configurable?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org