You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Chen He (JIRA)" <ji...@apache.org> on 2016/06/01 04:56:13 UTC

[jira] [Commented] (HADOOP-13211) Swift driver should have a configurable retry feature when ecounter 5xx error

    [ https://issues.apache.org/jira/browse/HADOOP-13211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15309263#comment-15309263 ] 

Chen He commented on HADOOP-13211:
----------------------------------

Thank you for the reply, [~stevel@apache.org]. 

IMHO, the hadoop openstack driver is a bridge between HDFS and Openstack object store. MR or other native Hadoop frameworks should be able to utilize the Hadoop IPC retry. With the increasing popularity of HDFS, other computing frameworks like Spark, in memory storage system like Tachyon, they are using hadoop openstack driver. I am not sure if Spark or other frameworks use hadoop-openstack driver, the Hadoop IPC retry will trigger or not. 

Those frameworks have retry on task level, however, it could be costly to retry a task than just retry in the driver level. 

For the data lose, it is a really good catch. If the server keeps failing and providing 5xx, the upload will finally fail. The object store is not file system and may not guarantee file system level integrity. I can't figure out a scenario that data loss caused by retry. Could you provide an suggestion? 

> Swift driver should have a configurable retry feature when ecounter 5xx error
> -----------------------------------------------------------------------------
>
>                 Key: HADOOP-13211
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13211
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs/swift
>    Affects Versions: 2.7.2
>            Reporter: Chen He
>            Assignee: Chen He
>
> In current code. if Swift driver meets a HTTP 5xx, it will throw exception and stop. As a driver, it will be more sophisticate if it can retry a configurable times before report failure. There are two reasons that I can image:
> 1. if the server is really busy, it is possible that the server will drop some requests to avoid DDoS attack.
> 2. If server accidentally unavailable for a short period of time and come back again, we may not need to fail the whole driver. Just record the exception and retry may be more flexible. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org