You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (Jira)" <ji...@apache.org> on 2021/03/22 13:08:00 UTC

[jira] [Updated] (HADOOP-17597) Add option to downgrade S3A rejection of Syncable to warning

     [ https://issues.apache.org/jira/browse/HADOOP-17597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Steve Loughran updated HADOOP-17597:
------------------------------------
    Summary: Add option to downgrade S3A rejection of Syncable to warning  (was: Add option to downgrade S3A rejection of Syncable to warning + iostatistics)

> Add option to downgrade S3A rejection of Syncable to warning
> ------------------------------------------------------------
>
>                 Key: HADOOP-17597
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17597
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 3.3.1
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Minor
>
> The Hadoop Filesystem Syncable API is intended to meet the requirements laid out in [StoneBraker81] _Operating System Support for Database Management_
> bq.  The service required from an OS buffer manager is a selectedforce out which would push the intentions list and the commit flag to disk in the proper order. Such a service is not present in any buffer manager known to us.
> It's an expensive operation -so expensive that {{Syncable.hsync()}} isn't even called on {{DFSOutputStream.close()}}. I
> Even though S3A does not manifest any data until close() is called, applications coming from HDFS may call Syncable methods and expect to them to persist data with the durability guarantees offered by HDFS.
> Since the output stream hardening of HADOOP-13327, S3A throws UnsupportedOperationException to indicate that the synchronization semantics of Syncable absolutely cannot be met. 
> As a result, applications which have been calling the Syncable APIs are finding the call failing. In the absence of exception handling to recognise that the durability semantics are being met, they fail.
> If the user and the application actually expects data to be persisted, this is the correct behaviour. The data cannot be persisted this way.
> If, however, they were calling this on HDFS more as a {{flush()}} than the full and expensive DBMS-class persistence call, then this failure is unwelcome. The applications really needs to catch the UnsupportedOperationException raised by S3A _or any other FS strictly reporting failures_, report the problem and perform some other means of safe data storage
> Even better, they can use hasPathCapability on the FS or hasCapability() on the stream to probe before even opening a file or trying to sync it. the hasCapability() on a stream was actually implemented in Hadooop-2.x precisely to allow applications to identify when a stream could not meet the guarantees (e.g some of the encrypted streams, file:// before HADOOP-13...)
> Until they can correct their code, I propose adding the option for s3a to downgrade
> fs.s3a.downgrade.syncable.exceptions 
> This will
> * Log once per process at WARN
> * downgrade the calls to noop() 
> * increment counters in S3A stats and IO stats of invocations of the Syncable methods. This will allow for stats gathering to let us identify which applications need fixing in cloud deployments
> Testing: copy the hsync tests but expect exceptions to be swallowed and stats to be collected
> Also: UnsupportedException text will link to this JIRA



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org