You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/01/05 20:23:00 UTC
[jira] [Work logged] (HADOOP-17404) ABFS: Piggyback flush on Append calls for short writes

     [ https://issues.apache.org/jira/browse/HADOOP-17404?focusedWorklogId=531468&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-531468 ]

ASF GitHub Bot logged work on HADOOP-17404:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 05/Jan/21 20:22
            Start Date: 05/Jan/21 20:22
    Worklog Time Spent: 10m 
      Work Description: DadanielZ commented on a change in pull request #2509:
URL: https://github.com/apache/hadoop/pull/2509#discussion_r552173130



##########
File path: hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/constants/ConfigurationKeys.java
##########
@@ -55,6 +55,7 @@
   public static final String AZURE_WRITE_MAX_CONCURRENT_REQUESTS = "fs.azure.write.max.concurrent.requests";
   public static final String AZURE_WRITE_MAX_REQUESTS_TO_QUEUE = "fs.azure.write.max.requests.to.queue";
   public static final String AZURE_WRITE_BUFFER_SIZE = "fs.azure.write.request.size";
+  public static final String AZURE_ENABLE_SMALL_WRITE_OPTIMIZATION = "fs.azure.write.enableappendwithflush";

Review comment:
       for newly added config key, a little comment would be very helpful




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 531468)
    Time Spent: 3h 10m  (was: 3h)

> ABFS: Piggyback flush on Append calls for short writes
> ------------------------------------------------------
>
>                 Key: HADOOP-17404
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17404
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/azure
>    Affects Versions: 3.3.0
>            Reporter: Sneha Vijayarajan
>            Assignee: Sneha Vijayarajan
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.3.1
>
>          Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> When Hflush or Hsync APIs are called, a call is made to store backend to commit the data that was appended. 
> If the data size written by Hadoop app is small, i.e. data size :
>  * before any of HFlush/HSync call is made or
>  * between 2 HFlush/Hsync API calls
> is less than write buffer size, 2 separate calls, one for append and another for flush is made,
> Apps that do such small writes eventually end up with almost similar number of calls for flush and append.
> This PR enables Flush to be piggybacked onto append call for such short write scenarios.
>  
> NOTE: The changes is guarded over a config, and is disabled by default until relevant supported changes is made available on all store production clusters.
> New Config added: fs.azure.write.enableappendwithflush



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org