You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-issues@hadoop.apache.org by "John Zhuge (JIRA)" <ji...@apache.org> on 2017/09/19 02:59:00 UTC

[jira] [Commented] (HADOOP-14765) AdlFsInputStream should implement unbuffer

    [ https://issues.apache.org/jira/browse/HADOOP-14765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171041#comment-16171041 ] 

John Zhuge commented on HADOOP-14765:
-------------------------------------

unbuffer is not flush. It does not attempt to write the unwritten data. It just reduces the buffer. Based HBase, another use case is that Impala's file handle cache calls unbuffer before it caches the file handle. I believe the Impala JIRA is IMPALA-1588 "Cache HDFS file handle to avoid repeated hdfs fopen call".

Just need to set ADLFileInputStream#buffer to null where we can save 4MB by default or whatever read buffer size is set to. No need to close socket.

Unfortunately the current ADLFileInputStream#unbuffer has a slightly different semantics. It only forces the next read to fetch from server. It does not free the buffer.

> AdlFsInputStream should implement unbuffer
> ------------------------------------------
>
>                 Key: HADOOP-14765
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14765
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/adl
>    Affects Versions: 2.8.0
>            Reporter: John Zhuge
>            Priority: Minor
>
> HBase and Impala rely on FileSystems implementing CanUnbuffer.unbuffer() to force input streams to free up remote connections (HBASE-9393). This works for HDFS, but not elsewhere.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org