You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-issues@hadoop.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/12/09 15:15:00 UTC

[jira] [Commented] (HDFS-16864) HDFS advisory caching should drop cache behind block when block closed

    [ https://issues.apache.org/jira/browse/HDFS-16864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17645338#comment-17645338 ] 

ASF GitHub Bot commented on HDFS-16864:
---------------------------------------

dlmarion opened a new pull request, #5204:
URL: https://github.com/apache/hadoop/pull/5204

   As blocks are written posix_fadvise is called when dropCacheBehindWrites is true, but the range supplied does not include the most recent 8MB written. This change modifies BlockReceiver.close such that when dropCacheBehindWrites is true, then posix_fadvise is called on the entire block.
   




> HDFS advisory caching should drop cache behind block when block closed
> ----------------------------------------------------------------------
>
>                 Key: HDFS-16864
>                 URL: https://issues.apache.org/jira/browse/HDFS-16864
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs
>    Affects Versions: 3.3.4
>            Reporter: Dave Marion
>            Priority: Minor
>
> One of the comments in HDFS-4817 describes the behavior in BlockReceiver.manageWriterOsCache:
> "The general idea is that there isn't much point in calling {{sync_file_pages}} twice on the same offsets, since the sync process has presumably already begun. On the other hand, calling {{fadvise(FADV_DONTNEED)}} again and again will tend to purge more and more bytes from the cache. The reason is because dirty pages (those containing un-written-out-data) cannot be purged using {{{}FADV_DONTNEED{}}}. And we can't know exactly when the pages we wrote will be flushed to disk. But we do know that calling {{FADV_DONTNEED}} on very recently written bytes is a waste, since they will almost certainly not have been written out to disk. That is why it purges between 0 and {{{}lastCacheManagementOffset - CACHE_WINDOW_SIZE{}}}, rather than simply 0 to pos."
> Looking at the code, I'm wondering if at least the last 8MB (size of CACHE_WINDOW_SIZE) of a block might be left without an associated FADVISE_DONT_NEED call. We're having a [discussion|https://the-asf.slack.com/archives/CERNB8NDC/p1669399302264189] in #accumulo about the file caching feature and I found some interesting [results|https://gist.github.com/dlmarion/1835f387b0fa8fb9dbf849a0c87b6d04] in a test that we wrote. Specifically, that for a multi-block file using setDropBehind with either hsync or CreateFlag.SYNC_BLOCK, parts of each block remained in the cache instead of parts of the last block.
> I'm wondering if there is a reason not to call fadvise(FADV_DONTNEED) on the entire block in close [here|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java#L371] when dropCacheBehindWrites is true.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org