You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2022/11/14 18:51:42 UTC

[GitHub] [accumulo] dlmarion commented on pull request #3076: Call FSDataOutputStream.setDropBehind for WAL files

dlmarion commented on PR #3076:
URL: https://github.com/apache/accumulo/pull/3076#issuecomment-1314223461

   This change is not in response to a new reported issue, but something that I have seen on busy clusters in the past. On busy clusters I have seen the case where the page cache flush process is very active. I think there is an opportunity here to help alleviate the cache pressure by telling the operating system that it doesn't need to cache the WAL after we write it, because we aren't going to be reading it. In fact, there is likely another commit here that I could make that is the opposite side of this - we can tell the operating system to drop or not cache the WAL during recovery as we are going to read it once and then be done with it. 
   
   IIRC, under the hood this calls [posix_fadvise](https://linux.die.net/man/2/posix_fadvise) and the native code is located [here](https://github.com/apache/hadoop/blame/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/nativeio/NativeIO.c#L167). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org