You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Todd Lipcon (Updated) (JIRA)" <ji...@apache.org> on 2011/10/03 19:27:34 UTC

[jira] [Updated] (HADOOP-7714) Add support in native libs for OS buffer cache management

     [ https://issues.apache.org/jira/browse/HADOOP-7714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated HADOOP-7714:
--------------------------------

    Attachment: hadoop-7714-20s-prelim.txt

here's a rough patch that I've been playing with over the weekend. With this patch on, and configured, it noticeably decreases the cache churn for workloads like teragen or teravalidate. (terasort still churns cache due to the shuffle, but it wouldn't be too hard to improve the shuffle to drop map output out of cache once it has been fetched by a reducer)
                
> Add support in native libs for OS buffer cache management
> ---------------------------------------------------------
>
>                 Key: HADOOP-7714
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7714
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: native
>    Affects Versions: 0.24.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-7714-20s-prelim.txt
>
>
> Especially in shared HBase/MR situations, management of the OS buffer cache is important. Currently, running a big MR job will evict all of HBase's hot data from cache, causing HBase performance to really suffer. However, caching of the MR input/output is rarely useful, since the datasets tend to be larger than cache and not re-read often enough that the cache is used. Having access to the native calls {{posix_fadvise}} and {{sync_data_range}} on platforms where they are supported would allow us to do a better job of managing this cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira