You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/01/09 00:38:39 UTC

[jira] [Commented] (STORM-1206) Reduce logviewer memory usage

    [ https://issues.apache.org/jira/browse/STORM-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15090202#comment-15090202 ] 

ASF GitHub Bot commented on STORM-1206:
---------------------------------------

GitHub user zhuoliu opened a pull request:

    https://github.com/apache/storm/pull/999

    [STORM-1206] Reduce logviewer memory usage through directory stream

    Using DirectoryStream to replace File.listFiles will save logviewer memory usage since it does not require loading all files' metadata into memory. This avoids potential memory usage problem in extreme case (e.g., you have millions of small files in your log directory).
    Also, a multi-phase PQ-based sorting and cleaning scheme is introduced in DirectoryCleaner for replacing the global in-memory sorting.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zhuoliu/storm 1206

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/999.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #999
    
----
commit 7cc1d05a6c4db0610baccbc7870544d9ea8258af
Author: zhuol <zh...@yahoo-inc.com>
Date:   2016-01-08T23:32:50Z

    [STORM-1206] Reduce logviewer memory usage through directory stream

----


> Reduce logviewer memory usage
> -----------------------------
>
>                 Key: STORM-1206
>                 URL: https://issues.apache.org/jira/browse/STORM-1206
>             Project: Apache Storm
>          Issue Type: Improvement
>          Components: storm-core
>            Reporter: Zhuo Liu
>            Assignee: Zhuo Liu
>            Priority: Minor
>
> In production, we ran into an issue with logviewers bouncing with out of memory errors. Note that this happens very rarely, we met this in some extreme case when super frequently restarting of workers generates a huge number of gc files (~1M files).
> What was happening is that if there are lots of log files (~1 M files) for a particular headless user, we would have so many strings resident in memory that logviewer would run out of heap space.
> We were able to work around this by increasing the heap space, but we should consider putting some sort of an upper bound on the number of files so that we don't run in to this issue, even with the bigger heap.
> Using the java DirectoryStream can avoid holding all file names in memory during file listing. Also, a multi-round directory cleaner can be introduced to delete files while disk quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)