You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by "John Sirois (Created) (JIRA)" <ji...@apache.org> on 2012/04/13 08:30:35 UTC

[jira] [Created] (MESOS-184) Log has a space leak

Log has a space leak
--------------------

                 Key: MESOS-184
                 URL: https://issues.apache.org/jira/browse/MESOS-184
             Project: Mesos
          Issue Type: Bug
          Components: c++-api
    Affects Versions: 0.9.0
            Reporter: John Sirois
            Assignee: Benjamin Hindman


In short, the access pattern of the Log of the underlying LevelDB storage is such that background compactions are ineffective and a long running Log will have a space leak on disk even in the presence of otherwise apparently sufficient Log::Writer::truncate calls.
It seems the right thing to do is to issue a DB::CompactRange(NULL, Slice(truncateToKey)) after a replica learns a Action::TRUNCATE Record.  The cost here is a synchronous compaction stall on every truncate so maybe this should be a configuration option or even an explicit api.
===

Snip of email explanation:
I spent some time understanding what was going on here and our use pattern of leveldb does in fact defeat the backround compaction algorithm.

The docs are here: http://leveldb.googlecode.com/svn/trunk/doc/impl.html in the 'Compactions' section, but in short the gist is compaction operates on an uncompacted file from a level (1 file) + all files overlapping its key range in the next level.  Since we write sequential keys with no randomness at all, by definition the only overlap we ever can get is in level 0 which is the only level that leveldb allows for overlap in sstables in the 1st place.

That leaves the question of why no compaction on open.  Looking there: http://code.google.com/p/leveldb/source/browse/db/db_impl.cc#1376
I see a call to MaybeScheduleCompaction, but following that trail, that just leads to http://code.google.com/p/leveldb/source/browse/db/version_set.cc?spec=svnbc1ee4d25e09b04e074db330a41f54ef4af0e31b&r=36a5f8ed7f9fb3373236d5eace4f5fea369856ee#1156 which implements the compaction strategy I tried to summarize above, and thus background compactions for out case are limited to level0 -> level 1 compactions and lefel1 and higher never compact automatically.

This seems born out by the LOG files.  For example, from smf1-prod - restarts after your manual compaction fix in bold:
[jsirois@smf1-ajb-35-sr1 ~]$ grep Compacting /var/lib/mesos/scheduler_db/mesos_log/LOG.old 
2012/04/13-00:24:20.356673 44c1e940 Compacting 3@0 + 4@1 files
2012/04/13-00:24:20.490113 44c1e940 Compacting 5@1 + 281@2 files
2012/04/13-00:24:25.824995 44c1e940 Compacting 1@1 + 0@2 files
2012/04/13-00:24:26.008857 44c1e940 Compacting 1@2 + 0@3 files
2012/04/13-00:24:26.196877 44c1e940 Compacting 1@2 + 0@3 files
2012/04/13-00:24:26.312465 44c1e940 Compacting 1@2 + 0@3 files
2012/04/13-00:24:26.429817 44c1e940 Compacting 1@2 + 0@3 files
2012/04/13-00:24:26.533483 44c1e940 Compacting 1@2 + 0@3 files
2012/04/13-00:24:26.631044 44c1e940 Compacting 1@2 + 0@3 files
2012/04/13-00:24:26.733702 44c1e940 Compacting 1@2 + 0@3 files
2012/04/13-00:24:26.832787 44c1e940 Compacting 1@2 + 0@3 files
2012/04/13-00:24:26.949864 44c1e940 Compacting 1@2 + 0@3 files
2012/04/13-00:24:27.052502 44c1e940 Compacting 1@2 + 0@3 files
2012/04/13-00:24:27.164623 44c1e940 Compacting 1@2 + 0@3 files
2012/04/13-00:24:27.275621 44c1e940 Compacting 1@2 + 0@3 files
2012/04/13-00:24:27.376748 44c1e940 Compacting 1@2 + 0@3 files
2012/04/13-00:24:27.477728 44c1e940 Compacting 1@2 + 0@3 files
2012/04/13-00:24:27.611332 44c1e940 Compacting 1@2 + 0@3 files
2012/04/13-00:24:28.050275 44c1e940 Compacting 50@2 + 242@3 files
2012/04/13-00:24:32.455665 44c1e940 Compacting 1@2 + 0@3 files
2012/04/13-00:24:32.538566 44c1e940 Compacting 1@3 + 0@4 files
2012/04/13-00:24:32.819205 44c1e940 Compacting 1@3 + 0@4 files
2012/04/13-00:24:33.052064 44c1e940 Compacting 1@3 + 0@4 files
2012/04/13-00:24:33.198850 44c1e940 Compacting 1@3 + 0@4 files
2012/04/13-00:24:33.350893 44c1e940 Compacting 1@3 + 0@4 files
2012/04/13-00:24:33.521784 44c1e940 Compacting 1@3 + 0@4 files
2012/04/13-00:24:33.693531 44c1e940 Compacting 1@3 + 0@4 files
2012/04/13-00:24:33.847151 44c1e940 Compacting 1@3 + 0@4 files
2012/04/13-00:24:34.034277 44c1e940 Compacting 1@3 + 0@4 files
2012/04/13-00:24:34.225582 44c1e940 Compacting 1@3 + 0@4 files
2012/04/13-00:24:34.390228 44c1e940 Compacting 1@3 + 0@4 files
2012/04/13-00:24:34.554127 44c1e940 Compacting 1@3 + 0@4 files
2012/04/13-00:24:34.715242 44c1e940 Compacting 1@3 + 0@4 files
2012/04/13-00:24:34.852110 44c1e940 Compacting 1@3 + 0@4 files
2012/04/13-00:24:35.046899 44c1e940 Compacting 68@3 + 331@4 files
2012/04/13-00:25:02.582758 44c1e940 Compacting 433@3 + 2159@4 files
2012/04/13-00:26:39.827096 44c1e940 Compacting 1@3 + 0@4 files
2012/04/13-00:26:39.992623 44c1e940 Compacting 72@4 + 354@5 files
2012/04/13-00:27:13.024120 44c1e940 Compacting 9@4 + 51@5 files
2012/04/13-00:27:18.007566 44c1e940 Compacting 9@4 + 48@5 files
2012/04/13-00:27:23.026351 44c1e940 Compacting 8@4 + 41@5 files
2012/04/13-00:27:28.408619 44c1e940 Compacting 6@4 + 33@5 files
2012/04/13-00:27:32.522630 44c1e940 Compacting 6@4 + 32@5 files
2012/04/13-00:27:36.719610 44c1e940 Compacting 6@4 + 31@5 files
2012/04/13-00:27:41.277302 44c1e940 Compacting 6@4 + 33@5 files
2012/04/13-00:27:44.928451 44c1e940 Compacting 6@4 + 32@5 files
2012/04/13-00:27:48.168874 44c1e940 Compacting 6@4 + 34@5 files
2012/04/13-00:27:52.718402 44c1e940 Compacting 6@4 + 32@5 files
2012/04/13-00:27:55.665107 44c1e940 Compacting 6@4 + 33@5 files
2012/04/13-00:27:59.381808 44c1e940 Compacting 6@4 + 34@5 files
2012/04/13-00:28:03.592802 44c1e940 Compacting 6@4 + 33@5 files
2012/04/13-00:28:07.179032 44c1e940 Compacting 664@4 + 3330@5 files
2012/04/13-00:29:58.239662 44c1e940 Compacting 101@4 + 500@5 files
2012/04/13-00:30:22.333750 44c1e940 Compacting 1@4 + 0@5 files

2012/04/13-00:45:28.851715 44c1e940 Compacting 4@0 + 1@1 files
2012/04/13-01:00:31.152105 44c1e940 Compacting 4@0 + 3@1 files
2012/04/13-01:10:33.167940 44c1e940 Compacting 4@0 + 3@1 files
2012/04/13-01:25:35.113416 44c1e940 Compacting 4@0 + 3@1 files
2012/04/13-01:35:37.621499 44c1e940 Compacting 4@0 + 2@1 files

[jsirois@smf1-ajb-35-sr1 ~]$ grep Compacting /var/lib/mesos/scheduler_db/mesos_log/LOG
2012/04/13-01:44:32.533694 44c27940 Compacting 2@0 + 3@1 files
2012/04/13-01:44:32.586958 44c27940 Compacting 2@1 + 6@2 files
2012/04/13-01:44:32.739514 44c27940 Compacting 1@2 + 0@3 files
2012/04/13-01:44:32.768764 44c27940 Compacting 1@2 + 0@3 files
2012/04/13-01:44:32.843866 44c27940 Compacting 1@3 + 0@4 files
2012/04/13-01:44:32.973304 44c27940 Compacting 1@3 + 0@4 files
2012/04/13-01:44:33.009686 44c27940 Compacting 1@4 + 2@5 files
2012/04/13-01:44:33.074056 44c27940 Compacting 1@4 + 0@5 files

2012/04/13-02:01:42.947456 44c27940 Compacting 4@0 + 1@1 files
2012/04/13-02:16:45.326088 44c27940 Compacting 4@0 + 4@1 files
2012/04/13-02:31:48.360851 44c27940 Compacting 4@0 + 1@1 files
2012/04/13-02:41:50.055622 44c27940 Compacting 4@0 + 3@1 files
2012/04/13-02:51:51.889148 44c27940 Compacting 4@0 + 4@1 files
2012/04/13-03:01:54.345784 44c27940 Compacting 4@0 + 4@1 files
2012/04/13-03:11:55.987774 44c27940 Compacting 4@0 + 4@1 files
2012/04/13-03:21:57.701121 44c27940 Compacting 4@0 + 4@1 files
2012/04/13-03:31:59.373435 44c27940 Compacting 4@0 + 4@1 files
2012/04/13-03:42:01.047061 44c27940 Compacting 4@0 + 4@1 files
2012/04/13-03:52:03.088683 44c27940 Compacting 4@0 + 4@1 files
2012/04/13-04:02:05.181165 44c27940 Compacting 4@0 + 4@1 files
2012/04/13-04:12:06.757773 44c27940 Compacting 4@0 + 4@1 files
2012/04/13-04:22:08.598259 44c27940 Compacting 4@0 + 4@1 files
2012/04/13-04:32:10.882913 44c27940 Compacting 4@0 + 4@1 files
2012/04/13-04:42:12.602192 44c27940 Compacting 4@0 + 4@1 files
2012/04/13-04:52:14.779705 44c27940 Compacting 4@0 + 4@1 files
2012/04/13-05:02:16.621063 44c27940 Compacting 4@0 + 4@1 files
2012/04/13-05:12:18.608767 44c27940 Compacting 4@0 + 4@1 files
2012/04/13-05:22:20.453201 44c27940 Compacting 4@0 + 4@1 files
2012/04/13-05:32:22.215804 44c27940 Compacting 4@0 + 4@1 files
2012/04/13-05:42:23.882423 44c27940 Compacting 4@0 + 4@1 files
2012/04/13-05:52:25.553032 44c27940 Compacting 4@0 + 4@1 files

With the format: [number of sstables compacted]@[level #] this says all levels are compacted on startup now, but once running we only see level0 -> level1 compactions and this accounts for the observed space leak.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MESOS-184) Log has a space leak

Posted by "Benjamin Mahler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MESOS-184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433271#comment-13433271 ] 

Benjamin Mahler commented on MESOS-184:
---------------------------------------

how long does a scheduler have to be up before the long compaction times happen at the next start up? Days? Weeks?
                
> Log has a space leak
> --------------------
>
>                 Key: MESOS-184
>                 URL: https://issues.apache.org/jira/browse/MESOS-184
>             Project: Mesos
>          Issue Type: Bug
>          Components: c++-api
>    Affects Versions: 0.9.0
>            Reporter: John Sirois
>            Assignee: Benjamin Hindman
>
> In short, the access pattern of the Log of the underlying LevelDB storage is such that background compactions are ineffective and a long running Log will have a space leak on disk even in the presence of otherwise apparently sufficient Log::Writer::truncate calls.
> It seems the right thing to do is to issue a DB::CompactRange(NULL, Slice(truncateToKey)) after a replica learns a Action::TRUNCATE Record.  The cost here is a synchronous compaction stall on every truncate so maybe this should be a configuration option or even an explicit api.
> ===
> Snip of email explanation:
> I spent some time understanding what was going on here and our use pattern of leveldb does in fact defeat the backround compaction algorithm.
> The docs are here: http://leveldb.googlecode.com/svn/trunk/doc/impl.html in the 'Compactions' section, but in short the gist is compaction operates on an uncompacted file from a level (1 file) + all files overlapping its key range in the next level.  Since we write sequential keys with no randomness at all, by definition the only overlap we ever can get is in level 0 which is the only level that leveldb allows for overlap in sstables in the 1st place.
> That leaves the question of why no compaction on open.  Looking there: http://code.google.com/p/leveldb/source/browse/db/db_impl.cc#1376
> I see a call to MaybeScheduleCompaction, but following that trail, that just leads to http://code.google.com/p/leveldb/source/browse/db/version_set.cc?spec=svnbc1ee4d25e09b04e074db330a41f54ef4af0e31b&r=36a5f8ed7f9fb3373236d5eace4f5fea369856ee#1156 which implements the compaction strategy I tried to summarize above, and thus background compactions for out case are limited to level0 -> level 1 compactions and lefel1 and higher never compact automatically.
> This seems born out by the LOG files.  For example, from smf1-prod - restarts after your manual compaction fix in bold:
> [jsirois@smf1-ajb-35-sr1 ~]$ grep Compacting /var/lib/mesos/scheduler_db/mesos_log/LOG.old 
> 2012/04/13-00:24:20.356673 44c1e940 Compacting 3@0 + 4@1 files
> 2012/04/13-00:24:20.490113 44c1e940 Compacting 5@1 + 281@2 files
> 2012/04/13-00:24:25.824995 44c1e940 Compacting 1@1 + 0@2 files
> 2012/04/13-00:24:26.008857 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.196877 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.312465 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.429817 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.533483 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.631044 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.733702 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.832787 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.949864 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:27.052502 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:27.164623 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:27.275621 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:27.376748 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:27.477728 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:27.611332 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:28.050275 44c1e940 Compacting 50@2 + 242@3 files
> 2012/04/13-00:24:32.455665 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:32.538566 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:32.819205 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:33.052064 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:33.198850 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:33.350893 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:33.521784 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:33.693531 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:33.847151 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:34.034277 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:34.225582 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:34.390228 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:34.554127 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:34.715242 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:34.852110 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:35.046899 44c1e940 Compacting 68@3 + 331@4 files
> 2012/04/13-00:25:02.582758 44c1e940 Compacting 433@3 + 2159@4 files
> 2012/04/13-00:26:39.827096 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:26:39.992623 44c1e940 Compacting 72@4 + 354@5 files
> 2012/04/13-00:27:13.024120 44c1e940 Compacting 9@4 + 51@5 files
> 2012/04/13-00:27:18.007566 44c1e940 Compacting 9@4 + 48@5 files
> 2012/04/13-00:27:23.026351 44c1e940 Compacting 8@4 + 41@5 files
> 2012/04/13-00:27:28.408619 44c1e940 Compacting 6@4 + 33@5 files
> 2012/04/13-00:27:32.522630 44c1e940 Compacting 6@4 + 32@5 files
> 2012/04/13-00:27:36.719610 44c1e940 Compacting 6@4 + 31@5 files
> 2012/04/13-00:27:41.277302 44c1e940 Compacting 6@4 + 33@5 files
> 2012/04/13-00:27:44.928451 44c1e940 Compacting 6@4 + 32@5 files
> 2012/04/13-00:27:48.168874 44c1e940 Compacting 6@4 + 34@5 files
> 2012/04/13-00:27:52.718402 44c1e940 Compacting 6@4 + 32@5 files
> 2012/04/13-00:27:55.665107 44c1e940 Compacting 6@4 + 33@5 files
> 2012/04/13-00:27:59.381808 44c1e940 Compacting 6@4 + 34@5 files
> 2012/04/13-00:28:03.592802 44c1e940 Compacting 6@4 + 33@5 files
> 2012/04/13-00:28:07.179032 44c1e940 Compacting 664@4 + 3330@5 files
> 2012/04/13-00:29:58.239662 44c1e940 Compacting 101@4 + 500@5 files
> 2012/04/13-00:30:22.333750 44c1e940 Compacting 1@4 + 0@5 files
> 2012/04/13-00:45:28.851715 44c1e940 Compacting 4@0 + 1@1 files
> 2012/04/13-01:00:31.152105 44c1e940 Compacting 4@0 + 3@1 files
> 2012/04/13-01:10:33.167940 44c1e940 Compacting 4@0 + 3@1 files
> 2012/04/13-01:25:35.113416 44c1e940 Compacting 4@0 + 3@1 files
> 2012/04/13-01:35:37.621499 44c1e940 Compacting 4@0 + 2@1 files
> [jsirois@smf1-ajb-35-sr1 ~]$ grep Compacting /var/lib/mesos/scheduler_db/mesos_log/LOG
> 2012/04/13-01:44:32.533694 44c27940 Compacting 2@0 + 3@1 files
> 2012/04/13-01:44:32.586958 44c27940 Compacting 2@1 + 6@2 files
> 2012/04/13-01:44:32.739514 44c27940 Compacting 1@2 + 0@3 files
> 2012/04/13-01:44:32.768764 44c27940 Compacting 1@2 + 0@3 files
> 2012/04/13-01:44:32.843866 44c27940 Compacting 1@3 + 0@4 files
> 2012/04/13-01:44:32.973304 44c27940 Compacting 1@3 + 0@4 files
> 2012/04/13-01:44:33.009686 44c27940 Compacting 1@4 + 2@5 files
> 2012/04/13-01:44:33.074056 44c27940 Compacting 1@4 + 0@5 files
> 2012/04/13-02:01:42.947456 44c27940 Compacting 4@0 + 1@1 files
> 2012/04/13-02:16:45.326088 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-02:31:48.360851 44c27940 Compacting 4@0 + 1@1 files
> 2012/04/13-02:41:50.055622 44c27940 Compacting 4@0 + 3@1 files
> 2012/04/13-02:51:51.889148 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-03:01:54.345784 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-03:11:55.987774 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-03:21:57.701121 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-03:31:59.373435 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-03:42:01.047061 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-03:52:03.088683 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-04:02:05.181165 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-04:12:06.757773 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-04:22:08.598259 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-04:32:10.882913 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-04:42:12.602192 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-04:52:14.779705 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-05:02:16.621063 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-05:12:18.608767 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-05:22:20.453201 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-05:32:22.215804 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-05:42:23.882423 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-05:52:25.553032 44c27940 Compacting 4@0 + 4@1 files
> With the format: [number of sstables compacted]@[level #] this says all levels are compacted on startup now, but once running we only see level0 -> level1 compactions and this accounts for the observed space leak.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MESOS-184) Log has a space leak

Posted by "Benjamin Mahler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MESOS-184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433292#comment-13433292 ] 

Benjamin Mahler commented on MESOS-184:
---------------------------------------

What if we periodically watch the log dir size, and trigger compactions before it gets out of hand?
That might tie in nicely with the file GC code once we have it.

(I'm guessing we do not want to switch away from leveldb at this point?)
                
> Log has a space leak
> --------------------
>
>                 Key: MESOS-184
>                 URL: https://issues.apache.org/jira/browse/MESOS-184
>             Project: Mesos
>          Issue Type: Bug
>          Components: c++-api
>    Affects Versions: 0.9.0
>            Reporter: John Sirois
>            Assignee: Benjamin Hindman
>
> In short, the access pattern of the Log of the underlying LevelDB storage is such that background compactions are ineffective and a long running Log will have a space leak on disk even in the presence of otherwise apparently sufficient Log::Writer::truncate calls.
> It seems the right thing to do is to issue a DB::CompactRange(NULL, Slice(truncateToKey)) after a replica learns a Action::TRUNCATE Record.  The cost here is a synchronous compaction stall on every truncate so maybe this should be a configuration option or even an explicit api.
> ===
> Snip of email explanation:
> I spent some time understanding what was going on here and our use pattern of leveldb does in fact defeat the backround compaction algorithm.
> The docs are here: http://leveldb.googlecode.com/svn/trunk/doc/impl.html in the 'Compactions' section, but in short the gist is compaction operates on an uncompacted file from a level (1 file) + all files overlapping its key range in the next level.  Since we write sequential keys with no randomness at all, by definition the only overlap we ever can get is in level 0 which is the only level that leveldb allows for overlap in sstables in the 1st place.
> That leaves the question of why no compaction on open.  Looking there: http://code.google.com/p/leveldb/source/browse/db/db_impl.cc#1376
> I see a call to MaybeScheduleCompaction, but following that trail, that just leads to http://code.google.com/p/leveldb/source/browse/db/version_set.cc?spec=svnbc1ee4d25e09b04e074db330a41f54ef4af0e31b&r=36a5f8ed7f9fb3373236d5eace4f5fea369856ee#1156 which implements the compaction strategy I tried to summarize above, and thus background compactions for out case are limited to level0 -> level 1 compactions and lefel1 and higher never compact automatically.
> This seems born out by the LOG files.  For example, from smf1-prod - restarts after your manual compaction fix in bold:
> [jsirois@smf1-ajb-35-sr1 ~]$ grep Compacting /var/lib/mesos/scheduler_db/mesos_log/LOG.old 
> 2012/04/13-00:24:20.356673 44c1e940 Compacting 3@0 + 4@1 files
> 2012/04/13-00:24:20.490113 44c1e940 Compacting 5@1 + 281@2 files
> 2012/04/13-00:24:25.824995 44c1e940 Compacting 1@1 + 0@2 files
> 2012/04/13-00:24:26.008857 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.196877 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.312465 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.429817 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.533483 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.631044 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.733702 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.832787 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.949864 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:27.052502 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:27.164623 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:27.275621 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:27.376748 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:27.477728 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:27.611332 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:28.050275 44c1e940 Compacting 50@2 + 242@3 files
> 2012/04/13-00:24:32.455665 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:32.538566 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:32.819205 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:33.052064 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:33.198850 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:33.350893 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:33.521784 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:33.693531 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:33.847151 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:34.034277 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:34.225582 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:34.390228 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:34.554127 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:34.715242 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:34.852110 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:35.046899 44c1e940 Compacting 68@3 + 331@4 files
> 2012/04/13-00:25:02.582758 44c1e940 Compacting 433@3 + 2159@4 files
> 2012/04/13-00:26:39.827096 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:26:39.992623 44c1e940 Compacting 72@4 + 354@5 files
> 2012/04/13-00:27:13.024120 44c1e940 Compacting 9@4 + 51@5 files
> 2012/04/13-00:27:18.007566 44c1e940 Compacting 9@4 + 48@5 files
> 2012/04/13-00:27:23.026351 44c1e940 Compacting 8@4 + 41@5 files
> 2012/04/13-00:27:28.408619 44c1e940 Compacting 6@4 + 33@5 files
> 2012/04/13-00:27:32.522630 44c1e940 Compacting 6@4 + 32@5 files
> 2012/04/13-00:27:36.719610 44c1e940 Compacting 6@4 + 31@5 files
> 2012/04/13-00:27:41.277302 44c1e940 Compacting 6@4 + 33@5 files
> 2012/04/13-00:27:44.928451 44c1e940 Compacting 6@4 + 32@5 files
> 2012/04/13-00:27:48.168874 44c1e940 Compacting 6@4 + 34@5 files
> 2012/04/13-00:27:52.718402 44c1e940 Compacting 6@4 + 32@5 files
> 2012/04/13-00:27:55.665107 44c1e940 Compacting 6@4 + 33@5 files
> 2012/04/13-00:27:59.381808 44c1e940 Compacting 6@4 + 34@5 files
> 2012/04/13-00:28:03.592802 44c1e940 Compacting 6@4 + 33@5 files
> 2012/04/13-00:28:07.179032 44c1e940 Compacting 664@4 + 3330@5 files
> 2012/04/13-00:29:58.239662 44c1e940 Compacting 101@4 + 500@5 files
> 2012/04/13-00:30:22.333750 44c1e940 Compacting 1@4 + 0@5 files
> 2012/04/13-00:45:28.851715 44c1e940 Compacting 4@0 + 1@1 files
> 2012/04/13-01:00:31.152105 44c1e940 Compacting 4@0 + 3@1 files
> 2012/04/13-01:10:33.167940 44c1e940 Compacting 4@0 + 3@1 files
> 2012/04/13-01:25:35.113416 44c1e940 Compacting 4@0 + 3@1 files
> 2012/04/13-01:35:37.621499 44c1e940 Compacting 4@0 + 2@1 files
> [jsirois@smf1-ajb-35-sr1 ~]$ grep Compacting /var/lib/mesos/scheduler_db/mesos_log/LOG
> 2012/04/13-01:44:32.533694 44c27940 Compacting 2@0 + 3@1 files
> 2012/04/13-01:44:32.586958 44c27940 Compacting 2@1 + 6@2 files
> 2012/04/13-01:44:32.739514 44c27940 Compacting 1@2 + 0@3 files
> 2012/04/13-01:44:32.768764 44c27940 Compacting 1@2 + 0@3 files
> 2012/04/13-01:44:32.843866 44c27940 Compacting 1@3 + 0@4 files
> 2012/04/13-01:44:32.973304 44c27940 Compacting 1@3 + 0@4 files
> 2012/04/13-01:44:33.009686 44c27940 Compacting 1@4 + 2@5 files
> 2012/04/13-01:44:33.074056 44c27940 Compacting 1@4 + 0@5 files
> 2012/04/13-02:01:42.947456 44c27940 Compacting 4@0 + 1@1 files
> 2012/04/13-02:16:45.326088 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-02:31:48.360851 44c27940 Compacting 4@0 + 1@1 files
> 2012/04/13-02:41:50.055622 44c27940 Compacting 4@0 + 3@1 files
> 2012/04/13-02:51:51.889148 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-03:01:54.345784 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-03:11:55.987774 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-03:21:57.701121 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-03:31:59.373435 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-03:42:01.047061 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-03:52:03.088683 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-04:02:05.181165 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-04:12:06.757773 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-04:22:08.598259 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-04:32:10.882913 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-04:42:12.602192 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-04:52:14.779705 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-05:02:16.621063 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-05:12:18.608767 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-05:22:20.453201 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-05:32:22.215804 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-05:42:23.882423 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-05:52:25.553032 44c27940 Compacting 4@0 + 4@1 files
> With the format: [number of sstables compacted]@[level #] this says all levels are compacted on startup now, but once running we only see level0 -> level1 compactions and this accounts for the observed space leak.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MESOS-184) Log has a space leak

Posted by "John Sirois (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MESOS-184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433317#comment-13433317 ] 

John Sirois commented on MESOS-184:
-----------------------------------

This would work with the api suggestion above.  For Twitter use - we might do this - for general use, compacting on truncate or with a new api call seems more sane.  And as you point out - this is very specifically a problem of leveldb being a bad fit for the log's write pattern.  The other solution is to write an append only log implementation that knows how to truncate.
                
> Log has a space leak
> --------------------
>
>                 Key: MESOS-184
>                 URL: https://issues.apache.org/jira/browse/MESOS-184
>             Project: Mesos
>          Issue Type: Bug
>          Components: c++-api
>    Affects Versions: 0.9.0
>            Reporter: John Sirois
>            Assignee: Benjamin Hindman
>
> In short, the access pattern of the Log of the underlying LevelDB storage is such that background compactions are ineffective and a long running Log will have a space leak on disk even in the presence of otherwise apparently sufficient Log::Writer::truncate calls.
> It seems the right thing to do is to issue a DB::CompactRange(NULL, Slice(truncateToKey)) after a replica learns a Action::TRUNCATE Record.  The cost here is a synchronous compaction stall on every truncate so maybe this should be a configuration option or even an explicit api.
> ===
> Snip of email explanation:
> I spent some time understanding what was going on here and our use pattern of leveldb does in fact defeat the backround compaction algorithm.
> The docs are here: http://leveldb.googlecode.com/svn/trunk/doc/impl.html in the 'Compactions' section, but in short the gist is compaction operates on an uncompacted file from a level (1 file) + all files overlapping its key range in the next level.  Since we write sequential keys with no randomness at all, by definition the only overlap we ever can get is in level 0 which is the only level that leveldb allows for overlap in sstables in the 1st place.
> That leaves the question of why no compaction on open.  Looking there: http://code.google.com/p/leveldb/source/browse/db/db_impl.cc#1376
> I see a call to MaybeScheduleCompaction, but following that trail, that just leads to http://code.google.com/p/leveldb/source/browse/db/version_set.cc?spec=svnbc1ee4d25e09b04e074db330a41f54ef4af0e31b&r=36a5f8ed7f9fb3373236d5eace4f5fea369856ee#1156 which implements the compaction strategy I tried to summarize above, and thus background compactions for out case are limited to level0 -> level 1 compactions and lefel1 and higher never compact automatically.
> This seems born out by the LOG files.  For example, from smf1-prod - restarts after your manual compaction fix in bold:
> [jsirois@smf1-ajb-35-sr1 ~]$ grep Compacting /var/lib/mesos/scheduler_db/mesos_log/LOG.old 
> 2012/04/13-00:24:20.356673 44c1e940 Compacting 3@0 + 4@1 files
> 2012/04/13-00:24:20.490113 44c1e940 Compacting 5@1 + 281@2 files
> 2012/04/13-00:24:25.824995 44c1e940 Compacting 1@1 + 0@2 files
> 2012/04/13-00:24:26.008857 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.196877 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.312465 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.429817 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.533483 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.631044 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.733702 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.832787 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.949864 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:27.052502 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:27.164623 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:27.275621 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:27.376748 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:27.477728 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:27.611332 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:28.050275 44c1e940 Compacting 50@2 + 242@3 files
> 2012/04/13-00:24:32.455665 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:32.538566 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:32.819205 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:33.052064 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:33.198850 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:33.350893 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:33.521784 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:33.693531 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:33.847151 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:34.034277 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:34.225582 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:34.390228 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:34.554127 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:34.715242 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:34.852110 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:35.046899 44c1e940 Compacting 68@3 + 331@4 files
> 2012/04/13-00:25:02.582758 44c1e940 Compacting 433@3 + 2159@4 files
> 2012/04/13-00:26:39.827096 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:26:39.992623 44c1e940 Compacting 72@4 + 354@5 files
> 2012/04/13-00:27:13.024120 44c1e940 Compacting 9@4 + 51@5 files
> 2012/04/13-00:27:18.007566 44c1e940 Compacting 9@4 + 48@5 files
> 2012/04/13-00:27:23.026351 44c1e940 Compacting 8@4 + 41@5 files
> 2012/04/13-00:27:28.408619 44c1e940 Compacting 6@4 + 33@5 files
> 2012/04/13-00:27:32.522630 44c1e940 Compacting 6@4 + 32@5 files
> 2012/04/13-00:27:36.719610 44c1e940 Compacting 6@4 + 31@5 files
> 2012/04/13-00:27:41.277302 44c1e940 Compacting 6@4 + 33@5 files
> 2012/04/13-00:27:44.928451 44c1e940 Compacting 6@4 + 32@5 files
> 2012/04/13-00:27:48.168874 44c1e940 Compacting 6@4 + 34@5 files
> 2012/04/13-00:27:52.718402 44c1e940 Compacting 6@4 + 32@5 files
> 2012/04/13-00:27:55.665107 44c1e940 Compacting 6@4 + 33@5 files
> 2012/04/13-00:27:59.381808 44c1e940 Compacting 6@4 + 34@5 files
> 2012/04/13-00:28:03.592802 44c1e940 Compacting 6@4 + 33@5 files
> 2012/04/13-00:28:07.179032 44c1e940 Compacting 664@4 + 3330@5 files
> 2012/04/13-00:29:58.239662 44c1e940 Compacting 101@4 + 500@5 files
> 2012/04/13-00:30:22.333750 44c1e940 Compacting 1@4 + 0@5 files
> 2012/04/13-00:45:28.851715 44c1e940 Compacting 4@0 + 1@1 files
> 2012/04/13-01:00:31.152105 44c1e940 Compacting 4@0 + 3@1 files
> 2012/04/13-01:10:33.167940 44c1e940 Compacting 4@0 + 3@1 files
> 2012/04/13-01:25:35.113416 44c1e940 Compacting 4@0 + 3@1 files
> 2012/04/13-01:35:37.621499 44c1e940 Compacting 4@0 + 2@1 files
> [jsirois@smf1-ajb-35-sr1 ~]$ grep Compacting /var/lib/mesos/scheduler_db/mesos_log/LOG
> 2012/04/13-01:44:32.533694 44c27940 Compacting 2@0 + 3@1 files
> 2012/04/13-01:44:32.586958 44c27940 Compacting 2@1 + 6@2 files
> 2012/04/13-01:44:32.739514 44c27940 Compacting 1@2 + 0@3 files
> 2012/04/13-01:44:32.768764 44c27940 Compacting 1@2 + 0@3 files
> 2012/04/13-01:44:32.843866 44c27940 Compacting 1@3 + 0@4 files
> 2012/04/13-01:44:32.973304 44c27940 Compacting 1@3 + 0@4 files
> 2012/04/13-01:44:33.009686 44c27940 Compacting 1@4 + 2@5 files
> 2012/04/13-01:44:33.074056 44c27940 Compacting 1@4 + 0@5 files
> 2012/04/13-02:01:42.947456 44c27940 Compacting 4@0 + 1@1 files
> 2012/04/13-02:16:45.326088 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-02:31:48.360851 44c27940 Compacting 4@0 + 1@1 files
> 2012/04/13-02:41:50.055622 44c27940 Compacting 4@0 + 3@1 files
> 2012/04/13-02:51:51.889148 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-03:01:54.345784 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-03:11:55.987774 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-03:21:57.701121 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-03:31:59.373435 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-03:42:01.047061 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-03:52:03.088683 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-04:02:05.181165 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-04:12:06.757773 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-04:22:08.598259 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-04:32:10.882913 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-04:42:12.602192 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-04:52:14.779705 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-05:02:16.621063 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-05:12:18.608767 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-05:22:20.453201 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-05:32:22.215804 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-05:42:23.882423 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-05:52:25.553032 44c27940 Compacting 4@0 + 4@1 files
> With the format: [number of sstables compacted]@[level #] this says all levels are compacted on startup now, but once running we only see level0 -> level1 compactions and this accounts for the observed space leak.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MESOS-184) Log has a space leak

Posted by "John Sirois (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MESOS-184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433284#comment-13433284 ] 

John Sirois commented on MESOS-184:
-----------------------------------

This is fully usage dependant.  We see log compactions taking a long time (many minutes) when the size of the log dir is in the 10s of GB range - and this is basically bound by disk access times and transfer rates.
                
> Log has a space leak
> --------------------
>
>                 Key: MESOS-184
>                 URL: https://issues.apache.org/jira/browse/MESOS-184
>             Project: Mesos
>          Issue Type: Bug
>          Components: c++-api
>    Affects Versions: 0.9.0
>            Reporter: John Sirois
>            Assignee: Benjamin Hindman
>
> In short, the access pattern of the Log of the underlying LevelDB storage is such that background compactions are ineffective and a long running Log will have a space leak on disk even in the presence of otherwise apparently sufficient Log::Writer::truncate calls.
> It seems the right thing to do is to issue a DB::CompactRange(NULL, Slice(truncateToKey)) after a replica learns a Action::TRUNCATE Record.  The cost here is a synchronous compaction stall on every truncate so maybe this should be a configuration option or even an explicit api.
> ===
> Snip of email explanation:
> I spent some time understanding what was going on here and our use pattern of leveldb does in fact defeat the backround compaction algorithm.
> The docs are here: http://leveldb.googlecode.com/svn/trunk/doc/impl.html in the 'Compactions' section, but in short the gist is compaction operates on an uncompacted file from a level (1 file) + all files overlapping its key range in the next level.  Since we write sequential keys with no randomness at all, by definition the only overlap we ever can get is in level 0 which is the only level that leveldb allows for overlap in sstables in the 1st place.
> That leaves the question of why no compaction on open.  Looking there: http://code.google.com/p/leveldb/source/browse/db/db_impl.cc#1376
> I see a call to MaybeScheduleCompaction, but following that trail, that just leads to http://code.google.com/p/leveldb/source/browse/db/version_set.cc?spec=svnbc1ee4d25e09b04e074db330a41f54ef4af0e31b&r=36a5f8ed7f9fb3373236d5eace4f5fea369856ee#1156 which implements the compaction strategy I tried to summarize above, and thus background compactions for out case are limited to level0 -> level 1 compactions and lefel1 and higher never compact automatically.
> This seems born out by the LOG files.  For example, from smf1-prod - restarts after your manual compaction fix in bold:
> [jsirois@smf1-ajb-35-sr1 ~]$ grep Compacting /var/lib/mesos/scheduler_db/mesos_log/LOG.old 
> 2012/04/13-00:24:20.356673 44c1e940 Compacting 3@0 + 4@1 files
> 2012/04/13-00:24:20.490113 44c1e940 Compacting 5@1 + 281@2 files
> 2012/04/13-00:24:25.824995 44c1e940 Compacting 1@1 + 0@2 files
> 2012/04/13-00:24:26.008857 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.196877 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.312465 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.429817 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.533483 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.631044 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.733702 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.832787 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:26.949864 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:27.052502 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:27.164623 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:27.275621 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:27.376748 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:27.477728 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:27.611332 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:28.050275 44c1e940 Compacting 50@2 + 242@3 files
> 2012/04/13-00:24:32.455665 44c1e940 Compacting 1@2 + 0@3 files
> 2012/04/13-00:24:32.538566 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:32.819205 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:33.052064 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:33.198850 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:33.350893 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:33.521784 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:33.693531 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:33.847151 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:34.034277 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:34.225582 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:34.390228 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:34.554127 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:34.715242 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:34.852110 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:24:35.046899 44c1e940 Compacting 68@3 + 331@4 files
> 2012/04/13-00:25:02.582758 44c1e940 Compacting 433@3 + 2159@4 files
> 2012/04/13-00:26:39.827096 44c1e940 Compacting 1@3 + 0@4 files
> 2012/04/13-00:26:39.992623 44c1e940 Compacting 72@4 + 354@5 files
> 2012/04/13-00:27:13.024120 44c1e940 Compacting 9@4 + 51@5 files
> 2012/04/13-00:27:18.007566 44c1e940 Compacting 9@4 + 48@5 files
> 2012/04/13-00:27:23.026351 44c1e940 Compacting 8@4 + 41@5 files
> 2012/04/13-00:27:28.408619 44c1e940 Compacting 6@4 + 33@5 files
> 2012/04/13-00:27:32.522630 44c1e940 Compacting 6@4 + 32@5 files
> 2012/04/13-00:27:36.719610 44c1e940 Compacting 6@4 + 31@5 files
> 2012/04/13-00:27:41.277302 44c1e940 Compacting 6@4 + 33@5 files
> 2012/04/13-00:27:44.928451 44c1e940 Compacting 6@4 + 32@5 files
> 2012/04/13-00:27:48.168874 44c1e940 Compacting 6@4 + 34@5 files
> 2012/04/13-00:27:52.718402 44c1e940 Compacting 6@4 + 32@5 files
> 2012/04/13-00:27:55.665107 44c1e940 Compacting 6@4 + 33@5 files
> 2012/04/13-00:27:59.381808 44c1e940 Compacting 6@4 + 34@5 files
> 2012/04/13-00:28:03.592802 44c1e940 Compacting 6@4 + 33@5 files
> 2012/04/13-00:28:07.179032 44c1e940 Compacting 664@4 + 3330@5 files
> 2012/04/13-00:29:58.239662 44c1e940 Compacting 101@4 + 500@5 files
> 2012/04/13-00:30:22.333750 44c1e940 Compacting 1@4 + 0@5 files
> 2012/04/13-00:45:28.851715 44c1e940 Compacting 4@0 + 1@1 files
> 2012/04/13-01:00:31.152105 44c1e940 Compacting 4@0 + 3@1 files
> 2012/04/13-01:10:33.167940 44c1e940 Compacting 4@0 + 3@1 files
> 2012/04/13-01:25:35.113416 44c1e940 Compacting 4@0 + 3@1 files
> 2012/04/13-01:35:37.621499 44c1e940 Compacting 4@0 + 2@1 files
> [jsirois@smf1-ajb-35-sr1 ~]$ grep Compacting /var/lib/mesos/scheduler_db/mesos_log/LOG
> 2012/04/13-01:44:32.533694 44c27940 Compacting 2@0 + 3@1 files
> 2012/04/13-01:44:32.586958 44c27940 Compacting 2@1 + 6@2 files
> 2012/04/13-01:44:32.739514 44c27940 Compacting 1@2 + 0@3 files
> 2012/04/13-01:44:32.768764 44c27940 Compacting 1@2 + 0@3 files
> 2012/04/13-01:44:32.843866 44c27940 Compacting 1@3 + 0@4 files
> 2012/04/13-01:44:32.973304 44c27940 Compacting 1@3 + 0@4 files
> 2012/04/13-01:44:33.009686 44c27940 Compacting 1@4 + 2@5 files
> 2012/04/13-01:44:33.074056 44c27940 Compacting 1@4 + 0@5 files
> 2012/04/13-02:01:42.947456 44c27940 Compacting 4@0 + 1@1 files
> 2012/04/13-02:16:45.326088 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-02:31:48.360851 44c27940 Compacting 4@0 + 1@1 files
> 2012/04/13-02:41:50.055622 44c27940 Compacting 4@0 + 3@1 files
> 2012/04/13-02:51:51.889148 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-03:01:54.345784 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-03:11:55.987774 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-03:21:57.701121 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-03:31:59.373435 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-03:42:01.047061 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-03:52:03.088683 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-04:02:05.181165 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-04:12:06.757773 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-04:22:08.598259 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-04:32:10.882913 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-04:42:12.602192 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-04:52:14.779705 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-05:02:16.621063 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-05:12:18.608767 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-05:22:20.453201 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-05:32:22.215804 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-05:42:23.882423 44c27940 Compacting 4@0 + 4@1 files
> 2012/04/13-05:52:25.553032 44c27940 Compacting 4@0 + 4@1 files
> With the format: [number of sstables compacted]@[level #] this says all levels are compacted on startup now, but once running we only see level0 -> level1 compactions and this accounts for the observed space leak.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira