You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (Created) (JIRA)" <ji...@apache.org> on 2012/02/23 19:14:48 UTC

[jira] [Created] (CASSANDRA-3952) avoid quadratic startup time in LeveledManifest

avoid quadratic startup time in LeveledManifest
-----------------------------------------------

                 Key: CASSANDRA-3952
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3952
             Project: Cassandra
          Issue Type: Improvement
          Components: Core
            Reporter: Jonathan Ellis
            Assignee: Jonathan Ellis
            Priority: Minor
             Fix For: 1.1.0, 1.1.1


Checking that each sstable is in the manifest on startup is O(N**2) in the number of sstables:

{code}
.       // ensure all SSTables are in the manifest
        for (SSTableReader ssTableReader : cfs.getSSTables())
        {
            if (manifest.levelOf(ssTableReader) < 0)
                manifest.add(ssTableReader);
        }
{code}

{code}
private int levelOf(SSTableReader sstable)
{
    for (int level = 0; level < generations.length; level++)
    {
        if (generations[level].contains(sstable))
            return level;
    }
    return -1;
}
{code}

Note that the contains call is a linear List.contains.

We need to switch to a sorted list and bsearch, or a tree, to support TB-levels of data in LeveledCompactionStrategy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3952) avoid quadratic startup time in LeveledManifest

Posted by "Peter Schuller (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223079#comment-13223079 ] 

Peter Schuller commented on CASSANDRA-3952:
-------------------------------------------

Unable to reproduce even when nuking the .json files and changing back and fourth between compaction strategies.

However, observe the log below. The last 'Compacting ...' line indicates that it's compacting a single SStable (first line in log; rest is un-edited log up to the point of the assertion error).

For the case of a single sstable, the newLevel will be set to 0 first, if the sstable is in level 0 (which I am assuming, or this makes even less sense, but is not "known"):

{code}
        // avoid increasing the level if we had a single source sstable involved.  This prevents
        // cleanup, scrub, and upgradesstables from blowing through the level cap.
        // See CASSANDRA-3989
        int newLevel = Iterables.size(removed) == 1
                     ? maximumLevel
                     : minimumLevel == maximumLevel ? maximumLevel + 1 : maximumLevel;
{code}

We then re-assign newLevel as such:

{code}
        newLevel = skipLevels(newLevel, added);
{code}

But skipLevels() makes no guarantee that it will in fact skip any level:

{code}
    private int skipLevels(int newLevel, Iterable<SSTableReader> added)
    {
        while (maxBytesForLevel(newLevel) < SSTableReader.getTotalBytes(added)
            && generations[(newLevel + 1)].isEmpty())
        {
            newLevel++;
        }
        return newLevel;
    }
{code}

The other possibility I can see is that the patch somehow introduces a lack of level information that I'm not seeing, and levelOf() returns -1, and then skipLevels() skips one level making the final level 0.


Log:

{code}
 INFO [CompactionExecutor:4] 2012-03-05 23:29:45,291 CompactionTask.java (line 114) Compacting [SSTableReader(path='/var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-1-Data.db')]
 WARN [MutationStage:21] 2012-03-05 23:29:46,181 Memtable.java (line 159) MemoryMeter uninitialized (jamm not specified as java agent); assuming liveRatio of 10.0.  Usually this means cassandra-env.sh disabled jamm because you are using a buggy JRE; upgrade to the Sun JRE instead
 INFO [FlushWriter:2] 2012-03-05 23:29:46,516 Memtable.java (line 290) Completed flushing /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-2-Data.db (471933 bytes)
 INFO [OptionalTasks:1] 2012-03-05 23:29:47,178 MeteredFlusher.java (line 58) flushing high-traffic column family CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') (estimated 51525937 bytes)
 INFO [OptionalTasks:1] 2012-03-05 23:29:47,183 ColumnFamilyStore.java (line 645) Enqueuing flush of Memtable-Standard1@114464782(4122075/51525937 serialized/live bytes, 80825 ops)
 INFO [FlushWriter:2] 2012-03-05 23:29:47,184 Memtable.java (line 249) Writing Memtable-Standard1@114464782(4122075/51525937 serialized/live bytes, 80825 ops)
 INFO [FlushWriter:2] 2012-03-05 23:29:48,163 Memtable.java (line 290) Completed flushing /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-4-Data.db (583221 bytes)
 INFO [OptionalTasks:1] 2012-03-05 23:29:49,186 MeteredFlusher.java (line 58) flushing high-traffic column family CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') (estimated 46716000 bytes)
 INFO [OptionalTasks:1] 2012-03-05 23:29:49,187 ColumnFamilyStore.java (line 645) Enqueuing flush of Memtable-Standard1@1403886703(3739065/46738312 serialized/live bytes, 73315 ops)
 INFO [FlushWriter:2] 2012-03-05 23:29:49,188 Memtable.java (line 249) Writing Memtable-Standard1@1403886703(3739065/46738312 serialized/live bytes, 73315 ops)
 INFO [FlushWriter:2] 2012-03-05 23:29:49,578 Memtable.java (line 290) Completed flushing /var/lib/cassandra/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-5-Data.db (535826 bytes)
ERROR [CompactionExecutor:4] 2012-03-05 23:29:49,746 AbstractCassandraDaemon.java (line 133) Exception in thread Thread[CompactionExecutor:4,1,main]
java.lang.AssertionError
{code}

                
> avoid quadratic startup time in LeveledManifest
> -----------------------------------------------
>
>                 Key: CASSANDRA-3952
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3952
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Dave Brosius
>            Priority: Minor
>              Labels: lhf
>             Fix For: 1.1.1
>
>         Attachments: speed_up_level_of.diff
>
>
> Checking that each sstable is in the manifest on startup is O(N**2) in the number of sstables:
> {code}
> .       // ensure all SSTables are in the manifest
>         for (SSTableReader ssTableReader : cfs.getSSTables())
>         {
>             if (manifest.levelOf(ssTableReader) < 0)
>                 manifest.add(ssTableReader);
>         }
> {code}
> {code}
> private int levelOf(SSTableReader sstable)
> {
>     for (int level = 0; level < generations.length; level++)
>     {
>         if (generations[level].contains(sstable))
>             return level;
>     }
>     return -1;
> }
> {code}
> Note that the contains call is a linear List.contains.
> We need to switch to a sorted list and bsearch, or a tree, to support TB-levels of data in LeveledCompactionStrategy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-3952) avoid quadratic startup time in LeveledManifest

Posted by "Peter Schuller (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Peter Schuller updated CASSANDRA-3952:
--------------------------------------

    Fix Version/s:     (was: 1.1.1)
                   1.1.0
    
> avoid quadratic startup time in LeveledManifest
> -----------------------------------------------
>
>                 Key: CASSANDRA-3952
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3952
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Dave Brosius
>            Priority: Minor
>              Labels: lhf
>             Fix For: 1.1.0
>
>         Attachments: speed_up_level_of.diff
>
>
> Checking that each sstable is in the manifest on startup is O(N**2) in the number of sstables:
> {code}
> .       // ensure all SSTables are in the manifest
>         for (SSTableReader ssTableReader : cfs.getSSTables())
>         {
>             if (manifest.levelOf(ssTableReader) < 0)
>                 manifest.add(ssTableReader);
>         }
> {code}
> {code}
> private int levelOf(SSTableReader sstable)
> {
>     for (int level = 0; level < generations.length; level++)
>     {
>         if (generations[level].contains(sstable))
>             return level;
>     }
>     return -1;
> }
> {code}
> Note that the contains call is a linear List.contains.
> We need to switch to a sorted list and bsearch, or a tree, to support TB-levels of data in LeveledCompactionStrategy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3952) avoid quadratic startup time in LeveledManifest

Posted by "Peter Schuller (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223082#comment-13223082 ] 

Peter Schuller commented on CASSANDRA-3952:
-------------------------------------------

I think this might happen whenever there is exactly one sstable in L0 that is large enough for the score for L0 to be > 1, *and* if L1 is full (so that skipLevels doesn't promote).

It's possible the sstables I had lying around from sized-tiered compaction combined with my write load conspired to trigger this case.

I'll go over the patch once again and try to be sure we are never accidentally not adding the level information somewhere. If it looks good I'll file this problem in a separate ticket.
                
> avoid quadratic startup time in LeveledManifest
> -----------------------------------------------
>
>                 Key: CASSANDRA-3952
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3952
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Dave Brosius
>            Priority: Minor
>              Labels: lhf
>             Fix For: 1.1.1
>
>         Attachments: speed_up_level_of.diff
>
>
> Checking that each sstable is in the manifest on startup is O(N**2) in the number of sstables:
> {code}
> .       // ensure all SSTables are in the manifest
>         for (SSTableReader ssTableReader : cfs.getSSTables())
>         {
>             if (manifest.levelOf(ssTableReader) < 0)
>                 manifest.add(ssTableReader);
>         }
> {code}
> {code}
> private int levelOf(SSTableReader sstable)
> {
>     for (int level = 0; level < generations.length; level++)
>     {
>         if (generations[level].contains(sstable))
>             return level;
>     }
>     return -1;
> }
> {code}
> Note that the contains call is a linear List.contains.
> We need to switch to a sorted list and bsearch, or a tree, to support TB-levels of data in LeveledCompactionStrategy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3952) avoid quadratic startup time in LeveledManifest

Posted by "Dave Brosius (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221518#comment-13221518 ] 

Dave Brosius commented on CASSANDRA-3952:
-----------------------------------------

Could the SSTableReader just have a field that held what level it was, and is set when LeveledManifest.add and .remove were called?
                
> avoid quadratic startup time in LeveledManifest
> -----------------------------------------------
>
>                 Key: CASSANDRA-3952
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3952
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Priority: Minor
>              Labels: lhf
>             Fix For: 1.1.1
>
>
> Checking that each sstable is in the manifest on startup is O(N**2) in the number of sstables:
> {code}
> .       // ensure all SSTables are in the manifest
>         for (SSTableReader ssTableReader : cfs.getSSTables())
>         {
>             if (manifest.levelOf(ssTableReader) < 0)
>                 manifest.add(ssTableReader);
>         }
> {code}
> {code}
> private int levelOf(SSTableReader sstable)
> {
>     for (int level = 0; level < generations.length; level++)
>     {
>         if (generations[level].contains(sstable))
>             return level;
>     }
>     return -1;
> }
> {code}
> Note that the contains call is a linear List.contains.
> We need to switch to a sorted list and bsearch, or a tree, to support TB-levels of data in LeveledCompactionStrategy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (CASSANDRA-3952) avoid quadratic startup time in LeveledManifest

Posted by "Peter Schuller (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223091#comment-13223091 ] 

Peter Schuller edited comment on CASSANDRA-3952 at 3/6/12 8:41 AM:
-------------------------------------------------------------------

Committed with an additional assertion and the map renamed to {{sstableGenerations}}, and including 1.1.0. It was marked for 1.1.1 but frankly if this *does* introduce some kind of bug, it feels more dangerous to have that crop up in an upgrade to 1.1.1 than to have it in the initial release.
                
      was (Author: scode):
    Committed, including 1.1.0. It was marked for 1.1.1 but frankly if this *does* introduce some kind of bug, it feels more dangerous to have that crop up in an upgrade to 1.1.1 than to have it in the initial release.
                  
> avoid quadratic startup time in LeveledManifest
> -----------------------------------------------
>
>                 Key: CASSANDRA-3952
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3952
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Dave Brosius
>            Priority: Minor
>              Labels: lhf
>             Fix For: 1.1.0
>
>         Attachments: speed_up_level_of.diff
>
>
> Checking that each sstable is in the manifest on startup is O(N**2) in the number of sstables:
> {code}
> .       // ensure all SSTables are in the manifest
>         for (SSTableReader ssTableReader : cfs.getSSTables())
>         {
>             if (manifest.levelOf(ssTableReader) < 0)
>                 manifest.add(ssTableReader);
>         }
> {code}
> {code}
> private int levelOf(SSTableReader sstable)
> {
>     for (int level = 0; level < generations.length; level++)
>     {
>         if (generations[level].contains(sstable))
>             return level;
>     }
>     return -1;
> }
> {code}
> Note that the contains call is a linear List.contains.
> We need to switch to a sorted list and bsearch, or a tree, to support TB-levels of data in LeveledCompactionStrategy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3952) avoid quadratic startup time in LeveledManifest

Posted by "Peter Schuller (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223060#comment-13223060 ] 

Peter Schuller commented on CASSANDRA-3952:
-------------------------------------------

I tried the patch on a local node with pre-existing data on which I altered the CF for leveled compaction strategy. One to a few compactions into taking stress writes I got the below trace. With the patch applied, line 185 is the "newLevel > 0" assertion.

I am not sure yet whether this has anything to do at all with the patch though; I don't think so. Investigating.

{code}
java.lang.AssertionError
        at org.apache.cassandra.db.compaction.LeveledManifest.promote(LeveledManifest.java:185)
        at org.apache.cassandra.db.compaction.LeveledCompactionStrategy.handleNotification(LeveledCompactionStrategy.java:135)
        at org.apache.cassandra.db.DataTracker.notifySSTablesChanged(DataTracker.java:526)
        at org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:249)
        at org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:948)
        at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:204)
        at org.apache.cassandra.db.compaction.LeveledCompactionTask.execute(LeveledCompactionTask.java:46)
        at org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:128)
        at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:26)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:722)
{code}

                
> avoid quadratic startup time in LeveledManifest
> -----------------------------------------------
>
>                 Key: CASSANDRA-3952
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3952
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Dave Brosius
>            Priority: Minor
>              Labels: lhf
>             Fix For: 1.1.1
>
>         Attachments: speed_up_level_of.diff
>
>
> Checking that each sstable is in the manifest on startup is O(N**2) in the number of sstables:
> {code}
> .       // ensure all SSTables are in the manifest
>         for (SSTableReader ssTableReader : cfs.getSSTables())
>         {
>             if (manifest.levelOf(ssTableReader) < 0)
>                 manifest.add(ssTableReader);
>         }
> {code}
> {code}
> private int levelOf(SSTableReader sstable)
> {
>     for (int level = 0; level < generations.length; level++)
>     {
>         if (generations[level].contains(sstable))
>             return level;
>     }
>     return -1;
> }
> {code}
> Note that the contains call is a linear List.contains.
> We need to switch to a sorted list and bsearch, or a tree, to support TB-levels of data in LeveledCompactionStrategy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-3952) avoid quadratic startup time in LeveledManifest

Posted by "Jonathan Ellis (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3952:
--------------------------------------

    Reviewer: scode
    Assignee: Dave Brosius
    
> avoid quadratic startup time in LeveledManifest
> -----------------------------------------------
>
>                 Key: CASSANDRA-3952
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3952
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Dave Brosius
>            Priority: Minor
>              Labels: lhf
>             Fix For: 1.1.1
>
>         Attachments: speed_up_level_of.diff
>
>
> Checking that each sstable is in the manifest on startup is O(N**2) in the number of sstables:
> {code}
> .       // ensure all SSTables are in the manifest
>         for (SSTableReader ssTableReader : cfs.getSSTables())
>         {
>             if (manifest.levelOf(ssTableReader) < 0)
>                 manifest.add(ssTableReader);
>         }
> {code}
> {code}
> private int levelOf(SSTableReader sstable)
> {
>     for (int level = 0; level < generations.length; level++)
>     {
>         if (generations[level].contains(sstable))
>             return level;
>     }
>     return -1;
> }
> {code}
> Note that the contains call is a linear List.contains.
> We need to switch to a sorted list and bsearch, or a tree, to support TB-levels of data in LeveledCompactionStrategy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3952) avoid quadratic startup time in LeveledManifest

Posted by "Jonathan Ellis (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221525#comment-13221525 ] 

Jonathan Ellis commented on CASSANDRA-3952:
-------------------------------------------

Good idea.  It kind of balances out, since if there's a lot of sstables that *aren't* in the manifest, then the manifest will be smaller and the {{contains}} calls correspondingly faster.

I'd rather avoid adding compactionstrategy-specific fields elsewhere if we can help it, but I think we can get the same effect with a temporary {{Map<sstable, level>}} or even just {{Set<sstable>}} that we build during manifest creation?
                
> avoid quadratic startup time in LeveledManifest
> -----------------------------------------------
>
>                 Key: CASSANDRA-3952
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3952
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Priority: Minor
>              Labels: lhf
>             Fix For: 1.1.1
>
>
> Checking that each sstable is in the manifest on startup is O(N**2) in the number of sstables:
> {code}
> .       // ensure all SSTables are in the manifest
>         for (SSTableReader ssTableReader : cfs.getSSTables())
>         {
>             if (manifest.levelOf(ssTableReader) < 0)
>                 manifest.add(ssTableReader);
>         }
> {code}
> {code}
> private int levelOf(SSTableReader sstable)
> {
>     for (int level = 0; level < generations.length; level++)
>     {
>         if (generations[level].contains(sstable))
>             return level;
>     }
>     return -1;
> }
> {code}
> Note that the contains call is a linear List.contains.
> We need to switch to a sorted list and bsearch, or a tree, to support TB-levels of data in LeveledCompactionStrategy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-3952) avoid quadratic startup time in LeveledManifest

Posted by "Dave Brosius (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dave Brosius updated CASSANDRA-3952:
------------------------------------

    Attachment: speed_up_level_of.diff

simple use of map<SSTableReader, Integer> to improve levelOf calculations.

applied against trunk.
                
> avoid quadratic startup time in LeveledManifest
> -----------------------------------------------
>
>                 Key: CASSANDRA-3952
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3952
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Priority: Minor
>              Labels: lhf
>             Fix For: 1.1.1
>
>         Attachments: speed_up_level_of.diff
>
>
> Checking that each sstable is in the manifest on startup is O(N**2) in the number of sstables:
> {code}
> .       // ensure all SSTables are in the manifest
>         for (SSTableReader ssTableReader : cfs.getSSTables())
>         {
>             if (manifest.levelOf(ssTableReader) < 0)
>                 manifest.add(ssTableReader);
>         }
> {code}
> {code}
> private int levelOf(SSTableReader sstable)
> {
>     for (int level = 0; level < generations.length; level++)
>     {
>         if (generations[level].contains(sstable))
>             return level;
>     }
>     return -1;
> }
> {code}
> Note that the contains call is a linear List.contains.
> We need to switch to a sorted list and bsearch, or a tree, to support TB-levels of data in LeveledCompactionStrategy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-3952) avoid quadratic startup time in LeveledManifest

Posted by "Jonathan Ellis (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3952:
--------------------------------------

    Fix Version/s:     (was: 1.1.0)
    
> avoid quadratic startup time in LeveledManifest
> -----------------------------------------------
>
>                 Key: CASSANDRA-3952
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3952
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 1.1.1
>
>
> Checking that each sstable is in the manifest on startup is O(N**2) in the number of sstables:
> {code}
> .       // ensure all SSTables are in the manifest
>         for (SSTableReader ssTableReader : cfs.getSSTables())
>         {
>             if (manifest.levelOf(ssTableReader) < 0)
>                 manifest.add(ssTableReader);
>         }
> {code}
> {code}
> private int levelOf(SSTableReader sstable)
> {
>     for (int level = 0; level < generations.length; level++)
>     {
>         if (generations[level].contains(sstable))
>             return level;
>     }
>     return -1;
> }
> {code}
> Note that the contains call is a linear List.contains.
> We need to switch to a sorted list and bsearch, or a tree, to support TB-levels of data in LeveledCompactionStrategy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3952) avoid quadratic startup time in LeveledManifest

Posted by "Peter Schuller (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223085#comment-13223085 ] 

Peter Schuller commented on CASSANDRA-3952:
-------------------------------------------

This is already tracked by CASSANDRA-3989 being re-opened.
                
> avoid quadratic startup time in LeveledManifest
> -----------------------------------------------
>
>                 Key: CASSANDRA-3952
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3952
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Dave Brosius
>            Priority: Minor
>              Labels: lhf
>             Fix For: 1.1.1
>
>         Attachments: speed_up_level_of.diff
>
>
> Checking that each sstable is in the manifest on startup is O(N**2) in the number of sstables:
> {code}
> .       // ensure all SSTables are in the manifest
>         for (SSTableReader ssTableReader : cfs.getSSTables())
>         {
>             if (manifest.levelOf(ssTableReader) < 0)
>                 manifest.add(ssTableReader);
>         }
> {code}
> {code}
> private int levelOf(SSTableReader sstable)
> {
>     for (int level = 0; level < generations.length; level++)
>     {
>         if (generations[level].contains(sstable))
>             return level;
>     }
>     return -1;
> }
> {code}
> Note that the contains call is a linear List.contains.
> We need to switch to a sorted list and bsearch, or a tree, to support TB-levels of data in LeveledCompactionStrategy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-3952) avoid quadratic startup time in LeveledManifest

Posted by "Jonathan Ellis (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3952:
--------------------------------------

    Assignee:     (was: Jonathan Ellis)
      Labels: lhf  (was: )
    
> avoid quadratic startup time in LeveledManifest
> -----------------------------------------------
>
>                 Key: CASSANDRA-3952
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3952
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Priority: Minor
>              Labels: lhf
>             Fix For: 1.1.1
>
>
> Checking that each sstable is in the manifest on startup is O(N**2) in the number of sstables:
> {code}
> .       // ensure all SSTables are in the manifest
>         for (SSTableReader ssTableReader : cfs.getSSTables())
>         {
>             if (manifest.levelOf(ssTableReader) < 0)
>                 manifest.add(ssTableReader);
>         }
> {code}
> {code}
> private int levelOf(SSTableReader sstable)
> {
>     for (int level = 0; level < generations.length; level++)
>     {
>         if (generations[level].contains(sstable))
>             return level;
>     }
>     return -1;
> }
> {code}
> Note that the contains call is a linear List.contains.
> We need to switch to a sorted list and bsearch, or a tree, to support TB-levels of data in LeveledCompactionStrategy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira