You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2009/07/16 21:28:14 UTC

[jira] Created: (CASSANDRA-299) make table directory creation lazy

make table directory creation lazy
----------------------------------

                 Key: CASSANDRA-299
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-299
             Project: Cassandra
          Issue Type: Improvement
            Reporter: Jonathan Ellis
            Priority: Minor


checking that each subdir for each table is present on startup -- _every_ startup -- could be a real pita.

i think that to support 100k tables (not impossible, in a hosted-cassandra-as-a-service scenario) we're going to want to make table dir creation lazy.

then we would want to make scanning for sstables faster by only doing one listdir call per datadir, to see which table subdirs are present, and then checking only those for sstable files.  this would involve some re-org of the onstart code.

(note that we don't want to prune directories if there are no sstables left in them, since we'd end up re-creating them at some point anyway; we just want to allow the lack of a table subdir to imply the same thing as an empty one.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-299) make table directory creation lazy

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743449#action_12743449 ] 

Jonathan Ellis commented on CASSANDRA-299:
------------------------------------------

I really think we can do better than a factory method that has to be called twice.  Doesn't that feel ... wrong to you deep down inside? :)

> make table directory creation lazy
> ----------------------------------
>
>                 Key: CASSANDRA-299
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-299
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.5
>
>         Attachments: 0001-lazy-creation-of-Table-dirs.-only-open-tables-that-h.patch
>
>
> checking that each subdir for each table is present on startup -- _every_ startup -- could be a real pita.
> i think that to support 100k tables (not impossible, in a hosted-cassandra-as-a-service scenario) we're going to want to make table dir creation lazy.
> then we would want to make scanning for sstables faster by only doing one listdir call per datadir, to see which table subdirs are present, and then checking only those for sstable files.  this would involve some re-org of the onstart code.
> (note that we don't want to prune directories if there are no sstables left in them, since we'd end up re-creating them at some point anyway; we just want to allow the lack of a table subdir to imply the same thing as an empty one.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-299) make table directory creation lazy

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743430#action_12743430 ] 

Jonathan Ellis commented on CASSANDRA-299:
------------------------------------------

this makes me nervous...  requiring Table.open() to be called a second time lazily, at which point the directories are created, is the kind of thing that's guaranteed to bite us eventually when someone forgets and grabs a Table object directly from instances_ (for example).

> make table directory creation lazy
> ----------------------------------
>
>                 Key: CASSANDRA-299
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-299
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.5
>
>         Attachments: 0001-lazy-creation-of-Table-dirs.-only-open-tables-that-h.patch
>
>
> checking that each subdir for each table is present on startup -- _every_ startup -- could be a real pita.
> i think that to support 100k tables (not impossible, in a hosted-cassandra-as-a-service scenario) we're going to want to make table dir creation lazy.
> then we would want to make scanning for sstables faster by only doing one listdir call per datadir, to see which table subdirs are present, and then checking only those for sstable files.  this would involve some re-org of the onstart code.
> (note that we don't want to prune directories if there are no sstables left in them, since we'd end up re-creating them at some point anyway; we just want to allow the lack of a table subdir to imply the same thing as an empty one.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (CASSANDRA-299) make table directory creation lazy

Posted by "Arin Sarkissian (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743446#action_12743446 ] 

Arin Sarkissian edited comment on CASSANDRA-299 at 8/14/09 2:09 PM:
--------------------------------------------------------------------

but instances_ is only used in Table.open() and the constructor's private... should be pretty safe IMO.

Table.open() is the only public interface to the constructor or _instances... aka its the only public way to get a Table

      was (Author: phatduckk):
    but instances_ is only used in Table.open() and the constructor's private... should be pretty safe IMO.

Table.open() is the only public interface to the constructor or _instances...
  
> make table directory creation lazy
> ----------------------------------
>
>                 Key: CASSANDRA-299
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-299
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.5
>
>         Attachments: 0001-lazy-creation-of-Table-dirs.-only-open-tables-that-h.patch
>
>
> checking that each subdir for each table is present on startup -- _every_ startup -- could be a real pita.
> i think that to support 100k tables (not impossible, in a hosted-cassandra-as-a-service scenario) we're going to want to make table dir creation lazy.
> then we would want to make scanning for sstables faster by only doing one listdir call per datadir, to see which table subdirs are present, and then checking only those for sstable files.  this would involve some re-org of the onstart code.
> (note that we don't want to prune directories if there are no sstables left in them, since we'd end up re-creating them at some point anyway; we just want to allow the lack of a table subdir to imply the same thing as an empty one.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-299) make table directory creation lazy

Posted by "Chris Goffinet (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757801#action_12757801 ] 

Chris Goffinet commented on CASSANDRA-299:
------------------------------------------

Taking this over.

> make table directory creation lazy
> ----------------------------------
>
>                 Key: CASSANDRA-299
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-299
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Chris Goffinet
>            Priority: Minor
>             Fix For: 0.5
>
>         Attachments: 0001-lazy-creation-of-Table-dirs.-only-open-tables-that-h.patch
>
>
> checking that each subdir for each table is present on startup -- _every_ startup -- could be a real pita.
> i think that to support 100k tables (not impossible, in a hosted-cassandra-as-a-service scenario) we're going to want to make table dir creation lazy.
> then we would want to make scanning for sstables faster by only doing one listdir call per datadir, to see which table subdirs are present, and then checking only those for sstable files.  this would involve some re-org of the onstart code.
> (note that we don't want to prune directories if there are no sstables left in them, since we'd end up re-creating them at some point anyway; we just want to allow the lack of a table subdir to imply the same thing as an empty one.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-299) make table directory creation lazy

Posted by "Michael Greene (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Greene updated CASSANDRA-299:
-------------------------------------

    Component/s: Core

> make table directory creation lazy
> ----------------------------------
>
>                 Key: CASSANDRA-299
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-299
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Priority: Minor
>
> checking that each subdir for each table is present on startup -- _every_ startup -- could be a real pita.
> i think that to support 100k tables (not impossible, in a hosted-cassandra-as-a-service scenario) we're going to want to make table dir creation lazy.
> then we would want to make scanning for sstables faster by only doing one listdir call per datadir, to see which table subdirs are present, and then checking only those for sstable files.  this would involve some re-org of the onstart code.
> (note that we don't want to prune directories if there are no sstables left in them, since we'd end up re-creating them at some point anyway; we just want to allow the lack of a table subdir to imply the same thing as an empty one.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-299) make table directory creation lazy

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732104#action_12732104 ] 

Jonathan Ellis commented on CASSANDRA-299:
------------------------------------------

(see CASSANDRA-276 for the original put-tables-in-subdirs ticket)

> make table directory creation lazy
> ----------------------------------
>
>                 Key: CASSANDRA-299
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-299
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Jonathan Ellis
>            Priority: Minor
>
> checking that each subdir for each table is present on startup -- _every_ startup -- could be a real pita.
> i think that to support 100k tables (not impossible, in a hosted-cassandra-as-a-service scenario) we're going to want to make table dir creation lazy.
> then we would want to make scanning for sstables faster by only doing one listdir call per datadir, to see which table subdirs are present, and then checking only those for sstable files.  this would involve some re-org of the onstart code.
> (note that we don't want to prune directories if there are no sstables left in them, since we'd end up re-creating them at some point anyway; we just want to allow the lack of a table subdir to imply the same thing as an empty one.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (CASSANDRA-299) make table directory creation lazy

Posted by "Chris Goffinet (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Goffinet reassigned CASSANDRA-299:
----------------------------------------

    Assignee: Chris Goffinet

> make table directory creation lazy
> ----------------------------------
>
>                 Key: CASSANDRA-299
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-299
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Chris Goffinet
>            Priority: Minor
>             Fix For: 0.5
>
>         Attachments: 0001-lazy-creation-of-Table-dirs.-only-open-tables-that-h.patch
>
>
> checking that each subdir for each table is present on startup -- _every_ startup -- could be a real pita.
> i think that to support 100k tables (not impossible, in a hosted-cassandra-as-a-service scenario) we're going to want to make table dir creation lazy.
> then we would want to make scanning for sstables faster by only doing one listdir call per datadir, to see which table subdirs are present, and then checking only those for sstable files.  this would involve some re-org of the onstart code.
> (note that we don't want to prune directories if there are no sstables left in them, since we'd end up re-creating them at some point anyway; we just want to allow the lack of a table subdir to imply the same thing as an empty one.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-299) make table directory creation lazy

Posted by "Arin Sarkissian (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arin Sarkissian updated CASSANDRA-299:
--------------------------------------

    Attachment: 0001-lazy-creation-of-Table-dirs.-only-open-tables-that-h.patch

here's a patch that:

doesnt create all database dirs on startup. 
Table.open now optionally checks if a Table needs a dir
reworked how CD starts tables so that a Table w/o a dir in any DataDirectory does not get started at startup

> make table directory creation lazy
> ----------------------------------
>
>                 Key: CASSANDRA-299
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-299
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.5
>
>         Attachments: 0001-lazy-creation-of-Table-dirs.-only-open-tables-that-h.patch
>
>
> checking that each subdir for each table is present on startup -- _every_ startup -- could be a real pita.
> i think that to support 100k tables (not impossible, in a hosted-cassandra-as-a-service scenario) we're going to want to make table dir creation lazy.
> then we would want to make scanning for sstables faster by only doing one listdir call per datadir, to see which table subdirs are present, and then checking only those for sstable files.  this would involve some re-org of the onstart code.
> (note that we don't want to prune directories if there are no sstables left in them, since we'd end up re-creating them at some point anyway; we just want to allow the lack of a table subdir to imply the same thing as an empty one.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (CASSANDRA-299) make table directory creation lazy

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis resolved CASSANDRA-299.
--------------------------------------

    Resolution: Won't Fix
      Assignee:     (was: Chris Goffinet)

This isn't going to be a priority until someone starts runing with way more keyspaces than anyone is currently.  Closing for now.

> make table directory creation lazy
> ----------------------------------
>
>                 Key: CASSANDRA-299
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-299
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Priority: Minor
>         Attachments: 0001-lazy-creation-of-Table-dirs.-only-open-tables-that-h.patch
>
>
> checking that each subdir for each table is present on startup -- _every_ startup -- could be a real pita.
> i think that to support 100k tables (not impossible, in a hosted-cassandra-as-a-service scenario) we're going to want to make table dir creation lazy.
> then we would want to make scanning for sstables faster by only doing one listdir call per datadir, to see which table subdirs are present, and then checking only those for sstable files.  this would involve some re-org of the onstart code.
> (note that we don't want to prune directories if there are no sstables left in them, since we'd end up re-creating them at some point anyway; we just want to allow the lack of a table subdir to imply the same thing as an empty one.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-299) make table directory creation lazy

Posted by "Arin Sarkissian (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742013#action_12742013 ] 

Arin Sarkissian commented on CASSANDRA-299:
-------------------------------------------

http://github.com/phatduckk/Cassandra/commit/e2b5774db232f1d419e22089444682dbee1d3f51

What's going on in the patch is that  tables are no longer open()'ed and onStart()'ed in CassandraDaemon.

The whole process is lazy...
When a table is first opened we will make sure it has a dir in each <DataFileDirectories>
Once that's done we start the table which will start each of its CF

If I understand correctly what I've done so far is just 1/2 of  what Jonathan mentioned in his comment. It sounds like Jonathan's saying we should still open() and onStart() Tables in CassandraDaemon but only those that have subdirs in <DataFileDirectories>.

Combining that with my lazy opne()/onStart() refector will work but I want to make sure that's what you're getting at before I submit a patch...

> make table directory creation lazy
> ----------------------------------
>
>                 Key: CASSANDRA-299
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-299
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.5
>
>
> checking that each subdir for each table is present on startup -- _every_ startup -- could be a real pita.
> i think that to support 100k tables (not impossible, in a hosted-cassandra-as-a-service scenario) we're going to want to make table dir creation lazy.
> then we would want to make scanning for sstables faster by only doing one listdir call per datadir, to see which table subdirs are present, and then checking only those for sstable files.  this would involve some re-org of the onstart code.
> (note that we don't want to prune directories if there are no sstables left in them, since we'd end up re-creating them at some point anyway; we just want to allow the lack of a table subdir to imply the same thing as an empty one.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-299) make table directory creation lazy

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742024#action_12742024 ] 

Jonathan Ellis commented on CASSANDRA-299:
------------------------------------------

btw, I'm not a huge fan of reviewing stuff on github, because i'm not allowed to commit it (with or without minor tweaks) if I approve :)

> make table directory creation lazy
> ----------------------------------
>
>                 Key: CASSANDRA-299
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-299
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.5
>
>
> checking that each subdir for each table is present on startup -- _every_ startup -- could be a real pita.
> i think that to support 100k tables (not impossible, in a hosted-cassandra-as-a-service scenario) we're going to want to make table dir creation lazy.
> then we would want to make scanning for sstables faster by only doing one listdir call per datadir, to see which table subdirs are present, and then checking only those for sstable files.  this would involve some re-org of the onstart code.
> (note that we don't want to prune directories if there are no sstables left in them, since we'd end up re-creating them at some point anyway; we just want to allow the lack of a table subdir to imply the same thing as an empty one.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-299) make table directory creation lazy

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742015#action_12742015 ] 

Jonathan Ellis commented on CASSANDRA-299:
------------------------------------------

what i'm saying is, you should still open() all tables (i don't see the benefit in changing that, and it would cause some pain i think), but the scanning for data files should not be per-table but at a higher level to avoid O(N) system calls scanning datafiledirectories, when it really only needs to be O(1).  (of course the subdirs then need to be scanned per-table).

this involves breaking the current encapsulation but it will make a big difference if we want to support 10000s of Tables.  (and i think we do.)

> make table directory creation lazy
> ----------------------------------
>
>                 Key: CASSANDRA-299
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-299
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.5
>
>
> checking that each subdir for each table is present on startup -- _every_ startup -- could be a real pita.
> i think that to support 100k tables (not impossible, in a hosted-cassandra-as-a-service scenario) we're going to want to make table dir creation lazy.
> then we would want to make scanning for sstables faster by only doing one listdir call per datadir, to see which table subdirs are present, and then checking only those for sstable files.  this would involve some re-org of the onstart code.
> (note that we don't want to prune directories if there are no sstables left in them, since we'd end up re-creating them at some point anyway; we just want to allow the lack of a table subdir to imply the same thing as an empty one.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-299) make table directory creation lazy

Posted by "Gary Dusbabek (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846695#action_12846695 ] 

Gary Dusbabek commented on CASSANDRA-299:
-----------------------------------------

We'll get this for free when CASSANDRA-861 goes in.

> make table directory creation lazy
> ----------------------------------
>
>                 Key: CASSANDRA-299
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-299
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Priority: Minor
>         Attachments: 0001-lazy-creation-of-Table-dirs.-only-open-tables-that-h.patch
>
>
> checking that each subdir for each table is present on startup -- _every_ startup -- could be a real pita.
> i think that to support 100k tables (not impossible, in a hosted-cassandra-as-a-service scenario) we're going to want to make table dir creation lazy.
> then we would want to make scanning for sstables faster by only doing one listdir call per datadir, to see which table subdirs are present, and then checking only those for sstable files.  this would involve some re-org of the onstart code.
> (note that we don't want to prune directories if there are no sstables left in them, since we'd end up re-creating them at some point anyway; we just want to allow the lack of a table subdir to imply the same thing as an empty one.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-299) make table directory creation lazy

Posted by "Arin Sarkissian (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743446#action_12743446 ] 

Arin Sarkissian commented on CASSANDRA-299:
-------------------------------------------

but instances_ is only used in Table.open() and the constructor's private... should be pretty safe IMO.

Table.open() is the only public interface to the constructor or _instances...

> make table directory creation lazy
> ----------------------------------
>
>                 Key: CASSANDRA-299
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-299
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.5
>
>         Attachments: 0001-lazy-creation-of-Table-dirs.-only-open-tables-that-h.patch
>
>
> checking that each subdir for each table is present on startup -- _every_ startup -- could be a real pita.
> i think that to support 100k tables (not impossible, in a hosted-cassandra-as-a-service scenario) we're going to want to make table dir creation lazy.
> then we would want to make scanning for sstables faster by only doing one listdir call per datadir, to see which table subdirs are present, and then checking only those for sstable files.  this would involve some re-org of the onstart code.
> (note that we don't want to prune directories if there are no sstables left in them, since we'd end up re-creating them at some point anyway; we just want to allow the lack of a table subdir to imply the same thing as an empty one.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-299) make table directory creation lazy

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-299:
-------------------------------------

    Fix Version/s:     (was: 0.5)

> make table directory creation lazy
> ----------------------------------
>
>                 Key: CASSANDRA-299
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-299
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Chris Goffinet
>            Priority: Minor
>         Attachments: 0001-lazy-creation-of-Table-dirs.-only-open-tables-that-h.patch
>
>
> checking that each subdir for each table is present on startup -- _every_ startup -- could be a real pita.
> i think that to support 100k tables (not impossible, in a hosted-cassandra-as-a-service scenario) we're going to want to make table dir creation lazy.
> then we would want to make scanning for sstables faster by only doing one listdir call per datadir, to see which table subdirs are present, and then checking only those for sstable files.  this would involve some re-org of the onstart code.
> (note that we don't want to prune directories if there are no sstables left in them, since we'd end up re-creating them at some point anyway; we just want to allow the lack of a table subdir to imply the same thing as an empty one.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-299) make table directory creation lazy

Posted by "Chris Goffinet (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12750593#action_12750593 ] 

Chris Goffinet commented on CASSANDRA-299:
------------------------------------------

Arin, can I take this one over?

> make table directory creation lazy
> ----------------------------------
>
>                 Key: CASSANDRA-299
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-299
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.5
>
>         Attachments: 0001-lazy-creation-of-Table-dirs.-only-open-tables-that-h.patch
>
>
> checking that each subdir for each table is present on startup -- _every_ startup -- could be a real pita.
> i think that to support 100k tables (not impossible, in a hosted-cassandra-as-a-service scenario) we're going to want to make table dir creation lazy.
> then we would want to make scanning for sstables faster by only doing one listdir call per datadir, to see which table subdirs are present, and then checking only those for sstable files.  this would involve some re-org of the onstart code.
> (note that we don't want to prune directories if there are no sstables left in them, since we'd end up re-creating them at some point anyway; we just want to allow the lack of a table subdir to imply the same thing as an empty one.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-299) make table directory creation lazy

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742027#action_12742027 ] 

Jonathan Ellis commented on CASSANDRA-299:
------------------------------------------

(See http://wiki.apache.org/cassandra/GitAndJIRA for how to take some of the suck out of manually uploading patches.)

> make table directory creation lazy
> ----------------------------------
>
>                 Key: CASSANDRA-299
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-299
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.5
>
>
> checking that each subdir for each table is present on startup -- _every_ startup -- could be a real pita.
> i think that to support 100k tables (not impossible, in a hosted-cassandra-as-a-service scenario) we're going to want to make table dir creation lazy.
> then we would want to make scanning for sstables faster by only doing one listdir call per datadir, to see which table subdirs are present, and then checking only those for sstable files.  this would involve some re-org of the onstart code.
> (note that we don't want to prune directories if there are no sstables left in them, since we'd end up re-creating them at some point anyway; we just want to allow the lack of a table subdir to imply the same thing as an empty one.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.