You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2010/10/29 19:33:22 UTC

[jira] Created: (CASSANDRA-1684) Entity groups

Entity groups
-------------

                 Key: CASSANDRA-1684
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1684
             Project: Cassandra
          Issue Type: New Feature
          Components: Core
            Reporter: Jonathan Ellis
             Fix For: 0.8


Supporting entity groups similar to App Engine's (that is, allow rows to be part of a parent "entity group," whose key is used for routing instead of the row itself) allows several improvements:

 - batches within an EG can be atomic across multiple rows
 - order-by-value queries within an EG only have to touch a single replica even with RandomPartitioner

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Commented] (CASSANDRA-1684) Entity groups

Posted by "Jonathan Ellis (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13155949#comment-13155949 ] 

Jonathan Ellis commented on CASSANDRA-1684:
-------------------------------------------

Do we really need row groups now that we can have arbitrary nesting within a row via composite columns?  Looked at that way the row key itself becomes the "entity group id."
                
> Entity groups
> -------------
>
>                 Key: CASSANDRA-1684
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1684
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1
>
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Supporting entity groups similar to App Engine's (that is, allow rows to be part of a parent "entity group," whose key is used for routing instead of the row itself) allows several improvements:
>  - batches within an EG can be atomic across multiple rows
>  - order-by-value queries within an EG only have to touch a single replica even with RandomPartitioner

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-1684) Entity groups

Posted by "T Jake Luciani (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13156133#comment-13156133 ] 

T Jake Luciani commented on CASSANDRA-1684:
-------------------------------------------

bq. Do we really need row groups now that we can have arbitrary nesting within a row via composite columns?

What about secondary indexes?  Unless we add composite secondary indexes.

                
> Entity groups
> -------------
>
>                 Key: CASSANDRA-1684
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1684
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1
>
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Supporting entity groups similar to App Engine's (that is, allow rows to be part of a parent "entity group," whose key is used for routing instead of the row itself) allows several improvements:
>  - batches within an EG can be atomic across multiple rows
>  - order-by-value queries within an EG only have to touch a single replica even with RandomPartitioner

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (CASSANDRA-1684) Entity groups

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005650#comment-13005650 ] 

Jonathan Ellis commented on CASSANDRA-1684:
-------------------------------------------

By "like App Engine [megastore]" we only mean "atomic within a group" not "consistent and isolated within a group."  the former is useful even without the latter.

> Entity groups
> -------------
>
>                 Key: CASSANDRA-1684
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1684
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Sylvain Lebresne
>             Fix For: 0.8
>
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Supporting entity groups similar to App Engine's (that is, allow rows to be part of a parent "entity group," whose key is used for routing instead of the row itself) allows several improvements:
>  - batches within an EG can be atomic across multiple rows
>  - order-by-value queries within an EG only have to touch a single replica even with RandomPartitioner

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-1684) Entity groups

Posted by "Patricio Echague (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13018027#comment-13018027 ] 

Patricio Echague commented on CASSANDRA-1684:
---------------------------------------------

+1 for "Row Groups"

> Entity groups
> -------------
>
>                 Key: CASSANDRA-1684
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1684
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Sylvain Lebresne
>             Fix For: 0.8
>
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Supporting entity groups similar to App Engine's (that is, allow rows to be part of a parent "entity group," whose key is used for routing instead of the row itself) allows several improvements:
>  - batches within an EG can be atomic across multiple rows
>  - order-by-value queries within an EG only have to touch a single replica even with RandomPartitioner

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (CASSANDRA-1684) Entity groups

Posted by "Ed Anuff (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12998732#comment-12998732 ] 

Ed Anuff commented on CASSANDRA-1684:
-------------------------------------

This is something I've been thinking about while consolidating the number of column families within an application so that I ended up with row keys that were constructed from concatenating an entity id with various other strings (eg. 9081bd70-3fe4-11e0-9207-0800200c9a66:something ).  Is it feasible to have a partitioner that hashed on just the first x bytes in a key?  Do tokens have to be one-to-one unique with keys, or could you have multiple keys share the same token? (apparently that's currently possible, although an extreme edge case, with the RandomPartitioner)

> Entity groups
> -------------
>
>                 Key: CASSANDRA-1684
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1684
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Sylvain Lebresne
>             Fix For: 0.8
>
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Supporting entity groups similar to App Engine's (that is, allow rows to be part of a parent "entity group," whose key is used for routing instead of the row itself) allows several improvements:
>  - batches within an EG can be atomic across multiple rows
>  - order-by-value queries within an EG only have to touch a single replica even with RandomPartitioner

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-1684) Entity groups

Posted by "Jonathan Ellis (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13179790#comment-13179790 ] 

Jonathan Ellis commented on CASSANDRA-1684:
-------------------------------------------

bq. Should we add a special "row group" api?

I really like how the composite PK model is shaking out over in CASSANDRA-2474.  Feels like that's the right model for this too, conceptually.  Implementation-wise, wide rows still come up short as noted above.

I'm starting to think that the right way to implement this is to start to erase the distinction between row and column lookups.  Which is basically where Stu was going in CASSANDRA-674.
                
> Entity groups
> -------------
>
>                 Key: CASSANDRA-1684
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1684
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Sylvain Lebresne
>             Fix For: 1.2
>
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Supporting entity groups similar to App Engine's (that is, allow rows to be part of a parent "entity group," whose key is used for routing instead of the row itself) allows several improvements:
>  - batches within an EG can be atomic across multiple rows
>  - order-by-value queries within an EG only have to touch a single replica even with RandomPartitioner

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (CASSANDRA-1684) Entity groups

Posted by "Edward Ribeiro (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12982203#action_12982203 ] 

Edward Ribeiro commented on CASSANDRA-1684:
-------------------------------------------

CIDR 2011 Megastore paper: http://www.cidrdb.org/cidr2011/Papers/CIDR11_Paper32.pdf

Any development already started on this issue?

> Entity groups
> -------------
>
>                 Key: CASSANDRA-1684
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1684
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Sylvain Lebresne
>             Fix For: 0.8
>
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Supporting entity groups similar to App Engine's (that is, allow rows to be part of a parent "entity group," whose key is used for routing instead of the row itself) allows several improvements:
>  - batches within an EG can be atomic across multiple rows
>  - order-by-value queries within an EG only have to touch a single replica even with RandomPartitioner

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1684) Entity groups

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005665#comment-13005665 ] 

Sylvain Lebresne commented on CASSANDRA-1684:
---------------------------------------------

I'd add that it's pretty clear in my mind that we should end up calling them 'row group' or something alike to avoid the confusion.

> Entity groups
> -------------
>
>                 Key: CASSANDRA-1684
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1684
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Sylvain Lebresne
>             Fix For: 0.8
>
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Supporting entity groups similar to App Engine's (that is, allow rows to be part of a parent "entity group," whose key is used for routing instead of the row itself) allows several improvements:
>  - batches within an EG can be atomic across multiple rows
>  - order-by-value queries within an EG only have to touch a single replica even with RandomPartitioner

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-1684) Entity groups

Posted by "Ed Anuff (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13155981#comment-13155981 ] 

Ed Anuff commented on CASSANDRA-1684:
-------------------------------------

I agree with Sylvain points.  This does raise the question, though, if there were more optimizations done on rows (allowed them to be even larger, etc.), would that be a better approach?  I'm personally all for that.
                
> Entity groups
> -------------
>
>                 Key: CASSANDRA-1684
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1684
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1
>
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Supporting entity groups similar to App Engine's (that is, allow rows to be part of a parent "entity group," whose key is used for routing instead of the row itself) allows several improvements:
>  - batches within an EG can be atomic across multiple rows
>  - order-by-value queries within an EG only have to touch a single replica even with RandomPartitioner

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-1684) Entity groups

Posted by "Sylvain Lebresne (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13155965#comment-13155965 ] 

Sylvain Lebresne commented on CASSANDRA-1684:
---------------------------------------------

It is a good question, and I suppose it depends on what was the motivation for row groups in the first place (after all, we've always kind of be able to arbitrary nest, we just have (slightly) simpler way now).

For instance, if the goal is to make sure rows are collocated, having to do it with composite may not be very convenient, in particular if you wan to collocate rows across multiple CF. Of course it is always possible to redesign the model so that you use the same row key and use composite, but that could be really weird. To "solve" that last part, we could provide the row group API but encode it server side with composites.

However, I think we should be aware that pushing such encoding has limitation today:
* there is the same problem that encoding super columns with composite, i.e. we'd need range tombstones.
* rows have a number of subtle limitation that are fine, but may be a bit less fine if you start to push for collocating lots and lots of data under one row:
** There is the 2B columns limit
** If a row is > 2GB, it won't be mmapped
** compaction is slower on big rows
** performance can globally be less good on huge rows
** leveled compaction has at least one row per sstable. Goes a bit against fixed size sstables.
Don't get me wrong, for most case, this is probably fine and we likely want to improve on all of this, but those are still obstacle to co-locating large amount of data under the same row

Now maybe pushing the co-location of data is not a good idea for a distributed store (it obviously raise the question of load balancing in particular), but there is case where careful co-location is paramount to the best performance so giving a good tool for that could have value.

Doing row groups 'natively' would avoids the gotcha above but note that it has at least one drawback: if/once we do CASSANDRA-2893, isolation for row group encoded with  composite type would be a given, with 'native' row group we would have to work a bit.

So overall, I think row group could have an interest API wise, making for a number of more natural modeling. And if we think this is indeed useful, I kind of think doing it natively could be less headache than an encoding with composites overall.
                
> Entity groups
> -------------
>
>                 Key: CASSANDRA-1684
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1684
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1
>
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Supporting entity groups similar to App Engine's (that is, allow rows to be part of a parent "entity group," whose key is used for routing instead of the row itself) allows several improvements:
>  - batches within an EG can be atomic across multiple rows
>  - order-by-value queries within an EG only have to touch a single replica even with RandomPartitioner

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (CASSANDRA-1684) Entity groups

Posted by "Dave Revell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005713#comment-13005713 ] 

Dave Revell commented on CASSANDRA-1684:
----------------------------------------

It sounds like everyone agrees.

+1 on Sylvain's idea to call them something other than "entity groups."

> Entity groups
> -------------
>
>                 Key: CASSANDRA-1684
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1684
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Sylvain Lebresne
>             Fix For: 0.8
>
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Supporting entity groups similar to App Engine's (that is, allow rows to be part of a parent "entity group," whose key is used for routing instead of the row itself) allows several improvements:
>  - batches within an EG can be atomic across multiple rows
>  - order-by-value queries within an EG only have to touch a single replica even with RandomPartitioner

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (CASSANDRA-1684) Entity groups

Posted by "Gary Dusbabek (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12926406#action_12926406 ] 

Gary Dusbabek commented on CASSANDRA-1684:
------------------------------------------

Would they be static like App Engine, or would we permit dynamically adding/subtracting existing rows to an entity group, in effect, moving them?  

the G-Store paper explains one approach to this:  www.cs.ucsb.edu/~sudipto/papers/socc10-das.pdf

> Entity groups
> -------------
>
>                 Key: CASSANDRA-1684
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1684
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>             Fix For: 0.8
>
>
> Supporting entity groups similar to App Engine's (that is, allow rows to be part of a parent "entity group," whose key is used for routing instead of the row itself) allows several improvements:
>  - batches within an EG can be atomic across multiple rows
>  - order-by-value queries within an EG only have to touch a single replica even with RandomPartitioner

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (CASSANDRA-1684) Entity groups

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis reassigned CASSANDRA-1684:
-----------------------------------------

    Assignee: Sylvain Lebresne

> Entity groups
> -------------
>
>                 Key: CASSANDRA-1684
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1684
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Sylvain Lebresne
>             Fix For: 0.8
>
>
> Supporting entity groups similar to App Engine's (that is, allow rows to be part of a parent "entity group," whose key is used for routing instead of the row itself) allows several improvements:
>  - batches within an EG can be atomic across multiple rows
>  - order-by-value queries within an EG only have to touch a single replica even with RandomPartitioner

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1684) Entity groups

Posted by "Dave Revell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005536#comment-13005536 ] 

Dave Revell commented on CASSANDRA-1684:
----------------------------------------

As jbellis says in the description, atomic batches and entity group locality are cool and useful. But we should be clear that Cassandra's entity groups would be a different beast than Megastore's entity groups, and wouldn't have the same consistency properties unless some un-Cassandra-like changes were made.

In Megastore, transactions can maintain arbitrary consistency constraints among items in an entity group, since there is a Paxos-agreed total order of transactions. Cassandra has so far avoided fancy distributed agreement like this. For example, imagine running (in Cassandra) two different transactions on two different replicas and imagine what mishmash of the two outcomes you'd get once timestamp-based conflict resolution happened. In Megastore one of the transactions would abort. Are we willing to add Paxos?

G-Store's ownership transfer protocol also seems very anti-Cassandra-philosophy with its concept of single-replica item ownership.

I'd be happy to be corrected on any of this. I think Megastore-like entity groups are an exciting idea but perhaps make more sense on top of HBase :)

> Entity groups
> -------------
>
>                 Key: CASSANDRA-1684
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1684
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Sylvain Lebresne
>             Fix For: 0.8
>
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Supporting entity groups similar to App Engine's (that is, allow rows to be part of a parent "entity group," whose key is used for routing instead of the row itself) allows several improvements:
>  - batches within an EG can be atomic across multiple rows
>  - order-by-value queries within an EG only have to touch a single replica even with RandomPartitioner

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (CASSANDRA-1684) Entity groups

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002155#comment-13002155 ] 

Sylvain Lebresne commented on CASSANDRA-1684:
---------------------------------------------

bq. Do tokens have to be one-to-one unique with keys, or could you have multiple keys share the same token? (apparently that's currently possible, although an extreme edge case, with the RandomPartitioner)

Right now, they do have to be one-to-one. That's the 'raison d'ĂȘtre' of CASSANDRA-1034 (and I won't hide that my interest for the latter is motivated by this ticket, even though we should fix it because of RandomPartioner anyway).

As for this ticket, I think using parts of the key for the token is only the first step (but an important one). The main thing we want here is to apply mutation on an entity group consistently, that is in one commit log transaction. That in turn is not very complicated in theory, but will be much more work in practice I believe.

As a side note, I think it would also be nice to find "a trick" to make this work with the existing partitioners. Otherwise, since we can't change partitioners, this would make this useful for only new clusters, which would be sad.


> Entity groups
> -------------
>
>                 Key: CASSANDRA-1684
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1684
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Sylvain Lebresne
>             Fix For: 0.8
>
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Supporting entity groups similar to App Engine's (that is, allow rows to be part of a parent "entity group," whose key is used for routing instead of the row itself) allows several improvements:
>  - batches within an EG can be atomic across multiple rows
>  - order-by-value queries within an EG only have to touch a single replica even with RandomPartitioner

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] Updated: (CASSANDRA-1684) Entity groups

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1684:
--------------------------------------

    Remaining Estimate: 80h
     Original Estimate: 80h

> Entity groups
> -------------
>
>                 Key: CASSANDRA-1684
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1684
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Sylvain Lebresne
>             Fix For: 0.8
>
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Supporting entity groups similar to App Engine's (that is, allow rows to be part of a parent "entity group," whose key is used for routing instead of the row itself) allows several improvements:
>  - batches within an EG can be atomic across multiple rows
>  - order-by-value queries within an EG only have to touch a single replica even with RandomPartitioner

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Commented] (CASSANDRA-1684) Entity groups

Posted by "Daniel Doubleday (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13156596#comment-13156596 ] 

Daniel Doubleday commented on CASSANDRA-1684:
---------------------------------------------

Two use cases where same row does not work for us:

- Read/Write intense CFs where we need row caching but cannot cache all values due to their size (CASSANDRA-1956 in its current form will not help there)
- Heavy update CFs where we use changing (versioned) row keys to avoid multiple-sstable-reads
                
> Entity groups
> -------------
>
>                 Key: CASSANDRA-1684
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1684
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1
>
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Supporting entity groups similar to App Engine's (that is, allow rows to be part of a parent "entity group," whose key is used for routing instead of the row itself) allows several improvements:
>  - batches within an EG can be atomic across multiple rows
>  - order-by-value queries within an EG only have to touch a single replica even with RandomPartitioner

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (CASSANDRA-1684) Entity groups

Posted by "Jonathan Ellis (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis resolved CASSANDRA-1684.
---------------------------------------

       Resolution: Won't Fix
    Fix Version/s:     (was: 1.2)
         Assignee:     (was: Sylvain Lebresne)
    
> Entity groups
> -------------
>
>                 Key: CASSANDRA-1684
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1684
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Supporting entity groups similar to App Engine's (that is, allow rows to be part of a parent "entity group," whose key is used for routing instead of the row itself) allows several improvements:
>  - batches within an EG can be atomic across multiple rows
>  - order-by-value queries within an EG only have to touch a single replica even with RandomPartitioner

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-1684) Entity groups

Posted by "Jonathan Ellis (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13170484#comment-13170484 ] 

Jonathan Ellis commented on CASSANDRA-1684:
-------------------------------------------

bq. if there were more optimizations done on rows (allowed them to be even larger, etc.), would that be a better approach? 

I think it would be.  That's definitely a long-term play, though.  I only have ideas on how to fix some of the problems Sylvain raised.  And then there's others like CASSANDRA-3362.

But we kind of need to fix large rows independent of the entity group idea.

bq. Two use cases where same row does not work for us:

Both of these sound like basically workarounds for weaknesses elsewhere.  Which again feels like the right answer is to fix those weaknesses rather than adding another layer of hack on top.

I guess there's really two questions here:
- Should we add a special "row group" api?
- What should the implementation look like?

In other words, we could add a row group api and implement it in terms of large rows.  Or implement it another way.  But, we want wide rows that work "well" independent of row groups, so it feels like that's the right place to spend our efforts now.
                
> Entity groups
> -------------
>
>                 Key: CASSANDRA-1684
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1684
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Sylvain Lebresne
>             Fix For: 1.2
>
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Supporting entity groups similar to App Engine's (that is, allow rows to be part of a parent "entity group," whose key is used for routing instead of the row itself) allows several improvements:
>  - batches within an EG can be atomic across multiple rows
>  - order-by-value queries within an EG only have to touch a single replica even with RandomPartitioner

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira