You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Benedict (JIRA)" <ji...@apache.org> on 2015/07/15 17:18:05 UTC

[jira] [Comment Edited] (CASSANDRA-9705) Simplify some of 8099's concrete implementations

    [ https://issues.apache.org/jira/browse/CASSANDRA-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14628212#comment-14628212 ] 

Benedict edited comment on CASSANDRA-9705 at 7/15/15 3:17 PM:
--------------------------------------------------------------

bq. But what is currently called ComplexColumnData cannot reasonably implement Cell, not with the current definition/API of Cell at least.

I guess, to clarify, I'm somewhat taking exception to the use of {{Column}} in the name alone, without any reference to the {{Row}} component. A {{ColumnFilter}}, for instance, does not filter {{ColumnData}}, but filters {{ColumnDefinition}}. A {{RowFilter}} filters {{ColumnData}}, however (or the whole {{Row}}). This is inconsistent, and I would rather we made it consistent. Even renaming to {{RowDatum}} would be a little more consistent, but still not very consistent.

So my proposal is that we overload the _logical concept_ of a cell (lowercase) to mean {{(Clustering Prefix, Column) -> Values}}, and special case to simple cells (where {{card(Values) <= 1}}) and complex cells (where {{card(Values) >= 1}}). Another way of formulating this is that a cell is the collection of {{(Clustering Prefix, Column) -> Value}} maps, but a simple cell is one where this map is guaranteed to be of size 1. This could be achieved by:

* Renaming {{ComplexColumnData}} to {{ComplexCell}}; and
* either:
** Renaming {{ColumnData}} to {{AbstractCell}} (or, perhaps, {{RowCell}}, or any other {{XCell}}); or
** Renaming {{Cell}} to {{SimpleCell}}, and {{ColumnData}} to {{Cell}}

The latter would further lead to normalising the concept of simple/complex around the place, somewhat, I think. However it may mean typing "Simple" a bit more than we might otherwise.

I'll note that, in my opinion at least, this is more of a clarification of the prior concept of cell that naturally follows from the other internal engine refactoring (previously it was also defined as {{(clustering prefix, column) -> value}}; we've just now introduced an extra "cell path"). This has previously been a bit of a gray zone, nomenclature-wise, and very few people will have used the verbiage around complex data internals, since very few people interact with it (or understand it). I'm hoping both will change for the better post 8099, but making our language a bit clearer will no doubt help.


was (Author: benedict):
bq. But what is currently called ComplexColumnData cannot reasonably implement Cell, not with the current definition/API of Cell at least.

I guess, to clarify, I'm somewhat taking exception to the use of {{Column}} in the name alone, without any reference to the {{Row}} component. A {{ColumnFilter}}, for instance, does not filter {{ColumnData}}, but filters {{ColumnDefinition}}. A {{RowFilter}} filters {{ColumnData}}, however (or the whole {{Row}}). This is inconsistent, and I would rather we made it consistent. Even renaming to {{RowDatum}} would be a little more consistent, but still not very consistent.

So my proposal is that we overload the _logical concept_ of a cell (lowercase) to mean {{(Clustering Prefix, Column) -> Values}}, and special case to simple cells (where {{card(Values) <= 1}}) and complex cells (where {{card(Values) >= 1}}). Another way of formulating this is that a cell is the collection of {{(Clustering Prefix, Column) -> Value}} maps, but a simple cell is one where this map is guaranteed to be of size 1. This could be achieved by:

* Renaming {{ComplexColumnData}} to {{ComplexCell}}; and
* either:
** Renaming {{ColumnData}} to {{AbstractCell}} (or, perhaps, {{RowCell}}, or any other {{XCell}}); or
** Renaming {{Cell}} to {{SimpleCell}}, and {{ColumnData}} to {{Cell}}

The latter would further lead to normalising the concept of simple/complex around the place, somewhat, I think. However it may mean typing "Simple" a bit more than we might otherwise.

I'll note that, in my opinion at least, this is more of a clarification of the prior concept of cell that naturally follows from the other internal engine refactoring. This has previously been a bit of a gray zone, nomenclature-wise, and very few people will have used the verbiage around complex data internals, since very few people interact with it (or understand it). I'm hoping both will change for the better post 8099, but making our language a bit clearer will no doubt help.

> Simplify some of 8099's concrete implementations
> ------------------------------------------------
>
>                 Key: CASSANDRA-9705
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9705
>             Project: Cassandra
>          Issue Type: Sub-task
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 3.0 beta 1
>
>
> As mentioned in the ticket comments, some of the concrete implementations (for Cell, Row, Clustering, PartitionUpdate, ...) of the initial patch for CASSANDRA-8099 are more complex than they should be (the use of flyweight is typically probably ill-fitted), which probably has performance consequence. This ticket is to track the refactoring/simplifying those implementation (mainly by removing the use of flyweights and simplifying accordingly).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)