You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sylvain Lebresne (Commented) (JIRA)" <ji...@apache.org> on 2012/01/12 20:13:45 UTC

[jira] [Commented] (CASSANDRA-2474) CQL support for compound columns and wide rows

    [ https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185152#comment-13185152 ] 

Sylvain Lebresne commented on CASSANDRA-2474:
---------------------------------------------

The patch is coming along rather well I think, but I still have a few details to fix/clean up.

The patch does involve a number of language changes (which I strongly believe are for the best), and the introduction of a native way to deal with wide rows have a bunch of consequences. So let me list a number of those changes and see on which you disagree, so we discuss that sooner than later:

* The '..' notation for slices is removed. It just doesn't work with the new way to handle wide rows (and composites).
* I've made static definitions (i.e, those definitions that don't use COMPACT STORAGE basically) really static. That is, inserting a column that is not defined is not accepted, and there is no way to do something that translate as a column slice on a static CF. The reason is twofold: I think this is good for the user to have the validation that it doesn't make a basic typo when inserting a column (thus inserting the wrong column). But also, I believe that it makes the language much more coherent.
* Column definitions for static (and sparse) CF are all UTF8 strings (as discussed above).
* I'm going to make all column definition case insensitive, unless they are enclosed in single or double quotes (I've heard it's how PostgreSQL does it more or less). If you have a better idea ...
* My current patch don't really allow creating a secondary index on a non static CF, because it's not clear how that would work from the syntax and I'm sure I see good use for that. Does that bother someone ? :)
* Currently, prepared statement allows you to write {{SELECT ? FROM users WHERE ? > 3}}. I.e, you can have marker in place of a column definition (I call 'column definition' a name defined in the CREATE TABLE basically). I think it could make sense to disallow that. The rational being that this patch make select more complicated, it has more work to do to validate the query is correct and to generate the query itself. If we were to disallow markers in place of column definition, a fair amount of that preprocessing could be done during the preparation phase in theory, which would be faster and allow to return a number of errors more quickly. Besides, column definition are small string, so allowing marker for them doesn't seem to be any real win.

Now, I do have a problem I'm not sure how to deal with: I'm not sure how to deal with the limit for slices. The problem is that a slice on a wide row will now return a number of CqlRow, not one with many columns. So it sound like we should use the LIMIT. However, when you do say {{SELECT * FROM CF;}} (of if you query multiple keys with IN), then it's unclear what you do with the LIMIT. I unfortunately don't have a good solution for that yet.

I'm also not sure have the REVERSED just after the SELECT is still the right choice after this patch.

I'll also note that this patch (with the things above) make it so that in the CqlMetadata (in the result), the name_types are not useful anymore (since this is always UTF8), and neither is the default comparator and default validation types, as we can always set the type of each columns in the value_types map. I suppose the best way to proceed is to mark them deprecated for now and remove them in the next version.

In any case, I'm attaching the dtest file I'm using to test this patch (cql_test.py) because I think it's a great way to see what this patch is about.

Lastly, because of the scale of the changes this patch does, I don't think it is a good candidate to hide the change behind a runtime flag.  It would be messy and I think would make it less clear that both
* previous behavior is unchanged without the flag turned on
* new behavior works as expected when the flag is on

                
> CQL support for compound columns and wide rows
> ----------------------------------------------
>
>                 Key: CASSANDRA-2474
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2474
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API, Core
>            Reporter: Eric Evans
>            Assignee: Sylvain Lebresne
>            Priority: Critical
>              Labels: cql
>             Fix For: 1.1
>
>         Attachments: 2474-transposed-1.PNG, 2474-transposed-raw.PNG, 2474-transposed-select-no-sparse.PNG, 2474-transposed-select.PNG, cql_tests.py, raw_composite.txt, screenshot-1.jpg, screenshot-2.jpg
>
>
> For the most part, this boils down to supporting the specification of compound column names (the CQL syntax is colon-delimted terms), and then teaching the decoders (drivers) to create structures from the results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira