You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Andrés de la Peña (JIRA)" <ji...@apache.org> on 2014/08/01 14:51:39 UTC

[jira] [Updated] (CASSANDRA-7575) Custom 2i validation

     [ https://issues.apache.org/jira/browse/CASSANDRA-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrés de la Peña updated CASSANDRA-7575:
-----------------------------------------

    Attachment: 2i_validation_v3.patch

> Custom 2i validation
> --------------------
>
>                 Key: CASSANDRA-7575
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7575
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: API
>            Reporter: Andrés de la Peña
>            Assignee: Andrés de la Peña
>            Priority: Minor
>              Labels: 2i, cql3, secondaryIndex, secondary_index, select
>             Fix For: 2.1.1
>
>         Attachments: 2i_validation.patch, 2i_validation_v2.patch, 2i_validation_v3.patch
>
>
> There are several projects using custom secondary indexes as an extension point to integrate C* with other systems such as Solr or Lucene. The usual approach is to embed third party indexing queries in CQL clauses. 
> For example, [DSE Search|http://www.datastax.com/what-we-offer/products-services/datastax-enterprise] embeds Solr syntax this way:
> {code}
> SELECT title FROM solr WHERE solr_query='title:natio*';
> {code}
> [Stratio platform|https://github.com/Stratio/stratio-cassandra] embeds custom JSON syntax for searching in Lucene indexes:
> {code}
> SELECT * FROM tweets WHERE lucene='{
>     filter : {
>         type: "range",
>         field: "time",
>         lower: "2014/04/25",
>         upper: "2014/04/1"
>     },
>     query  : {
>         type: "phrase", 
>         field: "body", 
>         values: ["big", "data"]
>     },
>     sort  : {fields: [ {field:"time", reverse:true} ] }
> }';
> {code}
> Tuplejump [Stargate|http://tuplejump.github.io/stargate/] also uses the Stratio's open source JSON syntax:
> {code}
> SELECT name,company FROM PERSON WHERE stargate ='{
>     filter: {
>         type: "range",
>         field: "company",
>         lower: "a",
>         upper: "p"
>     },
>     sort:{
>        fields: [{field:"name",reverse:true}]
>     }
> }';
> {code}
> These syntaxes are validated by the corresponding 2i implementation. This validation is done behind the StorageProxy command distribution. So, far as I know, there is no way to give rich feedback about syntax errors to CQL users.
> I'm uploading a patch with some changes trying to improve this. I propose adding an empty validation method to SecondaryIndexSearcher that can be overridden by custom 2i implementations:
> {code}
> public void validate(List<IndexExpression> clause) {}
> {code}
> And call it from SelectStatement#getRangeCommand:
> {code}
> ColumnFamilyStore cfs = Keyspace.open(keyspace()).getColumnFamilyStore(columnFamily());
>         for (SecondaryIndexSearcher searcher : cfs.indexManager.getIndexSearchersForQuery(expressions))
>         {
>             try
>             {
>                 searcher.validate(expressions);
>             }
>             catch (RuntimeException e)
>             {
>                 String exceptionMessage = e.getMessage();
>                 if (exceptionMessage != null 
>                         && !exceptionMessage.trim().isEmpty())
>                     throw new InvalidRequestException(
>                             "Invalid index expression: " + e.getMessage());
>                 else
>                     throw new InvalidRequestException(
>                             "Invalid index expression");
>             }
>         }
> {code}
> In this way C* allows custom 2i implementations to give feedback about syntax errors.
> We are currently using these changes in a fork with no problems.



--
This message was sent by Atlassian JIRA
(v6.2#6252)