You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Joel Bernstein (JIRA)" <ji...@apache.org> on 2015/05/02 18:25:06 UTC

[jira] [Updated] (SOLR-7377) SOLR Streaming Expressions

     [ https://issues.apache.org/jira/browse/SOLR-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joel Bernstein updated SOLR-7377:
---------------------------------
    Attachment: SOLR-7377.patch

New patch with the following changes:

1) ParallelStream works can use either Object serialization or streaming expression as a transport mechanism. When the ParalleStream is created using a StreamingExpression it defaults to streaming expression transport. When it's created through the Streaming API it defaults to Object serialization.

2) Added parallel tests for merge, unique, group, and top functions. Sample syntax:
{code}
    parallel(collection1, 
                merge(
                            search(collection1, q="id:(4 1 8 9)", fl="id,a_s,a_i", sort="a_i desc", partitionKeys="a_i"), 
                           search(collection1, q="id:(0 2 3 6)", fl="id,a_s,a_i", sort="a_i desc", partitionKeys="a_i"), 
                          on="a_i desc"), 
                workers="2", 
                sort="a_i desc")

{code}

3) Added the ExpressionRunner class, still needs tests.

> SOLR Streaming Expressions
> --------------------------
>
>                 Key: SOLR-7377
>                 URL: https://issues.apache.org/jira/browse/SOLR-7377
>             Project: Solr
>          Issue Type: Improvement
>          Components: clients - java
>            Reporter: Dennis Gove
>            Priority: Minor
>             Fix For: Trunk
>
>         Attachments: SOLR-7377.patch, SOLR-7377.patch, SOLR-7377.patch, SOLR-7377.patch, SOLR-7377.patch
>
>
> It would be beneficial to add an expression-based interface to Streaming API described in SOLR-7082. Right now that API requires streaming requests to come in from clients as serialized bytecode of the streaming classes. The suggestion here is to support string expressions which describe the streaming operations the client wishes to perform. 
> {code:java}
> search(collection1, q=*:*, fl="id,fieldA,fieldB", sort="fieldA asc")
> {code}
> With this syntax in mind, one can now express arbitrarily complex stream queries with a single string.
> {code:java}
> // merge two distinct searches together on common fields
> merge(
>   search(collection1, q="id:(0 3 4)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s asc"),
>   search(collection2, q="id:(1 2)", fl="id,a_s,a_i,a_f", sort="a_f asc, a_s asc"),
>   on="a_f asc, a_s asc")
> // find top 20 unique records of a search
> top(
>   n=20,
>   unique(
>     search(collection1, q=*:*, fl="id,a_s,a_i,a_f", sort="a_f desc"),
>     over="a_f desc"),
>   sort="a_f desc")
> {code}
> The syntax would support
> 1. Configurable expression names (eg. via solrconfig.xml one can map "unique" to a class implementing a Unique stream class) This allows users to build their own streams and use as they wish.
> 2. Named parameters (of both simple and expression types)
> 3. Unnamed, type-matched parameters (to support requiring N streams as arguments to another stream)
> 4. Positional parameters
> The main goal here is to make streaming as accessible as possible and define a syntax for running complex queries across large distributed systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org