You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Jason Gerlowski (JIRA)" <ji...@apache.org> on 2015/12/23 03:12:46 UTC

[jira] [Commented] (SOLR-7535) Add UpdateStream to Streaming API and Streaming Expression

    [ https://issues.apache.org/jira/browse/SOLR-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15069034#comment-15069034 ] 

Jason Gerlowski commented on SOLR-7535:
---------------------------------------

I'm in the process of hacking together a first pass at this.

Going well for the most part, but I did run into one sticking point.  {{UpdateStream.read()}} takes each tuple and sends it along to a SolrCloud collection.  I was planning on converting the tuple into a {{SolrInputDocument}}, and then using {{CloudSolrClient.add(doc)}} to send along the converted tuple.

It's not super hard to take a straw-man approach to the conversion:
{code}
    final SolrInputDocument doc = new SolrInputDocument();
    for (Object s : tupleFromSource.fields.keySet()) {
      doc.addField((String)s, tupleFromSource.get(s));
    }   
{code}

Is this a reasonable approach?  I think this'll work for simple cases, but I wasn't sure how it'd do with more complex tuples.  Do tuples ever have non-String keys?  Is there any special treatment that I should know about for nested-docs (I wasn't sure how these mapped to tuples).

I'm assuming there must be some code out there that does the reverse-conversion (*from* Solr results *to* tuples).  I nosed around a bit in {{StreamHandler.handleRequestBody}} and the various TupleStream implementations, but I didn't find anything too promising.  Does anyone know where that might live.  If I found that code it'd probably be helpful for doing the opposite conversion for {{UpdateStream}}

> Add UpdateStream to Streaming API and Streaming Expression
> ----------------------------------------------------------
>
>                 Key: SOLR-7535
>                 URL: https://issues.apache.org/jira/browse/SOLR-7535
>             Project: Solr
>          Issue Type: New Feature
>          Components: clients - java, SolrJ
>            Reporter: Joel Bernstein
>            Priority: Minor
>
> The ticket adds an UpdateStream implementation to the Streaming API and streaming expressions. The UpdateStream will wrap a TupleStream and send the Tuples it reads to a SolrCloud collection to be indexed.
> This will allow users to pull data from different Solr Cloud collections, merge and transform the streams and send the transformed data to another Solr Cloud collection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org