You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Webster Homer <we...@sial.com> on 2017/09/19 19:33:21 UTC

Solr Streaming Question

Is it possible to use the streaming API to stream documents from a
collection and load them into a new collection? I was thinking that this
would be a great way to get a random sample of data from our main
collections to developer machines. Making it a random sample would be
useful as well. This looks feasible, but I've only scratched the surface of
streaming Solr

Thanks

-- 


This message and any attachment are confidential and may be privileged or 
otherwise protected from disclosure. If you are not the intended recipient, 
you must not copy this message or attachment or disclose the contents to 
any other person. If you have received this transmission in error, please 
notify the sender immediately and delete the message and any attachment 
from your system. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not accept liability for any omissions or errors in this 
message which may arise as a result of E-Mail-transmission or for damages 
resulting from any unauthorized changes of the content of this message and 
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not guarantee that this message is free of viruses and does 
not accept liability for any damages caused by any virus transmitted 
therewith.

Click http://www.emdgroup.com/disclaimer to access the German, French, 
Spanish and Portuguese versions of this disclaimer.

Re: Solr Streaming Question

Posted by Joel Bernstein <jo...@gmail.com>.
Also random() will work with any type of field. Only the /export handler
limits the field list to docValues.

Each time you call random() it will give you a different random sample.

Joel Bernstein
http://joelsolr.blogspot.com/

On Tue, Sep 19, 2017 at 10:04 PM, Joel Bernstein <jo...@gmail.com> wrote:

> Try this construct:
>
> update(list(random(...), random(...), random(...)))
>
>
>
>
>
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Tue, Sep 19, 2017 at 9:02 PM, Susheel Kumar <su...@gmail.com>
> wrote:
>
>> You can follow the sectionCreating an Alert With the Topic Streaming
>> Expression" at http://joelsolr.blogspot.com/  and use random function for
>> getting random records and schedule using daemon function to retrieve
>> periodically etc.
>>
>> Thanks,
>> Susheel
>>
>>
>>
>> On Tue, Sep 19, 2017 at 4:56 PM, Erick Erickson <er...@gmail.com>
>> wrote:
>>
>> > Webster:
>> >
>> > I think you're looking for UpdateStream. Unfortunately the fix version
>> > wasn't entered so you'll have to look at your particular version but
>> > going strictly from the dates it appears in 6.0.
>> >
>> > David:
>> >
>> > Stored is irrelevant. Streaming only works with docValues="true"
>> > fields and moves the docValues content over.
>> >
>> > Best,
>> > Erick
>> >
>> > On Tue, Sep 19, 2017 at 12:39 PM, David Hastings
>> > <ha...@gmail.com> wrote:
>> > > I am also curious about this, specifically about indexed/non stored
>> > fields.
>> > >
>> > > On Tue, Sep 19, 2017 at 3:33 PM, Webster Homer <
>> webster.homer@sial.com>
>> > > wrote:
>> > >
>> > >> Is it possible to use the streaming API to stream documents from a
>> > >> collection and load them into a new collection? I was thinking that
>> this
>> > >> would be a great way to get a random sample of data from our main
>> > >> collections to developer machines. Making it a random sample would be
>> > >> useful as well. This looks feasible, but I've only scratched the
>> > surface of
>> > >> streaming Solr
>> > >>
>> > >> Thanks
>> > >>
>> > >> --
>> > >>
>> > >>
>> > >> This message and any attachment are confidential and may be
>> privileged
>> > or
>> > >> otherwise protected from disclosure. If you are not the intended
>> > recipient,
>> > >> you must not copy this message or attachment or disclose the
>> contents to
>> > >> any other person. If you have received this transmission in error,
>> > please
>> > >> notify the sender immediately and delete the message and any
>> attachment
>> > >> from your system. Merck KGaA, Darmstadt, Germany and any of its
>> > >> subsidiaries do not accept liability for any omissions or errors in
>> this
>> > >> message which may arise as a result of E-Mail-transmission or for
>> > damages
>> > >> resulting from any unauthorized changes of the content of this
>> message
>> > and
>> > >> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
>> > >> subsidiaries do not guarantee that this message is free of viruses
>> and
>> > does
>> > >> not accept liability for any damages caused by any virus transmitted
>> > >> therewith.
>> > >>
>> > >> Click http://www.emdgroup.com/disclaimer to access the German,
>> French,
>> > >> Spanish and Portuguese versions of this disclaimer.
>> > >>
>> >
>>
>
>

Re: Solr Streaming Question

Posted by Joel Bernstein <jo...@gmail.com>.
Try this construct:

update(list(random(...), random(...), random(...)))







Joel Bernstein
http://joelsolr.blogspot.com/

On Tue, Sep 19, 2017 at 9:02 PM, Susheel Kumar <su...@gmail.com>
wrote:

> You can follow the sectionCreating an Alert With the Topic Streaming
> Expression" at http://joelsolr.blogspot.com/  and use random function for
> getting random records and schedule using daemon function to retrieve
> periodically etc.
>
> Thanks,
> Susheel
>
>
>
> On Tue, Sep 19, 2017 at 4:56 PM, Erick Erickson <er...@gmail.com>
> wrote:
>
> > Webster:
> >
> > I think you're looking for UpdateStream. Unfortunately the fix version
> > wasn't entered so you'll have to look at your particular version but
> > going strictly from the dates it appears in 6.0.
> >
> > David:
> >
> > Stored is irrelevant. Streaming only works with docValues="true"
> > fields and moves the docValues content over.
> >
> > Best,
> > Erick
> >
> > On Tue, Sep 19, 2017 at 12:39 PM, David Hastings
> > <ha...@gmail.com> wrote:
> > > I am also curious about this, specifically about indexed/non stored
> > fields.
> > >
> > > On Tue, Sep 19, 2017 at 3:33 PM, Webster Homer <webster.homer@sial.com
> >
> > > wrote:
> > >
> > >> Is it possible to use the streaming API to stream documents from a
> > >> collection and load them into a new collection? I was thinking that
> this
> > >> would be a great way to get a random sample of data from our main
> > >> collections to developer machines. Making it a random sample would be
> > >> useful as well. This looks feasible, but I've only scratched the
> > surface of
> > >> streaming Solr
> > >>
> > >> Thanks
> > >>
> > >> --
> > >>
> > >>
> > >> This message and any attachment are confidential and may be privileged
> > or
> > >> otherwise protected from disclosure. If you are not the intended
> > recipient,
> > >> you must not copy this message or attachment or disclose the contents
> to
> > >> any other person. If you have received this transmission in error,
> > please
> > >> notify the sender immediately and delete the message and any
> attachment
> > >> from your system. Merck KGaA, Darmstadt, Germany and any of its
> > >> subsidiaries do not accept liability for any omissions or errors in
> this
> > >> message which may arise as a result of E-Mail-transmission or for
> > damages
> > >> resulting from any unauthorized changes of the content of this message
> > and
> > >> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> > >> subsidiaries do not guarantee that this message is free of viruses and
> > does
> > >> not accept liability for any damages caused by any virus transmitted
> > >> therewith.
> > >>
> > >> Click http://www.emdgroup.com/disclaimer to access the German,
> French,
> > >> Spanish and Portuguese versions of this disclaimer.
> > >>
> >
>

Re: Solr Streaming Question

Posted by Susheel Kumar <su...@gmail.com>.
You can follow the sectionCreating an Alert With the Topic Streaming
Expression" at http://joelsolr.blogspot.com/  and use random function for
getting random records and schedule using daemon function to retrieve
periodically etc.

Thanks,
Susheel



On Tue, Sep 19, 2017 at 4:56 PM, Erick Erickson <er...@gmail.com>
wrote:

> Webster:
>
> I think you're looking for UpdateStream. Unfortunately the fix version
> wasn't entered so you'll have to look at your particular version but
> going strictly from the dates it appears in 6.0.
>
> David:
>
> Stored is irrelevant. Streaming only works with docValues="true"
> fields and moves the docValues content over.
>
> Best,
> Erick
>
> On Tue, Sep 19, 2017 at 12:39 PM, David Hastings
> <ha...@gmail.com> wrote:
> > I am also curious about this, specifically about indexed/non stored
> fields.
> >
> > On Tue, Sep 19, 2017 at 3:33 PM, Webster Homer <we...@sial.com>
> > wrote:
> >
> >> Is it possible to use the streaming API to stream documents from a
> >> collection and load them into a new collection? I was thinking that this
> >> would be a great way to get a random sample of data from our main
> >> collections to developer machines. Making it a random sample would be
> >> useful as well. This looks feasible, but I've only scratched the
> surface of
> >> streaming Solr
> >>
> >> Thanks
> >>
> >> --
> >>
> >>
> >> This message and any attachment are confidential and may be privileged
> or
> >> otherwise protected from disclosure. If you are not the intended
> recipient,
> >> you must not copy this message or attachment or disclose the contents to
> >> any other person. If you have received this transmission in error,
> please
> >> notify the sender immediately and delete the message and any attachment
> >> from your system. Merck KGaA, Darmstadt, Germany and any of its
> >> subsidiaries do not accept liability for any omissions or errors in this
> >> message which may arise as a result of E-Mail-transmission or for
> damages
> >> resulting from any unauthorized changes of the content of this message
> and
> >> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> >> subsidiaries do not guarantee that this message is free of viruses and
> does
> >> not accept liability for any damages caused by any virus transmitted
> >> therewith.
> >>
> >> Click http://www.emdgroup.com/disclaimer to access the German, French,
> >> Spanish and Portuguese versions of this disclaimer.
> >>
>

Re: Solr Streaming Question

Posted by Erick Erickson <er...@gmail.com>.
Webster:

I think you're looking for UpdateStream. Unfortunately the fix version
wasn't entered so you'll have to look at your particular version but
going strictly from the dates it appears in 6.0.

David:

Stored is irrelevant. Streaming only works with docValues="true"
fields and moves the docValues content over.

Best,
Erick

On Tue, Sep 19, 2017 at 12:39 PM, David Hastings
<ha...@gmail.com> wrote:
> I am also curious about this, specifically about indexed/non stored fields.
>
> On Tue, Sep 19, 2017 at 3:33 PM, Webster Homer <we...@sial.com>
> wrote:
>
>> Is it possible to use the streaming API to stream documents from a
>> collection and load them into a new collection? I was thinking that this
>> would be a great way to get a random sample of data from our main
>> collections to developer machines. Making it a random sample would be
>> useful as well. This looks feasible, but I've only scratched the surface of
>> streaming Solr
>>
>> Thanks
>>
>> --
>>
>>
>> This message and any attachment are confidential and may be privileged or
>> otherwise protected from disclosure. If you are not the intended recipient,
>> you must not copy this message or attachment or disclose the contents to
>> any other person. If you have received this transmission in error, please
>> notify the sender immediately and delete the message and any attachment
>> from your system. Merck KGaA, Darmstadt, Germany and any of its
>> subsidiaries do not accept liability for any omissions or errors in this
>> message which may arise as a result of E-Mail-transmission or for damages
>> resulting from any unauthorized changes of the content of this message and
>> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
>> subsidiaries do not guarantee that this message is free of viruses and does
>> not accept liability for any damages caused by any virus transmitted
>> therewith.
>>
>> Click http://www.emdgroup.com/disclaimer to access the German, French,
>> Spanish and Portuguese versions of this disclaimer.
>>

Re: Solr Streaming Question

Posted by David Hastings <ha...@gmail.com>.
I am also curious about this, specifically about indexed/non stored fields.

On Tue, Sep 19, 2017 at 3:33 PM, Webster Homer <we...@sial.com>
wrote:

> Is it possible to use the streaming API to stream documents from a
> collection and load them into a new collection? I was thinking that this
> would be a great way to get a random sample of data from our main
> collections to developer machines. Making it a random sample would be
> useful as well. This looks feasible, but I've only scratched the surface of
> streaming Solr
>
> Thanks
>
> --
>
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
> Click http://www.emdgroup.com/disclaimer to access the German, French,
> Spanish and Portuguese versions of this disclaimer.
>