You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Trey Grainger (Jira)" <ji...@apache.org> on 2019/10/11 05:52:00 UTC
[jira] [Updated] (SOLR-13836) Streaming Expression Query Parser
[ https://issues.apache.org/jira/browse/SOLR-13836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Trey Grainger updated SOLR-13836:
---------------------------------
Description:
It is currently possible to hit the search handler in a streaming expression ("search(...)"), but it is not currently possible to invoke a streaming expression from within a regular search within the search handler. In some cases, it would be useful to leverage the power of streaming expressions to generate a result set and then join that result set with a normal set of search results.
This isn't expected to be particularly efficient for high cardinality streaming expression results, but it would be pretty powerful feature that could enable a bunch of use cases that aren't possible today within a normal search.
h2. Example:
*Docs:*
{code:java}
curl -X POST -H "Content-Type: application/json" http://localhost:8983/solr/food_collection/update?commit=true --data-binary '
[
{"id": "1", "name_s":"donut","vector_fs":[5.0,0.0,1.0,5.0,0.0,4.0,5.0,1.0]},
{"id": "2", "name_s":"apple juice","vector_fs":[1.0,5.0,0.0,0.0,0.0,4.0,4.0,3.0]},
{"id": "3", "name_s":"cappuccino","vector_fs":[0.0,5.0,3.0,0.0,4.0,1.0,2.0,3.0]},
{"id": "4", "name_s":"cheese pizza","vector_fs":[5.0,0.0,4.0,4.0,0.0,1.0,5.0,2.0]},
{"id": "5", "name_s":"green tea","vector_fs":[0.0,5.0,0.0,0.0,2.0,1.0,1.0,5.0]},
{"id": "6", "name_s":"latte","vector_fs":[0.0,5.0,4.0,0.0,4.0,1.0,3.0,3.0]},
{"id": "7", "name_s":"soda","vector_fs":[0.0,5.0,0.0,0.0,3.0,5.0,5.0,0.0]},
{"id": "8", "name_s":"cheese bread sticks","vector_fs":[5.0,0.0,4.0,5.0,0.0,1.0,4.0,2.0]},
{"id": "9", "name_s":"water","vector_fs":[0.0,5.0,0.0,0.0,0.0,0.0,0.0,5.0]},
{"id": "10", "name_s":"cinnamon bread sticks","vector_fs":[5.0,0.0,1.0,5.0,0.0,3.0,4.0,2.0]}
]
{code}
*Query:*
{code:java}
http://localhost:8983/solr/food/select?q=*:*&fq=\{!streaming_expression}top(select(search(food,%20q=%22*:*%22,%20fl=%22id,vector_fs%22,%20sort=%22id%20asc%22),%20cosineSimilarity(vector_fs,%20array(5.1,0.0,1.0,5.0,0.0,4.0,5.0,1.0))%20as%20cos,%20id),%20n=5,%20sort=%22cos%20desc%22)&fl=id,name_s
{code}
*Response:*
{code:java}
{
"responseHeader":{
"zkConnected":true,
"status":0,
"QTime":7,
"params":{
"q":"*:*",
"fl":"id,name_s",
"fq":"{!streaming_expression}top(select(search(food, q=\"*:*\", fl=\"id,vector_fs\", sort=\"id asc\"), cosineSimilarity(vector_fs, array(5.2,0.0,1.0,5.0,0.0,4.0,5.0,1.0)) as cos, id), n=5, sort=\"cos desc\")"}},
"response":{"numFound":5,"start":0,"docs":[
{
"name_s":"donut",
"id":"1"},
{
"name_s":"apple juice",
"id":"2"},
{
"name_s":"cheese pizza",
"id":"4"},
{
"name_s":"cheese bread sticks",
"id":"8"},
{
"name_s":"cinnamon bread sticks",
"id":"10"}]
}}
{code}
The current implementation also supports the following additional parameters:
*f*: (optional) The field name from the streaming expression containing the document ids upon which to filter. Defaults to the same uniqueKey field name from your documents.
*method: (optional) Any of termsFilter (default), booleanQuery, automaton, docValuesTermsFilter.
The method may go away, especially if we find a more efficient way to join the stream to the main query doc set.
was:
It is currently possible to hit the search handler in a streaming expression ("search(...)"), but it is not currently possible to invoke a streaming expression from within a regular search within the search handler. In some cases, it would be useful to leverage the power of streaming expressions to generate a result set and then join that result set with a normal set of search results.
This isn't expected to be particularly efficient for high cardinality streaming expression results, but it would be pretty powerful feature that could enable a bunch of use cases that aren't possible today within a normal search.
> Streaming Expression Query Parser
> ---------------------------------
>
> Key: SOLR-13836
> URL: https://issues.apache.org/jira/browse/SOLR-13836
> Project: Solr
> Issue Type: New Feature
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Trey Grainger
> Priority: Minor
>
> It is currently possible to hit the search handler in a streaming expression ("search(...)"), but it is not currently possible to invoke a streaming expression from within a regular search within the search handler. In some cases, it would be useful to leverage the power of streaming expressions to generate a result set and then join that result set with a normal set of search results.
> This isn't expected to be particularly efficient for high cardinality streaming expression results, but it would be pretty powerful feature that could enable a bunch of use cases that aren't possible today within a normal search.
> h2. Example:
> *Docs:*
> {code:java}
> curl -X POST -H "Content-Type: application/json" http://localhost:8983/solr/food_collection/update?commit=true --data-binary '
> [
> {"id": "1", "name_s":"donut","vector_fs":[5.0,0.0,1.0,5.0,0.0,4.0,5.0,1.0]},
> {"id": "2", "name_s":"apple juice","vector_fs":[1.0,5.0,0.0,0.0,0.0,4.0,4.0,3.0]},
> {"id": "3", "name_s":"cappuccino","vector_fs":[0.0,5.0,3.0,0.0,4.0,1.0,2.0,3.0]},
> {"id": "4", "name_s":"cheese pizza","vector_fs":[5.0,0.0,4.0,4.0,0.0,1.0,5.0,2.0]},
> {"id": "5", "name_s":"green tea","vector_fs":[0.0,5.0,0.0,0.0,2.0,1.0,1.0,5.0]},
> {"id": "6", "name_s":"latte","vector_fs":[0.0,5.0,4.0,0.0,4.0,1.0,3.0,3.0]},
> {"id": "7", "name_s":"soda","vector_fs":[0.0,5.0,0.0,0.0,3.0,5.0,5.0,0.0]},
> {"id": "8", "name_s":"cheese bread sticks","vector_fs":[5.0,0.0,4.0,5.0,0.0,1.0,4.0,2.0]},
> {"id": "9", "name_s":"water","vector_fs":[0.0,5.0,0.0,0.0,0.0,0.0,0.0,5.0]},
> {"id": "10", "name_s":"cinnamon bread sticks","vector_fs":[5.0,0.0,1.0,5.0,0.0,3.0,4.0,2.0]}
> ]
> {code}
>
> *Query:*
> {code:java}
> http://localhost:8983/solr/food/select?q=*:*&fq=\{!streaming_expression}top(select(search(food,%20q=%22*:*%22,%20fl=%22id,vector_fs%22,%20sort=%22id%20asc%22),%20cosineSimilarity(vector_fs,%20array(5.1,0.0,1.0,5.0,0.0,4.0,5.0,1.0))%20as%20cos,%20id),%20n=5,%20sort=%22cos%20desc%22)&fl=id,name_s
> {code}
>
> *Response:*
> {code:java}
> {
> "responseHeader":{
> "zkConnected":true,
> "status":0,
> "QTime":7,
> "params":{
> "q":"*:*",
> "fl":"id,name_s",
> "fq":"{!streaming_expression}top(select(search(food, q=\"*:*\", fl=\"id,vector_fs\", sort=\"id asc\"), cosineSimilarity(vector_fs, array(5.2,0.0,1.0,5.0,0.0,4.0,5.0,1.0)) as cos, id), n=5, sort=\"cos desc\")"}},
> "response":{"numFound":5,"start":0,"docs":[
> {
> "name_s":"donut",
> "id":"1"},
> {
> "name_s":"apple juice",
> "id":"2"},
> {
> "name_s":"cheese pizza",
> "id":"4"},
> {
> "name_s":"cheese bread sticks",
> "id":"8"},
> {
> "name_s":"cinnamon bread sticks",
> "id":"10"}]
> }}
> {code}
> The current implementation also supports the following additional parameters:
> *f*: (optional) The field name from the streaming expression containing the document ids upon which to filter. Defaults to the same uniqueKey field name from your documents.
> *method: (optional) Any of termsFilter (default), booleanQuery, automaton, docValuesTermsFilter.
> The method may go away, especially if we find a more efficient way to join the stream to the main query doc set.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org