You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Yonik Seeley (JIRA)" <ji...@apache.org> on 2015/03/15 03:59:38 UTC

[jira] [Commented] (SOLR-7214) JSON Facet API

    [ https://issues.apache.org/jira/browse/SOLR-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14362162#comment-14362162 ] 

Yonik Seeley commented on SOLR-7214:
------------------------------------

Here's a message I sent to the Heliosearch forum last year:

{code}
Facet functions and subfacets (nested facets) have added a lot of 
analytic power, but using separate query parameters for nested facets 
has the downside of being very hard to construct / read for complex 
nested facets.  Specifying deeply nested facets with a naturally 
nested structure (JSON) makes a lot of sense and can also make 
programatic generation of requests easier. 

The kitchen sink that is SimpleFacets has outlived it's usefulness, so 
I'm developing a new faceting module for Heliosearch with the 
following goals: 
 - first class JSON support 
 - support a much more canonical response format that is easier for 
clients to parse 
 - first class analytics support 
 - support a cleaner way to do distributed faceting 
 - support better integration with other search features such as 
grouping, joins, cross-core features 

The JSON API: 
Note that the JSON parser we use now supports comments, unquoted 
simple strings, and single quoted strings, making the DSL much more 
suited to hand-typing. 

The top-level "bucket" is implicitly defined by the set of documents 
matching the main query and filters (same as old faceting, but just 
more explicit). 
Given that we start out with a bucket, we can ask for stats at the top-level. 

EXAMPLE: 
json.facet={ x:'avg(price)', y:'unique(manufacturer)' } 

RESPONSE: { count:17, x:37.5, y:7 } 
// The top level facet bucket is just like any query facet and always 
includes "count" 

EXAMPLE: simple field facet 
json.facet={genres:{terms:genre_field}} 
// this is a short-form since no other params are desired 

EXAMPLE: field facet with other params 
json.facet={genres:{terms:{ 
   field:genre_field, 
   offset:100, 
   limit:20, 
   mincount:5 
}} 

Notes  - I switched to using "terms" for a field facet because of the 
awkwardness of having "field" appear twice in a row (i.e. 
mylabel:{field:{field:myfield, offset:... ) 

EXAMPLE: field facet with sub facets and stats 
json.facet={genres:{terms:{ 
   field:genre_field, 
   offset:100, 
   limit:20, 
   mincount:5, 
   facet : {    // these facet commands will be done per-bucket of parent facet 
     x : 'avg(price)' ,   // a stat per-bucket 
     y : { query : 'popularity[5 TO 10]' } ,  // query sub-facet 
     z : { terms : manufacturer }   // field/terms sub-facet 
   } 
}}} 

The output looks pretty much identical to the existing facet function 
and sub-facet code: 
http://heliosearch.org/solr-subfacets/ 
http://heliosearch.org/solr-facet-functions/ 
You can check out the tests so far too in TestJsonFacets.java 

Implementation Notes: 
- Agg (aggregations are currently a subclass of ValueSource and 
piggyback off of the ability of users to plug in their own custom 
value source parsers). 
- a FacetRequestParser creates a FacetRequest, and then to execute 
that request, a FacetProcessor is created. 
- Much of the code is in a single file (FacetRequest.java), but this 
is just temporary... it eases changes early on while things are in 
flux. 

Overall, I think this will end up striking a good balance between 
readability, ad-hoc human generated requests, and programmatically 
generated queries. 

You can try it out (but it's early... faceting multivalued fields are 
not supported yet). 
For convenience, here's the "example" server you can try: 
[dead link]

-Yonik 
{code}

> JSON Facet API
> --------------
>
>                 Key: SOLR-7214
>                 URL: https://issues.apache.org/jira/browse/SOLR-7214
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Yonik Seeley
>         Attachments: SOLR-7214.patch
>
>
> Overview is here: http://heliosearch.org/json-facet-api/
> The structured nature of nested sub-facets are more naturally expressed in a nested structure like JSON rather than the flat structure that normal query parameters provide.
> Goals:
> - First class JSON support
> - Easier programmatic construction of complex nested facet commands
> - Support a much more canonical response format that is easier for clients to parse
> - First class analytics support
> - Support a cleaner way to do distributed faceting
> - Support better integration with other search features



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org