You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Yonik Seeley (JIRA)" <ji...@apache.org> on 2015/03/15 03:59:38 UTC
[jira] [Commented] (SOLR-7214) JSON Facet API
[ https://issues.apache.org/jira/browse/SOLR-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14362162#comment-14362162 ]
Yonik Seeley commented on SOLR-7214:
------------------------------------
Here's a message I sent to the Heliosearch forum last year:
{code}
Facet functions and subfacets (nested facets) have added a lot of
analytic power, but using separate query parameters for nested facets
has the downside of being very hard to construct / read for complex
nested facets. Specifying deeply nested facets with a naturally
nested structure (JSON) makes a lot of sense and can also make
programatic generation of requests easier.
The kitchen sink that is SimpleFacets has outlived it's usefulness, so
I'm developing a new faceting module for Heliosearch with the
following goals:
- first class JSON support
- support a much more canonical response format that is easier for
clients to parse
- first class analytics support
- support a cleaner way to do distributed faceting
- support better integration with other search features such as
grouping, joins, cross-core features
The JSON API:
Note that the JSON parser we use now supports comments, unquoted
simple strings, and single quoted strings, making the DSL much more
suited to hand-typing.
The top-level "bucket" is implicitly defined by the set of documents
matching the main query and filters (same as old faceting, but just
more explicit).
Given that we start out with a bucket, we can ask for stats at the top-level.
EXAMPLE:
json.facet={ x:'avg(price)', y:'unique(manufacturer)' }
RESPONSE: { count:17, x:37.5, y:7 }
// The top level facet bucket is just like any query facet and always
includes "count"
EXAMPLE: simple field facet
json.facet={genres:{terms:genre_field}}
// this is a short-form since no other params are desired
EXAMPLE: field facet with other params
json.facet={genres:{terms:{
field:genre_field,
offset:100,
limit:20,
mincount:5
}}
Notes - I switched to using "terms" for a field facet because of the
awkwardness of having "field" appear twice in a row (i.e.
mylabel:{field:{field:myfield, offset:... )
EXAMPLE: field facet with sub facets and stats
json.facet={genres:{terms:{
field:genre_field,
offset:100,
limit:20,
mincount:5,
facet : { // these facet commands will be done per-bucket of parent facet
x : 'avg(price)' , // a stat per-bucket
y : { query : 'popularity[5 TO 10]' } , // query sub-facet
z : { terms : manufacturer } // field/terms sub-facet
}
}}}
The output looks pretty much identical to the existing facet function
and sub-facet code:
http://heliosearch.org/solr-subfacets/
http://heliosearch.org/solr-facet-functions/
You can check out the tests so far too in TestJsonFacets.java
Implementation Notes:
- Agg (aggregations are currently a subclass of ValueSource and
piggyback off of the ability of users to plug in their own custom
value source parsers).
- a FacetRequestParser creates a FacetRequest, and then to execute
that request, a FacetProcessor is created.
- Much of the code is in a single file (FacetRequest.java), but this
is just temporary... it eases changes early on while things are in
flux.
Overall, I think this will end up striking a good balance between
readability, ad-hoc human generated requests, and programmatically
generated queries.
You can try it out (but it's early... faceting multivalued fields are
not supported yet).
For convenience, here's the "example" server you can try:
[dead link]
-Yonik
{code}
> JSON Facet API
> --------------
>
> Key: SOLR-7214
> URL: https://issues.apache.org/jira/browse/SOLR-7214
> Project: Solr
> Issue Type: New Feature
> Reporter: Yonik Seeley
> Attachments: SOLR-7214.patch
>
>
> Overview is here: http://heliosearch.org/json-facet-api/
> The structured nature of nested sub-facets are more naturally expressed in a nested structure like JSON rather than the flat structure that normal query parameters provide.
> Goals:
> - First class JSON support
> - Easier programmatic construction of complex nested facet commands
> - Support a much more canonical response format that is easier for clients to parse
> - First class analytics support
> - Support a cleaner way to do distributed faceting
> - Support better integration with other search features
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org