You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Sanja Zivotic <sa...@pixbuffer.com> on 2014/11/21 11:10:13 UTC

Solr Result Grouping and Sorting

Hello,

I am using Solr 4.8.1 with the following fields in schema.xml:

<field name="id" type="string" indexed="true" stored="true" required="true"
multiValued="false"/>
<field name="vehicle_year" type="tint" indexed="true" stored="true"
required="true" multiValued="false"/>
<field name="vehicle_maker" type="string" indexed="true" stored="true"
required="true" multiValued="false"/>
<field name="vehicle_model" type="string" indexed="true" stored="true"
required="true" multiValued="false"/>
<field name="vehicle_trim" type="string" indexed="true" stored="true"
required="true" multiValued="false"/>
<field name="vehicle_full_name" type="string" indexed="true" stored="true"
required="true" multiValued="false"/>
<field name="city" type="string" indexed="true" stored="true"
required="true" multiValued="false"/>
<field name="price" type="tint" indexed="true" stored="true"
required="true" multiValued="false"/>

where:

- id: unique vehicle id
- vehicle_year: year when the vehicle was made (ex. 2014)
- vehicle_maker: vehicle manufacturer (ex. BMW)
- vehicle_model: vehicle model (ex. 320i)
- vehicle_trim: vehicle trim (ex. Sedan)
- vehicle_full_name: vehicle full name (ex. 2014 BMW 320i Sedan...)
- city: city where the vehicle can be bought (value of this field can be
the exact city name like 'Toronto' for example or generic city value
'generic'. Generic city price is used when there is no data for an exact
city in solr index).
- price: vehicle price in the city (or generic vehicle price if the city
equals 'generic')

Generic city price always exists in solr index for each vehicle. On the
other hand, I can have 2014 BMW 320i Sedan price in Toronto, but not in
Calgary.

I need to find a list of all vehicles in the given city and in the given
price range.
If there is no price in the solr index for the exact given city, then I
need to show generic price.
If there are both generic city price and the price in the exact city, I
need to show the minimal one.
I also need to show number of vehicles per maker in the city and in the
given price range (faceting).

I tried something like this (using solr grouping
https://cwiki.apache.org/confluence/display/solr/Result+Grouping):

I group the records with the same "vehicle_full_name" field ('group.field':
'vehicle_full_name'), then sort the items in the group by price
('group.sort': 'price asc') and fetch only the first one ('group.limit':
'1').

http://127.0.0.1:8983/solr/select?sort=price+asc&rows=20&start=0&group=true&group.format=simple&group.ngroups=true&group.limit=1&group.field=vehicle_full_name&group.sort=price&group.facet=true&facet=true&facet.sort=vehicle_maker&facet.field=vehilce_maker&q=city:("Toronto"
OR "generic") AND price:[30000 TO *] AND price:[* TO 36000]


params = {'q': 'city:("Toronto" OR "generic") AND price:[31000 TO *] AND
price:[* TO 34000]',
          'group': 'true',
          'group.field': 'vehicle_full_name',
          'group.limit': '1',
          'group.sort': 'price asc',
          'group.ngroups': 'true',
          'group.format': 'simple',
          'sort': price asc,
          'start': start,
          'rows': 20,
          'group.facet': 'true',
          'facet': 'true',
          'facet.sort': 'vehicle_maker',
          'facet.field': 'vehicle_maker',
          'wt': 'json'}

which is almost fine, but in some edge cases I am getting wrong results.

When filtering by price range, solr filters the results before grouping, so
I can end up with different prices depending on the price range.

Example:
- 2014 BMW 320i Sedan generic price: $32,004
- 2014 BMW 320i Sedan Toronto price: $31,443
1) if price range is $31,000 - $34,000 both prices are fetched, grouped and
limited to the minimal one -> I get the correct vehicle price $31,443 in
results
2) if price range is $32,000 - $34,000 solr filters by price before
grouping and generic price is fetched only ($32,004) so I get $32,004 as
the vehicle price which is wrong. In this case (city: Toronto, price:
$32,000 - $34,000) this vehicle shouldn't be in the results.

Any idea how to solve this?