You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Tomás Fernández Löbbe (JIRA)" <ji...@apache.org> on 2015/04/16 01:59:58 UTC

[jira] [Commented] (SOLR-7406) Support DV implementation in range faceting

    [ https://issues.apache.org/jira/browse/SOLR-7406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14497338#comment-14497338 ] 

Tomás Fernández Löbbe commented on SOLR-7406:
---------------------------------------------

I did the following benchmark on my mac:
Geonames dataset (added 4 times making a total of 33.3M docs)
Based on Solr's basic configset, just added the following fields:
{code:xml}
   <field name="name" type="text_general"/>
   <field name="alternatenames" type="text_general" multiValued="true"/>
   <field name="latitude" type="double" docValues="true"/>
   <field name="longitude" type="double" docValues="true"/>
   <field name="feature_class" type="string"/>
   <field name="feature_code" type="string"/>
   <field name="country_code" type="string"/>
   <field name="cc2" type="string"/>
   <field name="admin1_code" type="string"/>
   <field name="admin2_code" type="string"/>
   <field name="admin3_code" type="string"/>
   <field name="admin4_code" type="string"/>
   <field name="population" type="long" docValues="true"/>
   <field name="elevation" type="int" docValues="true"/>
   <field name="dem" type="int" docValues="true"/>
   <field name="timezone" type="string"/>
   <field name="modification_date" type="string"/>
{code}
AutoSoftCommit every second. 
AutoCommit every 15 seconds with openSearcher=false
Updating one doc per second. 
Using Solr start script without modification to start.ini.sh
"dem" and "population" have both docValues=true. 
All times are in milliseconds
Single thread doing almost 5k different boolean queries
On "dem" field:
{noformat}
facet=true
facet.range=dem
facet.range.start=0
facet.range.end=200
facet.range.gap=1
facet.range.method=filter/dv
{noformat}
||Method||Min||Max||Average||p10||p50||p90||p99||
|Filter|77|3514|1141.5|1040|1128|1263|1374|
|DV|47|1988|166.0|88|151|262|368|

On "population" field:
{noformat}
facet=true
facet.range=population
facet.range.start=0
facet.range.end=2000
facet.range.gap=5
facet.range.method=filter/dv
{noformat}

||Method||Min||Max||Average||p10||p50||p90||p99||
|Filter|3|2055|321.1|47|70|891|955|
|DV|10|972|67.7|35|60|102|150|


> Support DV implementation in range faceting
> -------------------------------------------
>
>                 Key: SOLR-7406
>                 URL: https://issues.apache.org/jira/browse/SOLR-7406
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Tomás Fernández Löbbe
>            Assignee: Tomás Fernández Löbbe
>             Fix For: Trunk
>
>         Attachments: SOLR-7406.patch
>
>
> interval faceting has a different implementation than range faceting based on DocValues API. This is sometimes faster and doesn't rely on filters / filter cache.
> I'm planning to add a "method" parameter that would allow users to chose between the current implementation ("filter"?) and the DV-based implementation ("dv"?). The result for both methods should be the same, but performance may vary.
> Default should continue to be "filter".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org