You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Tomás Fernández Löbbe (JIRA)" <ji...@apache.org> on 2015/04/16 01:59:58 UTC
[jira] [Commented] (SOLR-7406) Support DV implementation in range
faceting
[ https://issues.apache.org/jira/browse/SOLR-7406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14497338#comment-14497338 ]
Tomás Fernández Löbbe commented on SOLR-7406:
---------------------------------------------
I did the following benchmark on my mac:
Geonames dataset (added 4 times making a total of 33.3M docs)
Based on Solr's basic configset, just added the following fields:
{code:xml}
<field name="name" type="text_general"/>
<field name="alternatenames" type="text_general" multiValued="true"/>
<field name="latitude" type="double" docValues="true"/>
<field name="longitude" type="double" docValues="true"/>
<field name="feature_class" type="string"/>
<field name="feature_code" type="string"/>
<field name="country_code" type="string"/>
<field name="cc2" type="string"/>
<field name="admin1_code" type="string"/>
<field name="admin2_code" type="string"/>
<field name="admin3_code" type="string"/>
<field name="admin4_code" type="string"/>
<field name="population" type="long" docValues="true"/>
<field name="elevation" type="int" docValues="true"/>
<field name="dem" type="int" docValues="true"/>
<field name="timezone" type="string"/>
<field name="modification_date" type="string"/>
{code}
AutoSoftCommit every second.
AutoCommit every 15 seconds with openSearcher=false
Updating one doc per second.
Using Solr start script without modification to start.ini.sh
"dem" and "population" have both docValues=true.
All times are in milliseconds
Single thread doing almost 5k different boolean queries
On "dem" field:
{noformat}
facet=true
facet.range=dem
facet.range.start=0
facet.range.end=200
facet.range.gap=1
facet.range.method=filter/dv
{noformat}
||Method||Min||Max||Average||p10||p50||p90||p99||
|Filter|77|3514|1141.5|1040|1128|1263|1374|
|DV|47|1988|166.0|88|151|262|368|
On "population" field:
{noformat}
facet=true
facet.range=population
facet.range.start=0
facet.range.end=2000
facet.range.gap=5
facet.range.method=filter/dv
{noformat}
||Method||Min||Max||Average||p10||p50||p90||p99||
|Filter|3|2055|321.1|47|70|891|955|
|DV|10|972|67.7|35|60|102|150|
> Support DV implementation in range faceting
> -------------------------------------------
>
> Key: SOLR-7406
> URL: https://issues.apache.org/jira/browse/SOLR-7406
> Project: Solr
> Issue Type: Improvement
> Reporter: Tomás Fernández Löbbe
> Assignee: Tomás Fernández Löbbe
> Fix For: Trunk
>
> Attachments: SOLR-7406.patch
>
>
> interval faceting has a different implementation than range faceting based on DocValues API. This is sometimes faster and doesn't rely on filters / filter cache.
> I'm planning to add a "method" parameter that would allow users to chose between the current implementation ("filter"?) and the DV-based implementation ("dv"?). The result for both methods should be the same, but performance may vary.
> Default should continue to be "filter".
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org