You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Ishan Chattopadhyaya (JIRA)" <ji...@apache.org> on 2017/01/22 16:25:27 UTC

[jira] [Comment Edited] (SOLR-5944) Support updates of numeric DocValues

    [ https://issues.apache.org/jira/browse/SOLR-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15833574#comment-15833574 ] 

Ishan Chattopadhyaya edited comment on SOLR-5944 at 1/22/17 4:24 PM:
---------------------------------------------------------------------

Hoss did some initial single threaded benchmarks on the performance impact of doing in-place updates as compared to regular atomic updates.
{code}
The script [0] adds 20K docs containing a stored+indexed text field, a DVO long field, and a stored+indexed long field. It then does 50 iterations of 5K updates to each of the 2 long fields (500K updates total, 250K on the DVO field, 250K on the stored+indexed field), recording the cumulative total times for the udpates to each type of field.
{code}

Here are some results (times are in seconds):
Hoss' run, as of c21d8a005387eae4d0f0adde4de7e4e465fb73c8, on his laptop:
||Branch||Adds||DVO Updates||stored+indexed Updates||
|master|52|531|543|
|5944|51|352|503|

My run, as of right now:
||Branch||Adds||DVO Updates||stored+indexed Updates||
|master|40|682|663|
|5944|36|295|694|

Seems like Hoss observed a 30% speedup on 5944 branch (in-place update vs. regular update), and I observed a 57% speedup on the same usecase. More benchmarks need to done to analyse the performance in multi-threaded indexing, and also to ensure there's no performance regression for non-dv updates.

 [0] - https://gist.github.com/anonymous/ae8bf7db3c713bf9d45937ec0aa1cfae


was (Author: ichattopadhyaya):
Hoss did some initial single threaded benchmarks on the performance impact of doing in-place updates as compared to regular atomic updates.
{code}
The script [0] adds 20K docs containing a stored+indexed text field, a DVO long field, and a stored+indexed text field. It then does 50 iterations of 5K updates to each of the 2 long fields (500K updates total, 250K on the DVO field, 250K on the stored+indexed field), recording the cumulative total times for the udpates to each type of field.
{code}

Here are some results (times are in seconds):
Hoss' run, as of c21d8a005387eae4d0f0adde4de7e4e465fb73c8, on his laptop:
||Branch||Adds||DVO Updates||stored+indexed Updates||
|master|52|531|543|
|5944|51|352|503|

My run, as of right now:
||Branch||Adds||DVO Updates||stored+indexed Updates||
|master|40|682|663|
|5944|36|295|694|

Seems like Hoss observed a 30% speedup on 5944 branch (in-place update vs. regular update), and I observed a 57% speedup on the same usecase. More benchmarks need to done to analyse the performance in multi-threaded indexing, and also to ensure there's no performance regression for non-dv updates.

 [0] - https://gist.github.com/anonymous/ae8bf7db3c713bf9d45937ec0aa1cfae

> Support updates of numeric DocValues
> ------------------------------------
>
>                 Key: SOLR-5944
>                 URL: https://issues.apache.org/jira/browse/SOLR-5944
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Ishan Chattopadhyaya
>            Assignee: Shalin Shekhar Mangar
>         Attachments: defensive-checks.log.gz, demo-why-dynamic-fields-cannot-be-inplace-updated-first-time.patch, DUP.patch, hoss.62D328FA1DEA57FD.fail2.txt, hoss.62D328FA1DEA57FD.fail3.txt, hoss.62D328FA1DEA57FD.fail.txt, hoss.D768DD9443A98DC.fail.txt, hoss.D768DD9443A98DC.pass.txt, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, TestStressInPlaceUpdates.eb044ac71.beast-167-failure.stdout.txt, TestStressInPlaceUpdates.eb044ac71.beast-587-failure.stdout.txt, TestStressInPlaceUpdates.eb044ac71.failures.tar.gz
>
>
> LUCENE-5189 introduced support for updates to numeric docvalues. It would be really nice to have Solr support this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org