You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by "STANISLAV LIVOTOV (JIRA)" <ji...@apache.org> on 2018/08/21 15:08:00 UTC

[jira] [Created] (SOLR-12688) LTR Multiple performance fixes + pure DocValues support for FieldValueFeature

STANISLAV LIVOTOV created SOLR-12688:
----------------------------------------

             Summary: LTR Multiple performance fixes + pure DocValues support for FieldValueFeature
                 Key: SOLR-12688
                 URL: https://issues.apache.org/jira/browse/SOLR-12688
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
            Reporter: STANISLAV LIVOTOV
         Attachments: LTRModelHashCodeAfter.png, LTRModelHashCodeBefore.png, LTRSolrFeatureAfter.png, LTRSolrFeatureBefore.png, LTRwithDVOptimisation.png, LTRwithoutDVOptimisation.png, MultiplePerformanceFixes.patch

This ticket is related to 2 performance and 1 functional/performance issue that I had found during integrating LTR in our e-commerce search engine : 
 # FieldValueFeature doesn't support pure DocValues fields (Stored false). Please also note that for fields which are both stored and DocValues it is working not optimal because it is extracting just one field from the stored document. DocValues are obviously faster for such circumstances. Below are screenshots of JFR profiles without and with new support of DocValues for the case when it can be read from DocValues. 
 !LTRwithoutDVOptimisation.png! 
 !LTRwithDVOptimisation.png!
 # SolrFeature was not optimally implemented for the case when no fq parameter was passed. I'm not absolutely sure what was the intention to introduce fq parameter for SolrFeature at all, so I decided not to change behavior but just optimize described case !LTRSolrFeatureBefore.png! !LTRSolrFeatureAfter.png!
 # LTRScoringModel was a mutable object. It was leading to the calculation of hashcode on each query, which in turn can consume a lot of time in cases when a model is big(In our case we were using LambdaMART with 100 trees and leaves which was consuming 3MB of the disk size). So I decided to make LTRScoringModel immutable and cache hashCode calculation. Below are the screenshots before and after.  !LTRModelHashCodeBefore.png!!LTRModelHashCodeAfter.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org