You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "David Smiley (JIRA)" <ji...@apache.org> on 2018/05/04 22:05:00 UTC
[jira] [Commented] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching

    [ https://issues.apache.org/jira/browse/SOLR-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464442#comment-16464442 ] 

David Smiley commented on SOLR-11865:
-------------------------------------

Patch still in progress but want to mention some things.
 * New name {{useConfiguredOrderForElevations}}.  Documentation language: "When multiple docs are elevated, should their relative order be the order in the configuration file or should they be subject to whatever the sort criteria is? True by default."
 * Found a way to entirely skip using ElevationComparatorSource if the Elevation obj has no elevations (maybe just has exclusions)
 * I was looking in detail at ElevationComparatorSource that led me to some observations that I'd like your input on:
 ** BytesRef[] termValues is the "value half of the map".  It's the BytesRef version of the ID values aligned with ordSet (doc ID) slots.  But the reader of these (docVal()) has to do additional work to look it up in elevation.priorities to get an int.  I think this could be replaced with an int[] populated with the pertinent int priorities when doSetNextReader is called (which is where ordSet & termValues is init'ed right now).  This int[] would be named simply priorities.
 ** Elevation.elevatedIds could be a Map<String,Integer> that maps directly to the priority from the uniqueKey val (thus removing the need for a separate "priorities" map), and then in doSetNextReader we can iterate on the Map.Entry and needn't do another lookup.
 ** I wonder if the String IDs in Elevation, both elevated and excluded, ought to be BytesRefs to clarify that they are raw indexed form IDs?  (consider when uniqueKey is a long)  The current String form is suggestive that they are the surface form IDs, yet they aren't since they've already been mapped with FieldType.readableToIndexed.  Or alternatively keep the surface form IDs and translate them at a later time.  I think we might as well do them eagerly as it saves work during search, even if it's easy work, and again it clarifies the type.
 *** FieldType.readableToIndexed(String) ought to be deprecated in lieu of readableToIndexed(CharSequence, BytesRefBuilder).
 ** I guess it's debatable where to actually apply the key String => indexed form (String of BytesRef)... we're doing it in Elevation's constructor with a passed in UnaryOperator thingy but it could just as easily be done very late in, say, ElevationComparatorSource.doSetNextReader, or perhaps very early right after we read it from the XML. I suppose it's fine as-is.

> Refactor QueryElevationComponent to prepare query subset matching
> -----------------------------------------------------------------
>
>                 Key: SOLR-11865
>                 URL: https://issues.apache.org/jira/browse/SOLR-11865
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SearchComponents - other
>    Affects Versions: master (8.0)
>            Reporter: Bruno Roustant
>            Priority: Minor
>              Labels: QueryComponent
>             Fix For: master (8.0)
>
>         Attachments: 0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch, 0002-Refactor-QueryElevationComponent-after-review.patch, 0003-Remove-exception-handlers-and-refactor-getBoostDocs.patch, SOLR-11865.patch
>
>
> The goal is to prepare a second improvement to support query terms subset matching or query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it extendible. We introduce the ElevationProvider interface which will be implemented later in a second patch to support subset matching. The current full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org