You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2014/11/05 01:22:34 UTC
[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

    [ https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14197204#comment-14197204 ] 

ASF subversion and git services commented on SOLR-6248:
-------------------------------------------------------

Commit 1636784 from [~anshumg] in branch 'dev/trunk'
[ https://svn.apache.org/r1636784 ]

SOLR-6248: Fixing an exception in case of missing qf

> MoreLikeThis Query Parser
> -------------------------
>
>                 Key: SOLR-6248
>                 URL: https://issues.apache.org/jira/browse/SOLR-6248
>             Project: Solr
>          Issue Type: New Feature
>          Components: query parsers
>            Reporter: Anshum Gupta
>            Assignee: Anshum Gupta
>             Fix For: 5.0
>
>         Attachments: SOLR-6248.patch, SOLR-6248.patch, SOLR-6248.patch, SOLR-6248.patch, SOLR-6248.patch, SOLR-6248.patch
>
>
> MLT Component doesn't let people highlight/paginate and the handler comes with an cost of maintaining another piece in the config. Also, any changes to the default (number of results to be fetched etc.) /select handler need to be copied/synced with this handler too.
> Having an MLT QParser would let users get back docs based on a query for them to paginate, highlight etc. It would also give them the flexibility to use this anywhere i.e. q,fq,bq etc.
> A bit of history about MLT (thanks to Hoss)
> MLT Handler pre-dates the existence of QParsers and was meant to take an arbitrary query as input, find docs that match that 
> query, club them together to find interesting terms, and then use those 
> terms as if they were my main query to generate a main result set.
> This result would then be used as the set to facet, highlight etc.
> The flow: Query -> DocList(m) -> Bag (terms) -> Query -> DocList\(y)
> The MLT component on the other hand solved a very different purpose of augmenting the main result set. It is used to get similar docs for each of the doc in the main result set.
> DocSet\(n) -> n * Bag (terms) -> n * (Query) -> n * DocList(m)
> The new approach:
> All of this can be done better and cleaner (and makes more sense too) using an MLT QParser.
> An important thing to handle here is the case where the user doesn't have TermVectors, in which case, it does what happens right now i.e. parsing stored fields.
> Also, in case the user doesn't have a field (to be used for MLT) indexed, the field would need to be a TextField with an index analyzer defined. This analyzer will then be used to extract terms for MLT.
> In case of SolrCloud mode, '/get-termvectors' can be used after looking at the schema (if TermVectors are enabled for the field). If not, a /get call can be used to fetch the field and parse it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org