You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Robert Muir (JIRA)" <ji...@apache.org> on 2017/12/08 16:47:00 UTC

[jira] [Commented] (LUCENE-8087) Record per-term max term frequencies

    [ https://issues.apache.org/jira/browse/LUCENE-8087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16283818#comment-16283818 ] 

Robert Muir commented on LUCENE-8087:
-------------------------------------

{quote}
Ideally we'd need something like the maximum term frequency for each norm value.
{quote}

I agree this would be ideal: it would let similarity defer computing the maximum impact to query time. Maybe its the right tradeoff to look into, if we get good performance without bloating the index? For big terms it'd be at worst 256 integers. For terms appearing only once or twice the overhead could be kept smaller if we don't encode zeros, etc.

> Record per-term max term frequencies
> ------------------------------------
>
>                 Key: LUCENE-8087
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8087
>             Project: Lucene - Core
>          Issue Type: Wish
>            Reporter: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-8087.patch
>
>
> I was mostly interested in doing that in order to get better score upper bounds for LUCENE-4100. However this doesn't help, at least with the tasks that we have for wikimedium10m. I dug this a bit, and this is due to the fact that the upper bound is not much better if we can't make assumptions about the value of the length. Ideally we'd need something like the maximum term frequency for each norm value. I'll post the patch in case someone has another use-case for per-term max term frequencies.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org