You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by "Robert Muir (JIRA)" <ji...@apache.org> on 2014/01/14 17:47:54 UTC

[jira] [Commented] (LUCENE-5399) PagingFieldCollector is very slow with String fields

    [ https://issues.apache.org/jira/browse/LUCENE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13870886#comment-13870886 ] 

Robert Muir commented on LUCENE-5399:
-------------------------------------

oh this is great, this was really frustrating when trying to test the missing value support in DV.

as far as "my solution", i think its totally bogus actually: its caching a reference to a mutable thing (BytesRef)... I just did it this way because exactly one and only one thing calls this method (PagingFIeldCollector) and so I knew i could do benchmarks and so on (i only investigated this as some 'page 2' results with searchAfter were like 8x slower than if you didnt use searchAfter, with current patch like 2x).

as far as reversing the checks, that sounds fantastic, I think that explains why my page 2 results are still 2x slower: 2x the work because they read the ordinal twice, since most hits are neither competitive nor previously visited. 

> PagingFieldCollector is very slow with String fields
> ----------------------------------------------------
>
>                 Key: LUCENE-5399
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5399
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/search
>            Reporter: Robert Muir
>         Attachments: LUCENE-5399.patch, LUCENE-5399.patch
>
>
> PagingFieldCollector (sort comparator) is significantly slower with string fields, because of how its "seen on a previous page" works: it calls compareDocToValue(int doc, T t) first to check this. (its the only user of this method)
> This is very slow with String, because no ordinals are used. so each document must lookup ord, then lookup bytes, then compare bytes.
> I think maybe we should replace this method with an 'after' slot, and just have compareDocToAfter or something.
> Otherwise we could use a hack-patch like the one i will upload (i did this just to test the performance, although tests do pass).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org