You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2016/03/11 21:58:12 UTC

[jira] [Updated] (LUCENE-7098) BKDWriter should write ords as ints when possible during offline sort

     [ https://issues.apache.org/jira/browse/LUCENE-7098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-7098:
---------------------------------------
    Attachment: LUCENE-7098.patch

Patch.  {{BKDWriter}} figures out up front whether it can use {{int}} or {{long}} to write all ords.  The caller must specific max number of values it will pass to this instance (hmm, I'll add checks to verify caller didn't exceed what it had promised).

This gives a nice speed up on the 6.1M London UK test, with the final merge going from 192.1 sec down to 171.1 sec to merge points.

I'll make sure {{Test2BPoints}} passes with this change.

> BKDWriter should write ords as ints when possible during offline sort
> ---------------------------------------------------------------------
>
>                 Key: LUCENE-7098
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7098
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>         Attachments: LUCENE-7098.patch
>
>
> Today we write all ords as longs, since we support more than 2.1B values in one segment, but the vast majority of the time an int would suffice.
> We could look into vLong, but this quickly gets tricky because {{BKDWriter}} needs random access to the file and we rely on fixed-width entries to do this now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org