You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2015/09/03 18:58:46 UTC

[jira] [Commented] (LUCENE-6777) Switch GeoPointTermsEnum range list to use a reusable BytesRef

    [ https://issues.apache.org/jira/browse/LUCENE-6777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14729394#comment-14729394 ] 

Michael McCandless commented on LUCENE-6777:
--------------------------------------------

Thanks [~nknize]!  I don't think we have to perf test before committing ... but can we try to simplify the code a bit?  E.g. things like:

{noformat}
-    if (term.compareTo(rangeBounds.get(0).cell) < 0) {
+    if (term.compareTo(this.nextSubRange = rangeBounds.get(0).asBytesRef(this.nextSubRange)) < 0) {
{noformat}

break that inner assignment out into its own line before?

Another example, instead of:
{noformat}
    final int result = Short.compare(this.shift, other.shift);
    return (result == 0) ? Long.compare(this.start, other.start) : result;
{noformat}

can we do e.g.:

{noformat}
    final int result = Short.compare(this.shift, other.shift);
    if (result == 0) {
      return result;
    }
    return Long.compare(this.start, other.start);
{noformat}

Instead of cloning the somewhat hairy
{{NumericUtils.longToPrefixCodedBytes}}, is it possible to e.g. reuse
a shared {{BytesRefBuilder}}?

Instead of the lazy alloc of reusable can we change the {{asBytesRef}} to
{{fillBytesRef}}, and it takes the non-null {{BytesRef}} in, and
returns void?  It's confusing to both take a parameter and return a result that
are sort of doing the same thing ...


> Switch GeoPointTermsEnum range list to use a reusable BytesRef 
> ---------------------------------------------------------------
>
>                 Key: LUCENE-6777
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6777
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Nicholas Knize
>         Attachments: LUCENE-6777.patch
>
>
> GeoPointTermsEnum currently constructs a BytesRef for every computed range, then sorts on this BytesRef.  This adds an unnecessary memory overhead since the TermsEnum only requires BytesRef on calls to nextSeekTerm and accept and the ranges only need to be sorted by their long representation. This issue adds the following two improvements:
> 1. Lazily compute the BytesRef on demand only when its needed
> 2. Add a single, transient BytesRef to GeoPointTermsEnum
> This will further cut back on heap usage when constructing ranges across every segment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org