You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@lucene.apache.org by "Michael McCandless (Jira)" <ji...@apache.org> on 2021/08/17 13:21:00 UTC

[jira] [Created] (LUCENE-10052) Add LuceneTestCase.newBytesRef methods

Michael McCandless created LUCENE-10052:
-------------------------------------------

             Summary: Add LuceneTestCase.newBytesRef methods
                 Key: LUCENE-10052
                 URL: https://issues.apache.org/jira/browse/LUCENE-10052
             Project: Lucene - Core
          Issue Type: Improvement
            Reporter: Michael McCandless


{{BytesRef}} is a super useful Lucene utility class, referencing a slice (offset + length) of an underlying possibly larger {{byte[]}}.  We use it all over the place in our APIs.

But the {{offset}} is trappy – we programmers sometimes forget to add the {{offset}} when accessing the underlying bytes.  Or sometimes we accidentally add it twice, as just happened in our (Amazon Product Search) Lucene usage.  Such errors are devious because they often do not matter since typically {{offset}} will be zero, but then suddenly when the rare {{BytesRef}} arrives with non-zero {{offset}}, BOOM.

I think we should improve our testing here by making it simple to randomize a {{BytesRef}} creation to sometimes use non-zero offset and also to sometimes leave extra padding on the end of the underlying {{byte[]}} to catch another trappy case where we use {{bytesRef.bytes.length}} when we were supposed to use {{bytesRef.length}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org