You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Arvind Kumar Sahu (Jira)" <ji...@apache.org> on 2021/06/22 14:09:00 UTC

[jira] [Created] (LUCENE-10013) Document contains at least one immense term in field (whose UTF8 encoding is longer than the max length 32766)

Arvind Kumar Sahu created LUCENE-10013:
------------------------------------------

             Summary: Document contains at least one immense term in field (whose UTF8 encoding is longer than the max length 32766)
                 Key: LUCENE-10013
                 URL: https://issues.apache.org/jira/browse/LUCENE-10013
             Project: Lucene - Core
          Issue Type: Task
    Affects Versions: 4.10.4
            Reporter: Arvind Kumar Sahu


Hi Team,

Currently we are using Lucene 4.10.4 version. We are getting the below error:

"Document contains at least one immense term in field (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms. The prefix of the first immense term is: '[-41, -103, -41, -87, -41, -103, -41, -111, -41, -108, 32, 56, 56, 45, -41, -108, -41, -111, -41, -107, -41, -89, -41, -88, 44, 32, 40, 32, 49, 51]...', original message: bytes can be at most 32766 in length; got 35169".

We understand from the Lucene JIRA ticketĀ [LUCENE-5472] Long terms should generate a RuntimeException, not just infoStream - ASF JIRA (apache.org), this issue has been resolved in 4.8 and 6.0.

Please confirm us if this fix is included in 4.10.4.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org