You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@lucene.apache.org by "Adrien Grand (Jira)" <ji...@apache.org> on 2021/06/07 16:41:00 UTC

[jira] [Created] (LUCENE-9994) Can IndexingChain better protect against large documents?

Adrien Grand created LUCENE-9994:
------------------------------------

             Summary: Can IndexingChain better protect against large documents?
                 Key: LUCENE-9994
                 URL: https://issues.apache.org/jira/browse/LUCENE-9994
             Project: Lucene - Core
          Issue Type: Improvement
            Reporter: Adrien Grand


It's easy for a single document to use several times the amount of RAM that is configured on IndexWriter by having many fields or many terms on a single field. Could we improve IndexingChain to reject such documents before they may cause an out-of-memory error? We could make such documents born deleted in the new segment like we already do when consuming a TokenStream raises an exception.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org