You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Jack Krupansky (JIRA)" <ji...@apache.org> on 2012/06/03 23:25:22 UTC

[jira] [Commented] (LUCENE-4104) Clearly document the limit for maximum number of documents in a single index

    [ https://issues.apache.org/jira/browse/LUCENE-4104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13288263#comment-13288263 ] 

Jack Krupansky commented on LUCENE-4104:
----------------------------------------

I propose to add these constants to IndexWriter:

static final int MAX_DOCUMENT_NUMBER = Integer.MAX_VALUE - 1;
static final int MIN_DOCUMENT_NUMBER = 0;
static final int MAX_DOCUMENT_COUNT = MAX_DOCUMENT_NUMBER - MIN_DOCUMENT_NUMBER + 1;

And add these to IndexReader for convenience:

static final int MAX_DOCUMENT_NUMBER = IndexWriter.MAX_DOCUMENT_NUMBER;
static final int MIN_DOCUMENT_NUMBER = IndexWriter.MIN_DOCUMENT_NUMBER;
static final int MAX_DOCUMENT_COUNT = IndexWriter.MAX_DOCUMENT_COUNT;

Add to the IndexWriter class javadoc at the class level and in the addDocument, addDocuments, and maybe updateDocuments methods:

"NOTE: the maximum number of documents in a single Lucene index is defined by MAX_DOCUMENT_COUNT which is 2,147,483,647 in the current implementation, but in practice that number will be reduced by deleted documents and may not be achievable with available memory in the JVM."

                
> Clearly document the limit for maximum number of documents in a single index
> ----------------------------------------------------------------------------
>
>                 Key: LUCENE-4104
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4104
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/index
>    Affects Versions: 3.6
>            Reporter: Jack Krupansky
>            Priority: Minor
>
> Although the "int" in a number of APIs strongly suggests the approximate limit to the number of documents than can exist in a single Lucene index, it would be useful to have the specific number more clearly documented.
> My reading suggests that the limit is 2^31-2 so that the count of documents, 0 to 2^31-2, will fit in an int as Integer.MAX_INT or 2^31-1 or 2,147,483,647.
> Symbolic definitions of the maximum document number and maximum number of documents, as well as the first document number should also be provided.
> A subsequent issue will be to detect and throw an exception when that limit is exceeded.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org