You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Uwe Schindler (Updated) (JIRA)" <ji...@apache.org> on 2012/03/24 18:18:25 UTC

[jira] [Updated] (LUCENE-3659) Improve Javadocs of RAMDirectory to document its limitations

     [ https://issues.apache.org/jira/browse/LUCENE-3659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-3659:
----------------------------------

    Attachment: LUCENE-3659.patch

I started to work on this, here is just a first step (trunk). This patch removes the BUFFER_SIZE constant and moves it up to RAMDirectory (but for now only as default, see below!). RAMDirectory inherits the default buffersize for now to its RAMFile childs (newRAMFile() method), but this can likely change (see below).

As every RAMFile has its own buffer size, optimizations are possible:
- when you open an IndexOutput, in trunk we get the IOContext, which may contain a Merge/Flush desc containing the complete segment size (unfortunately the *complete* segment size). But this number can be used as a order of magnitude for specifiing the buffer size.

The patch does not yet implement that, but an idea would be to maybe allocate 1/32 of the segment size as buffer size. By that the buffer size does not get too big, but on the other hand the number of slices has an upper limit (approx 32 slices per merged segment). Currently a merged segment with a size of say 32 Gigabytes would have 32 million byte[] arrays, after the change only 32 byte[] arrays with a size of 1 Gigabyte each. This should make GC happy.

When backporting to 3.x, the IOContext is not yet available and RAMDirectory always uses the default buffer size (maybe randomize in tests). Rainsing the buffer size should bring improvements here.

We should still add some warnings into the Javadocs, that for *large* indexes it is often preferable to use MMapDir, especially when you store it on disk. We should also peple tell that new RAMDirectoty(OtherDirectory) maybe a bad idea...

The new default buffer size was raised from 1024 to 8192.
                
> Improve Javadocs of RAMDirectory to document its limitations
> ------------------------------------------------------------
>
>                 Key: LUCENE-3659
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3659
>             Project: Lucene - Java
>          Issue Type: Task
>    Affects Versions: 3.5, 4.0
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 3.6, 4.0
>
>         Attachments: LUCENE-3659.patch
>
>
> Spinoff from several dev@lao issues:
> - [http://mail-archives.apache.org/mod_mbox/lucene-dev/201112.mbox/%3C001001ccbf1c%2471845830%24548d0890%24%40thetaphi.de%3E]
> - issue LUCENE-3653
> The use cases for RAMDirectory are very limited and to prevent users from using it for e.g. loading a 50 Gigabyte index from a file on disk, we should improve the javadocs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org