You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Robert Muir (JIRA)" <ji...@apache.org> on 2011/06/14 01:05:49 UTC

[jira] [Commented] (LUCENE-3200) Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized of powers of 2

    [ https://issues.apache.org/jira/browse/LUCENE-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048858#comment-13048858 ] 

Robert Muir commented on LUCENE-3200:
-------------------------------------

also, we can fix the issue Shai brought up for the 3.1 VOTE while we are here.

in seek(long pos) i think we should do:
{code}
try {
 ...
 position()
 ...
} catch (IllegalArgumentException e) {
  if (pos < 0) 
    throw exc;
  else 
    throw new IOException("read past EOF"); 
}
{code}

This would be more consistent with NIOFS/SimpleFS from an exception perspective.


> Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized of powers of 2
> -------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3200
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3200
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Uwe Schindler
>
> Robert and me discussed a little bit after Mike's investigations, that using SingleMMapIndexinput together with MultiMMapIndexInput leads to hotspot slowdowns sometimes.
> We had the following ideas:
> - MultiMMapIndexInput is almost as fast as SingleMMapIndexInput, as the switching between buffer boundaries is done in exception catch blocks. So normal code path is always the same like for Single*
> - Only the seek method uses strange calculations (the modulo is totally bogus, it could be simply: int bufOffset = (int) (pos % maxBufSize); - very strange way of calculating modulo in the original code)
> - Because of speed we suggest to no longer use arbitrary buffer sizes. We should pass only the power of 2 to the indexinput as size. All calculations in seek and anywhere else would be simple bit shifts and AND operations (the and masks for the modulo can be calculated in the ctor like NumericUtils does when calculating precisionSteps).
> - the maximum buffer size will now be 2^30, not 2^31-1. But thats not an issue at all. In my opinion, a buffer size of 2^31-1 is stupid in all cases, as it will no longer fit page boundaries and mmapping gets harder for the O/S.
> We will provide a patch with those cleanups.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org