You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucy.apache.org by Marvin Humphrey <ma...@rectangular.com> on 2009/07/05 19:12:13 UTC

Memory mapping portability

Greets,

InStream_Buf() and InStream_Advance_Buf(), as described at...

  http://mail-archives.apache.org/mod_mbox/lucene-lucy-dev/200810.mbox/%3C20081003174901.GA8683@rectangular.com%3E

... were implemented for Unixen a while ago in our KS prototype, using mmap().  
A fallback implementation using streamed io was left in place; Windows had
been using that fallback until a few days ago.  However, I've now finally
finished the mapping Windows implementation, which uses CreateFile,
CreateFileMapping, and MapViewOfFile -- and it seems to be working great.  

On 64-bit systems, we map the whole file as soon as it's opened.  On 32-bit
systems, Buf() and Advance_Buf() use a windowing technique to conserve
addressable space.

Each InStream can be asked to provide at most one "buf" at a time.  By
default, the size of that buf is sysconf(_SC_PAGESIZE) on Unixen -- typically
4k -- and dwAllocationGranularity from GetSystemInfo on Windows() -- typically
64k.  For certain files, callers may request that Buf() provide them more that
-- for example, SortReader maps all sort cache files in their entirety.
Nevertheless, since we aren't mapping whole postings files, we shouldn't run
out of addressable space for indexes of any practical size.

We can keep the streaming io fallback around for systems which don't provide
either mman.h or windows.h, though I don't imagine there are too many of those
around these days.

Marvin Humphrey