You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Phantom <gh...@gmail.com> on 2007/06/20 17:50:29 UTC
MapFile inner workings
Hi All
I know this is a tall ask. I am going through the source code. But could
someone please tell me the intuition behind the design of the MapFile class.
If I were using the MapFile against the local file system are there any
limitations to the number of items I can store. I mean can I have a MapFile
on the local filesystem that has say 10GB of data. The reason I ask this is
because I did read in the documentation that it behooves one to keep the key
small since the index is completely kept in memory. Could someone please
enlighten me ?
Thanks
Avinash
Re: MapFile inner workings
Posted by Doug Cutting <cu...@apache.org>.
Every 128th key is held in memory. So if you've got 1M keys in a
MapFile, then opening a MapFile.Reader would read 10k keys into memory.
Binary search is used on these in-memory keys, so that a maximum of
127 entries must be scanned per random access.
Doug
Phantom wrote:
> Hi All
>
> I know this is a tall ask. I am going through the source code. But could
> someone please tell me the intuition behind the design of the MapFile
> class.
> If I were using the MapFile against the local file system are there any
> limitations to the number of items I can store. I mean can I have a MapFile
> on the local filesystem that has say 10GB of data. The reason I ask this is
> because I did read in the documentation that it behooves one to keep the
> key
> small since the index is completely kept in memory. Could someone please
> enlighten me ?
>
> Thanks
> Avinash
>