You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Dragon Fly <dr...@hotmail.com> on 2009/07/23 15:37:42 UTC

Loading an index into memory

Hi,

I have a question regarding RAMDirectory.  I have a 5 GB index on disk and it is opened like the following:

  searcher = new IndexSearcher (new RAMDirectory (indexDirectory));

Approximately how much memory is needed to load the index? 5GB of memory or 10GB because of Unicode? Does the entire index get loaded into memory or only parts of it? Thank you.


_________________________________________________________________
Windows Live™ Hotmail®: Celebrate the moment with your favorite sports pics. Check it out.
http://www.windowslive.com/Online/Hotmail/Campaign/QuickAdd?ocid=TXT_TAGLM_WL_QA_HM_sports_photos_072009&cat=sports

Re: Loading an index into memory

Posted by Thomas Becker <th...@net-m.de>.
We've a centralized lucene index running on a nfs share. This index gets
an update per 30 min. The LuceneServer nodes will notice the update and
copy the index (about 2,5gig) to a local tmpfs directory. Searching is
way faster in our case compared to a local disk.
However eks' concerns are valid and in the beginning we had some issues
with the OS starting to swap out things without a noticeable reason.
However this swapping occured rarely and did not affect performance
severly. Now the boxes got some more main memory and the problem is
gone. Two LuceneServer instances are sufficient to handle up to 60 heavy
search requests per second, which involves also some object generation,
etc.

I've tried RAMDirectory first, but we went the tmpfs way quite fast.
Sadly I can't remember what the main reason of our decision has been.

eks dev wrote:
> I do not know much about RAM FS, but I know for sure if you have enough memory for RAMDirectory, you should go for it. That gives you the fastest and the most stable performance, no OS swaps, no sudden performance drops... Uwe's tip is very good, if you/OS occasionally need RAM for other things, so that OS can borrow some from your index. This swapping comes with price, which can or cannot be ok for you. 
>
>   
>
> ----- Original Message ----
>   
>> From: Otis Gospodnetic <ot...@yahoo.com>
>> To: java-user@lucene.apache.org
>> Sent: Thursday, 23 July, 2009 18:55:57
>> Subject: Re: Loading an index into memory
>>
>> I haven't verified this myself, but I remember talking to somebody who tried 
>> MMapDirectory and compared it to simply using tmpfs (RAM FS).  The result was 
>> that MMapDirectory had some memory overhead, so putting the index on tmpfs was 
>> more memory-efficient.  I guess this person had read-only indices, so tmpfs was 
>> an option.
>>
>> Otis
>> --
>> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
>> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>>
>>
>>
>> ----- Original Message ----
>>     
>>> From: Uwe Schindler 
>>> To: java-user@lucene.apache.org
>>> Sent: Thursday, July 23, 2009 9:47:24 AM
>>> Subject: RE: Loading an index into memory
>>>
>>> The size is in bytes and the RAMDirectory stores the bytes in bytes, so size
>>> is equal. I would suggest to not copy the dir into a RAMdirectory. It is
>>> better to use MMapDirectory in this case, as it "swaps" the files into
>>> address space like a normal OS swap file. The OS kernel will automatically
>>> swap needed parts into physical RAM. In this case the Java Heap is not
>>> wasted and only needed parts are swapped into RAM.
>>>
>>> -----
>>> UWE SCHINDLER
>>> Webserver/Middleware Development
>>> PANGAEA - Publishing Network for Geoscientific and Environmental Data
>>> MARUM - University of Bremen
>>> Room 2500, Leobener Str., D-28359 Bremen
>>> Tel.: +49 421 218 65595
>>> Fax:  +49 421 218 65505
>>> http://www.pangaea.de/
>>> E-mail: uschindler@pangaea.de
>>>
>>>       
>>>> -----Original Message-----
>>>> From: Dragon Fly [mailto:dragon-fly999@hotmail.com]
>>>> Sent: Thursday, July 23, 2009 3:38 PM
>>>> To: java-user@lucene.apache.org
>>>> Subject: Loading an index into memory
>>>>
>>>>
>>>> Hi,
>>>>
>>>> I have a question regarding RAMDirectory.  I have a 5 GB index on disk and
>>>> it is opened like the following:
>>>>
>>>>   searcher = new IndexSearcher (new RAMDirectory (indexDirectory));
>>>>
>>>> Approximately how much memory is needed to load the index? 5GB of memory
>>>> or 10GB because of Unicode? Does the entire index get loaded into memory
>>>> or only parts of it? Thank you.
>>>>
>>>>
>>>> _________________________________________________________________
>>>> Windows LiveT HotmailR: Celebrate the moment with your favorite sports
>>>> pics. Check it out.
>>>> http://www.windowslive.com/Online/Hotmail/Campaign/QuickAdd?ocid=TXT_TAGLM
>>>> _WL_QA_HM_sports_photos_072009&cat=sports
>>>>         
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>       
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>     
>
>
>
>       
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>   

-- 
Thomas Becker
Senior JEE Developer

net mobile AG
Zollhof 17
40221 Düsseldorf
GERMANY

Phone:    +49 211 97020-195
Fax:      +49 211 97020-949
Mobile:   +49 173 5146567 (private)
E-Mail:   mailto:thomas.becker@net-m.de
Internet: http://www.net-m.de

Registergericht:  Amtsgericht Düsseldorf, HRB 48022
Vorstand:         Theodor Niehues (Vorsitzender), Frank Hartmann,
                 Kai Markus Kulas, Dieter Plassmann
Vorsitzender des
Aufsichtsrates:   Dr. Michael Briem 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Loading an index into memory

Posted by eks dev <ek...@yahoo.co.uk>.
I do not know much about RAM FS, but I know for sure if you have enough memory for RAMDirectory, you should go for it. That gives you the fastest and the most stable performance, no OS swaps, no sudden performance drops... Uwe's tip is very good, if you/OS occasionally need RAM for other things, so that OS can borrow some from your index. This swapping comes with price, which can or cannot be ok for you. 

  

----- Original Message ----
> From: Otis Gospodnetic <ot...@yahoo.com>
> To: java-user@lucene.apache.org
> Sent: Thursday, 23 July, 2009 18:55:57
> Subject: Re: Loading an index into memory
> 
> I haven't verified this myself, but I remember talking to somebody who tried 
> MMapDirectory and compared it to simply using tmpfs (RAM FS).  The result was 
> that MMapDirectory had some memory overhead, so putting the index on tmpfs was 
> more memory-efficient.  I guess this person had read-only indices, so tmpfs was 
> an option.
> 
> Otis
> --
> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
> 
> 
> 
> ----- Original Message ----
> > From: Uwe Schindler 
> > To: java-user@lucene.apache.org
> > Sent: Thursday, July 23, 2009 9:47:24 AM
> > Subject: RE: Loading an index into memory
> > 
> > The size is in bytes and the RAMDirectory stores the bytes in bytes, so size
> > is equal. I would suggest to not copy the dir into a RAMdirectory. It is
> > better to use MMapDirectory in this case, as it "swaps" the files into
> > address space like a normal OS swap file. The OS kernel will automatically
> > swap needed parts into physical RAM. In this case the Java Heap is not
> > wasted and only needed parts are swapped into RAM.
> > 
> > -----
> > UWE SCHINDLER
> > Webserver/Middleware Development
> > PANGAEA - Publishing Network for Geoscientific and Environmental Data
> > MARUM - University of Bremen
> > Room 2500, Leobener Str., D-28359 Bremen
> > Tel.: +49 421 218 65595
> > Fax:  +49 421 218 65505
> > http://www.pangaea.de/
> > E-mail: uschindler@pangaea.de
> > 
> > > -----Original Message-----
> > > From: Dragon Fly [mailto:dragon-fly999@hotmail.com]
> > > Sent: Thursday, July 23, 2009 3:38 PM
> > > To: java-user@lucene.apache.org
> > > Subject: Loading an index into memory
> > > 
> > > 
> > > Hi,
> > > 
> > > I have a question regarding RAMDirectory.  I have a 5 GB index on disk and
> > > it is opened like the following:
> > > 
> > >   searcher = new IndexSearcher (new RAMDirectory (indexDirectory));
> > > 
> > > Approximately how much memory is needed to load the index? 5GB of memory
> > > or 10GB because of Unicode? Does the entire index get loaded into memory
> > > or only parts of it? Thank you.
> > > 
> > > 
> > > _________________________________________________________________
> > > Windows LiveT HotmailR: Celebrate the moment with your favorite sports
> > > pics. Check it out.
> > > http://www.windowslive.com/Online/Hotmail/Campaign/QuickAdd?ocid=TXT_TAGLM
> > > _WL_QA_HM_sports_photos_072009&cat=sports
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org



      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Loading an index into memory

Posted by Otis Gospodnetic <ot...@yahoo.com>.
I haven't verified this myself, but I remember talking to somebody who tried MMapDirectory and compared it to simply using tmpfs (RAM FS).  The result was that MMapDirectory had some memory overhead, so putting the index on tmpfs was more memory-efficient.  I guess this person had read-only indices, so tmpfs was an option.

 Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



----- Original Message ----
> From: Uwe Schindler <us...@pangaea.de>
> To: java-user@lucene.apache.org
> Sent: Thursday, July 23, 2009 9:47:24 AM
> Subject: RE: Loading an index into memory
> 
> The size is in bytes and the RAMDirectory stores the bytes in bytes, so size
> is equal. I would suggest to not copy the dir into a RAMdirectory. It is
> better to use MMapDirectory in this case, as it "swaps" the files into
> address space like a normal OS swap file. The OS kernel will automatically
> swap needed parts into physical RAM. In this case the Java Heap is not
> wasted and only needed parts are swapped into RAM.
> 
> -----
> UWE SCHINDLER
> Webserver/Middleware Development
> PANGAEA - Publishing Network for Geoscientific and Environmental Data
> MARUM - University of Bremen
> Room 2500, Leobener Str., D-28359 Bremen
> Tel.: +49 421 218 65595
> Fax:  +49 421 218 65505
> http://www.pangaea.de/
> E-mail: uschindler@pangaea.de
> 
> > -----Original Message-----
> > From: Dragon Fly [mailto:dragon-fly999@hotmail.com]
> > Sent: Thursday, July 23, 2009 3:38 PM
> > To: java-user@lucene.apache.org
> > Subject: Loading an index into memory
> > 
> > 
> > Hi,
> > 
> > I have a question regarding RAMDirectory.  I have a 5 GB index on disk and
> > it is opened like the following:
> > 
> >   searcher = new IndexSearcher (new RAMDirectory (indexDirectory));
> > 
> > Approximately how much memory is needed to load the index? 5GB of memory
> > or 10GB because of Unicode? Does the entire index get loaded into memory
> > or only parts of it? Thank you.
> > 
> > 
> > _________________________________________________________________
> > Windows LiveT HotmailR: Celebrate the moment with your favorite sports
> > pics. Check it out.
> > http://www.windowslive.com/Online/Hotmail/Campaign/QuickAdd?ocid=TXT_TAGLM
> > _WL_QA_HM_sports_photos_072009&cat=sports
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Loading an index into memory

Posted by Dragon Fly <dr...@hotmail.com>.
Thank you both.

> Date: Thu, 23 Jul 2009 11:55:58 -0400
> Subject: Re: Loading an index into memory
> From: erickerickson@gmail.com
> To: java-user@lucene.apache.org
> 
> What are you trying to accomplish? I'd insure that my performance wasa
> problem before doing anything. If you're thinking "it's in RAM so it
> has to be faster" you might be surprised.....
> 
> So gather evidence that you have a problem before you jump to
> providing a solution.
> 
> Erick
> 
> On Thu, Jul 23, 2009 at 9:47 AM, Uwe Schindler <us...@pangaea.de>wrote:
> 
> > The size is in bytes and the RAMDirectory stores the bytes in bytes, so
> > size
> > is equal. I would suggest to not copy the dir into a RAMdirectory. It is
> > better to use MMapDirectory in this case, as it "swaps" the files into
> > address space like a normal OS swap file. The OS kernel will automatically
> > swap needed parts into physical RAM. In this case the Java Heap is not
> > wasted and only needed parts are swapped into RAM.
> >
> > -----
> > UWE SCHINDLER
> > Webserver/Middleware Development
> > PANGAEA - Publishing Network for Geoscientific and Environmental Data
> > MARUM - University of Bremen
> > Room 2500, Leobener Str., D-28359 Bremen
> > Tel.: +49 421 218 65595
> > Fax:  +49 421 218 65505
> > http://www.pangaea.de/
> > E-mail: uschindler@pangaea.de
> >
> > > -----Original Message-----
> > > From: Dragon Fly [mailto:dragon-fly999@hotmail.com]
> > > Sent: Thursday, July 23, 2009 3:38 PM
> > > To: java-user@lucene.apache.org
> > > Subject: Loading an index into memory
> > >
> > >
> > > Hi,
> > >
> > > I have a question regarding RAMDirectory.  I have a 5 GB index on disk
> > and
> > > it is opened like the following:
> > >
> > >   searcher = new IndexSearcher (new RAMDirectory (indexDirectory));
> > >
> > > Approximately how much memory is needed to load the index? 5GB of memory
> > > or 10GB because of Unicode? Does the entire index get loaded into memory
> > > or only parts of it? Thank you.
> > >
> > >
> > > _________________________________________________________________
> > > Windows LiveT HotmailR: Celebrate the moment with your favorite sports
> > > pics. Check it out.
> > >
> > http://www.windowslive.com/Online/Hotmail/Campaign/QuickAdd?ocid=TXT_TAGLM
> > > _WL_QA_HM_sports_photos_072009&cat=sports
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >

_________________________________________________________________
Bing™ brings you maps, menus, and reviews organized in one place. Try it now.
http://www.bing.com/search?q=restaurants&form=MLOGEN&publ=WLHMTAG&crea=TXT_MLOGEN_Local_Local_Restaurants_1x1

Re: Loading an index into memory

Posted by Erick Erickson <er...@gmail.com>.
What are you trying to accomplish? I'd insure that my performance wasa
problem before doing anything. If you're thinking "it's in RAM so it
has to be faster" you might be surprised.....

So gather evidence that you have a problem before you jump to
providing a solution.

Erick

On Thu, Jul 23, 2009 at 9:47 AM, Uwe Schindler <us...@pangaea.de>wrote:

> The size is in bytes and the RAMDirectory stores the bytes in bytes, so
> size
> is equal. I would suggest to not copy the dir into a RAMdirectory. It is
> better to use MMapDirectory in this case, as it "swaps" the files into
> address space like a normal OS swap file. The OS kernel will automatically
> swap needed parts into physical RAM. In this case the Java Heap is not
> wasted and only needed parts are swapped into RAM.
>
> -----
> UWE SCHINDLER
> Webserver/Middleware Development
> PANGAEA - Publishing Network for Geoscientific and Environmental Data
> MARUM - University of Bremen
> Room 2500, Leobener Str., D-28359 Bremen
> Tel.: +49 421 218 65595
> Fax:  +49 421 218 65505
> http://www.pangaea.de/
> E-mail: uschindler@pangaea.de
>
> > -----Original Message-----
> > From: Dragon Fly [mailto:dragon-fly999@hotmail.com]
> > Sent: Thursday, July 23, 2009 3:38 PM
> > To: java-user@lucene.apache.org
> > Subject: Loading an index into memory
> >
> >
> > Hi,
> >
> > I have a question regarding RAMDirectory.  I have a 5 GB index on disk
> and
> > it is opened like the following:
> >
> >   searcher = new IndexSearcher (new RAMDirectory (indexDirectory));
> >
> > Approximately how much memory is needed to load the index? 5GB of memory
> > or 10GB because of Unicode? Does the entire index get loaded into memory
> > or only parts of it? Thank you.
> >
> >
> > _________________________________________________________________
> > Windows LiveT HotmailR: Celebrate the moment with your favorite sports
> > pics. Check it out.
> >
> http://www.windowslive.com/Online/Hotmail/Campaign/QuickAdd?ocid=TXT_TAGLM
> > _WL_QA_HM_sports_photos_072009&cat=sports
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

RE: Loading an index into memory

Posted by Uwe Schindler <us...@pangaea.de>.
The size is in bytes and the RAMDirectory stores the bytes in bytes, so size
is equal. I would suggest to not copy the dir into a RAMdirectory. It is
better to use MMapDirectory in this case, as it "swaps" the files into
address space like a normal OS swap file. The OS kernel will automatically
swap needed parts into physical RAM. In this case the Java Heap is not
wasted and only needed parts are swapped into RAM.

-----
UWE SCHINDLER
Webserver/Middleware Development
PANGAEA - Publishing Network for Geoscientific and Environmental Data
MARUM - University of Bremen
Room 2500, Leobener Str., D-28359 Bremen
Tel.: +49 421 218 65595
Fax:  +49 421 218 65505
http://www.pangaea.de/
E-mail: uschindler@pangaea.de

> -----Original Message-----
> From: Dragon Fly [mailto:dragon-fly999@hotmail.com]
> Sent: Thursday, July 23, 2009 3:38 PM
> To: java-user@lucene.apache.org
> Subject: Loading an index into memory
> 
> 
> Hi,
> 
> I have a question regarding RAMDirectory.  I have a 5 GB index on disk and
> it is opened like the following:
> 
>   searcher = new IndexSearcher (new RAMDirectory (indexDirectory));
> 
> Approximately how much memory is needed to load the index? 5GB of memory
> or 10GB because of Unicode? Does the entire index get loaded into memory
> or only parts of it? Thank you.
> 
> 
> _________________________________________________________________
> Windows LiveT HotmailR: Celebrate the moment with your favorite sports
> pics. Check it out.
> http://www.windowslive.com/Online/Hotmail/Campaign/QuickAdd?ocid=TXT_TAGLM
> _WL_QA_HM_sports_photos_072009&cat=sports


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org