You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Paul Taylor <pa...@fastmail.fm> on 2009/03/24 09:42:57 UTC

Can you create a RAM index from a file index

Hi

Ive built some file based indexes based on data in a database, and it 
took quite some time.
I am interested in trying to use RAM based indexes instead of file based 
indexes to compare search performance but its going to take some time to 
rebuild the index from the original database, isnt it possible to 
rebuild the index from the file based index ?
How big is a RAM index compared to File based index, I assumming its 
slightly smaller because no files are created ?

thanks Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Can you create a RAM index from a file index

Posted by Anshum <an...@gmail.com>.
Hi Ganesh,

What you are talking about is loading partial index (as per requirement)
into RAM. This is exactly what any other decently designed application would
do. On the other hand, RAM Directory implementation just copies all of the
index into RAM. Also,  tmpfs is nothing but an explicit copy of data into
RAM.
The performance difference is noticeable as the difference in I/O speeds for
RAM Vs HDD.
About syncing, perhaps you're talking about changing an existing index at
runtime (deletions, editions etc.). If you use the RAM Directory
implementation, perhaps reinitialization should help.
On the otherhand, tmpfs implementation would just be the way you use file
system based index. Nothing including the code would need to be changed for
it.
NOTE : You may want to keep a copy of the index always ready @ the HDD as a
reboot wipes off tmpfs (as expected out of a RAM based setup).

--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com

The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw............


On Wed, Mar 25, 2009 at 11:37 AM, Ganesh <em...@yahoo.co.in> wrote:

> FileSystem index reader loads the data to RAM, I have tried with more than
> 6 GB of index (sharded to 20 index) and the response is pretty fast.
>
> What significance gain would be to use RAM directory.
>
> How the modifications done in RAM directory will sync with FileSystem.
>
> Regards
> Ganesh
>
> ----- Original Message ----- From: "Otis Gospodnetic" <
> otis_gospodnetic@yahoo.com>
> To: <ja...@lucene.apache.org>
> Sent: Wednesday, March 25, 2009 9:12 AM
>
> Subject: Re: Can you create a RAM index from a file index
>
>
>
>> That's indeed an alternative.  Moreover, I have heard (not
>> measured/comparered myself) from people who tried both MM and tmpfs approach
>> that the former has some overhead.
>>
>>
>> Otis
>> --
>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>
>>
>>
>> ----- Original Message ----
>>
>>> From: Anshum <an...@gmail.com>
>>> To: java-user@lucene.apache.org; paul_t100@fastmail.fm
>>> Sent: Tuesday, March 24, 2009 6:42:58 AM
>>> Subject: Re: Can you create a RAM index from a file index
>>>
>>> Hi Paul,
>>>
>>> Going by what you've conveyed here, I'd assume that you have more than
>>> some
>>> data. You could either go ahead with Ian's way which is the suggested
>>> one(as
>>> far as lucene implementation is concerned) but It'd not be possible if
>>> you're index is greater than 2 Gigs and you are not running the 64 bit
>>> version of JVM (as your JVM could not use more than 2Gigs of RAM
>>> otherwise).
>>> The other workaround that I've tried successfully is by creating a tmpfs
>>> partition and copying your index onto that tmpfs partition. This would
>>> also
>>> mean that you'd need almost the same amount of 'free' ram to copy the
>>> index
>>> onto the RAM.
>>> You could then open your reader in the regular fashion straight off the
>>> RAM
>>> based tmpfs.
>>> You could also go through the archives for suggestions.
>>> --
>>> Anshum Gupta
>>> Naukri Labs!
>>> http://ai-cafe.blogspot.com
>>>
>>> The facts expressed here belong to everybody, the opinions to me. The
>>> distinction is yours to draw............
>>>
>>>
>>> On Tue, Mar 24, 2009 at 2:12 PM, Paul Taylor wrote:
>>>
>>> > Hi
>>> >
>>> > Ive built some file based indexes based on data in a database, and it >
>>> took
>>> > quite some time.
>>> > I am interested in trying to use RAM based indexes instead of file >
>>> based
>>> > indexes to compare search performance but its going to take some time >
>>> to
>>> > rebuild the index from the original database, isnt it possible to >
>>> rebuild
>>> > the index from the file based index ?
>>> > How big is a RAM index compared to File based index, I assumming its
>>> > slightly smaller because no files are created ?
>>> >
>>> > thanks Paul
>>> >
>>> > ---------------------------------------------------------------------
>>> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> > For additional commands, e-mail: java-user-help@lucene.apache.org
>>> >
>>> >
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
> Send instant messages to your online friends http://in.messenger.yahoo.com
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Can you create a RAM index from a file index

Posted by Ganesh <em...@yahoo.co.in>.
FileSystem index reader loads the data to RAM, I have tried with more than 6 
GB of index (sharded to 20 index) and the response is pretty fast.

What significance gain would be to use RAM directory.

How the modifications done in RAM directory will sync with FileSystem.

Regards
Ganesh

----- Original Message ----- 
From: "Otis Gospodnetic" <ot...@yahoo.com>
To: <ja...@lucene.apache.org>
Sent: Wednesday, March 25, 2009 9:12 AM
Subject: Re: Can you create a RAM index from a file index


>
> That's indeed an alternative.  Moreover, I have heard (not 
> measured/comparered myself) from people who tried both MM and tmpfs 
> approach that the former has some overhead.
>
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> ----- Original Message ----
>> From: Anshum <an...@gmail.com>
>> To: java-user@lucene.apache.org; paul_t100@fastmail.fm
>> Sent: Tuesday, March 24, 2009 6:42:58 AM
>> Subject: Re: Can you create a RAM index from a file index
>>
>> Hi Paul,
>>
>> Going by what you've conveyed here, I'd assume that you have more than 
>> some
>> data. You could either go ahead with Ian's way which is the suggested 
>> one(as
>> far as lucene implementation is concerned) but It'd not be possible if
>> you're index is greater than 2 Gigs and you are not running the 64 bit
>> version of JVM (as your JVM could not use more than 2Gigs of RAM 
>> otherwise).
>> The other workaround that I've tried successfully is by creating a tmpfs
>> partition and copying your index onto that tmpfs partition. This would 
>> also
>> mean that you'd need almost the same amount of 'free' ram to copy the 
>> index
>> onto the RAM.
>> You could then open your reader in the regular fashion straight off the 
>> RAM
>> based tmpfs.
>> You could also go through the archives for suggestions.
>> --
>> Anshum Gupta
>> Naukri Labs!
>> http://ai-cafe.blogspot.com
>>
>> The facts expressed here belong to everybody, the opinions to me. The
>> distinction is yours to draw............
>>
>>
>> On Tue, Mar 24, 2009 at 2:12 PM, Paul Taylor wrote:
>>
>> > Hi
>> >
>> > Ive built some file based indexes based on data in a database, and it 
>> > took
>> > quite some time.
>> > I am interested in trying to use RAM based indexes instead of file 
>> > based
>> > indexes to compare search performance but its going to take some time 
>> > to
>> > rebuild the index from the original database, isnt it possible to 
>> > rebuild
>> > the index from the file based index ?
>> > How big is a RAM index compared to File based index, I assumming its
>> > slightly smaller because no files are created ?
>> >
>> > thanks Paul
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> > For additional commands, e-mail: java-user-help@lucene.apache.org
>> >
>> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 

Send instant messages to your online friends http://in.messenger.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Can you create a RAM index from a file index

Posted by Otis Gospodnetic <ot...@yahoo.com>.
That's indeed an alternative.  Moreover, I have heard (not measured/comparered myself) from people who tried both MM and tmpfs approach that the former has some overhead.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: Anshum <an...@gmail.com>
> To: java-user@lucene.apache.org; paul_t100@fastmail.fm
> Sent: Tuesday, March 24, 2009 6:42:58 AM
> Subject: Re: Can you create a RAM index from a file index
> 
> Hi Paul,
> 
> Going by what you've conveyed here, I'd assume that you have more than some
> data. You could either go ahead with Ian's way which is the suggested one(as
> far as lucene implementation is concerned) but It'd not be possible if
> you're index is greater than 2 Gigs and you are not running the 64 bit
> version of JVM (as your JVM could not use more than 2Gigs of RAM otherwise).
> The other workaround that I've tried successfully is by creating a tmpfs
> partition and copying your index onto that tmpfs partition. This would also
> mean that you'd need almost the same amount of 'free' ram to copy the index
> onto the RAM.
> You could then open your reader in the regular fashion straight off the RAM
> based tmpfs.
> You could also go through the archives for suggestions.
> --
> Anshum Gupta
> Naukri Labs!
> http://ai-cafe.blogspot.com
> 
> The facts expressed here belong to everybody, the opinions to me. The
> distinction is yours to draw............
> 
> 
> On Tue, Mar 24, 2009 at 2:12 PM, Paul Taylor wrote:
> 
> > Hi
> >
> > Ive built some file based indexes based on data in a database, and it took
> > quite some time.
> > I am interested in trying to use RAM based indexes instead of file based
> > indexes to compare search performance but its going to take some time to
> > rebuild the index from the original database, isnt it possible to rebuild
> > the index from the file based index ?
> > How big is a RAM index compared to File based index, I assumming its
> > slightly smaller because no files are created ?
> >
> > thanks Paul
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Can you create a RAM index from a file index

Posted by Anshum <an...@gmail.com>.
Hi Paul,

Going by what you've conveyed here, I'd assume that you have more than some
data. You could either go ahead with Ian's way which is the suggested one(as
far as lucene implementation is concerned) but It'd not be possible if
you're index is greater than 2 Gigs and you are not running the 64 bit
version of JVM (as your JVM could not use more than 2Gigs of RAM otherwise).
The other workaround that I've tried successfully is by creating a tmpfs
partition and copying your index onto that tmpfs partition. This would also
mean that you'd need almost the same amount of 'free' ram to copy the index
onto the RAM.
You could then open your reader in the regular fashion straight off the RAM
based tmpfs.
You could also go through the archives for suggestions.
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com

The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw............


On Tue, Mar 24, 2009 at 2:12 PM, Paul Taylor <pa...@fastmail.fm> wrote:

> Hi
>
> Ive built some file based indexes based on data in a database, and it took
> quite some time.
> I am interested in trying to use RAM based indexes instead of file based
> indexes to compare search performance but its going to take some time to
> rebuild the index from the original database, isnt it possible to rebuild
> the index from the file based index ?
> How big is a RAM index compared to File based index, I assumming its
> slightly smaller because no files are created ?
>
> thanks Paul
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Can you create a RAM index from a file index

Posted by Paul Taylor <pa...@fastmail.fm>.
Ian Lea wrote:
> Hi
>
>
> You can load an existing index into a RAMDirectory using one of the
> constructors that takes an existing index.  I believe that a RAM index
> will be the same size as a file based index.
>   
Of course I was looking at IndexSearcher but the constructor is for 
RAMDirectory
> MMapDirectory is another possibility.
>
>   
How does this work, memory mapping.

and if Im using  a 32bit Java I'm limited to potentailly 4GB , (but 
probably more like 3GB) for the JVM , but if I use 64bit there is no 
limit, right ?

Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Can you create a RAM index from a file index

Posted by Ian Lea <ia...@gmail.com>.
Hi


You can load an existing index into a RAMDirectory using one of the
constructors that takes an existing index.  I believe that a RAM index
will be the same size as a file based index.

MMapDirectory is another possibility.


--
Ian.



On Tue, Mar 24, 2009 at 8:42 AM, Paul Taylor <pa...@fastmail.fm> wrote:
> Hi
>
> Ive built some file based indexes based on data in a database, and it took
> quite some time.
> I am interested in trying to use RAM based indexes instead of file based
> indexes to compare search performance but its going to take some time to
> rebuild the index from the original database, isnt it possible to rebuild
> the index from the file based index ?
> How big is a RAM index compared to File based index, I assumming its
> slightly smaller because no files are created ?
>
> thanks Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org