You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by William Bell <bi...@gmail.com> on 2012/07/17 06:04:38 UTC

How to setup SimpleFSDirectoryFactory

We all know that MMapDirectory is fastest. However we cannot always
use it since you might run out of memory on large indexes right?

Here is how I got iSimpleFSDirectoryFactory to work. Just set
-Dsolr.directoryFactory=solr.SimpleFSDirectoryFactory.

Your solrconfig.xml:

<directoryFactory name="DirectoryFactory"
class="${solr.directoryFactory:solr.StandardDirectoryFactory}"/>

You can check it with http://localhost:8983/solr/admin/stats.jsp

Notice that the default for Windows 64bit is MMapDirectory. Else
NIOFSDirectory except for WIndows.... It would be nicer if we just set
it all up with a helper in solrconfig.xml...

if (Constants.WINDOWS) {
     if (MMapDirectory.UNMAP_SUPPORTED && Constants.JRE_IS_64BIT)
        return new MMapDirectory(path, lockFactory);
     else
        return new SimpleFSDirectory(path, lockFactory);
     } else {
        return new NIOFSDirectory(path, lockFactory);
      }
}



-- 
Bill Bell
billnbell@gmail.com
cell 720-256-8076

RE: RE: How to setup SimpleFSDirectoryFactory

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi,

It seems that both of you simply don't understand what's happening in your
operating system kernel. Please read the blog post again!

> It happens in 3.6, for this reasons I thought of moving to solandra.
> If I do a commit, the all documents are persisted with out any issues.
> There is no issues  in terms of any functionality, but only this happens
is
> increase in physical RAM, goes higher and higher and stop at maximum and
it
> never comes down.

This is perfectly fine in Windows and Linux (and any other operating
system). If an operating system would not use *all* available physical
memory it would waste costly hardware resources. Why not use resources that
are unused otherwise? As said before:

O/S kernel uses *all* available physical RAM for caching file system
accesses. The memory used for that is always reported as not free, because
it is used (very simple, right?). But if some other application wants to use
it, its free for malloc(), so it is not permanently occupied. That's always
that case, using MMapDirectory or not (same for SimpleFSDirectory or
NIOFSDirectory).

Of course, when you freshly booted your kernel, it reports free memory, but
definitely not on a server running 24/7 since weeks.

For all people who don't want to understand that, here is the easy
explanation page:
http://www.linuxatemyram.com/

> > > all my physical memory say its 100 percentage used(windows). On deep
> > > investigation found that mmap is not releasing os files handles. Do
> > > you find this behaviour?

One comment: The file handles are not freed as long as the index is open.
Used file handles have nothing to do with memory mapping, it's completely
unrelated to each other.

Uwe

> On Sun, Jul 22, 2012 at 3:38 AM, Lance Norskog <go...@gmail.com> wrote:
> 
> > Interesting. Which version of Solr is this? What happens if you do a
> > commit?
> >
> > On Sat, Jul 21, 2012 at 8:01 AM, geetha anjali
> > <an...@gmail.com>
> > wrote:
> > > Hi uwe,
> > > Great to know. We have files indexing 10000/min. After 30 mins I see
> > > all my physical memory say its 100 percentage used(windows). On deep
> > > investigation found that mmap is not releasing os files handles. Do
> > > you find this behaviour?
> > >
> > > Thanks
> > >
> > > On 20 Jul 2012 14:04, "Uwe Schindler" <uw...@thetaphi.de> wrote:
> > >
> > > Hi Bill,
> > >
> > > MMapDirectory uses the file system cache of your operating system,
> > > which
> > has
> > > following consequences: In Linux, top & free should normally report
> > > only
> > > *few* free memory, because the O/S uses all memory not allocated by
> > > applications to cache disk I/O (and shows it as allocated, so having
> > > 0%
> > free
> > > memory is just normal on Linux and also Windows). If you have other
> > > applications or Lucene/Solr itself that allocate lot's of heap space
> > > or
> > > malloc() a lot, then you are reducing free physical memory, so
> > > reducing
> > fs
> > > cache. This depends also on your swappiness parameter (if swappiness
> > > is higher, inactive processes are swapped out easier, default is 60%
> > > on
> > linux -
> > > freeing more space for FS cache - the backside is of course that
> > > maybe in-memory structures of Lucene and other applications get pages
> out).
> > >
> > > You will only see no paging at all if all memory allocated all
> > applications
> > > + all mmapped files fit into memory. But paging in/out the mmapped
> > > + Lucene
> > > index is muuuuuch cheaper than using SimpleFSDirectory or
> > NIOFSDirectory. If
> > > you use SimpleFS or NIO and your index is not in FS cache, it will
> > > also
> > read
> > > it from physical disk again, so where is the difference. Paging is
> > actually
> > > cheaper as no syscalls are involved.
> > >
> > > If you want as much as possible of your index in physical RAM, copy
> > > it to /dev/null regularily and buy more RUM :-)
> > >
> > >
> > > -----
> > > Uwe Schindler
> > > H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> > > eMail: uwe@thetaphi...
> > >
> > >> From: Bill Bell [mailto:billnbell@gmail.com]
> > >> Sent: Friday, July 20, 2012 5:17 AM
> > >> Subject: Re: ...
> > >> s=op using it? The least used memory will be removed from the OS
> > >> automaticall=? Isee some paging. Wouldn't paging slow down the
> querying?
> > >
> > >>
> > >> My index is 10gb and every 8 hours we get most of it in shared
memory.
> > The
> > >> m=mory is 99 percent used, and that does not leave any room for
> > >> other
> > > apps. =
> > >
> > >> Other implications?
> > >>
> > >> Sent from my mobile device
> > >> 720-256-8076
> > >>
> > >> On Jul 19, 2012, at 9:49 A...
> > >> H=ap space or free system RAM:
> > >
> > >> >
> > >> >
> > http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.htm
> > >> > l
> > >> >
> > >> > Uwe
> > >> >...
> > >> >> use i= since you might run out of memory on large indexes right?
> > >
> > >> >>
> > >> >> Here is how I got iSimpleFSDirectoryFactory to work. Just set -
> > >> >> Dsolr.directoryFactor...
> > >> >> set it=all up with a helper in solrconfig.xml...
> > >
> > >> >>
> > >> >> if (Constants.WINDOWS) {
> > >> >> if (MMapDirectory.UNMAP_SUPPORTED && Constants.JRE_IS_64...
> >
> >
> >
> > --
> > Lance Norskog
> > goksron@gmail.com
> >


Re: How to setup SimpleFSDirectoryFactory

Posted by geetha anjali <an...@gmail.com>.
Thanks a lot Uwe, will check out in the new 3.6.1


On Mon, Jul 23, 2012 at 11:46 AM, Uwe Schindler <uw...@thetaphi.de> wrote:

> Hi Geetha Anjali,
>
> Lucene will not use MMapDirectoy by default on 32 bit platforms or if you
> are not using a Oracle/Sun JVM. On 64 bit platforms, Lucene will use it,
> but
> will accept the risks of segfaulting when unmapping the buffers - Lucene
> does try its best to prevent this. It is a risk, but accepted by the Lucene
> developers.
>
> To come back to your issue: It is perfectly fine on Solr/Lucene to not
> unmap
> all buffers as long as the index is open. The number of open file handles
> is
> another discussion, but not related at all to MMap, if you are using an old
> Lucene version (like 3.0.2), you should upgrade in all cases The recent one
> is 3.6.1.
>
> Uwe
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
> > -----Original Message-----
> > From: geetha anjali [mailto:anjaliprabhu03@gmail.com]
> > Sent: Monday, July 23, 2012 4:28 AM
> > Subject: Re: How to setup SimpleFSDirectoryFactory
> >
> > Hu Uwe,
> > Thanks Wwe, Have you checked the Bug in JRE for mmapDirectory?. I was
> > mentioning this, This is posted in Oracle site, and the API doc.
> > They accept this as a bug, have you seen this?.
> >
> > "MMapDirectory<http://lucene.apache.org/java/3_0_2/api/core/org/apache/l
> > u=ene/store/MMapDirectory.html>uses
> > memory-mapped IO when reading. This is a good choice if you have plenty
> of
> > virtual memory relative to your index size, eg if you are running on a 64
> bit JRE,
> > or you are running on a 32 bit JRE but your index sizes are small enough
> to fit
> > into the virtual memory space. Java has currently the limitation of not
> being
> > able to unmap files from user code. The files are unmapped, when GC
> releases
> > the byte buffers. *Due to this
> > bug<http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4724038>in
> > Sun's JRE,
> > MMapDirectory's
> >
> **IndexInput.close()*<
> http://lucene.apache.org/java/3_0_2/api/core/org/apac
> > =e/lucene/store/IndexInput.html#close%28%29>
> > * is unable to close the underlying OS file handle. Only when GC finally
> collects
> > the underlying objects, which could be quite some time later, will the
> file
> > handle be closed*. *This will consume additional transient disk
> > usage*: on Windows, attempts to delete or overwrite the files will result
> in an
> > exception; on other platforms, which typically have a "delete on last
> close"
> > semantics, while such operations will succeed, the bytes are still
> consuming
> > space on disk. For many applications this limitation is not a problem
> (e.g. if you
> > have plenty of disk space, and you don't rely on overwriting files on
> Windows)
> > but it's still an important limitation to be aware of. This class
> supplies
> a
> > (possibly dangerous) workaround mentioned in the bug report, which may
> fail
> > on non-Sun JVMs. "
> >
> >
> > Thanks,
> >
> >
> > On Mon, Jul 23, 2012 at 4:13 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
> >
> > > It is hopeless to talk to both of you, you don't understand virtual
> memor=:
> > >
> > > > I get a similar situation using Windows 2008 and Solr 3.6. Memory
> > > > using mmap=is never released. Even if I turn off traffic and commit
> > > > and do =
> > > manual
> > > > gc= If the size of the index is 3gb then memory used will be heap +
> > > > 3=b
> > > of
> > > > sha=ed used. If I use a 6gb index I get heap + 6gb.
> > >
> > > That is expected, but we are talking not about allocated physical
> > > memory, we are talking about allocated ADDRESS SPACE and you have 2^47
> > > of that on 64bit platforms. There is no physical memory wasted or
> > > allocated - please read the blog post a third, forth, fifth... or
> > > tenth time, until it is obvious. Yo= should also go back to school and
> > > take a course on system programming and operating system kernels.
> > > Every CS student gets that taught in his first year (at least in
> > > Germany).
> > >
> > > Java's GC has nothing to do with that - as long as the index is open,
> > > ADDRESS SPACE is assigned. We are talking not about memory nor Java
> > > heap space.
> > >
> > > > If I turn off
> > > > MMapDirectory=actory it goes back down. When is the MMap supposed to
> > > > release memory ? It o=ly does it on JVM restart now.
> > >
> > > Can you please stop spreading nonsense about MMapDirectory with no
> > > knowledge behind? http://www.linuxatemyram.com/ - Also applies to
> > > Windows.
> > >
> > > Uwe
> > >
> > > > Bill Bell
> > > > Sent from mobile
> > > >
> > > >
> > > > On Jul 22, 2012, at 6:21 AM, geetha anjali
> > > > <an...@gmail.com> wrote:=
> > > > > It happens in 3.6, for this reasons I thought of moving to
> solandra.
> > > > > If I do a commit, the all documents are persisted with out any
> > > > > issues= There is no issues  in terms of any functionality, but
> > > > > only this happens i= increase in physical RAM, goes higher and
> > > > > higher and sto= at maximum and i= never comes down.
> > > > >
> > > > > Thanks
> > > > >
> > > > > On Sun, Jul 22, 2012 at 3:38 AM, Lance Norskog <go...@gmail.com>
> > > > wrote:
> > > > >
> > > > >> Interesting. Which version of Solr is this? What happens if you
> > > > >> do a commit?
> > > > >>
> > > > >> On Sat, Jul 21, 2012 at 8:01 AM, geetha anjali
> > > > <an...@gmail.com>=>> wrote:
> > > > >>> Hi uwe,
> > > > >>> Great to know. We have files indexing 10000/min. After 30 mins I
> > > > >>> se= all=>>> my physical memory say its 100 percentage
> > > > >>> used(windows). =n deep investigation found that mmap is not
> releasing
> > os files handle=.
> > > Do
> > > > you find this behaviour?
> > > > >>>
> > > > >>> Thanks
> > > > >>>
> > > > >>> On 20 Jul 2012 14:04, "Uwe Schindler" <uw...@thetaphi.de> wrote:
> > > > >>>
> > > > >>> Hi Bill,
> > > > >>>
> > > > >>> MMapDirectory uses the file system cache of your operating
> > > > >>> system, which=>> has following consequences: In Linux, top &
> > > > >>> free should normally report only=>>> *few* free memory, because
> > > > >>> the O/S uses =ll memory not allocated by applications to cache
> > > > >>> disk I/O (and shows i= as allocated, so having 0%
> > > > >> free
> > > > >>> memory is just normal on Linux and also Windows). If you have
> > > > >>> other applications or Lucene/Solr itself that allocate lot's of
> > > > >>> heap spac= or
> > > > >>> malloc() a lot, then you are reducing free physical memory, so
> > > > >>> reducing
> > > > >> fs
> > > > >>> cache. This depends also on your swappiness parameter (if
> > > > >>> swappines= is higher, inactive processes are swapped out easier,
> > > > >>> default is 60= on
> > > > >> linux -
> > > > >>> freeing more space for FS cache - the backside is of course that
> > > > >>> maybe in-memory structures of Lucene and other applications get
> > > > >>> pag=s
> > > > out).
> > > > >>>
> > > > >>> You will only see no paging at all if all memory allocated all
> > > > >> applications
> > > > >>> + all mmapped files fit into memory. But paging in/out the
> > > > >>> + mmapped Lucen=
> > > > >>> index is muuuuuch cheaper than using SimpleFSDirectory or
> > > > >> NIOFSDirectory. If
> > > > >>> you use SimpleFS or NIO and your index is not in FS cache, it
> > > > >>> will also
> > > > >> read
> > > > >>> it from physical disk again, so where is the difference. Paging
> > > > >>> is
> > > > >> actually
> > > > >>> cheaper as no syscalls are involved.
> > > > >>>
> > > > >>> If you want as much as possible of your index in physical RAM,
> > > > >>> copy it t= /dev/null regularily and buy more RUM :-)
> > > > >>>
> > > > >>>
> > > > >>> -----
> > > > >>> Uwe Schindler
> > > > >>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> > > > >>> eMail: uwe@thetaphi...
> > > > >>>
> > > > >>>> From: Bill Bell [mailto:billnbell@gmail.com]
> > > > >>>> Sent: Friday, July 20, 2012 5:17 AM
> > > > >>>> Subject: Re: ...
> > > > >>>> s=op using it? The least used memory will be removed from the
> > > > >>>> OS automaticall=? Isee some paging. Wouldn't paging slow down
> > > > >>>> the
> > > > queryi=g?
> > > > >>>
> > > > >>>>
> > > > >>>> My index is 10gb and every 8 hours we get most of it in shared
> > > memory.
> > > > >> The
> > > > >>>> m=mory is 99 percent used, and that does not leave any room for
> > > > >>>> other=>>> apps. =
> > > > >>>
> > > > >>>> Other implications?
> > > > >>>>
> > > > >>>> Sent from my mobile device
> > > > >>>> 720-256-8076
> > > > >>>>
> > > > >>>> On Jul 19, 2012, at 9:49 A...
> > > > >>>> H=ap space or free system RAM:
> > > > >>>
> > > > >>>>>
> > > > >>>>>
> > > > >> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bi
> > > > >> t.h=
> > > > >> m
> > > > >>>>> l
> > > > >>>>>
> > > > >>>>> Uwe
> > > > >>>>> ...
> > > > >>>>>> use i= since you might run out of memory on large indexes
> righ=?
> > > > >>>
> > > > >>>>>>
> > > > >>>>>> Here is how I got iSimpleFSDirectoryFactory to work. Just set
> > > > >>>>>> - Dsolr.directoryFactor...
> > > > >>>>>> set it=all up with a helper in solrconfig.xml...
> > > > >>>
> > > > >>>>>>
> > > > >>>>>> if (Constants.WINDOWS) {
> > > > >>>>>> if (MMapDirectory.UNMAP_SUPPORTED && Constants.JRE_IS_64...
> > > > >>
> > > > >>
> > > > >>
> > > > >> --
> > > > >> Lance Norskog
> > > > >> goksron@gmail.com
> > > > >>
> > >
> > >
> > >
>
>

RE: How to setup SimpleFSDirectoryFactory

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi Geetha Anjali,

Lucene will not use MMapDirectoy by default on 32 bit platforms or if you
are not using a Oracle/Sun JVM. On 64 bit platforms, Lucene will use it, but
will accept the risks of segfaulting when unmapping the buffers - Lucene
does try its best to prevent this. It is a risk, but accepted by the Lucene
developers.

To come back to your issue: It is perfectly fine on Solr/Lucene to not unmap
all buffers as long as the index is open. The number of open file handles is
another discussion, but not related at all to MMap, if you are using an old
Lucene version (like 3.0.2), you should upgrade in all cases The recent one
is 3.6.1.

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: geetha anjali [mailto:anjaliprabhu03@gmail.com]
> Sent: Monday, July 23, 2012 4:28 AM
> Subject: Re: How to setup SimpleFSDirectoryFactory
> 
> Hu Uwe,
> Thanks Wwe, Have you checked the Bug in JRE for mmapDirectory?. I was
> mentioning this, This is posted in Oracle site, and the API doc.
> They accept this as a bug, have you seen this?.
> 
> "MMapDirectory<http://lucene.apache.org/java/3_0_2/api/core/org/apache/l
> u=ene/store/MMapDirectory.html>uses
> memory-mapped IO when reading. This is a good choice if you have plenty of
> virtual memory relative to your index size, eg if you are running on a 64
bit JRE,
> or you are running on a 32 bit JRE but your index sizes are small enough
to fit
> into the virtual memory space. Java has currently the limitation of not
being
> able to unmap files from user code. The files are unmapped, when GC
releases
> the byte buffers. *Due to this
> bug<http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4724038>in
> Sun's JRE,
> MMapDirectory's
>
**IndexInput.close()*<http://lucene.apache.org/java/3_0_2/api/core/org/apac
> =e/lucene/store/IndexInput.html#close%28%29>
> * is unable to close the underlying OS file handle. Only when GC finally
collects
> the underlying objects, which could be quite some time later, will the
file
> handle be closed*. *This will consume additional transient disk
> usage*: on Windows, attempts to delete or overwrite the files will result
in an
> exception; on other platforms, which typically have a "delete on last
close"
> semantics, while such operations will succeed, the bytes are still
consuming
> space on disk. For many applications this limitation is not a problem
(e.g. if you
> have plenty of disk space, and you don't rely on overwriting files on
Windows)
> but it's still an important limitation to be aware of. This class supplies
a
> (possibly dangerous) workaround mentioned in the bug report, which may
fail
> on non-Sun JVMs. "
> 
> 
> Thanks,
> 
> 
> On Mon, Jul 23, 2012 at 4:13 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
> 
> > It is hopeless to talk to both of you, you don't understand virtual
memor=:
> >
> > > I get a similar situation using Windows 2008 and Solr 3.6. Memory
> > > using mmap=is never released. Even if I turn off traffic and commit
> > > and do =
> > manual
> > > gc= If the size of the index is 3gb then memory used will be heap +
> > > 3=b
> > of
> > > sha=ed used. If I use a 6gb index I get heap + 6gb.
> >
> > That is expected, but we are talking not about allocated physical
> > memory, we are talking about allocated ADDRESS SPACE and you have 2^47
> > of that on 64bit platforms. There is no physical memory wasted or
> > allocated - please read the blog post a third, forth, fifth... or
> > tenth time, until it is obvious. Yo= should also go back to school and
> > take a course on system programming and operating system kernels.
> > Every CS student gets that taught in his first year (at least in
> > Germany).
> >
> > Java's GC has nothing to do with that - as long as the index is open,
> > ADDRESS SPACE is assigned. We are talking not about memory nor Java
> > heap space.
> >
> > > If I turn off
> > > MMapDirectory=actory it goes back down. When is the MMap supposed to
> > > release memory ? It o=ly does it on JVM restart now.
> >
> > Can you please stop spreading nonsense about MMapDirectory with no
> > knowledge behind? http://www.linuxatemyram.com/ - Also applies to
> > Windows.
> >
> > Uwe
> >
> > > Bill Bell
> > > Sent from mobile
> > >
> > >
> > > On Jul 22, 2012, at 6:21 AM, geetha anjali
> > > <an...@gmail.com> wrote:=
> > > > It happens in 3.6, for this reasons I thought of moving to solandra.
> > > > If I do a commit, the all documents are persisted with out any
> > > > issues= There is no issues  in terms of any functionality, but
> > > > only this happens i= increase in physical RAM, goes higher and
> > > > higher and sto= at maximum and i= never comes down.
> > > >
> > > > Thanks
> > > >
> > > > On Sun, Jul 22, 2012 at 3:38 AM, Lance Norskog <go...@gmail.com>
> > > wrote:
> > > >
> > > >> Interesting. Which version of Solr is this? What happens if you
> > > >> do a commit?
> > > >>
> > > >> On Sat, Jul 21, 2012 at 8:01 AM, geetha anjali
> > > <an...@gmail.com>=>> wrote:
> > > >>> Hi uwe,
> > > >>> Great to know. We have files indexing 10000/min. After 30 mins I
> > > >>> se= all=>>> my physical memory say its 100 percentage
> > > >>> used(windows). =n deep investigation found that mmap is not
releasing
> os files handle=.
> > Do
> > > you find this behaviour?
> > > >>>
> > > >>> Thanks
> > > >>>
> > > >>> On 20 Jul 2012 14:04, "Uwe Schindler" <uw...@thetaphi.de> wrote:
> > > >>>
> > > >>> Hi Bill,
> > > >>>
> > > >>> MMapDirectory uses the file system cache of your operating
> > > >>> system, which=>> has following consequences: In Linux, top &
> > > >>> free should normally report only=>>> *few* free memory, because
> > > >>> the O/S uses =ll memory not allocated by applications to cache
> > > >>> disk I/O (and shows i= as allocated, so having 0%
> > > >> free
> > > >>> memory is just normal on Linux and also Windows). If you have
> > > >>> other applications or Lucene/Solr itself that allocate lot's of
> > > >>> heap spac= or
> > > >>> malloc() a lot, then you are reducing free physical memory, so
> > > >>> reducing
> > > >> fs
> > > >>> cache. This depends also on your swappiness parameter (if
> > > >>> swappines= is higher, inactive processes are swapped out easier,
> > > >>> default is 60= on
> > > >> linux -
> > > >>> freeing more space for FS cache - the backside is of course that
> > > >>> maybe in-memory structures of Lucene and other applications get
> > > >>> pag=s
> > > out).
> > > >>>
> > > >>> You will only see no paging at all if all memory allocated all
> > > >> applications
> > > >>> + all mmapped files fit into memory. But paging in/out the
> > > >>> + mmapped Lucen=
> > > >>> index is muuuuuch cheaper than using SimpleFSDirectory or
> > > >> NIOFSDirectory. If
> > > >>> you use SimpleFS or NIO and your index is not in FS cache, it
> > > >>> will also
> > > >> read
> > > >>> it from physical disk again, so where is the difference. Paging
> > > >>> is
> > > >> actually
> > > >>> cheaper as no syscalls are involved.
> > > >>>
> > > >>> If you want as much as possible of your index in physical RAM,
> > > >>> copy it t= /dev/null regularily and buy more RUM :-)
> > > >>>
> > > >>>
> > > >>> -----
> > > >>> Uwe Schindler
> > > >>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> > > >>> eMail: uwe@thetaphi...
> > > >>>
> > > >>>> From: Bill Bell [mailto:billnbell@gmail.com]
> > > >>>> Sent: Friday, July 20, 2012 5:17 AM
> > > >>>> Subject: Re: ...
> > > >>>> s=op using it? The least used memory will be removed from the
> > > >>>> OS automaticall=? Isee some paging. Wouldn't paging slow down
> > > >>>> the
> > > queryi=g?
> > > >>>
> > > >>>>
> > > >>>> My index is 10gb and every 8 hours we get most of it in shared
> > memory.
> > > >> The
> > > >>>> m=mory is 99 percent used, and that does not leave any room for
> > > >>>> other=>>> apps. =
> > > >>>
> > > >>>> Other implications?
> > > >>>>
> > > >>>> Sent from my mobile device
> > > >>>> 720-256-8076
> > > >>>>
> > > >>>> On Jul 19, 2012, at 9:49 A...
> > > >>>> H=ap space or free system RAM:
> > > >>>
> > > >>>>>
> > > >>>>>
> > > >> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bi
> > > >> t.h=
> > > >> m
> > > >>>>> l
> > > >>>>>
> > > >>>>> Uwe
> > > >>>>> ...
> > > >>>>>> use i= since you might run out of memory on large indexes
righ=?
> > > >>>
> > > >>>>>>
> > > >>>>>> Here is how I got iSimpleFSDirectoryFactory to work. Just set
> > > >>>>>> - Dsolr.directoryFactor...
> > > >>>>>> set it=all up with a helper in solrconfig.xml...
> > > >>>
> > > >>>>>>
> > > >>>>>> if (Constants.WINDOWS) {
> > > >>>>>> if (MMapDirectory.UNMAP_SUPPORTED && Constants.JRE_IS_64...
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> Lance Norskog
> > > >> goksron@gmail.com
> > > >>
> >
> >
> >


Re: How to setup SimpleFSDirectoryFactory

Posted by geetha anjali <an...@gmail.com>.
Hu Uwe,
Thanks Wwe, Have you checked the Bug in JRE for mmapDirectory?. I was
mentioning this, This is posted in Oracle site, and the API doc.
They accept this as a bug, have you seen this?.

“MMapDirectory<http://lucene.apache.org/java/3_0_2/api/core/org/apache/lucene/store/MMapDirectory.html>uses
memory-mapped IO when reading. This is a good choice if you have
plenty of virtual memory relative to your index size, eg if you are running
on a 64 bit JRE, or you are running on a 32 bit JRE but your index sizes
are small enough to fit into the virtual memory space. Java has currently
the limitation of not being able to unmap files from user code. The files
are unmapped, when GC releases the byte buffers. *Due to this
bug<http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4724038>in
Sun's JRE,
MMapDirectory's
**IndexInput.close()*<http://lucene.apache.org/java/3_0_2/api/core/org/apache/lucene/store/IndexInput.html#close%28%29>
* is unable to close the underlying OS file handle. Only when GC finally
collects the underlying objects, which could be quite some time later, will
the file handle be closed*. *This will consume additional transient disk
usage*: on Windows, attempts to delete or overwrite the files will result
in an exception; on other platforms, which typically have a "delete on last
close" semantics, while such operations will succeed, the bytes are still
consuming space on disk. For many applications this limitation is not a
problem (e.g. if you have plenty of disk space, and you don't rely on
overwriting files on Windows) but it's still an important limitation to be
aware of. This class supplies a (possibly dangerous) workaround mentioned
in the bug report, which may fail on non-Sun JVMs. “


Thanks,


On Mon, Jul 23, 2012 at 4:13 AM, Uwe Schindler <uw...@thetaphi.de> wrote:

> It is hopeless to talk to both of you, you don't understand virtual memory:
>
> > I get a similar situation using Windows 2008 and Solr 3.6. Memory using
> > mmap=is never released. Even if I turn off traffic and commit and do a
> manual
> > gc= If the size of the index is 3gb then memory used will be heap + 3gb
> of
> > sha=ed used. If I use a 6gb index I get heap + 6gb.
>
> That is expected, but we are talking not about allocated physical memory,
> we
> are talking about allocated ADDRESS SPACE and you have 2^47 of that on
> 64bit
> platforms. There is no physical memory wasted or allocated - please read
> the
> blog post a third, forth, fifth... or tenth time, until it is obvious. You
> should also go back to school and take a course on system programming and
> operating system kernels. Every CS student gets that taught in his first
> year (at least in Germany).
>
> Java's GC has nothing to do with that - as long as the index is open,
> ADDRESS SPACE is assigned. We are talking not about memory nor Java heap
> space.
>
> > If I turn off
> > MMapDirectory=actory it goes back down. When is the MMap supposed to
> > release memory ? It o=ly does it on JVM restart now.
>
> Can you please stop spreading nonsense about MMapDirectory with no
> knowledge
> behind? http://www.linuxatemyram.com/ - Also applies to Windows.
>
> Uwe
>
> > Bill Bell
> > Sent from mobile
> >
> >
> > On Jul 22, 2012, at 6:21 AM, geetha anjali <an...@gmail.com>
> > wrote:=
> > > It happens in 3.6, for this reasons I thought of moving to solandra.
> > > If I do a commit, the all documents are persisted with out any issues.
> > > There is no issues  in terms of any functionality, but only this
> > > happens i= increase in physical RAM, goes higher and higher and stop
> > > at maximum and i= never comes down.
> > >
> > > Thanks
> > >
> > > On Sun, Jul 22, 2012 at 3:38 AM, Lance Norskog <go...@gmail.com>
> > wrote:
> > >
> > >> Interesting. Which version of Solr is this? What happens if you do a
> > >> commit?
> > >>
> > >> On Sat, Jul 21, 2012 at 8:01 AM, geetha anjali
> > <an...@gmail.com>=>> wrote:
> > >>> Hi uwe,
> > >>> Great to know. We have files indexing 10000/min. After 30 mins I see
> > >>> all=>>> my physical memory say its 100 percentage used(windows). On
> > >>> deep investigation found that mmap is not releasing os files handles.
> Do
> > you find this behaviour?
> > >>>
> > >>> Thanks
> > >>>
> > >>> On 20 Jul 2012 14:04, "Uwe Schindler" <uw...@thetaphi.de> wrote:
> > >>>
> > >>> Hi Bill,
> > >>>
> > >>> MMapDirectory uses the file system cache of your operating system,
> > >>> which=>> has following consequences: In Linux, top & free should
> > >>> normally report only=>>> *few* free memory, because the O/S uses all
> > >>> memory not allocated by applications to cache disk I/O (and shows it
> > >>> as allocated, so having 0%
> > >> free
> > >>> memory is just normal on Linux and also Windows). If you have other
> > >>> applications or Lucene/Solr itself that allocate lot's of heap space
> > >>> or
> > >>> malloc() a lot, then you are reducing free physical memory, so
> > >>> reducing
> > >> fs
> > >>> cache. This depends also on your swappiness parameter (if swappiness
> > >>> is higher, inactive processes are swapped out easier, default is 60%
> > >>> on
> > >> linux -
> > >>> freeing more space for FS cache - the backside is of course that
> > >>> maybe in-memory structures of Lucene and other applications get pages
> > out).
> > >>>
> > >>> You will only see no paging at all if all memory allocated all
> > >> applications
> > >>> + all mmapped files fit into memory. But paging in/out the mmapped
> > >>> + Lucen=
> > >>> index is muuuuuch cheaper than using SimpleFSDirectory or
> > >> NIOFSDirectory. If
> > >>> you use SimpleFS or NIO and your index is not in FS cache, it will
> > >>> also
> > >> read
> > >>> it from physical disk again, so where is the difference. Paging is
> > >> actually
> > >>> cheaper as no syscalls are involved.
> > >>>
> > >>> If you want as much as possible of your index in physical RAM, copy
> > >>> it t= /dev/null regularily and buy more RUM :-)
> > >>>
> > >>>
> > >>> -----
> > >>> Uwe Schindler
> > >>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> > >>> eMail: uwe@thetaphi...
> > >>>
> > >>>> From: Bill Bell [mailto:billnbell@gmail.com]
> > >>>> Sent: Friday, July 20, 2012 5:17 AM
> > >>>> Subject: Re: ...
> > >>>> s=op using it? The least used memory will be removed from the OS
> > >>>> automaticall=? Isee some paging. Wouldn't paging slow down the
> > queryi=g?
> > >>>
> > >>>>
> > >>>> My index is 10gb and every 8 hours we get most of it in shared
> memory.
> > >> The
> > >>>> m=mory is 99 percent used, and that does not leave any room for
> > >>>> other=>>> apps. =
> > >>>
> > >>>> Other implications?
> > >>>>
> > >>>> Sent from my mobile device
> > >>>> 720-256-8076
> > >>>>
> > >>>> On Jul 19, 2012, at 9:49 A...
> > >>>> H=ap space or free system RAM:
> > >>>
> > >>>>>
> > >>>>>
> > >> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.ht
> > >> m
> > >>>>> l
> > >>>>>
> > >>>>> Uwe
> > >>>>> ...
> > >>>>>> use i= since you might run out of memory on large indexes right?
> > >>>
> > >>>>>>
> > >>>>>> Here is how I got iSimpleFSDirectoryFactory to work. Just set -
> > >>>>>> Dsolr.directoryFactor...
> > >>>>>> set it=all up with a helper in solrconfig.xml...
> > >>>
> > >>>>>>
> > >>>>>> if (Constants.WINDOWS) {
> > >>>>>> if (MMapDirectory.UNMAP_SUPPORTED && Constants.JRE_IS_64...
> > >>
> > >>
> > >>
> > >> --
> > >> Lance Norskog
> > >> goksron@gmail.com
> > >>
>
>
>

RE: How to setup SimpleFSDirectoryFactory

Posted by Uwe Schindler <uw...@thetaphi.de>.
It is hopeless to talk to both of you, you don't understand virtual memory:

> I get a similar situation using Windows 2008 and Solr 3.6. Memory using
> mmap=is never released. Even if I turn off traffic and commit and do a
manual
> gc= If the size of the index is 3gb then memory used will be heap + 3gb of
> sha=ed used. If I use a 6gb index I get heap + 6gb. 

That is expected, but we are talking not about allocated physical memory, we
are talking about allocated ADDRESS SPACE and you have 2^47 of that on 64bit
platforms. There is no physical memory wasted or allocated - please read the
blog post a third, forth, fifth... or tenth time, until it is obvious. You
should also go back to school and take a course on system programming and
operating system kernels. Every CS student gets that taught in his first
year (at least in Germany).

Java's GC has nothing to do with that - as long as the index is open,
ADDRESS SPACE is assigned. We are talking not about memory nor Java heap
space.

> If I turn off
> MMapDirectory=actory it goes back down. When is the MMap supposed to
> release memory ? It o=ly does it on JVM restart now.

Can you please stop spreading nonsense about MMapDirectory with no knowledge
behind? http://www.linuxatemyram.com/ - Also applies to Windows.

Uwe

> Bill Bell
> Sent from mobile
> 
> 
> On Jul 22, 2012, at 6:21 AM, geetha anjali <an...@gmail.com>
> wrote:=
> > It happens in 3.6, for this reasons I thought of moving to solandra.
> > If I do a commit, the all documents are persisted with out any issues.
> > There is no issues  in terms of any functionality, but only this
> > happens i= increase in physical RAM, goes higher and higher and stop
> > at maximum and i= never comes down.
> >
> > Thanks
> >
> > On Sun, Jul 22, 2012 at 3:38 AM, Lance Norskog <go...@gmail.com>
> wrote:
> >
> >> Interesting. Which version of Solr is this? What happens if you do a
> >> commit?
> >>
> >> On Sat, Jul 21, 2012 at 8:01 AM, geetha anjali
> <an...@gmail.com>=>> wrote:
> >>> Hi uwe,
> >>> Great to know. We have files indexing 10000/min. After 30 mins I see
> >>> all=>>> my physical memory say its 100 percentage used(windows). On
> >>> deep investigation found that mmap is not releasing os files handles.
Do
> you find this behaviour?
> >>>
> >>> Thanks
> >>>
> >>> On 20 Jul 2012 14:04, "Uwe Schindler" <uw...@thetaphi.de> wrote:
> >>>
> >>> Hi Bill,
> >>>
> >>> MMapDirectory uses the file system cache of your operating system,
> >>> which=>> has following consequences: In Linux, top & free should
> >>> normally report only=>>> *few* free memory, because the O/S uses all
> >>> memory not allocated by applications to cache disk I/O (and shows it
> >>> as allocated, so having 0%
> >> free
> >>> memory is just normal on Linux and also Windows). If you have other
> >>> applications or Lucene/Solr itself that allocate lot's of heap space
> >>> or
> >>> malloc() a lot, then you are reducing free physical memory, so
> >>> reducing
> >> fs
> >>> cache. This depends also on your swappiness parameter (if swappiness
> >>> is higher, inactive processes are swapped out easier, default is 60%
> >>> on
> >> linux -
> >>> freeing more space for FS cache - the backside is of course that
> >>> maybe in-memory structures of Lucene and other applications get pages
> out).
> >>>
> >>> You will only see no paging at all if all memory allocated all
> >> applications
> >>> + all mmapped files fit into memory. But paging in/out the mmapped
> >>> + Lucen=
> >>> index is muuuuuch cheaper than using SimpleFSDirectory or
> >> NIOFSDirectory. If
> >>> you use SimpleFS or NIO and your index is not in FS cache, it will
> >>> also
> >> read
> >>> it from physical disk again, so where is the difference. Paging is
> >> actually
> >>> cheaper as no syscalls are involved.
> >>>
> >>> If you want as much as possible of your index in physical RAM, copy
> >>> it t= /dev/null regularily and buy more RUM :-)
> >>>
> >>>
> >>> -----
> >>> Uwe Schindler
> >>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> >>> eMail: uwe@thetaphi...
> >>>
> >>>> From: Bill Bell [mailto:billnbell@gmail.com]
> >>>> Sent: Friday, July 20, 2012 5:17 AM
> >>>> Subject: Re: ...
> >>>> s=op using it? The least used memory will be removed from the OS
> >>>> automaticall=? Isee some paging. Wouldn't paging slow down the
> queryi=g?
> >>>
> >>>>
> >>>> My index is 10gb and every 8 hours we get most of it in shared
memory.
> >> The
> >>>> m=mory is 99 percent used, and that does not leave any room for
> >>>> other=>>> apps. =
> >>>
> >>>> Other implications?
> >>>>
> >>>> Sent from my mobile device
> >>>> 720-256-8076
> >>>>
> >>>> On Jul 19, 2012, at 9:49 A...
> >>>> H=ap space or free system RAM:
> >>>
> >>>>>
> >>>>>
> >> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.ht
> >> m
> >>>>> l
> >>>>>
> >>>>> Uwe
> >>>>> ...
> >>>>>> use i= since you might run out of memory on large indexes right?
> >>>
> >>>>>>
> >>>>>> Here is how I got iSimpleFSDirectoryFactory to work. Just set -
> >>>>>> Dsolr.directoryFactor...
> >>>>>> set it=all up with a helper in solrconfig.xml...
> >>>
> >>>>>>
> >>>>>> if (Constants.WINDOWS) {
> >>>>>> if (MMapDirectory.UNMAP_SUPPORTED && Constants.JRE_IS_64...
> >>
> >>
> >>
> >> --
> >> Lance Norskog
> >> goksron@gmail.com
> >>



Re: How to setup SimpleFSDirectoryFactory

Posted by Bill Bell <bi...@gmail.com>.
I get a similar situation using Windows 2008 and Solr 3.6. Memory using mmap is never released. Even if I turn off traffic and commit and do a manual gc. If the size of the index is 3gb then memory used will be heap + 3gb of shared used. If I use a 6gb index I get heap + 6gb. If I turn off MMapDirectoryFactory it goes back down. When is the MMap supposed to release memory ? It only does it on JVM restart now.

Bill Bell
Sent from mobile


On Jul 22, 2012, at 6:21 AM, geetha anjali <an...@gmail.com> wrote:

> It happens in 3.6, for this reasons I thought of moving to solandra.
> If I do a commit, the all documents are persisted with out any issues.
> There is no issues  in terms of any functionality, but only this happens is
> increase in physical RAM, goes higher and higher and stop at maximum and it
> never comes down.
> 
> Thanks
> 
> On Sun, Jul 22, 2012 at 3:38 AM, Lance Norskog <go...@gmail.com> wrote:
> 
>> Interesting. Which version of Solr is this? What happens if you do a
>> commit?
>> 
>> On Sat, Jul 21, 2012 at 8:01 AM, geetha anjali <an...@gmail.com>
>> wrote:
>>> Hi uwe,
>>> Great to know. We have files indexing 10000/min. After 30 mins I see all
>>> my physical memory say its 100 percentage used(windows). On deep
>>> investigation found that mmap is not releasing os files handles. Do you
>>> find this behaviour?
>>> 
>>> Thanks
>>> 
>>> On 20 Jul 2012 14:04, "Uwe Schindler" <uw...@thetaphi.de> wrote:
>>> 
>>> Hi Bill,
>>> 
>>> MMapDirectory uses the file system cache of your operating system, which
>> has
>>> following consequences: In Linux, top & free should normally report only
>>> *few* free memory, because the O/S uses all memory not allocated by
>>> applications to cache disk I/O (and shows it as allocated, so having 0%
>> free
>>> memory is just normal on Linux and also Windows). If you have other
>>> applications or Lucene/Solr itself that allocate lot's of heap space or
>>> malloc() a lot, then you are reducing free physical memory, so reducing
>> fs
>>> cache. This depends also on your swappiness parameter (if swappiness is
>>> higher, inactive processes are swapped out easier, default is 60% on
>> linux -
>>> freeing more space for FS cache - the backside is of course that maybe
>>> in-memory structures of Lucene and other applications get pages out).
>>> 
>>> You will only see no paging at all if all memory allocated all
>> applications
>>> + all mmapped files fit into memory. But paging in/out the mmapped Lucene
>>> index is muuuuuch cheaper than using SimpleFSDirectory or
>> NIOFSDirectory. If
>>> you use SimpleFS or NIO and your index is not in FS cache, it will also
>> read
>>> it from physical disk again, so where is the difference. Paging is
>> actually
>>> cheaper as no syscalls are involved.
>>> 
>>> If you want as much as possible of your index in physical RAM, copy it to
>>> /dev/null regularily and buy more RUM :-)
>>> 
>>> 
>>> -----
>>> Uwe Schindler
>>> H.-H.-Meier-Allee 63, D-28213 Bremen
>>> http://www.thetaphi.de
>>> eMail: uwe@thetaphi...
>>> 
>>>> From: Bill Bell [mailto:billnbell@gmail.com]
>>>> Sent: Friday, July 20, 2012 5:17 AM
>>>> Subject: Re: ...
>>>> s=op using it? The least used memory will be removed from the OS
>>>> automaticall=? Isee some paging. Wouldn't paging slow down the querying?
>>> 
>>>> 
>>>> My index is 10gb and every 8 hours we get most of it in shared memory.
>> The
>>>> m=mory is 99 percent used, and that does not leave any room for other
>>> apps. =
>>> 
>>>> Other implications?
>>>> 
>>>> Sent from my mobile device
>>>> 720-256-8076
>>>> 
>>>> On Jul 19, 2012, at 9:49 A...
>>>> H=ap space or free system RAM:
>>> 
>>>>> 
>>>>> 
>> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.htm
>>>>> l
>>>>> 
>>>>> Uwe
>>>>> ...
>>>>>> use i= since you might run out of memory on large indexes right?
>>> 
>>>>>> 
>>>>>> Here is how I got iSimpleFSDirectoryFactory to work. Just set -
>>>>>> Dsolr.directoryFactor...
>>>>>> set it=all up with a helper in solrconfig.xml...
>>> 
>>>>>> 
>>>>>> if (Constants.WINDOWS) {
>>>>>> if (MMapDirectory.UNMAP_SUPPORTED && Constants.JRE_IS_64...
>> 
>> 
>> 
>> --
>> Lance Norskog
>> goksron@gmail.com
>> 

Re: RE: How to setup SimpleFSDirectoryFactory

Posted by geetha anjali <an...@gmail.com>.
It happens in 3.6, for this reasons I thought of moving to solandra.
If I do a commit, the all documents are persisted with out any issues.
There is no issues  in terms of any functionality, but only this happens is
increase in physical RAM, goes higher and higher and stop at maximum and it
never comes down.

Thanks

On Sun, Jul 22, 2012 at 3:38 AM, Lance Norskog <go...@gmail.com> wrote:

> Interesting. Which version of Solr is this? What happens if you do a
> commit?
>
> On Sat, Jul 21, 2012 at 8:01 AM, geetha anjali <an...@gmail.com>
> wrote:
> > Hi uwe,
> > Great to know. We have files indexing 10000/min. After 30 mins I see all
> > my physical memory say its 100 percentage used(windows). On deep
> > investigation found that mmap is not releasing os files handles. Do you
> > find this behaviour?
> >
> > Thanks
> >
> > On 20 Jul 2012 14:04, "Uwe Schindler" <uw...@thetaphi.de> wrote:
> >
> > Hi Bill,
> >
> > MMapDirectory uses the file system cache of your operating system, which
> has
> > following consequences: In Linux, top & free should normally report only
> > *few* free memory, because the O/S uses all memory not allocated by
> > applications to cache disk I/O (and shows it as allocated, so having 0%
> free
> > memory is just normal on Linux and also Windows). If you have other
> > applications or Lucene/Solr itself that allocate lot's of heap space or
> > malloc() a lot, then you are reducing free physical memory, so reducing
> fs
> > cache. This depends also on your swappiness parameter (if swappiness is
> > higher, inactive processes are swapped out easier, default is 60% on
> linux -
> > freeing more space for FS cache - the backside is of course that maybe
> > in-memory structures of Lucene and other applications get pages out).
> >
> > You will only see no paging at all if all memory allocated all
> applications
> > + all mmapped files fit into memory. But paging in/out the mmapped Lucene
> > index is muuuuuch cheaper than using SimpleFSDirectory or
> NIOFSDirectory. If
> > you use SimpleFS or NIO and your index is not in FS cache, it will also
> read
> > it from physical disk again, so where is the difference. Paging is
> actually
> > cheaper as no syscalls are involved.
> >
> > If you want as much as possible of your index in physical RAM, copy it to
> > /dev/null regularily and buy more RUM :-)
> >
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: uwe@thetaphi...
> >
> >> From: Bill Bell [mailto:billnbell@gmail.com]
> >> Sent: Friday, July 20, 2012 5:17 AM
> >> Subject: Re: ...
> >> s=op using it? The least used memory will be removed from the OS
> >> automaticall=? Isee some paging. Wouldn't paging slow down the querying?
> >
> >>
> >> My index is 10gb and every 8 hours we get most of it in shared memory.
> The
> >> m=mory is 99 percent used, and that does not leave any room for other
> > apps. =
> >
> >> Other implications?
> >>
> >> Sent from my mobile device
> >> 720-256-8076
> >>
> >> On Jul 19, 2012, at 9:49 A...
> >> H=ap space or free system RAM:
> >
> >> >
> >> >
> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.htm
> >> > l
> >> >
> >> > Uwe
> >> >...
> >> >> use i= since you might run out of memory on large indexes right?
> >
> >> >>
> >> >> Here is how I got iSimpleFSDirectoryFactory to work. Just set -
> >> >> Dsolr.directoryFactor...
> >> >> set it=all up with a helper in solrconfig.xml...
> >
> >> >>
> >> >> if (Constants.WINDOWS) {
> >> >> if (MMapDirectory.UNMAP_SUPPORTED && Constants.JRE_IS_64...
>
>
>
> --
> Lance Norskog
> goksron@gmail.com
>

Re: RE: How to setup SimpleFSDirectoryFactory

Posted by Lance Norskog <go...@gmail.com>.
Interesting. Which version of Solr is this? What happens if you do a commit?

On Sat, Jul 21, 2012 at 8:01 AM, geetha anjali <an...@gmail.com> wrote:
> Hi uwe,
> Great to know. We have files indexing 10000/min. After 30 mins I see all
> my physical memory say its 100 percentage used(windows). On deep
> investigation found that mmap is not releasing os files handles. Do you
> find this behaviour?
>
> Thanks
>
> On 20 Jul 2012 14:04, "Uwe Schindler" <uw...@thetaphi.de> wrote:
>
> Hi Bill,
>
> MMapDirectory uses the file system cache of your operating system, which has
> following consequences: In Linux, top & free should normally report only
> *few* free memory, because the O/S uses all memory not allocated by
> applications to cache disk I/O (and shows it as allocated, so having 0% free
> memory is just normal on Linux and also Windows). If you have other
> applications or Lucene/Solr itself that allocate lot's of heap space or
> malloc() a lot, then you are reducing free physical memory, so reducing fs
> cache. This depends also on your swappiness parameter (if swappiness is
> higher, inactive processes are swapped out easier, default is 60% on linux -
> freeing more space for FS cache - the backside is of course that maybe
> in-memory structures of Lucene and other applications get pages out).
>
> You will only see no paging at all if all memory allocated all applications
> + all mmapped files fit into memory. But paging in/out the mmapped Lucene
> index is muuuuuch cheaper than using SimpleFSDirectory or NIOFSDirectory. If
> you use SimpleFS or NIO and your index is not in FS cache, it will also read
> it from physical disk again, so where is the difference. Paging is actually
> cheaper as no syscalls are involved.
>
> If you want as much as possible of your index in physical RAM, copy it to
> /dev/null regularily and buy more RUM :-)
>
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi...
>
>> From: Bill Bell [mailto:billnbell@gmail.com]
>> Sent: Friday, July 20, 2012 5:17 AM
>> Subject: Re: ...
>> s=op using it? The least used memory will be removed from the OS
>> automaticall=? Isee some paging. Wouldn't paging slow down the querying?
>
>>
>> My index is 10gb and every 8 hours we get most of it in shared memory. The
>> m=mory is 99 percent used, and that does not leave any room for other
> apps. =
>
>> Other implications?
>>
>> Sent from my mobile device
>> 720-256-8076
>>
>> On Jul 19, 2012, at 9:49 A...
>> H=ap space or free system RAM:
>
>> >
>> > http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.htm
>> > l
>> >
>> > Uwe
>> >...
>> >> use i= since you might run out of memory on large indexes right?
>
>> >>
>> >> Here is how I got iSimpleFSDirectoryFactory to work. Just set -
>> >> Dsolr.directoryFactor...
>> >> set it=all up with a helper in solrconfig.xml...
>
>> >>
>> >> if (Constants.WINDOWS) {
>> >> if (MMapDirectory.UNMAP_SUPPORTED && Constants.JRE_IS_64...



-- 
Lance Norskog
goksron@gmail.com

Re: RE: How to setup SimpleFSDirectoryFactory

Posted by geetha anjali <an...@gmail.com>.
Hi uwe,
Great to know. We have files indexing 10000/min. After 30 mins I see all
my physical memory say its 100 percentage used(windows). On deep
investigation found that mmap is not releasing os files handles. Do you
find this behaviour?

Thanks

On 20 Jul 2012 14:04, "Uwe Schindler" <uw...@thetaphi.de> wrote:

Hi Bill,

MMapDirectory uses the file system cache of your operating system, which has
following consequences: In Linux, top & free should normally report only
*few* free memory, because the O/S uses all memory not allocated by
applications to cache disk I/O (and shows it as allocated, so having 0% free
memory is just normal on Linux and also Windows). If you have other
applications or Lucene/Solr itself that allocate lot's of heap space or
malloc() a lot, then you are reducing free physical memory, so reducing fs
cache. This depends also on your swappiness parameter (if swappiness is
higher, inactive processes are swapped out easier, default is 60% on linux -
freeing more space for FS cache - the backside is of course that maybe
in-memory structures of Lucene and other applications get pages out).

You will only see no paging at all if all memory allocated all applications
+ all mmapped files fit into memory. But paging in/out the mmapped Lucene
index is muuuuuch cheaper than using SimpleFSDirectory or NIOFSDirectory. If
you use SimpleFS or NIO and your index is not in FS cache, it will also read
it from physical disk again, so where is the difference. Paging is actually
cheaper as no syscalls are involved.

If you want as much as possible of your index in physical RAM, copy it to
/dev/null regularily and buy more RUM :-)


-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi...

> From: Bill Bell [mailto:billnbell@gmail.com]
> Sent: Friday, July 20, 2012 5:17 AM
> Subject: Re: ...
> s=op using it? The least used memory will be removed from the OS
> automaticall=? Isee some paging. Wouldn't paging slow down the querying?

>
> My index is 10gb and every 8 hours we get most of it in shared memory. The
> m=mory is 99 percent used, and that does not leave any room for other
apps. =

> Other implications?
>
> Sent from my mobile device
> 720-256-8076
>
> On Jul 19, 2012, at 9:49 A...
> H=ap space or free system RAM:

> >
> > http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.htm
> > l
> >
> > Uwe
> >...
> >> use i= since you might run out of memory on large indexes right?

> >>
> >> Here is how I got iSimpleFSDirectoryFactory to work. Just set -
> >> Dsolr.directoryFactor...
> >> set it=all up with a helper in solrconfig.xml...

> >>
> >> if (Constants.WINDOWS) {
> >> if (MMapDirectory.UNMAP_SUPPORTED && Constants.JRE_IS_64...

RE: How to setup SimpleFSDirectoryFactory

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi Bill,

MMapDirectory uses the file system cache of your operating system, which has
following consequences: In Linux, top & free should normally report only
*few* free memory, because the O/S uses all memory not allocated by
applications to cache disk I/O (and shows it as allocated, so having 0% free
memory is just normal on Linux and also Windows). If you have other
applications or Lucene/Solr itself that allocate lot's of heap space or
malloc() a lot, then you are reducing free physical memory, so reducing fs
cache. This depends also on your swappiness parameter (if swappiness is
higher, inactive processes are swapped out easier, default is 60% on linux -
freeing more space for FS cache - the backside is of course that maybe
in-memory structures of Lucene and other applications get pages out).

You will only see no paging at all if all memory allocated all applications
+ all mmapped files fit into memory. But paging in/out the mmapped Lucene
index is muuuuuch cheaper than using SimpleFSDirectory or NIOFSDirectory. If
you use SimpleFS or NIO and your index is not in FS cache, it will also read
it from physical disk again, so where is the difference. Paging is actually
cheaper as no syscalls are involved.

If you want as much as possible of your index in physical RAM, copy it to
/dev/null regularily and buy more RUM :-)

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Bill Bell [mailto:billnbell@gmail.com]
> Sent: Friday, July 20, 2012 5:17 AM
> Subject: Re: How to setup SimpleFSDirectoryFactory
> 
> Thanks. Are you saying that if we run low on memory, the MMapDirectory
will
> s=op using it? The least used memory will be removed from the OS
> automaticall=? Isee some paging. Wouldn't paging slow down the querying?
> 
> My index is 10gb and every 8 hours we get most of it in shared memory. The
> m=mory is 99 percent used, and that does not leave any room for other
apps. =
> Other implications?
> 
> Sent from my mobile device
> 720-256-8076
> 
> On Jul 19, 2012, at 9:49 AM, "Uwe Schindler" <uw...@thetaphi.de> wrote:
> 
> > Read this, then you will see that MMapDirectory will use 0% of your Java
> H=ap space or free system RAM:
> >
> > http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.htm
> > l
> >
> > Uwe
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: uwe@thetaphi.de
> >
> >
> >> -----Original Message-----
> >> From: William Bell [mailto:billnbell@gmail.com]
> >> Sent: Tuesday, July 17, 2012 6:05 AM
> >> Subject: How to setup SimpleFSDirectoryFactory
> >>
> >> We all know that MMapDirectory is fastest. However we cannot always
> >> use i= since you might run out of memory on large indexes right?
> >>
> >> Here is how I got iSimpleFSDirectoryFactory to work. Just set -
> >> Dsolr.directoryFactory=solr.SimpleFSDirectoryFactory.
> >>
> >> Your solrconfig.xml:
> >>
> >> <directoryFactory name="DirectoryFactory"
> >> class="${solr.directoryFactory:solr.StandardDirectoryFactory}"/>
> >>
> >> You can check it with http://localhost:8983/solr/admin/stats.jsp
> >>
> >> Notice that the default for Windows 64bit is MMapDirectory. Else
> >> NIOFSDirectory except for WIndows.... It would be nicer if we just
> >> set it=all up with a helper in solrconfig.xml...
> >>
> >> if (Constants.WINDOWS) {
> >>     if (MMapDirectory.UNMAP_SUPPORTED && Constants.JRE_IS_64BIT)
> >>        return new MMapDirectory(path, lockFactory);
> >>     else
> >>        return new SimpleFSDirectory(path, lockFactory);
> >>     } else {
> >>        return new NIOFSDirectory(path, lockFactory);
> >>      }
> >> }
> >>
> >>
> >>
> >> --
> >> Bill Bell
> >> billnbell@gmail.com
> >> cell 720-256-8076
> >
> >



Re: How to setup SimpleFSDirectoryFactory

Posted by Bill Bell <bi...@gmail.com>.
Thanks. Are you saying that if we run low on memory, the MMapDirectory will stop using it? The least used memory will be removed from the OS automatically? Isee some paging. Wouldn't paging slow down the querying?

My index is 10gb and every 8 hours we get most of it in shared memory. The memory is 99 percent used, and that does not leave any room for other apps. 

Other implications?

Sent from my mobile device
720-256-8076

On Jul 19, 2012, at 9:49 AM, "Uwe Schindler" <uw...@thetaphi.de> wrote:

> Read this, then you will see that MMapDirectory will use 0% of your Java Heap space or free system RAM:
> 
> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
> 
> Uwe
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
> 
> 
>> -----Original Message-----
>> From: William Bell [mailto:billnbell@gmail.com]
>> Sent: Tuesday, July 17, 2012 6:05 AM
>> Subject: How to setup SimpleFSDirectoryFactory
>> 
>> We all know that MMapDirectory is fastest. However we cannot always use it
>> since you might run out of memory on large indexes right?
>> 
>> Here is how I got iSimpleFSDirectoryFactory to work. Just set -
>> Dsolr.directoryFactory=solr.SimpleFSDirectoryFactory.
>> 
>> Your solrconfig.xml:
>> 
>> <directoryFactory name="DirectoryFactory"
>> class="${solr.directoryFactory:solr.StandardDirectoryFactory}"/>
>> 
>> You can check it with http://localhost:8983/solr/admin/stats.jsp
>> 
>> Notice that the default for Windows 64bit is MMapDirectory. Else
>> NIOFSDirectory except for WIndows.... It would be nicer if we just set it all up
>> with a helper in solrconfig.xml...
>> 
>> if (Constants.WINDOWS) {
>>     if (MMapDirectory.UNMAP_SUPPORTED && Constants.JRE_IS_64BIT)
>>        return new MMapDirectory(path, lockFactory);
>>     else
>>        return new SimpleFSDirectory(path, lockFactory);
>>     } else {
>>        return new NIOFSDirectory(path, lockFactory);
>>      }
>> }
>> 
>> 
>> 
>> --
>> Bill Bell
>> billnbell@gmail.com
>> cell 720-256-8076
> 
> 

RE: How to setup SimpleFSDirectoryFactory

Posted by Uwe Schindler <uw...@thetaphi.de>.
Read this, then you will see that MMapDirectory will use 0% of your Java Heap space or free system RAM:

http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: William Bell [mailto:billnbell@gmail.com]
> Sent: Tuesday, July 17, 2012 6:05 AM
> Subject: How to setup SimpleFSDirectoryFactory
> 
> We all know that MMapDirectory is fastest. However we cannot always use it
> since you might run out of memory on large indexes right?
> 
> Here is how I got iSimpleFSDirectoryFactory to work. Just set -
> Dsolr.directoryFactory=solr.SimpleFSDirectoryFactory.
> 
> Your solrconfig.xml:
> 
> <directoryFactory name="DirectoryFactory"
> class="${solr.directoryFactory:solr.StandardDirectoryFactory}"/>
> 
> You can check it with http://localhost:8983/solr/admin/stats.jsp
> 
> Notice that the default for Windows 64bit is MMapDirectory. Else
> NIOFSDirectory except for WIndows.... It would be nicer if we just set it all up
> with a helper in solrconfig.xml...
> 
> if (Constants.WINDOWS) {
>      if (MMapDirectory.UNMAP_SUPPORTED && Constants.JRE_IS_64BIT)
>         return new MMapDirectory(path, lockFactory);
>      else
>         return new SimpleFSDirectory(path, lockFactory);
>      } else {
>         return new NIOFSDirectory(path, lockFactory);
>       }
> }
> 
> 
> 
> --
> Bill Bell
> billnbell@gmail.com
> cell 720-256-8076