You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Brian Whitman <br...@echonest.com> on 2009/01/02 16:52:59 UTC

cannot allocate memory for snapshooter

I have an indexing machine on a test server (a mid-level EC2 instance, 8GB
of RAM) and I run jetty like:

java -server -Xms5g -Xmx5g -XX:MaxPermSize=128m
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/heap
-Dsolr.solr.home=/vol/solr -Djava.awt.headless=true -jar start.jar

The indexing master is set to snapshoot on commit. Sometimes (not always)
the snapshot fails with

SEVERE: java.io.IOException: Cannot run program "/vol/solr/bin/snapshooter":
java.io.IOException: error=12, Cannot allocate memory
at java.lang.ProcessBuilder.start(Unknown Source)

Why would snapshooter need more than 2GB ram?  /proc/meminfo says (with solr
running & nothing else)

MemTotal:      7872040 kB
MemFree:       2018404 kB
Buffers:         67704 kB
Cached:        2161880 kB
SwapCached:          0 kB
Active:        3446348 kB
Inactive:      2186964 kB
SwapTotal:           0 kB
SwapFree:            0 kB
Dirty:               8 kB
Writeback:           0 kB
AnonPages:     3403728 kB
Mapped:          12016 kB
Slab:            37804 kB
SReclaimable:    20048 kB
SUnreclaim:      17756 kB
PageTables:       7476 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
CommitLimit:   3936020 kB
Committed_AS:  5383624 kB
VmallocTotal: 34359738367 kB
VmallocUsed:       340 kB
VmallocChunk: 34359738027 kB

Re: understanding queryNorm

Posted by Chris Hostetter <ho...@fucit.org>.
:    i wanted to understand how the queryNorm is calculated. i did read 
: similarity documentation of lucene it says it is
	...
: what would be default q.getBoost() ?  ( as i am not giving any value 
: specifically any where in solr). t.getBoost() is 1 in my case as i am 

all queries have a boost value, even if you dont' specify one they have a 
default -- i believe it's "1" for every stock query, but a custom Impl 
could have an alternate default if it really wanted to.

thea easiest way to visuallize a lot of this is with debugQuery=true


-Hoss


understanding queryNorm

Posted by vinay kumar kaku <vk...@hotmail.com>.
Hi,
   i wanted to understand  how the queryNorm is calculated. i did read similarity documentation of lucene it says it is 


1
               
                  �C�C�C�C�C�C�C�C�C�C�C�C�C�C
               
               sqrt(sumOfSquaredWeights)

sumOfSquaredWeights   =  
            q.getBoost() 2
             �� 
          
          
            ��
          
          
            (
            idf(t)  �� 
            t.getBoost()
            ) 2
 

what would be default q.getBoost() ?  ( as i am not giving any value specifically any where in solr). t.getBoost() is 1 in my case as i am not boosting any term at query time. can some one explain with an example if you could ? 

thank you,
vinay

_________________________________________________________________
Send e-mail anywhere. No map, no compass.
http://windowslive.com/oneline/hotmail?ocid=TXT_TAGLM_WL_hotmail_acq_anywhere_122008

Re: cannot allocate memory for snapshooter

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Here is another one that I just saw on hadoop's core-user list:

"If you have overcommit_mem turned on, Java 1.5 will lock *all* of
its maximum heap size into RAM (ignores swap!) upon startup.  Earlier
versions of 1.5 also allocate 1GB of RAM for code compilation.  I've
seen situations where there was enough RAM to start up, but not enough
to fork other processes."

 
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: Brian Whitman <br...@echonest.com>
> To: solr-user@lucene.apache.org
> Sent: Friday, January 2, 2009 2:47:16 PM
> Subject: Re: cannot allocate memory for snapshooter
> 
> Thanks for the pointer. (It seems really weird to alloc 5GB of swap just
> because the JVM needs to run a shell script.. but I get hoss's explanation
> in the following post)
> 
> On Fri, Jan 2, 2009 at 2:37 PM, Bill Au wrote:
> 
> > add more swap space:
> > http://www.nabble.com/Not-enough-space-to11423199.html#a11424938
> >
> > Bill
> >
> > On Fri, Jan 2, 2009 at 10:52 AM, Brian Whitman wrote:
> >
> > > I have an indexing machine on a test server (a mid-level EC2 instance,
> > 8GB
> > > of RAM) and I run jetty like:
> > >
> > > java -server -Xms5g -Xmx5g -XX:MaxPermSize=128m
> > > -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/heap
> > > -Dsolr.solr.home=/vol/solr -Djava.awt.headless=true -jar start.jar
> > >
> > > The indexing master is set to snapshoot on commit. Sometimes (not always)
> > > the snapshot fails with
> > >
> > > SEVERE: java.io.IOException: Cannot run program
> > > "/vol/solr/bin/snapshooter":
> > > java.io.IOException: error=12, Cannot allocate memory
> > > at java.lang.ProcessBuilder.start(Unknown Source)
> > >
> > > Why would snapshooter need more than 2GB ram?  /proc/meminfo says (with
> > > solr
> > > running & nothing else)
> > >
> > > MemTotal:      7872040 kB
> > > MemFree:       2018404 kB
> > > Buffers:         67704 kB
> > > Cached:        2161880 kB
> > > SwapCached:          0 kB
> > > Active:        3446348 kB
> > > Inactive:      2186964 kB
> > > SwapTotal:           0 kB
> > > SwapFree:            0 kB
> > > Dirty:               8 kB
> > > Writeback:           0 kB
> > > AnonPages:     3403728 kB
> > > Mapped:          12016 kB
> > > Slab:            37804 kB
> > > SReclaimable:    20048 kB
> > > SUnreclaim:      17756 kB
> > > PageTables:       7476 kB
> > > NFS_Unstable:        0 kB
> > > Bounce:              0 kB
> > > CommitLimit:   3936020 kB
> > > Committed_AS:  5383624 kB
> > > VmallocTotal: 34359738367 kB
> > > VmallocUsed:       340 kB
> > > VmallocChunk: 34359738027 kB
> > >
> >


Re: cannot allocate memory for snapshooter

Posted by Mark Miller <ma...@gmail.com>.
Brian Whitman wrote:
> On Sun, Jan 4, 2009 at 9:47 PM, Mark Miller <ma...@gmail.com> wrote:
>
>   
>> Hey Brian, I didn't catch what OS you are using on EC2 by the way. I
>> thought most UNIX OS's were using memory overcommit - A quick search brings
>> up Linux, AIX, and HP-UX, and maybe even OSX?
>>
>> What are you running over there? EC2, so Linux I assume?
>>
>>     
>
> This is on debian, a 2.6.21 x86_64 kernel
Interesting. Well it must not be overcommitting then. I think thats your 
only hope without a serious pain in the butt.

Check your settings - you should be able to do it through proc. Here is 
a bit of info cut and pasted from the web. I am *guessing* that you are 
in mode 0, and the Heuristic is worried you are going to try and use 
that RAM. Perhaps try 1:

echo 1 > /proc/sys/vm/overcommit_memory

    0 - Heuristic overcommit handling. Obvious overcommits of address 
space are refused. Used for a typical system. It ensures a seriously 
wild allocation fails while allowing overcommit to reduce swap usage. 
root is allowed to allocate slighly more memory in this mode. This is 
the default.
    1 - Always overcommit.
    2 - Don't overcommit. The total address space commit for the system 
is not permitted to exceed swap plus a configurable percentage (default 
is 50) of physical RAM. Depending on the percentage you use, in most 
situations this means a process will not be killed while attempting to 
use already-allocated memory but will receive errors on memory 
allocation as appropriate.

Re: cannot allocate memory for snapshooter

Posted by Brian Whitman <br...@echonest.com>.
On Sun, Jan 4, 2009 at 9:47 PM, Mark Miller <ma...@gmail.com> wrote:

> Hey Brian, I didn't catch what OS you are using on EC2 by the way. I
> thought most UNIX OS's were using memory overcommit - A quick search brings
> up Linux, AIX, and HP-UX, and maybe even OSX?
>
> What are you running over there? EC2, so Linux I assume?
>

This is on debian, a 2.6.21 x86_64 kernel.

Re: cannot allocate memory for snapshooter

Posted by Mark Miller <ma...@gmail.com>.
Hey Brian, I didn't catch what OS you are using on EC2 by the way. I 
thought most UNIX OS's were using memory overcommit - A quick search 
brings up Linux, AIX, and HP-UX, and maybe even OSX?

What are you running over there? EC2, so Linux I assume?

Yonik: I take it, now that Linux uses copy on write, they stopped with 
the memory overcommit? Or perhaps Brian is not on Linux...

I love Unix stuff :) So much Java and Windows in my past, I still find 
this info really cool. Windows command line was never so interesting.


- Mark

Re: cannot allocate memory for snapshooter

Posted by Mark Miller <ma...@gmail.com>.
Yonik Seeley wrote:
> On Sun, Jan 4, 2009 at 8:07 PM, Mark Miller <ma...@gmail.com> wrote:
>   
>> Forking for a small script on something that can have such a large memory
>> footprint is just a huge waste of resources. Ideally you might have a tiny
>> program running, listening on a socket or something, and it can be alerted
>> and do the actual fork (being small itself). Or some other such workaround,
>> other than copying a few gig into RAM or swap :)
>>     
>
> Well, fork doesn't actually copy anymore (for a long time now) - it's
> really only the page tables that get copied and set to copy-on-write
> so the fork is actually pretty lightweight.
>   
Right, copying was the wrong word. It depends. Depending on your Unix 
variant, it will actually use vfork, or sometimes..., or sometimes
sometimes you have no option to share. (Because you can screw with the 
parent, I've seen warnings in the doc for vfork that its not recommended 
even for use - but this could be old now, and was for a particular 
version of UNIX that I don't remember)
> The issue is that the OS is being conservative and checking that there
> would be enough RAM+SWAP available if all of the process address space
> did have to be copied/allocated (older versions of linux didn't do
> this check and allowed memory overcommit).  The OS doesn't know that
> the fork will be followed by an exec.
>   
I don't think you can just count on that in a unix environment. Maybe 
Linux took care of it, but is that common on all versions of Unix? And 
if you have an older version of Linux?
> So the workaround of creating more swap is just so that this OS memory
> overcommit check passes.  The swap won't actually be used by the fork
> + exec.
>   
Again, only if your lucky. It depends on the many implementations of 
fork. A lot of times fork is actually vfork or something, but solr can't 
count on it for everybody I wouldn't think.
> The real fix would be for the JVM to use something like vfork when available.
>   
Which kind of happens under the scenes if your lucky already. Some unix 
guys don't like it, and I assume thats why its not the standard (overly 
concerned with the child process mucking up the parent process).

I shouldn't have said copy - the issue is that we are looking for way to 
much RAM. A JVM using 5 gig will look for another 5 - thats terrible. I 
don't link we can solve it in a universal way for Unix by relying on 
forking the JVM though. Its hit or miss. The real fix can't depend on 
your OS varient and its version I wouldn't think.

Re: cannot allocate memory for snapshooter

Posted by Yonik Seeley <ys...@gmail.com>.
On Sun, Jan 4, 2009 at 8:07 PM, Mark Miller <ma...@gmail.com> wrote:
> Forking for a small script on something that can have such a large memory
> footprint is just a huge waste of resources. Ideally you might have a tiny
> program running, listening on a socket or something, and it can be alerted
> and do the actual fork (being small itself). Or some other such workaround,
> other than copying a few gig into RAM or swap :)

Well, fork doesn't actually copy anymore (for a long time now) - it's
really only the page tables that get copied and set to copy-on-write
so the fork is actually pretty lightweight.
The issue is that the OS is being conservative and checking that there
would be enough RAM+SWAP available if all of the process address space
did have to be copied/allocated (older versions of linux didn't do
this check and allowed memory overcommit).  The OS doesn't know that
the fork will be followed by an exec.

So the workaround of creating more swap is just so that this OS memory
overcommit check passes.  The swap won't actually be used by the fork
+ exec.

The real fix would be for the JVM to use something like vfork when available.

-Yonik

Re: cannot allocate memory for snapshooter

Posted by Mark Miller <ma...@gmail.com>.
Your right, its nasty, but its how Fork works. I would say its something 
that should be fixed, its so nasty, but with the new all Java 
replication, its probably a moot point.

Forking for a small script on something that can have such a large 
memory footprint is just a huge waste of resources. Ideally you might 
have a tiny program running, listening on a socket or something, and it 
can be alerted and do the actual fork (being small itself). Or some 
other such workaround, other than copying a few gig into RAM or swap :)

The new all Java replication looks a little nicer in the face of this 
(someone was asking about the differences earlier).

- Mark

Brian Whitman wrote:
> Thanks for the pointer. (It seems really weird to alloc 5GB of swap just
> because the JVM needs to run a shell script.. but I get hoss's explanation
> in the following post)
>
> On Fri, Jan 2, 2009 at 2:37 PM, Bill Au <bi...@gmail.com> wrote:
>
>   
>> add more swap space:
>> http://www.nabble.com/Not-enough-space-to11423199.html#a11424938
>>
>> Bill
>>
>> On Fri, Jan 2, 2009 at 10:52 AM, Brian Whitman <br...@echonest.com> wrote:
>>
>>     
>>> I have an indexing machine on a test server (a mid-level EC2 instance,
>>>       
>> 8GB
>>     
>>> of RAM) and I run jetty like:
>>>
>>> java -server -Xms5g -Xmx5g -XX:MaxPermSize=128m
>>> -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/heap
>>> -Dsolr.solr.home=/vol/solr -Djava.awt.headless=true -jar start.jar
>>>
>>> The indexing master is set to snapshoot on commit. Sometimes (not always)
>>> the snapshot fails with
>>>
>>> SEVERE: java.io.IOException: Cannot run program
>>> "/vol/solr/bin/snapshooter":
>>> java.io.IOException: error=12, Cannot allocate memory
>>> at java.lang.ProcessBuilder.start(Unknown Source)
>>>
>>> Why would snapshooter need more than 2GB ram?  /proc/meminfo says (with
>>> solr
>>> running & nothing else)
>>>
>>> MemTotal:      7872040 kB
>>> MemFree:       2018404 kB
>>> Buffers:         67704 kB
>>> Cached:        2161880 kB
>>> SwapCached:          0 kB
>>> Active:        3446348 kB
>>> Inactive:      2186964 kB
>>> SwapTotal:           0 kB
>>> SwapFree:            0 kB
>>> Dirty:               8 kB
>>> Writeback:           0 kB
>>> AnonPages:     3403728 kB
>>> Mapped:          12016 kB
>>> Slab:            37804 kB
>>> SReclaimable:    20048 kB
>>> SUnreclaim:      17756 kB
>>> PageTables:       7476 kB
>>> NFS_Unstable:        0 kB
>>> Bounce:              0 kB
>>> CommitLimit:   3936020 kB
>>> Committed_AS:  5383624 kB
>>> VmallocTotal: 34359738367 kB
>>> VmallocUsed:       340 kB
>>> VmallocChunk: 34359738027 kB
>>>
>>>       
>
>   


Re: cannot allocate memory for snapshooter

Posted by Brian Whitman <br...@echonest.com>.
Thanks for the pointer. (It seems really weird to alloc 5GB of swap just
because the JVM needs to run a shell script.. but I get hoss's explanation
in the following post)

On Fri, Jan 2, 2009 at 2:37 PM, Bill Au <bi...@gmail.com> wrote:

> add more swap space:
> http://www.nabble.com/Not-enough-space-to11423199.html#a11424938
>
> Bill
>
> On Fri, Jan 2, 2009 at 10:52 AM, Brian Whitman <br...@echonest.com> wrote:
>
> > I have an indexing machine on a test server (a mid-level EC2 instance,
> 8GB
> > of RAM) and I run jetty like:
> >
> > java -server -Xms5g -Xmx5g -XX:MaxPermSize=128m
> > -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/heap
> > -Dsolr.solr.home=/vol/solr -Djava.awt.headless=true -jar start.jar
> >
> > The indexing master is set to snapshoot on commit. Sometimes (not always)
> > the snapshot fails with
> >
> > SEVERE: java.io.IOException: Cannot run program
> > "/vol/solr/bin/snapshooter":
> > java.io.IOException: error=12, Cannot allocate memory
> > at java.lang.ProcessBuilder.start(Unknown Source)
> >
> > Why would snapshooter need more than 2GB ram?  /proc/meminfo says (with
> > solr
> > running & nothing else)
> >
> > MemTotal:      7872040 kB
> > MemFree:       2018404 kB
> > Buffers:         67704 kB
> > Cached:        2161880 kB
> > SwapCached:          0 kB
> > Active:        3446348 kB
> > Inactive:      2186964 kB
> > SwapTotal:           0 kB
> > SwapFree:            0 kB
> > Dirty:               8 kB
> > Writeback:           0 kB
> > AnonPages:     3403728 kB
> > Mapped:          12016 kB
> > Slab:            37804 kB
> > SReclaimable:    20048 kB
> > SUnreclaim:      17756 kB
> > PageTables:       7476 kB
> > NFS_Unstable:        0 kB
> > Bounce:              0 kB
> > CommitLimit:   3936020 kB
> > Committed_AS:  5383624 kB
> > VmallocTotal: 34359738367 kB
> > VmallocUsed:       340 kB
> > VmallocChunk: 34359738027 kB
> >
>

Re: cannot allocate memory for snapshooter

Posted by Bill Au <bi...@gmail.com>.
add more swap space:
http://www.nabble.com/Not-enough-space-to11423199.html#a11424938

Bill

On Fri, Jan 2, 2009 at 10:52 AM, Brian Whitman <br...@echonest.com> wrote:

> I have an indexing machine on a test server (a mid-level EC2 instance, 8GB
> of RAM) and I run jetty like:
>
> java -server -Xms5g -Xmx5g -XX:MaxPermSize=128m
> -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/heap
> -Dsolr.solr.home=/vol/solr -Djava.awt.headless=true -jar start.jar
>
> The indexing master is set to snapshoot on commit. Sometimes (not always)
> the snapshot fails with
>
> SEVERE: java.io.IOException: Cannot run program
> "/vol/solr/bin/snapshooter":
> java.io.IOException: error=12, Cannot allocate memory
> at java.lang.ProcessBuilder.start(Unknown Source)
>
> Why would snapshooter need more than 2GB ram?  /proc/meminfo says (with
> solr
> running & nothing else)
>
> MemTotal:      7872040 kB
> MemFree:       2018404 kB
> Buffers:         67704 kB
> Cached:        2161880 kB
> SwapCached:          0 kB
> Active:        3446348 kB
> Inactive:      2186964 kB
> SwapTotal:           0 kB
> SwapFree:            0 kB
> Dirty:               8 kB
> Writeback:           0 kB
> AnonPages:     3403728 kB
> Mapped:          12016 kB
> Slab:            37804 kB
> SReclaimable:    20048 kB
> SUnreclaim:      17756 kB
> PageTables:       7476 kB
> NFS_Unstable:        0 kB
> Bounce:              0 kB
> CommitLimit:   3936020 kB
> Committed_AS:  5383624 kB
> VmallocTotal: 34359738367 kB
> VmallocUsed:       340 kB
> VmallocChunk: 34359738027 kB
>

Re: cannot allocate memory for snapshooter

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Brian,

Could it be your OS that's running out of some resources, and not the JVM?
How many processes/tasks do you see in top?  Do you see any swap there?

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: Brian Whitman <br...@echonest.com>
> To: solr-user@lucene.apache.org
> Sent: Friday, January 2, 2009 10:52:59 AM
> Subject: cannot allocate memory for snapshooter
> 
> I have an indexing machine on a test server (a mid-level EC2 instance, 8GB
> of RAM) and I run jetty like:
> 
> java -server -Xms5g -Xmx5g -XX:MaxPermSize=128m
> -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/heap
> -Dsolr.solr.home=/vol/solr -Djava.awt.headless=true -jar start.jar
> 
> The indexing master is set to snapshoot on commit. Sometimes (not always)
> the snapshot fails with
> 
> SEVERE: java.io.IOException: Cannot run program "/vol/solr/bin/snapshooter":
> java.io.IOException: error=12, Cannot allocate memory
> at java.lang.ProcessBuilder.start(Unknown Source)
> 
> Why would snapshooter need more than 2GB ram?  /proc/meminfo says (with solr
> running & nothing else)
> 
> MemTotal:      7872040 kB
> MemFree:       2018404 kB
> Buffers:         67704 kB
> Cached:        2161880 kB
> SwapCached:          0 kB
> Active:        3446348 kB
> Inactive:      2186964 kB
> SwapTotal:           0 kB
> SwapFree:            0 kB
> Dirty:               8 kB
> Writeback:           0 kB
> AnonPages:     3403728 kB
> Mapped:          12016 kB
> Slab:            37804 kB
> SReclaimable:    20048 kB
> SUnreclaim:      17756 kB
> PageTables:       7476 kB
> NFS_Unstable:        0 kB
> Bounce:              0 kB
> CommitLimit:   3936020 kB
> Committed_AS:  5383624 kB
> VmallocTotal: 34359738367 kB
> VmallocUsed:       340 kB
> VmallocChunk: 34359738027 kB