You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by James <lj...@163.com> on 2012/01/09 06:29:39 UTC

how to avoid OOM while merge index

I am build the solr index on the hadoop, and at reduce step I run the task that merge the indexes, each part of index is about 1G, I have 10 indexes to merge them together, I always get the java heap memory exhausted, the heap size is about 2G  also. I wonder which part use these so many memory. And how to avoid the OOM during the merge process.

Re: how to avoid OOM while merge index

Posted by Ralf Matulat <ra...@bundestag.de>.
A quick guess:
If you are using tomcat for example, be sure to grand unlimited virtual 
memory to that process, e.g. putting
"ulimit -v unlimited"
in your tomcat-init script (if you're using Linux).

Am 09.01.2012 06:29, schrieb James:
> I am build the solr index on the hadoop, and at reduce step I run the task that merge the indexes, each part of index is about 1G, I have 10 indexes to merge them together, I always get the java heap memory exhausted, the heap size is about 2G  also. I wonder which part use these so many memory. And how to avoid the OOM during the merge process.
>


-- 
Ralf Matulat
Deutscher Bundestag
Platz der Republik 1
11011 Berlin
Referat IT 1 - Anwendungsadministration
ralf.matulat@bundestag.de
Tel.: 030 - 227 34260


Re:Re: how to avoid OOM while merge index

Posted by James <lj...@163.com>.
Sinece the hadoop task monitor will check each task, and when find it consume to much memory, then it will kill the task, so I am currently want to find a method to decrease the mem usage at solr side, any idea?
At 2012-01-09 17:07:09,"Tomas Zerolo" <to...@axelspringer.de> wrote:
>On Mon, Jan 09, 2012 at 01:29:39PM +0800, James wrote:
>> I am build the solr index on the hadoop, and at reduce step I run the task that merge the indexes, each part of index is about 1G, I have 10 indexes to merge them together, I always get the java heap memory exhausted, the heap size is about 2G  also. I wonder which part use these so many memory. And how to avoid the OOM during the merge process.
>
>There are three issues in there. You should first try to find out which
>one it is (it's not clear to me based on your question):
>
>  - Java heap memory: you can set that as a start option of the JVM.
>    You set the maximum with the -Xmxn start option. You get an
>    OutOfMemory exception if you reach that (no idea wheter the
>    SOLR code bubbles this up, but there are experts on that here).
>  - Operating system limit: you can set the limit for a process's
>    use of resources (memory, among others). Typically, Linux based
>    systems are shipped with unlimited memory setting; Ralf already
>    posted how to check/set that.
>    The situation here is a bit complicated, because there are
>    different limits (memory size vs. virtual memory size, mainly)
>    and they are exercised differently depending on the allocation
>    pattern. Anyway, I'd expect malloc() returning NULL in this
>    case and the Java runtime translating it (again) into an OutOfMemory
>    exception.
>  - Now the OOM killer is quite another kettle of fish. AFAIK, it's
>    Linux-specific. Once the global system memory is more-or-less
>    exhausted, the kernel kills some applications to try to improve
>    the situation. There's some heuristic in deciding which application
>    to kill, and there are some knobs to help the kernel in this
>    decision. I'd recommend [1]; after reading *that* you know all :-)
>    You know you've run into that by looking at the system log.
>
>
>[1] <https://lwn.net/Articles/317814/>
>-- 
>Tomás Zerolo
>Axel Springer AG
>Axel Springer media Systems
>BILD Produktionssysteme
>Axel-Springer-Straße 65
>10888 Berlin
>Tel.: +49 (30) 2591-72875
>tomas.zerolo@axelspringer.de
>www.axelspringer.de
>
>Axel Springer AG, Sitz Berlin, Amtsgericht Charlottenburg, HRB 4998
>Vorsitzender des Aufsichtsrats: Dr. Giuseppe Vita
>Vorstand: Dr. Mathias Döpfner (Vorsitzender)
>Jan Bayer, Ralph Büchi, Lothar Lanz, Dr. Andreas Wiele


Re: how to avoid OOM while merge index

Posted by Tomas Zerolo <to...@axelspringer.de>.
On Mon, Jan 09, 2012 at 01:29:39PM +0800, James wrote:
> I am build the solr index on the hadoop, and at reduce step I run the task that merge the indexes, each part of index is about 1G, I have 10 indexes to merge them together, I always get the java heap memory exhausted, the heap size is about 2G  also. I wonder which part use these so many memory. And how to avoid the OOM during the merge process.

There are three issues in there. You should first try to find out which
one it is (it's not clear to me based on your question):

  - Java heap memory: you can set that as a start option of the JVM.
    You set the maximum with the -Xmxn start option. You get an
    OutOfMemory exception if you reach that (no idea wheter the
    SOLR code bubbles this up, but there are experts on that here).
  - Operating system limit: you can set the limit for a process's
    use of resources (memory, among others). Typically, Linux based
    systems are shipped with unlimited memory setting; Ralf already
    posted how to check/set that.
    The situation here is a bit complicated, because there are
    different limits (memory size vs. virtual memory size, mainly)
    and they are exercised differently depending on the allocation
    pattern. Anyway, I'd expect malloc() returning NULL in this
    case and the Java runtime translating it (again) into an OutOfMemory
    exception.
  - Now the OOM killer is quite another kettle of fish. AFAIK, it's
    Linux-specific. Once the global system memory is more-or-less
    exhausted, the kernel kills some applications to try to improve
    the situation. There's some heuristic in deciding which application
    to kill, and there are some knobs to help the kernel in this
    decision. I'd recommend [1]; after reading *that* you know all :-)
    You know you've run into that by looking at the system log.


[1] <https://lwn.net/Articles/317814/>
-- 
Tomás Zerolo
Axel Springer AG
Axel Springer media Systems
BILD Produktionssysteme
Axel-Springer-Straße 65
10888 Berlin
Tel.: +49 (30) 2591-72875
tomas.zerolo@axelspringer.de
www.axelspringer.de

Axel Springer AG, Sitz Berlin, Amtsgericht Charlottenburg, HRB 4998
Vorsitzender des Aufsichtsrats: Dr. Giuseppe Vita
Vorstand: Dr. Mathias Döpfner (Vorsitzender)
Jan Bayer, Ralph Büchi, Lothar Lanz, Dr. Andreas Wiele