You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Andrey Stepachev <oc...@gmail.com> on 2011/01/11 15:57:12 UTC

Re: Java Commited Virtual Memory significally larged then Heap Memory

After starting the hbase in jroсkit found the same memory leakage.

After the launch

 Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head
Tue Jan 11 16:49:31 2011

   11 16:49:31 MSK 2011
   PID RSS VSZ% CPU
  7863 2547760 5576744 78.7



JR dumps:

 Total mapped 5576740KB (reserved = 2676404KB) - Java heap 2048000KB
(reserved = 1472176KB) - GC tables 68512KB - Thread stacks 37236KB (#
threads = 111) - Compiled code 1048576KB (used = 2599KB) - Internal
1224KB - OS 549688KB - Other 1800976KB - Classblocks 1280KB (malloced
= 1110KB # 3285) - Java class data 20224KB (malloced = 20002KB # 15134
in 3285 classes) - Native memory tracking 1024KB (malloced = 325KB +10
KB # 20)



After running the mr which make high write load (~1hour)

 Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head
Tue Jan 11 17:08:56 2011

   11 17:08:56 MSK 2011
   PID RSS VSZ% CPU
  7863 4072396 5459572 100



JR said not important below specify why)

http://paste.ubuntu.com/552820/
 <http://paste.ubuntu.com/552820/>


7863:
Total mapped                  5742628KB +165888KB (reserved=1144000KB
-1532404KB)
-              Java heap      2048000KB           (reserved=0KB -1472176KB)
-              GC tables        68512KB
-          Thread stacks        38028KB    +792KB (#threads=114 +3)
-          Compiled code      1048576KB           (used=3376KB +776KB)
-               Internal         1480KB    +256KB
-                     OS       517944KB  -31744KB
-                  Other      1996792KB +195816KB
-            Classblocks         1280KB           (malloced=1156KB
+45KB #3421 +136)
-        Java class data        20992KB    +768KB (malloced=20843KB
+840KB #15774 +640 in 3421 classes)
- Native memory tracking         1024KB           (malloced=325KB +10KB #20)


+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    OS                          *java    r x 0x0000000000400000.(     76KB)
    OS                          *java    rw  0x0000000000612000 (      4KB)
    OS                        *[heap]    rw  0x0000000000613000.( 478712KB)
   INT                           Poll    r   0x000000007fffe000 (      4KB)
   INT                         Membar    rw  0x000000007ffff000.(      4KB)
   MSP              Classblocks (1/2)    rw  0x0000000082ec0000 (    768KB)
   MSP              Classblocks (2/2)    rw  0x0000000082f80000 (    512KB)
  HEAP                      Java heap    rw  0x0000000083000000.(2048000KB)
                                         rw  0x00007f2574000000 (  65500KB)
                                             0x00007f2577ff7000.(     36KB)
                                         rw  0x00007f2584000000 (  65492KB)
                                             0x00007f2587ff5000.(     44KB)
                                         rw  0x00007f258c000000 (  65500KB)
                                             0x00007f258fff7000 (     36KB)
                                         rw  0x00007f2590000000 (  65500KB)
                                             0x00007f2593ff7000 (     36KB)
                                         rw  0x00007f2594000000 (  65500KB)
                                             0x00007f2597ff7000 (     36KB)
                                         rw  0x00007f2598000000 ( 131036KB)
                                             0x00007f259fff7000 (     36KB)
                                         rw  0x00007f25a0000000 (  65528KB)
                                             0x00007f25a3ffe000 (      8KB)
                                         rw  0x00007f25a4000000 (  65496KB)
                                             0x00007f25a7ff6000 (     40KB)
                                         rw  0x00007f25a8000000 (  65496KB)
                                             0x00007f25abff6000 (     40KB)
                                         rw  0x00007f25ac000000 (  65504KB)



So, the difference was in the pieces of memory like this:

 rw 0x00007f2590000000 (65500KB)
     0x00007f2593ff7000 (36KB)


Looks like HLog allocates memory (looks like HLog, becase it is very similar
size)

If we count this blocks we get amount of lost memory:

65M * 32 + 132M = 2212M

So, it looks like HLog allcates to many memory, and question is: how to
restrict it?

2010/12/30 Andrey Stepachev <oc...@gmail.com>

> Hi All.
>
> After heavy load into hbase (single node, nondistributed test system) I got
> 4Gb process size of my HBase java process.
> On 6GB machine there was no room for anything else (disk cache and so on).
>
> Does anybody knows, what is going on, and how you solve this. What heap
> memory is set on you hosts
> and how much of RSS hbase process actually use.
>
> I don't see such things before, all tomcat and other java apps don't eats
> significally more memory then -Xmx.
>
>  Connection name:   pid: 23476 org.apache.hadoop.hbase.master.HMaster
> start   Virtual Machine:   Java HotSpot(TM) 64-Bit Server VM version
> 17.1-b03   Vendor:   Sun Microsystems Inc.   Name:   23476@mars    Uptime:   12
> hours 4 minutes   Process CPU time:   5 hours 45 minutes   JIT compiler:   HotSpot
> 64-Bit Server Compiler   Total compile time:   19,223 seconds
> ------------------------------
>     Current heap size:     703 903 kbytes   Maximum heap size:   2 030 976kbytes    Committed memory:
> 2 030 976 kbytes   Pending finalization:   0 objects      Garbage
> collector:   Name = 'ParNew', Collections = 9 990, Total time spent = 5
> minutes   Garbage collector:   Name = 'ConcurrentMarkSweep', Collections =
> 20, Total time spent = 35,754 seconds
> ------------------------------
>     Operating System:   Linux 2.6.34.7-0.5-xen   Architecture:   amd64  Number of processors:
> 8   Committed virtual memory:   4 403 512 kbytes     Total physical
> memory:   6 815 744 kbytes   Free physical memory:      82 720 kbytes  Total swap space:
> 8 393 924 kbytes   Free swap space:   8 050 880 kbytes
>
>
>
>

Re: Java Commited Virtual Memory significally larged then Heap Memory

Posted by Friso van Vollenhoven <fv...@xebia.com>.

Inline...

> Hi Friso and everyone, 
> 
> OK. We don't have to spend time to juggle hadoop-core jars anymore since Todd is working hard on enhancing hadoop-lzo behavior. 
> 
> I think your assumption is correct, but what I was trying to say was HBase doesn't change the way to use Hadoop compressors since HBase 0.20 release while Hadoop added reinit() on 0.21. I verified that ASF Hadoop 0.21 and CDH3b3 have reinit() and ASF Hadoop 0.20.2 (including its append branch) and CDH3b2 don't. I saw you had no problem running HBase 0.89 on CDH3b2, so I thought HBase 0.90 would work fine on ASF Hadoop 0.20.2. Because both of them don't have reinit().

Ah, so my mistake was that I thought using the reinit() is something HBase specific, but it just depends on the Hadoop jar that you drop in the lib folder then. It's just that I never saw these problems in mappers and reducers but only in the RS.

@Stack, to answer your question once more then: I don't think it's a problem with the way that HBase uses the compressors, but it's a problem with the (LZO) compressor implementation in combination with the usage pattern that you get when using HBase and particular types of workloads.

> 
> HBase tries to create an output compression stream on each compression block, and one HFile flush will contain roughly 1000 compression blocks. I think reinit() could get called 1000 times on one flush, and if hadoop-lzo allocates 64MB block on reinit() (HBase's compression blocks is about 64KB though), it will become pretty much something you're observing now. 
> 
> Thanks, 
> 
> --
> Tatsuya Kawano (Mr.)
> Tokyo, Japan
> 
> 
> On Jan 13, 2011, at 7:50 AM, Todd Lipcon <to...@cloudera.com> wrote:
> 
>> Can someone who is having this issue try checking out the following git
>> branch and rebuilding LZO?
>> 
>> https://github.com/toddlipcon/hadoop-lzo/tree/realloc
>> 
>> This definitely stems one leak of a 64KB directbuffer on every reinit.
>> 
>> -Todd
>> 
>> On Wed, Jan 12, 2011 at 2:12 PM, Todd Lipcon <to...@cloudera.com> wrote:
>> 
>>> Yea, you're definitely on the right track. Have you considered systems
>>> programming, Friso? :)
>>> 
>>> Hopefully should have a candidate patch to LZO later today.
>>> 
>>> -Todd
>>> 
>>> On Wed, Jan 12, 2011 at 1:20 PM, Friso van Vollenhoven <
>>> fvanvollenhoven@xebia.com> wrote:
>>> 
>>>> Hi,
>>>> My guess is indeed that it has to do with using the reinit() method on
>>>> compressors and making them long lived instead of throwaway together with
>>>> the LZO implementation of reinit(), which magically causes NIO buffer
>>>> objects not to be finalized and as a result not release their native
>>>> allocations. It's just theory and I haven't had the time to properly verify
>>>> this (unfortunately, I spend most of my time writing application code), but
>>>> Todd said he will be looking into it further. I browsed the LZO code to see
>>>> what was going on there, but with my limited knowledge of the HBase code it
>>>> would be bald to say that this is for sure the case. It would be my first
>>>> direction of investigation. I would add some logging to the LZO code where
>>>> new direct byte buffers are created to log how often that happens and what
>>>> size they are and then redo the workload that shows the leak. Together with
>>>> some profiling you should be able to see how long it takes for these get
>>>> finalized.
>>>> 
>>>> Cheers,
>>>> Friso
>>>> 
>>>> 
>>>> 
>>>> On 12 jan 2011, at 20:08, Stack wrote:
>>>> 
>>>>> 2011/1/12 Friso van Vollenhoven <fv...@xebia.com>:
>>>>>> No, I haven't. But the Hadoop (mapreduce) LZO compression is not the
>>>> problem. Compressing the map output using LZO works just fine. The problem
>>>> is HBase LZO compression. The region server process is the one with the
>>>> memory leak...
>>>>>> 
>>>>> 
>>>>> (Sorry for dumb question Friso) But HBase is leaking because we make
>>>>> use of the Compression API in a manner that produces leaks?
>>>>> Thanks,
>>>>> St.Ack
>>>> 
>>>> 
>>> 
>>> 
>>> --
>>> Todd Lipcon
>>> Software Engineer, Cloudera
>>> 
>> 
>> 
>> 
>> -- 
>> Todd Lipcon
>> Software Engineer, Cloudera

Re: Java Commited Virtual Memory significally larged then Heap Memory

Posted by Todd Lipcon <to...@cloudera.com>.

On Thu, Jan 13, 2011 at 12:25 AM, Friso van Vollenhoven <
fvanvollenhoven@xebia.com> wrote:

> Hey Todd,
>
> I saw the patch. On what JVM (versions) have you tested this?
>

I tested on Sun JVM 1.6u22, but the undocumented calls I used have
definitely been around for a long time, so it ought to work on any Sun or
OpenJDK as far as I know.


>
> (Probably the wrong list for this, but: is there a officially supported JVM
> version for CDH3?)
>
>
We recommend the Sun 1.6 >=u16 but not u18

-Todd

>
>
> On 13 jan 2011, at 07:42, Todd Lipcon wrote:
>
> > On Wed, Jan 12, 2011 at 5:01 PM, Tatsuya Kawano <tatsuya6502@gmail.com
> >wrote:
> >
> >>> And
> >>> in some circumstances (like all the rigged tests I've attempted to do)
> >> these
> >>> get cleaned up nicely by the JVM. It seems only in pretty large heaps
> in
> >>> real workloads does the leak actually end up running away.
> >>
> >> This issue should be circumstance dependent as we don't have direct
> control
> >> on deallocating those buffers. We need them GCed but they never occupy
> the
> >> Java heap to encourage the GC to run.
> >>
> >
> > Thanks to reflection and use of undocumented APIs, you can actually
> free() a
> > direct buffer - check out the patch referenced earlier in this thread.
> >
> > Of course it probably doesn't work on other JVMs... oh well.
> >
> > -Todd
> >
> >>
> >>
> >> On Jan 13, 2011, at 8:30 AM, Todd Lipcon <to...@cloudera.com> wrote:
> >>
> >>> On Wed, Jan 12, 2011 at 3:25 PM, Tatsuya Kawano <tatsuya6502@gmail.com
> >>> wrote:
> >>>
> >>>> Hi Friso and everyone,
> >>>>
> >>>> OK. We don't have to spend time to juggle hadoop-core jars anymore
> since
> >>>> Todd is working hard on enhancing hadoop-lzo behavior.
> >>>>
> >>>> I think your assumption is correct, but what I was trying to say was
> >> HBase
> >>>> doesn't change the way to use Hadoop compressors since HBase 0.20
> >> release
> >>>> while Hadoop added reinit() on 0.21. I verified that ASF Hadoop 0.21
> and
> >>>> CDH3b3 have reinit() and ASF Hadoop 0.20.2 (including its append
> branch)
> >> and
> >>>> CDH3b2 don't. I saw you had no problem running HBase 0.89 on CDH3b2,
> so
> >> I
> >>>> thought HBase 0.90 would work fine on ASF Hadoop 0.20.2. Because both
> of
> >>>> them don't have reinit().
> >>>>
> >>>>
> >>> Yep - but that jar isn't wire-compatible with a CDH3b3 cluster. So if
> you
> >>> have a CDH3b3 cluster for one of the other features included, you need
> to
> >>> use a 3b3 client jar as well, which includes the reinit stuff.
> >>>
> >>>
> >>>> HBase tries to create an output compression stream on each compression
> >>>> block, and one HFile flush will contain roughly 1000 compression
> blocks.
> >> I
> >>>> think reinit() could get called 1000 times on one flush, and if
> >> hadoop-lzo
> >>>> allocates 64MB block on reinit() (HBase's compression blocks is about
> >> 64KB
> >>>> though), it will become pretty much something you're observing now.
> >>>>
> >>>>
> >>> Yep - though I think it's only leaking a 64K buffer for each in 0.4.8.
> >> And
> >>> in some circumstances (like all the rigged tests I've attempted to do)
> >> these
> >>> get cleaned up nicely by the JVM. It seems only in pretty large heaps
> in
> >>> real workloads does the leak actually end up running away.
> >>>
> >>> -Todd
> >>>
> >>>>
> >>>> On Jan 13, 2011, at 7:50 AM, Todd Lipcon <to...@cloudera.com> wrote:
> >>>>
> >>>>> Can someone who is having this issue try checking out the following
> git
> >>>>> branch and rebuilding LZO?
> >>>>>
> >>>>> https://github.com/toddlipcon/hadoop-lzo/tree/realloc
> >>>>>
> >>>>> This definitely stems one leak of a 64KB directbuffer on every
> reinit.
> >>>>>
> >>>>> -Todd
> >>>>>
> >>>>> On Wed, Jan 12, 2011 at 2:12 PM, Todd Lipcon <to...@cloudera.com>
> >> wrote:
> >>>>>
> >>>>>> Yea, you're definitely on the right track. Have you considered
> systems
> >>>>>> programming, Friso? :)
> >>>>>>
> >>>>>> Hopefully should have a candidate patch to LZO later today.
> >>>>>>
> >>>>>> -Todd
> >>>>>>
> >>>>>> On Wed, Jan 12, 2011 at 1:20 PM, Friso van Vollenhoven <
> >>>>>> fvanvollenhoven@xebia.com> wrote:
> >>>>>>
> >>>>>>> Hi,
> >>>>>>> My guess is indeed that it has to do with using the reinit() method
> >> on
> >>>>>>> compressors and making them long lived instead of throwaway
> together
> >>>> with
> >>>>>>> the LZO implementation of reinit(), which magically causes NIO
> buffer
> >>>>>>> objects not to be finalized and as a result not release their
> native
> >>>>>>> allocations. It's just theory and I haven't had the time to
> properly
> >>>> verify
> >>>>>>> this (unfortunately, I spend most of my time writing application
> >> code),
> >>>> but
> >>>>>>> Todd said he will be looking into it further. I browsed the LZO
> code
> >> to
> >>>> see
> >>>>>>> what was going on there, but with my limited knowledge of the HBase
> >>>> code it
> >>>>>>> would be bald to say that this is for sure the case. It would be my
> >>>> first
> >>>>>>> direction of investigation. I would add some logging to the LZO
> code
> >>>> where
> >>>>>>> new direct byte buffers are created to log how often that happens
> and
> >>>> what
> >>>>>>> size they are and then redo the workload that shows the leak.
> >> Together
> >>>> with
> >>>>>>> some profiling you should be able to see how long it takes for
> these
> >>>> get
> >>>>>>> finalized.
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>> Friso
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On 12 jan 2011, at 20:08, Stack wrote:
> >>>>>>>
> >>>>>>>> 2011/1/12 Friso van Vollenhoven <fv...@xebia.com>:
> >>>>>>>>> No, I haven't. But the Hadoop (mapreduce) LZO compression is not
> >> the
> >>>>>>> problem. Compressing the map output using LZO works just fine. The
> >>>> problem
> >>>>>>> is HBase LZO compression. The region server process is the one with
> >> the
> >>>>>>> memory leak...
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> (Sorry for dumb question Friso) But HBase is leaking because we
> make
> >>>>>>>> use of the Compression API in a manner that produces leaks?
> >>>>>>>> Thanks,
> >>>>>>>> St.Ack
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Todd Lipcon
> >>>>>> Software Engineer, Cloudera
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Todd Lipcon
> >>>>> Software Engineer, Cloudera
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Todd Lipcon
> >>> Software Engineer, Cloudera
> >>
> >>
> >
> >
> > --
> > Todd Lipcon
> > Software Engineer, Cloudera
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

Re: Java Commited Virtual Memory significally larged then Heap Memory

Posted by Friso van Vollenhoven <fv...@xebia.com>.

Hey Todd,

I saw the patch. On what JVM (versions) have you tested this?

(Probably the wrong list for this, but: is there a officially supported JVM version for CDH3?)


Thanks,
Friso


On 13 jan 2011, at 07:42, Todd Lipcon wrote:

> On Wed, Jan 12, 2011 at 5:01 PM, Tatsuya Kawano <ta...@gmail.com>wrote:
> 
>>> And
>>> in some circumstances (like all the rigged tests I've attempted to do)
>> these
>>> get cleaned up nicely by the JVM. It seems only in pretty large heaps in
>>> real workloads does the leak actually end up running away.
>> 
>> This issue should be circumstance dependent as we don't have direct control
>> on deallocating those buffers. We need them GCed but they never occupy the
>> Java heap to encourage the GC to run.
>> 
> 
> Thanks to reflection and use of undocumented APIs, you can actually free() a
> direct buffer - check out the patch referenced earlier in this thread.
> 
> Of course it probably doesn't work on other JVMs... oh well.
> 
> -Todd
> 
>> 
>> 
>> On Jan 13, 2011, at 8:30 AM, Todd Lipcon <to...@cloudera.com> wrote:
>> 
>>> On Wed, Jan 12, 2011 at 3:25 PM, Tatsuya Kawano <tatsuya6502@gmail.com
>>> wrote:
>>> 
>>>> Hi Friso and everyone,
>>>> 
>>>> OK. We don't have to spend time to juggle hadoop-core jars anymore since
>>>> Todd is working hard on enhancing hadoop-lzo behavior.
>>>> 
>>>> I think your assumption is correct, but what I was trying to say was
>> HBase
>>>> doesn't change the way to use Hadoop compressors since HBase 0.20
>> release
>>>> while Hadoop added reinit() on 0.21. I verified that ASF Hadoop 0.21 and
>>>> CDH3b3 have reinit() and ASF Hadoop 0.20.2 (including its append branch)
>> and
>>>> CDH3b2 don't. I saw you had no problem running HBase 0.89 on CDH3b2, so
>> I
>>>> thought HBase 0.90 would work fine on ASF Hadoop 0.20.2. Because both of
>>>> them don't have reinit().
>>>> 
>>>> 
>>> Yep - but that jar isn't wire-compatible with a CDH3b3 cluster. So if you
>>> have a CDH3b3 cluster for one of the other features included, you need to
>>> use a 3b3 client jar as well, which includes the reinit stuff.
>>> 
>>> 
>>>> HBase tries to create an output compression stream on each compression
>>>> block, and one HFile flush will contain roughly 1000 compression blocks.
>> I
>>>> think reinit() could get called 1000 times on one flush, and if
>> hadoop-lzo
>>>> allocates 64MB block on reinit() (HBase's compression blocks is about
>> 64KB
>>>> though), it will become pretty much something you're observing now.
>>>> 
>>>> 
>>> Yep - though I think it's only leaking a 64K buffer for each in 0.4.8.
>> And
>>> in some circumstances (like all the rigged tests I've attempted to do)
>> these
>>> get cleaned up nicely by the JVM. It seems only in pretty large heaps in
>>> real workloads does the leak actually end up running away.
>>> 
>>> -Todd
>>> 
>>>> 
>>>> On Jan 13, 2011, at 7:50 AM, Todd Lipcon <to...@cloudera.com> wrote:
>>>> 
>>>>> Can someone who is having this issue try checking out the following git
>>>>> branch and rebuilding LZO?
>>>>> 
>>>>> https://github.com/toddlipcon/hadoop-lzo/tree/realloc
>>>>> 
>>>>> This definitely stems one leak of a 64KB directbuffer on every reinit.
>>>>> 
>>>>> -Todd
>>>>> 
>>>>> On Wed, Jan 12, 2011 at 2:12 PM, Todd Lipcon <to...@cloudera.com>
>> wrote:
>>>>> 
>>>>>> Yea, you're definitely on the right track. Have you considered systems
>>>>>> programming, Friso? :)
>>>>>> 
>>>>>> Hopefully should have a candidate patch to LZO later today.
>>>>>> 
>>>>>> -Todd
>>>>>> 
>>>>>> On Wed, Jan 12, 2011 at 1:20 PM, Friso van Vollenhoven <
>>>>>> fvanvollenhoven@xebia.com> wrote:
>>>>>> 
>>>>>>> Hi,
>>>>>>> My guess is indeed that it has to do with using the reinit() method
>> on
>>>>>>> compressors and making them long lived instead of throwaway together
>>>> with
>>>>>>> the LZO implementation of reinit(), which magically causes NIO buffer
>>>>>>> objects not to be finalized and as a result not release their native
>>>>>>> allocations. It's just theory and I haven't had the time to properly
>>>> verify
>>>>>>> this (unfortunately, I spend most of my time writing application
>> code),
>>>> but
>>>>>>> Todd said he will be looking into it further. I browsed the LZO code
>> to
>>>> see
>>>>>>> what was going on there, but with my limited knowledge of the HBase
>>>> code it
>>>>>>> would be bald to say that this is for sure the case. It would be my
>>>> first
>>>>>>> direction of investigation. I would add some logging to the LZO code
>>>> where
>>>>>>> new direct byte buffers are created to log how often that happens and
>>>> what
>>>>>>> size they are and then redo the workload that shows the leak.
>> Together
>>>> with
>>>>>>> some profiling you should be able to see how long it takes for these
>>>> get
>>>>>>> finalized.
>>>>>>> 
>>>>>>> Cheers,
>>>>>>> Friso
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On 12 jan 2011, at 20:08, Stack wrote:
>>>>>>> 
>>>>>>>> 2011/1/12 Friso van Vollenhoven <fv...@xebia.com>:
>>>>>>>>> No, I haven't. But the Hadoop (mapreduce) LZO compression is not
>> the
>>>>>>> problem. Compressing the map output using LZO works just fine. The
>>>> problem
>>>>>>> is HBase LZO compression. The region server process is the one with
>> the
>>>>>>> memory leak...
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> (Sorry for dumb question Friso) But HBase is leaking because we make
>>>>>>>> use of the Compression API in a manner that produces leaks?
>>>>>>>> Thanks,
>>>>>>>> St.Ack
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Todd Lipcon
>>>>>> Software Engineer, Cloudera
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Todd Lipcon
>>>>> Software Engineer, Cloudera
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Todd Lipcon
>>> Software Engineer, Cloudera
>> 
>> 
> 
> 
> -- 
> Todd Lipcon
> Software Engineer, Cloudera

Re: Java Commited Virtual Memory significally larged then Heap Memory

Posted by Todd Lipcon <to...@cloudera.com>.

On Wed, Jan 12, 2011 at 5:01 PM, Tatsuya Kawano <ta...@gmail.com>wrote:

> > And
> > in some circumstances (like all the rigged tests I've attempted to do)
> these
> > get cleaned up nicely by the JVM. It seems only in pretty large heaps in
> > real workloads does the leak actually end up running away.
>
> This issue should be circumstance dependent as we don't have direct control
> on deallocating those buffers. We need them GCed but they never occupy the
> Java heap to encourage the GC to run.
>

Thanks to reflection and use of undocumented APIs, you can actually free() a
direct buffer - check out the patch referenced earlier in this thread.

Of course it probably doesn't work on other JVMs... oh well.

-Todd

>
>
> On Jan 13, 2011, at 8:30 AM, Todd Lipcon <to...@cloudera.com> wrote:
>
> > On Wed, Jan 12, 2011 at 3:25 PM, Tatsuya Kawano <tatsuya6502@gmail.com
> >wrote:
> >
> >> Hi Friso and everyone,
> >>
> >> OK. We don't have to spend time to juggle hadoop-core jars anymore since
> >> Todd is working hard on enhancing hadoop-lzo behavior.
> >>
> >> I think your assumption is correct, but what I was trying to say was
> HBase
> >> doesn't change the way to use Hadoop compressors since HBase 0.20
> release
> >> while Hadoop added reinit() on 0.21. I verified that ASF Hadoop 0.21 and
> >> CDH3b3 have reinit() and ASF Hadoop 0.20.2 (including its append branch)
> and
> >> CDH3b2 don't. I saw you had no problem running HBase 0.89 on CDH3b2, so
> I
> >> thought HBase 0.90 would work fine on ASF Hadoop 0.20.2. Because both of
> >> them don't have reinit().
> >>
> >>
> > Yep - but that jar isn't wire-compatible with a CDH3b3 cluster. So if you
> > have a CDH3b3 cluster for one of the other features included, you need to
> > use a 3b3 client jar as well, which includes the reinit stuff.
> >
> >
> >> HBase tries to create an output compression stream on each compression
> >> block, and one HFile flush will contain roughly 1000 compression blocks.
> I
> >> think reinit() could get called 1000 times on one flush, and if
> hadoop-lzo
> >> allocates 64MB block on reinit() (HBase's compression blocks is about
> 64KB
> >> though), it will become pretty much something you're observing now.
> >>
> >>
> > Yep - though I think it's only leaking a 64K buffer for each in 0.4.8.
> And
> > in some circumstances (like all the rigged tests I've attempted to do)
> these
> > get cleaned up nicely by the JVM. It seems only in pretty large heaps in
> > real workloads does the leak actually end up running away.
> >
> > -Todd
> >
> >>
> >> On Jan 13, 2011, at 7:50 AM, Todd Lipcon <to...@cloudera.com> wrote:
> >>
> >>> Can someone who is having this issue try checking out the following git
> >>> branch and rebuilding LZO?
> >>>
> >>> https://github.com/toddlipcon/hadoop-lzo/tree/realloc
> >>>
> >>> This definitely stems one leak of a 64KB directbuffer on every reinit.
> >>>
> >>> -Todd
> >>>
> >>> On Wed, Jan 12, 2011 at 2:12 PM, Todd Lipcon <to...@cloudera.com>
> wrote:
> >>>
> >>>> Yea, you're definitely on the right track. Have you considered systems
> >>>> programming, Friso? :)
> >>>>
> >>>> Hopefully should have a candidate patch to LZO later today.
> >>>>
> >>>> -Todd
> >>>>
> >>>> On Wed, Jan 12, 2011 at 1:20 PM, Friso van Vollenhoven <
> >>>> fvanvollenhoven@xebia.com> wrote:
> >>>>
> >>>>> Hi,
> >>>>> My guess is indeed that it has to do with using the reinit() method
> on
> >>>>> compressors and making them long lived instead of throwaway together
> >> with
> >>>>> the LZO implementation of reinit(), which magically causes NIO buffer
> >>>>> objects not to be finalized and as a result not release their native
> >>>>> allocations. It's just theory and I haven't had the time to properly
> >> verify
> >>>>> this (unfortunately, I spend most of my time writing application
> code),
> >> but
> >>>>> Todd said he will be looking into it further. I browsed the LZO code
> to
> >> see
> >>>>> what was going on there, but with my limited knowledge of the HBase
> >> code it
> >>>>> would be bald to say that this is for sure the case. It would be my
> >> first
> >>>>> direction of investigation. I would add some logging to the LZO code
> >> where
> >>>>> new direct byte buffers are created to log how often that happens and
> >> what
> >>>>> size they are and then redo the workload that shows the leak.
> Together
> >> with
> >>>>> some profiling you should be able to see how long it takes for these
> >> get
> >>>>> finalized.
> >>>>>
> >>>>> Cheers,
> >>>>> Friso
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 12 jan 2011, at 20:08, Stack wrote:
> >>>>>
> >>>>>> 2011/1/12 Friso van Vollenhoven <fv...@xebia.com>:
> >>>>>>> No, I haven't. But the Hadoop (mapreduce) LZO compression is not
> the
> >>>>> problem. Compressing the map output using LZO works just fine. The
> >> problem
> >>>>> is HBase LZO compression. The region server process is the one with
> the
> >>>>> memory leak...
> >>>>>>>
> >>>>>>
> >>>>>> (Sorry for dumb question Friso) But HBase is leaking because we make
> >>>>>> use of the Compression API in a manner that produces leaks?
> >>>>>> Thanks,
> >>>>>> St.Ack
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>> Todd Lipcon
> >>>> Software Engineer, Cloudera
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Todd Lipcon
> >>> Software Engineer, Cloudera
> >>
> >
> >
> >
> > --
> > Todd Lipcon
> > Software Engineer, Cloudera
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

Re: Java Commited Virtual Memory significally larged then Heap Memory

Posted by Tatsuya Kawano <ta...@gmail.com>.

Hi Todd, 

> Yep - but that jar isn't wire-compatible with a CDH3b3 cluster. So if you
> have a CDH3b3 cluster for one of the other features included, you need to
> use a 3b3 client jar as well, 

Yeah, I saw the number "+737" after the version number. Thanks for clarifying it. (and sorry for the bad suggestion.)


> And
> in some circumstances (like all the rigged tests I've attempted to do) these
> get cleaned up nicely by the JVM. It seems only in pretty large heaps in
> real workloads does the leak actually end up running away.

This issue should be circumstance dependent as we don't have direct control on deallocating those buffers. We need them GCed but they never occupy the Java heap to encourage the GC to run.

-Tatsuya


On Jan 13, 2011, at 8:30 AM, Todd Lipcon <to...@cloudera.com> wrote:

> On Wed, Jan 12, 2011 at 3:25 PM, Tatsuya Kawano <ta...@gmail.com>wrote:
> 
>> Hi Friso and everyone,
>> 
>> OK. We don't have to spend time to juggle hadoop-core jars anymore since
>> Todd is working hard on enhancing hadoop-lzo behavior.
>> 
>> I think your assumption is correct, but what I was trying to say was HBase
>> doesn't change the way to use Hadoop compressors since HBase 0.20 release
>> while Hadoop added reinit() on 0.21. I verified that ASF Hadoop 0.21 and
>> CDH3b3 have reinit() and ASF Hadoop 0.20.2 (including its append branch) and
>> CDH3b2 don't. I saw you had no problem running HBase 0.89 on CDH3b2, so I
>> thought HBase 0.90 would work fine on ASF Hadoop 0.20.2. Because both of
>> them don't have reinit().
>> 
>> 
> Yep - but that jar isn't wire-compatible with a CDH3b3 cluster. So if you
> have a CDH3b3 cluster for one of the other features included, you need to
> use a 3b3 client jar as well, which includes the reinit stuff.
> 
> 
>> HBase tries to create an output compression stream on each compression
>> block, and one HFile flush will contain roughly 1000 compression blocks. I
>> think reinit() could get called 1000 times on one flush, and if hadoop-lzo
>> allocates 64MB block on reinit() (HBase's compression blocks is about 64KB
>> though), it will become pretty much something you're observing now.
>> 
>> 
> Yep - though I think it's only leaking a 64K buffer for each in 0.4.8. And
> in some circumstances (like all the rigged tests I've attempted to do) these
> get cleaned up nicely by the JVM. It seems only in pretty large heaps in
> real workloads does the leak actually end up running away.
> 
> -Todd
> 
>> 
>> On Jan 13, 2011, at 7:50 AM, Todd Lipcon <to...@cloudera.com> wrote:
>> 
>>> Can someone who is having this issue try checking out the following git
>>> branch and rebuilding LZO?
>>> 
>>> https://github.com/toddlipcon/hadoop-lzo/tree/realloc
>>> 
>>> This definitely stems one leak of a 64KB directbuffer on every reinit.
>>> 
>>> -Todd
>>> 
>>> On Wed, Jan 12, 2011 at 2:12 PM, Todd Lipcon <to...@cloudera.com> wrote:
>>> 
>>>> Yea, you're definitely on the right track. Have you considered systems
>>>> programming, Friso? :)
>>>> 
>>>> Hopefully should have a candidate patch to LZO later today.
>>>> 
>>>> -Todd
>>>> 
>>>> On Wed, Jan 12, 2011 at 1:20 PM, Friso van Vollenhoven <
>>>> fvanvollenhoven@xebia.com> wrote:
>>>> 
>>>>> Hi,
>>>>> My guess is indeed that it has to do with using the reinit() method on
>>>>> compressors and making them long lived instead of throwaway together
>> with
>>>>> the LZO implementation of reinit(), which magically causes NIO buffer
>>>>> objects not to be finalized and as a result not release their native
>>>>> allocations. It's just theory and I haven't had the time to properly
>> verify
>>>>> this (unfortunately, I spend most of my time writing application code),
>> but
>>>>> Todd said he will be looking into it further. I browsed the LZO code to
>> see
>>>>> what was going on there, but with my limited knowledge of the HBase
>> code it
>>>>> would be bald to say that this is for sure the case. It would be my
>> first
>>>>> direction of investigation. I would add some logging to the LZO code
>> where
>>>>> new direct byte buffers are created to log how often that happens and
>> what
>>>>> size they are and then redo the workload that shows the leak. Together
>> with
>>>>> some profiling you should be able to see how long it takes for these
>> get
>>>>> finalized.
>>>>> 
>>>>> Cheers,
>>>>> Friso
>>>>> 
>>>>> 
>>>>> 
>>>>> On 12 jan 2011, at 20:08, Stack wrote:
>>>>> 
>>>>>> 2011/1/12 Friso van Vollenhoven <fv...@xebia.com>:
>>>>>>> No, I haven't. But the Hadoop (mapreduce) LZO compression is not the
>>>>> problem. Compressing the map output using LZO works just fine. The
>> problem
>>>>> is HBase LZO compression. The region server process is the one with the
>>>>> memory leak...
>>>>>>> 
>>>>>> 
>>>>>> (Sorry for dumb question Friso) But HBase is leaking because we make
>>>>>> use of the Compression API in a manner that produces leaks?
>>>>>> Thanks,
>>>>>> St.Ack
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> --
>>>> Todd Lipcon
>>>> Software Engineer, Cloudera
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Todd Lipcon
>>> Software Engineer, Cloudera
>> 
> 
> 
> 
> -- 
> Todd Lipcon
> Software Engineer, Cloudera

Re: Java Commited Virtual Memory significally larged then Heap Memory

Posted by Todd Lipcon <to...@cloudera.com>.

On Wed, Jan 12, 2011 at 3:25 PM, Tatsuya Kawano <ta...@gmail.com>wrote:

> Hi Friso and everyone,
>
> OK. We don't have to spend time to juggle hadoop-core jars anymore since
> Todd is working hard on enhancing hadoop-lzo behavior.
>
> I think your assumption is correct, but what I was trying to say was HBase
> doesn't change the way to use Hadoop compressors since HBase 0.20 release
> while Hadoop added reinit() on 0.21. I verified that ASF Hadoop 0.21 and
> CDH3b3 have reinit() and ASF Hadoop 0.20.2 (including its append branch) and
> CDH3b2 don't. I saw you had no problem running HBase 0.89 on CDH3b2, so I
> thought HBase 0.90 would work fine on ASF Hadoop 0.20.2. Because both of
> them don't have reinit().
>
>
Yep - but that jar isn't wire-compatible with a CDH3b3 cluster. So if you
have a CDH3b3 cluster for one of the other features included, you need to
use a 3b3 client jar as well, which includes the reinit stuff.


> HBase tries to create an output compression stream on each compression
> block, and one HFile flush will contain roughly 1000 compression blocks. I
> think reinit() could get called 1000 times on one flush, and if hadoop-lzo
> allocates 64MB block on reinit() (HBase's compression blocks is about 64KB
> though), it will become pretty much something you're observing now.
>
>
Yep - though I think it's only leaking a 64K buffer for each in 0.4.8. And
in some circumstances (like all the rigged tests I've attempted to do) these
get cleaned up nicely by the JVM. It seems only in pretty large heaps in
real workloads does the leak actually end up running away.

-Todd

>
> On Jan 13, 2011, at 7:50 AM, Todd Lipcon <to...@cloudera.com> wrote:
>
> > Can someone who is having this issue try checking out the following git
> > branch and rebuilding LZO?
> >
> > https://github.com/toddlipcon/hadoop-lzo/tree/realloc
> >
> > This definitely stems one leak of a 64KB directbuffer on every reinit.
> >
> > -Todd
> >
> > On Wed, Jan 12, 2011 at 2:12 PM, Todd Lipcon <to...@cloudera.com> wrote:
> >
> >> Yea, you're definitely on the right track. Have you considered systems
> >> programming, Friso? :)
> >>
> >> Hopefully should have a candidate patch to LZO later today.
> >>
> >> -Todd
> >>
> >> On Wed, Jan 12, 2011 at 1:20 PM, Friso van Vollenhoven <
> >> fvanvollenhoven@xebia.com> wrote:
> >>
> >>> Hi,
> >>> My guess is indeed that it has to do with using the reinit() method on
> >>> compressors and making them long lived instead of throwaway together
> with
> >>> the LZO implementation of reinit(), which magically causes NIO buffer
> >>> objects not to be finalized and as a result not release their native
> >>> allocations. It's just theory and I haven't had the time to properly
> verify
> >>> this (unfortunately, I spend most of my time writing application code),
> but
> >>> Todd said he will be looking into it further. I browsed the LZO code to
> see
> >>> what was going on there, but with my limited knowledge of the HBase
> code it
> >>> would be bald to say that this is for sure the case. It would be my
> first
> >>> direction of investigation. I would add some logging to the LZO code
> where
> >>> new direct byte buffers are created to log how often that happens and
> what
> >>> size they are and then redo the workload that shows the leak. Together
> with
> >>> some profiling you should be able to see how long it takes for these
> get
> >>> finalized.
> >>>
> >>> Cheers,
> >>> Friso
> >>>
> >>>
> >>>
> >>> On 12 jan 2011, at 20:08, Stack wrote:
> >>>
> >>>> 2011/1/12 Friso van Vollenhoven <fv...@xebia.com>:
> >>>>> No, I haven't. But the Hadoop (mapreduce) LZO compression is not the
> >>> problem. Compressing the map output using LZO works just fine. The
> problem
> >>> is HBase LZO compression. The region server process is the one with the
> >>> memory leak...
> >>>>>
> >>>>
> >>>> (Sorry for dumb question Friso) But HBase is leaking because we make
> >>>> use of the Compression API in a manner that produces leaks?
> >>>> Thanks,
> >>>> St.Ack
> >>>
> >>>
> >>
> >>
> >> --
> >> Todd Lipcon
> >> Software Engineer, Cloudera
> >>
> >
> >
> >
> > --
> > Todd Lipcon
> > Software Engineer, Cloudera
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: Java Commited Virtual Memory significally larged then Heap Memory

Posted by Tatsuya Kawano <ta...@gmail.com>.

Hi Friso and everyone, 

OK. We don't have to spend time to juggle hadoop-core jars anymore since Todd is working hard on enhancing hadoop-lzo behavior. 

I think your assumption is correct, but what I was trying to say was HBase doesn't change the way to use Hadoop compressors since HBase 0.20 release while Hadoop added reinit() on 0.21. I verified that ASF Hadoop 0.21 and CDH3b3 have reinit() and ASF Hadoop 0.20.2 (including its append branch) and CDH3b2 don't. I saw you had no problem running HBase 0.89 on CDH3b2, so I thought HBase 0.90 would work fine on ASF Hadoop 0.20.2. Because both of them don't have reinit().

HBase tries to create an output compression stream on each compression block, and one HFile flush will contain roughly 1000 compression blocks. I think reinit() could get called 1000 times on one flush, and if hadoop-lzo allocates 64MB block on reinit() (HBase's compression blocks is about 64KB though), it will become pretty much something you're observing now. 

Thanks, 

--
Tatsuya Kawano (Mr.)
Tokyo, Japan


On Jan 13, 2011, at 7:50 AM, Todd Lipcon <to...@cloudera.com> wrote:

> Can someone who is having this issue try checking out the following git
> branch and rebuilding LZO?
> 
> https://github.com/toddlipcon/hadoop-lzo/tree/realloc
> 
> This definitely stems one leak of a 64KB directbuffer on every reinit.
> 
> -Todd
> 
> On Wed, Jan 12, 2011 at 2:12 PM, Todd Lipcon <to...@cloudera.com> wrote:
> 
>> Yea, you're definitely on the right track. Have you considered systems
>> programming, Friso? :)
>> 
>> Hopefully should have a candidate patch to LZO later today.
>> 
>> -Todd
>> 
>> On Wed, Jan 12, 2011 at 1:20 PM, Friso van Vollenhoven <
>> fvanvollenhoven@xebia.com> wrote:
>> 
>>> Hi,
>>> My guess is indeed that it has to do with using the reinit() method on
>>> compressors and making them long lived instead of throwaway together with
>>> the LZO implementation of reinit(), which magically causes NIO buffer
>>> objects not to be finalized and as a result not release their native
>>> allocations. It's just theory and I haven't had the time to properly verify
>>> this (unfortunately, I spend most of my time writing application code), but
>>> Todd said he will be looking into it further. I browsed the LZO code to see
>>> what was going on there, but with my limited knowledge of the HBase code it
>>> would be bald to say that this is for sure the case. It would be my first
>>> direction of investigation. I would add some logging to the LZO code where
>>> new direct byte buffers are created to log how often that happens and what
>>> size they are and then redo the workload that shows the leak. Together with
>>> some profiling you should be able to see how long it takes for these get
>>> finalized.
>>> 
>>> Cheers,
>>> Friso
>>> 
>>> 
>>> 
>>> On 12 jan 2011, at 20:08, Stack wrote:
>>> 
>>>> 2011/1/12 Friso van Vollenhoven <fv...@xebia.com>:
>>>>> No, I haven't. But the Hadoop (mapreduce) LZO compression is not the
>>> problem. Compressing the map output using LZO works just fine. The problem
>>> is HBase LZO compression. The region server process is the one with the
>>> memory leak...
>>>>> 
>>>> 
>>>> (Sorry for dumb question Friso) But HBase is leaking because we make
>>>> use of the Compression API in a manner that produces leaks?
>>>> Thanks,
>>>> St.Ack
>>> 
>>> 
>> 
>> 
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
>> 
> 
> 
> 
> -- 
> Todd Lipcon
> Software Engineer, Cloudera

Re: Java Commited Virtual Memory significally larged then Heap Memory

Posted by Friso van Vollenhoven <fv...@xebia.com>.

Hey Todd,

Hopefully I can get to this somewhere next week. We have had our NN corrupted, so we are rebuilding the prod cluster, meaning we use dev for backing our apps now, so I have no environment to give it a go. Stay tuned...

>> Yea, you're definitely on the right track. Have you considered systems
>> programming, Friso? :)

> 

Well, at least then you get to do your own memory management most of the time...


Friso



> Can someone who is having this issue try checking out the following git
> branch and rebuilding LZO?
> 
> https://github.com/toddlipcon/hadoop-lzo/tree/realloc
> 
> This definitely stems one leak of a 64KB directbuffer on every reinit.
> 
> -Todd
> 
> On Wed, Jan 12, 2011 at 2:12 PM, Todd Lipcon <to...@cloudera.com> wrote:
> 
>> Yea, you're definitely on the right track. Have you considered systems
>> programming, Friso? :)
>> 
>> Hopefully should have a candidate patch to LZO later today.
>> 
>> -Todd
>> 
>> On Wed, Jan 12, 2011 at 1:20 PM, Friso van Vollenhoven <
>> fvanvollenhoven@xebia.com> wrote:
>> 
>>> Hi,
>>> My guess is indeed that it has to do with using the reinit() method on
>>> compressors and making them long lived instead of throwaway together with
>>> the LZO implementation of reinit(), which magically causes NIO buffer
>>> objects not to be finalized and as a result not release their native
>>> allocations. It's just theory and I haven't had the time to properly verify
>>> this (unfortunately, I spend most of my time writing application code), but
>>> Todd said he will be looking into it further. I browsed the LZO code to see
>>> what was going on there, but with my limited knowledge of the HBase code it
>>> would be bald to say that this is for sure the case. It would be my first
>>> direction of investigation. I would add some logging to the LZO code where
>>> new direct byte buffers are created to log how often that happens and what
>>> size they are and then redo the workload that shows the leak. Together with
>>> some profiling you should be able to see how long it takes for these get
>>> finalized.
>>> 
>>> Cheers,
>>> Friso
>>> 
>>> 
>>> 
>>> On 12 jan 2011, at 20:08, Stack wrote:
>>> 
>>>> 2011/1/12 Friso van Vollenhoven <fv...@xebia.com>:
>>>>> No, I haven't. But the Hadoop (mapreduce) LZO compression is not the
>>> problem. Compressing the map output using LZO works just fine. The problem
>>> is HBase LZO compression. The region server process is the one with the
>>> memory leak...
>>>>> 
>>>> 
>>>> (Sorry for dumb question Friso) But HBase is leaking because we make
>>>> use of the Compression API in a manner that produces leaks?
>>>> Thanks,
>>>> St.Ack
>>> 
>>> 
>> 
>> 
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
>> 
> 
> 
> 
> -- 
> Todd Lipcon
> Software Engineer, Cloudera

Re: Java Commited Virtual Memory significally larged then Heap Memory

Posted by Todd Lipcon <to...@cloudera.com>.

Can someone who is having this issue try checking out the following git
branch and rebuilding LZO?

https://github.com/toddlipcon/hadoop-lzo/tree/realloc

This definitely stems one leak of a 64KB directbuffer on every reinit.

-Todd

On Wed, Jan 12, 2011 at 2:12 PM, Todd Lipcon <to...@cloudera.com> wrote:

> Yea, you're definitely on the right track. Have you considered systems
> programming, Friso? :)
>
> Hopefully should have a candidate patch to LZO later today.
>
> -Todd
>
> On Wed, Jan 12, 2011 at 1:20 PM, Friso van Vollenhoven <
> fvanvollenhoven@xebia.com> wrote:
>
>> Hi,
>> My guess is indeed that it has to do with using the reinit() method on
>> compressors and making them long lived instead of throwaway together with
>> the LZO implementation of reinit(), which magically causes NIO buffer
>> objects not to be finalized and as a result not release their native
>> allocations. It's just theory and I haven't had the time to properly verify
>> this (unfortunately, I spend most of my time writing application code), but
>> Todd said he will be looking into it further. I browsed the LZO code to see
>> what was going on there, but with my limited knowledge of the HBase code it
>> would be bald to say that this is for sure the case. It would be my first
>> direction of investigation. I would add some logging to the LZO code where
>> new direct byte buffers are created to log how often that happens and what
>> size they are and then redo the workload that shows the leak. Together with
>> some profiling you should be able to see how long it takes for these get
>> finalized.
>>
>> Cheers,
>> Friso
>>
>>
>>
>> On 12 jan 2011, at 20:08, Stack wrote:
>>
>> > 2011/1/12 Friso van Vollenhoven <fv...@xebia.com>:
>> >> No, I haven't. But the Hadoop (mapreduce) LZO compression is not the
>> problem. Compressing the map output using LZO works just fine. The problem
>> is HBase LZO compression. The region server process is the one with the
>> memory leak...
>> >>
>> >
>> > (Sorry for dumb question Friso) But HBase is leaking because we make
>> > use of the Compression API in a manner that produces leaks?
>> > Thanks,
>> > St.Ack
>>
>>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: Java Commited Virtual Memory significally larged then Heap Memory

Posted by Todd Lipcon <to...@cloudera.com>.

Yea, you're definitely on the right track. Have you considered systems
programming, Friso? :)

Hopefully should have a candidate patch to LZO later today.

-Todd

On Wed, Jan 12, 2011 at 1:20 PM, Friso van Vollenhoven <
fvanvollenhoven@xebia.com> wrote:

> Hi,
> My guess is indeed that it has to do with using the reinit() method on
> compressors and making them long lived instead of throwaway together with
> the LZO implementation of reinit(), which magically causes NIO buffer
> objects not to be finalized and as a result not release their native
> allocations. It's just theory and I haven't had the time to properly verify
> this (unfortunately, I spend most of my time writing application code), but
> Todd said he will be looking into it further. I browsed the LZO code to see
> what was going on there, but with my limited knowledge of the HBase code it
> would be bald to say that this is for sure the case. It would be my first
> direction of investigation. I would add some logging to the LZO code where
> new direct byte buffers are created to log how often that happens and what
> size they are and then redo the workload that shows the leak. Together with
> some profiling you should be able to see how long it takes for these get
> finalized.
>
> Cheers,
> Friso
>
>
>
> On 12 jan 2011, at 20:08, Stack wrote:
>
> > 2011/1/12 Friso van Vollenhoven <fv...@xebia.com>:
> >> No, I haven't. But the Hadoop (mapreduce) LZO compression is not the
> problem. Compressing the map output using LZO works just fine. The problem
> is HBase LZO compression. The region server process is the one with the
> memory leak...
> >>
> >
> > (Sorry for dumb question Friso) But HBase is leaking because we make
> > use of the Compression API in a manner that produces leaks?
> > Thanks,
> > St.Ack
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

Re: Java Commited Virtual Memory significally larged then Heap Memory

Posted by Friso van Vollenhoven <fv...@xebia.com>.

Hi,
My guess is indeed that it has to do with using the reinit() method on compressors and making them long lived instead of throwaway together with the LZO implementation of reinit(), which magically causes NIO buffer objects not to be finalized and as a result not release their native allocations. It's just theory and I haven't had the time to properly verify this (unfortunately, I spend most of my time writing application code), but Todd said he will be looking into it further. I browsed the LZO code to see what was going on there, but with my limited knowledge of the HBase code it would be bald to say that this is for sure the case. It would be my first direction of investigation. I would add some logging to the LZO code where new direct byte buffers are created to log how often that happens and what size they are and then redo the workload that shows the leak. Together with some profiling you should be able to see how long it takes for these get finalized.

Cheers,
Friso

On 12 jan 2011, at 20:08, Stack wrote:

> 2011/1/12 Friso van Vollenhoven <fv...@xebia.com>:
>> No, I haven't. But the Hadoop (mapreduce) LZO compression is not the problem. Compressing the map output using LZO works just fine. The problem is HBase LZO compression. The region server process is the one with the memory leak...
>> 
> 
> (Sorry for dumb question Friso) But HBase is leaking because we make
> use of the Compression API in a manner that produces leaks?
> Thanks,
> St.Ack

Re: Java Commited Virtual Memory significally larged then Heap Memory

Posted by Todd Lipcon <to...@cloudera.com>.

Hey all,

I will be looking into this today :)

-Todd

On Wed, Jan 12, 2011 at 11:08 AM, Stack <st...@duboce.net> wrote:

> 2011/1/12 Friso van Vollenhoven <fv...@xebia.com>:
> > No, I haven't. But the Hadoop (mapreduce) LZO compression is not the
> problem. Compressing the map output using LZO works just fine. The problem
> is HBase LZO compression. The region server process is the one with the
> memory leak...
> >
>
> (Sorry for dumb question Friso) But HBase is leaking because we make
> use of the Compression API in a manner that produces leaks?
> Thanks,
> St.Ack
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: Java Commited Virtual Memory significally larged then Heap Memory

Posted by Stack <st...@duboce.net>.

2011/1/12 Friso van Vollenhoven <fv...@xebia.com>:
> No, I haven't. But the Hadoop (mapreduce) LZO compression is not the problem. Compressing the map output using LZO works just fine. The problem is HBase LZO compression. The region server process is the one with the memory leak...
>

(Sorry for dumb question Friso) But HBase is leaking because we make
use of the Compression API in a manner that produces leaks?
Thanks,
St.Ack

Re: Java Commited Virtual Memory significally larged then Heap Memory

Posted by Friso van Vollenhoven <fv...@xebia.com>.

No, I haven't. But the Hadoop (mapreduce) LZO compression is not the problem. Compressing the map output using LZO works just fine. The problem is HBase LZO compression. The region server process is the one with the memory leak...


Friso


On 12 jan 2011, at 12:44, Tatsuya Kawano wrote:

> 
> Hi,
> 
> Have you tried the ASF version of hadoop-core? (The one distributed with HBase 0.90RC.)
> 
> It doesn't call reinit() so I'm hoping it will just work fine with the latest hadoop-lzo and other compressors.
> 
> Thanks,
> 
> --
> Tatsuya Kawano (Mr.)
> Tokyo, Japan
> 
> 
> On Jan 12, 2011, at 7:51 PM, Friso van Vollenhoven <fv...@xebia.com> wrote:
> 
>> Thanks.
>> 
>> I went back to hbase 0.89 with 0.1 LZO, which works fine and does not show this issue.
>> 
>> I tried with a newer Hbase and LZO version, also with the MALLOC... setting but without max direct memory set, so I was wondering whether you need a combination of the two to fix things (apparently not).
>> 
>> Now i am wondering whether I did something wrong setting the env var. It should just be picked up when it's in hbase-env.sh, right?
>> 
>> 
>> Friso
>> 
>> 
>> 
>> On 12 jan 2011, at 10:59, Andrey Stepachev wrote:
>> 
>>> with MALLOC_ARENA_MAX=2
>>> 
>>> I check -XX:MaxDirectMemorySize=256m, before, but it doesn't affect anything
>>> (even no OOM
>>> exceptions or so on).
>>> 
>>> But it looks like i have exactly the same issue (it looks like). I have many
>>> 64Mb anon memory blocks.
>>> (sometimes they 132MB). And on heavy load i have rapidly growing rss size of
>>> jvm process.
>>> 
>>> 2011/1/12 Friso van Vollenhoven <fv...@xebia.com>
>>> 
>>>> Just to clarify: you fixed it by setting the MALLOC_MAX_ARENA=? in
>>>> hbase-env.sh?
>>>> 
>>>> Did you also use the -XX:MaxDirectMemorySize=256m ?
>>>> 
>>>> It would be nice to check that this is a different than the leakage with
>>>> LZO...
>>>> 
>>>> 
>>>> Thanks,
>>>> Friso
>>>> 
>>>> 
>>>> On 12 jan 2011, at 07:46, Andrey Stepachev wrote:
>>>> 
>>>>> My bad. All things work. Thanks for  Todd Lipcon :)
>>>>> 
>>>>> 2011/1/11 Andrey Stepachev <oc...@gmail.com>
>>>>> 
>>>>>> I tried to set MALLOC_ARENA_MAX=2. But still the same issue like in LZO
>>>>>> problem thread. All those 65M blocks here. And JVM continues to eat
>>>> memory
>>>>>> on heavy write load. And yes, I use "improved" kernel
>>>>>> Linux 2.6.34.7-0.5.
>>>>>> 
>>>>>> 2011/1/11 Xavier Stevens <xs...@mozilla.com>
>>>>>> 
>>>>>> Are you using a newer linux kernel with the new and "improved" memory
>>>>>>> allocator?
>>>>>>> 
>>>>>>> If so try setting this in hadoop-env.sh:
>>>>>>> 
>>>>>>> export MALLOC_ARENA_MAX=<number of cores you want to use>
>>>>>>> 
>>>>>>> Maybe start by setting it to 4.  You can thank Todd Lipcon if this
>>>> works
>>>>>>> for you.
>>>>>>> 
>>>>>>> Cheers,
>>>>>>> 
>>>>>>> 
>>>>>>> -Xavier
>>>>>>> 
>>>>>>> On 1/11/11 7:24 AM, Andrey Stepachev wrote:
>>>>>>>> No. I don't use LZO. I tried even remove any native support (i.e. all
>>>>>>> .so
>>>>>>>> from class path)
>>>>>>>> and use java gzip. But nothing.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 2011/1/11 Friso van Vollenhoven <fv...@xebia.com>
>>>>>>>> 
>>>>>>>>> Are you using LZO by any chance? If so, which version?
>>>>>>>>> 
>>>>>>>>> Friso
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On 11 jan 2011, at 15:57, Andrey Stepachev wrote:
>>>>>>>>> 
>>>>>>>>>> After starting the hbase in jroсkit found the same memory leakage.
>>>>>>>>>> 
>>>>>>>>>> After the launch
>>>>>>>>>> 
>>>>>>>>>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head
>>>>>>>>>> Tue Jan 11 16:49:31 2011
>>>>>>>>>> 
>>>>>>>>>> 11 16:49:31 MSK 2011
>>>>>>>>>> PID RSS VSZ% CPU
>>>>>>>>>> 7863 2547760 5576744 78.7
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> JR dumps:
>>>>>>>>>> 
>>>>>>>>>> Total mapped 5576740KB (reserved = 2676404KB) - Java heap 2048000KB
>>>>>>>>>> (reserved = 1472176KB) - GC tables 68512KB - Thread stacks 37236KB
>>>> (#
>>>>>>>>>> threads = 111) - Compiled code 1048576KB (used = 2599KB) - Internal
>>>>>>>>>> 1224KB - OS 549688KB - Other 1800976KB - Classblocks 1280KB
>>>> (malloced
>>>>>>>>>> = 1110KB # 3285) - Java class data 20224KB (malloced = 20002KB #
>>>> 15134
>>>>>>>>>> in 3285 classes) - Native memory tracking 1024KB (malloced = 325KB
>>>> +10
>>>>>>>>>> KB # 20)
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> After running the mr which make high write load (~1hour)
>>>>>>>>>> 
>>>>>>>>>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head
>>>>>>>>>> Tue Jan 11 17:08:56 2011
>>>>>>>>>> 
>>>>>>>>>> 11 17:08:56 MSK 2011
>>>>>>>>>> PID RSS VSZ% CPU
>>>>>>>>>> 7863 4072396 5459572 100
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> JR said not important below specify why)
>>>>>>>>>> 
>>>>>>>>>> http://paste.ubuntu.com/552820/
>>>>>>>>>> <http://paste.ubuntu.com/552820/>
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 7863:
>>>>>>>>>> Total mapped                  5742628KB +165888KB
>>>> (reserved=1144000KB
>>>>>>>>>> -1532404KB)
>>>>>>>>>> -              Java heap      2048000KB           (reserved=0KB
>>>>>>>>> -1472176KB)
>>>>>>>>>> -              GC tables        68512KB
>>>>>>>>>> -          Thread stacks        38028KB    +792KB (#threads=114 +3)
>>>>>>>>>> -          Compiled code      1048576KB           (used=3376KB
>>>> +776KB)
>>>>>>>>>> -               Internal         1480KB    +256KB
>>>>>>>>>> -                     OS       517944KB  -31744KB
>>>>>>>>>> -                  Other      1996792KB +195816KB
>>>>>>>>>> -            Classblocks         1280KB           (malloced=1156KB
>>>>>>>>>> +45KB #3421 +136)
>>>>>>>>>> -        Java class data        20992KB    +768KB (malloced=20843KB
>>>>>>>>>> +840KB #15774 +640 in 3421 classes)
>>>>>>>>>> - Native memory tracking         1024KB           (malloced=325KB
>>>>>>> +10KB
>>>>>>>>> #20)
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>>>>> OS                          *java    r x 0x0000000000400000.(
>>>>>>>>> 76KB)
>>>>>>>>>> OS                          *java    rw  0x0000000000612000 (
>>>>>>>>> 4KB)
>>>>>>>>>> OS                        *[heap]    rw  0x0000000000613000.(
>>>>>>>>> 478712KB)
>>>>>>>>>> INT                           Poll    r   0x000000007fffe000 (
>>>>>>>>> 4KB)
>>>>>>>>>> INT                         Membar    rw  0x000000007ffff000.(
>>>>>>>>> 4KB)
>>>>>>>>>> MSP              Classblocks (1/2)    rw  0x0000000082ec0000 (
>>>>>>>>> 768KB)
>>>>>>>>>> MSP              Classblocks (2/2)    rw  0x0000000082f80000 (
>>>>>>>>> 512KB)
>>>>>>>>>> HEAP                      Java heap    rw
>>>>>>>>> 0x0000000083000000.(2048000KB)
>>>>>>>>>>                                     rw  0x00007f2574000000 (
>>>>>>>>> 65500KB)
>>>>>>>>>>                                         0x00007f2577ff7000.(
>>>>>>>>> 36KB)
>>>>>>>>>>                                     rw  0x00007f2584000000 (
>>>>>>>>> 65492KB)
>>>>>>>>>>                                         0x00007f2587ff5000.(
>>>>>>>>> 44KB)
>>>>>>>>>>                                     rw  0x00007f258c000000 (
>>>>>>>>> 65500KB)
>>>>>>>>>>                                         0x00007f258fff7000 (
>>>>>>>>> 36KB)
>>>>>>>>>>                                     rw  0x00007f2590000000 (
>>>>>>>>> 65500KB)
>>>>>>>>>>                                         0x00007f2593ff7000 (
>>>>>>>>> 36KB)
>>>>>>>>>>                                     rw  0x00007f2594000000 (
>>>>>>>>> 65500KB)
>>>>>>>>>>                                         0x00007f2597ff7000 (
>>>>>>>>> 36KB)
>>>>>>>>>>                                     rw  0x00007f2598000000 (
>>>>>>>>> 131036KB)
>>>>>>>>>>                                         0x00007f259fff7000 (
>>>>>>>>> 36KB)
>>>>>>>>>>                                     rw  0x00007f25a0000000 (
>>>>>>>>> 65528KB)
>>>>>>>>>>                                         0x00007f25a3ffe000 (
>>>>>>>>> 8KB)
>>>>>>>>>>                                     rw  0x00007f25a4000000 (
>>>>>>>>> 65496KB)
>>>>>>>>>>                                         0x00007f25a7ff6000 (
>>>>>>>>> 40KB)
>>>>>>>>>>                                     rw  0x00007f25a8000000 (
>>>>>>>>> 65496KB)
>>>>>>>>>>                                         0x00007f25abff6000 (
>>>>>>>>> 40KB)
>>>>>>>>>>                                     rw  0x00007f25ac000000 (
>>>>>>>>> 65504KB)
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> So, the difference was in the pieces of memory like this:
>>>>>>>>>> 
>>>>>>>>>> rw 0x00007f2590000000 (65500KB)
>>>>>>>>>> 0x00007f2593ff7000 (36KB)
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Looks like HLog allocates memory (looks like HLog, becase it is very
>>>>>>>>> similar
>>>>>>>>>> size)
>>>>>>>>>> 
>>>>>>>>>> If we count this blocks we get amount of lost memory:
>>>>>>>>>> 
>>>>>>>>>> 65M * 32 + 132M = 2212M
>>>>>>>>>> 
>>>>>>>>>> So, it looks like HLog allcates to many memory, and question is: how
>>>>>>> to
>>>>>>>>>> restrict it?
>>>>>>>>>> 
>>>>>>>>>> 2010/12/30 Andrey Stepachev <oc...@gmail.com>
>>>>>>>>>> 
>>>>>>>>>>> Hi All.
>>>>>>>>>>> 
>>>>>>>>>>> After heavy load into hbase (single node, nondistributed test
>>>> system)
>>>>>>> I
>>>>>>>>> got
>>>>>>>>>>> 4Gb process size of my HBase java process.
>>>>>>>>>>> On 6GB machine there was no room for anything else (disk cache and
>>>> so
>>>>>>>>> on).
>>>>>>>>>>> Does anybody knows, what is going on, and how you solve this. What
>>>>>>> heap
>>>>>>>>>>> memory is set on you hosts
>>>>>>>>>>> and how much of RSS hbase process actually use.
>>>>>>>>>>> 
>>>>>>>>>>> I don't see such things before, all tomcat and other java apps
>>>> don't
>>>>>>>>> eats
>>>>>>>>>>> significally more memory then -Xmx.
>>>>>>>>>>> 
>>>>>>>>>>> Connection name:   pid: 23476
>>>> org.apache.hadoop.hbase.master.HMaster
>>>>>>>>>>> start   Virtual Machine:   Java HotSpot(TM) 64-Bit Server VM
>>>> version
>>>>>>>>>>> 17.1-b03   Vendor:   Sun Microsystems Inc.   Name:   23476@mars
>>>>>>>>> Uptime:   12
>>>>>>>>>>> hours 4 minutes   Process CPU time:   5 hours 45 minutes   JIT
>>>>>>> compiler:
>>>>>>>>> HotSpot
>>>>>>>>>>> 64-Bit Server Compiler   Total compile time:   19,223 seconds
>>>>>>>>>>> ------------------------------
>>>>>>>>>>> Current heap size:     703 903 kbytes   Maximum heap size:   2
>>>> 030
>>>>>>>>> 976kbytes    Committed memory:
>>>>>>>>>>> 2 030 976 kbytes   Pending finalization:   0 objects      Garbage
>>>>>>>>>>> collector:   Name = 'ParNew', Collections = 9 990, Total time spent
>>>> =
>>>>>>> 5
>>>>>>>>>>> minutes   Garbage collector:   Name = 'ConcurrentMarkSweep',
>>>>>>> Collections
>>>>>>>>> =
>>>>>>>>>>> 20, Total time spent = 35,754 seconds
>>>>>>>>>>> ------------------------------
>>>>>>>>>>> Operating System:   Linux 2.6.34.7-0.5-xen   Architecture:
>>>> amd64
>>>>>>>>> Number of processors:
>>>>>>>>>>> 8   Committed virtual memory:   4 403 512 kbytes     Total physical
>>>>>>>>>>> memory:   6 815 744 kbytes   Free physical memory:      82 720
>>>> kbytes
>>>>>>>>> Total swap space:
>>>>>>>>>>> 8 393 924 kbytes   Free swap space:   8 050 880 kbytes
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>>

Re: Java Commited Virtual Memory significally larged then Heap Memory

Posted by Tatsuya Kawano <ta...@gmail.com>.

Hi, 

Have you tried the ASF version of hadoop-core? (The one distributed with HBase 0.90RC.)

It doesn't call reinit() so I'm hoping it will just work fine with the latest hadoop-lzo and other compressors.

Thanks, 

--
Tatsuya Kawano (Mr.)
Tokyo, Japan


On Jan 12, 2011, at 7:51 PM, Friso van Vollenhoven <fv...@xebia.com> wrote:

> Thanks.
> 
> I went back to hbase 0.89 with 0.1 LZO, which works fine and does not show this issue.
> 
> I tried with a newer Hbase and LZO version, also with the MALLOC... setting but without max direct memory set, so I was wondering whether you need a combination of the two to fix things (apparently not).
> 
> Now i am wondering whether I did something wrong setting the env var. It should just be picked up when it's in hbase-env.sh, right?
> 
> 
> Friso
> 
> 
> 
> On 12 jan 2011, at 10:59, Andrey Stepachev wrote:
> 
>> with MALLOC_ARENA_MAX=2
>> 
>> I check -XX:MaxDirectMemorySize=256m, before, but it doesn't affect anything
>> (even no OOM
>> exceptions or so on).
>> 
>> But it looks like i have exactly the same issue (it looks like). I have many
>> 64Mb anon memory blocks.
>> (sometimes they 132MB). And on heavy load i have rapidly growing rss size of
>> jvm process.
>> 
>> 2011/1/12 Friso van Vollenhoven <fv...@xebia.com>
>> 
>>> Just to clarify: you fixed it by setting the MALLOC_MAX_ARENA=? in
>>> hbase-env.sh?
>>> 
>>> Did you also use the -XX:MaxDirectMemorySize=256m ?
>>> 
>>> It would be nice to check that this is a different than the leakage with
>>> LZO...
>>> 
>>> 
>>> Thanks,
>>> Friso
>>> 
>>> 
>>> On 12 jan 2011, at 07:46, Andrey Stepachev wrote:
>>> 
>>>> My bad. All things work. Thanks for  Todd Lipcon :)
>>>> 
>>>> 2011/1/11 Andrey Stepachev <oc...@gmail.com>
>>>> 
>>>>> I tried to set MALLOC_ARENA_MAX=2. But still the same issue like in LZO
>>>>> problem thread. All those 65M blocks here. And JVM continues to eat
>>> memory
>>>>> on heavy write load. And yes, I use "improved" kernel
>>>>> Linux 2.6.34.7-0.5.
>>>>> 
>>>>> 2011/1/11 Xavier Stevens <xs...@mozilla.com>
>>>>> 
>>>>> Are you using a newer linux kernel with the new and "improved" memory
>>>>>> allocator?
>>>>>> 
>>>>>> If so try setting this in hadoop-env.sh:
>>>>>> 
>>>>>> export MALLOC_ARENA_MAX=<number of cores you want to use>
>>>>>> 
>>>>>> Maybe start by setting it to 4.  You can thank Todd Lipcon if this
>>> works
>>>>>> for you.
>>>>>> 
>>>>>> Cheers,
>>>>>> 
>>>>>> 
>>>>>> -Xavier
>>>>>> 
>>>>>> On 1/11/11 7:24 AM, Andrey Stepachev wrote:
>>>>>>> No. I don't use LZO. I tried even remove any native support (i.e. all
>>>>>> .so
>>>>>>> from class path)
>>>>>>> and use java gzip. But nothing.
>>>>>>> 
>>>>>>> 
>>>>>>> 2011/1/11 Friso van Vollenhoven <fv...@xebia.com>
>>>>>>> 
>>>>>>>> Are you using LZO by any chance? If so, which version?
>>>>>>>> 
>>>>>>>> Friso
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On 11 jan 2011, at 15:57, Andrey Stepachev wrote:
>>>>>>>> 
>>>>>>>>> After starting the hbase in jroсkit found the same memory leakage.
>>>>>>>>> 
>>>>>>>>> After the launch
>>>>>>>>> 
>>>>>>>>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head
>>>>>>>>> Tue Jan 11 16:49:31 2011
>>>>>>>>> 
>>>>>>>>> 11 16:49:31 MSK 2011
>>>>>>>>> PID RSS VSZ% CPU
>>>>>>>>> 7863 2547760 5576744 78.7
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> JR dumps:
>>>>>>>>> 
>>>>>>>>> Total mapped 5576740KB (reserved = 2676404KB) - Java heap 2048000KB
>>>>>>>>> (reserved = 1472176KB) - GC tables 68512KB - Thread stacks 37236KB
>>> (#
>>>>>>>>> threads = 111) - Compiled code 1048576KB (used = 2599KB) - Internal
>>>>>>>>> 1224KB - OS 549688KB - Other 1800976KB - Classblocks 1280KB
>>> (malloced
>>>>>>>>> = 1110KB # 3285) - Java class data 20224KB (malloced = 20002KB #
>>> 15134
>>>>>>>>> in 3285 classes) - Native memory tracking 1024KB (malloced = 325KB
>>> +10
>>>>>>>>> KB # 20)
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> After running the mr which make high write load (~1hour)
>>>>>>>>> 
>>>>>>>>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head
>>>>>>>>> Tue Jan 11 17:08:56 2011
>>>>>>>>> 
>>>>>>>>> 11 17:08:56 MSK 2011
>>>>>>>>> PID RSS VSZ% CPU
>>>>>>>>> 7863 4072396 5459572 100
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> JR said not important below specify why)
>>>>>>>>> 
>>>>>>>>> http://paste.ubuntu.com/552820/
>>>>>>>>> <http://paste.ubuntu.com/552820/>
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 7863:
>>>>>>>>> Total mapped                  5742628KB +165888KB
>>> (reserved=1144000KB
>>>>>>>>> -1532404KB)
>>>>>>>>> -              Java heap      2048000KB           (reserved=0KB
>>>>>>>> -1472176KB)
>>>>>>>>> -              GC tables        68512KB
>>>>>>>>> -          Thread stacks        38028KB    +792KB (#threads=114 +3)
>>>>>>>>> -          Compiled code      1048576KB           (used=3376KB
>>> +776KB)
>>>>>>>>> -               Internal         1480KB    +256KB
>>>>>>>>> -                     OS       517944KB  -31744KB
>>>>>>>>> -                  Other      1996792KB +195816KB
>>>>>>>>> -            Classblocks         1280KB           (malloced=1156KB
>>>>>>>>> +45KB #3421 +136)
>>>>>>>>> -        Java class data        20992KB    +768KB (malloced=20843KB
>>>>>>>>> +840KB #15774 +640 in 3421 classes)
>>>>>>>>> - Native memory tracking         1024KB           (malloced=325KB
>>>>>> +10KB
>>>>>>>> #20)
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>>>> OS                          *java    r x 0x0000000000400000.(
>>>>>>>> 76KB)
>>>>>>>>> OS                          *java    rw  0x0000000000612000 (
>>>>>>>> 4KB)
>>>>>>>>> OS                        *[heap]    rw  0x0000000000613000.(
>>>>>>>> 478712KB)
>>>>>>>>> INT                           Poll    r   0x000000007fffe000 (
>>>>>>>> 4KB)
>>>>>>>>> INT                         Membar    rw  0x000000007ffff000.(
>>>>>>>> 4KB)
>>>>>>>>> MSP              Classblocks (1/2)    rw  0x0000000082ec0000 (
>>>>>>>> 768KB)
>>>>>>>>> MSP              Classblocks (2/2)    rw  0x0000000082f80000 (
>>>>>>>> 512KB)
>>>>>>>>> HEAP                      Java heap    rw
>>>>>>>> 0x0000000083000000.(2048000KB)
>>>>>>>>>                                      rw  0x00007f2574000000 (
>>>>>>>> 65500KB)
>>>>>>>>>                                          0x00007f2577ff7000.(
>>>>>>>> 36KB)
>>>>>>>>>                                      rw  0x00007f2584000000 (
>>>>>>>> 65492KB)
>>>>>>>>>                                          0x00007f2587ff5000.(
>>>>>>>> 44KB)
>>>>>>>>>                                      rw  0x00007f258c000000 (
>>>>>>>> 65500KB)
>>>>>>>>>                                          0x00007f258fff7000 (
>>>>>>>> 36KB)
>>>>>>>>>                                      rw  0x00007f2590000000 (
>>>>>>>> 65500KB)
>>>>>>>>>                                          0x00007f2593ff7000 (
>>>>>>>> 36KB)
>>>>>>>>>                                      rw  0x00007f2594000000 (
>>>>>>>> 65500KB)
>>>>>>>>>                                          0x00007f2597ff7000 (
>>>>>>>> 36KB)
>>>>>>>>>                                      rw  0x00007f2598000000 (
>>>>>>>> 131036KB)
>>>>>>>>>                                          0x00007f259fff7000 (
>>>>>>>> 36KB)
>>>>>>>>>                                      rw  0x00007f25a0000000 (
>>>>>>>> 65528KB)
>>>>>>>>>                                          0x00007f25a3ffe000 (
>>>>>>>> 8KB)
>>>>>>>>>                                      rw  0x00007f25a4000000 (
>>>>>>>> 65496KB)
>>>>>>>>>                                          0x00007f25a7ff6000 (
>>>>>>>> 40KB)
>>>>>>>>>                                      rw  0x00007f25a8000000 (
>>>>>>>> 65496KB)
>>>>>>>>>                                          0x00007f25abff6000 (
>>>>>>>> 40KB)
>>>>>>>>>                                      rw  0x00007f25ac000000 (
>>>>>>>> 65504KB)
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> So, the difference was in the pieces of memory like this:
>>>>>>>>> 
>>>>>>>>> rw 0x00007f2590000000 (65500KB)
>>>>>>>>>  0x00007f2593ff7000 (36KB)
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Looks like HLog allocates memory (looks like HLog, becase it is very
>>>>>>>> similar
>>>>>>>>> size)
>>>>>>>>> 
>>>>>>>>> If we count this blocks we get amount of lost memory:
>>>>>>>>> 
>>>>>>>>> 65M * 32 + 132M = 2212M
>>>>>>>>> 
>>>>>>>>> So, it looks like HLog allcates to many memory, and question is: how
>>>>>> to
>>>>>>>>> restrict it?
>>>>>>>>> 
>>>>>>>>> 2010/12/30 Andrey Stepachev <oc...@gmail.com>
>>>>>>>>> 
>>>>>>>>>> Hi All.
>>>>>>>>>> 
>>>>>>>>>> After heavy load into hbase (single node, nondistributed test
>>> system)
>>>>>> I
>>>>>>>> got
>>>>>>>>>> 4Gb process size of my HBase java process.
>>>>>>>>>> On 6GB machine there was no room for anything else (disk cache and
>>> so
>>>>>>>> on).
>>>>>>>>>> Does anybody knows, what is going on, and how you solve this. What
>>>>>> heap
>>>>>>>>>> memory is set on you hosts
>>>>>>>>>> and how much of RSS hbase process actually use.
>>>>>>>>>> 
>>>>>>>>>> I don't see such things before, all tomcat and other java apps
>>> don't
>>>>>>>> eats
>>>>>>>>>> significally more memory then -Xmx.
>>>>>>>>>> 
>>>>>>>>>> Connection name:   pid: 23476
>>> org.apache.hadoop.hbase.master.HMaster
>>>>>>>>>> start   Virtual Machine:   Java HotSpot(TM) 64-Bit Server VM
>>> version
>>>>>>>>>> 17.1-b03   Vendor:   Sun Microsystems Inc.   Name:   23476@mars
>>>>>>>> Uptime:   12
>>>>>>>>>> hours 4 minutes   Process CPU time:   5 hours 45 minutes   JIT
>>>>>> compiler:
>>>>>>>> HotSpot
>>>>>>>>>> 64-Bit Server Compiler   Total compile time:   19,223 seconds
>>>>>>>>>> ------------------------------
>>>>>>>>>> Current heap size:     703 903 kbytes   Maximum heap size:   2
>>> 030
>>>>>>>> 976kbytes    Committed memory:
>>>>>>>>>> 2 030 976 kbytes   Pending finalization:   0 objects      Garbage
>>>>>>>>>> collector:   Name = 'ParNew', Collections = 9 990, Total time spent
>>> =
>>>>>> 5
>>>>>>>>>> minutes   Garbage collector:   Name = 'ConcurrentMarkSweep',
>>>>>> Collections
>>>>>>>> =
>>>>>>>>>> 20, Total time spent = 35,754 seconds
>>>>>>>>>> ------------------------------
>>>>>>>>>> Operating System:   Linux 2.6.34.7-0.5-xen   Architecture:
>>> amd64
>>>>>>>> Number of processors:
>>>>>>>>>> 8   Committed virtual memory:   4 403 512 kbytes     Total physical
>>>>>>>>>> memory:   6 815 744 kbytes   Free physical memory:      82 720
>>> kbytes
>>>>>>>> Total swap space:
>>>>>>>>>> 8 393 924 kbytes   Free swap space:   8 050 880 kbytes
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>> 
>>> 
>

Re: Java Commited Virtual Memory significally larged then Heap Memory

Posted by Friso van Vollenhoven <fv...@xebia.com>.

Once I have a moment to play with our dev cluster, I will give this another go.

Thanks,
Friso


On 12 jan 2011, at 12:05, Andrey Stepachev wrote:

> No, I use only malloc env var, and I set it (as suggested before) into
> hbase-env.sh, and it looks like it eats more less memory (in my case 4.7G vs
> 3.3G with 2Gheap)
> 
> 2011/1/12 Friso van Vollenhoven <fv...@xebia.com>
> 
>> Thanks.
>> 
>> I went back to hbase 0.89 with 0.1 LZO, which works fine and does not show
>> this issue.
>> 
>> I tried with a newer Hbase and LZO version, also with the MALLOC... setting
>> but without max direct memory set, so I was wondering whether you need a
>> combination of the two to fix things (apparently not).
>> 
>> Now i am wondering whether I did something wrong setting the env var. It
>> should just be picked up when it's in hbase-env.sh, right?
>> 
>> 
>> Friso
>> 
>> 
>> 
>> On 12 jan 2011, at 10:59, Andrey Stepachev wrote:
>> 
>>> with MALLOC_ARENA_MAX=2
>>> 
>>> I check -XX:MaxDirectMemorySize=256m, before, but it doesn't affect
>> anything
>>> (even no OOM
>>> exceptions or so on).
>>> 
>>> But it looks like i have exactly the same issue (it looks like). I have
>> many
>>> 64Mb anon memory blocks.
>>> (sometimes they 132MB). And on heavy load i have rapidly growing rss size
>> of
>>> jvm process.
>>> 
>>> 2011/1/12 Friso van Vollenhoven <fv...@xebia.com>
>>> 
>>>> Just to clarify: you fixed it by setting the MALLOC_MAX_ARENA=? in
>>>> hbase-env.sh?
>>>> 
>>>> Did you also use the -XX:MaxDirectMemorySize=256m ?
>>>> 
>>>> It would be nice to check that this is a different than the leakage with
>>>> LZO...
>>>> 
>>>> 
>>>> Thanks,
>>>> Friso
>>>> 
>>>> 
>>>> On 12 jan 2011, at 07:46, Andrey Stepachev wrote:
>>>> 
>>>>> My bad. All things work. Thanks for  Todd Lipcon :)
>>>>> 
>>>>> 2011/1/11 Andrey Stepachev <oc...@gmail.com>
>>>>> 
>>>>>> I tried to set MALLOC_ARENA_MAX=2. But still the same issue like in
>> LZO
>>>>>> problem thread. All those 65M blocks here. And JVM continues to eat
>>>> memory
>>>>>> on heavy write load. And yes, I use "improved" kernel
>>>>>> Linux 2.6.34.7-0.5.
>>>>>> 
>>>>>> 2011/1/11 Xavier Stevens <xs...@mozilla.com>
>>>>>> 
>>>>>> Are you using a newer linux kernel with the new and "improved" memory
>>>>>>> allocator?
>>>>>>> 
>>>>>>> If so try setting this in hadoop-env.sh:
>>>>>>> 
>>>>>>> export MALLOC_ARENA_MAX=<number of cores you want to use>
>>>>>>> 
>>>>>>> Maybe start by setting it to 4.  You can thank Todd Lipcon if this
>>>> works
>>>>>>> for you.
>>>>>>> 
>>>>>>> Cheers,
>>>>>>> 
>>>>>>> 
>>>>>>> -Xavier
>>>>>>> 
>>>>>>> On 1/11/11 7:24 AM, Andrey Stepachev wrote:
>>>>>>>> No. I don't use LZO. I tried even remove any native support (i.e.
>> all
>>>>>>> .so
>>>>>>>> from class path)
>>>>>>>> and use java gzip. But nothing.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 2011/1/11 Friso van Vollenhoven <fv...@xebia.com>
>>>>>>>> 
>>>>>>>>> Are you using LZO by any chance? If so, which version?
>>>>>>>>> 
>>>>>>>>> Friso
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On 11 jan 2011, at 15:57, Andrey Stepachev wrote:
>>>>>>>>> 
>>>>>>>>>> After starting the hbase in jroсkit found the same memory leakage.
>>>>>>>>>> 
>>>>>>>>>> After the launch
>>>>>>>>>> 
>>>>>>>>>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu |
>> head
>>>>>>>>>> Tue Jan 11 16:49:31 2011
>>>>>>>>>> 
>>>>>>>>>> 11 16:49:31 MSK 2011
>>>>>>>>>> PID RSS VSZ% CPU
>>>>>>>>>> 7863 2547760 5576744 78.7
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> JR dumps:
>>>>>>>>>> 
>>>>>>>>>> Total mapped 5576740KB (reserved = 2676404KB) - Java heap
>> 2048000KB
>>>>>>>>>> (reserved = 1472176KB) - GC tables 68512KB - Thread stacks 37236KB
>>>> (#
>>>>>>>>>> threads = 111) - Compiled code 1048576KB (used = 2599KB) -
>> Internal
>>>>>>>>>> 1224KB - OS 549688KB - Other 1800976KB - Classblocks 1280KB
>>>> (malloced
>>>>>>>>>> = 1110KB # 3285) - Java class data 20224KB (malloced = 20002KB #
>>>> 15134
>>>>>>>>>> in 3285 classes) - Native memory tracking 1024KB (malloced = 325KB
>>>> +10
>>>>>>>>>> KB # 20)
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> After running the mr which make high write load (~1hour)
>>>>>>>>>> 
>>>>>>>>>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu |
>> head
>>>>>>>>>> Tue Jan 11 17:08:56 2011
>>>>>>>>>> 
>>>>>>>>>> 11 17:08:56 MSK 2011
>>>>>>>>>> PID RSS VSZ% CPU
>>>>>>>>>> 7863 4072396 5459572 100
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> JR said not important below specify why)
>>>>>>>>>> 
>>>>>>>>>> http://paste.ubuntu.com/552820/
>>>>>>>>>> <http://paste.ubuntu.com/552820/>
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 7863:
>>>>>>>>>> Total mapped                  5742628KB +165888KB
>>>> (reserved=1144000KB
>>>>>>>>>> -1532404KB)
>>>>>>>>>> -              Java heap      2048000KB           (reserved=0KB
>>>>>>>>> -1472176KB)
>>>>>>>>>> -              GC tables        68512KB
>>>>>>>>>> -          Thread stacks        38028KB    +792KB (#threads=114
>> +3)
>>>>>>>>>> -          Compiled code      1048576KB           (used=3376KB
>>>> +776KB)
>>>>>>>>>> -               Internal         1480KB    +256KB
>>>>>>>>>> -                     OS       517944KB  -31744KB
>>>>>>>>>> -                  Other      1996792KB +195816KB
>>>>>>>>>> -            Classblocks         1280KB           (malloced=1156KB
>>>>>>>>>> +45KB #3421 +136)
>>>>>>>>>> -        Java class data        20992KB    +768KB
>> (malloced=20843KB
>>>>>>>>>> +840KB #15774 +640 in 3421 classes)
>>>>>>>>>> - Native memory tracking         1024KB           (malloced=325KB
>>>>>>> +10KB
>>>>>>>>> #20)
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>> 
>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>>>>> OS                          *java    r x 0x0000000000400000.(
>>>>>>>>> 76KB)
>>>>>>>>>> OS                          *java    rw  0x0000000000612000 (
>>>>>>>>> 4KB)
>>>>>>>>>> OS                        *[heap]    rw  0x0000000000613000.(
>>>>>>>>> 478712KB)
>>>>>>>>>> INT                           Poll    r   0x000000007fffe000 (
>>>>>>>>> 4KB)
>>>>>>>>>> INT                         Membar    rw  0x000000007ffff000.(
>>>>>>>>> 4KB)
>>>>>>>>>> MSP              Classblocks (1/2)    rw  0x0000000082ec0000 (
>>>>>>>>> 768KB)
>>>>>>>>>> MSP              Classblocks (2/2)    rw  0x0000000082f80000 (
>>>>>>>>> 512KB)
>>>>>>>>>> HEAP                      Java heap    rw
>>>>>>>>> 0x0000000083000000.(2048000KB)
>>>>>>>>>>                                      rw  0x00007f2574000000 (
>>>>>>>>> 65500KB)
>>>>>>>>>>                                          0x00007f2577ff7000.(
>>>>>>>>> 36KB)
>>>>>>>>>>                                      rw  0x00007f2584000000 (
>>>>>>>>> 65492KB)
>>>>>>>>>>                                          0x00007f2587ff5000.(
>>>>>>>>> 44KB)
>>>>>>>>>>                                      rw  0x00007f258c000000 (
>>>>>>>>> 65500KB)
>>>>>>>>>>                                          0x00007f258fff7000 (
>>>>>>>>> 36KB)
>>>>>>>>>>                                      rw  0x00007f2590000000 (
>>>>>>>>> 65500KB)
>>>>>>>>>>                                          0x00007f2593ff7000 (
>>>>>>>>> 36KB)
>>>>>>>>>>                                      rw  0x00007f2594000000 (
>>>>>>>>> 65500KB)
>>>>>>>>>>                                          0x00007f2597ff7000 (
>>>>>>>>> 36KB)
>>>>>>>>>>                                      rw  0x00007f2598000000 (
>>>>>>>>> 131036KB)
>>>>>>>>>>                                          0x00007f259fff7000 (
>>>>>>>>> 36KB)
>>>>>>>>>>                                      rw  0x00007f25a0000000 (
>>>>>>>>> 65528KB)
>>>>>>>>>>                                          0x00007f25a3ffe000 (
>>>>>>>>> 8KB)
>>>>>>>>>>                                      rw  0x00007f25a4000000 (
>>>>>>>>> 65496KB)
>>>>>>>>>>                                          0x00007f25a7ff6000 (
>>>>>>>>> 40KB)
>>>>>>>>>>                                      rw  0x00007f25a8000000 (
>>>>>>>>> 65496KB)
>>>>>>>>>>                                          0x00007f25abff6000 (
>>>>>>>>> 40KB)
>>>>>>>>>>                                      rw  0x00007f25ac000000 (
>>>>>>>>> 65504KB)
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> So, the difference was in the pieces of memory like this:
>>>>>>>>>> 
>>>>>>>>>> rw 0x00007f2590000000 (65500KB)
>>>>>>>>>>  0x00007f2593ff7000 (36KB)
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Looks like HLog allocates memory (looks like HLog, becase it is
>> very
>>>>>>>>> similar
>>>>>>>>>> size)
>>>>>>>>>> 
>>>>>>>>>> If we count this blocks we get amount of lost memory:
>>>>>>>>>> 
>>>>>>>>>> 65M * 32 + 132M = 2212M
>>>>>>>>>> 
>>>>>>>>>> So, it looks like HLog allcates to many memory, and question is:
>> how
>>>>>>> to
>>>>>>>>>> restrict it?
>>>>>>>>>> 
>>>>>>>>>> 2010/12/30 Andrey Stepachev <oc...@gmail.com>
>>>>>>>>>> 
>>>>>>>>>>> Hi All.
>>>>>>>>>>> 
>>>>>>>>>>> After heavy load into hbase (single node, nondistributed test
>>>> system)
>>>>>>> I
>>>>>>>>> got
>>>>>>>>>>> 4Gb process size of my HBase java process.
>>>>>>>>>>> On 6GB machine there was no room for anything else (disk cache
>> and
>>>> so
>>>>>>>>> on).
>>>>>>>>>>> Does anybody knows, what is going on, and how you solve this.
>> What
>>>>>>> heap
>>>>>>>>>>> memory is set on you hosts
>>>>>>>>>>> and how much of RSS hbase process actually use.
>>>>>>>>>>> 
>>>>>>>>>>> I don't see such things before, all tomcat and other java apps
>>>> don't
>>>>>>>>> eats
>>>>>>>>>>> significally more memory then -Xmx.
>>>>>>>>>>> 
>>>>>>>>>>> Connection name:   pid: 23476
>>>> org.apache.hadoop.hbase.master.HMaster
>>>>>>>>>>> start   Virtual Machine:   Java HotSpot(TM) 64-Bit Server VM
>>>> version
>>>>>>>>>>> 17.1-b03   Vendor:   Sun Microsystems Inc.   Name:   23476@mars
>>>>>>>>> Uptime:   12
>>>>>>>>>>> hours 4 minutes   Process CPU time:   5 hours 45 minutes   JIT
>>>>>>> compiler:
>>>>>>>>> HotSpot
>>>>>>>>>>> 64-Bit Server Compiler   Total compile time:   19,223 seconds
>>>>>>>>>>> ------------------------------
>>>>>>>>>>> Current heap size:     703 903 kbytes   Maximum heap size:   2
>>>> 030
>>>>>>>>> 976kbytes    Committed memory:
>>>>>>>>>>> 2 030 976 kbytes   Pending finalization:   0 objects      Garbage
>>>>>>>>>>> collector:   Name = 'ParNew', Collections = 9 990, Total time
>> spent
>>>> =
>>>>>>> 5
>>>>>>>>>>> minutes   Garbage collector:   Name = 'ConcurrentMarkSweep',
>>>>>>> Collections
>>>>>>>>> =
>>>>>>>>>>> 20, Total time spent = 35,754 seconds
>>>>>>>>>>> ------------------------------
>>>>>>>>>>> Operating System:   Linux 2.6.34.7-0.5-xen   Architecture:
>>>> amd64
>>>>>>>>> Number of processors:
>>>>>>>>>>> 8   Committed virtual memory:   4 403 512 kbytes     Total
>> physical
>>>>>>>>>>> memory:   6 815 744 kbytes   Free physical memory:      82 720
>>>> kbytes
>>>>>>>>> Total swap space:
>>>>>>>>>>> 8 393 924 kbytes   Free swap space:   8 050 880 kbytes
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>> 
>>

Re: Java Commited Virtual Memory significally larged then Heap Memory

Posted by Andrey Stepachev <oc...@gmail.com>.

No, I use only malloc env var, and I set it (as suggested before) into
hbase-env.sh, and it looks like it eats more less memory (in my case 4.7G vs
3.3G with 2Gheap)

2011/1/12 Friso van Vollenhoven <fv...@xebia.com>

> Thanks.
>
> I went back to hbase 0.89 with 0.1 LZO, which works fine and does not show
> this issue.
>
> I tried with a newer Hbase and LZO version, also with the MALLOC... setting
> but without max direct memory set, so I was wondering whether you need a
> combination of the two to fix things (apparently not).
>
> Now i am wondering whether I did something wrong setting the env var. It
> should just be picked up when it's in hbase-env.sh, right?
>
>
> Friso
>
>
>
> On 12 jan 2011, at 10:59, Andrey Stepachev wrote:
>
> > with MALLOC_ARENA_MAX=2
> >
> > I check -XX:MaxDirectMemorySize=256m, before, but it doesn't affect
> anything
> > (even no OOM
> > exceptions or so on).
> >
> > But it looks like i have exactly the same issue (it looks like). I have
> many
> > 64Mb anon memory blocks.
> > (sometimes they 132MB). And on heavy load i have rapidly growing rss size
> of
> > jvm process.
> >
> > 2011/1/12 Friso van Vollenhoven <fv...@xebia.com>
> >
> >> Just to clarify: you fixed it by setting the MALLOC_MAX_ARENA=? in
> >> hbase-env.sh?
> >>
> >> Did you also use the -XX:MaxDirectMemorySize=256m ?
> >>
> >> It would be nice to check that this is a different than the leakage with
> >> LZO...
> >>
> >>
> >> Thanks,
> >> Friso
> >>
> >>
> >> On 12 jan 2011, at 07:46, Andrey Stepachev wrote:
> >>
> >>> My bad. All things work. Thanks for  Todd Lipcon :)
> >>>
> >>> 2011/1/11 Andrey Stepachev <oc...@gmail.com>
> >>>
> >>>> I tried to set MALLOC_ARENA_MAX=2. But still the same issue like in
> LZO
> >>>> problem thread. All those 65M blocks here. And JVM continues to eat
> >> memory
> >>>> on heavy write load. And yes, I use "improved" kernel
> >>>> Linux 2.6.34.7-0.5.
> >>>>
> >>>> 2011/1/11 Xavier Stevens <xs...@mozilla.com>
> >>>>
> >>>> Are you using a newer linux kernel with the new and "improved" memory
> >>>>> allocator?
> >>>>>
> >>>>> If so try setting this in hadoop-env.sh:
> >>>>>
> >>>>> export MALLOC_ARENA_MAX=<number of cores you want to use>
> >>>>>
> >>>>> Maybe start by setting it to 4.  You can thank Todd Lipcon if this
> >> works
> >>>>> for you.
> >>>>>
> >>>>> Cheers,
> >>>>>
> >>>>>
> >>>>> -Xavier
> >>>>>
> >>>>> On 1/11/11 7:24 AM, Andrey Stepachev wrote:
> >>>>>> No. I don't use LZO. I tried even remove any native support (i.e.
> all
> >>>>> .so
> >>>>>> from class path)
> >>>>>> and use java gzip. But nothing.
> >>>>>>
> >>>>>>
> >>>>>> 2011/1/11 Friso van Vollenhoven <fv...@xebia.com>
> >>>>>>
> >>>>>>> Are you using LZO by any chance? If so, which version?
> >>>>>>>
> >>>>>>> Friso
> >>>>>>>
> >>>>>>>
> >>>>>>> On 11 jan 2011, at 15:57, Andrey Stepachev wrote:
> >>>>>>>
> >>>>>>>> After starting the hbase in jroсkit found the same memory leakage.
> >>>>>>>>
> >>>>>>>> After the launch
> >>>>>>>>
> >>>>>>>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu |
> head
> >>>>>>>> Tue Jan 11 16:49:31 2011
> >>>>>>>>
> >>>>>>>> 11 16:49:31 MSK 2011
> >>>>>>>> PID RSS VSZ% CPU
> >>>>>>>> 7863 2547760 5576744 78.7
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> JR dumps:
> >>>>>>>>
> >>>>>>>> Total mapped 5576740KB (reserved = 2676404KB) - Java heap
> 2048000KB
> >>>>>>>> (reserved = 1472176KB) - GC tables 68512KB - Thread stacks 37236KB
> >> (#
> >>>>>>>> threads = 111) - Compiled code 1048576KB (used = 2599KB) -
> Internal
> >>>>>>>> 1224KB - OS 549688KB - Other 1800976KB - Classblocks 1280KB
> >> (malloced
> >>>>>>>> = 1110KB # 3285) - Java class data 20224KB (malloced = 20002KB #
> >> 15134
> >>>>>>>> in 3285 classes) - Native memory tracking 1024KB (malloced = 325KB
> >> +10
> >>>>>>>> KB # 20)
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> After running the mr which make high write load (~1hour)
> >>>>>>>>
> >>>>>>>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu |
> head
> >>>>>>>> Tue Jan 11 17:08:56 2011
> >>>>>>>>
> >>>>>>>> 11 17:08:56 MSK 2011
> >>>>>>>> PID RSS VSZ% CPU
> >>>>>>>> 7863 4072396 5459572 100
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> JR said not important below specify why)
> >>>>>>>>
> >>>>>>>> http://paste.ubuntu.com/552820/
> >>>>>>>> <http://paste.ubuntu.com/552820/>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> 7863:
> >>>>>>>> Total mapped                  5742628KB +165888KB
> >> (reserved=1144000KB
> >>>>>>>> -1532404KB)
> >>>>>>>> -              Java heap      2048000KB           (reserved=0KB
> >>>>>>> -1472176KB)
> >>>>>>>> -              GC tables        68512KB
> >>>>>>>> -          Thread stacks        38028KB    +792KB (#threads=114
> +3)
> >>>>>>>> -          Compiled code      1048576KB           (used=3376KB
> >> +776KB)
> >>>>>>>> -               Internal         1480KB    +256KB
> >>>>>>>> -                     OS       517944KB  -31744KB
> >>>>>>>> -                  Other      1996792KB +195816KB
> >>>>>>>> -            Classblocks         1280KB           (malloced=1156KB
> >>>>>>>> +45KB #3421 +136)
> >>>>>>>> -        Java class data        20992KB    +768KB
> (malloced=20843KB
> >>>>>>>> +840KB #15774 +640 in 3421 classes)
> >>>>>>>> - Native memory tracking         1024KB           (malloced=325KB
> >>>>> +10KB
> >>>>>>> #20)
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>>>>>>>  OS                          *java    r x 0x0000000000400000.(
> >>>>>>> 76KB)
> >>>>>>>>  OS                          *java    rw  0x0000000000612000 (
> >>>>>>> 4KB)
> >>>>>>>>  OS                        *[heap]    rw  0x0000000000613000.(
> >>>>>>> 478712KB)
> >>>>>>>> INT                           Poll    r   0x000000007fffe000 (
> >>>>>>> 4KB)
> >>>>>>>> INT                         Membar    rw  0x000000007ffff000.(
> >>>>>>> 4KB)
> >>>>>>>> MSP              Classblocks (1/2)    rw  0x0000000082ec0000 (
> >>>>>>> 768KB)
> >>>>>>>> MSP              Classblocks (2/2)    rw  0x0000000082f80000 (
> >>>>>>> 512KB)
> >>>>>>>> HEAP                      Java heap    rw
> >>>>>>> 0x0000000083000000.(2048000KB)
> >>>>>>>>                                       rw  0x00007f2574000000 (
> >>>>>>> 65500KB)
> >>>>>>>>                                           0x00007f2577ff7000.(
> >>>>>>> 36KB)
> >>>>>>>>                                       rw  0x00007f2584000000 (
> >>>>>>> 65492KB)
> >>>>>>>>                                           0x00007f2587ff5000.(
> >>>>>>> 44KB)
> >>>>>>>>                                       rw  0x00007f258c000000 (
> >>>>>>> 65500KB)
> >>>>>>>>                                           0x00007f258fff7000 (
> >>>>>>> 36KB)
> >>>>>>>>                                       rw  0x00007f2590000000 (
> >>>>>>> 65500KB)
> >>>>>>>>                                           0x00007f2593ff7000 (
> >>>>>>> 36KB)
> >>>>>>>>                                       rw  0x00007f2594000000 (
> >>>>>>> 65500KB)
> >>>>>>>>                                           0x00007f2597ff7000 (
> >>>>>>> 36KB)
> >>>>>>>>                                       rw  0x00007f2598000000 (
> >>>>>>> 131036KB)
> >>>>>>>>                                           0x00007f259fff7000 (
> >>>>>>> 36KB)
> >>>>>>>>                                       rw  0x00007f25a0000000 (
> >>>>>>> 65528KB)
> >>>>>>>>                                           0x00007f25a3ffe000 (
> >>>>>>> 8KB)
> >>>>>>>>                                       rw  0x00007f25a4000000 (
> >>>>>>> 65496KB)
> >>>>>>>>                                           0x00007f25a7ff6000 (
> >>>>>>> 40KB)
> >>>>>>>>                                       rw  0x00007f25a8000000 (
> >>>>>>> 65496KB)
> >>>>>>>>                                           0x00007f25abff6000 (
> >>>>>>> 40KB)
> >>>>>>>>                                       rw  0x00007f25ac000000 (
> >>>>>>> 65504KB)
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> So, the difference was in the pieces of memory like this:
> >>>>>>>>
> >>>>>>>> rw 0x00007f2590000000 (65500KB)
> >>>>>>>>   0x00007f2593ff7000 (36KB)
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Looks like HLog allocates memory (looks like HLog, becase it is
> very
> >>>>>>> similar
> >>>>>>>> size)
> >>>>>>>>
> >>>>>>>> If we count this blocks we get amount of lost memory:
> >>>>>>>>
> >>>>>>>> 65M * 32 + 132M = 2212M
> >>>>>>>>
> >>>>>>>> So, it looks like HLog allcates to many memory, and question is:
> how
> >>>>> to
> >>>>>>>> restrict it?
> >>>>>>>>
> >>>>>>>> 2010/12/30 Andrey Stepachev <oc...@gmail.com>
> >>>>>>>>
> >>>>>>>>> Hi All.
> >>>>>>>>>
> >>>>>>>>> After heavy load into hbase (single node, nondistributed test
> >> system)
> >>>>> I
> >>>>>>> got
> >>>>>>>>> 4Gb process size of my HBase java process.
> >>>>>>>>> On 6GB machine there was no room for anything else (disk cache
> and
> >> so
> >>>>>>> on).
> >>>>>>>>> Does anybody knows, what is going on, and how you solve this.
> What
> >>>>> heap
> >>>>>>>>> memory is set on you hosts
> >>>>>>>>> and how much of RSS hbase process actually use.
> >>>>>>>>>
> >>>>>>>>> I don't see such things before, all tomcat and other java apps
> >> don't
> >>>>>>> eats
> >>>>>>>>> significally more memory then -Xmx.
> >>>>>>>>>
> >>>>>>>>> Connection name:   pid: 23476
> >> org.apache.hadoop.hbase.master.HMaster
> >>>>>>>>> start   Virtual Machine:   Java HotSpot(TM) 64-Bit Server VM
> >> version
> >>>>>>>>> 17.1-b03   Vendor:   Sun Microsystems Inc.   Name:   23476@mars
> >>>>>>> Uptime:   12
> >>>>>>>>> hours 4 minutes   Process CPU time:   5 hours 45 minutes   JIT
> >>>>> compiler:
> >>>>>>> HotSpot
> >>>>>>>>> 64-Bit Server Compiler   Total compile time:   19,223 seconds
> >>>>>>>>> ------------------------------
> >>>>>>>>>  Current heap size:     703 903 kbytes   Maximum heap size:   2
> >> 030
> >>>>>>> 976kbytes    Committed memory:
> >>>>>>>>> 2 030 976 kbytes   Pending finalization:   0 objects      Garbage
> >>>>>>>>> collector:   Name = 'ParNew', Collections = 9 990, Total time
> spent
> >> =
> >>>>> 5
> >>>>>>>>> minutes   Garbage collector:   Name = 'ConcurrentMarkSweep',
> >>>>> Collections
> >>>>>>> =
> >>>>>>>>> 20, Total time spent = 35,754 seconds
> >>>>>>>>> ------------------------------
> >>>>>>>>>  Operating System:   Linux 2.6.34.7-0.5-xen   Architecture:
> >> amd64
> >>>>>>> Number of processors:
> >>>>>>>>> 8   Committed virtual memory:   4 403 512 kbytes     Total
> physical
> >>>>>>>>> memory:   6 815 744 kbytes   Free physical memory:      82 720
> >> kbytes
> >>>>>>> Total swap space:
> >>>>>>>>> 8 393 924 kbytes   Free swap space:   8 050 880 kbytes
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>>
> >>
> >>
>
>

Re: Java Commited Virtual Memory significally larged then Heap Memory

Posted by Friso van Vollenhoven <fv...@xebia.com>.

Thanks.

I went back to hbase 0.89 with 0.1 LZO, which works fine and does not show this issue.

I tried with a newer Hbase and LZO version, also with the MALLOC... setting but without max direct memory set, so I was wondering whether you need a combination of the two to fix things (apparently not).

Now i am wondering whether I did something wrong setting the env var. It should just be picked up when it's in hbase-env.sh, right?


Friso



On 12 jan 2011, at 10:59, Andrey Stepachev wrote:

> with MALLOC_ARENA_MAX=2
> 
> I check -XX:MaxDirectMemorySize=256m, before, but it doesn't affect anything
> (even no OOM
> exceptions or so on).
> 
> But it looks like i have exactly the same issue (it looks like). I have many
> 64Mb anon memory blocks.
> (sometimes they 132MB). And on heavy load i have rapidly growing rss size of
> jvm process.
> 
> 2011/1/12 Friso van Vollenhoven <fv...@xebia.com>
> 
>> Just to clarify: you fixed it by setting the MALLOC_MAX_ARENA=? in
>> hbase-env.sh?
>> 
>> Did you also use the -XX:MaxDirectMemorySize=256m ?
>> 
>> It would be nice to check that this is a different than the leakage with
>> LZO...
>> 
>> 
>> Thanks,
>> Friso
>> 
>> 
>> On 12 jan 2011, at 07:46, Andrey Stepachev wrote:
>> 
>>> My bad. All things work. Thanks for  Todd Lipcon :)
>>> 
>>> 2011/1/11 Andrey Stepachev <oc...@gmail.com>
>>> 
>>>> I tried to set MALLOC_ARENA_MAX=2. But still the same issue like in LZO
>>>> problem thread. All those 65M blocks here. And JVM continues to eat
>> memory
>>>> on heavy write load. And yes, I use "improved" kernel
>>>> Linux 2.6.34.7-0.5.
>>>> 
>>>> 2011/1/11 Xavier Stevens <xs...@mozilla.com>
>>>> 
>>>> Are you using a newer linux kernel with the new and "improved" memory
>>>>> allocator?
>>>>> 
>>>>> If so try setting this in hadoop-env.sh:
>>>>> 
>>>>> export MALLOC_ARENA_MAX=<number of cores you want to use>
>>>>> 
>>>>> Maybe start by setting it to 4.  You can thank Todd Lipcon if this
>> works
>>>>> for you.
>>>>> 
>>>>> Cheers,
>>>>> 
>>>>> 
>>>>> -Xavier
>>>>> 
>>>>> On 1/11/11 7:24 AM, Andrey Stepachev wrote:
>>>>>> No. I don't use LZO. I tried even remove any native support (i.e. all
>>>>> .so
>>>>>> from class path)
>>>>>> and use java gzip. But nothing.
>>>>>> 
>>>>>> 
>>>>>> 2011/1/11 Friso van Vollenhoven <fv...@xebia.com>
>>>>>> 
>>>>>>> Are you using LZO by any chance? If so, which version?
>>>>>>> 
>>>>>>> Friso
>>>>>>> 
>>>>>>> 
>>>>>>> On 11 jan 2011, at 15:57, Andrey Stepachev wrote:
>>>>>>> 
>>>>>>>> After starting the hbase in jroсkit found the same memory leakage.
>>>>>>>> 
>>>>>>>> After the launch
>>>>>>>> 
>>>>>>>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head
>>>>>>>> Tue Jan 11 16:49:31 2011
>>>>>>>> 
>>>>>>>> 11 16:49:31 MSK 2011
>>>>>>>> PID RSS VSZ% CPU
>>>>>>>> 7863 2547760 5576744 78.7
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> JR dumps:
>>>>>>>> 
>>>>>>>> Total mapped 5576740KB (reserved = 2676404KB) - Java heap 2048000KB
>>>>>>>> (reserved = 1472176KB) - GC tables 68512KB - Thread stacks 37236KB
>> (#
>>>>>>>> threads = 111) - Compiled code 1048576KB (used = 2599KB) - Internal
>>>>>>>> 1224KB - OS 549688KB - Other 1800976KB - Classblocks 1280KB
>> (malloced
>>>>>>>> = 1110KB # 3285) - Java class data 20224KB (malloced = 20002KB #
>> 15134
>>>>>>>> in 3285 classes) - Native memory tracking 1024KB (malloced = 325KB
>> +10
>>>>>>>> KB # 20)
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> After running the mr which make high write load (~1hour)
>>>>>>>> 
>>>>>>>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head
>>>>>>>> Tue Jan 11 17:08:56 2011
>>>>>>>> 
>>>>>>>> 11 17:08:56 MSK 2011
>>>>>>>> PID RSS VSZ% CPU
>>>>>>>> 7863 4072396 5459572 100
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> JR said not important below specify why)
>>>>>>>> 
>>>>>>>> http://paste.ubuntu.com/552820/
>>>>>>>> <http://paste.ubuntu.com/552820/>
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 7863:
>>>>>>>> Total mapped                  5742628KB +165888KB
>> (reserved=1144000KB
>>>>>>>> -1532404KB)
>>>>>>>> -              Java heap      2048000KB           (reserved=0KB
>>>>>>> -1472176KB)
>>>>>>>> -              GC tables        68512KB
>>>>>>>> -          Thread stacks        38028KB    +792KB (#threads=114 +3)
>>>>>>>> -          Compiled code      1048576KB           (used=3376KB
>> +776KB)
>>>>>>>> -               Internal         1480KB    +256KB
>>>>>>>> -                     OS       517944KB  -31744KB
>>>>>>>> -                  Other      1996792KB +195816KB
>>>>>>>> -            Classblocks         1280KB           (malloced=1156KB
>>>>>>>> +45KB #3421 +136)
>>>>>>>> -        Java class data        20992KB    +768KB (malloced=20843KB
>>>>>>>> +840KB #15774 +640 in 3421 classes)
>>>>>>>> - Native memory tracking         1024KB           (malloced=325KB
>>>>> +10KB
>>>>>>> #20)
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>> 
>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>>>  OS                          *java    r x 0x0000000000400000.(
>>>>>>> 76KB)
>>>>>>>>  OS                          *java    rw  0x0000000000612000 (
>>>>>>> 4KB)
>>>>>>>>  OS                        *[heap]    rw  0x0000000000613000.(
>>>>>>> 478712KB)
>>>>>>>> INT                           Poll    r   0x000000007fffe000 (
>>>>>>> 4KB)
>>>>>>>> INT                         Membar    rw  0x000000007ffff000.(
>>>>>>> 4KB)
>>>>>>>> MSP              Classblocks (1/2)    rw  0x0000000082ec0000 (
>>>>>>> 768KB)
>>>>>>>> MSP              Classblocks (2/2)    rw  0x0000000082f80000 (
>>>>>>> 512KB)
>>>>>>>> HEAP                      Java heap    rw
>>>>>>> 0x0000000083000000.(2048000KB)
>>>>>>>>                                       rw  0x00007f2574000000 (
>>>>>>> 65500KB)
>>>>>>>>                                           0x00007f2577ff7000.(
>>>>>>> 36KB)
>>>>>>>>                                       rw  0x00007f2584000000 (
>>>>>>> 65492KB)
>>>>>>>>                                           0x00007f2587ff5000.(
>>>>>>> 44KB)
>>>>>>>>                                       rw  0x00007f258c000000 (
>>>>>>> 65500KB)
>>>>>>>>                                           0x00007f258fff7000 (
>>>>>>> 36KB)
>>>>>>>>                                       rw  0x00007f2590000000 (
>>>>>>> 65500KB)
>>>>>>>>                                           0x00007f2593ff7000 (
>>>>>>> 36KB)
>>>>>>>>                                       rw  0x00007f2594000000 (
>>>>>>> 65500KB)
>>>>>>>>                                           0x00007f2597ff7000 (
>>>>>>> 36KB)
>>>>>>>>                                       rw  0x00007f2598000000 (
>>>>>>> 131036KB)
>>>>>>>>                                           0x00007f259fff7000 (
>>>>>>> 36KB)
>>>>>>>>                                       rw  0x00007f25a0000000 (
>>>>>>> 65528KB)
>>>>>>>>                                           0x00007f25a3ffe000 (
>>>>>>> 8KB)
>>>>>>>>                                       rw  0x00007f25a4000000 (
>>>>>>> 65496KB)
>>>>>>>>                                           0x00007f25a7ff6000 (
>>>>>>> 40KB)
>>>>>>>>                                       rw  0x00007f25a8000000 (
>>>>>>> 65496KB)
>>>>>>>>                                           0x00007f25abff6000 (
>>>>>>> 40KB)
>>>>>>>>                                       rw  0x00007f25ac000000 (
>>>>>>> 65504KB)
>>>>>>>> 
>>>>>>>> 
>>>>>>>> So, the difference was in the pieces of memory like this:
>>>>>>>> 
>>>>>>>> rw 0x00007f2590000000 (65500KB)
>>>>>>>>   0x00007f2593ff7000 (36KB)
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Looks like HLog allocates memory (looks like HLog, becase it is very
>>>>>>> similar
>>>>>>>> size)
>>>>>>>> 
>>>>>>>> If we count this blocks we get amount of lost memory:
>>>>>>>> 
>>>>>>>> 65M * 32 + 132M = 2212M
>>>>>>>> 
>>>>>>>> So, it looks like HLog allcates to many memory, and question is: how
>>>>> to
>>>>>>>> restrict it?
>>>>>>>> 
>>>>>>>> 2010/12/30 Andrey Stepachev <oc...@gmail.com>
>>>>>>>> 
>>>>>>>>> Hi All.
>>>>>>>>> 
>>>>>>>>> After heavy load into hbase (single node, nondistributed test
>> system)
>>>>> I
>>>>>>> got
>>>>>>>>> 4Gb process size of my HBase java process.
>>>>>>>>> On 6GB machine there was no room for anything else (disk cache and
>> so
>>>>>>> on).
>>>>>>>>> Does anybody knows, what is going on, and how you solve this. What
>>>>> heap
>>>>>>>>> memory is set on you hosts
>>>>>>>>> and how much of RSS hbase process actually use.
>>>>>>>>> 
>>>>>>>>> I don't see such things before, all tomcat and other java apps
>> don't
>>>>>>> eats
>>>>>>>>> significally more memory then -Xmx.
>>>>>>>>> 
>>>>>>>>> Connection name:   pid: 23476
>> org.apache.hadoop.hbase.master.HMaster
>>>>>>>>> start   Virtual Machine:   Java HotSpot(TM) 64-Bit Server VM
>> version
>>>>>>>>> 17.1-b03   Vendor:   Sun Microsystems Inc.   Name:   23476@mars
>>>>>>> Uptime:   12
>>>>>>>>> hours 4 minutes   Process CPU time:   5 hours 45 minutes   JIT
>>>>> compiler:
>>>>>>> HotSpot
>>>>>>>>> 64-Bit Server Compiler   Total compile time:   19,223 seconds
>>>>>>>>> ------------------------------
>>>>>>>>>  Current heap size:     703 903 kbytes   Maximum heap size:   2
>> 030
>>>>>>> 976kbytes    Committed memory:
>>>>>>>>> 2 030 976 kbytes   Pending finalization:   0 objects      Garbage
>>>>>>>>> collector:   Name = 'ParNew', Collections = 9 990, Total time spent
>> =
>>>>> 5
>>>>>>>>> minutes   Garbage collector:   Name = 'ConcurrentMarkSweep',
>>>>> Collections
>>>>>>> =
>>>>>>>>> 20, Total time spent = 35,754 seconds
>>>>>>>>> ------------------------------
>>>>>>>>>  Operating System:   Linux 2.6.34.7-0.5-xen   Architecture:
>> amd64
>>>>>>> Number of processors:
>>>>>>>>> 8   Committed virtual memory:   4 403 512 kbytes     Total physical
>>>>>>>>> memory:   6 815 744 kbytes   Free physical memory:      82 720
>> kbytes
>>>>>>> Total swap space:
>>>>>>>>> 8 393 924 kbytes   Free swap space:   8 050 880 kbytes
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>> 
>> 
>>

Re: Java Commited Virtual Memory significally larged then Heap Memory

Posted by Andrey Stepachev <oc...@gmail.com>.

with MALLOC_ARENA_MAX=2

I check -XX:MaxDirectMemorySize=256m, before, but it doesn't affect anything
(even no OOM
exceptions or so on).

But it looks like i have exactly the same issue (it looks like). I have many
64Mb anon memory blocks.
(sometimes they 132MB). And on heavy load i have rapidly growing rss size of
jvm process.

2011/1/12 Friso van Vollenhoven <fv...@xebia.com>

> Just to clarify: you fixed it by setting the MALLOC_MAX_ARENA=? in
> hbase-env.sh?
>
> Did you also use the -XX:MaxDirectMemorySize=256m ?
>
> It would be nice to check that this is a different than the leakage with
> LZO...
>
>
> Thanks,
> Friso
>
>
> On 12 jan 2011, at 07:46, Andrey Stepachev wrote:
>
> > My bad. All things work. Thanks for  Todd Lipcon :)
> >
> > 2011/1/11 Andrey Stepachev <oc...@gmail.com>
> >
> >> I tried to set MALLOC_ARENA_MAX=2. But still the same issue like in LZO
> >> problem thread. All those 65M blocks here. And JVM continues to eat
> memory
> >> on heavy write load. And yes, I use "improved" kernel
> >> Linux 2.6.34.7-0.5.
> >>
> >> 2011/1/11 Xavier Stevens <xs...@mozilla.com>
> >>
> >> Are you using a newer linux kernel with the new and "improved" memory
> >>> allocator?
> >>>
> >>> If so try setting this in hadoop-env.sh:
> >>>
> >>> export MALLOC_ARENA_MAX=<number of cores you want to use>
> >>>
> >>> Maybe start by setting it to 4.  You can thank Todd Lipcon if this
> works
> >>> for you.
> >>>
> >>> Cheers,
> >>>
> >>>
> >>> -Xavier
> >>>
> >>> On 1/11/11 7:24 AM, Andrey Stepachev wrote:
> >>>> No. I don't use LZO. I tried even remove any native support (i.e. all
> >>> .so
> >>>> from class path)
> >>>> and use java gzip. But nothing.
> >>>>
> >>>>
> >>>> 2011/1/11 Friso van Vollenhoven <fv...@xebia.com>
> >>>>
> >>>>> Are you using LZO by any chance? If so, which version?
> >>>>>
> >>>>> Friso
> >>>>>
> >>>>>
> >>>>> On 11 jan 2011, at 15:57, Andrey Stepachev wrote:
> >>>>>
> >>>>>> After starting the hbase in jroсkit found the same memory leakage.
> >>>>>>
> >>>>>> After the launch
> >>>>>>
> >>>>>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head
> >>>>>> Tue Jan 11 16:49:31 2011
> >>>>>>
> >>>>>>  11 16:49:31 MSK 2011
> >>>>>>  PID RSS VSZ% CPU
> >>>>>> 7863 2547760 5576744 78.7
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> JR dumps:
> >>>>>>
> >>>>>> Total mapped 5576740KB (reserved = 2676404KB) - Java heap 2048000KB
> >>>>>> (reserved = 1472176KB) - GC tables 68512KB - Thread stacks 37236KB
> (#
> >>>>>> threads = 111) - Compiled code 1048576KB (used = 2599KB) - Internal
> >>>>>> 1224KB - OS 549688KB - Other 1800976KB - Classblocks 1280KB
> (malloced
> >>>>>> = 1110KB # 3285) - Java class data 20224KB (malloced = 20002KB #
> 15134
> >>>>>> in 3285 classes) - Native memory tracking 1024KB (malloced = 325KB
> +10
> >>>>>> KB # 20)
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> After running the mr which make high write load (~1hour)
> >>>>>>
> >>>>>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head
> >>>>>> Tue Jan 11 17:08:56 2011
> >>>>>>
> >>>>>>  11 17:08:56 MSK 2011
> >>>>>>  PID RSS VSZ% CPU
> >>>>>> 7863 4072396 5459572 100
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> JR said not important below specify why)
> >>>>>>
> >>>>>> http://paste.ubuntu.com/552820/
> >>>>>> <http://paste.ubuntu.com/552820/>
> >>>>>>
> >>>>>>
> >>>>>> 7863:
> >>>>>> Total mapped                  5742628KB +165888KB
> (reserved=1144000KB
> >>>>>> -1532404KB)
> >>>>>> -              Java heap      2048000KB           (reserved=0KB
> >>>>> -1472176KB)
> >>>>>> -              GC tables        68512KB
> >>>>>> -          Thread stacks        38028KB    +792KB (#threads=114 +3)
> >>>>>> -          Compiled code      1048576KB           (used=3376KB
> +776KB)
> >>>>>> -               Internal         1480KB    +256KB
> >>>>>> -                     OS       517944KB  -31744KB
> >>>>>> -                  Other      1996792KB +195816KB
> >>>>>> -            Classblocks         1280KB           (malloced=1156KB
> >>>>>> +45KB #3421 +136)
> >>>>>> -        Java class data        20992KB    +768KB (malloced=20843KB
> >>>>>> +840KB #15774 +640 in 3421 classes)
> >>>>>> - Native memory tracking         1024KB           (malloced=325KB
> >>> +10KB
> >>>>> #20)
> >>>>>>
> >>>>>>
> >>>>>
> >>>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>>>>>   OS                          *java    r x 0x0000000000400000.(
> >>>>> 76KB)
> >>>>>>   OS                          *java    rw  0x0000000000612000 (
> >>>>> 4KB)
> >>>>>>   OS                        *[heap]    rw  0x0000000000613000.(
> >>>>> 478712KB)
> >>>>>>  INT                           Poll    r   0x000000007fffe000 (
> >>>>> 4KB)
> >>>>>>  INT                         Membar    rw  0x000000007ffff000.(
> >>>>> 4KB)
> >>>>>>  MSP              Classblocks (1/2)    rw  0x0000000082ec0000 (
> >>>>> 768KB)
> >>>>>>  MSP              Classblocks (2/2)    rw  0x0000000082f80000 (
> >>>>> 512KB)
> >>>>>> HEAP                      Java heap    rw
> >>>>> 0x0000000083000000.(2048000KB)
> >>>>>>                                        rw  0x00007f2574000000 (
> >>>>> 65500KB)
> >>>>>>                                            0x00007f2577ff7000.(
> >>>>> 36KB)
> >>>>>>                                        rw  0x00007f2584000000 (
> >>>>> 65492KB)
> >>>>>>                                            0x00007f2587ff5000.(
> >>>>> 44KB)
> >>>>>>                                        rw  0x00007f258c000000 (
> >>>>> 65500KB)
> >>>>>>                                            0x00007f258fff7000 (
> >>>>> 36KB)
> >>>>>>                                        rw  0x00007f2590000000 (
> >>>>> 65500KB)
> >>>>>>                                            0x00007f2593ff7000 (
> >>>>> 36KB)
> >>>>>>                                        rw  0x00007f2594000000 (
> >>>>> 65500KB)
> >>>>>>                                            0x00007f2597ff7000 (
> >>>>> 36KB)
> >>>>>>                                        rw  0x00007f2598000000 (
> >>>>> 131036KB)
> >>>>>>                                            0x00007f259fff7000 (
> >>>>> 36KB)
> >>>>>>                                        rw  0x00007f25a0000000 (
> >>>>> 65528KB)
> >>>>>>                                            0x00007f25a3ffe000 (
> >>>>> 8KB)
> >>>>>>                                        rw  0x00007f25a4000000 (
> >>>>> 65496KB)
> >>>>>>                                            0x00007f25a7ff6000 (
> >>>>> 40KB)
> >>>>>>                                        rw  0x00007f25a8000000 (
> >>>>> 65496KB)
> >>>>>>                                            0x00007f25abff6000 (
> >>>>> 40KB)
> >>>>>>                                        rw  0x00007f25ac000000 (
> >>>>> 65504KB)
> >>>>>>
> >>>>>>
> >>>>>> So, the difference was in the pieces of memory like this:
> >>>>>>
> >>>>>> rw 0x00007f2590000000 (65500KB)
> >>>>>>    0x00007f2593ff7000 (36KB)
> >>>>>>
> >>>>>>
> >>>>>> Looks like HLog allocates memory (looks like HLog, becase it is very
> >>>>> similar
> >>>>>> size)
> >>>>>>
> >>>>>> If we count this blocks we get amount of lost memory:
> >>>>>>
> >>>>>> 65M * 32 + 132M = 2212M
> >>>>>>
> >>>>>> So, it looks like HLog allcates to many memory, and question is: how
> >>> to
> >>>>>> restrict it?
> >>>>>>
> >>>>>> 2010/12/30 Andrey Stepachev <oc...@gmail.com>
> >>>>>>
> >>>>>>> Hi All.
> >>>>>>>
> >>>>>>> After heavy load into hbase (single node, nondistributed test
> system)
> >>> I
> >>>>> got
> >>>>>>> 4Gb process size of my HBase java process.
> >>>>>>> On 6GB machine there was no room for anything else (disk cache and
> so
> >>>>> on).
> >>>>>>> Does anybody knows, what is going on, and how you solve this. What
> >>> heap
> >>>>>>> memory is set on you hosts
> >>>>>>> and how much of RSS hbase process actually use.
> >>>>>>>
> >>>>>>> I don't see such things before, all tomcat and other java apps
> don't
> >>>>> eats
> >>>>>>> significally more memory then -Xmx.
> >>>>>>>
> >>>>>>> Connection name:   pid: 23476
> org.apache.hadoop.hbase.master.HMaster
> >>>>>>> start   Virtual Machine:   Java HotSpot(TM) 64-Bit Server VM
> version
> >>>>>>> 17.1-b03   Vendor:   Sun Microsystems Inc.   Name:   23476@mars
> >>>>> Uptime:   12
> >>>>>>> hours 4 minutes   Process CPU time:   5 hours 45 minutes   JIT
> >>> compiler:
> >>>>>  HotSpot
> >>>>>>> 64-Bit Server Compiler   Total compile time:   19,223 seconds
> >>>>>>> ------------------------------
> >>>>>>>   Current heap size:     703 903 kbytes   Maximum heap size:   2
> 030
> >>>>> 976kbytes    Committed memory:
> >>>>>>> 2 030 976 kbytes   Pending finalization:   0 objects      Garbage
> >>>>>>> collector:   Name = 'ParNew', Collections = 9 990, Total time spent
> =
> >>> 5
> >>>>>>> minutes   Garbage collector:   Name = 'ConcurrentMarkSweep',
> >>> Collections
> >>>>> =
> >>>>>>> 20, Total time spent = 35,754 seconds
> >>>>>>> ------------------------------
> >>>>>>>   Operating System:   Linux 2.6.34.7-0.5-xen   Architecture:
> amd64
> >>>>> Number of processors:
> >>>>>>> 8   Committed virtual memory:   4 403 512 kbytes     Total physical
> >>>>>>> memory:   6 815 744 kbytes   Free physical memory:      82 720
> kbytes
> >>>>> Total swap space:
> >>>>>>> 8 393 924 kbytes   Free swap space:   8 050 880 kbytes
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>
> >>
> >>
>
>

Re: Java Commited Virtual Memory significally larged then Heap Memory

Posted by Friso van Vollenhoven <fv...@xebia.com>.

Just to clarify: you fixed it by setting the MALLOC_MAX_ARENA=? in hbase-env.sh?

Did you also use the -XX:MaxDirectMemorySize=256m ?

It would be nice to check that this is a different than the leakage with LZO...


Thanks,
Friso


On 12 jan 2011, at 07:46, Andrey Stepachev wrote:

> My bad. All things work. Thanks for  Todd Lipcon :)
> 
> 2011/1/11 Andrey Stepachev <oc...@gmail.com>
> 
>> I tried to set MALLOC_ARENA_MAX=2. But still the same issue like in LZO
>> problem thread. All those 65M blocks here. And JVM continues to eat memory
>> on heavy write load. And yes, I use "improved" kernel
>> Linux 2.6.34.7-0.5.
>> 
>> 2011/1/11 Xavier Stevens <xs...@mozilla.com>
>> 
>> Are you using a newer linux kernel with the new and "improved" memory
>>> allocator?
>>> 
>>> If so try setting this in hadoop-env.sh:
>>> 
>>> export MALLOC_ARENA_MAX=<number of cores you want to use>
>>> 
>>> Maybe start by setting it to 4.  You can thank Todd Lipcon if this works
>>> for you.
>>> 
>>> Cheers,
>>> 
>>> 
>>> -Xavier
>>> 
>>> On 1/11/11 7:24 AM, Andrey Stepachev wrote:
>>>> No. I don't use LZO. I tried even remove any native support (i.e. all
>>> .so
>>>> from class path)
>>>> and use java gzip. But nothing.
>>>> 
>>>> 
>>>> 2011/1/11 Friso van Vollenhoven <fv...@xebia.com>
>>>> 
>>>>> Are you using LZO by any chance? If so, which version?
>>>>> 
>>>>> Friso
>>>>> 
>>>>> 
>>>>> On 11 jan 2011, at 15:57, Andrey Stepachev wrote:
>>>>> 
>>>>>> After starting the hbase in jroсkit found the same memory leakage.
>>>>>> 
>>>>>> After the launch
>>>>>> 
>>>>>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head
>>>>>> Tue Jan 11 16:49:31 2011
>>>>>> 
>>>>>>  11 16:49:31 MSK 2011
>>>>>>  PID RSS VSZ% CPU
>>>>>> 7863 2547760 5576744 78.7
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> JR dumps:
>>>>>> 
>>>>>> Total mapped 5576740KB (reserved = 2676404KB) - Java heap 2048000KB
>>>>>> (reserved = 1472176KB) - GC tables 68512KB - Thread stacks 37236KB (#
>>>>>> threads = 111) - Compiled code 1048576KB (used = 2599KB) - Internal
>>>>>> 1224KB - OS 549688KB - Other 1800976KB - Classblocks 1280KB (malloced
>>>>>> = 1110KB # 3285) - Java class data 20224KB (malloced = 20002KB # 15134
>>>>>> in 3285 classes) - Native memory tracking 1024KB (malloced = 325KB +10
>>>>>> KB # 20)
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> After running the mr which make high write load (~1hour)
>>>>>> 
>>>>>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head
>>>>>> Tue Jan 11 17:08:56 2011
>>>>>> 
>>>>>>  11 17:08:56 MSK 2011
>>>>>>  PID RSS VSZ% CPU
>>>>>> 7863 4072396 5459572 100
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> JR said not important below specify why)
>>>>>> 
>>>>>> http://paste.ubuntu.com/552820/
>>>>>> <http://paste.ubuntu.com/552820/>
>>>>>> 
>>>>>> 
>>>>>> 7863:
>>>>>> Total mapped                  5742628KB +165888KB (reserved=1144000KB
>>>>>> -1532404KB)
>>>>>> -              Java heap      2048000KB           (reserved=0KB
>>>>> -1472176KB)
>>>>>> -              GC tables        68512KB
>>>>>> -          Thread stacks        38028KB    +792KB (#threads=114 +3)
>>>>>> -          Compiled code      1048576KB           (used=3376KB +776KB)
>>>>>> -               Internal         1480KB    +256KB
>>>>>> -                     OS       517944KB  -31744KB
>>>>>> -                  Other      1996792KB +195816KB
>>>>>> -            Classblocks         1280KB           (malloced=1156KB
>>>>>> +45KB #3421 +136)
>>>>>> -        Java class data        20992KB    +768KB (malloced=20843KB
>>>>>> +840KB #15774 +640 in 3421 classes)
>>>>>> - Native memory tracking         1024KB           (malloced=325KB
>>> +10KB
>>>>> #20)
>>>>>> 
>>>>>> 
>>>>> 
>>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>   OS                          *java    r x 0x0000000000400000.(
>>>>> 76KB)
>>>>>>   OS                          *java    rw  0x0000000000612000 (
>>>>> 4KB)
>>>>>>   OS                        *[heap]    rw  0x0000000000613000.(
>>>>> 478712KB)
>>>>>>  INT                           Poll    r   0x000000007fffe000 (
>>>>> 4KB)
>>>>>>  INT                         Membar    rw  0x000000007ffff000.(
>>>>> 4KB)
>>>>>>  MSP              Classblocks (1/2)    rw  0x0000000082ec0000 (
>>>>> 768KB)
>>>>>>  MSP              Classblocks (2/2)    rw  0x0000000082f80000 (
>>>>> 512KB)
>>>>>> HEAP                      Java heap    rw
>>>>> 0x0000000083000000.(2048000KB)
>>>>>>                                        rw  0x00007f2574000000 (
>>>>> 65500KB)
>>>>>>                                            0x00007f2577ff7000.(
>>>>> 36KB)
>>>>>>                                        rw  0x00007f2584000000 (
>>>>> 65492KB)
>>>>>>                                            0x00007f2587ff5000.(
>>>>> 44KB)
>>>>>>                                        rw  0x00007f258c000000 (
>>>>> 65500KB)
>>>>>>                                            0x00007f258fff7000 (
>>>>> 36KB)
>>>>>>                                        rw  0x00007f2590000000 (
>>>>> 65500KB)
>>>>>>                                            0x00007f2593ff7000 (
>>>>> 36KB)
>>>>>>                                        rw  0x00007f2594000000 (
>>>>> 65500KB)
>>>>>>                                            0x00007f2597ff7000 (
>>>>> 36KB)
>>>>>>                                        rw  0x00007f2598000000 (
>>>>> 131036KB)
>>>>>>                                            0x00007f259fff7000 (
>>>>> 36KB)
>>>>>>                                        rw  0x00007f25a0000000 (
>>>>> 65528KB)
>>>>>>                                            0x00007f25a3ffe000 (
>>>>> 8KB)
>>>>>>                                        rw  0x00007f25a4000000 (
>>>>> 65496KB)
>>>>>>                                            0x00007f25a7ff6000 (
>>>>> 40KB)
>>>>>>                                        rw  0x00007f25a8000000 (
>>>>> 65496KB)
>>>>>>                                            0x00007f25abff6000 (
>>>>> 40KB)
>>>>>>                                        rw  0x00007f25ac000000 (
>>>>> 65504KB)
>>>>>> 
>>>>>> 
>>>>>> So, the difference was in the pieces of memory like this:
>>>>>> 
>>>>>> rw 0x00007f2590000000 (65500KB)
>>>>>>    0x00007f2593ff7000 (36KB)
>>>>>> 
>>>>>> 
>>>>>> Looks like HLog allocates memory (looks like HLog, becase it is very
>>>>> similar
>>>>>> size)
>>>>>> 
>>>>>> If we count this blocks we get amount of lost memory:
>>>>>> 
>>>>>> 65M * 32 + 132M = 2212M
>>>>>> 
>>>>>> So, it looks like HLog allcates to many memory, and question is: how
>>> to
>>>>>> restrict it?
>>>>>> 
>>>>>> 2010/12/30 Andrey Stepachev <oc...@gmail.com>
>>>>>> 
>>>>>>> Hi All.
>>>>>>> 
>>>>>>> After heavy load into hbase (single node, nondistributed test system)
>>> I
>>>>> got
>>>>>>> 4Gb process size of my HBase java process.
>>>>>>> On 6GB machine there was no room for anything else (disk cache and so
>>>>> on).
>>>>>>> Does anybody knows, what is going on, and how you solve this. What
>>> heap
>>>>>>> memory is set on you hosts
>>>>>>> and how much of RSS hbase process actually use.
>>>>>>> 
>>>>>>> I don't see such things before, all tomcat and other java apps don't
>>>>> eats
>>>>>>> significally more memory then -Xmx.
>>>>>>> 
>>>>>>> Connection name:   pid: 23476 org.apache.hadoop.hbase.master.HMaster
>>>>>>> start   Virtual Machine:   Java HotSpot(TM) 64-Bit Server VM version
>>>>>>> 17.1-b03   Vendor:   Sun Microsystems Inc.   Name:   23476@mars
>>>>> Uptime:   12
>>>>>>> hours 4 minutes   Process CPU time:   5 hours 45 minutes   JIT
>>> compiler:
>>>>>  HotSpot
>>>>>>> 64-Bit Server Compiler   Total compile time:   19,223 seconds
>>>>>>> ------------------------------
>>>>>>>   Current heap size:     703 903 kbytes   Maximum heap size:   2 030
>>>>> 976kbytes    Committed memory:
>>>>>>> 2 030 976 kbytes   Pending finalization:   0 objects      Garbage
>>>>>>> collector:   Name = 'ParNew', Collections = 9 990, Total time spent =
>>> 5
>>>>>>> minutes   Garbage collector:   Name = 'ConcurrentMarkSweep',
>>> Collections
>>>>> =
>>>>>>> 20, Total time spent = 35,754 seconds
>>>>>>> ------------------------------
>>>>>>>   Operating System:   Linux 2.6.34.7-0.5-xen   Architecture:   amd64
>>>>> Number of processors:
>>>>>>> 8   Committed virtual memory:   4 403 512 kbytes     Total physical
>>>>>>> memory:   6 815 744 kbytes   Free physical memory:      82 720 kbytes
>>>>> Total swap space:
>>>>>>> 8 393 924 kbytes   Free swap space:   8 050 880 kbytes
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>> 
>>> 
>> 
>>

Re: Java Commited Virtual Memory significally larged then Heap Memory

Posted by Andrey Stepachev <oc...@gmail.com>.

My bad. All things work. Thanks for  Todd Lipcon :)

2011/1/11 Andrey Stepachev <oc...@gmail.com>

> I tried to set MALLOC_ARENA_MAX=2. But still the same issue like in LZO
> problem thread. All those 65M blocks here. And JVM continues to eat memory
> on heavy write load. And yes, I use "improved" kernel
> Linux 2.6.34.7-0.5.
>
> 2011/1/11 Xavier Stevens <xs...@mozilla.com>
>
> Are you using a newer linux kernel with the new and "improved" memory
>> allocator?
>>
>> If so try setting this in hadoop-env.sh:
>>
>> export MALLOC_ARENA_MAX=<number of cores you want to use>
>>
>> Maybe start by setting it to 4.  You can thank Todd Lipcon if this works
>> for you.
>>
>> Cheers,
>>
>>
>> -Xavier
>>
>> On 1/11/11 7:24 AM, Andrey Stepachev wrote:
>> > No. I don't use LZO. I tried even remove any native support (i.e. all
>> .so
>> > from class path)
>> > and use java gzip. But nothing.
>> >
>> >
>> > 2011/1/11 Friso van Vollenhoven <fv...@xebia.com>
>> >
>> >> Are you using LZO by any chance? If so, which version?
>> >>
>> >> Friso
>> >>
>> >>
>> >> On 11 jan 2011, at 15:57, Andrey Stepachev wrote:
>> >>
>> >>> After starting the hbase in jroсkit found the same memory leakage.
>> >>>
>> >>> After the launch
>> >>>
>> >>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head
>> >>> Tue Jan 11 16:49:31 2011
>> >>>
>> >>>   11 16:49:31 MSK 2011
>> >>>   PID RSS VSZ% CPU
>> >>>  7863 2547760 5576744 78.7
>> >>>
>> >>>
>> >>>
>> >>> JR dumps:
>> >>>
>> >>> Total mapped 5576740KB (reserved = 2676404KB) - Java heap 2048000KB
>> >>> (reserved = 1472176KB) - GC tables 68512KB - Thread stacks 37236KB (#
>> >>> threads = 111) - Compiled code 1048576KB (used = 2599KB) - Internal
>> >>> 1224KB - OS 549688KB - Other 1800976KB - Classblocks 1280KB (malloced
>> >>> = 1110KB # 3285) - Java class data 20224KB (malloced = 20002KB # 15134
>> >>> in 3285 classes) - Native memory tracking 1024KB (malloced = 325KB +10
>> >>> KB # 20)
>> >>>
>> >>>
>> >>>
>> >>> After running the mr which make high write load (~1hour)
>> >>>
>> >>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head
>> >>> Tue Jan 11 17:08:56 2011
>> >>>
>> >>>   11 17:08:56 MSK 2011
>> >>>   PID RSS VSZ% CPU
>> >>>  7863 4072396 5459572 100
>> >>>
>> >>>
>> >>>
>> >>> JR said not important below specify why)
>> >>>
>> >>> http://paste.ubuntu.com/552820/
>> >>> <http://paste.ubuntu.com/552820/>
>> >>>
>> >>>
>> >>> 7863:
>> >>> Total mapped                  5742628KB +165888KB (reserved=1144000KB
>> >>> -1532404KB)
>> >>> -              Java heap      2048000KB           (reserved=0KB
>> >> -1472176KB)
>> >>> -              GC tables        68512KB
>> >>> -          Thread stacks        38028KB    +792KB (#threads=114 +3)
>> >>> -          Compiled code      1048576KB           (used=3376KB +776KB)
>> >>> -               Internal         1480KB    +256KB
>> >>> -                     OS       517944KB  -31744KB
>> >>> -                  Other      1996792KB +195816KB
>> >>> -            Classblocks         1280KB           (malloced=1156KB
>> >>> +45KB #3421 +136)
>> >>> -        Java class data        20992KB    +768KB (malloced=20843KB
>> >>> +840KB #15774 +640 in 3421 classes)
>> >>> - Native memory tracking         1024KB           (malloced=325KB
>> +10KB
>> >> #20)
>> >>>
>> >>>
>> >>
>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >>>    OS                          *java    r x 0x0000000000400000.(
>> >> 76KB)
>> >>>    OS                          *java    rw  0x0000000000612000 (
>> >>  4KB)
>> >>>    OS                        *[heap]    rw  0x0000000000613000.(
>> >> 478712KB)
>> >>>   INT                           Poll    r   0x000000007fffe000 (
>> >>  4KB)
>> >>>   INT                         Membar    rw  0x000000007ffff000.(
>> >>  4KB)
>> >>>   MSP              Classblocks (1/2)    rw  0x0000000082ec0000 (
>> >>  768KB)
>> >>>   MSP              Classblocks (2/2)    rw  0x0000000082f80000 (
>> >>  512KB)
>> >>>  HEAP                      Java heap    rw
>> >>  0x0000000083000000.(2048000KB)
>> >>>                                         rw  0x00007f2574000000 (
>> >>  65500KB)
>> >>>                                             0x00007f2577ff7000.(
>> >> 36KB)
>> >>>                                         rw  0x00007f2584000000 (
>> >>  65492KB)
>> >>>                                             0x00007f2587ff5000.(
>> >> 44KB)
>> >>>                                         rw  0x00007f258c000000 (
>> >>  65500KB)
>> >>>                                             0x00007f258fff7000 (
>> >> 36KB)
>> >>>                                         rw  0x00007f2590000000 (
>> >>  65500KB)
>> >>>                                             0x00007f2593ff7000 (
>> >> 36KB)
>> >>>                                         rw  0x00007f2594000000 (
>> >>  65500KB)
>> >>>                                             0x00007f2597ff7000 (
>> >> 36KB)
>> >>>                                         rw  0x00007f2598000000 (
>> >> 131036KB)
>> >>>                                             0x00007f259fff7000 (
>> >> 36KB)
>> >>>                                         rw  0x00007f25a0000000 (
>> >>  65528KB)
>> >>>                                             0x00007f25a3ffe000 (
>> >>  8KB)
>> >>>                                         rw  0x00007f25a4000000 (
>> >>  65496KB)
>> >>>                                             0x00007f25a7ff6000 (
>> >> 40KB)
>> >>>                                         rw  0x00007f25a8000000 (
>> >>  65496KB)
>> >>>                                             0x00007f25abff6000 (
>> >> 40KB)
>> >>>                                         rw  0x00007f25ac000000 (
>> >>  65504KB)
>> >>>
>> >>>
>> >>> So, the difference was in the pieces of memory like this:
>> >>>
>> >>> rw 0x00007f2590000000 (65500KB)
>> >>>     0x00007f2593ff7000 (36KB)
>> >>>
>> >>>
>> >>> Looks like HLog allocates memory (looks like HLog, becase it is very
>> >> similar
>> >>> size)
>> >>>
>> >>> If we count this blocks we get amount of lost memory:
>> >>>
>> >>> 65M * 32 + 132M = 2212M
>> >>>
>> >>> So, it looks like HLog allcates to many memory, and question is: how
>> to
>> >>> restrict it?
>> >>>
>> >>> 2010/12/30 Andrey Stepachev <oc...@gmail.com>
>> >>>
>> >>>> Hi All.
>> >>>>
>> >>>> After heavy load into hbase (single node, nondistributed test system)
>> I
>> >> got
>> >>>> 4Gb process size of my HBase java process.
>> >>>> On 6GB machine there was no room for anything else (disk cache and so
>> >> on).
>> >>>> Does anybody knows, what is going on, and how you solve this. What
>> heap
>> >>>> memory is set on you hosts
>> >>>> and how much of RSS hbase process actually use.
>> >>>>
>> >>>> I don't see such things before, all tomcat and other java apps don't
>> >> eats
>> >>>> significally more memory then -Xmx.
>> >>>>
>> >>>> Connection name:   pid: 23476 org.apache.hadoop.hbase.master.HMaster
>> >>>> start   Virtual Machine:   Java HotSpot(TM) 64-Bit Server VM version
>> >>>> 17.1-b03   Vendor:   Sun Microsystems Inc.   Name:   23476@mars
>> >>  Uptime:   12
>> >>>> hours 4 minutes   Process CPU time:   5 hours 45 minutes   JIT
>> compiler:
>> >>   HotSpot
>> >>>> 64-Bit Server Compiler   Total compile time:   19,223 seconds
>> >>>> ------------------------------
>> >>>>    Current heap size:     703 903 kbytes   Maximum heap size:   2 030
>> >> 976kbytes    Committed memory:
>> >>>> 2 030 976 kbytes   Pending finalization:   0 objects      Garbage
>> >>>> collector:   Name = 'ParNew', Collections = 9 990, Total time spent =
>> 5
>> >>>> minutes   Garbage collector:   Name = 'ConcurrentMarkSweep',
>> Collections
>> >> =
>> >>>> 20, Total time spent = 35,754 seconds
>> >>>> ------------------------------
>> >>>>    Operating System:   Linux 2.6.34.7-0.5-xen   Architecture:   amd64
>> >>  Number of processors:
>> >>>> 8   Committed virtual memory:   4 403 512 kbytes     Total physical
>> >>>> memory:   6 815 744 kbytes   Free physical memory:      82 720 kbytes
>> >>  Total swap space:
>> >>>> 8 393 924 kbytes   Free swap space:   8 050 880 kbytes
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>
>>
>
>

Re: Java Commited Virtual Memory significally larged then Heap Memory

Posted by Andrey Stepachev <oc...@gmail.com>.

I tried to set MALLOC_ARENA_MAX=2. But still the same issue like in LZO
problem thread. All those 65M blocks here. And JVM continues to eat memory
on heavy write load. And yes, I use "improved" kernel
Linux 2.6.34.7-0.5.

2011/1/11 Xavier Stevens <xs...@mozilla.com>

> Are you using a newer linux kernel with the new and "improved" memory
> allocator?
>
> If so try setting this in hadoop-env.sh:
>
> export MALLOC_ARENA_MAX=<number of cores you want to use>
>
> Maybe start by setting it to 4.  You can thank Todd Lipcon if this works
> for you.
>
> Cheers,
>
>
> -Xavier
>
> On 1/11/11 7:24 AM, Andrey Stepachev wrote:
> > No. I don't use LZO. I tried even remove any native support (i.e. all .so
> > from class path)
> > and use java gzip. But nothing.
> >
> >
> > 2011/1/11 Friso van Vollenhoven <fv...@xebia.com>
> >
> >> Are you using LZO by any chance? If so, which version?
> >>
> >> Friso
> >>
> >>
> >> On 11 jan 2011, at 15:57, Andrey Stepachev wrote:
> >>
> >>> After starting the hbase in jroсkit found the same memory leakage.
> >>>
> >>> After the launch
> >>>
> >>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head
> >>> Tue Jan 11 16:49:31 2011
> >>>
> >>>   11 16:49:31 MSK 2011
> >>>   PID RSS VSZ% CPU
> >>>  7863 2547760 5576744 78.7
> >>>
> >>>
> >>>
> >>> JR dumps:
> >>>
> >>> Total mapped 5576740KB (reserved = 2676404KB) - Java heap 2048000KB
> >>> (reserved = 1472176KB) - GC tables 68512KB - Thread stacks 37236KB (#
> >>> threads = 111) - Compiled code 1048576KB (used = 2599KB) - Internal
> >>> 1224KB - OS 549688KB - Other 1800976KB - Classblocks 1280KB (malloced
> >>> = 1110KB # 3285) - Java class data 20224KB (malloced = 20002KB # 15134
> >>> in 3285 classes) - Native memory tracking 1024KB (malloced = 325KB +10
> >>> KB # 20)
> >>>
> >>>
> >>>
> >>> After running the mr which make high write load (~1hour)
> >>>
> >>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head
> >>> Tue Jan 11 17:08:56 2011
> >>>
> >>>   11 17:08:56 MSK 2011
> >>>   PID RSS VSZ% CPU
> >>>  7863 4072396 5459572 100
> >>>
> >>>
> >>>
> >>> JR said not important below specify why)
> >>>
> >>> http://paste.ubuntu.com/552820/
> >>> <http://paste.ubuntu.com/552820/>
> >>>
> >>>
> >>> 7863:
> >>> Total mapped                  5742628KB +165888KB (reserved=1144000KB
> >>> -1532404KB)
> >>> -              Java heap      2048000KB           (reserved=0KB
> >> -1472176KB)
> >>> -              GC tables        68512KB
> >>> -          Thread stacks        38028KB    +792KB (#threads=114 +3)
> >>> -          Compiled code      1048576KB           (used=3376KB +776KB)
> >>> -               Internal         1480KB    +256KB
> >>> -                     OS       517944KB  -31744KB
> >>> -                  Other      1996792KB +195816KB
> >>> -            Classblocks         1280KB           (malloced=1156KB
> >>> +45KB #3421 +136)
> >>> -        Java class data        20992KB    +768KB (malloced=20843KB
> >>> +840KB #15774 +640 in 3421 classes)
> >>> - Native memory tracking         1024KB           (malloced=325KB +10KB
> >> #20)
> >>>
> >>>
> >>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>>    OS                          *java    r x 0x0000000000400000.(
> >> 76KB)
> >>>    OS                          *java    rw  0x0000000000612000 (
> >>  4KB)
> >>>    OS                        *[heap]    rw  0x0000000000613000.(
> >> 478712KB)
> >>>   INT                           Poll    r   0x000000007fffe000 (
> >>  4KB)
> >>>   INT                         Membar    rw  0x000000007ffff000.(
> >>  4KB)
> >>>   MSP              Classblocks (1/2)    rw  0x0000000082ec0000 (
> >>  768KB)
> >>>   MSP              Classblocks (2/2)    rw  0x0000000082f80000 (
> >>  512KB)
> >>>  HEAP                      Java heap    rw
> >>  0x0000000083000000.(2048000KB)
> >>>                                         rw  0x00007f2574000000 (
> >>  65500KB)
> >>>                                             0x00007f2577ff7000.(
> >> 36KB)
> >>>                                         rw  0x00007f2584000000 (
> >>  65492KB)
> >>>                                             0x00007f2587ff5000.(
> >> 44KB)
> >>>                                         rw  0x00007f258c000000 (
> >>  65500KB)
> >>>                                             0x00007f258fff7000 (
> >> 36KB)
> >>>                                         rw  0x00007f2590000000 (
> >>  65500KB)
> >>>                                             0x00007f2593ff7000 (
> >> 36KB)
> >>>                                         rw  0x00007f2594000000 (
> >>  65500KB)
> >>>                                             0x00007f2597ff7000 (
> >> 36KB)
> >>>                                         rw  0x00007f2598000000 (
> >> 131036KB)
> >>>                                             0x00007f259fff7000 (
> >> 36KB)
> >>>                                         rw  0x00007f25a0000000 (
> >>  65528KB)
> >>>                                             0x00007f25a3ffe000 (
> >>  8KB)
> >>>                                         rw  0x00007f25a4000000 (
> >>  65496KB)
> >>>                                             0x00007f25a7ff6000 (
> >> 40KB)
> >>>                                         rw  0x00007f25a8000000 (
> >>  65496KB)
> >>>                                             0x00007f25abff6000 (
> >> 40KB)
> >>>                                         rw  0x00007f25ac000000 (
> >>  65504KB)
> >>>
> >>>
> >>> So, the difference was in the pieces of memory like this:
> >>>
> >>> rw 0x00007f2590000000 (65500KB)
> >>>     0x00007f2593ff7000 (36KB)
> >>>
> >>>
> >>> Looks like HLog allocates memory (looks like HLog, becase it is very
> >> similar
> >>> size)
> >>>
> >>> If we count this blocks we get amount of lost memory:
> >>>
> >>> 65M * 32 + 132M = 2212M
> >>>
> >>> So, it looks like HLog allcates to many memory, and question is: how to
> >>> restrict it?
> >>>
> >>> 2010/12/30 Andrey Stepachev <oc...@gmail.com>
> >>>
> >>>> Hi All.
> >>>>
> >>>> After heavy load into hbase (single node, nondistributed test system)
> I
> >> got
> >>>> 4Gb process size of my HBase java process.
> >>>> On 6GB machine there was no room for anything else (disk cache and so
> >> on).
> >>>> Does anybody knows, what is going on, and how you solve this. What
> heap
> >>>> memory is set on you hosts
> >>>> and how much of RSS hbase process actually use.
> >>>>
> >>>> I don't see such things before, all tomcat and other java apps don't
> >> eats
> >>>> significally more memory then -Xmx.
> >>>>
> >>>> Connection name:   pid: 23476 org.apache.hadoop.hbase.master.HMaster
> >>>> start   Virtual Machine:   Java HotSpot(TM) 64-Bit Server VM version
> >>>> 17.1-b03   Vendor:   Sun Microsystems Inc.   Name:   23476@mars
> >>  Uptime:   12
> >>>> hours 4 minutes   Process CPU time:   5 hours 45 minutes   JIT
> compiler:
> >>   HotSpot
> >>>> 64-Bit Server Compiler   Total compile time:   19,223 seconds
> >>>> ------------------------------
> >>>>    Current heap size:     703 903 kbytes   Maximum heap size:   2 030
> >> 976kbytes    Committed memory:
> >>>> 2 030 976 kbytes   Pending finalization:   0 objects      Garbage
> >>>> collector:   Name = 'ParNew', Collections = 9 990, Total time spent =
> 5
> >>>> minutes   Garbage collector:   Name = 'ConcurrentMarkSweep',
> Collections
> >> =
> >>>> 20, Total time spent = 35,754 seconds
> >>>> ------------------------------
> >>>>    Operating System:   Linux 2.6.34.7-0.5-xen   Architecture:   amd64
> >>  Number of processors:
> >>>> 8   Committed virtual memory:   4 403 512 kbytes     Total physical
> >>>> memory:   6 815 744 kbytes   Free physical memory:      82 720 kbytes
> >>  Total swap space:
> >>>> 8 393 924 kbytes   Free swap space:   8 050 880 kbytes
> >>>>
> >>>>
> >>>>
> >>>>
> >>
>

Re: Java Commited Virtual Memory significally larged then Heap Memory

Posted by Xavier Stevens <xs...@mozilla.com>.

Are you using a newer linux kernel with the new and "improved" memory
allocator?

If so try setting this in hadoop-env.sh:

export MALLOC_ARENA_MAX=<number of cores you want to use>

Maybe start by setting it to 4.  You can thank Todd Lipcon if this works
for you.

Cheers,


-Xavier

On 1/11/11 7:24 AM, Andrey Stepachev wrote:
> No. I don't use LZO. I tried even remove any native support (i.e. all .so
> from class path)
> and use java gzip. But nothing.
>
>
> 2011/1/11 Friso van Vollenhoven <fv...@xebia.com>
>
>> Are you using LZO by any chance? If so, which version?
>>
>> Friso
>>
>>
>> On 11 jan 2011, at 15:57, Andrey Stepachev wrote:
>>
>>> After starting the hbase in jroсkit found the same memory leakage.
>>>
>>> After the launch
>>>
>>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head
>>> Tue Jan 11 16:49:31 2011
>>>
>>>   11 16:49:31 MSK 2011
>>>   PID RSS VSZ% CPU
>>>  7863 2547760 5576744 78.7
>>>
>>>
>>>
>>> JR dumps:
>>>
>>> Total mapped 5576740KB (reserved = 2676404KB) - Java heap 2048000KB
>>> (reserved = 1472176KB) - GC tables 68512KB - Thread stacks 37236KB (#
>>> threads = 111) - Compiled code 1048576KB (used = 2599KB) - Internal
>>> 1224KB - OS 549688KB - Other 1800976KB - Classblocks 1280KB (malloced
>>> = 1110KB # 3285) - Java class data 20224KB (malloced = 20002KB # 15134
>>> in 3285 classes) - Native memory tracking 1024KB (malloced = 325KB +10
>>> KB # 20)
>>>
>>>
>>>
>>> After running the mr which make high write load (~1hour)
>>>
>>> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head
>>> Tue Jan 11 17:08:56 2011
>>>
>>>   11 17:08:56 MSK 2011
>>>   PID RSS VSZ% CPU
>>>  7863 4072396 5459572 100
>>>
>>>
>>>
>>> JR said not important below specify why)
>>>
>>> http://paste.ubuntu.com/552820/
>>> <http://paste.ubuntu.com/552820/>
>>>
>>>
>>> 7863:
>>> Total mapped                  5742628KB +165888KB (reserved=1144000KB
>>> -1532404KB)
>>> -              Java heap      2048000KB           (reserved=0KB
>> -1472176KB)
>>> -              GC tables        68512KB
>>> -          Thread stacks        38028KB    +792KB (#threads=114 +3)
>>> -          Compiled code      1048576KB           (used=3376KB +776KB)
>>> -               Internal         1480KB    +256KB
>>> -                     OS       517944KB  -31744KB
>>> -                  Other      1996792KB +195816KB
>>> -            Classblocks         1280KB           (malloced=1156KB
>>> +45KB #3421 +136)
>>> -        Java class data        20992KB    +768KB (malloced=20843KB
>>> +840KB #15774 +640 in 3421 classes)
>>> - Native memory tracking         1024KB           (malloced=325KB +10KB
>> #20)
>>>
>>>
>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>    OS                          *java    r x 0x0000000000400000.(
>> 76KB)
>>>    OS                          *java    rw  0x0000000000612000 (
>>  4KB)
>>>    OS                        *[heap]    rw  0x0000000000613000.(
>> 478712KB)
>>>   INT                           Poll    r   0x000000007fffe000 (
>>  4KB)
>>>   INT                         Membar    rw  0x000000007ffff000.(
>>  4KB)
>>>   MSP              Classblocks (1/2)    rw  0x0000000082ec0000 (
>>  768KB)
>>>   MSP              Classblocks (2/2)    rw  0x0000000082f80000 (
>>  512KB)
>>>  HEAP                      Java heap    rw
>>  0x0000000083000000.(2048000KB)
>>>                                         rw  0x00007f2574000000 (
>>  65500KB)
>>>                                             0x00007f2577ff7000.(
>> 36KB)
>>>                                         rw  0x00007f2584000000 (
>>  65492KB)
>>>                                             0x00007f2587ff5000.(
>> 44KB)
>>>                                         rw  0x00007f258c000000 (
>>  65500KB)
>>>                                             0x00007f258fff7000 (
>> 36KB)
>>>                                         rw  0x00007f2590000000 (
>>  65500KB)
>>>                                             0x00007f2593ff7000 (
>> 36KB)
>>>                                         rw  0x00007f2594000000 (
>>  65500KB)
>>>                                             0x00007f2597ff7000 (
>> 36KB)
>>>                                         rw  0x00007f2598000000 (
>> 131036KB)
>>>                                             0x00007f259fff7000 (
>> 36KB)
>>>                                         rw  0x00007f25a0000000 (
>>  65528KB)
>>>                                             0x00007f25a3ffe000 (
>>  8KB)
>>>                                         rw  0x00007f25a4000000 (
>>  65496KB)
>>>                                             0x00007f25a7ff6000 (
>> 40KB)
>>>                                         rw  0x00007f25a8000000 (
>>  65496KB)
>>>                                             0x00007f25abff6000 (
>> 40KB)
>>>                                         rw  0x00007f25ac000000 (
>>  65504KB)
>>>
>>>
>>> So, the difference was in the pieces of memory like this:
>>>
>>> rw 0x00007f2590000000 (65500KB)
>>>     0x00007f2593ff7000 (36KB)
>>>
>>>
>>> Looks like HLog allocates memory (looks like HLog, becase it is very
>> similar
>>> size)
>>>
>>> If we count this blocks we get amount of lost memory:
>>>
>>> 65M * 32 + 132M = 2212M
>>>
>>> So, it looks like HLog allcates to many memory, and question is: how to
>>> restrict it?
>>>
>>> 2010/12/30 Andrey Stepachev <oc...@gmail.com>
>>>
>>>> Hi All.
>>>>
>>>> After heavy load into hbase (single node, nondistributed test system) I
>> got
>>>> 4Gb process size of my HBase java process.
>>>> On 6GB machine there was no room for anything else (disk cache and so
>> on).
>>>> Does anybody knows, what is going on, and how you solve this. What heap
>>>> memory is set on you hosts
>>>> and how much of RSS hbase process actually use.
>>>>
>>>> I don't see such things before, all tomcat and other java apps don't
>> eats
>>>> significally more memory then -Xmx.
>>>>
>>>> Connection name:   pid: 23476 org.apache.hadoop.hbase.master.HMaster
>>>> start   Virtual Machine:   Java HotSpot(TM) 64-Bit Server VM version
>>>> 17.1-b03   Vendor:   Sun Microsystems Inc.   Name:   23476@mars
>>  Uptime:   12
>>>> hours 4 minutes   Process CPU time:   5 hours 45 minutes   JIT compiler:
>>   HotSpot
>>>> 64-Bit Server Compiler   Total compile time:   19,223 seconds
>>>> ------------------------------
>>>>    Current heap size:     703 903 kbytes   Maximum heap size:   2 030
>> 976kbytes    Committed memory:
>>>> 2 030 976 kbytes   Pending finalization:   0 objects      Garbage
>>>> collector:   Name = 'ParNew', Collections = 9 990, Total time spent = 5
>>>> minutes   Garbage collector:   Name = 'ConcurrentMarkSweep', Collections
>> =
>>>> 20, Total time spent = 35,754 seconds
>>>> ------------------------------
>>>>    Operating System:   Linux 2.6.34.7-0.5-xen   Architecture:   amd64
>>  Number of processors:
>>>> 8   Committed virtual memory:   4 403 512 kbytes     Total physical
>>>> memory:   6 815 744 kbytes   Free physical memory:      82 720 kbytes
>>  Total swap space:
>>>> 8 393 924 kbytes   Free swap space:   8 050 880 kbytes
>>>>
>>>>
>>>>
>>>>
>>

Re: Java Commited Virtual Memory significally larged then Heap Memory

Posted by Andrey Stepachev <oc...@gmail.com>.

No. I don't use LZO. I tried even remove any native support (i.e. all .so
from class path)
and use java gzip. But nothing.


2011/1/11 Friso van Vollenhoven <fv...@xebia.com>

> Are you using LZO by any chance? If so, which version?
>
> Friso
>
>
> On 11 jan 2011, at 15:57, Andrey Stepachev wrote:
>
> > After starting the hbase in jroсkit found the same memory leakage.
> >
> > After the launch
> >
> > Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head
> > Tue Jan 11 16:49:31 2011
> >
> >   11 16:49:31 MSK 2011
> >   PID RSS VSZ% CPU
> >  7863 2547760 5576744 78.7
> >
> >
> >
> > JR dumps:
> >
> > Total mapped 5576740KB (reserved = 2676404KB) - Java heap 2048000KB
> > (reserved = 1472176KB) - GC tables 68512KB - Thread stacks 37236KB (#
> > threads = 111) - Compiled code 1048576KB (used = 2599KB) - Internal
> > 1224KB - OS 549688KB - Other 1800976KB - Classblocks 1280KB (malloced
> > = 1110KB # 3285) - Java class data 20224KB (malloced = 20002KB # 15134
> > in 3285 classes) - Native memory tracking 1024KB (malloced = 325KB +10
> > KB # 20)
> >
> >
> >
> > After running the mr which make high write load (~1hour)
> >
> > Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head
> > Tue Jan 11 17:08:56 2011
> >
> >   11 17:08:56 MSK 2011
> >   PID RSS VSZ% CPU
> >  7863 4072396 5459572 100
> >
> >
> >
> > JR said not important below specify why)
> >
> > http://paste.ubuntu.com/552820/
> > <http://paste.ubuntu.com/552820/>
> >
> >
> > 7863:
> > Total mapped                  5742628KB +165888KB (reserved=1144000KB
> > -1532404KB)
> > -              Java heap      2048000KB           (reserved=0KB
> -1472176KB)
> > -              GC tables        68512KB
> > -          Thread stacks        38028KB    +792KB (#threads=114 +3)
> > -          Compiled code      1048576KB           (used=3376KB +776KB)
> > -               Internal         1480KB    +256KB
> > -                     OS       517944KB  -31744KB
> > -                  Other      1996792KB +195816KB
> > -            Classblocks         1280KB           (malloced=1156KB
> > +45KB #3421 +136)
> > -        Java class data        20992KB    +768KB (malloced=20843KB
> > +840KB #15774 +640 in 3421 classes)
> > - Native memory tracking         1024KB           (malloced=325KB +10KB
> #20)
> >
> >
> >
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >    OS                          *java    r x 0x0000000000400000.(
> 76KB)
> >    OS                          *java    rw  0x0000000000612000 (
>  4KB)
> >    OS                        *[heap]    rw  0x0000000000613000.(
> 478712KB)
> >   INT                           Poll    r   0x000000007fffe000 (
>  4KB)
> >   INT                         Membar    rw  0x000000007ffff000.(
>  4KB)
> >   MSP              Classblocks (1/2)    rw  0x0000000082ec0000 (
>  768KB)
> >   MSP              Classblocks (2/2)    rw  0x0000000082f80000 (
>  512KB)
> >  HEAP                      Java heap    rw
>  0x0000000083000000.(2048000KB)
> >                                         rw  0x00007f2574000000 (
>  65500KB)
> >                                             0x00007f2577ff7000.(
> 36KB)
> >                                         rw  0x00007f2584000000 (
>  65492KB)
> >                                             0x00007f2587ff5000.(
> 44KB)
> >                                         rw  0x00007f258c000000 (
>  65500KB)
> >                                             0x00007f258fff7000 (
> 36KB)
> >                                         rw  0x00007f2590000000 (
>  65500KB)
> >                                             0x00007f2593ff7000 (
> 36KB)
> >                                         rw  0x00007f2594000000 (
>  65500KB)
> >                                             0x00007f2597ff7000 (
> 36KB)
> >                                         rw  0x00007f2598000000 (
> 131036KB)
> >                                             0x00007f259fff7000 (
> 36KB)
> >                                         rw  0x00007f25a0000000 (
>  65528KB)
> >                                             0x00007f25a3ffe000 (
>  8KB)
> >                                         rw  0x00007f25a4000000 (
>  65496KB)
> >                                             0x00007f25a7ff6000 (
> 40KB)
> >                                         rw  0x00007f25a8000000 (
>  65496KB)
> >                                             0x00007f25abff6000 (
> 40KB)
> >                                         rw  0x00007f25ac000000 (
>  65504KB)
> >
> >
> >
> > So, the difference was in the pieces of memory like this:
> >
> > rw 0x00007f2590000000 (65500KB)
> >     0x00007f2593ff7000 (36KB)
> >
> >
> > Looks like HLog allocates memory (looks like HLog, becase it is very
> similar
> > size)
> >
> > If we count this blocks we get amount of lost memory:
> >
> > 65M * 32 + 132M = 2212M
> >
> > So, it looks like HLog allcates to many memory, and question is: how to
> > restrict it?
> >
> > 2010/12/30 Andrey Stepachev <oc...@gmail.com>
> >
> >> Hi All.
> >>
> >> After heavy load into hbase (single node, nondistributed test system) I
> got
> >> 4Gb process size of my HBase java process.
> >> On 6GB machine there was no room for anything else (disk cache and so
> on).
> >>
> >> Does anybody knows, what is going on, and how you solve this. What heap
> >> memory is set on you hosts
> >> and how much of RSS hbase process actually use.
> >>
> >> I don't see such things before, all tomcat and other java apps don't
> eats
> >> significally more memory then -Xmx.
> >>
> >> Connection name:   pid: 23476 org.apache.hadoop.hbase.master.HMaster
> >> start   Virtual Machine:   Java HotSpot(TM) 64-Bit Server VM version
> >> 17.1-b03   Vendor:   Sun Microsystems Inc.   Name:   23476@mars
>  Uptime:   12
> >> hours 4 minutes   Process CPU time:   5 hours 45 minutes   JIT compiler:
>   HotSpot
> >> 64-Bit Server Compiler   Total compile time:   19,223 seconds
> >> ------------------------------
> >>    Current heap size:     703 903 kbytes   Maximum heap size:   2 030
> 976kbytes    Committed memory:
> >> 2 030 976 kbytes   Pending finalization:   0 objects      Garbage
> >> collector:   Name = 'ParNew', Collections = 9 990, Total time spent = 5
> >> minutes   Garbage collector:   Name = 'ConcurrentMarkSweep', Collections
> =
> >> 20, Total time spent = 35,754 seconds
> >> ------------------------------
> >>    Operating System:   Linux 2.6.34.7-0.5-xen   Architecture:   amd64
>  Number of processors:
> >> 8   Committed virtual memory:   4 403 512 kbytes     Total physical
> >> memory:   6 815 744 kbytes   Free physical memory:      82 720 kbytes
>  Total swap space:
> >> 8 393 924 kbytes   Free swap space:   8 050 880 kbytes
> >>
> >>
> >>
> >>
>
>

Re: Java Commited Virtual Memory significally larged then Heap Memory

Posted by Friso van Vollenhoven <fv...@xebia.com>.

Are you using LZO by any chance? If so, which version?

Friso


On 11 jan 2011, at 15:57, Andrey Stepachev wrote:

> After starting the hbase in jroсkit found the same memory leakage.
> 
> After the launch
> 
> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head
> Tue Jan 11 16:49:31 2011
> 
>   11 16:49:31 MSK 2011
>   PID RSS VSZ% CPU
>  7863 2547760 5576744 78.7
> 
> 
> 
> JR dumps:
> 
> Total mapped 5576740KB (reserved = 2676404KB) - Java heap 2048000KB
> (reserved = 1472176KB) - GC tables 68512KB - Thread stacks 37236KB (#
> threads = 111) - Compiled code 1048576KB (used = 2599KB) - Internal
> 1224KB - OS 549688KB - Other 1800976KB - Classblocks 1280KB (malloced
> = 1110KB # 3285) - Java class data 20224KB (malloced = 20002KB # 15134
> in 3285 classes) - Native memory tracking 1024KB (malloced = 325KB +10
> KB # 20)
> 
> 
> 
> After running the mr which make high write load (~1hour)
> 
> Every 2,0 s: date & & ps - sort =- rss-eopid, rss, vsz, pcpu | head
> Tue Jan 11 17:08:56 2011
> 
>   11 17:08:56 MSK 2011
>   PID RSS VSZ% CPU
>  7863 4072396 5459572 100
> 
> 
> 
> JR said not important below specify why)
> 
> http://paste.ubuntu.com/552820/
> <http://paste.ubuntu.com/552820/>
> 
> 
> 7863:
> Total mapped                  5742628KB +165888KB (reserved=1144000KB
> -1532404KB)
> -              Java heap      2048000KB           (reserved=0KB -1472176KB)
> -              GC tables        68512KB
> -          Thread stacks        38028KB    +792KB (#threads=114 +3)
> -          Compiled code      1048576KB           (used=3376KB +776KB)
> -               Internal         1480KB    +256KB
> -                     OS       517944KB  -31744KB
> -                  Other      1996792KB +195816KB
> -            Classblocks         1280KB           (malloced=1156KB
> +45KB #3421 +136)
> -        Java class data        20992KB    +768KB (malloced=20843KB
> +840KB #15774 +640 in 3421 classes)
> - Native memory tracking         1024KB           (malloced=325KB +10KB #20)
> 
> 
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>    OS                          *java    r x 0x0000000000400000.(     76KB)
>    OS                          *java    rw  0x0000000000612000 (      4KB)
>    OS                        *[heap]    rw  0x0000000000613000.( 478712KB)
>   INT                           Poll    r   0x000000007fffe000 (      4KB)
>   INT                         Membar    rw  0x000000007ffff000.(      4KB)
>   MSP              Classblocks (1/2)    rw  0x0000000082ec0000 (    768KB)
>   MSP              Classblocks (2/2)    rw  0x0000000082f80000 (    512KB)
>  HEAP                      Java heap    rw  0x0000000083000000.(2048000KB)
>                                         rw  0x00007f2574000000 (  65500KB)
>                                             0x00007f2577ff7000.(     36KB)
>                                         rw  0x00007f2584000000 (  65492KB)
>                                             0x00007f2587ff5000.(     44KB)
>                                         rw  0x00007f258c000000 (  65500KB)
>                                             0x00007f258fff7000 (     36KB)
>                                         rw  0x00007f2590000000 (  65500KB)
>                                             0x00007f2593ff7000 (     36KB)
>                                         rw  0x00007f2594000000 (  65500KB)
>                                             0x00007f2597ff7000 (     36KB)
>                                         rw  0x00007f2598000000 ( 131036KB)
>                                             0x00007f259fff7000 (     36KB)
>                                         rw  0x00007f25a0000000 (  65528KB)
>                                             0x00007f25a3ffe000 (      8KB)
>                                         rw  0x00007f25a4000000 (  65496KB)
>                                             0x00007f25a7ff6000 (     40KB)
>                                         rw  0x00007f25a8000000 (  65496KB)
>                                             0x00007f25abff6000 (     40KB)
>                                         rw  0x00007f25ac000000 (  65504KB)
> 
> 
> 
> So, the difference was in the pieces of memory like this:
> 
> rw 0x00007f2590000000 (65500KB)
>     0x00007f2593ff7000 (36KB)
> 
> 
> Looks like HLog allocates memory (looks like HLog, becase it is very similar
> size)
> 
> If we count this blocks we get amount of lost memory:
> 
> 65M * 32 + 132M = 2212M
> 
> So, it looks like HLog allcates to many memory, and question is: how to
> restrict it?
> 
> 2010/12/30 Andrey Stepachev <oc...@gmail.com>
> 
>> Hi All.
>> 
>> After heavy load into hbase (single node, nondistributed test system) I got
>> 4Gb process size of my HBase java process.
>> On 6GB machine there was no room for anything else (disk cache and so on).
>> 
>> Does anybody knows, what is going on, and how you solve this. What heap
>> memory is set on you hosts
>> and how much of RSS hbase process actually use.
>> 
>> I don't see such things before, all tomcat and other java apps don't eats
>> significally more memory then -Xmx.
>> 
>> Connection name:   pid: 23476 org.apache.hadoop.hbase.master.HMaster
>> start   Virtual Machine:   Java HotSpot(TM) 64-Bit Server VM version
>> 17.1-b03   Vendor:   Sun Microsystems Inc.   Name:   23476@mars    Uptime:   12
>> hours 4 minutes   Process CPU time:   5 hours 45 minutes   JIT compiler:   HotSpot
>> 64-Bit Server Compiler   Total compile time:   19,223 seconds
>> ------------------------------
>>    Current heap size:     703 903 kbytes   Maximum heap size:   2 030 976kbytes    Committed memory:
>> 2 030 976 kbytes   Pending finalization:   0 objects      Garbage
>> collector:   Name = 'ParNew', Collections = 9 990, Total time spent = 5
>> minutes   Garbage collector:   Name = 'ConcurrentMarkSweep', Collections =
>> 20, Total time spent = 35,754 seconds
>> ------------------------------
>>    Operating System:   Linux 2.6.34.7-0.5-xen   Architecture:   amd64  Number of processors:
>> 8   Committed virtual memory:   4 403 512 kbytes     Total physical
>> memory:   6 815 744 kbytes   Free physical memory:      82 720 kbytes  Total swap space:
>> 8 393 924 kbytes   Free swap space:   8 050 880 kbytes
>> 
>> 
>> 
>>