You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by Vijay Pandey <VP...@mdes.ms.gov> on 2009/01/05 17:48:43 UTC

Large DMS - 43 GB - Out Of Memory Errors

Hi,

 

We are using Jackrabbit 1.0.1 now for more than 2 years. Currently we are
facing a lot of issues related to the Out Of Memory exception, goes down
around 3 times a day. These are the relevant details of the DMS.

 

a)       JackRabbit 1.0.1 - set up as a standalone RMI server

b)       Its primarily used to store PDF documents ( size vary from 50 KB to
100 KB for each document) in binary format

c)       We have a nightly java batch programs that inserts around 5000 to
6000 of these documents every day

d)       During the day time these documents are accessed by the user from a
web application - we connect through RMI to the jackrabbit server and fetch
the document

e)       The database used is DB2 and the documents ( blobs) are stored on
the file system ( setting for externalblobs is true)

f)         The total size of the repository has reached around 43 GB

 

Will really appreciate if you all can throw some light on this and what can
be done to resolve this issue?

 

I am also thinking of migrating to 1.5 Jackrabbit, what I have read is other
than various jar file changes; it should be compatible to the 1.x release? I
also read about the DataStore, but to use it do I have to do the full
export/import?

 

Thanks

Vijay


Re: Large DMS - 43 GB - Out Of Memory Errors

Posted by Alexander Klimetschek <ak...@day.com>.
On Thu, Jan 8, 2009 at 7:08 PM, Vijay Pandey <VP...@mdes.ms.gov> wrote:
> a) Root Node
>        OurContentMainNode
>                Node 1
>                        This node has got around 80000 flat(non-hierchical) child nodes

[...]

>                Sun.rmi.transport.WeakRef --- has the size of around 19 MB
> with the following structure.
>
> Sun.rmi.transport.WeakRef
>        Org.apache.jackrabbit.rmi.server.ServerNode
>                Org.apache.jackrabbit.core.NodeImpl
>                        Org.apache.jackrabbit.core.NodeData
>                                Org.apache.jackrabbit.core.state.NodeState
>
> Org.apache.jackrabbit.core.state.ChildNodeEntries ----- this one occupied 19 MB

As Todd already mentioned, this one occupies 19 MB because it contain
the references to all those 80000 child nodes you have. This is a
basic design decision inside Jackrabbit that takes a lot of effort to
change and optimizes the performance for small number of child nodes.
You can solve this by introducing some extra-levels to you content
model.

The benefit is also that the repository becomes more easily explorable
for humans. Typical solutions are to use some categories of the
context of your data or date folders, such as "2009/01/09".

Regards,
Alex

-- 
Alexander Klimetschek
alexander.klimetschek@day.com

RE: Large DMS - 43 GB - Out Of Memory Errors

Posted by Vijay Pandey <VP...@mdes.ms.gov>.
Yeah it is more than that --- we do have issues on performance but currently
the system is going out of memory every 2-3 hrs.

-----Original Message-----
From: Todd Seiber [mailto:todd.seiber@gmail.com] 
Sent: Thursday, January 08, 2009 12:15 PM
To: users@jackrabbit.apache.org
Subject: Re: Large DMS - 43 GB - Out Of Memory Errors

The Jackrabbit main site and wiki mention performance problems with over 10K
child nodes. Are you exceeding this number?

On Thu, Jan 8, 2009 at 1:08 PM, Vijay Pandey <VP...@mdes.ms.gov> wrote:

> We got the jackrabbit 1.5 upgraded without any issue and did the test on
> 1.5
>
> on a smaller repository of size around 20 GB.
>
> This is how our content is setup
>
> a) Root Node
>        OurContentMainNode
>                Node 1
>                        This node has got around 80000 flat(non-hierchical)
> child                   nodes and each of these child nodes have around 5
> to
> 20                      set of PDF docs (size varies from 15kb to 70kb)
> along with                      several other properties (around 12 -
> string
> type                            properties)
>                Node 2 --
>                Node 3
>                Node 4
>                Node 5
>
>
> We are using JackRabbit through RMI -- we have 2 separate applications
> accessing JackRabbit.
>
> This is what we found out through YourKit Java Profiler. We ran this in
> test
> environment with only 1 user connected to JackRabbit.
>
> a) When we execute this piece of code on client side through RMI
>
> session.getRootNode().getNode("OurContentMainNode").getNode("Node1");
>
>        Just on the above operation we saw a spike of around 37 MB memory
on
> the heap.
>
> After doing the force garbage collection (through yourkit profiler) – we
> saw
> around 18 MB memory being reclaimed, but still there was around 19 MB
> memory
> being used in heap, we ran the snapshot to see which objects have claimed
> so
> much of heap. This is what we found out.
>
>                Sun.rmi.transport.WeakRef --- has the size of around 19 MB
> with the following structure.
>
> Sun.rmi.transport.WeakRef
>        Org.apache.jackrabbit.rmi.server.ServerNode
>                Org.apache.jackrabbit.core.NodeImpl
>                        Org.apache.jackrabbit.core.NodeData
>                                Org.apache.jackrabbit.core.state.NodeState
>
> Org.apache.jackrabbit.core.state.ChildNodeEntries ----- this one occupied
> 19
> MB
>                Org.apache.commons.collections.map.LinkedMap
>                Java.util.HashMap
>
>
> After doing some more force GC, around 10 mints later this memory was
> reclaimed.
>
>
> Is this issue just because we have too many nodes (non-hierarchical) or
> it's
> a combination of too many non-hierarchical nodes and the JackRabbit RMI
> usage.
>
> Any hint how we can get over this problem will be great.
>
> Thanks
> Vijay
>
> -----Original Message-----
> From: Jukka Zitting [mailto:jukka.zitting@gmail.com]
> Sent: Tuesday, January 06, 2009 3:13 AM
> To: users@jackrabbit.apache.org
> Subject: Re: Large DMS - 43 GB - Out Of Memory Errors
>
> Hi,
>
> On Tue, Jan 6, 2009 at 9:32 AM, Thomas Müller <th...@day.com>
> wrote:
> > Upgrading is a good idea, you will have to do a full export/import
> however.
>
> Only if you decide to start using the DataStore or bundle persistence
> features, both of which should provide nice performance benefits in
> this case.
>
> But you should be able to upgrade to Jackrabbit 1.5 simply by
> replacing all the relevant jar files (note that 1.5 uses quite a bit
> more jars than 1.0.1). Your existing configuration files and
> repository data should work fine with Jackrabbit 1.5.
>
> BR,
>
> Jukka Zitting
>
>
> -----------------------
> Hi,
>
>
>
> We are using Jackrabbit 1.0.1 now for more than 2 years. Currently we are
> facing a lot of issues related to the Out Of Memory exception, goes down
> around 3 times a day. These are the relevant details of the DMS.
>
>
>
> a)       JackRabbit 1.0.1 - set up as a standalone RMI server
>
> b)       Its primarily used to store PDF documents ( size vary from 50 KB
> to
> 100 KB for each document) in binary format
>
> c)       We have a nightly java batch programs that inserts around 5000 to
> 6000 of these documents every day
>
> d)       During the day time these documents are accessed by the user from
> a
> web application - we connect through RMI to the jackrabbit server and
fetch
> the document
>
> e)       The database used is DB2 and the documents ( blobs) are stored on
> the file system ( setting for externalblobs is true)
>
> f)         The total size of the repository has reached around 43 GB
>
>
>
> Will really appreciate if you all can throw some light on this and what
can
> be done to resolve this issue?
>
>
>
> I am also thinking of migrating to 1.5 Jackrabbit, what I have read is
> other
> than various jar file changes; it should be compatible to the 1.x release?
> I
> also read about the DataStore, but to use it do I have to do the full
> export/import?
>
>
>
> Thanks
>
> Vijay
>
>


-- 
Todd Seiber
830 Fishing Creek Rd.
New Cumberland, PA 17070

h. 717-938-5778
c. 717-497-1742
e. todd.seiber@gmail.com


Re: Large DMS - 43 GB - Out Of Memory Errors

Posted by Todd Seiber <to...@gmail.com>.
The Jackrabbit main site and wiki mention performance problems with over 10K
child nodes. Are you exceeding this number?

On Thu, Jan 8, 2009 at 1:08 PM, Vijay Pandey <VP...@mdes.ms.gov> wrote:

> We got the jackrabbit 1.5 upgraded without any issue and did the test on
> 1.5
>
> on a smaller repository of size around 20 GB.
>
> This is how our content is setup
>
> a) Root Node
>        OurContentMainNode
>                Node 1
>                        This node has got around 80000 flat(non-hierchical)
> child                   nodes and each of these child nodes have around 5
> to
> 20                      set of PDF docs (size varies from 15kb to 70kb)
> along with                      several other properties (around 12 -
> string
> type                            properties)
>                Node 2 --
>                Node 3
>                Node 4
>                Node 5
>
>
> We are using JackRabbit through RMI -- we have 2 separate applications
> accessing JackRabbit.
>
> This is what we found out through YourKit Java Profiler. We ran this in
> test
> environment with only 1 user connected to JackRabbit.
>
> a) When we execute this piece of code on client side through RMI
>
> session.getRootNode().getNode("OurContentMainNode").getNode("Node1");
>
>        Just on the above operation we saw a spike of around 37 MB memory on
> the heap.
>
> After doing the force garbage collection (through yourkit profiler) – we
> saw
> around 18 MB memory being reclaimed, but still there was around 19 MB
> memory
> being used in heap, we ran the snapshot to see which objects have claimed
> so
> much of heap. This is what we found out.
>
>                Sun.rmi.transport.WeakRef --- has the size of around 19 MB
> with the following structure.
>
> Sun.rmi.transport.WeakRef
>        Org.apache.jackrabbit.rmi.server.ServerNode
>                Org.apache.jackrabbit.core.NodeImpl
>                        Org.apache.jackrabbit.core.NodeData
>                                Org.apache.jackrabbit.core.state.NodeState
>
> Org.apache.jackrabbit.core.state.ChildNodeEntries ----- this one occupied
> 19
> MB
>                Org.apache.commons.collections.map.LinkedMap
>                Java.util.HashMap
>
>
> After doing some more force GC, around 10 mints later this memory was
> reclaimed.
>
>
> Is this issue just because we have too many nodes (non-hierarchical) or
> it's
> a combination of too many non-hierarchical nodes and the JackRabbit RMI
> usage.
>
> Any hint how we can get over this problem will be great.
>
> Thanks
> Vijay
>
> -----Original Message-----
> From: Jukka Zitting [mailto:jukka.zitting@gmail.com]
> Sent: Tuesday, January 06, 2009 3:13 AM
> To: users@jackrabbit.apache.org
> Subject: Re: Large DMS - 43 GB - Out Of Memory Errors
>
> Hi,
>
> On Tue, Jan 6, 2009 at 9:32 AM, Thomas Müller <th...@day.com>
> wrote:
> > Upgrading is a good idea, you will have to do a full export/import
> however.
>
> Only if you decide to start using the DataStore or bundle persistence
> features, both of which should provide nice performance benefits in
> this case.
>
> But you should be able to upgrade to Jackrabbit 1.5 simply by
> replacing all the relevant jar files (note that 1.5 uses quite a bit
> more jars than 1.0.1). Your existing configuration files and
> repository data should work fine with Jackrabbit 1.5.
>
> BR,
>
> Jukka Zitting
>
>
> -----------------------
> Hi,
>
>
>
> We are using Jackrabbit 1.0.1 now for more than 2 years. Currently we are
> facing a lot of issues related to the Out Of Memory exception, goes down
> around 3 times a day. These are the relevant details of the DMS.
>
>
>
> a)       JackRabbit 1.0.1 - set up as a standalone RMI server
>
> b)       Its primarily used to store PDF documents ( size vary from 50 KB
> to
> 100 KB for each document) in binary format
>
> c)       We have a nightly java batch programs that inserts around 5000 to
> 6000 of these documents every day
>
> d)       During the day time these documents are accessed by the user from
> a
> web application - we connect through RMI to the jackrabbit server and fetch
> the document
>
> e)       The database used is DB2 and the documents ( blobs) are stored on
> the file system ( setting for externalblobs is true)
>
> f)         The total size of the repository has reached around 43 GB
>
>
>
> Will really appreciate if you all can throw some light on this and what can
> be done to resolve this issue?
>
>
>
> I am also thinking of migrating to 1.5 Jackrabbit, what I have read is
> other
> than various jar file changes; it should be compatible to the 1.x release?
> I
> also read about the DataStore, but to use it do I have to do the full
> export/import?
>
>
>
> Thanks
>
> Vijay
>
>


-- 
Todd Seiber
830 Fishing Creek Rd.
New Cumberland, PA 17070

h. 717-938-5778
c. 717-497-1742
e. todd.seiber@gmail.com

RE: Large DMS - 43 GB - Out Of Memory Errors

Posted by Vijay Pandey <VP...@mdes.ms.gov>.
We got the jackrabbit 1.5 upgraded without any issue and did the test on 1.5

on a smaller repository of size around 20 GB.

This is how our content is setup

a) Root Node
	OurContentMainNode
		Node 1
			This node has got around 80000 flat(non-hierchical)
child 			nodes and each of these child nodes have around 5 to
20 			set of PDF docs (size varies from 15kb to 70kb)
along with 			several other properties (around 12 - string
type 				properties)
		Node 2 -- 
		Node 3
		Node 4
		Node 5
		

We are using JackRabbit through RMI -- we have 2 separate applications
accessing JackRabbit.

This is what we found out through YourKit Java Profiler. We ran this in test
environment with only 1 user connected to JackRabbit.

a) When we execute this piece of code on client side through RMI
	
session.getRootNode().getNode(“OurContentMainNode”).getNode(“Node1”);
	
	Just on the above operation we saw a spike of around 37 MB memory on
the heap.

After doing the force garbage collection (through yourkit profiler) – we saw
around 18 MB memory being reclaimed, but still there was around 19 MB memory
being used in heap, we ran the snapshot to see which objects have claimed so
much of heap. This is what we found out.

		Sun.rmi.transport.WeakRef --- has the size of around 19 MB
with the following structure.

Sun.rmi.transport.WeakRef
	Org.apache.jackrabbit.rmi.server.ServerNode
		Org.apache.jackrabbit.core.NodeImpl
			Org.apache.jackrabbit.core.NodeData
				Org.apache.jackrabbit.core.state.NodeState
	
Org.apache.jackrabbit.core.state.ChildNodeEntries ----- this one occupied 19
MB
		Org.apache.commons.collections.map.LinkedMap
		Java.util.HashMap
			

After doing some more force GC, around 10 mints later this memory was
reclaimed.


Is this issue just because we have too many nodes (non-hierarchical) or it’s
a combination of too many non-hierarchical nodes and the JackRabbit RMI
usage.

Any hint how we can get over this problem will be great.

Thanks
Vijay

-----Original Message-----
From: Jukka Zitting [mailto:jukka.zitting@gmail.com] 
Sent: Tuesday, January 06, 2009 3:13 AM
To: users@jackrabbit.apache.org
Subject: Re: Large DMS - 43 GB - Out Of Memory Errors

Hi,

On Tue, Jan 6, 2009 at 9:32 AM, Thomas Müller <th...@day.com>
wrote:
> Upgrading is a good idea, you will have to do a full export/import
however.

Only if you decide to start using the DataStore or bundle persistence
features, both of which should provide nice performance benefits in
this case.

But you should be able to upgrade to Jackrabbit 1.5 simply by
replacing all the relevant jar files (note that 1.5 uses quite a bit
more jars than 1.0.1). Your existing configuration files and
repository data should work fine with Jackrabbit 1.5.

BR,

Jukka Zitting


-----------------------
Hi,

 

We are using Jackrabbit 1.0.1 now for more than 2 years. Currently we are
facing a lot of issues related to the Out Of Memory exception, goes down
around 3 times a day. These are the relevant details of the DMS.

 

a)       JackRabbit 1.0.1 - set up as a standalone RMI server

b)       Its primarily used to store PDF documents ( size vary from 50 KB to
100 KB for each document) in binary format

c)       We have a nightly java batch programs that inserts around 5000 to
6000 of these documents every day

d)       During the day time these documents are accessed by the user from a
web application - we connect through RMI to the jackrabbit server and fetch
the document

e)       The database used is DB2 and the documents ( blobs) are stored on
the file system ( setting for externalblobs is true)

f)         The total size of the repository has reached around 43 GB

 

Will really appreciate if you all can throw some light on this and what can
be done to resolve this issue?

 

I am also thinking of migrating to 1.5 Jackrabbit, what I have read is other
than various jar file changes; it should be compatible to the 1.x release? I
also read about the DataStore, but to use it do I have to do the full
export/import?

 

Thanks

Vijay


Re: Large DMS - 43 GB - Out Of Memory Errors

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Tue, Jan 6, 2009 at 9:32 AM, Thomas Müller <th...@day.com> wrote:
> Upgrading is a good idea, you will have to do a full export/import however.

Only if you decide to start using the DataStore or bundle persistence
features, both of which should provide nice performance benefits in
this case.

But you should be able to upgrade to Jackrabbit 1.5 simply by
replacing all the relevant jar files (note that 1.5 uses quite a bit
more jars than 1.0.1). Your existing configuration files and
repository data should work fine with Jackrabbit 1.5.

BR,

Jukka Zitting

Re: Large DMS - 43 GB - Out Of Memory Errors

Posted by Thomas Müller <th...@day.com>.
Hi,


> Currently we are
> facing a lot of issues related to the Out Of Memory exception, goes down
> around 3 times a day.


Did already analyze what uses so much memory? For example using
jmap (if you are using Java 1.5 or 1.6). Or using the YourKit Java Profiler.

I am also thinking of migrating to 1.5 Jackrabbit, what I have read is other
> than various jar file changes; it should be compatible to the 1.x release?
> I
> also read about the DataStore, but to use it do I have to do the full
> export/import?


Upgrading is a good idea, you will have to do a full export/import however.

Regards,
Thomas