You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by Daniel Hobi <d....@gmx.ch> on 2013/01/15 14:51:17 UTC

JVM hangs during startup (indexing)

Hi there

We are currently facing a problem using JCR 2.2.13: The JVM suddenly hangs during its startup. No exception / stacktrace is provided. It just hangs while (re)indexing our repository. 

repository information:
ca. 100GB Data, consisting of 100KB - PDF files 

jvm options: (we only have 2GB ram available :-() 
-Xms512m -Xmx512m -XX:PermSize=64m -XX:MaxPermSize=64m 

repository configuration: (coming from older jcr versions) 
http://pastebin.com/Q9Jdqdai

jStack dump (-F was needed):
http://pastebin.com/REhk00RU

As you may have noticed we use yourkit to profile our application. Unfortunately, removing the agent does not change a thing, but one thing we noticed is that in every "JVM hang" the org.apache.commons.dbcp.PoolableConnection.close() is part of the stacktrace.

Any insights?

Thanks!
Daniel



RE: JVM hangs during startup (indexing)

Posted by Daniel Hobi <d....@gmx.ch>.
Hi Jukka

> Are you running an up-to-date JVM? The thread dump doesn't show any
deadlocks so this sounds rather like a JVM hang, possibly triggered by
> some native font/graphics library issues we've encountered on a few
occasions with PDF processing.

We are currently running 1.6.0_26... 

> Alternatively, increasing the amount of memory available to the process
might help. Some PDFs may require lots of memory, which could cause
> unexpected OOM issues to bubble up.

The strange thing is that we don't get OOM exceptions.

> Finally, if you upgrade to Jackrabbit 2.4, you might want to try the
forkJavaCommand option we added in JCR-2864. That puts the full text
> extraction tasks to separate background processes where they have no
chance of breaking the main repository process in ways described above.

Good to know. We switched to 2.4.x once but immediately went back to 2.2.x
because of very (very) slow query results. We are using XPATH as query
language. Could that be the reason?

However, I'm still investigating this issue. At the moment, it seems that
our postgresql (8.4.14) could be the bad boy in the whole story. After
upgrading to 8.4.15, the indexing process is still running (since
yesterday).

Daniel 



-----Original Message-----
From: Jukka Zitting [mailto:jukka.zitting@gmail.com] 
Sent: Mittwoch, 16. Januar 2013 13:12
To: Jackrabbit Users
Subject: Re: JVM hangs during startup (indexing)

Hi,

On Tue, Jan 15, 2013 at 4:35 PM, Daniel Hobi <d....@gmx.ch> wrote:
> ...and then suddenly (after about 61100)...
> 1. no more log output is created
> 2. yourkit cannot talk to the jvm process anymore (not responding)

Are you running an up-to-date JVM? The thread dump doesn't show any
deadlocks so this sounds rather like a JVM hang, possibly triggered by some
native font/graphics library issues we've encountered on a few occasions
with PDF processing.

Alternatively, increasing the amount of memory available to the process
might help. Some PDFs may require lots of memory, which could cause
unexpected OOM issues to bubble up.

Finally, if you upgrade to Jackrabbit 2.4, you might want to try the
forkJavaCommand option we added in JCR-2864. That puts the full text
extraction tasks to separate background processes where they have no chance
of breaking the main repository process in ways described above.

[1] https://issues.apache.org/jira/browse/JCR-2864

BR,

Jukka Zitting


Re: JVM hangs during startup (indexing)

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Tue, Jan 15, 2013 at 4:35 PM, Daniel Hobi <d....@gmx.ch> wrote:
> ...and then suddenly (after about 61100)...
> 1. no more log output is created
> 2. yourkit cannot talk to the jvm process anymore (not responding)

Are you running an up-to-date JVM? The thread dump doesn't show any
deadlocks so this sounds rather like a JVM hang, possibly triggered by
some native font/graphics library issues we've encountered on a few
occasions with PDF processing.

Alternatively, increasing the amount of memory available to the
process might help. Some PDFs may require lots of memory, which could
cause unexpected OOM issues to bubble up.

Finally, if you upgrade to Jackrabbit 2.4, you might want to try the
forkJavaCommand option we added in JCR-2864. That puts the full text
extraction tasks to separate background processes where they have no
chance of breaking the main repository process in ways described
above.

[1] https://issues.apache.org/jira/browse/JCR-2864

BR,

Jukka Zitting

Re: JVM hangs during startup (indexing)

Posted by Daniel Hobi <d....@gmx.ch>.
Hi Jukka

Thanks for having a look at this.

> It looks like the repository is simply busy reindexing (thread 14523
> is executing the createInitialIndex method).

So it is an expected behaviour at the start of jcr that...
1. log messages are periodically (every 1s) showing up:
INFO  o.a.j.core.query.lucene.MultiIndex - indexing... pathto/jcr:content (1500)
INFO  o.a.j.core.query.lucene.MultiIndex - indexing... pathto/jcr:content (1600)

2. Yourkit is able to periodically profile the CPU and memory usage.

...and then suddenly (after about 61100)...
1. no more log output is created
2. yourkit cannot talk to the jvm process anymore (not responding)

> That will take a long time to reindex... Assuming one second per PDF,
> that's roughly twelve days. Normally PDF parsing should be a faster
> than that, but you're still probably looking at at least a few days
> worth of reindexing.
Yes, I thought it would take a while :-) Honestly, this is not the part I am worried about. It is moreover that I am unable to see if some progress is made.

> Unfortunately the reindexing process currently doesn't run
> concurrently in the background, though there's been some discussion of
> the need to do something like that.
That would be nice to have :-)

Thanks!
Daniel

Re: JVM hangs during startup (indexing)

Posted by Daniel Hobi <d....@gmx.ch>.
Hi Grégory,

Thanks for your suggestion.

Unfortunately, increasing the heap is not an option for us, because we have not faced an out of memory exception (yet?) and the memory usage (according to yourkit) has some (if not to say plenty) space left.

Daniel

-------- Original-Nachricht --------
> Datum: Tue, 15 Jan 2013 15:11:36 +0100
> Von: "oliver.gregory@gmail.com" <ol...@gmail.com>
> An: users@jackrabbit.apache.org
> Betreff: Re: JVM hangs during startup (indexing)

> Hello,
> 
> You have 2GB of ram available. Why don't you increase your jvm heap :
> -Xms512m -Xmx2000m ?
> 
> Grégory

Re: JVM hangs during startup (indexing)

Posted by "oliver.gregory@gmail.com" <ol...@gmail.com>.
Hello,

You have 2GB of ram available. Why don't you increase your jvm heap :
-Xms512m -Xmx2000m ?

Grégory

Re: JVM hangs during startup (indexing)

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Tue, Jan 15, 2013 at 3:51 PM, Daniel Hobi <d....@gmx.ch> wrote:
> We are currently facing a problem using JCR 2.2.13: The JVM suddenly hangs during its startup.
> No exception / stacktrace is provided. It just hangs while (re)indexing our repository.

It looks like the repository is simply busy reindexing (thread 14523
is executing the createInitialIndex method).

> repository information:
> ca. 100GB Data, consisting of 100KB - PDF files

That will take a long time to reindex... Assuming one second per PDF,
that's roughly twelve days. Normally PDF parsing should be a faster
than that, but you're still probably looking at at least a few days
worth of reindexing.

Unfortunately the reindexing process currently doesn't run
concurrently in the background, though there's been some discussion of
the need to do something like that.

BR,

Jukka Zitting