You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Allison, Timothy B." <ta...@mitre.org> on 2015/07/20 18:12:49 UTC

help debugging integration of PDFBox 2.0.0 trunk

All,
  While integrating 2.0.0 trunk into Tika and running against govdocs1, I'm finding two issues that are difficult to reproduce.

Background:
Tika-batch has a parent process that kicks off a Tika processor in a child process, if that dies unexpectedly, the parent kicks it off again.  I'm running with 10 consumer/parser threads and -Xmx5g on an (8 cpu/8GB vm); RHEL 7, Linux cloud-server-02 3.10.0-123.20.1.el7.x86_64 #1 SMP Wed Jan 21 09:45:55 EST 2015 x86_64 x86_64 x86_64 GNU/Linux)

Two problems:

1)      The child process exits with value 1. I'm catching Throwable around the primary execution call in the child process and logging it; nothing shows up in the log files from that part of the code. From the parser log files (at trace), I can tell which 10 files were being processed at the time, but I'm not seeing any other information about what caused the exit.  When I run against just those 10 files, all is ok.

2)      The OS is killing the child far more often than it does with 1.8.9 (exit code 137).

For the second problem, I'll wait until the optimizations to the caching are completed before I start worrying about that.  However, do you have any recommendations on how to figure out what's going on with 1)?

Thank you!

             Cheers,

                   Tim


Re: help debugging integration of PDFBox 2.0.0 trunk

Posted by John Hewson <jo...@jahewson.com>.
> On 20 Jul 2015, at 13:45, Allison, Timothy B. <ta...@mitre.org> wrote:
> 
> 
>>> Xmx doesn't limit native memory, so if there's a leak associated with AWT, ImageIO C libraries, or some other JNI library, the process can grow without limit. Such a leak could be due to a bug, or us not calling close() somewhere.
> 
> Got it.  Ok.  Is there anything I can do to help figure out what's going on?

Try generating a heap dump with -XX:+HeapDumpOnOutOfMemoryError and attaching it
to a JIRA issue.

You might need to reduce your heap size if it’s huge, to make the dump smaller.

— John

> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


RE: help debugging integration of PDFBox 2.0.0 trunk

Posted by "Allison, Timothy B." <ta...@mitre.org>.
>>Xmx doesn't limit native memory, so if there's a leak associated with AWT, ImageIO C libraries, or some other JNI library, the process can grow without limit. Such a leak could be due to a bug, or us not calling close() somewhere.

Got it.  Ok.  Is there anything I can do to help figure out what's going on?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: help debugging integration of PDFBox 2.0.0 trunk

Posted by John Hewson <jo...@jahewson.com>.
> On 20 Jul 2015, at 10:27, Tilman Hausherr <TH...@t-online.de> wrote:
> 
> Am 20.07.2015 um 18:12 schrieb Allison, Timothy B.:
>> All,
>>   While integrating 2.0.0 trunk into Tika and running against govdocs1, I'm finding two issues that are difficult to reproduce.
>> 
>> Background:
>> Tika-batch has a parent process that kicks off a Tika processor in a child process, if that dies unexpectedly, the parent kicks it off again.  I'm running with 10 consumer/parser threads and -Xmx5g on an (8 cpu/8GB vm); RHEL 7, Linux cloud-server-02 3.10.0-123.20.1.el7.x86_64 #1 SMP Wed Jan 21 09:45:55 EST 2015 x86_64 x86_64 x86_64 GNU/Linux)
>> 
>> Two problems:
>> 
>> 1)      The child process exits with value 1. I'm catching Throwable around the primary execution call in the child process and logging it; nothing shows up in the log files from that part of the code. From the parser log files (at trace), I can tell which 10 files were being processed at the time, but I'm not seeing any other information about what caused the exit.  When I run against just those 10 files, all is ok.
>> 
>> 2)      The OS is killing the child far more often than it does with 1.8.9 (exit code 137).
>> 
>> For the second problem, I'll wait until the optimizations to the caching are completed before I start worrying about that.  However, do you have any recommendations on how to figure out what's going on with 1)?
> 
> I'm also having some problem with that system... with my test software, I have observed that java uses more and more space, despite it being told not to use more than a certain amount with -Xmx. After some time, the "process killer" kills the application.

Xmx doesn’t limit native memory, so if there’s a leak associated with AWT, ImageIO C libraries, or some other JNI library, the process can grow without limit. Such a leak could be due to a bug, or us not calling close() somewhere.

— John

> Seems something changed in java memory management:
> http://karunsubramanian.com/websphere/one-important-change-in-memory-management-in-java-8/ <http://karunsubramanian.com/websphere/one-important-change-in-memory-management-in-java-8/>
> 
> I did some investigation on this a few months ago, but gave up out of frustration.
> 
> Tilman
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org <ma...@pdfbox.apache.org>
> For additional commands, e-mail: dev-help@pdfbox.apache.org <ma...@pdfbox.apache.org>

RE: help debugging integration of PDFBox 2.0.0 trunk

Posted by "Allison, Timothy B." <ta...@mitre.org>.
With  ~125k files, and there were 10 restarts, 7x with exit code=137 and 2x with exit code=1.  The exit code=253 was a timeout for: 111126.pdf.

Happens roughly every 8-10 minutes.

502907 2015-07-20 17:13:24,420 [main] WARN  org.apache.tika.batch.BatchProcessDriverCLI  - Must restart process (exitValue=137 numRestarts=0 receivedRestartMessage=false)
986787 2015-07-20 17:21:28,300 [main] WARN  org.apache.tika.batch.BatchProcessDriverCLI  - Must restart process (exitValue=253 numRestarts=1 receivedRestartMessage=false)
1574818 2015-07-20 17:31:16,331 [main] WARN  org.apache.tika.batch.BatchProcessDriverCLI  - Must restart process (exitValue=137 numRestarts=2 receivedRestartMessage=false)
2040741 2015-07-20 17:39:02,254 [main] WARN  org.apache.tika.batch.BatchProcessDriverCLI  - Must restart process (exitValue=137 numRestarts=3 receivedRestartMessage=false)
2545702 2015-07-20 17:47:27,215 [main] WARN  org.apache.tika.batch.BatchProcessDriverCLI  - Must restart process (exitValue=137 numRestarts=4 receivedRestartMessage=false)
3084672 2015-07-20 17:56:26,185 [main] WARN  org.apache.tika.batch.BatchProcessDriverCLI  - Must restart process (exitValue=137 numRestarts=5 receivedRestartMessage=false)
3571616 2015-07-20 18:04:33,129 [main] WARN  org.apache.tika.batch.BatchProcessDriverCLI  - Must restart process (exitValue=1 numRestarts=6 receivedRestartMessage=false)
4021342 2015-07-20 18:12:02,855 [main] WARN  org.apache.tika.batch.BatchProcessDriverCLI  - Must restart process (exitValue=1 numRestarts=7 receivedRestartMessage=false)
4503161 2015-07-20 18:20:04,674 [main] WARN  org.apache.tika.batch.BatchProcessDriverCLI  - Must restart process (exitValue=137 numRestarts=8 receivedRestartMessage=false)
4958976 2015-07-20 18:27:40,489 [main] WARN  org.apache.tika.batch.BatchProcessDriverCLI  - Must restart process (exitValue=137 numRestarts=9 receivedRestartMessage=false)
5437962 2015-07-20 18:35:39,475 [main] WARN  org.apache.tika.batch.BatchProcessDriverCLI  - Hit the maximum number of process restarts. Driver is shutting down now.

-----Original Message-----
From: Allison, Timothy B. [mailto:tallison@mitre.org] 
Sent: Monday, July 20, 2015 3:18 PM
To: dev@pdfbox.apache.org
Subject: RE: help debugging integration of PDFBox 2.0.0 trunk

Y, sorry, Tilman.  I'm not running into problems with 1.8.9 and straight text extraction, though.

Following Timo's recommendation...looks like a memory issue.  Let me know if I should post the full file or move to a more recent version of Java. :)

# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 403177472 bytes for committing reserved memory.
# Possible reasons:
#   The system is out of physical RAM or swap space
...
#  Out of Memory Error (os_linux.cpp:2798), pid=14958, tid=140419564971776
...
vm_info: OpenJDK 64-Bit Server VM (24.75-b04) for linux-amd64 JRE (1.7.0_75-b13), built on Jan 16 2015 09:15:47 by "mockbuild" with gcc 4.8.2 20140120 (Red Hat 4.8.2-16)


-----Original Message-----
From: Tilman Hausherr [mailto:THausherr@t-online.de] 
Sent: Monday, July 20, 2015 1:28 PM
To: dev@pdfbox.apache.org
Subject: Re: help debugging integration of PDFBox 2.0.0 trunk

Am 20.07.2015 um 18:12 schrieb Allison, Timothy B.:
> All,
>    While integrating 2.0.0 trunk into Tika and running against govdocs1, I'm finding two issues that are difficult to reproduce.
>
> Background:
> Tika-batch has a parent process that kicks off a Tika processor in a child process, if that dies unexpectedly, the parent kicks it off again.  I'm running with 10 consumer/parser threads and -Xmx5g on an (8 cpu/8GB vm); RHEL 7, Linux cloud-server-02 3.10.0-123.20.1.el7.x86_64 #1 SMP Wed Jan 21 09:45:55 EST 2015 x86_64 x86_64 x86_64 GNU/Linux)
>
> Two problems:
>
> 1)      The child process exits with value 1. I'm catching Throwable around the primary execution call in the child process and logging it; nothing shows up in the log files from that part of the code. From the parser log files (at trace), I can tell which 10 files were being processed at the time, but I'm not seeing any other information about what caused the exit.  When I run against just those 10 files, all is ok.
>
> 2)      The OS is killing the child far more often than it does with 1.8.9 (exit code 137).
>
> For the second problem, I'll wait until the optimizations to the caching are completed before I start worrying about that.  However, do you have any recommendations on how to figure out what's going on with 1)?

I'm also having some problem with that system... with my test software, 
I have observed that java uses more and more space, despite it being 
told not to use more than a certain amount with -Xmx. After some time, 
the "process killer" kills the application.

Seems something changed in java memory management:
http://karunsubramanian.com/websphere/one-important-change-in-memory-management-in-java-8/

I did some investigation on this a few months ago, but gave up out of 
frustration.

Tilman

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


RE: help debugging integration of PDFBox 2.0.0 trunk

Posted by "Allison, Timothy B." <ta...@mitre.org>.
Y, sorry, Tilman.  I'm not running into problems with 1.8.9 and straight text extraction, though.

Following Timo's recommendation...looks like a memory issue.  Let me know if I should post the full file or move to a more recent version of Java. :)

# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 403177472 bytes for committing reserved memory.
# Possible reasons:
#   The system is out of physical RAM or swap space
...
#  Out of Memory Error (os_linux.cpp:2798), pid=14958, tid=140419564971776
...
vm_info: OpenJDK 64-Bit Server VM (24.75-b04) for linux-amd64 JRE (1.7.0_75-b13), built on Jan 16 2015 09:15:47 by "mockbuild" with gcc 4.8.2 20140120 (Red Hat 4.8.2-16)


-----Original Message-----
From: Tilman Hausherr [mailto:THausherr@t-online.de] 
Sent: Monday, July 20, 2015 1:28 PM
To: dev@pdfbox.apache.org
Subject: Re: help debugging integration of PDFBox 2.0.0 trunk

Am 20.07.2015 um 18:12 schrieb Allison, Timothy B.:
> All,
>    While integrating 2.0.0 trunk into Tika and running against govdocs1, I'm finding two issues that are difficult to reproduce.
>
> Background:
> Tika-batch has a parent process that kicks off a Tika processor in a child process, if that dies unexpectedly, the parent kicks it off again.  I'm running with 10 consumer/parser threads and -Xmx5g on an (8 cpu/8GB vm); RHEL 7, Linux cloud-server-02 3.10.0-123.20.1.el7.x86_64 #1 SMP Wed Jan 21 09:45:55 EST 2015 x86_64 x86_64 x86_64 GNU/Linux)
>
> Two problems:
>
> 1)      The child process exits with value 1. I'm catching Throwable around the primary execution call in the child process and logging it; nothing shows up in the log files from that part of the code. From the parser log files (at trace), I can tell which 10 files were being processed at the time, but I'm not seeing any other information about what caused the exit.  When I run against just those 10 files, all is ok.
>
> 2)      The OS is killing the child far more often than it does with 1.8.9 (exit code 137).
>
> For the second problem, I'll wait until the optimizations to the caching are completed before I start worrying about that.  However, do you have any recommendations on how to figure out what's going on with 1)?

I'm also having some problem with that system... with my test software, 
I have observed that java uses more and more space, despite it being 
told not to use more than a certain amount with -Xmx. After some time, 
the "process killer" kills the application.

Seems something changed in java memory management:
http://karunsubramanian.com/websphere/one-important-change-in-memory-management-in-java-8/

I did some investigation on this a few months ago, but gave up out of 
frustration.

Tilman

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: help debugging integration of PDFBox 2.0.0 trunk

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 20.07.2015 um 18:12 schrieb Allison, Timothy B.:
> All,
>    While integrating 2.0.0 trunk into Tika and running against govdocs1, I'm finding two issues that are difficult to reproduce.
>
> Background:
> Tika-batch has a parent process that kicks off a Tika processor in a child process, if that dies unexpectedly, the parent kicks it off again.  I'm running with 10 consumer/parser threads and -Xmx5g on an (8 cpu/8GB vm); RHEL 7, Linux cloud-server-02 3.10.0-123.20.1.el7.x86_64 #1 SMP Wed Jan 21 09:45:55 EST 2015 x86_64 x86_64 x86_64 GNU/Linux)
>
> Two problems:
>
> 1)      The child process exits with value 1. I'm catching Throwable around the primary execution call in the child process and logging it; nothing shows up in the log files from that part of the code. From the parser log files (at trace), I can tell which 10 files were being processed at the time, but I'm not seeing any other information about what caused the exit.  When I run against just those 10 files, all is ok.
>
> 2)      The OS is killing the child far more often than it does with 1.8.9 (exit code 137).
>
> For the second problem, I'll wait until the optimizations to the caching are completed before I start worrying about that.  However, do you have any recommendations on how to figure out what's going on with 1)?

I'm also having some problem with that system... with my test software, 
I have observed that java uses more and more space, despite it being 
told not to use more than a certain amount with -Xmx. After some time, 
the "process killer" kills the application.

Seems something changed in java memory management:
http://karunsubramanian.com/websphere/one-important-change-in-memory-management-in-java-8/

I did some investigation on this a few months ago, but gave up out of 
frustration.

Tilman

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


RE: help debugging integration of PDFBox 2.0.0 trunk

Posted by "Allison, Timothy B." <ta...@mitre.org>.
Will modify.  Thank you.

-----Original Message-----
From: Timo Boehme [mailto:timo.boehme@ontochem.com] 
Sent: Monday, July 20, 2015 12:26 PM
To: dev@pdfbox.apache.org
Subject: Re: help debugging integration of PDFBox 2.0.0 trunk

Hi Tim,

is your java configured to write crash dumps? In 1.x it sometimes occurs 
to me that Java crashed in a native font library. However with 2.x and 
Java 1.7 I had also crashes in a native Java library.

Best,
Timo


Am 20.07.2015 um 18:12 schrieb Allison, Timothy B.:
> All,
>    While integrating 2.0.0 trunk into Tika and running against govdocs1, I'm finding two issues that are difficult to reproduce.
>
> Background:
> Tika-batch has a parent process that kicks off a Tika processor in a child process, if that dies unexpectedly, the parent kicks it off again.  I'm running with 10 consumer/parser threads and -Xmx5g on an (8 cpu/8GB vm); RHEL 7, Linux cloud-server-02 3.10.0-123.20.1.el7.x86_64 #1 SMP Wed Jan 21 09:45:55 EST 2015 x86_64 x86_64 x86_64 GNU/Linux)
>
> Two problems:
>
> 1)      The child process exits with value 1. I'm catching Throwable around the primary execution call in the child process and logging it; nothing shows up in the log files from that part of the code. From the parser log files (at trace), I can tell which 10 files were being processed at the time, but I'm not seeing any other information about what caused the exit.  When I run against just those 10 files, all is ok.
>
> 2)      The OS is killing the child far more often than it does with 1.8.9 (exit code 137).
>
> For the second problem, I'll wait until the optimizations to the caching are completed before I start worrying about that.  However, do you have any recommendations on how to figure out what's going on with 1)?
>
> Thank you!
>
>               Cheers,
>
>                     Tim
>
>


-- 
Timo Boehme
OntoChem IT Solutions GmbH
Blücherstraße 24
06120 Halle (Saale)
Germany

phone: +49 345 478 047 4      | fax: +49 345 478 047 1
email: ulf.laube@ontochem.com | web: www.ontochem.com
HRB 21962 Amtsgericht Stendal | USt-IdNr.: DE815563824
managing director : Lutz Weber


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: help debugging integration of PDFBox 2.0.0 trunk

Posted by Timo Boehme <ti...@ontochem.com>.
Hi Tim,

is your java configured to write crash dumps? In 1.x it sometimes occurs 
to me that Java crashed in a native font library. However with 2.x and 
Java 1.7 I had also crashes in a native Java library.

Best,
Timo


Am 20.07.2015 um 18:12 schrieb Allison, Timothy B.:
> All,
>    While integrating 2.0.0 trunk into Tika and running against govdocs1, I'm finding two issues that are difficult to reproduce.
>
> Background:
> Tika-batch has a parent process that kicks off a Tika processor in a child process, if that dies unexpectedly, the parent kicks it off again.  I'm running with 10 consumer/parser threads and -Xmx5g on an (8 cpu/8GB vm); RHEL 7, Linux cloud-server-02 3.10.0-123.20.1.el7.x86_64 #1 SMP Wed Jan 21 09:45:55 EST 2015 x86_64 x86_64 x86_64 GNU/Linux)
>
> Two problems:
>
> 1)      The child process exits with value 1. I'm catching Throwable around the primary execution call in the child process and logging it; nothing shows up in the log files from that part of the code. From the parser log files (at trace), I can tell which 10 files were being processed at the time, but I'm not seeing any other information about what caused the exit.  When I run against just those 10 files, all is ok.
>
> 2)      The OS is killing the child far more often than it does with 1.8.9 (exit code 137).
>
> For the second problem, I'll wait until the optimizations to the caching are completed before I start worrying about that.  However, do you have any recommendations on how to figure out what's going on with 1)?
>
> Thank you!
>
>               Cheers,
>
>                     Tim
>
>


-- 
Timo Boehme
OntoChem IT Solutions GmbH
Blücherstraße 24
06120 Halle (Saale)
Germany

phone: +49 345 478 047 4      | fax: +49 345 478 047 1
email: ulf.laube@ontochem.com | web: www.ontochem.com
HRB 21962 Amtsgericht Stendal | USt-IdNr.: DE815563824
managing director : Lutz Weber


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org