You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Florent Guillaume (JIRA)" <ji...@apache.org> on 2013/07/19 17:32:49 UTC
[jira] [Commented] (PDFBOX-1622) TextNormalize init not
thread-safe, may lead to infinite loop
[ https://issues.apache.org/jira/browse/PDFBOX-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13713771#comment-13713771 ]
Florent Guillaume commented on PDFBOX-1622:
-------------------------------------------
Could this please be committed by a committer? The fix is trivial and provided.
> TextNormalize init not thread-safe, may lead to infinite loop
> -------------------------------------------------------------
>
> Key: PDFBOX-1622
> URL: https://issues.apache.org/jira/browse/PDFBOX-1622
> Project: PDFBox
> Issue Type: Bug
> Components: Utilities
> Affects Versions: 1.0.0
> Reporter: Florent Guillaume
> Assignee: Andreas Lehmkühler
> Fix For: 1.8.3, 2.0.0
>
> Attachments: PDFBOX-1622.patch.txt
>
>
> TextNormalize fills a static HashMap (DIACHASH) from a method (populateDiacHash) called by the TextNormalize constructor.
> If the constructor is called from two different threads at the same time, then the HashMap may be written by two concurrent threads which may and will cause infinite loops.
> We see the CPU at 100% and jstack shows 4 threads all stuck at:
> "Thread-2" prio=10 tid=0x00007f6e94499000 nid=0x347 runnable [0x00007f6e925d6000]
> java.lang.Thread.State: RUNNABLE
> at java.util.HashMap.put(HashMap.java:391)
> at org.apache.pdfbox.util.TextNormalize.populateDiacHash(TextNormalize.java:82)
> at org.apache.pdfbox.util.TextNormalize.<init>(TextNormalize.java:41)
> at org.apache.pdfbox.util.PDFTextStripper.<init>(PDFTextStripper.java:193)
> A patch to fix this is attached, it just moves the initialization to a static block.
> Please apply to the 1.8.3 and 2.0.0 branches.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira