You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Florent Guillaume (JIRA)" <ji...@apache.org> on 2013/07/19 17:32:49 UTC

[jira] [Commented] (PDFBOX-1622) TextNormalize init not thread-safe, may lead to infinite loop

    [ https://issues.apache.org/jira/browse/PDFBOX-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13713771#comment-13713771 ] 

Florent Guillaume commented on PDFBOX-1622:
-------------------------------------------

Could this please be committed by a committer? The fix is trivial and provided.

                
> TextNormalize init not thread-safe, may lead to infinite loop
> -------------------------------------------------------------
>
>                 Key: PDFBOX-1622
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1622
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Utilities
>    Affects Versions: 1.0.0
>            Reporter: Florent Guillaume
>            Assignee: Andreas Lehmkühler
>             Fix For: 1.8.3, 2.0.0
>
>         Attachments: PDFBOX-1622.patch.txt
>
>
> TextNormalize fills a static HashMap (DIACHASH) from a method (populateDiacHash) called by the TextNormalize constructor.
> If the constructor is called from two different threads at the same time, then the HashMap may be written by two concurrent threads which may and will cause infinite loops.
> We see the CPU at 100% and jstack shows 4 threads all stuck at:
> "Thread-2" prio=10 tid=0x00007f6e94499000 nid=0x347 runnable [0x00007f6e925d6000]
>    java.lang.Thread.State: RUNNABLE
> 	at java.util.HashMap.put(HashMap.java:391)
> 	at org.apache.pdfbox.util.TextNormalize.populateDiacHash(TextNormalize.java:82)
> 	at org.apache.pdfbox.util.TextNormalize.<init>(TextNormalize.java:41)
> 	at org.apache.pdfbox.util.PDFTextStripper.<init>(PDFTextStripper.java:193)
> A patch to fix this is attached, it just moves the initialization to a static block.
> Please apply to the 1.8.3 and 2.0.0 branches.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira