You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@pdfbox.apache.org by "Stahle, Patrick" <pa...@te.com> on 2016/06/27 18:55:41 UTC

threads using PDFBox getting stuck infinite loop

Hi,

We have a relatively heavily threaded application which is calling pdfbox to stamp certain pdf files. We have been in production for a little over a week and have run into a few threads getting stuck. The stack trace is the following:

### Thread id=34, name="dispatch_2_20160626211454_1064"
# ThreadInfo: "dispatch_2_20160626211454_1064" Id=34 RUNNABLE
# CPU: threadCpuTime=49,541,548.824 ms, threadUserTime=49,538,533.066 ms
# Contention: blockedCount=33 , blockedTime=123 ms
# Contention: lockName=null , lockOwnerId=-1, lockOwnerName=null
java.util.HashMap.put(HashMap.java:473)
java.util.HashSet.add(HashSet.java:217)
java.util.AbstractCollection.addAll(AbstractCollection.java:334)
org.apache.pdfbox.pdmodel.font.encoding.Encoding.contains(Encoding.java:109)
org.apache.pdfbox.pdmodel.font.PDType1Font.encode(PDType1Font.java:343)
org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:286)
org.apache.pdfbox.pdmodel.font.PDFont.getStringWidth(PDFont.java:315)
com.tycoelectronics.emcs.stamppdf.StampEnginePDFBox.getLongestTextWidth(StampEnginePDFBox.java:1369)
:
:

We seem to be getting stuck in PDFont classes HashMap which is unsynchronized class.

Any suggestions?

We are using PDFBox 2.0.0 release...

Thanks,
Patrick

Re: threads using PDFBox getting stuck infinite loop

Posted by Tilman Hausherr <TH...@t-online.de>.

Am 27.06.2016 um 22:01 schrieb Stahle, Patrick:
> Hi Tilman,
>
> How would a call contains?

I meant put this in your software before multithreading starts:

PDType1Font.HELVETICA.getEncoding().contains("space");



>
>
> Right now I am just doing the following font wise:
> font = PDType1Font.HELVETICA;

Yes this is OK.

>
> Should I simply grab the fonts we use during single thread initialization code?

Can't follow you there.

What I also noticed is that each standard 14 font is allocating a 
WinAnsiEncoding object despite WinAnsiEncoding.INSTANCE already being 
assigned.

Anyway, can you test that workaround? I'm thinking about a code change 
like this:

     public boolean contains(String name)
     {
         // we have to wait until all add() calls are done before 
building the name cache
         // otherwise /Differences won't be accounted for
         if (names == null)
         {
             synchronized (this)
             {
                 if (names == null)
                 {
                     names = new HashSet<String>(codeToName.size());
                     names.addAll(codeToName.values());
                 }
             }
         }
         return names.contains(name);
     }


This way, there is no performance loss except the very first time.

Tilman

>
> Thanks,
> Patrick
>
>
>
> -----Original Message-----
> From: Stahle, Patrick [mailto:patrick.stahle@te.com]
> Sent: Monday, June 27, 2016 3:27 PM
> To: users@pdfbox.apache.org
> Subject: RE: threads using PDFBox getting stuck infinite loop
>
> Hi Tilman,
>
> We are using  "PDType1Font.HELVETICA".
>
> Thanks,
> Patrick
>
>
> -----Original Message-----
> From: Tilman Hausherr [mailto:THausherr@t-online.de]
> Sent: Monday, June 27, 2016 3:17 PM
> To: users@pdfbox.apache.org
> Subject: Re: threads using PDFBox getting stuck infinite loop
>
> Am 27.06.2016 um 20:55 schrieb Stahle, Patrick:
>> Hi,
>>
>> We have a relatively heavily threaded application which is calling pdfbox to stamp certain pdf files. We have been in production for a little over a week and have run into a few threads getting stuck. The stack trace is the following:
>>
>> ### Thread id=34, name="dispatch_2_20160626211454_1064"
>> # ThreadInfo: "dispatch_2_20160626211454_1064" Id=34 RUNNABLE # CPU:
>> threadCpuTime=49,541,548.824 ms, threadUserTime=49,538,533.066 ms #
>> Contention: blockedCount=33 , blockedTime=123 ms # Contention:
>> lockName=null , lockOwnerId=-1, lockOwnerName=null
>> java.util.HashMap.put(HashMap.java:473)
>> java.util.HashSet.add(HashSet.java:217)
>> java.util.AbstractCollection.addAll(AbstractCollection.java:334)
>> org.apache.pdfbox.pdmodel.font.encoding.Encoding.contains(Encoding.jav
>> a:109)
>> org.apache.pdfbox.pdmodel.font.PDType1Font.encode(PDType1Font.java:343
>> )
>> org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:286)
>> org.apache.pdfbox.pdmodel.font.PDFont.getStringWidth(PDFont.java:315)
>> com.tycoelectronics.emcs.stamppdf.StampEnginePDFBox.getLongestTextWidt
>> h(StampEnginePDFBox.java:1369)
>> :
>> :
>>
>> We seem to be getting stuck in PDFont classes HashMap which is unsynchronized class.
>>
>> Any suggestions?
>>
>> We are using PDFBox 2.0.0 release...
> Update to 2.0.2. What PDFont are you using? One of the predefined "standard 14"? If yes, then this code
>
>       public boolean contains(String name)
>       {
>           // we have to wait until all add() calls are done before building the name cache
>           // otherwise /Differences won't be accounted for
>           if (names == null)
>           {
>               names = new HashSet<String>(codeToName.size());
>               names.addAll(codeToName.values());
>           }
>           return names.contains(name);
>       }
>
> might be risky :-( Workaround idea: call contains("space") for the fonts you're using before the real business starts.
>
> Tilman
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org

RE: threads using PDFBox getting stuck infinite loop

Posted by "Stahle, Patrick" <pa...@te.com>.

Hi Tilman,

How would a call contains?


Right now I am just doing the following font wise:
font = PDType1Font.HELVETICA;

Should I simply grab the fonts we use during single thread initialization code?

Thanks,
Patrick



-----Original Message-----
From: Stahle, Patrick [mailto:patrick.stahle@te.com] 
Sent: Monday, June 27, 2016 3:27 PM
To: users@pdfbox.apache.org
Subject: RE: threads using PDFBox getting stuck infinite loop

Hi Tilman,

We are using  "PDType1Font.HELVETICA".

Thanks,
Patrick


-----Original Message-----
From: Tilman Hausherr [mailto:THausherr@t-online.de] 
Sent: Monday, June 27, 2016 3:17 PM
To: users@pdfbox.apache.org
Subject: Re: threads using PDFBox getting stuck infinite loop

Am 27.06.2016 um 20:55 schrieb Stahle, Patrick:
> Hi,
>
> We have a relatively heavily threaded application which is calling pdfbox to stamp certain pdf files. We have been in production for a little over a week and have run into a few threads getting stuck. The stack trace is the following:
>
> ### Thread id=34, name="dispatch_2_20160626211454_1064"
> # ThreadInfo: "dispatch_2_20160626211454_1064" Id=34 RUNNABLE # CPU: 
> threadCpuTime=49,541,548.824 ms, threadUserTime=49,538,533.066 ms # 
> Contention: blockedCount=33 , blockedTime=123 ms # Contention: 
> lockName=null , lockOwnerId=-1, lockOwnerName=null
> java.util.HashMap.put(HashMap.java:473)
> java.util.HashSet.add(HashSet.java:217)
> java.util.AbstractCollection.addAll(AbstractCollection.java:334)
> org.apache.pdfbox.pdmodel.font.encoding.Encoding.contains(Encoding.jav
> a:109)
> org.apache.pdfbox.pdmodel.font.PDType1Font.encode(PDType1Font.java:343
> )
> org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:286)
> org.apache.pdfbox.pdmodel.font.PDFont.getStringWidth(PDFont.java:315)
> com.tycoelectronics.emcs.stamppdf.StampEnginePDFBox.getLongestTextWidt
> h(StampEnginePDFBox.java:1369)
> :
> :
>
> We seem to be getting stuck in PDFont classes HashMap which is unsynchronized class.
>
> Any suggestions?
>
> We are using PDFBox 2.0.0 release...
Update to 2.0.2. What PDFont are you using? One of the predefined "standard 14"? If yes, then this code

     public boolean contains(String name)
     {
         // we have to wait until all add() calls are done before building the name cache
         // otherwise /Differences won't be accounted for
         if (names == null)
         {
             names = new HashSet<String>(codeToName.size());
             names.addAll(codeToName.values());
         }
         return names.contains(name);
     }

might be risky :-( Workaround idea: call contains("space") for the fonts you're using before the real business starts.

Tilman

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org

RE: threads using PDFBox getting stuck infinite loop

Posted by "Stahle, Patrick" <pa...@te.com>.

Hi Tilman,

We are using  "PDType1Font.HELVETICA".

Thanks,
Patrick


-----Original Message-----
From: Tilman Hausherr [mailto:THausherr@t-online.de] 
Sent: Monday, June 27, 2016 3:17 PM
To: users@pdfbox.apache.org
Subject: Re: threads using PDFBox getting stuck infinite loop

Am 27.06.2016 um 20:55 schrieb Stahle, Patrick:
> Hi,
>
> We have a relatively heavily threaded application which is calling pdfbox to stamp certain pdf files. We have been in production for a little over a week and have run into a few threads getting stuck. The stack trace is the following:
>
> ### Thread id=34, name="dispatch_2_20160626211454_1064"
> # ThreadInfo: "dispatch_2_20160626211454_1064" Id=34 RUNNABLE # CPU: 
> threadCpuTime=49,541,548.824 ms, threadUserTime=49,538,533.066 ms # 
> Contention: blockedCount=33 , blockedTime=123 ms # Contention: 
> lockName=null , lockOwnerId=-1, lockOwnerName=null
> java.util.HashMap.put(HashMap.java:473)
> java.util.HashSet.add(HashSet.java:217)
> java.util.AbstractCollection.addAll(AbstractCollection.java:334)
> org.apache.pdfbox.pdmodel.font.encoding.Encoding.contains(Encoding.jav
> a:109)
> org.apache.pdfbox.pdmodel.font.PDType1Font.encode(PDType1Font.java:343
> )
> org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:286)
> org.apache.pdfbox.pdmodel.font.PDFont.getStringWidth(PDFont.java:315)
> com.tycoelectronics.emcs.stamppdf.StampEnginePDFBox.getLongestTextWidt
> h(StampEnginePDFBox.java:1369)
> :
> :
>
> We seem to be getting stuck in PDFont classes HashMap which is unsynchronized class.
>
> Any suggestions?
>
> We are using PDFBox 2.0.0 release...
Update to 2.0.2. What PDFont are you using? One of the predefined "standard 14"? If yes, then this code

     public boolean contains(String name)
     {
         // we have to wait until all add() calls are done before building the name cache
         // otherwise /Differences won't be accounted for
         if (names == null)
         {
             names = new HashSet<String>(codeToName.size());
             names.addAll(codeToName.values());
         }
         return names.contains(name);
     }

might be risky :-( Workaround idea: call contains("space") for the fonts you're using before the real business starts.

Tilman

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org

Re: threads using PDFBox getting stuck infinite loop

Posted by Tilman Hausherr <TH...@t-online.de>.

Am 27.06.2016 um 20:55 schrieb Stahle, Patrick:
> Hi,
>
> We have a relatively heavily threaded application which is calling pdfbox to stamp certain pdf files. We have been in production for a little over a week and have run into a few threads getting stuck. The stack trace is the following:
>
> ### Thread id=34, name="dispatch_2_20160626211454_1064"
> # ThreadInfo: "dispatch_2_20160626211454_1064" Id=34 RUNNABLE
> # CPU: threadCpuTime=49,541,548.824 ms, threadUserTime=49,538,533.066 ms
> # Contention: blockedCount=33 , blockedTime=123 ms
> # Contention: lockName=null , lockOwnerId=-1, lockOwnerName=null
> java.util.HashMap.put(HashMap.java:473)
> java.util.HashSet.add(HashSet.java:217)
> java.util.AbstractCollection.addAll(AbstractCollection.java:334)
> org.apache.pdfbox.pdmodel.font.encoding.Encoding.contains(Encoding.java:109)
> org.apache.pdfbox.pdmodel.font.PDType1Font.encode(PDType1Font.java:343)
> org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:286)
> org.apache.pdfbox.pdmodel.font.PDFont.getStringWidth(PDFont.java:315)
> com.tycoelectronics.emcs.stamppdf.StampEnginePDFBox.getLongestTextWidth(StampEnginePDFBox.java:1369)
> :
> :
>
> We seem to be getting stuck in PDFont classes HashMap which is unsynchronized class.
>
> Any suggestions?
>
> We are using PDFBox 2.0.0 release...
Update to 2.0.2. What PDFont are you using? One of the predefined 
"standard 14"? If yes, then this code

     public boolean contains(String name)
     {
         // we have to wait until all add() calls are done before 
building the name cache
         // otherwise /Differences won't be accounted for
         if (names == null)
         {
             names = new HashSet<String>(codeToName.size());
             names.addAll(codeToName.values());
         }
         return names.contains(name);
     }

might be risky :-( Workaround idea: call contains("space") for the fonts 
you're using before the real business starts.

Tilman

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org