You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Andreas Lehmkühler (JIRA)" <ji...@apache.org> on 2011/01/13 21:28:50 UTC

[jira] Resolved: (PDFBOX-274) PDFDocument.save is really slow

     [ https://issues.apache.org/jira/browse/PDFBOX-274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andreas Lehmkühler resolved PDFBOX-274.
---------------------------------------

       Resolution: Fixed
    Fix Version/s: 1.5.0
         Assignee: Andreas Lehmkühler

... and it works fine. I applied the patch as proposed  with revision 1058731.

I run a test and added a text to each page of the pdf-reference (>1300 pages) using AddMessageToEachPage. It took 7 minutes running 1.4.0 on the commandline and 2:15 running the trunk within eclipse.


> PDFDocument.save is really slow
> -------------------------------
>
>                 Key: PDFBOX-274
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-274
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Writing
>            Reporter: Jukka Zitting
>            Assignee: Andreas Lehmkühler
>            Priority: Minor
>             Fix For: 1.5.0
>
>
> [imported from SourceForge]
> http://sourceforge.net/tracker/index.php?group_id=78314&atid=552832&aid=1714655
> Originally submitted by wasabii on 2007-05-07 17:01.
> It's really slow. Like, it takes way too long. I think I solved the issue. You keep a list called objectsToWrite, it is an ArrayList... but when processing it you constantly remove the first element of it. This operation results in a shift of the list, since it's array based. It's slow. Doing it for each element is very slow. A LinkedList is more appropriate for these operations.
> Additionally, in this code:
>         if( !writtenObjects.contains( object ) &&
>             !objectsToWrite.contains( object ) &&
>             !actualsAdded.contains( actual ) )
> You are attempting to find an object in it. A lot. This method runs a ton of times. Since you were using a List, .Contains causes a complete list scan.
> I replaced objectsToWrite with a LinkedList. This solved the first problem. I added a seperate variable called objectsToWriteSet of type HashSet. I then maintain them side by side, using the HashSet for searching. Java probably has a single class that can accomplish both of these. I'd also question the need in the first place to verify that the object does not already exist.
> Patch is attached.
> [attachment on SourceForge]
> http://sourceforge.net/tracker/download.php?group_id=78314&atid=552832&aid=1714655&file_id=228258
> patch.txt (text/plain), 2538 bytes
> patch
> [comment on SourceForge]
> Originally sent by nobody.
> Logged In: NO 
> I guess you are looking for java.util.LinkedHashSet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.