You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Mel Martinez (JIRA)" <ji...@apache.org> on 2010/01/14 21:29:55 UTC

[jira] Updated: (PDFBOX-599) PDFBox performance issue: TextPosition performance tweak

     [ https://issues.apache.org/jira/browse/PDFBOX-599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mel Martinez updated PDFBOX-599:
--------------------------------

    Attachment: TextPosition.java

A tweaked version of TextPosition that speeds up the getX() and getY() methods.

> PDFBox performance issue:  TextPosition performance tweak
> ---------------------------------------------------------
>
>                 Key: PDFBOX-599
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-599
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Text extraction
>    Affects Versions: 0.8.0-incubator, 1.0.0
>         Environment: All
>            Reporter: Mel Martinez
>         Attachments: TextPosition.java
>
>
> During text extraction, the TextPosition.getX() and TextPosition.getY() methods are invoked multiple times on each TextPosition object.
> The current code recalculate these values each time the accessor is invoked, even thought the underlying state from which the values are derived has not changed.
> This is slow.
> The getters  (getX() and getY()) should be changed to retain the X and Y attributes in instance fields and only calculate their values once.
> Specificaly the following two fields should be added:
>     private float x = Float.NEGATIVE_INFINITY;
>     private float y = Float.NEGATIVE_INFINITY;
> And the two methods changed to look like so:
>     public float getX()
>     {
>         if(x==Float.NEGATIVE_INFINITY){
>         	x = getXRot(rot);
>         }
>         return x;
>     }
>     public float getY()
>     {
>     	if(y==Float.NEGATIVE_INFINITY){
>             if ((rot == 0) || (rot == 180))
>             {
>                 y = pageHeight - getYLowerLeftRot(rot);
>             }
>             else 
>             {
>                 y = pageWidth - getYLowerLeftRot(rot);
>             }
>     	}
>     	return y;
>     }
> This provides a very noticeable speedup in the text extraction.
> I'll attach a version of the TextPosition.java class that includes this mod.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.