You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Andreas Lehmkühler (JIRA)" <ji...@apache.org> on 2011/09/14 20:56:09 UTC

[jira] [Resolved] (PDFBOX-1080) Improve TextPosition.isDiacritic and ICU4JImpl normalizeDiac performance

     [ https://issues.apache.org/jira/browse/PDFBOX-1080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andreas Lehmkühler resolved PDFBOX-1080.
----------------------------------------

    Resolution: Fixed
      Assignee: Andreas Lehmkühler

I added the proposed improvements in revision 1170768

Thanks for the contribution.

> Improve TextPosition.isDiacritic and ICU4JImpl normalizeDiac performance
> ------------------------------------------------------------------------
>
>                 Key: PDFBOX-1080
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1080
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Text extraction
>    Affects Versions: 1.6.0
>            Reporter: Lars Torunski
>            Assignee: Andreas Lehmkühler
>            Priority: Minor
>             Fix For: 1.7.0
>
>
> Character.getType with cText.charAt(0) and index range checks are invoked unnecessarily three times instead of only one time.
> Current 1.6.0 implementation:
>     public boolean isDiacritic()
>     {
>         String cText = this.getCharacter();
>         return (cText.length() == 1 &&  (Character.getType(cText.charAt(0)) == Character.NON_SPACING_MARK
>                 || Character.getType(cText.charAt(0)) == Character.MODIFIER_SYMBOL
>                 || Character.getType(cText.charAt(0)) == Character.MODIFIER_LETTER));
>     }
> Please use something like this:
>     public boolean isDiacritic()
>     {
>         final String cText = this.getCharacter();
>         if (cText.length() != 1) return false;
>         final int type = Character.getType(cText.charAt(0));
>         return (type == Character.NON_SPACING_MARK
>                 || type == Character.MODIFIER_SYMBOL
>                 || type == Character.MODIFIER_LETTER);
>     }
> Check the ICU4JImpl.normalizeDiac method also

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira