You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Michael Doswald (JIRA)" <ji...@apache.org> on 2016/07/21 14:49:20 UTC

[jira] [Comment Edited] (PDFBOX-3432) Optimize CID to GlyphId mapping (TTF)

    [ https://issues.apache.org/jira/browse/PDFBOX-3432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15387812#comment-15387812 ] 

Michael Doswald edited comment on PDFBOX-3432 at 7/21/16 2:48 PM:
------------------------------------------------------------------

commons-primitives doesn't seem to have any Map implementations for primitives. 

I've tried the FastUtil and Goldman-Sachs libraries. I've implemented a few simple benchmarks which I hope would cover the use cases in fontbox:

* Insert key/value pairs in sequential key order
* Insert key/value paris with random keys
* Get value of key with randomly selected keys

The FastUtil Int2IntMap seems to be slower than my simple implementation. The IntIntHashMap from GS seems to be faster for the last two benchmarks, only the sequential insertion is faster with my simple implementation.

So I guess we could use the GS libraries (they seem to have an Apache License too), but I can't really find the source code for it. It seems the Map implementations for primitives are generated and not hand-written. Also the implementation seem to include more features / code than we would really need in fontbox. 

Which solution would you prefer:

* Trying to integrate the GS IntIntHashMap (and possibly strip out all unneeded features?)
* Using the simple implementation provided in the patch 'rev1'




was (Author: michaeldoswald):
commons-primitives doesn't seem to have any Map implementations for primitives. 

I've tried the FastUtil and Goldman-Sachs libraries. I've implemented a few simple benchmarks which I hope would cover the use cases in fontbox:

* Insert key/value pairs in sequential key order
* Insert key/value paris with random keys
* Get value of key with randomly selected keys

The FastUtil Int2IntMap seems to be slower than my simple implementation. The IntIntHashMap from GS seems to be faster. For the last two benchmarks, only the sequential insertion is faster with my simple implementation.

So I guess we could use the GS libraries (they seem to have an Apache License too), but I can't really find the source code for it. It seems the Map implementations for primitives are generated and not hand-written. Also the implementation seem to include more features / code than we would really need in fontbox. 

Which solution would you prefer:

* Trying to integrate the GS IntIntHashMap (and possibly strip out all unneeded features?)
* Using the simple implementation provided in the patch 'rev1'



> Optimize CID to GlyphId mapping (TTF)
> -------------------------------------
>
>                 Key: PDFBOX-3432
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3432
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: FontBox
>    Affects Versions: 2.0.2
>         Environment: Ubuntu 14.04.4 LTS
>            Reporter: Michael Doswald
>            Priority: Trivial
>              Labels: optimization, performance
>         Attachments: PDFBOX-3432_Optimize_CID_to_GlyphId_mapping_rev1.patch, pdfbox-performance-PDFBOX-3432.zip
>
>
> TTF fonts map code-points (Code IDs) to glyphs. These are mappings from int to int. Because the JDK lacks map classes for primitive types, the code (e.g. in CmapSubtable) currently uses Map<Integer,Integer> for those mappings. This is inefficient in different ways:
> * Autoboxing/unboxing introduces a performance penalty
> * Boxing to Integer objects has a memory overhead
> * The JDK Map implementation has a big memory overhead for such simple objects
> For efficiency (execution time and memory consumption) I would propose to introduce a simple IntIntMap implementation which works with primitive integers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org