You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2012/05/22 15:39:41 UTC

[jira] [Reopened] (PDFBOX-1132) Add Tika parsers for PDF and TTF

     [ https://issues.apache.org/jira/browse/PDFBOX-1132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting reopened PDFBOX-1132:
-----------------------------------


I didn't had enough time to drive this, which led to some divergence between code in PDFBox and Tika. To solve that I've for now put the Tika PDF parser and all related changes back into Tika and removed the copy I put earlier to o.a.pdfbox.tika. I'll resolve this as Won't Fix at least for the 1.7.0 time scale.
                
> Add Tika parsers for PDF and TTF
> --------------------------------
>
>                 Key: PDFBOX-1132
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1132
>             Project: PDFBox
>          Issue Type: New Feature
>          Components: FontBox, Parsing
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>         Attachments: 0001-PDFBOX-1132-Add-Tika-parser-classes.patch, 0002-PDFBOX-1132-Add-Tika-parser-classes.patch
>
>
> The PDF and TTF parsers in Apache Tika rely more on improvements in PDFBox than on those in Tika, so it would make more sense for that code to reside inside Apache PDFBox.
> Having the code inside PDFBox would allow for tighter integration with PDFBox internals and avoid need to wait for an official PDFBox release before new features can be used inside the PDF and TTF parsers.
> To do this, I'd migrate the code PDF and TTF parser classes and related test cases and files from Tika to the PDFBox and FontBox components. We'd add an optional dependency to tika-core to these components, so people who don't use or need Tika wouldn't be affected.
> I'll attach a patch with the proposed changes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira