You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@pdfbox.apache.org by Maruan Sahyoun <sa...@fileaffairs.de> on 2013/05/14 18:33:10 UTC

PDFBox 2.0 - Parsing

Hi,

I'd like to look at the parsing part for PDFBox 2.0. As Timo suggested there are also ideas in changing/refactoring the COS model. As this might affect the parsing part how can we agree on an enhanced model. What are the ides floating around?

BR
Maruan Sahyoun

Re: PDFBox 2.0 - Parsing

Posted by Timo Boehme <ti...@ontochem.com>.

Hi,

the main point for me would be that the object model should support an 
on-demand loading/parsing of objects. This would speed-up the parsing 
and reduce needed memory if only specific information is needed from PDF 
(specific page, only form data etc.).


Best regards,
Timo



Am 14.05.2013 18:33, schrieb Maruan Sahyoun:
> Hi,
>
> I'd like to look at the parsing part for PDFBox 2.0. As Timo suggested there are also ideas in changing/refactoring the COS model. As this might affect the parsing part how can we agree on an enhanced model. What are the ides floating around?
>
> BR
> Maruan Sahyoun
>
>


-- 

  Timo Boehme
  OntoChem GmbH
  H.-Damerow-Str. 4
  06120 Halle/Saale
  T: +49 345 4780474
  F: +49 345 4780471
  timo.boehme@ontochem.com

_____________________________________________________________________

  OntoChem GmbH
  Geschäftsführer: Dr. Lutz Weber
  Sitz: Halle / Saale
  Registergericht: Stendal
  Registernummer: HRB 215461
_____________________________________________________________________