You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by Andrea Vacondio <an...@gmail.com> on 2020/01/28 09:25:08 UTC

Ondemand Parser

Hi,
Not sure if this is the right channel, apologies if it's not.
I came across the on demand parser
https://issues.apache.org/jira/browse/PDFBOX-4569 and saw the last comment
regarding memory mapped files and the scratch files. I just wanted to chip
in and point our lib https://github.com/torakiki/sejda-io
It's a small I/O layer that we use in our PDFBox fork, it supports memory
mapped files and it provides the concept of View that we used to drop the
Scratch files and have the COSStream to directly read from the underlying
PDF source. We have been using it for 3 years now and we processed a
gazillion of PDF files in PDFsam Basic, PDFsam Visual and Sejda so it's
definitely battle tested. Feel free to use it or take inspiration from it.
Andrea

Re: Ondemand Parser

Posted by Andreas Lehmkuehler <an...@lehmi.de>.
Am 28.01.20 um 10:25 schrieb Andrea Vacondio:
> Hi,
> Not sure if this is the right channel, apologies if it's not.
This is exactly the correct channel!

> I came across the on demand parser
> https://issues.apache.org/jira/browse/PDFBOX-4569 and saw the last comment
> regarding memory mapped files and the scratch files. I just wanted to chip
> in and point our lib https://github.com/torakiki/sejda-io
> It's a small I/O layer that we use in our PDFBox fork, it supports memory
> mapped files and it provides the concept of View that we used to drop the
> Scratch files and have the COSStream to directly read from the underlying
> PDF source. We have been using it for 3 years now and we processed a
> gazillion of PDF files in PDFsam Basic, PDFsam Visual and Sejda so it's
> definitely battle tested. Feel free to use it or take inspiration from it.
Thanks for the valuable pointer and the offer. I'll definitely have a look at 
your code.

> Andrea
> 


BR
Andreas

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org