You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by "Harper, Brad" <Br...@fiserv.com> on 2011/08/24 00:01:40 UTC
InputStream Being Closed by PDFParser
Hello:
Is there a way to load/parse input and retrieve a PDDocument *without*
having the input stream closed "automatically"? Or alternately, is there
a way to hook into parser's processing of the end-of-document *before*
the stream is closed.
I need to process a 'print stream', which is a set of valid PDF
documents concatenated into a single large file.
I'd like to record the byte offsets into the large file for each of the
'sub-files' ... and was hoping to get this info in a single pass using
the position reported by the input stream's channel, but now I find that
the PDFParser closes its input stream when finished.
I don't see a way to get visibility on a [still-opened] stream, unless I
sub-class PDFParser or write one-off code to scan the input file for
beginning-/ending-of-document markers during a separate pass.
Any thoughts?
Regards,
Brad Harper
Re: InputStream Being Closed by PDFParser
Posted by Jukka Zitting <ju...@gmail.com>.
Hi,
On Wed, Aug 24, 2011 at 12:01 AM, Harper, Brad <Br...@fiserv.com> wrote:
> Is there a way to load/parse input and retrieve a PDDocument *without*
> having the input stream closed "automatically"?
One solution would be to use the CloseShieldInputStream decorator
class [1] from Commons IO.
[1] http://commons.apache.org/io/api-release/org/apache/commons/io/input/CloseShieldInputStream.html
BR,
Jukka Zitting