You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Andreas Lehmkühler (JIRA)" <ji...@apache.org> on 2010/06/13 19:29:15 UTC
[jira] Resolved: (PDFBOX-742) [patch] Please don't print logging statements to System.err

     [ https://issues.apache.org/jira/browse/PDFBOX-742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andreas Lehmkühler resolved PDFBOX-742.
---------------------------------------

    Fix Version/s: 1.2.0
       Resolution: Fixed

I've replaced the System.out/err as proposed (there is no patch attached) with version 954269.

getUnfilteredStream returns the decoded stream of a COSStream as mentioned in the javadoc description.


> [patch] Please don't print logging statements to System.err
> -----------------------------------------------------------
>
>                 Key: PDFBOX-742
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-742
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: PDModel
>    Affects Versions: 1.1.0
>            Reporter: Antoni Mylka
>             Fix For: 1.2.0
>
>
> There are three org.apache.pdfbox.filter.Filter implementations which are unimplemented. These are:
> CCITTFaxDecodeFilter
> DCTFilter
> RunLengthDecodeFilter
> They all contain calls to System.err with messages like 
> Warning: DCTFilter.decode is not implemented yet, skipping this stream.
> In my code I iterate over all images in a PDF and try to obtain their raw, undecoded content. I use code like this:
> private byte [] getUnDecodedImageBytes(COSStream st) throws IOException {
> 	ByteArrayOutputStream baos = new ByteArrayOutputStream();
> 	IOUtil.writeStream(st.getUnfilteredStream(), baos);
> 	return baos.toByteArray();
> }
> The getUnfilteredStream() method, when called on JPG embedded images seems to try to invoke the DCTFilter. If I have a large ebook file with lots of JPG images - this yields LOTS of text to the Standard error output which can't be suppressed.
> PDFBox uses commons-logging all over the place. Why not push those warnings to the log. They are non-critical. In my particular case when I use the above method I get an empty array. If I do, I resort to another method:
> private byte [] getDecodedImageBytes(COSStream st) throws IOException {
> 	ByteArrayOutputStream baos = new ByteArrayOutputStream();
> 	PDXObjectImage ximage = (PDXObjectImage)PDXObject.createXObject( st );
> 	ximage.write2OutputStream(baos);
> 	return baos.toByteArray();
> }
> This seems to work, even for those images where getUnfilteredStream returns an empty stream.
> I don't quite understand what's the difference, since I would expect a method labelled 'getUnfilteredStream' to return the stream as-it-is in the PDF file, without using any Filters. Moreover such a failure would imply that the library simply cannot process JPG images in PDF files, which is not the case because write2OutputStream works OK. So I don't know where the real problem lies. Maybe someone with more PDFBox  knowledge could take a look.
> Still, my patch only moves those warnings to the log, where I can suppress them. This is simple and fixes the immediate problem in my application.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.