You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Tilman Hausherr <TH...@t-online.de> on 2019/03/03 14:14:22 UTC

Re: resize inline images

There wasn't anything attached. You can't attach PDFs anyway, these must 
be uploaded to a sharehoster.

Tilman

Am 28.02.2019 um 17:41 schrieb Matteo Gamboz:
> ...
>> Please retry there strategy where you had the closing error (fix the
>> bug I mentioned), and post enough code and upload the source and
>> result PDFs.
> Hi Tilman, thank you for the comments and suggestions.
>
> I think I'm reaching a good point; my approach is now as follows:
> . extend PDFStreamEngine and process all operators,
> . when I find an Inline Image, save away its transformation matrix
> . after finishing with the operators, I parse the page into tokens
> . and scan through the tokens
> . when I find the BI operator
> . I use the transformation matrix (saved before)
> . and generate a new image,
> . then I substitute the BI token with the new image
>
> I must yet fix the image parameters to correctly represent the new
> image data, but it seems feasible.
>
> If anyone is interested, I'm attaching my last code and an example pdf
> that I'm trying to modify. If you spot any obvious mistake I would be
> grateful for suggestions :-)
>
> m
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org



Re: resize images (XObjects, Inline Images and stencils)

Posted by Tilman Hausherr <TH...@t-online.de>.
Hi,

There is at least one bug:

         if (image.isStencil()) {
             stencil = " (stencil)";
             bufferedImageType = BufferedImage.TYPE_BYTE_BINARY;
         }
************
         if (image.getColorSpace() == PDDeviceGray.INSTANCE){
             bufferedImageType = BufferedImage.TYPE_BYTE_GRAY;
         }

The problem is that stencils are of colorspace DeviceGray despite being 
bitonal. So you need to put an "else" at the "************" place to 
avoid the image getting the TYPE_BYTE_GRAY for stencils.

And this:

             if (image.isStencil()) {
                 log.warn("Is stencil; painting red.");
                 bImage = image.getStencilImage(Color.red);
                 // ↑ this is problematic...
                 // it appears to paint the whole stencil
                 // like it's filling the stencil's "holes" also

is definitively wrong, you are "applying" the stencil, this is for 
PDFBox itself when rendering. Just get the image itself and later set 
the flag.

This is just a "dry" analysis, I didn't run your code.

Tilman

Am 27.03.2019 um 14:28 schrieb Matteo Gamboz:
> Hi Tilman,
>
> thank you for the corrections (and sorry for the delay in answering, I
> was working on other projects...).
>
> I've applied your suggestions and I think they fixed most of my
> problems.  Here is the code as I have it now:
> https://medialab.sissa.it/nextcloud/index.php/s/FiWX3F7TJDbCT5A
>
> One can call the script with some command line options to play with
> different resolutions.
>
> In the source of the java file, one can find how to call the script
> https://medialab.sissa.it/nextcloud/index.php/s/FiWX3F7TJDbCT5A?path=%2Fsrc%2Fmain%2Fjava%2Fit%2Fsissa%2Fmedialab%2Fpdfimages
>
> I don't have much experience with java, please be patient if I was
> supposed to "deploy" the code in some other way :-)
>
>
> As you pointed out, I still have some problems with stencils:
> my process transforms the stencil into an image, but sometimes the
> background of the stencil is black instead of white. You can seen an
> example in the file screenshot.png. The screenshot refers to
> test-files/d.pdf.
>
> I will work on this an keep the list posted.
> m
>
>
> On Fri, 08 Mar 2019 15:59:39 +0100,
> Tilman Hausherr wrote:
>> InputStream img_data_stream = helper_img.getCOSObject().createRawInputStream();
>> ...
>> newBIoperator.setImageParameters(helper_img.getCOSObject());
>>
>>
>> Another thing you should fix yourself. If the image is a stencil,
>> create a TYPE_BYTE_BINARY BufferredImage (and later call
>> setStencil()). If the image is of colorspace DeviceGray, create a
>> TYPE_BYTE_GRAY BufferedImage. I not, then RGB, NOT ARGB.
>>
>> Please fix that also try finding a PDF that has an inline image where
>> something can be seen, the current one is white.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: resize images (XObjects, Inline Images and stencils)

Posted by Matteo Gamboz <ga...@medialab.sissa.it>.
Hi Tilman,

thank you for the corrections (and sorry for the delay in answering, I
was working on other projects...).

I've applied your suggestions and I think they fixed most of my
problems.  Here is the code as I have it now:
https://medialab.sissa.it/nextcloud/index.php/s/FiWX3F7TJDbCT5A

One can call the script with some command line options to play with
different resolutions.

In the source of the java file, one can find how to call the script
https://medialab.sissa.it/nextcloud/index.php/s/FiWX3F7TJDbCT5A?path=%2Fsrc%2Fmain%2Fjava%2Fit%2Fsissa%2Fmedialab%2Fpdfimages

I don't have much experience with java, please be patient if I was
supposed to "deploy" the code in some other way :-)


As you pointed out, I still have some problems with stencils:
my process transforms the stencil into an image, but sometimes the
background of the stencil is black instead of white. You can seen an
example in the file screenshot.png. The screenshot refers to
test-files/d.pdf.

I will work on this an keep the list posted.
m


On Fri, 08 Mar 2019 15:59:39 +0100,
Tilman Hausherr wrote:
> 
> InputStream img_data_stream = helper_img.getCOSObject().createRawInputStream();
> ...
> newBIoperator.setImageParameters(helper_img.getCOSObject());
> 
> 
> Another thing you should fix yourself. If the image is a stencil,
> create a TYPE_BYTE_BINARY BufferredImage (and later call
> setStencil()). If the image is of colorspace DeviceGray, create a
> TYPE_BYTE_GRAY BufferedImage. I not, then RGB, NOT ARGB.
> 
> Please fix that also try finding a PDF that has an inline image where
> something can be seen, the current one is white.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: resize inline images

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 05.03.2019 um 09:54 schrieb Matteo Gamboz:
> I must yet fix the image parameters to correctly represent the new
> image data, but it seems feasible.

operator.getImageParameters(): 
COSDictionary{COSName{IM}:true;COSName{W}:COSInt{114};COSName{H}:COSInt{73};COSName{BPC}:COSInt{1};COSName{F}:COSName{CCF};COSName{DP}:COSDictionary{COSName{K}:COSInt{-1};COSName{Columns}:114;};}

helper_img: 
COSDictionary{COSName{Length}:COSInt{23};COSName{Type}:COSName{XObject};COSName{Subtype}:COSName{Image};COSName{Filter}:COSName{FlateDecode};COSName{BitsPerComponent}:COSInt{8};COSName{Width}:COSInt{31};COSName{Height}:COSInt{20};COSName{ColorSpace}:COSName{DeviceRGB};COSName{DecodeParms}:COSDictionary{COSName{BitsPerComponent}:8;COSName{Predictor}:COSInt{15};COSName{Columns}:31;COSName{Colors}:COSInt{3};};COSName{SMask}:COSDictionary{COSName{Length}:COSInt{26};COSName{Type}:-1148588617;COSName{Subtype}:70760763;COSName{Filter}:1578622202;COSName{BitsPerComponent}:8;COSName{Width}:31;COSName{Height}:20;COSName{ColorSpace}:COSName{DeviceGray};}COSStream{-856922952};}COSStream{-84829761}

You are running this:

newBIoperator.setImageParameters(operator.getImageParameters());

So you're taking the parameter of the old image to the new image. This 
is problematic (may be my fault!), the size isn't the same, the filter 
isn't the same, the bitspercomponent isn't the same. And then there's 
the problem that your original image is a mask, so the new one should be 
too, so the new one should also be a bitonal image but it isn't.

I made two changes:

                         //InputStream img_data_stream = 
helper_img.createInputStream();
                         InputStream img_data_stream = 
helper_img.getCOSObject().createRawInputStream();

This is because the raw stream is when the filter have been applied.

...

//newBIoperator.setImageParameters(operator.getImageParameters());
newBIoperator.setImageParameters(helper_img.getCOSObject());


Another thing you should fix yourself. If the image is a stencil, create 
a TYPE_BYTE_BINARY BufferredImage (and later call setStencil()). If the 
image is of colorspace DeviceGray, create a TYPE_BYTE_GRAY 
BufferedImage. I not, then RGB, NOT ARGB.

Please fix that also try finding a PDF that has an inline image where 
something can be seen, the current one is white.

After all these are done, please upload your code and source PDF again.

Tilman



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: resize inline images

Posted by Matteo Gamboz <ga...@medialab.sissa.it>.
sorry, here they are
https://medialab.sissa.it/nextcloud/index.php/s/Pdn5CYFcbjoAWsp


On Sun, 03 Mar 2019 15:14:22 +0100,
Tilman Hausherr wrote:
> 
> There wasn't anything attached. You can't attach PDFs anyway, these
> must be uploaded to a sharehoster.
> 
> Tilman
> 
> Am 28.02.2019 um 17:41 schrieb Matteo Gamboz:
> > ...
> >> Please retry there strategy where you had the closing error (fix the
> >> bug I mentioned), and post enough code and upload the source and
> >> result PDFs.
> > Hi Tilman, thank you for the comments and suggestions.
> > 
> > I think I'm reaching a good point; my approach is now as follows:
> > . extend PDFStreamEngine and process all operators,
> > . when I find an Inline Image, save away its transformation matrix
> > . after finishing with the operators, I parse the page into tokens
> > . and scan through the tokens
> > . when I find the BI operator
> > . I use the transformation matrix (saved before)
> > . and generate a new image,
> > . then I substitute the BI token with the new image
> > 
> > I must yet fix the image parameters to correctly represent the new
> > image data, but it seems feasible.
> > 
> > If anyone is interested, I'm attaching my last code and an example pdf
> > that I'm trying to modify. If you spot any obvious mistake I would be
> > grateful for suggestions :-)
> > 
> > m
> > 
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> > For additional commands, e-mail: users-help@pdfbox.apache.org
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org