You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Tilman Hausherr <TH...@t-online.de> on 2019/03/03 14:14:22 UTC
Re: resize inline images
There wasn't anything attached. You can't attach PDFs anyway, these must
be uploaded to a sharehoster.
Tilman
Am 28.02.2019 um 17:41 schrieb Matteo Gamboz:
> ...
>> Please retry there strategy where you had the closing error (fix the
>> bug I mentioned), and post enough code and upload the source and
>> result PDFs.
> Hi Tilman, thank you for the comments and suggestions.
>
> I think I'm reaching a good point; my approach is now as follows:
> . extend PDFStreamEngine and process all operators,
> . when I find an Inline Image, save away its transformation matrix
> . after finishing with the operators, I parse the page into tokens
> . and scan through the tokens
> . when I find the BI operator
> . I use the transformation matrix (saved before)
> . and generate a new image,
> . then I substitute the BI token with the new image
>
> I must yet fix the image parameters to correctly represent the new
> image data, but it seems feasible.
>
> If anyone is interested, I'm attaching my last code and an example pdf
> that I'm trying to modify. If you spot any obvious mistake I would be
> grateful for suggestions :-)
>
> m
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
Re: resize images (XObjects, Inline Images and stencils)
Posted by Tilman Hausherr <TH...@t-online.de>.
Hi,
There is at least one bug:
if (image.isStencil()) {
stencil = " (stencil)";
bufferedImageType = BufferedImage.TYPE_BYTE_BINARY;
}
************
if (image.getColorSpace() == PDDeviceGray.INSTANCE){
bufferedImageType = BufferedImage.TYPE_BYTE_GRAY;
}
The problem is that stencils are of colorspace DeviceGray despite being
bitonal. So you need to put an "else" at the "************" place to
avoid the image getting the TYPE_BYTE_GRAY for stencils.
And this:
if (image.isStencil()) {
log.warn("Is stencil; painting red.");
bImage = image.getStencilImage(Color.red);
// ↑ this is problematic...
// it appears to paint the whole stencil
// like it's filling the stencil's "holes" also
is definitively wrong, you are "applying" the stencil, this is for
PDFBox itself when rendering. Just get the image itself and later set
the flag.
This is just a "dry" analysis, I didn't run your code.
Tilman
Am 27.03.2019 um 14:28 schrieb Matteo Gamboz:
> Hi Tilman,
>
> thank you for the corrections (and sorry for the delay in answering, I
> was working on other projects...).
>
> I've applied your suggestions and I think they fixed most of my
> problems. Here is the code as I have it now:
> https://medialab.sissa.it/nextcloud/index.php/s/FiWX3F7TJDbCT5A
>
> One can call the script with some command line options to play with
> different resolutions.
>
> In the source of the java file, one can find how to call the script
> https://medialab.sissa.it/nextcloud/index.php/s/FiWX3F7TJDbCT5A?path=%2Fsrc%2Fmain%2Fjava%2Fit%2Fsissa%2Fmedialab%2Fpdfimages
>
> I don't have much experience with java, please be patient if I was
> supposed to "deploy" the code in some other way :-)
>
>
> As you pointed out, I still have some problems with stencils:
> my process transforms the stencil into an image, but sometimes the
> background of the stencil is black instead of white. You can seen an
> example in the file screenshot.png. The screenshot refers to
> test-files/d.pdf.
>
> I will work on this an keep the list posted.
> m
>
>
> On Fri, 08 Mar 2019 15:59:39 +0100,
> Tilman Hausherr wrote:
>> InputStream img_data_stream = helper_img.getCOSObject().createRawInputStream();
>> ...
>> newBIoperator.setImageParameters(helper_img.getCOSObject());
>>
>>
>> Another thing you should fix yourself. If the image is a stencil,
>> create a TYPE_BYTE_BINARY BufferredImage (and later call
>> setStencil()). If the image is of colorspace DeviceGray, create a
>> TYPE_BYTE_GRAY BufferedImage. I not, then RGB, NOT ARGB.
>>
>> Please fix that also try finding a PDF that has an inline image where
>> something can be seen, the current one is white.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org
Re: resize images (XObjects, Inline Images and stencils)
Posted by Matteo Gamboz <ga...@medialab.sissa.it>.
Hi Tilman,
thank you for the corrections (and sorry for the delay in answering, I
was working on other projects...).
I've applied your suggestions and I think they fixed most of my
problems. Here is the code as I have it now:
https://medialab.sissa.it/nextcloud/index.php/s/FiWX3F7TJDbCT5A
One can call the script with some command line options to play with
different resolutions.
In the source of the java file, one can find how to call the script
https://medialab.sissa.it/nextcloud/index.php/s/FiWX3F7TJDbCT5A?path=%2Fsrc%2Fmain%2Fjava%2Fit%2Fsissa%2Fmedialab%2Fpdfimages
I don't have much experience with java, please be patient if I was
supposed to "deploy" the code in some other way :-)
As you pointed out, I still have some problems with stencils:
my process transforms the stencil into an image, but sometimes the
background of the stencil is black instead of white. You can seen an
example in the file screenshot.png. The screenshot refers to
test-files/d.pdf.
I will work on this an keep the list posted.
m
On Fri, 08 Mar 2019 15:59:39 +0100,
Tilman Hausherr wrote:
>
> InputStream img_data_stream = helper_img.getCOSObject().createRawInputStream();
> ...
> newBIoperator.setImageParameters(helper_img.getCOSObject());
>
>
> Another thing you should fix yourself. If the image is a stencil,
> create a TYPE_BYTE_BINARY BufferredImage (and later call
> setStencil()). If the image is of colorspace DeviceGray, create a
> TYPE_BYTE_GRAY BufferedImage. I not, then RGB, NOT ARGB.
>
> Please fix that also try finding a PDF that has an inline image where
> something can be seen, the current one is white.
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org
Re: resize inline images
Posted by Tilman Hausherr <TH...@t-online.de>.
Am 05.03.2019 um 09:54 schrieb Matteo Gamboz:
> I must yet fix the image parameters to correctly represent the new
> image data, but it seems feasible.
operator.getImageParameters():
COSDictionary{COSName{IM}:true;COSName{W}:COSInt{114};COSName{H}:COSInt{73};COSName{BPC}:COSInt{1};COSName{F}:COSName{CCF};COSName{DP}:COSDictionary{COSName{K}:COSInt{-1};COSName{Columns}:114;};}
helper_img:
COSDictionary{COSName{Length}:COSInt{23};COSName{Type}:COSName{XObject};COSName{Subtype}:COSName{Image};COSName{Filter}:COSName{FlateDecode};COSName{BitsPerComponent}:COSInt{8};COSName{Width}:COSInt{31};COSName{Height}:COSInt{20};COSName{ColorSpace}:COSName{DeviceRGB};COSName{DecodeParms}:COSDictionary{COSName{BitsPerComponent}:8;COSName{Predictor}:COSInt{15};COSName{Columns}:31;COSName{Colors}:COSInt{3};};COSName{SMask}:COSDictionary{COSName{Length}:COSInt{26};COSName{Type}:-1148588617;COSName{Subtype}:70760763;COSName{Filter}:1578622202;COSName{BitsPerComponent}:8;COSName{Width}:31;COSName{Height}:20;COSName{ColorSpace}:COSName{DeviceGray};}COSStream{-856922952};}COSStream{-84829761}
You are running this:
newBIoperator.setImageParameters(operator.getImageParameters());
So you're taking the parameter of the old image to the new image. This
is problematic (may be my fault!), the size isn't the same, the filter
isn't the same, the bitspercomponent isn't the same. And then there's
the problem that your original image is a mask, so the new one should be
too, so the new one should also be a bitonal image but it isn't.
I made two changes:
//InputStream img_data_stream =
helper_img.createInputStream();
InputStream img_data_stream =
helper_img.getCOSObject().createRawInputStream();
This is because the raw stream is when the filter have been applied.
...
//newBIoperator.setImageParameters(operator.getImageParameters());
newBIoperator.setImageParameters(helper_img.getCOSObject());
Another thing you should fix yourself. If the image is a stencil, create
a TYPE_BYTE_BINARY BufferredImage (and later call setStencil()). If the
image is of colorspace DeviceGray, create a TYPE_BYTE_GRAY
BufferedImage. I not, then RGB, NOT ARGB.
Please fix that also try finding a PDF that has an inline image where
something can be seen, the current one is white.
After all these are done, please upload your code and source PDF again.
Tilman
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org
Re: resize inline images
Posted by Matteo Gamboz <ga...@medialab.sissa.it>.
sorry, here they are
https://medialab.sissa.it/nextcloud/index.php/s/Pdn5CYFcbjoAWsp
On Sun, 03 Mar 2019 15:14:22 +0100,
Tilman Hausherr wrote:
>
> There wasn't anything attached. You can't attach PDFs anyway, these
> must be uploaded to a sharehoster.
>
> Tilman
>
> Am 28.02.2019 um 17:41 schrieb Matteo Gamboz:
> > ...
> >> Please retry there strategy where you had the closing error (fix the
> >> bug I mentioned), and post enough code and upload the source and
> >> result PDFs.
> > Hi Tilman, thank you for the comments and suggestions.
> >
> > I think I'm reaching a good point; my approach is now as follows:
> > . extend PDFStreamEngine and process all operators,
> > . when I find an Inline Image, save away its transformation matrix
> > . after finishing with the operators, I parse the page into tokens
> > . and scan through the tokens
> > . when I find the BI operator
> > . I use the transformation matrix (saved before)
> > . and generate a new image,
> > . then I substitute the BI token with the new image
> >
> > I must yet fix the image parameters to correctly represent the new
> > image data, but it seems feasible.
> >
> > If anyone is interested, I'm attaching my last code and an example pdf
> > that I'm trying to modify. If you spot any obvious mistake I would be
> > grateful for suggestions :-)
> >
> > m
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> > For additional commands, e-mail: users-help@pdfbox.apache.org
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org