You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Peter Pinnau <pe...@unterbrecher.de> on 2018/04/11 14:17:51 UTC

Get display size or resolution of PDImageXObject

I want to determine if the first page of a document contains a 
PDImageXObject which fills the whole page.

I found that piece of code with PDFStreamEngine:
https://www.tutorialkart.com/pdfbox/how-to-get-location-and-size-of-images-in-pdf/

My implementation works so far and finds the PDImageXObject:

if( xobject instanceof PDImageXObject)
{
        PDImageXObject image = (PDImageXObject) xobject;

        Matrix ctmNew = getGraphicsState().getCurrentTransformationMatrix();

        System.out.println(image.getWidth() + " * " + image.getHeight());

        imageDim = new PDRectangle();
        imageDim.setLowerLeftX(ctmNew.getTranslateX());
        imageDim.setLowerLeftY(ctmNew.getTranslateY());
        imageDim.setUpperRightX(ctmNew.getScalingFactorX());
        imageDim.setUpperRightY(ctmNew.getScalingFactorY());
}


I want to compare the imageDim with the CropBox of the page. The CropBox 
for my A4 test doc is 595.2 x 841.92 as expected.

Since it is a 300dpi scan the pixel size of the image is 2480 * 3508.

My problem is that ctmNew.getScalingFactorX() and 
ctmNew.getScalingFactorY() both returns 1.0

How can I find out the display size of the image in user space units or 
at least the resolution which the image is rendered within the PDF?


The final goal is a routine which guesses if it is a scanned document 
(which my contain text data from OCR) or a pure digital PDF (not scanned).

Thanks

-- 
Viele Grüße
Peter Pinnau

Diplom Wirtschaftsinformatiker (FH)
----------------------------------------------
http://www.unterbrecher.de - MZ ES 250 & mehr
----------------------------------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Get display size or resolution of PDImageXObject

Posted by Tilman Hausherr <TH...@t-online.de>.
https://stackoverflow.com/questions/5472711/dpi-of-image-extracted-from-pdf-with-pdfbox

If it doesn't work, please share your PDF

Tilman

Am 11.04.2018 um 16:17 schrieb Peter Pinnau:
> I want to determine if the first page of a document contains a 
> PDImageXObject which fills the whole page.
>
> I found that piece of code with PDFStreamEngine:
> https://www.tutorialkart.com/pdfbox/how-to-get-location-and-size-of-images-in-pdf/ 
>
>
> My implementation works so far and finds the PDImageXObject:
>
> if( xobject instanceof PDImageXObject)
> {
>        PDImageXObject image = (PDImageXObject) xobject;
>
>        Matrix ctmNew = 
> getGraphicsState().getCurrentTransformationMatrix();
>
>        System.out.println(image.getWidth() + " * " + image.getHeight());
>
>        imageDim = new PDRectangle();
>        imageDim.setLowerLeftX(ctmNew.getTranslateX());
>        imageDim.setLowerLeftY(ctmNew.getTranslateY());
>        imageDim.setUpperRightX(ctmNew.getScalingFactorX());
>        imageDim.setUpperRightY(ctmNew.getScalingFactorY());
> }
>
>
> I want to compare the imageDim with the CropBox of the page. The 
> CropBox for my A4 test doc is 595.2 x 841.92 as expected.
>
> Since it is a 300dpi scan the pixel size of the image is 2480 * 3508.
>
> My problem is that ctmNew.getScalingFactorX() and 
> ctmNew.getScalingFactorY() both returns 1.0
>
> How can I find out the display size of the image in user space units 
> or at least the resolution which the image is rendered within the PDF?
>
>
> The final goal is a routine which guesses if it is a scanned document 
> (which my contain text data from OCR) or a pure digital PDF (not 
> scanned).
>
> Thanks
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org