You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "susheel (Commented) (JIRA)" <ji...@apache.org> on 2011/11/14 11:18:52 UTC
[jira] [Commented] (PDFBOX-1169) Images extracted from PDF are
loosing color (are shown in blackcolor)
[ https://issues.apache.org/jira/browse/PDFBOX-1169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13149529#comment-13149529 ]
susheel commented on PDFBOX-1169:
---------------------------------
Comment to extract the image:
private void processImages(PDResources resources, String destinationFolder) throws IOException {
Map images = resources.getImages();
if (images != null) {
Iterator imageIter = images.keySet().iterator();
while (imageIter.hasNext()) {
String key = (String) imageIter.next();
PDXObjectImage image = (PDXObjectImage) images.get(key);
String name = null;
name = destinationFolder + "image-" + imageCounter++ + "." + image.getSuffix();
//image.write2file(name); - Tried image.write2file as well, but retrieved images were similar
BufferedImage bufferedImage = image.getRGBImage();
File outputfile = new File(name);
ImageIO.write(bufferedImage,image.getSuffix(), outputfile);
System.out.println("szaveri - using imageio to write files " + name + " suffix =" + image.getSuffix());
}
}
}
Please note, out of 200 odd images in the PDF, only two got extracted correctly rest all are having images with black background.
I am sure, I am missing out some configuration or someother parameter, but unable to find it out.
Just to update, have also added following JAI Jars in my project:
jai_codec
jai_core
mlibwrapper_jai
> Images extracted from PDF are loosing color (are shown in blackcolor)
> ---------------------------------------------------------------------
>
> Key: PDFBOX-1169
> URL: https://issues.apache.org/jira/browse/PDFBOX-1169
> Project: PDFBox
> Issue Type: Bug
> Components: Utilities
> Affects Versions: 1.6.0
> Environment: Windows
> Reporter: susheel
> Attachments: eBook-Mini.pdf, image-1.jpg, image-2.jpg
>
>
> Using PDFBox, tried to read file (eBook-Mini.pdf, which is attached)
> When images are extracted using below mentioned code, the extracted images aren't as per the ones in PDF, they have lost color.
> Checked extracting images, using other tools and images were extracted correctly.
> Attached images extracted using PDFBox as well.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira