You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@pdfbox.apache.org by "Tilman Hausherr (Jira)" <ji...@apache.org> on 2021/04/11 09:11:00 UTC

[jira] [Closed] (PDFBOX-5140) Can't change PDF including some Chinese font to JPG correctly

     [ https://issues.apache.org/jira/browse/PDFBOX-5140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tilman Hausherr closed PDFBOX-5140.
-----------------------------------
    Resolution: Duplicate

Closing as duplicate of PDFBOX-3293, I've added you as a watcher there.

> Can't change PDF including some Chinese font to JPG correctly
> -------------------------------------------------------------
>
>                 Key: PDFBOX-5140
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5140
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Rendering
>    Affects Versions: 2.0.22
>         Environment: Windows 10
>            Reporter: Shigeru Okada
>            Priority: Major
>         Attachments: TC_DFKaiShuSB.pdf, TC_DFKaiShuSB_001.jpg, TC_MingLiu.pdf, TC_MingLiu_001.jpg, TC_PMingLiU.pdf, TC_PMingLiU_001.jpg
>
>
> I tried to change PDF file including Chinese font to JPG file.
> Source code is as below.
> {code}
> 	private List<String> convertPdf2Jpg(File pdfFile) {
> 		List<String> jpgList = new ArrayList<String>();
> 		try {
> 			PDDocument document = PDDocument.load(pdfFile);
> 			PDFRenderer pdfRenderer = new PDFRenderer(document);
> 			for (int i = 0; i < document.getNumberOfPages(); i++) {
> 				BufferedImage image = null;
> 				try{
> 					image = pdfRenderer.renderImageWithDPI(i, 300 ,ImageType.RGB);
> 					String jpgName = pdfFile.getPath().split(".pdf")[0] + "_" + String.format("%03d", i+1) +  ".jpg";
> 					ImageIOUtil.writeImage(image, jpgName, 300);
> 					jpgList.add(jpgName);
> 				}
> 				catch(Exception e) {
> 					document.close();
> 					LOG.error(pdfFile + "(" + i + " page) " + " can't convert pdf to jpg file (convertPdf2Jpg())." + e.toString());
> 					throwPdfBoxException("convertPdf2Jpg():" +  pdfFile + "(" + i + " page) " + " can't convert pdf to jpg file." + e.toString());
> 				}
> 			}
> 			document.close();
> 		}
> 		catch(Exception e){
> 			LOG.error(pdfFile + "Can't load PDF file (convertPdf2Jpg()).");
> 		}
> 		return jpgList;
> 	}
> {code}
> I attached example of PDF and JPG. Chinese characters are broken.
> It seems that it depends on font. 
> If you need more information, please let me know.
> Thanks
> //Okada



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org