You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tilman Hausherr (Jira)" <ji...@apache.org> on 2021/03/24 17:04:01 UTC

[jira] [Created] (TIKA-3337) Imaging classes inflexibility in PDFParser.checkInitialization

Tilman Hausherr created TIKA-3337:
-------------------------------------

             Summary: Imaging classes inflexibility in PDFParser.checkInitialization
                 Key: TIKA-3337
                 URL: https://issues.apache.org/jira/browse/TIKA-3337
             Project: Tika
          Issue Type: Bug
          Components: parser
    Affects Versions: 1.25
            Reporter: Tilman Hausherr


While looking for something else I saw this code PDFParser.checkInitialization():
{code}

            StringBuilder sb = new StringBuilder();
            try {
                Class.forName("com.github.jaiimageio.impl.plugins.tiff.TIFFImageWriter");
            } catch (ClassNotFoundException e) {
                sb.append("TIFFImageWriter not loaded. tiff files will not be processed\n");
                sb.append("See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io\n");
                sb.append("for optional dependencies.\n");

            }

            try {
                Class.forName("com.github.jaiimageio.jpeg2000.impl.J2KImageReader");
            } catch (ClassNotFoundException e) {
                sb.append("J2KImageReader not loaded. JPEG2000 files will not be processed.\n");
                sb.append("See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io\n");
                sb.append("for optional dependencies.\n");
            }

{code}

This requires specific classes, i.e. wouldn't work with twelvemonkeys or the old JAI. Proposed change:

{code}

            if (!ImageIO.getImageReadersByFormatName("tif").hasNext()) {
                sb.append("TIFFImageWriter not loaded. tiff files will not be processed\n");
                sb.append("See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io\n");
                sb.append("for optional dependencies.\n");
            }

            if (!ImageIO.getImageReadersByFormatName("jpeg2000").hasNext()) {
                sb.append("J2KImageReader not loaded. JPEG2000 files will not be processed.\n");
                sb.append("See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io\n");
                sb.append("for optional dependencies.\n");
            }

{code}





--
This message was sent by Atlassian Jira
(v8.3.4#803005)