You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Nick Burch (JIRA)" <ji...@apache.org> on 2017/12/05 21:53:00 UTC

[jira] [Commented] (TIKA-2518) tika app outputs warnings by default

    [ https://issues.apache.org/jira/browse/TIKA-2518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16279240#comment-16279240 ] 

Nick Burch commented on TIKA-2518:
----------------------------------

The warnings are all telling you about things that can't be done due to missing (unsuitably licensed or packaged) libraries or native libraries/programs. 

However, all of those warnings should only be going to STDERR. If you only check STDOUT you should get just the info you want, no warnings

What happens if you discard STDERR and check just STDOUT?

> tika app outputs warnings by default
> ------------------------------------
>
>                 Key: TIKA-2518
>                 URL: https://issues.apache.org/jira/browse/TIKA-2518
>             Project: Tika
>          Issue Type: Bug
>          Components: app
>    Affects Versions: 1.16
>            Reporter: Ryan Brueske
>
> upon downloading the latest tika and trying basic commands it spews unwanted warnings, which makes parsing output necessary.
> Example 1:
> {code}
> java -jar tika-app-1.16.jar --list-detectors
> Dec 05, 2017 3:16:13 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem
> WARNING: JBIG2ImageReader not loaded. jbig2 files will be ignored
> See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
> for optional dependencies.
> TIFFImageWriter not loaded. tiff files will not be processed
> See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
> for optional dependencies.
> J2KImageReader not loaded. JPEG2000 files will not be processed.
> See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
> for optional dependencies.
> Dec 05, 2017 3:16:13 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem
> WARNING: org.xerial's sqlite-jdbc is not loaded.
> Please provide the jar on your classpath to parse sqlite files.
> See tika-parsers/pom.xml for the correct version.
> org.apache.tika.detect.DefaultDetector (Composite Detector):
>   org.apache.tika.parser.microsoft.POIFSContainerDetector
>   org.apache.tika.parser.pkg.ZipContainerDetector
>   org.gagravarr.tika.OggDetector
>   org.apache.tika.mime.MimeTypes
> {code}
> Example 2:
> {code}
> java -jar tika-app-1.16.jar --text my.xlsx
> Dec 05, 2017 3:00:22 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem
> WARNING: JBIG2ImageReader not loaded. jbig2 files will be ignored
> See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
> for optional dependencies.
> TIFFImageWriter not loaded. tiff files will not be processed
> See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
> for optional dependencies.
> J2KImageReader not loaded. JPEG2000 files will not be processed.
> See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
> for optional dependencies.
> Dec 05, 2017 3:00:22 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem
> WARNING: org.xerial's sqlite-jdbc is not loaded.
> Please provide the jar on your classpath to parse sqlite files.
> See tika-parsers/pom.xml for the correct version.
> INFO  As a convenience, TikaCLI has turned on extraction of
> inline images for the PDFParser (TIKA-2374).
> This is not the default option in Tika generally or in tika-server.
> As a convenience, TikaCLI has turned on extraction of
> inline images for the PDFParser (TIKA-2374).
> This is not the default option in Tika generally or in tika-server.
> {code}
> The expected behavior is to return only the requested information. I do not see a switch to turn off or control unrequested warnings. 
> I can't imagine this is the correct behavior. It is not documented, nor could I find why such output exists.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)