You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (Jira)" <ji...@apache.org> on 2021/05/12 17:42:00 UTC

[jira] [Resolved] (TIKA-3392) Apache Tika V1.26 doen't work on Android anymore. Issue with org.xml dependencies.

     [ https://issues.apache.org/jira/browse/TIKA-3392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Allison resolved TIKA-3392.
-------------------------------
    Fix Version/s: 1.27
       Resolution: Fixed

This should now be fixed.  What I can't remember is why I bothered with the pool of saxparsers in the MimeTypesReader.  Shouldn't the mimetypes only be read/parsed once per jvm/tika load?  If you're calling tika-app from the commandline on every individual file, you'll still just read the mimetypes once per file, right?  In a multithreaded server environment, you should also only read the mimetypes once.

Onwards...

> Apache Tika V1.26 doen't work on Android anymore. Issue with org.xml dependencies.
> ----------------------------------------------------------------------------------
>
>                 Key: TIKA-3392
>                 URL: https://issues.apache.org/jira/browse/TIKA-3392
>             Project: Tika
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.26
>         Environment: Android 11
>            Reporter: Andrei Dobrescu
>            Priority: Major
>              Labels: android
>             Fix For: 1.27
>
>         Attachments: image-2021-05-11-17-53-58-291.png, image-2021-05-11-18-10-40-949.png, image-2021-05-11-18-12-15-300.png
>
>
> I use Apache Tika on Android in order to detect mime type of varios files:
> Apache Tika V1.10 works fine on Android:
> {code:java}
> implementation 'org.apache.tika:tika-core:1.10'
> {code}
> {code:java}
> val mimeType = file.inputStream().buffered().use { inputStream ->
>     AutoDetectParser().detector .detect(inputStream, Metadata()).toString()
> }
> {code}
> However, Tika V1.26 will crash when trying to detect the mime type:
> {code:java}
> implementation 'org.apache.tika:tika-core:1.10'
> {code}
> {noformat}
> java.lang.ExceptionInInitializerError
>     at org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:69)
>     at org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:100)
>     at org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:189)
>     at org.apache.tika.mime.MimeTypes.getDefaultMimeTypes(MimeTypes.java:604)
>     at org.apache.tika.config.TikaConfig.getDefaultMimeTypes(TikaConfig.java:83)
>     at org.apache.tika.config.TikaConfig.<init>(TikaConfig.java:257)
>     at org.apache.tika.config.TikaConfig.getDefaultConfig(TikaConfig.java:422)
>     at org.apache.tika.parser.AutoDetectParser.<init>(AutoDetectParser.java:55){noformat}
> {noformat}
> CAUSE:
> java.lang.RuntimeException: problem initializing SAXParser pool
>         at org.apache.tika.mime.MimeTypesReader.<clinit>(MimeTypesReader.java:119)
>         at org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:69)
>         at org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:100)
>         at org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:189)
>         at org.apache.tika.mime.MimeTypes.getDefaultMimeTypes(MimeTypes.java:604)
>         at org.apache.tika.config.TikaConfig.getDefaultMimeTypes(TikaConfig.java:83)
>         at org.apache.tika.config.TikaConfig.<init>(TikaConfig.java:257)
>         at org.apache.tika.config.TikaConfig.getDefaultConfig(TikaConfig.java:422)
>         at org.apache.tika.parser.AutoDetectParser.<init>(AutoDetectParser.java:55){noformat}
> {noformat}
> CAUSE OF CAUSE:
>  org.apache.tika.exception.TikaException: problem creating SAX parser factory
>      at org.apache.tika.mime.MimeTypesReader.newSAXParser(MimeTypesReader.java:433)
>      at org.apache.tika.mime.MimeTypesReader.setPoolSize(MimeTypesReader.java:417)
>      at org.apache.tika.mime.MimeTypesReader.<clinit>(MimeTypesReader.java:117)
>      at org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:69)
>      at org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:100)
>      at org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:189)
>      at org.apache.tika.mime.MimeTypes.getDefaultMimeTypes(MimeTypes.java:604)
>      at org.apache.tika.config.TikaConfig.getDefaultMimeTypes(TikaConfig.java:83)
>      at org.apache.tika.config.TikaConfig.<init>(TikaConfig.java:257)
>      at org.apache.tika.config.TikaConfig.getDefaultConfig(TikaConfig.java:422)
>      at org.apache.tika.parser.AutoDetectParser.<init>(AutoDetectParser.java:55){noformat}
> {noformat}
> CAUSE OF CAUSE OF CAUSE:
> org.xml.sax.SAXNotRecognizedException: http://javax.xml.XMLConstants/feature/secure-processing
>      at org.apache.harmony.xml.parsers.SAXParserFactoryImpl.setFeature(SAXParserFactoryImpl.java:93)
>      at org.apache.tika.mime.MimeTypesReader.newSAXParser(MimeTypesReader.java:429)
>      at org.apache.tika.mime.MimeTypesReader.setPoolSize(MimeTypesReader.java:417)
>      at org.apache.tika.mime.MimeTypesReader.<clinit>(MimeTypesReader.java:117)
>      at org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:69)
>      at org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:100)
>      at org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:189)
>      at org.apache.tika.mime.MimeTypes.getDefaultMimeTypes(MimeTypes.java:604)
>      at org.apache.tika.config.TikaConfig.getDefaultMimeTypes(TikaConfig.java:83)
>      at org.apache.tika.config.TikaConfig.<init>(TikaConfig.java:257)
>      at org.apache.tika.config.TikaConfig.getDefaultConfig(TikaConfig.java:422)
>      at org.apache.tika.parser.AutoDetectParser.<init>(AutoDetectParser.java:55)
> {noformat}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)