You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by GitBox <gi...@apache.org> on 2021/05/26 15:31:08 UTC

[GitHub] [tika] simon-shlomo opened a new pull request #446: [TIKA-3418] DefaultZipContainerDetector does not support loading of ZipContainerDetectors in an OSGi enviroment

simon-shlomo opened a new pull request #446:
URL: https://github.com/apache/tika/pull/446


   <!--
     Licensed to the Apache Software Foundation (ASF) under one
     or more contributor license agreements.  See the NOTICE file
     distributed with this work for additional information
     regarding copyright ownership.  The ASF licenses this file
     to you under the Apache License, Version 2.0 (the
     "License"); you may not use this file except in compliance
     with the License.  You may obtain a copy of the License at
   
       http://www.apache.org/licenses/LICENSE-2.0
   
     Unless required by applicable law or agreed to in writing,
     software distributed under the License is distributed on an
     "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
     KIND, either express or implied.  See the License for the
     specific language governing permissions and limitations
     under the License.
   -->
   
   Thanks for your contribution to [Apache Tika](https://tika.apache.org/)! Your help is appreciated!
   
   Before opening the pull request, please verify that
   * there is an open issue on the [Tika issue tracker](https://issues.apache.org/jira/projects/TIKA) which describes the problem or the improvement. We cannot accept pull requests without an issue because the change wouldn't be listed in the release notes.
   * the issue ID (`TIKA-XXXX`)
     - is referenced in the title of the pull request
     - and placed in front of your commit messages surrounded by square brackets (`[TIKA-XXXX] Issue or pull request title`)
   * commits are squashed into a single one (or few commits for larger changes)
   * Tika is successfully built and unit tests pass by running `mvn clean test`
   * there should be no conflicts when merging the pull request branch into the *recent* `main` branch. If there are conflicts, please try to rebase the pull request branch on top of a freshly pulled `main` branch.
   
   We will be able to faster integrate your pull request if these conditions are met. If you have any questions how to fix your problem or about using Tika in general, please sign up for the [Tika mailing list](http://tika.apache.org/mail-lists.html). Thanks!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tika] tballison closed pull request #446: [TIKA-3418] DefaultZipContainerDetector does not support loading of ZipContainerDetectors in an OSGi enviroment

Posted by GitBox <gi...@apache.org>.
tballison closed pull request #446:
URL: https://github.com/apache/tika/pull/446


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@tika.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tika] tballison commented on a change in pull request #446: [TIKA-3418] DefaultZipContainerDetector does not support loading of ZipContainerDetectors in an OSGi enviroment

Posted by GitBox <gi...@apache.org>.
tballison commented on a change in pull request #446:
URL: https://github.com/apache/tika/pull/446#discussion_r663133878



##########
File path: tika-core/src/main/java/org/apache/tika/config/TikaActivator.java
##########
@@ -51,14 +53,17 @@ public void start(final BundleContext context) throws Exception {
 
         detectorTracker = new ServiceTracker(context, Detector.class.getName(), this);
         parserTracker = new ServiceTracker(context, Parser.class.getName(), this);
+        zipDetectorTracker = new ServiceTracker(context, "org.apache.tika.detect.zip.ZipContainerDetector", this);

Review comment:
       Do we need to do this for EncodingDetector as well to ensure that all encoding detectors are dynamically loaded?  Obv, separate pull request...but I'm curious if that is failing in OSGi.

##########
File path: tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-zip-commons/src/main/java/org/apache/tika/detect/zip/DefaultZipContainerDetector.java
##########
@@ -262,12 +266,21 @@ private MediaType detect(ZipArchiveEntry zae, ZipArchiveInputStream zis,
     }
 
     private MediaType finalDetect(StreamingDetectContext detectContext) {
-        for (ZipContainerDetector d : zipDetectors) {
+        for (ZipContainerDetector d : getDetectors()) {
             MediaType mt = d.streamingDetectFinal(detectContext);
             if (mt != null) {
                 return mt;
             }
         }
         return MediaType.APPLICATION_ZIP;
     }
+    
+    private List<ZipContainerDetector> getDetectors() {
+        if (loader != null) {
+            List<ZipContainerDetector> zipDetectors = new ArrayList<>(staticZipDetectors);

Review comment:
       In DefaultDetector, the order is dynamic detectors and then we add the statically loaded detectors with `super.getDetectors()`.  Here the order is reversed.  Why the difference, will it make any difference?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@tika.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tika] simon-shlomo commented on pull request #446: [TIKA-3418] DefaultZipContainerDetector does not support loading of ZipContainerDetectors in an OSGi enviroment

Posted by GitBox <gi...@apache.org>.
simon-shlomo commented on pull request #446:
URL: https://github.com/apache/tika/pull/446#issuecomment-859580041


   Hi
   Could someone please approve the "workflow"?
   Have a great weekend
   Simon


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tika] tballison commented on a change in pull request #446: [TIKA-3418] DefaultZipContainerDetector does not support loading of ZipContainerDetectors in an OSGi enviroment

Posted by GitBox <gi...@apache.org>.
tballison commented on a change in pull request #446:
URL: https://github.com/apache/tika/pull/446#discussion_r663130515



##########
File path: tika-core/src/main/java/org/apache/tika/config/ServiceLoader.java
##########
@@ -76,6 +76,11 @@ public ServiceLoader(ClassLoader loader, LoadErrorHandler handler) {
         this(loader, handler, false);
     }
 
+    public ServiceLoader(ClassLoader loader, boolean dynamic) {
+        this(loader, Boolean.getBoolean("org.apache.tika.service.error.warn") ? LoadErrorHandler.WARN :
+                        LoadErrorHandler.IGNORE, true);

Review comment:
       Should `true` be changed to `dynamic`?  Do we need this constructor?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@tika.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tika] tballison commented on pull request #446: [TIKA-3418] DefaultZipContainerDetector does not support loading of ZipContainerDetectors in an OSGi enviroment

Posted by GitBox <gi...@apache.org>.
tballison commented on pull request #446:
URL: https://github.com/apache/tika/pull/446#issuecomment-1023626019


   I merged this and made a few modifications.  I need to figure out how to do a better job of passing tikaconfig info into DefaultZipDetector.
   
   Thank you @simon-shlomo for opening this!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@tika.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org