You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sling.apache.org by "Anjan (JIRA)" <ji...@apache.org> on 2013/06/21 14:18:20 UTC

[jira] [Created] (SLING-2924) Full text extraction issue with Tika v1.0 under OSGi environment

Anjan created SLING-2924:
----------------------------

             Summary: Full text extraction issue with Tika v1.0 under OSGi environment
                 Key: SLING-2924
                 URL: https://issues.apache.org/jira/browse/SLING-2924
             Project: Sling
          Issue Type: Bug
          Components: JCR
            Reporter: Anjan


The latest stable build (I checked out revision 1487628) of Sling is using Jackrabbit version 2.4.2 and it uses Tika version 1.0 for extracting metatdata and text for indexing purpose.  Jackrabbit v2.4.2 deployed as a separate web application extracts metadata and text from the uploaded documents perfectly fine, but when deployed in Sling (OSGi environment), full text extraction doesn't work.

Updating the Tika dependency to Version 1.2 in Sling resolved the above issue.

Secondly, if the indexes are deleted from the repository and the server is restarted, indexes are not rebuilt for the existing documents.  The Tika bundles were not ready by the time Jackrabbit starts to rebuild the indexes during the Sling server start up.  Updating the startlevel from 15 to 10 for the Tika bundles helps to resolve the issue.

The changes related to above fixes are in <sling>/launchpad/builder/src/main/bundles/list.xml file.

Currently Tika bundles are at start level 15 as shown below:

<startLevel level="15">
..........
<bundle>
            <groupId>org.apache.tika</groupId>
            <artifactId>tika-core</artifactId>
            <version>1.0</version>
        </bundle>
        <bundle>
            <groupId>org.apache.tika</groupId>
            <artifactId>tika-bundle</artifactId>
            <version>1.0</version>
        </bundle>
..........
</startLevel>

Moved the above bundles to start level 10 and also the version is changed to 1.2

<startLevel level="10">
..........
<bundle>
            <groupId>org.apache.tika</groupId>
            <artifactId>tika-core</artifactId>
            <version>1.2</version>
        </bundle>
        <bundle>
            <groupId>org.apache.tika</groupId>
            <artifactId>tika-bundle</artifactId>
            <version>1.2</version>
        </bundle>
..........
</startLevel>

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira