You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by sa...@oocl.com on 2014/01/23 02:00:25 UTC

FW: Help needed - Text search not working in Jackrabbit 2.6.5


From: SARA KRISHNAN (IRIS-ISD-OOCLL/SNT)
Sent: Wednesday, January 22, 2014 4:40 PM
To: 'user@jackrabbit.org'; 'jukka@apache.org'
Cc: ORLANDO PALIS (IRIS-ISD-OOCLL/SNT)
Subject: Help needed - Text search not working in Jackrabbit 2.6.5

Hello,

We have been working with Jackrabbit for a year now, starting with version 2.4.3 and now in the process of upgrading to the latest stable version 2.6.5.  We have not been able to get the content search function to work since the move to this latest version. A bit of a background on our setup and requirements.

We are referencing Jackrabbit jars in Weblogic 10.1.3.6 via server classpath setting as highlighted below, along with our application. Users need to be able to search the content, html files, in the repository using JCR-SQL2.


1.       CLASSPATH=D:\oracle\MIDDLE~1\modules\javax.persistence_1.1.0.0_2-0.jar;D:\oracle\MIDDLE~1\modules\com.oracle.jpa2support_1.0.0.0_2-0.jar;D:\oracle\MIDDLE~1\patch_wls1036\profiles\default\sys_manifest_classpath\weblogic_patch.jar;D:\oracle\MIDDLE~1\patch_ocp371\profiles\default\sys_manifest_classpath\weblogic_patch.jar;D:\Oracle\Middleware\coherence_3.7.0.2\eclipselink.jar;D:\Oracle\Middleware\coherence_3.7.0.2\coherence.jar;D:\Oracle\Middleware\coherence_3.7.0.2\toplink-grid.jar;D:\oracle\MIDDLE~1\JROCKI~1.1-3\lib\tools.jar;D:\oracle\MIDDLE~1\WLSERV~1.3\server\lib\weblogic_sp.jar;D:\oracle\MIDDLE~1\WLSERV~1.3\server\lib\weblogic.jar;D:\oracle\MIDDLE~1\modules\features\weblogic.server.modules_10.3.6.0.jar;D:\oracle\MIDDLE~1\WLSERV~1.3\server\lib\webservices.jar;D:\oracle\MIDDLE~1\modules\ORGAPA~1.1/lib/ant-all.jar;D:\oracle\MIDDLE~1\modules\NETSFA~1.0_1/lib/ant-contrib.jar;D:\oracle\MIDDLE~1\WLSERV~1.3\common\derby\lib\derbyclient.jar;D:\oracle\MIDDLE~1\WLSERV~1.3\server\lib\xqrl.jar;D:/Oracle/Middleware/modules/javax.persistence_1.1.0.0_2-0.jar;D:/oracle/Middleware/jackrabbit/jackrabbit-api-2.6.5.jar;D:/oracle/Middleware/jackrabbit/jackrabbit-core-2.6.5.jar;D:/oracle/Middleware/jackrabbit/jackrabbit-jcr-commons-2.6.5.jar;D:/oracle/Middleware/jackrabbit/jackrabbit-jcr-rmi-2.6.5.jar;D:/oracle/Middleware/jackrabbit/jackrabbit-jcr-server-2.6.5.jar;D:/oracle/Middleware/jackrabbit/jackrabbit-jcr-servlet-2.6.5.jar;D:/oracle/Middleware/jackrabbit/jackrabbit-spi-2.6.5.jar;D:/oracle/Middleware/jackrabbit/jackrabbit-spi-commons-2.6.5.jar;D:/oracle/Middleware/jackrabbit/jackrabbit-webdav-2.6.5.jar;D:/oracle/Middleware/jackrabbit/commons-io-2.2.jar;D:/oracle/Middleware/jackrabbit/concurrent-1.3.4.jar;D:/oracle/Middleware/jackrabbit/jcr-2.0.jar;D:/oracle/Middleware/jackrabbit/lucene-core-3.6.0.jar;D:/oracle/Middleware/jackrabbit/slf4j-api-1.6.4.jar;D:/oracle/Middleware/jackrabbit/tika-core-1.3.jar;D:/oracle/Middleware/jackrabbit/tika-parsers-1.3.jar;D:/oracle/Middleware/jackrabbit/tagsoup-1.2.1.jar;%CLASSPATH%;

2.       Attached Repository xml with workspace section changed to load an externalized tika-config xml

3.       Attached tika-config.xml with HTMLParser configuration.

4.       SQL used to search the content

SELECT rt.*, file.*, resource.* FROM [rt:RuleTariff] AS rt INNER JOIN [rt:file] AS file ON ISCHILDNODE(file, rt) INNER JOIN [nt:resource] AS resource ON ISCHILDNODE(resource, file) WHERE resource.[jcr:mimeType] = 'text/html' AND CONTAINS(resource.*, 'two')  //'two' is the text being searched


The content search works perfectly well using the sql above when we switch back to Jackrabbit 2.4.3, which uses Lucene.

1. Are we missing any other configuration ?
2. Can we use lucene for full text search in 2.6.5 instead of tika ? If yes, how should this be configured ?

Thanks in advance.
-Sara


IMPORTANT NOTICE
Email from OOCL is confidential and may be legally privileged.  If it is not
intended for you, please delete it immediately unread.  The internet
cannot guarantee that this communication is free of viruses, interception
or interference and anyone who communicates with us by email is taken
to accept the risks in doing so.  Without limitation, OOCL and its affiliates
accept no liability whatsoever and howsoever arising in connection with
the use of this email.  Under no circumstances shall this email constitute
a binding agreement to carry or for provision of carriage services by OOCL,
which is subject to the availability of carrier's equipment and vessels and
the terms and conditions of OOCL's standard bill of lading which is also
available at http://www.oocl.com.