You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by Jukka Zitting <ju...@gmail.com> on 2009/02/14 13:22:12 UTC
Re: Is jackrabbit textFilterClasses able to handle office 2007
documents.
Hi,
On Sat, Feb 14, 2009 at 12:59 PM, Akil Ali <Ak...@cognizant.com> wrote:
> i can see that there are numbers of filters available in the latest version.
> But will it be able to extract the contents of office 2007 documents. is
> anyone tested with indexing contents of office 2007 documents.
See JCR-1887 [1] for a patch that adds support for indexing Office
2007 documents.
Alternatively, the latest trunk of Apache Tika [2] also supports
Office 2007, and you can the jackrabbit-tika sandbox component [3]
allows you to set up Tika as a text extractor in Jackrabbit.
We will most likely have Office 2007 support built in when Jackrabbit
1.6 is released.
[1] https://issues.apache.org/jira/browse/JCR-1887
[2] http://lucene.apache.org/tika/
[3] http://svn.apache.org/repos/asf/jackrabbit/sandbox/jackrabbit-tika/
BR,
Jukka Zitting