You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by Marc Speck <ma...@gmail.com> on 2010/02/24 16:15:12 UTC
SearchIndex.textFilterClasses obsolete with JR2?, typo in
JackrabbitParser
Hi,
1. Does the use of Tika replace the configuration of
SearchIndex.textFilterClasses? If I read the code correctly, all formats in
tika-config.xml are configured by default. So SearchIndex.textFilterClasses
is only used for types not declared in tika-config.xml, right?
2. According to [a] (and my own repository.xml ;) , the extractor is called
org.apache.jackrabbit.extractor.MsPowerPointTextExtractor , not
org.apache.jackrabbit.extractor.MsPowerPointExtractor.
Marc
[a] http://jackrabbit.apache.org/jackrabbit-text-extractors.html
Index:
src/main/java/org/apache/jackrabbit/core/query/lucene/JackrabbitParser.java
===================================================================
---
src/main/java/org/apache/jackrabbit/core/query/lucene/JackrabbitParser.java
(revision 915798)
+++
src/main/java/org/apache/jackrabbit/core/query/lucene/JackrabbitParser.java
(working copy)
@@ -114,7 +114,7 @@
"org.apache.jackrabbit.extractor.MsOutlookTextExtractor")) {
parsers.put("application/vnd.ms-outlook", new
OfficeParser());
} else if (name.equals(
-
"org.apache.jackrabbit.extractor.MsPowerPointExtractor")) {
+
"org.apache.jackrabbit.extractor.MsPowerPointTextExtractor")) {
Parser parser = new OfficeParser();
parsers.put("application/vnd.ms-powerpoint", parser);
parsers.put("application/mspowerpoint", parser);