You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Chetan Mehrotra (JIRA)" <ji...@apache.org> on 2015/01/31 09:34:34 UTC
[jira] [Created] (OAK-2468) Index binary only if some Tika parser
can support the bianries mimeType
Chetan Mehrotra created OAK-2468:
------------------------------------
Summary: Index binary only if some Tika parser can support the bianries mimeType
Key: OAK-2468
URL: https://issues.apache.org/jira/browse/OAK-2468
Project: Jackrabbit Oak
Issue Type: Improvement
Components: oak-lucene
Reporter: Chetan Mehrotra
Assignee: Chetan Mehrotra
Priority: Minor
Fix For: 1.2
Currently all binaries are passed to Tika for text extraction. However Tika can only parse those for which it has supported parser present. Therefore extraction logic should parse a binary only if the mimeType is supported by Tika.
With this change {{jcr:mimeType}} would become a mandatory property
JR2 had a similar check [1]
[1] https://github.com/apache/jackrabbit/blob/trunk/jackrabbit-core/src/main/java/org/apache/jackrabbit/core/query/lucene/NodeIndexer.java#L932
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)