You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Chetan Mehrotra (JIRA)" <ji...@apache.org> on 2015/01/30 13:54:34 UTC
[jira] [Commented] (OAK-2463) Provide support for providing custom
Tika config
[ https://issues.apache.org/jira/browse/OAK-2463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14298593#comment-14298593 ]
Chetan Mehrotra commented on OAK-2463:
--------------------------------------
Applied the patch in trunk for now with http://svn.apache.org/r1655996. Once review is done would merge it to branch
Custom Tika config xml can be now be provided as part of Index Defintion node by creating a {{nt:file}} node with name {{tikaConfig}} under index definition
{noformat}
/oak:index/assetType
- jcr:primaryType = "oak:QueryIndexDefinition"
- compatVersion = 2
- type = "lucene"
- async = "async"
+ tikaConfig (nt:file)
+ jcr:content
- jcr:data = //config xml binary content
+ indexRules
{noformat}
> Provide support for providing custom Tika config
> ------------------------------------------------
>
> Key: OAK-2463
> URL: https://issues.apache.org/jira/browse/OAK-2463
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: oak-lucene
> Reporter: Chetan Mehrotra
> Assignee: Chetan Mehrotra
> Fix For: 1.1.6, 1.0.12
>
> Attachments: OAK-2463.patch
>
>
> Currently the Oak Lucene uses the default Tika Config while extracting text content from binary properties. To provide better control the tika config should be made configurable
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)