You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@daffodil.apache.org by "Michael Beckerle (JIRA)" <ji...@apache.org> on 2018/10/23 16:45:00 UTC
[jira] [Closed] (DAFFODIL-1710) Apache Tika integration
[ https://issues.apache.org/jira/browse/DAFFODIL-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Beckerle closed DAFFODIL-1710.
--------------------------------------
Resolution: Not A Problem
Not a Daffodil bug, improvement or feature. JIRA tickets are not for "good idea" keeping track.
> Apache Tika integration
> -----------------------
>
> Key: DAFFODIL-1710
> URL: https://issues.apache.org/jira/browse/DAFFODIL-1710
> Project: Daffodil
> Issue Type: New Feature
> Components: API, Integrations
> Reporter: Michael Beckerle
> Priority: Major
>
> Daffodil's parser could be encapsulated with the Apache Tika APIs allowing any DFDL-described format to be mined for text content in the Tika way.
> Probably this would want to be schema-aware in that Tika events would not want to be reported for numeric content, but only text content.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)