You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@daffodil.apache.org by "Michael Beckerle (JIRA)" <ji...@apache.org> on 2018/10/23 16:45:00 UTC

[jira] [Closed] (DAFFODIL-1710) Apache Tika integration

     [ https://issues.apache.org/jira/browse/DAFFODIL-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Beckerle closed DAFFODIL-1710.
--------------------------------------
    Resolution: Not A Problem

Not a Daffodil bug, improvement or feature. JIRA tickets are not for "good idea" keeping track.

> Apache Tika integration
> -----------------------
>
>                 Key: DAFFODIL-1710
>                 URL: https://issues.apache.org/jira/browse/DAFFODIL-1710
>             Project: Daffodil
>          Issue Type: New Feature
>          Components: API, Integrations
>            Reporter: Michael Beckerle
>            Priority: Major
>
> Daffodil's parser could be encapsulated with the Apache Tika APIs allowing any DFDL-described format to be mined for text content in the Tika way.
> Probably this would want to be schema-aware in that Tika events would not want to be reported for numeric content, but only text content.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)