You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Nick Burch (JIRA)" <ji...@apache.org> on 2011/04/06 13:56:05 UTC

[jira] [Created] (TIKA-634) Command Line Parser for Metadata Extraction

Command Line Parser for Metadata Extraction
-------------------------------------------

                 Key: TIKA-634
                 URL: https://issues.apache.org/jira/browse/TIKA-634
             Project: Tika
          Issue Type: Improvement
          Components: parser
    Affects Versions: 0.9
            Reporter: Nick Burch
            Assignee: Nick Burch
            Priority: Minor


As discussed on the mailing list:
http://mail-archives.apache.org/mod_mbox/tika-dev/201104.mbox/%3Calpine.DEB.2.00.1104052028380.29085@urchin.earth.li%3E

This issue is to track improvements in the ExternalParser support to handle metadata extraction, and probably easier configuration of an external parser too.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (TIKA-634) Command Line Parser for Metadata Extraction

Posted by "Nick Burch (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/TIKA-634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016459#comment-13016459 ] 

Nick Burch commented on TIKA-634:
---------------------------------

I've done some work on this. We can now use XML files to define what command to run, what args to give it, and how to get the metadata.

There's an initial example file for using FFMpeg committed, we may want to replace that with a different example longer term, especially as I hope to do a dedicated FFMpeg ExternalParser that is able to dynamically build the supported mime types based on what libraries are available to ffmpeg.

No unit tests yet, but my plan is to do it with a couple of perl scripts (with the test being skipped if perl isn't there)

> Command Line Parser for Metadata Extraction
> -------------------------------------------
>
>                 Key: TIKA-634
>                 URL: https://issues.apache.org/jira/browse/TIKA-634
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>    Affects Versions: 0.9
>            Reporter: Nick Burch
>            Assignee: Nick Burch
>            Priority: Minor
>
> As discussed on the mailing list:
> http://mail-archives.apache.org/mod_mbox/tika-dev/201104.mbox/%3Calpine.DEB.2.00.1104052028380.29085@urchin.earth.li%3E
> This issue is to track improvements in the ExternalParser support to handle metadata extraction, and probably easier configuration of an external parser too.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira