You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (Jira)" <ji...@apache.org> on 2022/10/26 13:50:00 UTC

[jira] [Updated] (TIKA-3901) Consider adding a wrapper for Siegfried

     [ https://issues.apache.org/jira/browse/TIKA-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Allison updated TIKA-3901:
------------------------------
    Description: 
Digital preservation folks really like PRONOM.  One option to get pronom definitions via Tika would be to run DROID "in process" (h/t Andy Jackson: https://github.com/openpreserve/nanite/).

Given the risks of oom/timeouts etc and the jar hell nightmares of running DROID in process, it might make sense to integrate Richard Lehane's siegfried (https://github.com/richardlehane/siegfried) and run it as an external process.  Users would be required to have it installed, or they could pre-install it in a Docker container. 

  was:
Digital preservation folks really like PRONOM.  One option to get pronom definitions via Tika would be to run DROID "in process" (h/t Andy Jackson: https://github.com/openpreserve/nanite/).

Given the risks of oom/timeouts etc and the jar hell nightmares of running DROID in process, it might make sense to integrate Richard Lehane's siegfried and run it as an external process.  Users would be required to have it installed, or they could pre-install it in a Docker container.


> Consider adding a wrapper for Siegfried
> ---------------------------------------
>
>                 Key: TIKA-3901
>                 URL: https://issues.apache.org/jira/browse/TIKA-3901
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Major
>
> Digital preservation folks really like PRONOM.  One option to get pronom definitions via Tika would be to run DROID "in process" (h/t Andy Jackson: https://github.com/openpreserve/nanite/).
> Given the risks of oom/timeouts etc and the jar hell nightmares of running DROID in process, it might make sense to integrate Richard Lehane's siegfried (https://github.com/richardlehane/siegfried) and run it as an external process.  Users would be required to have it installed, or they could pre-install it in a Docker container. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)