You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (Jira)" <ji...@apache.org> on 2020/03/06 19:31:00 UTC

[jira] [Created] (TIKA-3062) Improve attachment alignment in tika-eval

Tim Allison created TIKA-3062:
---------------------------------

             Summary: Improve attachment alignment in tika-eval
                 Key: TIKA-3062
                 URL: https://issues.apache.org/jira/browse/TIKA-3062
             Project: Tika
          Issue Type: Task
            Reporter: Tim Allison


We noticed in the last few runs that there are areas for improvement in alignment of attachments in tika-eval.  Different extractors or different versions can extract different numbers of attachments with different names in different orders, sometimes even with different digests.

We should increase the trust on digests if they exist, and decrease the reliance on matching embedded file names. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)