You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (Jira)" <ji...@apache.org> on 2023/02/10 17:40:00 UTC

[jira] [Updated] (TIKA-3971) Distinguish eps-based Adobe Illustrator files from pdf-based Illustrator files

     [ https://issues.apache.org/jira/browse/TIKA-3971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Allison updated TIKA-3971:
------------------------------
    Description: 
On TIKA-2689, we plan to add detection for Illustrator files that are based on/wrapped in PDF files at parse time.  Illustrator files used to be eps or just ps.  We should figure out how we want to distinguish between these two or three formats.

TIKA-2689 has some great resource links to help with this.

Pronom has a bunch of ids for "Illustrator", summarized: http://justsolve.archiveteam.org/wiki/Adobe_Illustrator_Artwork

One example: https://www.nationalarchives.gov.uk/PRONOM/Format/proFormatSearch.aspx?status=detailReport&id=1350

See also: https://bugs.ghostscript.com/show_bug.cgi?id=689926

  was:
On TIKA-2689, we plan to add detection for Illustrator files that are based on/wrapped in PDF files at parse time.  Illustrator files used to be eps or just ps.  We should figure out how we want to distinguish between these two or three formats.

TIKA-2689 has some great resource links to help with this.

Pronom has a bunch of ids for "Illustrator":

https://www.nationalarchives.gov.uk/PRONOM/Format/proFormatSearch.aspx?status=detailReport&id=1350

See also: https://bugs.ghostscript.com/show_bug.cgi?id=689926


> Distinguish eps-based Adobe Illustrator files from pdf-based Illustrator files
> ------------------------------------------------------------------------------
>
>                 Key: TIKA-3971
>                 URL: https://issues.apache.org/jira/browse/TIKA-3971
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Minor
>
> On TIKA-2689, we plan to add detection for Illustrator files that are based on/wrapped in PDF files at parse time.  Illustrator files used to be eps or just ps.  We should figure out how we want to distinguish between these two or three formats.
> TIKA-2689 has some great resource links to help with this.
> Pronom has a bunch of ids for "Illustrator", summarized: http://justsolve.archiveteam.org/wiki/Adobe_Illustrator_Artwork
> One example: https://www.nationalarchives.gov.uk/PRONOM/Format/proFormatSearch.aspx?status=detailReport&id=1350
> See also: https://bugs.ghostscript.com/show_bug.cgi?id=689926



--
This message was sent by Atlassian Jira
(v8.20.10#820010)