You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Nick Burch (JIRA)" <ji...@apache.org> on 2018/09/05 20:17:00 UTC

[jira] [Commented] (TIKA-2689) *.ai type (Adobe illustrator ) files are not detected correctly.

    [ https://issues.apache.org/jira/browse/TIKA-2689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16604890#comment-16604890 ] 

Nick Burch commented on TIKA-2689:
----------------------------------

I believe that Adobe Illustrator files are normally just special kinds of PDF files

If that's right... If you give Tika just the bytes, all it'll be able to say is "that's a PDF". If you give Tika the bytes and filename, it should say "that's a PDF, no wait, using the file extension it's actually the PDF subtype Illustrator"

Unless someone can find something near the start of the PDF that identifies the subtype of PDF as illustrator, I think that's the best we can manage

(Wikipedia suggests they can also be EPS files instead, not sure how common/true that is today? [https://en.wikipedia.org/wiki/Adobe_Illustrator_Artwork] )

> *.ai type (Adobe illustrator ) files are not detected correctly.
> ----------------------------------------------------------------
>
>                 Key: TIKA-2689
>                 URL: https://issues.apache.org/jira/browse/TIKA-2689
>             Project: Tika
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.16, 1.17, 1.18
>            Reporter: Amit Pandey
>            Priority: Major
>         Attachments: example.ai
>
>
> There is in-consistency in detecting **ai* types files when using different overloaded detect method. When I am using _detect(String filename)_, it gives correct file type - "*application/illustrator*". If I use _detect(InputStream is, String filename)_ or _detect(File fileObj)_ -  it gives file type "*application/pdf*".
> Here is sample code I used.
>   [https://stackoverflow.com/questions/51359351/tika-detect-method-not-giving-same-exact-file-type|http://example.com/]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)