You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Anas Hammani (Jira)" <ji...@apache.org> on 2021/03/02 16:37:00 UTC
[jira] [Created] (TIKA-3308) SVG file without xml declaration tag
is detected as text/plain
Anas Hammani created TIKA-3308:
----------------------------------
Summary: SVG file without xml declaration tag is detected as text/plain
Key: TIKA-3308
URL: https://issues.apache.org/jira/browse/TIKA-3308
Project: Tika
Issue Type: Bug
Components: mime
Affects Versions: 1.25
Reporter: Anas Hammani
Attachments: logo-luma.svg
The SVG file attached to the issue is interpreted as *text/plain* by
{code:java}
tika.detect(filePath){code}
If I add
{code:java}
<?xml version="1.0" standalone="no"?> {code}
at the beginning of the file, then tika detects it as "image/svg+xml"
When i read the documentation i see that xml is not necessary for a file to be well-formed
[https://www.w3.org/TR/REC-xml/#sec-prolog-dtd]
It will be great if tika can detect a file as a SVG without the prolog
--
This message was sent by Atlassian Jira
(v8.3.4#803005)