You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tika.apache.org by ni...@apache.org on 2020/02/04 10:31:53 UTC

[tika] branch master updated: TIKA-3034 Mathematica files don't have a unique magic, but try to detect based on the file starting with a Mathematica-style comment as all we can do. Also add the newer Wolfram Language mimetype, which extends mathematica, with a unix detection

This is an automated email from the ASF dual-hosted git repository.

nick pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/tika.git


The following commit(s) were added to refs/heads/master by this push:
     new f5571fa  TIKA-3034 Mathematica files don't have a unique magic, but try to detect based on the file starting with a Mathematica-style comment as all we can do. Also add the newer Wolfram Language mimetype, which extends mathematica, with a unix  detection
f5571fa is described below

commit f5571fa99ef6f178a16bd1bd3a3cded83c7b0013
Author: Nick Burch <ni...@gagravarr.org>
AuthorDate: Tue Feb 4 10:31:31 2020 +0000

    TIKA-3034 Mathematica files don't have a unique magic, but try to detect based on the file starting with a Mathematica-style comment as all we can do. Also add the newer Wolfram Language mimetype, which extends mathematica, with a unix  detection
---
 .../resources/org/apache/tika/mime/tika-mimetypes.xml  | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml b/tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml
index 34e8d98..174dad0 100644
--- a/tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml
+++ b/tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml
@@ -409,11 +409,29 @@
   <mime-type type="application/marc">
     <glob pattern="*.mrc"/>
   </mime-type>
+
   <mime-type type="application/mathematica">
+    <_comment>Wolfram Mathematica</_comment>
     <glob pattern="*.ma"/>
     <glob pattern="*.nb"/>
     <glob pattern="*.mb"/>
+    <!-- Note - there is no Unique Magic for Mathematica files! -->
+    <!-- Check for a Mathematica-style opening comment as our best hope... -->
+    <magic priority="50">
+      <match value="(**" type="string" offset="0"/>
+      <match value="(* " type="string" offset="0"/>
+    </magic>
+    <sub-class-of type="text/plain"/>
   </mime-type>
+  <mime-type type="application/vnd.wolfram.wl">
+    <_comment>Wolfram Language</_comment>
+    <glob pattern="*.wl"/>
+    <magic priority="50">
+      <match value="#!/usr/bin/env wolframscript" type="string" offset="0"/>
+    </magic>
+    <sub-class-of type="application/mathematica"/>
+  </mime-type>
+
   <mime-type type="application/mathml+xml">
     <glob pattern="*.mathml"/>
   </mime-type>