You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tika.apache.org by ni...@apache.org on 2015/04/24 05:01:02 UTC

svn commit: r1675755 - in /tika/trunk: tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml tika-parsers/src/test/java/org/apache/tika/mime/TestMimeTypes.java tika-parsers/src/test/resources/test-documents/NUTCH-1997.cbor

Author: nick
Date: Fri Apr 24 03:01:01 2015
New Revision: 1675755

URL: http://svn.apache.org/r1675755
Log:
TIKA-1610 Bump the CBOR mime magic priority to 60, to be more specific than (x)html, which is what CBOR often contains, and add a detection unit test

Added:
    tika/trunk/tika-parsers/src/test/resources/test-documents/NUTCH-1997.cbor   (with props)
Modified:
    tika/trunk/tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml
    tika/trunk/tika-parsers/src/test/java/org/apache/tika/mime/TestMimeTypes.java

Modified: tika/trunk/tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml
URL: http://svn.apache.org/viewvc/tika/trunk/tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml?rev=1675755&r1=1675754&r2=1675755&view=diff
==============================================================================
--- tika/trunk/tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml (original)
+++ tika/trunk/tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml Fri Apr 24 03:01:01 2015
@@ -76,7 +76,7 @@
     <acronym>CBOR</acronym>
     <_comment>Concise Binary Object Representation container</_comment>
     <tika:link>http://tools.ietf.org/html/rfc7049</tika:link>
-    <magic priority="40">
+    <magic priority="60">
       <match value="0xd9d9f7" type="string" offset="0" />
     </magic>
     <glob pattern="*.cbor"/>

Modified: tika/trunk/tika-parsers/src/test/java/org/apache/tika/mime/TestMimeTypes.java
URL: http://svn.apache.org/viewvc/tika/trunk/tika-parsers/src/test/java/org/apache/tika/mime/TestMimeTypes.java?rev=1675755&r1=1675754&r2=1675755&view=diff
==============================================================================
--- tika/trunk/tika-parsers/src/test/java/org/apache/tika/mime/TestMimeTypes.java (original)
+++ tika/trunk/tika-parsers/src/test/java/org/apache/tika/mime/TestMimeTypes.java Fri Apr 24 03:01:01 2015
@@ -905,6 +905,15 @@ public class TestMimeTypes {
                 "application/x-berkeley-db; format=hash; version=5", 
                 "testBDB_hash_5.db");
     }
+    
+    /**
+     * CBOR typically contains HTML
+     */
+    @Test
+    public void testCBOR() throws IOException {
+        assertTypeByNameAndData("application/cbor", "NUTCH-1997.cbor");
+        assertTypeByData("application/cbor", "NUTCH-1997.cbor");
+    }
 
     private void assertText(byte[] prefix) throws IOException {
         assertMagic("text/plain", prefix);

Added: tika/trunk/tika-parsers/src/test/resources/test-documents/NUTCH-1997.cbor
URL: http://svn.apache.org/viewvc/tika/trunk/tika-parsers/src/test/resources/test-documents/NUTCH-1997.cbor?rev=1675755&view=auto
==============================================================================
Binary file - no diff available.

Propchange: tika/trunk/tika-parsers/src/test/resources/test-documents/NUTCH-1997.cbor
------------------------------------------------------------------------------
    svn:mime-type = application/cbor