You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tika.apache.org by ta...@apache.org on 2016/03/22 02:19:18 UTC

[06/13] tika git commit: TIKA-1855 -- first pass. Need to turn back on the forbidden-apis testCheck. More clean up remains.

http://git-wip-us.apache.org/repos/asf/tika/blob/aa5f60d7/tika-parsers/src/main/appended-resources/META-INF/LICENSE
----------------------------------------------------------------------
diff --git a/tika-parsers/src/main/appended-resources/META-INF/LICENSE b/tika-parsers/src/main/appended-resources/META-INF/LICENSE
deleted file mode 100644
index bd54624..0000000
--- a/tika-parsers/src/main/appended-resources/META-INF/LICENSE
+++ /dev/null
@@ -1,94 +0,0 @@
-APACHE TIKA SUBCOMPONENTS
-
-Apache Tika includes a number of subcomponents with separate copyright notices
-and license terms. Your use of these subcomponents is subject to the terms and
-conditions of the following licenses.
-
-Charset detection code from ICU4J (http://site.icu-project.org/)
-
-    Copyright (c) 1995-2009 International Business Machines Corporation
-    and others
-
-    All rights reserved.
-
-    Permission is hereby granted, free of charge, to any person obtaining
-    a copy of this software and associated documentation files (the
-    "Software"), to deal in the Software without restriction, including
-    without limitation the rights to use, copy, modify, merge, publish,
-    distribute, and/or sell copies of the Software, and to permit persons
-    to whom the Software is furnished to do so, provided that the above
-    copyright notice(s) and this permission notice appear in all copies
-    of the Software and that both the above copyright notice(s) and this
-    permission notice appear in supporting documentation.
-
-    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
-    OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
-    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF THIRD PARTY RIGHTS.
-    IN NO EVENT SHALL THE COPYRIGHT HOLDER OR HOLDERS INCLUDED IN THIS NOTICE
-    BE LIABLE FOR ANY CLAIM, OR ANY SPECIAL INDIRECT OR CONSEQUENTIAL DAMAGES,
-    OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
-    WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
-    ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS
-    SOFTWARE.
-
-    Except as contained in this notice, the name of a copyright holder shall
-    not be used in advertising or otherwise to promote the sale, use or other
-    dealings in this Software without prior written authorization of the
-    copyright holder.
-
-
-JUnRAR (https://github.com/edmund-wagner/junrar/)
-
-      JUnRAR is based on the UnRAR tool, and covered by the same license
-      It was formerly available from http://java-unrar.svn.sourceforge.net/
-
- ******    *****   ******   UnRAR - free utility for RAR archives
- **   **  **   **  **   **  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- ******   *******  ******    License for use and distribution of
- **   **  **   **  **   **   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- **   **  **   **  **   **         FREE portable version
-                                   ~~~~~~~~~~~~~~~~~~~~~
-
-      The source code of UnRAR utility is freeware. This means:
-
-   1. All copyrights to RAR and the utility UnRAR are exclusively
-      owned by the author - Alexander Roshal.
-
-   2. The UnRAR sources may be used in any software to handle RAR
-      archives without limitations free of charge, but cannot be used
-      to re-create the RAR compression algorithm, which is proprietary.
-      Distribution of modified UnRAR sources in separate form or as a
-      part of other software is permitted, provided that it is clearly
-      stated in the documentation and source comments that the code may
-      not be used to develop a RAR (WinRAR) compatible archiver.
-
-   3. The UnRAR utility may be freely distributed. It is allowed
-      to distribute UnRAR inside of other software packages.
-
-   4. THE RAR ARCHIVER AND THE UnRAR UTILITY ARE DISTRIBUTED "AS IS".
-      NO WARRANTY OF ANY KIND IS EXPRESSED OR IMPLIED.  YOU USE AT 
-      YOUR OWN RISK. THE AUTHOR WILL NOT BE LIABLE FOR DATA LOSS, 
-      DAMAGES, LOSS OF PROFITS OR ANY OTHER KIND OF LOSS WHILE USING
-      OR MISUSING THIS SOFTWARE.
-
-   5. Installing and using the UnRAR utility signifies acceptance of
-      these terms and conditions of the license.
-
-   6. If you don't agree with terms of the license you must remove
-      UnRAR files from your storage devices and cease to use the
-      utility.
-
-      Thank you for your interest in RAR and UnRAR.  Alexander L. Roshal
-
-Sqlite (included in the "provided" org.xerial's sqlite-jdbc)
-    Sqlite is in the Public Domain.  For details
-    see: https://www.sqlite.org/copyright.html
-
-Two photos in test-documents (testWebp_Alpha_Lossy.webp and testWebp_Alpha_Lossless.webp)
-    are in the public domain.  These files were retrieved from:
-    https://github.com/drewnoakes/metadata-extractor-images/tree/master/webp
-    These photos are also available here:
-    https://developers.google.com/speed/webp/gallery2#webp_links
-    Credits for the photo:
-    "Free Stock Photo in High Resolution - Yellow Rose 3 - Flowers"
-    Image Author: Jon Sullivan

http://git-wip-us.apache.org/repos/asf/tika/blob/aa5f60d7/tika-parsers/src/main/java/org/apache/tika/parser/internal/Activator.java
----------------------------------------------------------------------
diff --git a/tika-parsers/src/main/java/org/apache/tika/parser/internal/Activator.java b/tika-parsers/src/main/java/org/apache/tika/parser/internal/Activator.java
deleted file mode 100644
index a884d3a..0000000
--- a/tika-parsers/src/main/java/org/apache/tika/parser/internal/Activator.java
+++ /dev/null
@@ -1,54 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.tika.parser.internal;
-
-import java.util.Properties;
-
-import org.apache.tika.detect.DefaultDetector;
-import org.apache.tika.detect.Detector;
-import org.apache.tika.parser.DefaultParser;
-import org.apache.tika.parser.Parser;
-import org.osgi.framework.BundleActivator;
-import org.osgi.framework.BundleContext;
-import org.osgi.framework.ServiceRegistration;
-
-public class Activator implements BundleActivator {
-
-    private ServiceRegistration detectorService;
-
-    private ServiceRegistration parserService;
-
-    @Override
-    public void start(BundleContext context) throws Exception {
-        detectorService = context.registerService(
-                Detector.class.getName(),
-                new DefaultDetector(Activator.class.getClassLoader()),
-                new Properties());
-        Parser parser = new DefaultParser(Activator.class.getClassLoader());
-        parserService = context.registerService(
-                Parser.class.getName(),
-                parser,
-                new Properties());
-    }
-
-    @Override
-    public void stop(BundleContext context) throws Exception {
-        parserService.unregister();
-        detectorService.unregister();
-    }
-
-}

http://git-wip-us.apache.org/repos/asf/tika/blob/aa5f60d7/tika-parsers/src/main/java/org/apache/tika/parser/utils/CommonsDigester.java
----------------------------------------------------------------------
diff --git a/tika-parsers/src/main/java/org/apache/tika/parser/utils/CommonsDigester.java b/tika-parsers/src/main/java/org/apache/tika/parser/utils/CommonsDigester.java
deleted file mode 100644
index a064156..0000000
--- a/tika-parsers/src/main/java/org/apache/tika/parser/utils/CommonsDigester.java
+++ /dev/null
@@ -1,299 +0,0 @@
-package org.apache.tika.parser.utils;
-
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-import java.io.File;
-import java.io.FileInputStream;
-import java.io.IOException;
-import java.io.InputStream;
-import java.util.ArrayList;
-import java.util.Collections;
-import java.util.List;
-import java.util.Locale;
-
-import org.apache.commons.codec.digest.DigestUtils;
-import org.apache.commons.io.IOUtils;
-import org.apache.tika.io.TikaInputStream;
-import org.apache.tika.metadata.Metadata;
-import org.apache.tika.metadata.TikaCoreProperties;
-import org.apache.tika.parser.DigestingParser;
-import org.apache.tika.parser.ParseContext;
-
-/**
- * Implementation of {@link org.apache.tika.parser.DigestingParser.Digester}
- * that relies on commons.codec.digest.DigestUtils to calculate digest hashes.
- * <p>
- * This digester tries to use the regular mark/reset protocol on the InputStream.
- * However, this wraps an internal BoundedInputStream, and if the InputStream
- * is not fully read, then this will reset the stream and
- * spool the InputStream to disk (via TikaInputStream) and then digest the file.
- * <p>
- * If a TikaInputStream is passed in and it has an underlying file that is longer
- * than the {@link #markLimit}, then this digester digests the file directly.
- *
- */
-public class CommonsDigester implements DigestingParser.Digester {
-
-    public enum DigestAlgorithm {
-        //those currently available in commons.digest
-        MD2,
-        MD5,
-        SHA1,
-        SHA256,
-        SHA384,
-        SHA512;
-
-        String getMetadataKey() {
-            return TikaCoreProperties.TIKA_META_PREFIX+
-                    "digest"+Metadata.NAMESPACE_PREFIX_DELIMITER+this.toString();
-        }
-    }
-
-    private final List<DigestAlgorithm> algorithms = new ArrayList<DigestAlgorithm>();
-    private final int markLimit;
-
-    public CommonsDigester(int markLimit, DigestAlgorithm... algorithms) {
-        Collections.addAll(this.algorithms, algorithms);
-        if (markLimit < 0) {
-            throw new IllegalArgumentException("markLimit must be >= 0");
-        }
-        this.markLimit = markLimit;
-    }
-
-    @Override
-    public void digest(InputStream is, Metadata m, ParseContext parseContext) throws IOException {
-        InputStream tis = TikaInputStream.get(is);
-        long sz = -1;
-        if (((TikaInputStream)tis).hasFile()) {
-            sz = ((TikaInputStream)tis).getLength();
-        }
-        //if the file is definitely a file,
-        //and its size is greater than its mark limit,
-        //just digest the underlying file.
-        if (sz > markLimit) {
-            digestFile(((TikaInputStream)tis).getFile(), m);
-            return;
-        }
-
-        //try the usual mark/reset stuff.
-        //however, if you actually hit the bound,
-        //then stop and spool to file via TikaInputStream
-        SimpleBoundedInputStream bis = new SimpleBoundedInputStream(markLimit, tis);
-        boolean finishedStream = false;
-        for (DigestAlgorithm algorithm : algorithms) {
-            bis.mark(markLimit + 1);
-            finishedStream = digestEach(algorithm, bis, m);
-            bis.reset();
-            if (!finishedStream) {
-                break;
-            }
-        }
-        if (!finishedStream) {
-            digestFile(((TikaInputStream)tis).getFile(), m);
-        }
-    }
-
-    private void digestFile(File f, Metadata m) throws IOException {
-        for (DigestAlgorithm algorithm : algorithms) {
-            InputStream is = new FileInputStream(f);
-            try {
-                digestEach(algorithm, is, m);
-            } finally {
-                IOUtils.closeQuietly(is);
-            }
-        }
-    }
-
-    /**
-     *
-     * @param algorithm algo to use
-     * @param is input stream to read from
-     * @param metadata metadata for reporting the digest
-     * @return whether or not this finished the input stream
-     * @throws IOException
-     */
-    private boolean digestEach(DigestAlgorithm algorithm,
-                            InputStream is, Metadata metadata) throws IOException {
-        String digest = null;
-        try {
-            switch (algorithm) {
-                case MD2:
-                    digest = DigestUtils.md2Hex(is);
-                    break;
-                case MD5:
-                    digest = DigestUtils.md5Hex(is);
-                    break;
-                case SHA1:
-                    digest = DigestUtils.sha1Hex(is);
-                    break;
-                case SHA256:
-                    digest = DigestUtils.sha256Hex(is);
-                    break;
-                case SHA384:
-                    digest = DigestUtils.sha384Hex(is);
-                    break;
-                case SHA512:
-                    digest = DigestUtils.sha512Hex(is);
-                    break;
-                default:
-                    throw new IllegalArgumentException("Sorry, not aware of algorithm: " + algorithm.toString());
-            }
-        } catch (IOException e) {
-            e.printStackTrace();
-            //swallow, or should we throw this?
-        }
-        if (is instanceof SimpleBoundedInputStream) {
-            if (((SimpleBoundedInputStream)is).hasHitBound()) {
-                return false;
-            }
-        }
-        metadata.set(algorithm.getMetadataKey(), digest);
-        return true;
-    }
-
-    /**
-     *
-     * @param s comma-delimited (no space) list of algorithms to use: md5,sha256
-     * @return
-     */
-    public static DigestAlgorithm[] parse(String s) {
-        assert(s != null);
-
-        List<DigestAlgorithm> ret = new ArrayList<DigestAlgorithm>();
-        for (String algoString : s.split(",")) {
-            String uc = algoString.toUpperCase(Locale.ROOT);
-            if (uc.equals(DigestAlgorithm.MD2.toString())) {
-                ret.add(DigestAlgorithm.MD2);
-            } else if (uc.equals(DigestAlgorithm.MD5.toString())) {
-                ret.add(DigestAlgorithm.MD5);
-            } else if (uc.equals(DigestAlgorithm.SHA1.toString())) {
-                ret.add(DigestAlgorithm.SHA1);
-            } else if (uc.equals(DigestAlgorithm.SHA256.toString())) {
-                ret.add(DigestAlgorithm.SHA256);
-            } else if (uc.equals(DigestAlgorithm.SHA384.toString())) {
-                ret.add(DigestAlgorithm.SHA384);
-            } else if (uc.equals(DigestAlgorithm.SHA512.toString())) {
-                ret.add(DigestAlgorithm.SHA512);
-            } else {
-                StringBuilder sb = new StringBuilder();
-                int i = 0;
-                for (DigestAlgorithm algo : DigestAlgorithm.values()) {
-                    if (i++ > 0) {
-                        sb.append(", ");
-                    }
-                    sb.append(algo.toString());
-                }
-                throw new IllegalArgumentException("Couldn't match " + s + " with any of: " + sb.toString());
-            }
-        }
-        return ret.toArray(new DigestAlgorithm[ret.size()]);
-    }
-
-    /**
-     * Very slight modification of Commons' BoundedInputStream
-     * so that we can figure out if this hit the bound or not.
-     */
-    private class SimpleBoundedInputStream extends InputStream {
-        private final static int EOF = -1;
-        private final long max;
-        private final InputStream in;
-        private long pos;
-        boolean hitBound = false;
-
-        private SimpleBoundedInputStream(long max, InputStream in) {
-            this.max = max;
-            this.in = in;
-        }
-
-        @Override
-        public int read() throws IOException {
-            if (max >= 0 && pos >= max) {
-                hitBound = true;
-                return EOF;
-            }
-            final int result = in.read();
-            pos++;
-            return result;
-        }
-
-        /**
-         * Invokes the delegate's <code>read(byte[])</code> method.
-         * @param b the buffer to read the bytes into
-         * @return the number of bytes read or -1 if the end of stream or
-         * the limit has been reached.
-         * @throws IOException if an I/O error occurs
-         */
-        @Override
-        public int read(final byte[] b) throws IOException {
-            return this.read(b, 0, b.length);
-        }
-
-        /**
-         * Invokes the delegate's <code>read(byte[], int, int)</code> method.
-         * @param b the buffer to read the bytes into
-         * @param off The start offset
-         * @param len The number of bytes to read
-         * @return the number of bytes read or -1 if the end of stream or
-         * the limit has been reached.
-         * @throws IOException if an I/O error occurs
-         */
-        @Override
-        public int read(final byte[] b, final int off, final int len) throws IOException {
-            if (max>=0 && pos>=max) {
-                return EOF;
-            }
-            final long maxRead = max>=0 ? Math.min(len, max-pos) : len;
-            final int bytesRead = in.read(b, off, (int)maxRead);
-
-            if (bytesRead==EOF) {
-                return EOF;
-            }
-
-            pos+=bytesRead;
-            return bytesRead;
-        }
-
-        /**
-         * Invokes the delegate's <code>skip(long)</code> method.
-         * @param n the number of bytes to skip
-         * @return the actual number of bytes skipped
-         * @throws IOException if an I/O error occurs
-         */
-        @Override
-        public long skip(final long n) throws IOException {
-            final long toSkip = max>=0 ? Math.min(n, max-pos) : n;
-            final long skippedBytes = in.skip(toSkip);
-            pos+=skippedBytes;
-            return skippedBytes;
-        }
-
-        @Override
-        public void reset() throws IOException {
-            in.reset();
-        }
-
-        @Override
-        public void mark(int readLimit) {
-            in.mark(readLimit);
-        }
-
-        public boolean hasHitBound() {
-            return hitBound;
-        }
-    }
-}

http://git-wip-us.apache.org/repos/asf/tika/blob/aa5f60d7/tika-parsers/src/test/java/org/apache/tika/TestParsers.java
----------------------------------------------------------------------
diff --git a/tika-parsers/src/test/java/org/apache/tika/TestParsers.java b/tika-parsers/src/test/java/org/apache/tika/TestParsers.java
deleted file mode 100644
index ddd671d..0000000
--- a/tika-parsers/src/test/java/org/apache/tika/TestParsers.java
+++ /dev/null
@@ -1,109 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.tika;
-
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertTrue;
-
-import java.io.File;
-import java.io.FileInputStream;
-import java.io.InputStream;
-
-import org.apache.tika.config.TikaConfig;
-import org.apache.tika.metadata.Metadata;
-import org.apache.tika.metadata.TikaCoreProperties;
-import org.apache.tika.parser.ParseContext;
-import org.apache.tika.parser.Parser;
-import org.junit.Before;
-import org.junit.Test;
-import org.xml.sax.helpers.DefaultHandler;
-
-/**
- * Junit test class for Tika {@link Parser}s.
- */
-public class TestParsers extends TikaTest {
-
-    private TikaConfig tc;
-
-    private Tika tika;
-
-    @Before
-    public void setUp() throws Exception {
-        tc = TikaConfig.getDefaultConfig();
-        tika = new Tika(tc);
-    }
-
-    @Test
-    public void testWORDxtraction() throws Exception {
-        File file = getResourceAsFile("/test-documents/testWORD.doc");
-        Parser parser = tika.getParser();
-        Metadata metadata = new Metadata();
-        try (InputStream stream = new FileInputStream(file)) {
-            parser.parse(stream, new DefaultHandler(), metadata, new ParseContext());
-        }
-        assertEquals("Sample Word Document", metadata.get(TikaCoreProperties.TITLE));
-    }
-
-    @Test
-    public void testEXCELExtraction() throws Exception {
-        final String expected = "Numbers and their Squares";
-        File file = getResourceAsFile("/test-documents/testEXCEL.xls");
-        String s1 = tika.parseToString(file);
-        assertTrue("Text does not contain '" + expected + "'", s1
-                .contains(expected));
-        Parser parser = tika.getParser();
-        Metadata metadata = new Metadata();
-        try (InputStream stream = new FileInputStream(file)) {
-            parser.parse(stream, new DefaultHandler(), metadata, new ParseContext());
-        }
-        assertEquals("Simple Excel document", metadata.get(TikaCoreProperties.TITLE));
-    }
-
-    @Test
-    public void testOptionalHyphen() throws Exception {
-        String[] extensions =
-                new String[] { "ppt", "pptx", "doc", "docx", "rtf", "pdf"};
-        for (String extension : extensions) {
-            File file = getResourceAsFile("/test-documents/testOptionalHyphen." + extension);
-            String content = tika.parseToString(file);
-            assertTrue("optional hyphen was not handled for '" + extension + "' file type: " + content,
-                       content.contains("optionalhyphen") ||
-                       content.contains("optional\u00adhyphen") ||   // soft hyphen
-                       content.contains("optional\u200bhyphen") ||   // zero width space
-                       content.contains("optional\u2027"));          // hyphenation point
-            
-        }
-    }
-
-    private void verifyComment(String extension, String fileName) throws Exception {
-        File file = getResourceAsFile("/test-documents/" + fileName + "." + extension);
-        String content = tika.parseToString(file);
-        assertTrue(extension + ": content=" + content + " did not extract text",
-                   content.contains("Here is some text"));
-        assertTrue(extension + ": content=" + content + " did not extract comment",
-                   content.contains("Here is a comment"));
-    }
-
-    @Test
-    public void testComment() throws Exception {
-        final String[] extensions = new String[] {"ppt", "pptx", "doc", 
-            "docx", "xls", "xlsx", "pdf", "rtf"};
-        for(String extension : extensions) {
-            verifyComment(extension, "testComment");
-        }
-    }
-}

http://git-wip-us.apache.org/repos/asf/tika/blob/aa5f60d7/tika-parsers/src/test/java/org/apache/tika/config/TikaDetectorConfigTest.java
----------------------------------------------------------------------
diff --git a/tika-parsers/src/test/java/org/apache/tika/config/TikaDetectorConfigTest.java b/tika-parsers/src/test/java/org/apache/tika/config/TikaDetectorConfigTest.java
deleted file mode 100644
index 2125888..0000000
--- a/tika-parsers/src/test/java/org/apache/tika/config/TikaDetectorConfigTest.java
+++ /dev/null
@@ -1,143 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.tika.config;
-
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertNotNull;
-import static org.junit.Assert.assertTrue;
-import static org.junit.Assert.fail;
-
-import org.apache.tika.detect.CompositeDetector;
-import org.apache.tika.detect.DefaultDetector;
-import org.apache.tika.detect.Detector;
-import org.apache.tika.detect.EmptyDetector;
-import org.apache.tika.io.TikaInputStream;
-import org.apache.tika.metadata.Metadata;
-import org.apache.tika.parser.mbox.OutlookPSTParser;
-import org.apache.tika.parser.microsoft.POIFSContainerDetector;
-import org.apache.tika.parser.pkg.ZipContainerDetector;
-import org.junit.Test;
-
-/**
- * Junit test class for {@link TikaConfig}, which cover things
- *  that {@link TikaConfigTest} can't do due to a need for the
- *  full set of detectors
- */
-public class TikaDetectorConfigTest extends AbstractTikaConfigTest {
-    @Test
-    public void testDetectorExcludeFromDefault() throws Exception {
-        TikaConfig config = getConfig("TIKA-1702-detector-blacklist.xml");
-        assertNotNull(config.getParser());
-        assertNotNull(config.getDetector());
-        CompositeDetector detector = (CompositeDetector)config.getDetector();
-        
-        // Should be wrapping two detectors
-        assertEquals(2, detector.getDetectors().size());
-
-        
-        // First should be DefaultDetector, second Empty, that order
-        assertEquals(DefaultDetector.class, detector.getDetectors().get(0).getClass());
-        assertEquals(EmptyDetector.class,   detector.getDetectors().get(1).getClass());
-        
-        
-        // Get the DefaultDetector from the config
-        DefaultDetector confDetector = (DefaultDetector)detector.getDetectors().get(0);
-        
-        // Get a fresh "default" DefaultParser
-        DefaultDetector normDetector = new DefaultDetector(config.getMimeRepository());
-        
-        
-        // The default one will offer the Zip and POIFS detectors
-        assertDetectors(normDetector, true, true);
-        
-        
-        // The one from the config won't, as we excluded those
-        assertDetectors(confDetector, false, false);
-    }
-    
-    /**
-     * TIKA-1708 - If the Zip detector is disabled, either explicitly,
-     *  or via giving a list of detectors that it isn't part of, ensure
-     *  that detection of PST files still works
-     */
-    @Test
-    public void testPSTDetectionWithoutZipDetector() throws Exception {
-        // Check the one with an exclude
-        TikaConfig configWX = getConfig("TIKA-1708-detector-default.xml");
-        assertNotNull(configWX.getParser());
-        assertNotNull(configWX.getDetector());
-        CompositeDetector detectorWX = (CompositeDetector)configWX.getDetector();
-
-        // Check it has the POIFS one, but not the zip one
-        assertDetectors(detectorWX, true, false);
-        
-        
-        // Check the one with an explicit list
-        TikaConfig configCL = getConfig("TIKA-1708-detector-composite.xml");
-        assertNotNull(configCL.getParser());
-        assertNotNull(configCL.getDetector());
-        CompositeDetector detectorCL = (CompositeDetector)configCL.getDetector();
-        assertEquals(2, detectorCL.getDetectors().size());
-        
-        // Check it also has the POIFS one, but not the zip one
-        assertDetectors(detectorCL, true, false);
-        
-        
-        // Check that both detectors have a mimetypes with entries
-        assertTrue("Not enough mime types: " + configWX.getMediaTypeRegistry().getTypes().size(),
-                   configWX.getMediaTypeRegistry().getTypes().size() > 100);
-        assertTrue("Not enough mime types: " + configCL.getMediaTypeRegistry().getTypes().size(),
-                   configCL.getMediaTypeRegistry().getTypes().size() > 100);
-        
-        
-        // Now check they detect PST files correctly
-        TikaInputStream stream = TikaInputStream.get(
-                getResourceAsFile("/test-documents/testPST.pst"));
-        assertEquals(
-                OutlookPSTParser.MS_OUTLOOK_PST_MIMETYPE, 
-                detectorWX.detect(stream, new Metadata())
-        );
-        assertEquals(
-                OutlookPSTParser.MS_OUTLOOK_PST_MIMETYPE, 
-                detectorCL.detect(stream, new Metadata())
-        );
-    }
-    
-    private void assertDetectors(CompositeDetector detector, boolean shouldHavePOIFS,
-                                 boolean shouldHaveZip) {
-        boolean hasZip = false;
-        boolean hasPOIFS = false;
-        for (Detector d : detector.getDetectors()) {
-            if (d instanceof ZipContainerDetector) {
-                if (shouldHaveZip) {
-                    hasZip = true;
-                } else {
-                    fail("Shouldn't have the ZipContainerDetector from config");
-                }
-            }
-            if (d instanceof POIFSContainerDetector) {
-                if (shouldHavePOIFS) {
-                    hasPOIFS = true;
-                } else {
-                    fail("Shouldn't have the POIFSContainerDetector from config");
-                }
-            }
-        }
-        if (shouldHavePOIFS) assertTrue("Should have the POIFSContainerDetector", hasPOIFS);
-        if (shouldHaveZip)   assertTrue("Should have the ZipContainerDetector", hasZip);
-    }
-}

http://git-wip-us.apache.org/repos/asf/tika/blob/aa5f60d7/tika-parsers/src/test/java/org/apache/tika/config/TikaParserConfigTest.java
----------------------------------------------------------------------
diff --git a/tika-parsers/src/test/java/org/apache/tika/config/TikaParserConfigTest.java b/tika-parsers/src/test/java/org/apache/tika/config/TikaParserConfigTest.java
deleted file mode 100644
index 2acd358..0000000
--- a/tika-parsers/src/test/java/org/apache/tika/config/TikaParserConfigTest.java
+++ /dev/null
@@ -1,157 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.tika.config;
-
-import static org.apache.tika.TikaTest.assertContains;
-import static org.apache.tika.TikaTest.assertNotContained;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertNotNull;
-import static org.junit.Assert.assertTrue;
-import static org.junit.Assert.fail;
-
-import java.util.List;
-
-import org.apache.tika.mime.MediaType;
-import org.apache.tika.parser.CompositeParser;
-import org.apache.tika.parser.DefaultParser;
-import org.apache.tika.parser.EmptyParser;
-import org.apache.tika.parser.Parser;
-import org.apache.tika.parser.ParserDecorator;
-import org.apache.tika.parser.executable.ExecutableParser;
-import org.apache.tika.parser.xml.XMLParser;
-import org.junit.Test;
-
-/**
- * Junit test class for {@link TikaConfig}, which cover things
- *  that {@link TikaConfigTest} can't do due to a need for the
- *  full set of parsers
- */
-public class TikaParserConfigTest extends AbstractTikaConfigTest {
-    @Test
-    public void testMimeExcludeInclude() throws Exception {
-        TikaConfig config = getConfig("TIKA-1558-blacklist.xml");
-        assertNotNull(config.getParser());
-        assertNotNull(config.getDetector());
-        Parser parser = config.getParser();
-        
-        MediaType PDF = MediaType.application("pdf");
-        MediaType JPEG = MediaType.image("jpeg");
-        
-        
-        // Has two parsers
-        assertEquals(CompositeParser.class, parser.getClass());
-        CompositeParser cParser = (CompositeParser)parser;
-        assertEquals(2, cParser.getAllComponentParsers().size());
-        
-        // Both are decorated
-        assertTrue(cParser.getAllComponentParsers().get(0) instanceof ParserDecorator);
-        assertTrue(cParser.getAllComponentParsers().get(1) instanceof ParserDecorator);
-        ParserDecorator p0 = (ParserDecorator)cParser.getAllComponentParsers().get(0);
-        ParserDecorator p1 = (ParserDecorator)cParser.getAllComponentParsers().get(1);
-        
-        
-        // DefaultParser will be wrapped with excludes
-        assertEquals(DefaultParser.class, p0.getWrappedParser().getClass());
-        
-        assertNotContained(PDF, p0.getSupportedTypes(context));
-        assertContains(PDF, p0.getWrappedParser().getSupportedTypes(context));
-        assertNotContained(JPEG, p0.getSupportedTypes(context));
-        assertContains(JPEG, p0.getWrappedParser().getSupportedTypes(context));
-        
-        
-        // Will have an empty parser for PDF
-        assertEquals(EmptyParser.class, p1.getWrappedParser().getClass());
-        assertEquals(1, p1.getSupportedTypes(context).size());
-        assertContains(PDF, p1.getSupportedTypes(context));
-        assertNotContained(PDF, p1.getWrappedParser().getSupportedTypes(context));
-    }
-    
-    @Test
-    public void testParserExcludeFromDefault() throws Exception {
-        TikaConfig config = getConfig("TIKA-1558-blacklist.xml");
-        assertNotNull(config.getParser());
-        assertNotNull(config.getDetector());
-        CompositeParser parser = (CompositeParser)config.getParser();
-        
-        MediaType PE_EXE = MediaType.application("x-msdownload");
-        MediaType ELF = MediaType.application("x-elf");
-        
-        
-        // Get the DefaultParser from the config
-        ParserDecorator confWrappedParser = (ParserDecorator)parser.getParsers().get(MediaType.APPLICATION_XML);
-        assertNotNull(confWrappedParser);
-        DefaultParser confParser = (DefaultParser)confWrappedParser.getWrappedParser();
-        
-        // Get a fresh "default" DefaultParser
-        DefaultParser normParser = new DefaultParser(config.getMediaTypeRegistry());
-        
-        
-        // The default one will offer the Executable Parser
-        assertContains(PE_EXE, normParser.getSupportedTypes(context));
-        assertContains(ELF, normParser.getSupportedTypes(context));
-        
-        boolean hasExec = false;
-        for (Parser p : normParser.getParsers().values()) {
-            if (p instanceof ExecutableParser) {
-                hasExec = true;
-                break;
-            }
-        }
-        assertTrue(hasExec);
-        
-        
-        // The one from the config won't
-        assertNotContained(PE_EXE, confParser.getSupportedTypes(context));
-        assertNotContained(ELF, confParser.getSupportedTypes(context));
-        
-        for (Parser p : confParser.getParsers().values()) {
-            if (p instanceof ExecutableParser)
-                fail("Shouldn't have the Executable Parser from config");
-        }
-    }
-    /**
-     * TIKA-1558 It should be possible to exclude Parsers from being picked up by
-     * DefaultParser.
-     */
-    @Test
-    public void defaultParserBlacklist() throws Exception {
-        TikaConfig config = new TikaConfig();
-        assertNotNull(config.getParser());
-        assertNotNull(config.getDetector());
-        CompositeParser cp = (CompositeParser) config.getParser();
-        List<Parser> parsers = cp.getAllComponentParsers();
-
-        boolean hasXML = false;
-        for (Parser p : parsers) {
-            if (p instanceof XMLParser) {
-                hasXML = true;
-                break;
-            }
-        }
-        assertTrue("Default config should include an XMLParser.", hasXML);
-
-        // This custom TikaConfig should exclude XMLParser and all of its subclasses.
-        config = getConfig("TIKA-1558-blacklistsub.xml");
-        cp = (CompositeParser) config.getParser();
-        parsers = cp.getAllComponentParsers();
-
-        for (Parser p : parsers) {
-            if (p instanceof XMLParser)
-                fail("Custom config should not include an XMLParser (" + p.getClass() + ").");
-        }
-    }
-}

http://git-wip-us.apache.org/repos/asf/tika/blob/aa5f60d7/tika-parsers/src/test/java/org/apache/tika/config/TikaTranslatorConfigTest.java
----------------------------------------------------------------------
diff --git a/tika-parsers/src/test/java/org/apache/tika/config/TikaTranslatorConfigTest.java b/tika-parsers/src/test/java/org/apache/tika/config/TikaTranslatorConfigTest.java
deleted file mode 100644
index 71af206..0000000
--- a/tika-parsers/src/test/java/org/apache/tika/config/TikaTranslatorConfigTest.java
+++ /dev/null
@@ -1,72 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.tika.config;
-
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertNotNull;
-
-import org.apache.tika.language.translate.DefaultTranslator;
-import org.apache.tika.language.translate.EmptyTranslator;
-import org.junit.Test;
-
-/**
- * Junit test class for {@link TikaConfig}, which cover things
- *  that {@link TikaConfigTest} can't do due to a need for the
- *  full set of translators
- */
-public class TikaTranslatorConfigTest extends AbstractTikaConfigTest {
-    @Test
-    public void testDefaultBehaviour() throws Exception {
-        TikaConfig config = TikaConfig.getDefaultConfig();
-        assertNotNull(config.getTranslator());
-        assertEquals(DefaultTranslator.class, config.getTranslator().getClass());
-    }
-    
-    @Test
-    public void testRequestsDefault() throws Exception {
-        TikaConfig config = getConfig("TIKA-1702-translator-default.xml");
-        assertNotNull(config.getParser());
-        assertNotNull(config.getDetector());
-        assertNotNull(config.getTranslator());
-        
-        assertEquals(DefaultTranslator.class, config.getTranslator().getClass());
-    }
-    
-    @Test
-    public void testRequestsEmpty() throws Exception {
-        TikaConfig config = getConfig("TIKA-1702-translator-empty.xml");
-        assertNotNull(config.getParser());
-        assertNotNull(config.getDetector());
-        assertNotNull(config.getTranslator());
-        
-        assertEquals(EmptyTranslator.class, config.getTranslator().getClass());
-    }
-    
-    /**
-     * Currently, Translators don't support Composites, so
-     *  if multiple translators are given, only the first wins
-     */
-    @Test
-    public void testRequestsMultiple() throws Exception {
-        TikaConfig config = getConfig("TIKA-1702-translator-empty-default.xml");
-        assertNotNull(config.getParser());
-        assertNotNull(config.getDetector());
-        assertNotNull(config.getTranslator());
-        
-        assertEquals(EmptyTranslator.class, config.getTranslator().getClass());
-    }
-}

http://git-wip-us.apache.org/repos/asf/tika/blob/aa5f60d7/tika-parsers/src/test/java/org/apache/tika/detect/TestContainerAwareDetector.java
----------------------------------------------------------------------
diff --git a/tika-parsers/src/test/java/org/apache/tika/detect/TestContainerAwareDetector.java b/tika-parsers/src/test/java/org/apache/tika/detect/TestContainerAwareDetector.java
deleted file mode 100644
index 5787408..0000000
--- a/tika-parsers/src/test/java/org/apache/tika/detect/TestContainerAwareDetector.java
+++ /dev/null
@@ -1,410 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.tika.detect;
-
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertNull;
-import static org.junit.Assert.assertTrue;
-
-import java.io.File;
-import java.io.FilenameFilter;
-import java.io.IOException;
-import java.io.InputStream;
-
-import org.apache.poi.poifs.filesystem.NPOIFSFileSystem;
-import org.apache.tika.config.TikaConfig;
-import org.apache.tika.io.TikaInputStream;
-import org.apache.tika.metadata.Metadata;
-import org.apache.tika.mime.MediaType;
-import org.apache.tika.mime.MimeTypes;
-import org.junit.Test;
-
-/**
- * Junit test class for {@link ContainerAwareDetector}
- */
-public class TestContainerAwareDetector {
-    private final TikaConfig tikaConfig = TikaConfig.getDefaultConfig();
-    private final MimeTypes mimeTypes = tikaConfig.getMimeRepository();
-    private final Detector detector = new DefaultDetector(mimeTypes);
-
-    private void assertTypeByData(String file, String type) throws Exception {
-       assertTypeByNameAndData(file, null, type);
-    }
-    private void assertTypeByNameAndData(String file, String type) throws Exception {
-       assertTypeByNameAndData(file, file, type);
-    }
-    private void assertType(String file, String byData, String byNameAndData) throws Exception {
-       assertTypeByData(file, byData);
-       assertTypeByNameAndData(file, byNameAndData);
-    }
-    private void assertTypeByNameAndData(String dataFile, String name, String type) throws Exception {
-        assertTypeByNameAndData(dataFile, name, type, null);
-    }
-    private void assertTypeByNameAndData(String dataFile, String name, String typeFromDetector, String typeFromMagic) throws Exception {
-        try (TikaInputStream stream = TikaInputStream.get(
-                TestContainerAwareDetector.class.getResource("/test-documents/" + dataFile))) {
-            Metadata m = new Metadata();
-            if (name != null)
-                m.add(Metadata.RESOURCE_NAME_KEY, name);
-
-            // Mime Magic version is likely to be less precise
-            if (typeFromMagic != null) {
-                assertEquals(
-                        MediaType.parse(typeFromMagic),
-                        mimeTypes.detect(stream, m));
-            }
-
-            // All being well, the detector should get it perfect
-            assertEquals(
-                    MediaType.parse(typeFromDetector),
-                    detector.detect(stream, m));
-        }
-    }
-
-    @Test
-    public void testDetectOLE2() throws Exception {
-        // Microsoft office types known by POI
-        assertTypeByData("testEXCEL.xls", "application/vnd.ms-excel");
-        assertTypeByData("testWORD.doc", "application/msword");
-        assertTypeByData("testPPT.ppt", "application/vnd.ms-powerpoint");
-        
-        assertTypeByData("test-outlook.msg", "application/vnd.ms-outlook");
-        assertTypeByData("test-outlook2003.msg", "application/vnd.ms-outlook");
-        assertTypeByData("testVISIO.vsd", "application/vnd.visio");
-        assertTypeByData("testPUBLISHER.pub", "application/x-mspublisher");
-        assertTypeByData("testWORKS.wps", "application/vnd.ms-works");
-        assertTypeByData("testWORKS2000.wps", "application/vnd.ms-works");
-        
-        // older Works Word Processor files can't be recognized
-        // they were created with Works Word Processor 7.0 (hence the text inside)
-        // and exported to the older formats with the "Save As" feature
-        assertTypeByData("testWORKSWordProcessor3.0.wps","application/vnd.ms-works");
-        assertTypeByData("testWORKSWordProcessor4.0.wps","application/vnd.ms-works");
-        assertTypeByData("testWORKSSpreadsheet7.0.xlr", "application/x-tika-msworks-spreadsheet");
-        assertTypeByData("testPROJECT2003.mpp", "application/vnd.ms-project");
-        assertTypeByData("testPROJECT2007.mpp", "application/vnd.ms-project");
-        
-        // Excel95 can be detected by not parsed
-        assertTypeByData("testEXCEL_95.xls", "application/vnd.ms-excel");
-
-        // Try some ones that POI doesn't handle, that are still OLE2 based
-        assertTypeByData("testCOREL.shw", "application/x-corelpresentations");
-        assertTypeByData("testQUATTRO.qpw", "application/x-quattro-pro");
-        assertTypeByData("testQUATTRO.wb3", "application/x-quattro-pro");
-        
-        assertTypeByData("testHWP_5.0.hwp", "application/x-hwp-v5");
-        
-        
-        // With the filename and data
-        assertTypeByNameAndData("testEXCEL.xls", "application/vnd.ms-excel");
-        assertTypeByNameAndData("testWORD.doc", "application/msword");
-        assertTypeByNameAndData("testPPT.ppt", "application/vnd.ms-powerpoint");
-        
-        // With the wrong filename supplied, data will trump filename
-        assertTypeByNameAndData("testEXCEL.xls", "notWord.doc",  "application/vnd.ms-excel");
-        assertTypeByNameAndData("testWORD.doc",  "notExcel.xls", "application/msword");
-        assertTypeByNameAndData("testPPT.ppt",   "notWord.doc",  "application/vnd.ms-powerpoint");
-        
-        // With a filename of a totally different type, data will trump filename
-        assertTypeByNameAndData("testEXCEL.xls", "notPDF.pdf",  "application/vnd.ms-excel");
-        assertTypeByNameAndData("testEXCEL.xls", "notPNG.png",  "application/vnd.ms-excel");
-    }
-    
-    /**
-     * There is no way to distinguish "proper" StarOffice files from templates.
-     * All templates have the same extension but their actual type depends on
-     * the magic. Our current MimeTypes class doesn't allow us to use the same
-     * glob pattern in more than one mimetype.
-     * 
-     * @throws Exception
-     */
-    @Test
-    public void testDetectStarOfficeFiles() throws Exception {
-        assertType("testStarOffice-5.2-calc.sdc",
-                "application/vnd.stardivision.calc",
-                "application/vnd.stardivision.calc");
-        assertType("testVORCalcTemplate.vor",
-                "application/vnd.stardivision.calc",
-                "application/vnd.stardivision.calc");
-        assertType("testStarOffice-5.2-draw.sda",
-                "application/vnd.stardivision.draw",
-                "application/vnd.stardivision.draw");
-        assertType("testVORDrawTemplate.vor",
-                "application/vnd.stardivision.draw",
-                "application/vnd.stardivision.draw");
-        assertType("testStarOffice-5.2-impress.sdd",
-                "application/vnd.stardivision.impress",
-                "application/vnd.stardivision.impress");
-        assertType("testVORImpressTemplate.vor",
-                "application/vnd.stardivision.impress",
-                "application/vnd.stardivision.impress");
-        assertType("testStarOffice-5.2-writer.sdw",
-                "application/vnd.stardivision.writer",
-                "application/vnd.stardivision.writer");
-        assertType("testVORWriterTemplate.vor",
-                "application/vnd.stardivision.writer",
-                "application/vnd.stardivision.writer");
-
-    }
-
-    @Test
-    public void testOpenContainer() throws Exception {
-        try (TikaInputStream stream = TikaInputStream.get(
-                TestContainerAwareDetector.class.getResource("/test-documents/testPPT.ppt"))) {
-            assertNull(stream.getOpenContainer());
-            assertEquals(
-                    MediaType.parse("application/vnd.ms-powerpoint"),
-                    detector.detect(stream, new Metadata()));
-            assertTrue(stream.getOpenContainer() instanceof NPOIFSFileSystem);
-        }
-    }
-
-    /**
-     * EPub uses a similar mimetype entry to OpenDocument for storing
-     *  the mimetype within the parent zip file
-     */
-    @Test
-    public void testDetectEPub() throws Exception {
-       assertTypeByData("testEPUB.epub", "application/epub+zip");
-       assertTypeByData("testiBooks.ibooks", "application/x-ibooks+zip");
-    }
-    
-    @Test
-    public void testDetectLotusNotesEml() throws Exception {
-        // Lotus .eml files aren't guaranteed to have any of the magic 
-        // matches as the first line, but should have X-Notes-Item and Message-ID
-        assertTypeByData("testLotusEml.eml", "message/rfc822");
-     }
-
-    @Test
-    public void testDetectODF() throws Exception {
-        assertTypeByData("testODFwithOOo3.odt", "application/vnd.oasis.opendocument.text");
-        assertTypeByData("testOpenOffice2.odf", "application/vnd.oasis.opendocument.formula");
-    }
-
-    @Test
-    public void testDetectOOXML() throws Exception {
-        assertTypeByData("testEXCEL.xlsx", "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
-        assertTypeByData("testWORD.docx", "application/vnd.openxmlformats-officedocument.wordprocessingml.document");
-        assertTypeByData("testPPT.pptx", "application/vnd.openxmlformats-officedocument.presentationml.presentation");
-
-        // Check some of the less common OOXML types
-        assertTypeByData("testPPT.pptm", "application/vnd.ms-powerpoint.presentation.macroenabled.12");
-        assertTypeByData("testPPT.ppsx", "application/vnd.openxmlformats-officedocument.presentationml.slideshow");
-        assertTypeByData("testPPT.ppsm", "application/vnd.ms-powerpoint.slideshow.macroEnabled.12");
-        assertTypeByData("testDOTM.dotm", "application/vnd.ms-word.template.macroEnabled.12");
-        assertTypeByData("testEXCEL.strict.xlsx", "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
-        assertTypeByData("testPPT.xps", "application/vnd.ms-xpsdocument");
-
-        assertTypeByData("testVISIO.vsdm", "application/vnd.ms-visio.drawing.macroenabled.12");
-        assertTypeByData("testVISIO.vsdx", "application/vnd.ms-visio.drawing");
-        assertTypeByData("testVISIO.vssm", "application/vnd.ms-visio.stencil.macroenabled.12");
-        assertTypeByData("testVISIO.vssx", "application/vnd.ms-visio.stencil");
-        assertTypeByData("testVISIO.vstm", "application/vnd.ms-visio.template.macroenabled.12");
-        assertTypeByData("testVISIO.vstx", "application/vnd.ms-visio.template");
-        
-        // .xlsb is an OOXML file containing the binary parts, and not
-        //  an OLE2 file as you might initially expect!
-        assertTypeByData("testEXCEL.xlsb", "application/vnd.ms-excel.sheet.binary.macroEnabled.12");
-
-        // With the filename and data
-        assertTypeByNameAndData("testEXCEL.xlsx", "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
-        assertTypeByNameAndData("testWORD.docx", "application/vnd.openxmlformats-officedocument.wordprocessingml.document");
-        assertTypeByNameAndData("testPPT.pptx", "application/vnd.openxmlformats-officedocument.presentationml.presentation");
-        
-        // With the wrong filename supplied, data will trump filename
-        assertTypeByNameAndData("testEXCEL.xlsx", "notWord.docx", "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
-        assertTypeByNameAndData("testWORD.docx",  "notExcel.xlsx", "application/vnd.openxmlformats-officedocument.wordprocessingml.document");
-        assertTypeByNameAndData("testPPT.pptx",   "notWord.docx", "application/vnd.openxmlformats-officedocument.presentationml.presentation");
-        
-        // With an incorrect filename of a different container type, data trumps filename
-        assertTypeByNameAndData("testEXCEL.xlsx", "notOldExcel.xls", "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
-    }
-    
-    /**
-     * Password Protected OLE2 files are fairly straightforward to detect, as they
-     *  have the same structure as regular OLE2 files. (Core streams may be encrypted
-     *  however)
-     */
-    @Test
-    public void testDetectProtectedOLE2() throws Exception {
-        assertTypeByData("testEXCEL_protected_passtika.xls", "application/vnd.ms-excel");
-        assertTypeByData("testWORD_protected_passtika.doc", "application/msword");
-        assertTypeByData("testPPT_protected_passtika.ppt", "application/vnd.ms-powerpoint");
-        assertTypeByNameAndData("testEXCEL_protected_passtika.xls", "application/vnd.ms-excel");
-        assertTypeByNameAndData("testWORD_protected_passtika.doc", "application/msword");
-        assertTypeByNameAndData("testPPT_protected_passtika.ppt", "application/vnd.ms-powerpoint");
-    }
-
-    /**
-     * Password Protected OOXML files are much more tricky beasts to work with.
-     * They have a very different structure to regular OOXML files, and instead
-     *  of being ZIP based they are actually an OLE2 file which contains the
-     *  OOXML structure within an encrypted stream.
-     * This makes detecting them much harder...
-     */
-    @Test
-    public void testDetectProtectedOOXML() throws Exception {
-        // Encrypted Microsoft Office OOXML files have OLE magic but
-        //  special streams, so we can tell they're Protected OOXML
-        assertTypeByData("testEXCEL_protected_passtika.xlsx", 
-                "application/x-tika-ooxml-protected");
-        assertTypeByData("testWORD_protected_passtika.docx", 
-                "application/x-tika-ooxml-protected");
-        assertTypeByData("testPPT_protected_passtika.pptx", 
-                "application/x-tika-ooxml-protected");
-        
-        // At the moment, we can't use the name to specialise
-        // See discussions on TIKA-790 for details
-        assertTypeByNameAndData("testEXCEL_protected_passtika.xlsx", 
-                "application/x-tika-ooxml-protected");
-        assertTypeByNameAndData("testWORD_protected_passtika.docx", 
-                "application/x-tika-ooxml-protected");
-        assertTypeByNameAndData("testPPT_protected_passtika.pptx", 
-                "application/x-tika-ooxml-protected");
-    }
-
-    /**
-     * Check that temporary files created by Tika are removed after
-     * closing TikaInputStream.
-     */
-    @Test
-    public void testRemovalTempfiles() throws Exception {
-        assertRemovalTempfiles("testWORD.docx");
-        assertRemovalTempfiles("test-documents.zip");
-    }
-
-    private int countTemporaryFiles() {
-        return new File(System.getProperty("java.io.tmpdir")).listFiles(
-                new FilenameFilter() {
-                    public boolean accept(File dir, String name) {
-                        return name.startsWith("apache-tika-");
-                    }
-                }).length;
-    }
-
-    private void assertRemovalTempfiles(String fileName) throws Exception {
-        int numberOfTempFiles = countTemporaryFiles();
-
-        try (TikaInputStream stream = TikaInputStream.get(
-                TestContainerAwareDetector.class.getResource("/test-documents/" + fileName))) {
-            detector.detect(stream, new Metadata());
-        }
-
-        assertEquals(numberOfTempFiles, countTemporaryFiles());
-    }
-
-    @Test
-    public void testDetectIWork() throws Exception {
-        assertTypeByData("testKeynote.key", "application/vnd.apple.keynote");
-        assertTypeByData("testNumbers.numbers", "application/vnd.apple.numbers");
-        assertTypeByData("testPages.pages", "application/vnd.apple.pages");
-    }
-
-    @Test
-    public void testDetectKMZ() throws Exception {
-       assertTypeByData("testKMZ.kmz", "application/vnd.google-earth.kmz");
-    }
-    
-    @Test
-    public void testDetectIPA() throws Exception {
-        assertTypeByNameAndData("testIPA.ipa", "application/x-itunes-ipa");
-        assertTypeByData("testIPA.ipa", "application/x-itunes-ipa");
-    }
-    
-    @Test
-    public void testASiC() throws Exception {
-        assertTypeByData("testASiCE.asice", "application/vnd.etsi.asic-e+zip");
-        assertTypeByData("testASiCS.asics", "application/vnd.etsi.asic-s+zip");
-        assertTypeByNameAndData("testASiCE.asice", "application/vnd.etsi.asic-e+zip");
-        assertTypeByNameAndData("testASiCS.asics", "application/vnd.etsi.asic-s+zip");
-    }
-     
-    @Test
-    public void testDetectZip() throws Exception {
-        assertTypeByData("test-documents.zip", "application/zip");
-        assertTypeByData("test-zip-of-zip.zip", "application/zip");
-        
-        // JAR based formats
-        assertTypeByData("testJAR.jar", "application/java-archive");
-        assertTypeByData("testWAR.war", "application/x-tika-java-web-archive");
-        assertTypeByData("testEAR.ear", "application/x-tika-java-enterprise-archive");
-        assertTypeByData("testAPK.apk", "application/vnd.android.package-archive");
-        
-        // JAR with HTML files in it
-        assertTypeByNameAndData("testJAR_with_HTML.jar", "testJAR_with_HTML.jar",
-                                "application/java-archive", "application/java-archive");
-    }
-
-    private TikaInputStream getTruncatedFile(String name, int n)
-            throws IOException {
-        try (InputStream input = TestContainerAwareDetector.class.getResourceAsStream(
-                "/test-documents/" + name)) {
-            byte[] bytes = new byte[n];
-            int m = 0;
-            while (m < bytes.length) {
-                int i = input.read(bytes, m, bytes.length - m);
-                if (i != -1) {
-                    m += i;
-                } else {
-                    throw new IOException("Unexpected end of stream");
-                }
-            }
-            return TikaInputStream.get(bytes);
-        }
-    }
-
-    @Test
-    public void testTruncatedFiles() throws Exception {
-        // First up a truncated OOXML (zip) file
-       
-        // With only the data supplied, the best we can do is the container
-        Metadata m = new Metadata();
-        try (TikaInputStream xlsx = getTruncatedFile("testEXCEL.xlsx", 300)) {
-            assertEquals(
-                    MediaType.application("x-tika-ooxml"),
-                    detector.detect(xlsx, m));
-        }
-        
-        // With truncated data + filename, we can use the filename to specialise
-        m = new Metadata();
-        m.add(Metadata.RESOURCE_NAME_KEY, "testEXCEL.xlsx");
-        try (TikaInputStream xlsx = getTruncatedFile("testEXCEL.xlsx", 300)) {
-            assertEquals(
-                    MediaType.application("vnd.openxmlformats-officedocument.spreadsheetml.sheet"),
-                    detector.detect(xlsx, m));
-        }
-
-        // Now a truncated OLE2 file 
-        m = new Metadata();
-        try (TikaInputStream xls = getTruncatedFile("testEXCEL.xls", 400)) {
-            assertEquals(
-                    MediaType.application("x-tika-msoffice"),
-                    detector.detect(xls, m));
-        }
-        
-        // Finally a truncated OLE2 file, with a filename available
-        m = new Metadata();
-        m.add(Metadata.RESOURCE_NAME_KEY, "testEXCEL.xls");
-        try (TikaInputStream xls = getTruncatedFile("testEXCEL.xls", 400)) {
-            assertEquals(
-                    MediaType.application("vnd.ms-excel"),
-                    detector.detect(xls, m));
-        }
-   }
-
-}

http://git-wip-us.apache.org/repos/asf/tika/blob/aa5f60d7/tika-parsers/src/test/java/org/apache/tika/embedder/ExternalEmbedderTest.java
----------------------------------------------------------------------
diff --git a/tika-parsers/src/test/java/org/apache/tika/embedder/ExternalEmbedderTest.java b/tika-parsers/src/test/java/org/apache/tika/embedder/ExternalEmbedderTest.java
deleted file mode 100644
index e988aff..0000000
--- a/tika-parsers/src/test/java/org/apache/tika/embedder/ExternalEmbedderTest.java
+++ /dev/null
@@ -1,292 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.tika.embedder;
-
-import static java.nio.charset.StandardCharsets.UTF_8;
-import static org.junit.Assert.assertNotNull;
-import static org.junit.Assert.assertTrue;
-import static org.junit.Assert.fail;
-
-import java.io.ByteArrayOutputStream;
-import java.io.File;
-import java.io.FileInputStream;
-import java.io.FileNotFoundException;
-import java.io.FileOutputStream;
-import java.io.IOException;
-import java.io.InputStream;
-import java.io.OutputStreamWriter;
-import java.net.URISyntaxException;
-import java.net.URL;
-import java.text.DateFormat;
-import java.text.SimpleDateFormat;
-import java.util.Date;
-import java.util.HashMap;
-import java.util.Locale;
-import java.util.Map;
-
-import org.apache.tika.exception.TikaException;
-import org.apache.tika.io.TemporaryResources;
-import org.apache.tika.io.TikaInputStream;
-import org.apache.tika.metadata.Metadata;
-import org.apache.tika.metadata.Property;
-import org.apache.tika.metadata.TikaCoreProperties;
-import org.apache.tika.parser.ParseContext;
-import org.apache.tika.parser.Parser;
-import org.apache.tika.parser.txt.TXTParser;
-import org.apache.tika.sax.BodyContentHandler;
-import org.junit.Test;
-import org.xml.sax.ContentHandler;
-import org.xml.sax.SAXException;
-
-/**
- * Unit test for {@link ExternalEmbedder}s.
- */
-public class ExternalEmbedderTest {
-
-    protected static final DateFormat EXPECTED_METADATA_DATE_FORMATTER =
-            new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss", Locale.ROOT);
-    protected static final String DEFAULT_CHARSET = UTF_8.name();
-    private static final String COMMAND_METADATA_ARGUMENT_DESCRIPTION = "dc:description";
-    private static final String TEST_TXT_PATH = "/test-documents/testTXT.txt";
-
-    private TemporaryResources tmp = new TemporaryResources();
-
-    /**
-     * Gets the expected returned metadata value for the given field
-     *
-     * @param fieldName
-     * @return a prefix added to the field name
-     */
-    protected String getExpectedMetadataValueString(String fieldName, Date timestamp) {
-        return this.getClass().getSimpleName() + " embedded " + fieldName +
-                " on " + EXPECTED_METADATA_DATE_FORMATTER.format(timestamp);
-    }
-
-    /**
-     * Gets the tika <code>Metadata</code> object containing data to be
-     * embedded.
-     *
-     * @return the populated tika metadata object
-     */
-    protected Metadata getMetadataToEmbed(Date timestamp) {
-        Metadata metadata = new Metadata();
-        metadata.add(TikaCoreProperties.DESCRIPTION,
-                getExpectedMetadataValueString(TikaCoreProperties.DESCRIPTION.toString(), timestamp));
-        return metadata;
-    }
-
-    /**
-     * Gets the <code>Embedder</code> to test.
-     *
-     * @return the embedder under test
-     */
-    protected Embedder getEmbedder() {
-        ExternalEmbedder embedder = new ExternalEmbedder();
-        Map<Property, String[]> metadataCommandArguments = new HashMap<Property, String[]>(1);
-        metadataCommandArguments.put(TikaCoreProperties.DESCRIPTION,
-                new String[] { COMMAND_METADATA_ARGUMENT_DESCRIPTION });
-        embedder.setMetadataCommandArguments(metadataCommandArguments);
-        return embedder;
-    }
-
-    /**
-     * Gets the source input stream through standard Java resource loaders 
-     * before metadata has been embedded.
-     *
-     * @return a fresh input stream
-     */
-    protected InputStream getSourceStandardInputStream() {
-        return this.getClass().getResourceAsStream(TEST_TXT_PATH);
-    }
-
-    /**
-     * Gets the source input stream via {@link TikaInputStream}
-     * before metadata has been embedded.
-     *
-     * @return a fresh input stream
-     * @throws FileNotFoundException 
-     */
-    protected InputStream getSourceTikaInputStream() throws FileNotFoundException {
-        return TikaInputStream.get(getSourceInputFile());
-    }
-    
-    /**
-     * Gets the source input file through standard Java resource loaders
-     * before metadata has been embedded.
-     *
-     * @return a fresh input stream
-     * @throws FileNotFoundException 
-     */
-    protected File getSourceInputFile() throws FileNotFoundException {
-        URL origUrl = this.getClass().getResource(TEST_TXT_PATH);
-        if (origUrl == null) {
-            throw new FileNotFoundException("could not load " + TEST_TXT_PATH);
-        }
-        try {
-            return new File(origUrl.toURI());
-        } catch (URISyntaxException e) {
-            throw new FileNotFoundException(e.getMessage());
-        }
-    }
-
-    /**
-     * Gets the parser to use to verify the result of the embed operation.
-     *
-     * @return the parser to read embedded metadata
-     */
-    protected Parser getParser() {
-        return new TXTParser();
-    }
-
-    /**
-     * Whether or not the final result of reading the now embedded metadata is
-     * expected in the output of the external tool
-     *
-     * @return whether or not results are expected in command line output
-     */
-    protected boolean getIsMetadataExpectedInOutput() {
-        return true;
-    }
-
-    /**
-     * Tests embedding metadata then reading metadata to verify the results.
-     *
-     * @param isResultExpectedInOutput whether or not results are expected in command line output
-     */
-    protected void embedInTempFile(InputStream sourceInputStream, boolean isResultExpectedInOutput) {
-        Embedder embedder = getEmbedder();
-        
-        // TODO Move this check to ExternalEmbedder
-        String os = System.getProperty("os.name", "");
-        if (os.contains("Windows")) {
-            // Skip test on Windows
-            return;
-        }
-        
-        Date timestamp = new Date();
-        Metadata metadataToEmbed = getMetadataToEmbed(timestamp);
-
-        try {
-            File tempOutputFile = tmp.createTemporaryFile();
-            FileOutputStream tempFileOutputStream = new FileOutputStream(tempOutputFile);
-
-            // Embed the metadata into a copy of the original output stream
-            embedder.embed(metadataToEmbed, sourceInputStream, tempFileOutputStream, null);
-
-            ParseContext context = new ParseContext();
-            Parser parser = getParser();
-            context.set(Parser.class, parser);
-
-            // Setup the extracting content handler
-            ByteArrayOutputStream result = new ByteArrayOutputStream();
-            OutputStreamWriter outputWriter = new OutputStreamWriter(result,DEFAULT_CHARSET);
-            ContentHandler handler = new BodyContentHandler(outputWriter);
-
-            // Create a new metadata object to read the new metadata into
-            Metadata embeddedMetadata = new Metadata();
-
-            // Setup a re-read of the now embeded temp file
-            FileInputStream embeddedFileInputStream = new FileInputStream(tempOutputFile);
-
-            parser.parse(embeddedFileInputStream, handler, embeddedMetadata,
-                    context);
-
-            tmp.dispose();
-
-            String outputString = null;
-            if (isResultExpectedInOutput) {
-                outputString = result.toString(DEFAULT_CHARSET);
-            } else {
-                assertTrue("no metadata found", embeddedMetadata.size() > 0);
-            }
-
-            // Check each metadata property for the expected value
-            for (String metadataName : metadataToEmbed.names()) {
-                if (metadataToEmbed.get(metadataName) != null) {
-                    String expectedValue = metadataToEmbed.get(metadataName);
-                    boolean foundExpectedValue = false;
-                    if (isResultExpectedInOutput) {
-                        // just check that the entire output contains the expected string
-                        foundExpectedValue = outputString.contains(expectedValue);
-                    } else {
-                        if (embeddedMetadata.isMultiValued(metadataName)) {
-                            for (String embeddedValue : embeddedMetadata.getValues(metadataName)) {
-                                if (embeddedValue != null) {
-                                    if (embeddedValue.contains(expectedValue)) {
-                                        foundExpectedValue = true;
-                                        break;
-                                    }
-                                }
-                            }
-                        } else {
-                            String embeddedValue = embeddedMetadata.get(metadataName);
-                            assertNotNull("expected metadata for "
-                                    + metadataName + " not found",
-                                    embeddedValue);
-                            foundExpectedValue = embeddedValue.contains(expectedValue);
-                        }
-                    }
-                    assertTrue(
-                            "result did not contain expected appended metadata "
-                                    + metadataName + "="
-                                    + expectedValue,
-                            foundExpectedValue);
-                }
-            }
-        } catch (IOException e) {
-            fail(e.getMessage());
-        } catch (TikaException e) {
-            fail(e.getMessage());
-        } catch (SAXException e) {
-            fail(e.getMessage());
-        }
-    }
-    
-    protected void checkSourceFileExists() {
-        String message = "the original input file was deleted";
-        try {
-            File origInputFile = getSourceInputFile();
-            assertNotNull(message, origInputFile);
-            assertTrue(message, origInputFile.exists());
-        } catch (FileNotFoundException e) {
-            fail(message + ": " + e.getMessage());
-        }
-    }
-
-    /**
-     * Tests embedding using an input stream obtained via {@link ExternalEmbedderTest#getSourceStandardInputStream()}
-     * 
-     * @throws IOException
-     */
-    @Test
-    public void testEmbedStandardInputStream() throws IOException {
-        embedInTempFile(getSourceStandardInputStream(), getIsMetadataExpectedInOutput());
-        checkSourceFileExists();
-    }
-    
-    /**
-     * Tests embedding using an input stream obtained via {@link ExternalEmbedderTest#getSourceTikaInputStream()}
-     * 
-     * @throws IOException
-     */
-    @Test
-    public void testEmbedTikaInputStream() throws IOException {
-        embedInTempFile(getSourceTikaInputStream(), getIsMetadataExpectedInOutput());
-        checkSourceFileExists();
-    }
-
-}

http://git-wip-us.apache.org/repos/asf/tika/blob/aa5f60d7/tika-parsers/src/test/java/org/apache/tika/mime/MimeTypeTest.java
----------------------------------------------------------------------
diff --git a/tika-parsers/src/test/java/org/apache/tika/mime/MimeTypeTest.java b/tika-parsers/src/test/java/org/apache/tika/mime/MimeTypeTest.java
deleted file mode 100644
index 7987630..0000000
--- a/tika-parsers/src/test/java/org/apache/tika/mime/MimeTypeTest.java
+++ /dev/null
@@ -1,105 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.tika.mime;
-
-import static org.junit.Assert.assertFalse;
-import static org.junit.Assert.assertTrue;
-import static org.junit.Assert.fail;
-
-import org.junit.Before;
-import org.junit.Test;
-
-public class MimeTypeTest {
-
-    private MimeTypes types;
-    private MimeType text;
-
-    @Before
-    public void setUp() throws MimeTypeException {
-        types = new MimeTypes();
-        text = types.forName("text/plain");
-    }
-
-    /** Test MimeType constructor */
-    @Test
-    public void testConstrctor() {
-        // Missing name
-        try {
-            new MimeType(null);
-            fail("Expected IllegalArgumentException");
-        } catch (IllegalArgumentException e) {
-            // expected result
-        }
-    }
-
-    @Test
-    public void testIsValidName() {
-        assertTrue(MimeType.isValid("application/octet-stream"));
-        assertTrue(MimeType.isValid("text/plain"));
-        assertTrue(MimeType.isValid("foo/bar"));
-        assertTrue(MimeType.isValid("a/b"));
-
-        assertFalse(MimeType.isValid("application"));
-        assertFalse(MimeType.isValid("application/"));
-        assertFalse(MimeType.isValid("/"));
-        assertFalse(MimeType.isValid("/octet-stream"));
-        assertFalse(MimeType.isValid("application//octet-stream"));
-        assertFalse(MimeType.isValid("application/octet=stream"));
-        assertFalse(MimeType.isValid("application/\u00f6ctet-stream"));
-        assertFalse(MimeType.isValid("text/plain;"));
-        assertFalse(MimeType.isValid("text/plain; charset=UTF-8"));
-        try {
-            MimeType.isValid(null);
-            fail("Expected IllegalArgumentException");
-        } catch (IllegalArgumentException e) {
-            // expected result
-        }
-    }
-
-    /** Test MimeType setDescription() */
-    @Test
-    public void testSetEmptyValues() {
-        try {
-            text.setDescription(null);
-            fail("Expected IllegalArgumentException");
-        } catch (IllegalArgumentException e) {
-            // expected result
-        }
-        
-        try {
-            text.setAcronym(null);
-            fail("Expected IllegalArgumentException");
-        } catch (IllegalArgumentException e) {
-            // expected result
-        }
-        
-        try {
-            text.addLink(null);
-            fail("Expected IllegalArgumentException");
-        } catch (IllegalArgumentException e) {
-            // expected result
-        }
-
-        try {
-            text.setUniformTypeIdentifier(null);
-            fail("Expected IllegalArgumentException");
-        } catch (IllegalArgumentException e) {
-            // expected result
-        }
-    }
-
-}

http://git-wip-us.apache.org/repos/asf/tika/blob/aa5f60d7/tika-parsers/src/test/java/org/apache/tika/mime/MimeTypesTest.java
----------------------------------------------------------------------
diff --git a/tika-parsers/src/test/java/org/apache/tika/mime/MimeTypesTest.java b/tika-parsers/src/test/java/org/apache/tika/mime/MimeTypesTest.java
deleted file mode 100644
index be8a575..0000000
--- a/tika-parsers/src/test/java/org/apache/tika/mime/MimeTypesTest.java
+++ /dev/null
@@ -1,122 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.tika.mime;
-
-import static org.apache.tika.mime.MediaType.OCTET_STREAM;
-import static org.apache.tika.mime.MediaType.TEXT_PLAIN;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertFalse;
-import static org.junit.Assert.assertNotNull;
-import static org.junit.Assert.assertNull;
-import static org.junit.Assert.assertTrue;
-import static org.junit.Assert.fail;
-
-import org.junit.Before;
-import org.junit.Test;
-
-public class MimeTypesTest {
-
-    private MimeTypes types;
-
-    private MediaTypeRegistry registry;
-
-    private MimeType binary;
-
-    private MimeType text;
-
-    private MimeType html;
-
-    @Before
-    public void setUp() throws MimeTypeException {
-        types = new MimeTypes();
-        registry = types.getMediaTypeRegistry();
-        binary = types.forName("application/octet-stream");
-        text = types.forName("text/plain");
-        types.addAlias(text, MediaType.parse("text/x-plain"));
-        html = types.forName("text/html");
-        types.setSuperType(html, TEXT_PLAIN);
-    }
-
-    @Test
-    public void testForName() throws MimeTypeException {
-        assertEquals(text, types.forName("text/plain"));
-        assertEquals(text, types.forName("TEXT/PLAIN"));
-
-        try {
-            types.forName("invalid");
-            fail("MimeTypeException not thrown on invalid type name");
-        } catch (MimeTypeException e) {
-            // expected
-        }
-    }
-
-    @Test
-    public void testRegisteredMimes() throws MimeTypeException {
-        String dummy = "text/xxxxx";
-        assertEquals(text, types.getRegisteredMimeType("text/plain"));
-        assertNull(types.getRegisteredMimeType(dummy));
-        assertNotNull(types.forName(dummy));
-        assertEquals(dummy, types.forName("text/xxxxx").getType().toString());
-        assertEquals(dummy, types.getRegisteredMimeType("text/xxxxx").getType().toString());
-        
-        try {
-            types.forName("invalid");
-            fail("MimeTypeException not thrown on invalid type name");
-        } catch (MimeTypeException e) {
-            // expected
-        }
-    }
-
-    @Test
-    public void testSuperType() throws MimeTypeException {
-        assertNull(registry.getSupertype(OCTET_STREAM));
-        assertEquals(OCTET_STREAM, registry.getSupertype(TEXT_PLAIN));
-        assertEquals(TEXT_PLAIN, registry.getSupertype(html.getType()));
-   }
-
-    @Test
-    public void testIsDescendantOf() {
-        assertFalse(registry.isSpecializationOf(OCTET_STREAM, OCTET_STREAM));
-        assertFalse(registry.isSpecializationOf(TEXT_PLAIN, TEXT_PLAIN));
-        assertFalse(registry.isSpecializationOf(html.getType(), html.getType()));
-
-        assertTrue(registry.isSpecializationOf(html.getType(), OCTET_STREAM));
-        assertFalse(registry.isSpecializationOf(OCTET_STREAM, html.getType()));
-
-        assertTrue(registry.isSpecializationOf(html.getType(), TEXT_PLAIN));
-        assertFalse(registry.isSpecializationOf(TEXT_PLAIN, html.getType()));
-
-        assertTrue(registry.isSpecializationOf(TEXT_PLAIN, OCTET_STREAM));
-        assertFalse(registry.isSpecializationOf(OCTET_STREAM, TEXT_PLAIN));
-    }
-
-    @Test
-    public void testCompareTo() {
-        assertTrue(binary.compareTo(binary) == 0);
-        assertTrue(binary.compareTo(text) != 0);
-        assertTrue(binary.compareTo(html) != 0);
-
-        assertTrue(text.compareTo(binary) != 0);
-        assertTrue(text.compareTo(text) == 0);
-        assertTrue(text.compareTo(html) != 0);
-
-        assertTrue(html.compareTo(binary) != 0);
-        assertTrue(html.compareTo(text) != 0);
-        assertTrue(html.compareTo(html) == 0);
-    }
-
-}