You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tika.apache.org by bo...@apache.org on 2015/12/30 18:36:58 UTC
svn commit: r1722385 [5/27] - in /tika/branches/2.x: tika-parser-bundles/
tika-parser-modules/ tika-parsers/ tika-test-resources/
tika-test-resources/src/ tika-test-resources/src/test/
tika-test-resources/src/test/java/ tika-test-resources/src/test/res...
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/complex.mbox
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/complex.mbox?rev=1722385&view=auto
==============================================================================
--- tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/complex.mbox (added)
+++ tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/complex.mbox Wed Dec 30 17:36:48 2015
@@ -0,0 +1,291 @@
+From core-user-return-14700-apmail-hadoop-core-user-archive=hadoop.apache.org@hadoop.apache.org Mon Jun 01 04:28:28 2009
+Return-Path: <co...@hadoop.apache.org>
+Delivered-To: apmail-hadoop-core-user-archive@www.apache.org
+Received: (qmail 19921 invoked from network); 1 Jun 2009 04:28:28 -0000
+Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3)
+ by minotaur.apache.org with SMTP; 1 Jun 2009 04:28:28 -0000
+Received: (qmail 84995 invoked by uid 500); 1 Jun 2009 04:28:38 -0000
+Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org
+Received: (qmail 84895 invoked by uid 500); 1 Jun 2009 04:28:38 -0000
+Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm
+Precedence: bulk
+List-Help: <ma...@hadoop.apache.org>
+List-Unsubscribe: <ma...@hadoop.apache.org>
+List-Post: <ma...@hadoop.apache.org>
+List-Id: <core-user.hadoop.apache.org>
+Reply-To: core-user@hadoop.apache.org
+Delivered-To: mailing list core-user@hadoop.apache.org
+Received: (qmail 84885 invoked by uid 99); 1 Jun 2009 04:28:38 -0000
+Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136)
+ by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Jun 2009 04:28:38 +0000
+X-ASF-Spam-Status: No, hits=1.2 required=10.0
+ tests=SPF_NEUTRAL
+X-Spam-Check-By: apache.org
+Received-SPF: neutral (athena.apache.org: local policy)
+Received: from [69.147.107.21] (HELO mrout2-b.corp.re1.yahoo.com) (69.147.107.21)
+ by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Jun 2009 04:28:26 +0000
+Received: from SNV-EXPF01.ds.corp.yahoo.com (snv-expf01.ds.corp.yahoo.com [207.126.227.250])
+ by mrout2-b.corp.re1.yahoo.com (8.13.8/8.13.8/y.out) with ESMTP id n514QYA6099963
+ for <co...@hadoop.apache.org>; Sun, 31 May 2009 21:26:35 -0700 (PDT)
+DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns;
+ h=received:user-agent:date:subject:from:to:message-id:
+ thread-topic:thread-index:in-reply-to:mime-version:content-type:
+ content-transfer-encoding:x-originalarrivaltime;
+ b=YVtSNdgjeeSBS1yY3XDolul49i+HrgNG7QszMo9LzGnrwejjgsl5+iUM6EiQgEpV
+Received: from SNV-EXVS08.ds.corp.yahoo.com ([207.126.227.9]) by SNV-EXPF01.ds.corp.yahoo.com with Microsoft SMTPSVC(6.0.3790.3959);
+ Sun, 31 May 2009 21:26:34 -0700
+Received: from 10.66.92.213 ([10.66.92.213]) by SNV-EXVS08.ds.corp.yahoo.com ([207.126.227.58]) with Microsoft Exchange Server HTTP-DAV ;
+ Mon, 1 Jun 2009 04:26:33 +0000
+User-Agent: Microsoft-Entourage/12.17.0.090302
+Date: Mon, 01 Jun 2009 09:56:31 +0530
+Subject: Re: question about when shuffle/sort start working
+From: Jothi Padmanabhan <jo...@yahoo-inc.com>
+To: <co...@hadoop.apache.org>
+Message-ID: <C6...@yahoo-inc.com>
+Thread-Topic: question about when shuffle/sort start working
+Thread-Index: AcnicSNoBw19cMU8UEaXwAdZ1YYhuw==
+In-Reply-To: <44...@web111005.mail.gq1.yahoo.com>
+Mime-version: 1.0
+Content-type: text/plain;
+ charset="US-ASCII"
+Content-transfer-encoding: 7bit
+X-OriginalArrivalTime: 01 Jun 2009 04:26:34.0501 (UTC) FILETIME=[257EAB50:01C9E271]
+X-Virus-Checked: Checked by ClamAV on apache.org
+
+When a Mapper completes, MapCompletionEvents are generated. Reducers try to
+fetch map outputs for a given map only on the receipt of such events.
+
+Jothi
+
+
+On 5/30/09 10:00 AM, "Jianmin Woo" <ji...@yahoo.com> wrote:
+
+> Hi,
+> I am being confused by the protocol between mapper and reducer. When mapper
+> emitting the (key,value) pair done, is there any signal the mapper send out to
+> hadoop framework in protocol to indicate that map is done and the shuffle/sort
+> can begin for reducer? If there is no this signal in protocol, when the
+> framework begin the shuffle/sort?
+>
+> Thanks,
+> Jianmin
+>
+>
+>
+>
+
+
+From core-user-return-14701-apmail-hadoop-core-user-archive=hadoop.apache.org@hadoop.apache.org Mon Jun 01 05:31:14 2009
+Return-Path: <co...@hadoop.apache.org>
+Delivered-To: apmail-hadoop-core-user-archive@www.apache.org
+Received: (qmail 38243 invoked from network); 1 Jun 2009 05:31:14 -0000
+Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3)
+ by minotaur.apache.org with SMTP; 1 Jun 2009 05:31:14 -0000
+Received: (qmail 15621 invoked by uid 500); 1 Jun 2009 05:31:24 -0000
+Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org
+Received: (qmail 15557 invoked by uid 500); 1 Jun 2009 05:31:24 -0000
+Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm
+Precedence: bulk
+List-Help: <ma...@hadoop.apache.org>
+List-Unsubscribe: <ma...@hadoop.apache.org>
+List-Post: <ma...@hadoop.apache.org>
+List-Id: <core-user.hadoop.apache.org>
+Reply-To: core-user@hadoop.apache.org
+Delivered-To: mailing list core-user@hadoop.apache.org
+Received: (qmail 15547 invoked by uid 99); 1 Jun 2009 05:31:24 -0000
+Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230)
+ by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Jun 2009 05:31:24 +0000
+X-ASF-Spam-Status: No, hits=2.2 required=10.0
+ tests=HTML_MESSAGE,SPF_PASS
+X-Spam-Check-By: apache.org
+Received-SPF: pass (nike.apache.org: local policy)
+Received: from [68.142.237.94] (HELO n9.bullet.re3.yahoo.com) (68.142.237.94)
+ by apache.org (qpsmtpd/0.29) with SMTP; Mon, 01 Jun 2009 05:31:11 +0000
+Received: from [68.142.237.88] by n9.bullet.re3.yahoo.com with NNFMP; 01 Jun 2009 05:30:50 -0000
+Received: from [67.195.9.82] by t4.bullet.re3.yahoo.com with NNFMP; 01 Jun 2009 05:30:49 -0000
+Received: from [67.195.9.99] by t2.bullet.mail.gq1.yahoo.com with NNFMP; 01 Jun 2009 05:30:49 -0000
+Received: from [127.0.0.1] by omp103.mail.gq1.yahoo.com with NNFMP; 01 Jun 2009 05:28:01 -0000
+X-Yahoo-Newman-Property: ymail-3
+X-Yahoo-Newman-Id: 796121.97519.bm@omp103.mail.gq1.yahoo.com
+Received: (qmail 35264 invoked by uid 60001); 1 Jun 2009 05:30:49 -0000
+DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1243834249; bh=R8qzdi/IbLyO8UwpnaujDpT9E+6bJ7nkmZN2803EmRk=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=vq4c6RIDbkuLPYd8mirusIXf6DqTb/IeT55In7W00Y5Sxx1ZiXBb78yE9+TDfXJ0elsEZvqv4ocyvolGE0eGtyYeJA0mZikpRNu6pidxPNpCplOcLHBRz7YQ7iERwv3TagRlWy2Xd3oD9ZeV0A05P7WUOiNNX1PUUJD1IVdrEZo=
+DomainKey-Signature:a=rsa-sha1; q=dns; c=nofws;
+ s=s1024; d=yahoo.com;
+ h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type;
+ b=6HXZV98ON5vBwmE/xS8stVD0D2F4dkMY7a0suX5KVTb736JdR8G59mqBq/dWcpbFTLiCLtxi18LMb/dU1RKRGOEdn3l3j/jKXhBrhIgfg3qtNskPedXDKBvn7JGXiSkqpA/tUtPjvc0Uuk8/LaA01SQTz40Engg7nD8/EJdIAhA=;
+Message-ID: <59...@web111010.mail.gq1.yahoo.com>
+X-YMail-OSG: KzhhrJYVM1m.MCS6vRpRP2ZZO2PrfnbngosELDCIa91ZqvhJph4RdmzfUW0jw9W04RCSch1K730bPohwNpNBIk2QR_zt4_mfbhfq7YEPkSoz9LSXG90P9vIo5Fc8qyZN0U6vA9gtdyGQTpN5ahvillUH9nAF0TMWv2SvZJLjPlQ0Z0p8oK8ltBwGTgLrM8Jtdn9D29yoRyi3_EpVOfdD9OP.EK50Vr1XwSUYMbnpZ0WGHMwd.Yig7A6Elwadm3YVbfOdx2mfrG.jQsUAxQjRBNvbrOM57.FaE11kHTe9aoBWSeihNg--
+Received: from [216.145.54.7] by web111010.mail.gq1.yahoo.com via HTTP; Sun, 31 May 2009 22:30:49 PDT
+X-Mailer: YahooMailRC/1277.43 YahooMailWebService/0.7.289.10
+References: <C6...@yahoo-inc.com>
+Date: Sun, 31 May 2009 22:30:49 -0700 (PDT)
+From: Jianmin Woo <ji...@yahoo.com>
+Subject: Re: question about when shuffle/sort start working
+To: core-user@hadoop.apache.org
+In-Reply-To: <C6...@yahoo-inc.com>
+MIME-Version: 1.0
+Content-Type: multipart/alternative; boundary="0-1193839393-1243834249=:35091"
+X-Virus-Checked: Checked by ClamAV on apache.org
+
+--0-1193839393-1243834249=:35091
+Content-Type: text/plain; charset=us-ascii
+
+Thanks a lot for your explanation, Jothi.
+
+So is this event generated by hadoop framework? Is there any API in mapper to fire this event? Actually, I am thinking to implement a mapper that will emit some <key, value> pairs, then fire this event to let the reducer works, the same mapper task then emit some other <key, value> pairs and repeat. Do you think is this logic feasible by current API?
+
+Thanks,
+Jianmin
+
+
+
+
+
+________________________________
+From: Jothi Padmanabhan <jo...@yahoo-inc.com>
+To: core-user@hadoop.apache.org
+Sent: Monday, June 1, 2009 12:26:31 PM
+Subject: Re: question about when shuffle/sort start working
+
+When a Mapper completes, MapCompletionEvents are generated. Reducers try to
+fetch map outputs for a given map only on the receipt of such events.
+
+Jothi
+
+
+On 5/30/09 10:00 AM, "Jianmin Woo" <ji...@yahoo.com> wrote:
+
+> Hi,
+> I am being confused by the protocol between mapper and reducer. When mapper
+> emitting the (key,value) pair done, is there any signal the mapper send out to
+> hadoop framework in protocol to indicate that map is done and the shuffle/sort
+> can begin for reducer? If there is no this signal in protocol, when the
+> framework begin the shuffle/sort?
+>
+> Thanks,
+> Jianmin
+>
+>
+>
+>
+
+
+
+--0-1193839393-1243834249=:35091--
+
+
+From core-user-return-14702-apmail-hadoop-core-user-archive=hadoop.apache.org@hadoop.apache.org Mon Jun 01 06:04:30 2009
+Return-Path: <co...@hadoop.apache.org>
+Delivered-To: apmail-hadoop-core-user-archive@www.apache.org
+Received: (qmail 53387 invoked from network); 1 Jun 2009 06:04:29 -0000
+Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3)
+ by minotaur.apache.org with SMTP; 1 Jun 2009 06:04:29 -0000
+Received: (qmail 39066 invoked by uid 500); 1 Jun 2009 06:04:39 -0000
+Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org
+Received: (qmail 38970 invoked by uid 500); 1 Jun 2009 06:04:39 -0000
+Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm
+Precedence: bulk
+List-Help: <ma...@hadoop.apache.org>
+List-Unsubscribe: <ma...@hadoop.apache.org>
+List-Post: <ma...@hadoop.apache.org>
+List-Id: <core-user.hadoop.apache.org>
+Reply-To: core-user@hadoop.apache.org
+Delivered-To: mailing list core-user@hadoop.apache.org
+Received: (qmail 38955 invoked by uid 99); 1 Jun 2009 06:04:39 -0000
+Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136)
+ by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Jun 2009 06:04:39 +0000
+X-ASF-Spam-Status: No, hits=1.2 required=10.0
+ tests=SPF_NEUTRAL
+X-Spam-Check-By: apache.org
+Received-SPF: neutral (athena.apache.org: local policy)
+Received: from [216.145.54.172] (HELO mrout2.yahoo.com) (216.145.54.172)
+ by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Jun 2009 06:04:28 +0000
+Received: from SNV-EXBH01.ds.corp.yahoo.com (snv-exbh01.ds.corp.yahoo.com [207.126.227.249])
+ by mrout2.yahoo.com (8.13.6/8.13.6/y.out) with ESMTP id n5163FGq038852
+ for <co...@hadoop.apache.org>; Sun, 31 May 2009 23:03:15 -0700 (PDT)
+DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns;
+ h=received:user-agent:date:subject:from:to:message-id:
+ thread-topic:thread-index:in-reply-to:mime-version:content-type:
+ content-transfer-encoding:x-originalarrivaltime;
+ b=rChE4SCnwtWaZpjhovkiXDKfDiVNdRRvsadSGG9S9bgvOexn/9/5JjEQx1pOR7Nb
+Received: from SNV-EXVS08.ds.corp.yahoo.com ([207.126.227.9]) by SNV-EXBH01.ds.corp.yahoo.com with Microsoft SMTPSVC(6.0.3790.3959);
+ Sun, 31 May 2009 23:03:15 -0700
+Received: from 10.66.92.213 ([10.66.92.213]) by SNV-EXVS08.ds.corp.yahoo.com ([207.126.227.58]) with Microsoft Exchange Server HTTP-DAV ;
+ Mon, 1 Jun 2009 06:03:15 +0000
+User-Agent: Microsoft-Entourage/12.17.0.090302
+Date: Mon, 01 Jun 2009 11:33:13 +0530
+Subject: Re: question about when shuffle/sort start working
+From: Jothi Padmanabhan <jo...@yahoo-inc.com>
+To: <co...@hadoop.apache.org>
+Message-ID: <C6...@yahoo-inc.com>
+Thread-Topic: question about when shuffle/sort start working
+Thread-Index: AcnifqWrLG6N7GAk7kqy9QalVWfegQ==
+In-Reply-To: <59...@web111010.mail.gq1.yahoo.com>
+Mime-version: 1.0
+Content-type: text/plain;
+ charset="US-ASCII"
+Content-transfer-encoding: 7bit
+X-OriginalArrivalTime: 01 Jun 2009 06:03:15.0462 (UTC) FILETIME=[A7231260:01C9E27E]
+X-Virus-Checked: Checked by ClamAV on apache.org
+
+
+No you cannot raise this event yourself, this event is generated internally
+by the framework.
+
+I am guessing that what you probably want is to have a chain of MapReduce
+Jobs where the output of one is automatically fed as input to another. You
+can look at these classes: JobControl and ChainMapper/ChainReducer.
+
+Jothi
+
+On 6/1/09 11:00 AM, "Jianmin Woo" <ji...@yahoo.com> wrote:
+
+> Thanks a lot for your explanation, Jothi.
+>
+> So is this event generated by hadoop framework? Is there any API in mapper to
+> fire this event? Actually, I am thinking to implement a mapper that will emit
+> some <key, value> pairs, then fire this event to let the reducer works, the
+> same mapper task then emit some other <key, value> pairs and repeat. Do you
+> think is this logic feasible by current API?
+>
+> Thanks,
+> Jianmin
+>
+>
+>
+>
+>
+> ________________________________
+> From: Jothi Padmanabhan <jo...@yahoo-inc.com>
+> To: core-user@hadoop.apache.org
+> Sent: Monday, June 1, 2009 12:26:31 PM
+> Subject: Re: question about when shuffle/sort start working
+>
+> When a Mapper completes, MapCompletionEvents are generated. Reducers try to
+> fetch map outputs for a given map only on the receipt of such events.
+>
+> Jothi
+>
+>
+> On 5/30/09 10:00 AM, "Jianmin Woo" <ji...@yahoo.com> wrote:
+>
+>> Hi,
+>> I am being confused by the protocol between mapper and reducer. When mapper
+>> emitting the (key,value) pair done, is there any signal the mapper send out
+>> to
+>> hadoop framework in protocol to indicate that map is done and the
+>> shuffle/sort
+>> can begin for reducer? If there is no this signal in protocol, when the
+>> framework begin the shuffle/sort?
+>>
+>> Thanks,
+>> Jianmin
+>>
+>>
+>>
+>>
+>
+>
+>
+
+
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/egyl03.gdas.200811.00Z.grb2
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/egyl03.gdas.200811.00Z.grb2?rev=1722385&view=auto
==============================================================================
Binary file - no diff available.
Propchange: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/egyl03.gdas.200811.00Z.grb2
------------------------------------------------------------------------------
svn:mime-type = application/octet-stream
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/english.cp500.txt
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/english.cp500.txt?rev=1722385&view=auto
==============================================================================
--- tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/english.cp500.txt (added)
+++ tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/english.cp500.txt Wed Dec 30 17:36:48 2015
@@ -0,0 +1 @@
+@@@@@@@@@É£
⣣@×ÖæÅÙ@
%
%@@@@@@@@@@@@Ȧ
@
%
%@@@@@@@@@@@@Ç¢@
%
%@@@@@@@@@@@@⤣@@¢
¥
¢@
%
%@@@@@@@@@@@@Ù
¢¤
¢@
%
%
%
%
%@@@@@@@@@@@@Ù
£
@¢
%@@@@@@@@@@@@æ£
¢k@
¢
¢@@£
%@@@@@@@@@@@@צ
@⨢£
¢
%@@@@@@@@@@@@⨢£
¢@
%@@@@@@@@@@@@ÉÂÔ@⨢£
@@Ù
¢
%@@@@@@@@@@@@â@¤¢
¢¢@
¢¤
@
£
%@@@@@@@@@@@@ŧ
¢¢@Á¥£
@@
¤@¤¢
¢¢
%@@@@@@@@@@@@¤¢
¢¢@×£
¢
%
%
%@@@@@@@@@@@@@ÉÂÔ@⨢£
¢@@n@梣£¢@@n@É£
⣣@×ÖæÅÙ@@n@
%@@@@@@@@@@@@@@@@@@Ȧ
@@n@@
%
%@@@@@@@@@@@@@@@@@@ÉÂÔ@É£
⣣@×ÖæÅÙ@ñøõ@ŧ
¢¢@@
%@@@@@@@@@@@@@@@@@@@@@@@@פ£@
£¢Â¦¢
@P@¤¨Á
¢¢
¢
%
%@@@@@@@@@@@@@@@@@@@@@@@@Ö¥
¥
¦@@@j@@@Æ
£¤
¢@P@Â
£¢@@@@j@@@â
£¢@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@j@@@×
@Ä£@@@@j@@
%
%
%
%
%
%@@@@@@@@@@@@@@@@@@@@@@@@@Ó
@
%
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ÉÂÔ@É£
⣣@×ÖæÅÙ@ñøõ@ŧ
¢¢@¦¢££@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@MõôÒÂ]
%
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ã
@¥
¥
¦@@£¤£
%
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@Æ£¢@P@
£¤
¢@
£
%
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@⨢£
@
@
£
%
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@Ȧ
@¤
££
%
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@⤣
%
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ãÁÇÉãã`ÃÁãÉÁ@åôaÅÕÖåÉÁ@ÄÔä@Å¥¤£@MôõðÒÂ]
%
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@Ç
£@Á
V@Ù
V
%
%
%@@@@@@@@@@@@@@@@@@ã
@ÉÂÔ@É£
⣣@צ
@ñøõ@@ÉÂÔ@É£
⣣@צ
@
%@@@@@@@@@@@@@@@@@@òøõ@¥
@
@¤
@¢@@¦
@¦£¦@@
%@@@@@@@@@@@@@@@@@@
£@¦£@@
£¥
@£
@@Ѥ¨@òk@òððùK@Ö@@
%@@@@@@@@@@@@@@@@@@£
@£
@
£¥
@£
@@¦£¦k@¨¤@@@
@
%@@@@@@@@@@@@@@@@@@
@£
¢
@¤£¢@
£¨@@ÉÂÔK@@è¤@¨@£@£
@
%@@@@@@@@@@@@@@@@@@¤£¢@@@¢`¥
@¢¢@£¤@ÉÂÔ@¤¢
¢¢@
%@@@@@@@@@@@@@@@@@@×£
¢K
%@@@@@@@@@@@@@@@@@@@@@@@@É£@@£
@ÉÂÔ@梣£@
¢
%
%
%
%@@@@@@@@@@@@@@@@@@@@@@@@È£¢
%
%
%@@@@@@@@@@@@@@@@@@@@⤣@@£@óò`£@@öô`£@ÔÃÁÄ@óÄ@¢@
%@@@@@@@@@@@@@@@@@@@@£¢@
%@@@@@@@@@@@@@@@@@@@@ؤ
£@
£@@@
¢¢
@
@
%@@@@@@@@@@@@@@@@@@@@Å¢¨@£@¢£@
%@@@@@@@@@@@@@@@@@@@@Ä
¢
@@¢£¦
@
¥
£@@¢£¦
@¢¤£@
%@@@@@@@@@@@@@@@@@@@@
¥
£¢@
%@@@@@@@@@@@@@@@@@@ã
@
¦@ÉÂÔ@É£
⣣V@×ÖæÅÙ9@ñøõ@ŧ
¢¢@¦¢££@
%@@@@@@@@@@@@@@@@@@
¢@@
@@¢@
¢¢
@
@¦£@
¢@
%@@@@@@@@@@@@@@@@@@
£¤
¢@@
§
£@
@@@¤
£@¤¢
@
%@@@@@@@@@@@@@@@@@@
§
K@ä¢
@£@@`
@Ô
@ä£
@Á
@
%@@@@@@@@@@@@@@@@@@Ä
¢@MÔÃÁÄ]k@ä£
@Á
@Å
@MÃÁÅ]k@óÄ@@
%@@@@@@@@@@@@@@@@@@
¢¢@@£
@¤¢
¢¢@@£
@£¢K@æ£@
%@@@@@@@@@@@@@@@@@@òÄ@¢@
£¢k@£@¢@@
§
£@¦¢££@@
%@@@@@@@@@@@@@@@@@@¢£¦
@
¥
£@@¢¤£
%
%@@@@@@@@@@@@@@@@@@æ£@
`
@ÉÂÔ@צ
×ÃV@ù÷ð@
¢¢@£
¨@@
%@@@@@@@@@@@@@@@@@@£
@磌
9k@£
@¥
¨@
@öô`£@¢¨
£@
%@@@@@@@@@@@@@@@@@@¤£
¢¢@É£
⣣@×ÖæÅÙ@ñøõ@ŧ
¢¢@¦¢££@
%@@@@@@@@@@@@@@@@@@
¢@¢£@
a
@
£¢K@É£@¢@£
@
%@@@@@@@@@@@@@@@@@@
£@
£@£@£
@ÉÂÔ@É£
⣣@×ÖæÅÙ@òøõ@
%@@@@@@@@@@@@@@@@@@ŧ
¢¢
@@
@@
@@
K@Á@
%@@@@@@@@@@@@@@@@@@¤@
£
@£
@ÁÉç@õÓ9@@Ó¤§V@
£@¢¨¢£
@MÖâ]k@
%@@@@@@@@@@@@@@@@@@£¢@@
¢¨@¢£O
%
%@@@@@@@@@@@@@@@@@@É£
⣣@×ÖæÅÙ@¦¢££¢@
@£
@¢£@
@
%@@@@@@@@@@@@@@@@@@
¢@¦¢££¢@¢¤£@öô`£@ÃÁãÉÁ@åõ@@
@
%@@@@@@@@@@@@@@@@@@
@¢¢
¢K@Æ@ÃÁãÉÁ@ÔÃÁÄ@¦¢k@£
@×ÖæÅÙ@
%@@@@@@@@@@@@@@@@@@ñøõ@ŧ
¢¢@¥
¢@
@
@£@£¢@
¢¢@@
%@@@@@@@@@@@@@@@@@@ñ@k@£
@ÉÂÔ@É£
⣣@×ÖæÅÙ@ò÷õ@ñKð@ÇÈ©k@¦
@
£@
%@@@@@@@@@@@@@@@@@@
¥
@
¢¢@¢
K@¨@
¤@£
@
£¨@
k@£
@×ÖæÅÙ@ñøõ@
%@@@@@@@@@@@@@@@@@@ŧ
¢¢@¦
¢@£
@¤
@£@¤
@k@¢¦
@ÔÃÁÄ@
¢@
%@@@@@@@@@@@@@@@@@@@¨¢¢@¦¢££¢K
%
%
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@Ã@
£¤
¢
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@Ä
¢¢
@¦¢££@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@ñ`@@ò`
@âÔ×@
¢
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@öô`£@צ
×Ã@£
¨@¦£@Á£å
9@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
§£
¢¢
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@Ã
@@òÄ@¢@£
@@óÄ@¢@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
£¢
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@⤣
@¨@ÁÉç@õÓ9@MåõKò@@åõKó]@@Ó¤§@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@¢£¤£¢@@Ù
@È£@MÙÈÅÓ@Áâ@ô]@@âäâÅ@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@Ó¤§@MâÓÅâ@ù]
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
%@@@@@@@@@@@@@@@@@@@@@@@@@Ȧ
@¢¤¨
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@Ä
¢¢
@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@Ö
@@£¦@öô`£@òKõ@ÇÈ©@צ
×Ã@ù÷ð@
¢¢¢
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@ñÔÂ@Óò@
@
@
¢¢
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@õñòÔÂ@£@øÇÂ@@óóó@ÔÈ©@ÄÄÙ@âÄÙÁÔ
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@Ƥ@×ÃÉ`ç@£
@¢£¢@Mòöö@ÔÈ©@Möô`£]^@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@òñóó@ÔÈ©@Möô`£]]^@Ö
@×ÃÉ@¢£@Móó@ÔÈ©@
%Móò`£]]
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@ã
@âÃâÉ@¢@¨¢@@¤@£@ðKùãÂ@@£
@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@¢£
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@Ĥ@
@ä£óòð@âÃâÉ@£
@M£
@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
§£
]
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@Ĥ@£
@Å£
£@ñðañððañððð@Ô¢@£
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@Ƥ@äâÂk@£¦@¢
@£¢
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@ã¦@`@
@¨¢
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@óÄ@¢@
¢@@Ççãôõðð×@@Ççãöõðð×
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@óÄ@¢@¤£@
¥
¢â
ÂV@@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@â
Ô¤¢
V
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@g@òÄ@¢@¤¢@Ççãñóõ×
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
%
%
%
%
%@@@@@@@@@@@@@@@@@@@@
%
%
%
%@@@@@@@@@@@@@@@@@@@@@ñ@¢
@@â×Åäòððð@
@£
¢£¢@¢@@Æ
¤¨@ñôk@
%@@@@@@@@@@@@@@@@@@òððö@¢¤££
@£@â£@×
@Å¥¤£@ã@
%@@@@@@@@@@@@@@@@@@@£@
@¥
@£@¦¦¦K¢
KK
%@@@@@@@@@@@@@@@@@@@æ
}
@
@£@
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@Å¢¨@¦¨¢@£@
£@£
@¢¦
¢@¨¤@
K
%
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ã@¦
%
%
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@Ù
¤
¢£@@¤£
%
%
%
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@¤¢@£
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ñ`øøø`âÈÖ×`ÉÂÔ
%
%
%@@@@@@@@@@@@@@@@@@@@@@@@ؤ£¨@¢¥¢
%
%
%@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@Ó
@
%
%
%
%
%@@@@@@@@@@@@@@@@@@@@@@@@ã
¢¥@¢£
¢
%
%@@@@@@@@@@@@@@@@@@@@@@@@Ã
£
@¤£
@¢@@¨¤@ÉÂÔ@¢¨¢£
¢
%
%
%@@@@@@@@@@@@@@@@@@@@@@@@墣@å¢ÃË
%
%
%
%
%@@Å`@£¢@
×£@£¢@
Ä@£¢â¥
@£@
KK¤¢Á¤£@ÉÂÔ@×¥¨@
%@@ã£@ã
¢@@¤¢
@ÉÂÔ@Æ
¢
%
\ No newline at end of file
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/envi_test_header.hdr
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/envi_test_header.hdr?rev=1722385&view=auto
==============================================================================
--- tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/envi_test_header.hdr (added)
+++ tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/envi_test_header.hdr Wed Dec 30 17:36:48 2015
@@ -0,0 +1,16 @@
+ENVI
+description = {
+ GEO-TIFF File Imported into ENVI [Fri May 25 14:06:23 2012]}
+samples = 2400
+lines = 2400
+bands = 7
+header offset = 0
+file type = ENVI Standard
+data type = 2
+interleave = bip
+sensor type = Unknown
+byte order = 0
+map info = {Sinusoidal, 1.5000, 1.5000, -10007091.3643, 5559289.2856, 4.6331271653e+02, 4.6331271653e+02, , units=Meters}
+projection info = {16, 6371007.2, 0.000000, 0.0, 0.0, Sinusoidal, units=Meters}
+coordinate system string = {PROJCS["Sinusoidal",GEOGCS["GCS_ELLIPSE_BASED_1",DATUM["D_ELLIPSE_BASED_1",SPHEROID["S_ELLIPSE_BASED_1",6371007.181,0.0]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Sinusoidal"],PARAMETER["False_Easting",0.0],PARAMETER["False_Northing",0.0],PARAMETER["Central_Meridian",0.0],UNIT["Meter",1.0]]}
+wavelength units = Unknown
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/footnotes.docx
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/footnotes.docx?rev=1722385&view=auto
==============================================================================
Binary file - no diff available.
Propchange: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/footnotes.docx
------------------------------------------------------------------------------
svn:mime-type = application/octet-stream
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/gdas1.forecmwf.2014062612.grib2
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/gdas1.forecmwf.2014062612.grib2?rev=1722385&view=auto
==============================================================================
Binary file - no diff available.
Propchange: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/gdas1.forecmwf.2014062612.grib2
------------------------------------------------------------------------------
svn:mime-type = application/octet-stream
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/headerPic.docx
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/headerPic.docx?rev=1722385&view=auto
==============================================================================
Binary file - no diff available.
Propchange: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/headerPic.docx
------------------------------------------------------------------------------
svn:mime-type = application/octet-stream
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/headers.mbox
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/headers.mbox?rev=1722385&view=auto
==============================================================================
--- tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/headers.mbox (added)
+++ tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/headers.mbox Wed Dec 30 17:36:48 2015
@@ -0,0 +1,7 @@
+From envelope-sender-mailbox-name Mon Jun 01 10:00:00 2009
+Return-Path: <na...@domain.com>
+Subject: subject
+From: <au...@domain.com>
+Date: Tue, 9 Jun 2009 23:58:45 -0400
+
+Test content
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/jxl.xls
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/jxl.xls?rev=1722385&view=auto
==============================================================================
Binary file - no diff available.
Propchange: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/jxl.xls
------------------------------------------------------------------------------
svn:mime-type = application/octet-stream
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/moby.zip
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/moby.zip?rev=1722385&view=auto
==============================================================================
Binary file - no diff available.
Propchange: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/moby.zip
------------------------------------------------------------------------------
svn:mime-type = application/octet-stream
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/embedded_then_npe.xml
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/embedded_then_npe.xml?rev=1722385&view=auto
==============================================================================
--- tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/embedded_then_npe.xml (added)
+++ tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/embedded_then_npe.xml Wed Dec 30 17:36:48 2015
@@ -0,0 +1,36 @@
+<?xml version="1.0" encoding="UTF-8" ?>
+
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+-->
+
+<mock>
+
+ <metadata action="add" name="author">Nikolai Lobachevsky</metadata>
+ <write element="p">main_content</write>
+ <!-- auto detection wasn't working for some reason; add content-type as
+ is to trigger mock on the embedded -->
+ <embedded filename="embed1.xml" content-type="application/mock+xml">
+ <mock>
+ <metadata action="add" name="author">embeddedAuthor</metadata>
+ <write element="p">some_embedded_content</write>
+ </mock>
+ </embedded>
+ <throw class="java.lang.NullPointerException">another null pointer exception</throw>
+
+</mock>
\ No newline at end of file
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/example.xml
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/example.xml?rev=1722385&view=auto
==============================================================================
--- tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/example.xml (added)
+++ tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/example.xml Wed Dec 30 17:36:48 2015
@@ -0,0 +1,51 @@
+<?xml version="1.0" encoding="UTF-8" ?>
+
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+-->
+
+<mock>
+ <!-- this file offers all of the options as documentation
+ Parsing should stop at an IOException, of course
+ -->
+
+ <!-- action can be "add" or "set" -->
+ <metadata action="add" name="author">Nikolai Lobachevsky</metadata>
+ <!-- element is the name of the sax event to write, p=paragraph
+ if the element is not specified, the default is <p> -->
+ <write element="p">some content</write>
+ <!-- write something to System.out -->
+ <print_out>writing to System.out</print_out>
+ <!-- write something to System.err -->
+ <print_err>writing to System.err</print_err>
+ <!-- hang
+ millis: how many milliseconds to pause. The actual hang time will probably
+ be a bit longer than the value specified. heavy: whether or not the hang should do something computationally expensive.
+ If the value is false, this just does a Thread.sleep(millis).
+ This attribute is optional, with default of heavy=false.
+ pulse_millis: (required if "heavy" is true), how often to check to see
+ whether the thread was interrupted or that the total hang time exceeded the millis
+ interruptible: whether or not the parser will check to see if its thread
+ has been interrupted; this attribute is optional with default of true
+ -->
+ <hang millis="100" heavy="true" pulse_millis="10" interruptible="true" />
+ <!-- throw an exception or error; optionally include a message or not -->
+ <throw class="java.io.IOException">not another IOException</throw>
+ <!-- perform a genuine OutOfMemoryError -->
+ <oom/>
+</mock>
\ No newline at end of file
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/fake_oom.xml
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/fake_oom.xml?rev=1722385&view=auto
==============================================================================
--- tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/fake_oom.xml (added)
+++ tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/fake_oom.xml Wed Dec 30 17:36:48 2015
@@ -0,0 +1,25 @@
+<?xml version="1.0" encoding="UTF-8" ?>
+
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+-->
+
+<mock>
+ <metadata action="add" name="author">Nikolai Lobachevsky</metadata>
+ <throw class="java.lang.OutOfMemoryError">not another oom</throw>
+</mock>
\ No newline at end of file
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/heavy_hang.xml
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/heavy_hang.xml?rev=1722385&view=auto
==============================================================================
--- tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/heavy_hang.xml (added)
+++ tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/heavy_hang.xml Wed Dec 30 17:36:48 2015
@@ -0,0 +1,25 @@
+<?xml version="1.0" encoding="UTF-8" ?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+-->
+
+<mock>
+ <metadata action="add" name="author">Nikolai Lobachevsky</metadata>
+ <write element="p">some content</write>
+ <hang millis="3000" heavy="true" pulse_millis="100" />
+</mock>
\ No newline at end of file
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/nothing_bad.xml
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/nothing_bad.xml?rev=1722385&view=auto
==============================================================================
--- tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/nothing_bad.xml (added)
+++ tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/nothing_bad.xml Wed Dec 30 17:36:48 2015
@@ -0,0 +1,26 @@
+<?xml version="1.0" encoding="UTF-8" ?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+-->
+
+<mock>
+ <metadata action="add" name="author">Geoffrey Chaucer</metadata>
+ <write element="p">Whan that Aprille with his shoures soote</write>
+ <write>The droghte of Marche hath perced to the roote,</write>
+ <write>And bathed every veyne in swich licour,</write>
+</mock>
\ No newline at end of file
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/null_pointer.xml
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/null_pointer.xml?rev=1722385&view=auto
==============================================================================
--- tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/null_pointer.xml (added)
+++ tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/null_pointer.xml Wed Dec 30 17:36:48 2015
@@ -0,0 +1,25 @@
+<?xml version="1.0" encoding="UTF-8" ?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+-->
+
+<mock>
+ <metadata action="add" name="author">Nikolai Lobachevsky</metadata>
+ <write element="p">some content</write>
+ <throw class="java.lang.NullPointerException">another null pointer exception</throw>
+</mock>
\ No newline at end of file
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/null_pointer_no_msg.xml
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/null_pointer_no_msg.xml?rev=1722385&view=auto
==============================================================================
--- tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/null_pointer_no_msg.xml (added)
+++ tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/null_pointer_no_msg.xml Wed Dec 30 17:36:48 2015
@@ -0,0 +1,25 @@
+<?xml version="1.0" encoding="UTF-8" ?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+-->
+
+<mock>
+ <metadata action="add" name="author">Nikolai Lobachevsky</metadata>
+ <write element="p">some content</write>
+ <throw class="java.lang.NullPointerException"/>
+</mock>
\ No newline at end of file
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/real_oom.xml
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/real_oom.xml?rev=1722385&view=auto
==============================================================================
--- tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/real_oom.xml (added)
+++ tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/real_oom.xml Wed Dec 30 17:36:48 2015
@@ -0,0 +1,24 @@
+<?xml version="1.0" encoding="UTF-8" ?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+-->
+
+<mock>
+ <metadata action="add" name="author">Nikolai Lobachevsky</metadata>
+ <oom/>
+</mock>
\ No newline at end of file
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/sleep.xml
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/sleep.xml?rev=1722385&view=auto
==============================================================================
--- tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/sleep.xml (added)
+++ tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/sleep.xml Wed Dec 30 17:36:48 2015
@@ -0,0 +1,25 @@
+<?xml version="1.0" encoding="UTF-8" ?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+-->
+
+<mock>
+ <metadata action="add" name="author">Nikolai Lobachevsky</metadata>
+ <write element="p">some content</write>
+ <hang millis="3000" heavy="false" />
+</mock>
\ No newline at end of file
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/sleep_interruptible.xml
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/sleep_interruptible.xml?rev=1722385&view=auto
==============================================================================
--- tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/sleep_interruptible.xml (added)
+++ tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/sleep_interruptible.xml Wed Dec 30 17:36:48 2015
@@ -0,0 +1,25 @@
+<?xml version="1.0" encoding="UTF-8" ?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+-->
+
+<mock>
+ <metadata action="add" name="author">Nikolai Lobachevsky</metadata>
+ <write element="p">some content</write>
+ <hang millis="3000" heavy="false" interruptible="true" />
+</mock>
\ No newline at end of file
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/sleep_not_interruptible.xml
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/sleep_not_interruptible.xml?rev=1722385&view=auto
==============================================================================
--- tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/sleep_not_interruptible.xml (added)
+++ tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/mock/sleep_not_interruptible.xml Wed Dec 30 17:36:48 2015
@@ -0,0 +1,25 @@
+<?xml version="1.0" encoding="UTF-8" ?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+-->
+
+<mock>
+ <metadata action="add" name="author">Nikolai Lobachevsky</metadata>
+ <write element="p">some content</write>
+ <hang millis="3000" heavy="false" interruptible="false" />
+</mock>
\ No newline at end of file
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/multiline.mbox
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/multiline.mbox?rev=1722385&view=auto
==============================================================================
--- tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/multiline.mbox (added)
+++ tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/multiline.mbox Wed Dec 30 17:36:48 2015
@@ -0,0 +1,5 @@
+From envelope-sender-mailbox-name Mon Jun 01 10:00:00 2009
+Received: from xxx
+ by xxx with xxx; date
+
+Test content
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/pictures.ppt
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/pictures.ppt?rev=1722385&view=auto
==============================================================================
Binary file - no diff available.
Propchange: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/pictures.ppt
------------------------------------------------------------------------------
svn:mime-type = application/octet-stream
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/protect.xlsx
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/protect.xlsx?rev=1722385&view=auto
==============================================================================
Binary file - no diff available.
Propchange: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/protect.xlsx
------------------------------------------------------------------------------
svn:mime-type = application/octet-stream
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/protectedFile.xlsx
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/protectedFile.xlsx?rev=1722385&view=auto
==============================================================================
Binary file - no diff available.
Propchange: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/protectedFile.xlsx
------------------------------------------------------------------------------
svn:mime-type = application/octet-stream
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/protectedSheets.xlsx
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/protectedSheets.xlsx?rev=1722385&view=auto
==============================================================================
Binary file - no diff available.
Propchange: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/protectedSheets.xlsx
------------------------------------------------------------------------------
svn:mime-type = application/octet-stream
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/quoted.mbox
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/quoted.mbox?rev=1722385&view=auto
==============================================================================
--- tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/quoted.mbox (added)
+++ tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/quoted.mbox Wed Dec 30 17:36:48 2015
@@ -0,0 +1,4 @@
+From envelope-sender-mailbox-name Mon Jun 01 10:00:00 2009
+
+Test content
+> quoted stuff
\ No newline at end of file
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/resume.html
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/resume.html?rev=1722385&view=auto
==============================================================================
--- tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/resume.html (added)
+++ tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/resume.html Wed Dec 30 17:36:48 2015
@@ -0,0 +1,73 @@
+
+
+ <div class="js-helper">
+ <style type="text/css">#style_13209008630000000884_BODY{background-color:#FFFFFF;color:#000000;MARGIN:0px 1px;font-family:Tahoma,Arial,Verdana,Sans-Serif}#style_13209008630000000884 TD{font-size:13px;font-family:Tahoma,Arial,Verdana,Sans-Serif;vertical-align:top}#style_13209008630000000884 CAPTION{font-size:13px;font-weight:bold;text-align:left}#style_13209008630000000884 TR.style_13209008630000000884thead TD{font-weight:bold;text-align:center; padding-bottom:6px;padding-top:6px;padding-left:2px;padding-right:2px}#style_13209008630000000884 H1{font-size:24px;margin-bottom:15px;margin-top:5px;display:block;font-weight:normal;}#style_13209008630000000884 H2{font-size:22px;margin-bottom:5px;margin-top:5px;display:block;font-weight:normal;letter-spacing:1px}#style_13209008630000000884 H1.style_13209008630000000884in, #style_13209008630000000884 H2.style_13209008630000000884in, #style_13209008630000000884 H3.style_13209008630000000884in{font-size:100%;margin-bottom:0px;margin-top:0px;di
splay:inline;}#style_13209008630000000884 A, #style_13209008630000000884 A.style_13209008630000000884notvisited:visited, #style_13209008630000000884 .style_13209008630000000884notvisited A:visited, #style_13209008630000000884 .style_13209008630000000884menu A:visited{color:#00418F;text-decoration:none}#style_13209008630000000884 A:visited{color:#6699CC;text-decoration:none;}#style_13209008630000000884 A:hover, #style_13209008630000000884 A.style_13209008630000000884notvisited:hover, #style_13209008630000000884 .style_13209008630000000884notvisited A:hover, #style_13209008630000000884 .style_13209008630000000884menu A:hover{color:#990000;text-decoration:underline}#style_13209008630000000884 .style_13209008630000000884bold, #style_13209008630000000884 .style_13209008630000000884bold H1{font-weight:bold}#style_13209008630000000884 .style_13209008630000000884u{text-decoration:underline}#style_13209008630000000884 .style_13209008630000000884gray, #style_13209008630000000884 A.style_13209
008630000000884gray:visited, #style_13209008630000000884 LEGEND{color:#7A7A7A}#style_13209008630000000884 .style_13209008630000000884red, #style_13209008630000000884 A.style_13209008630000000884red:visited{color:#C2311A}#style_13209008630000000884 EM, #style_13209008630000000884 .style_13209008630000000884imp, #style_13209008630000000884 .style_13209008630000000884field_warning{color:#C2311A;font-weight:bold;font-style:normal}#style_13209008630000000884 TABLE.style_13209008630000000884bl_table TR TD{padding:2px; padding-left:10px}#style_13209008630000000884 TD.style_13209008630000000884bl_row_name{color:#555; width:10%}#style_13209008630000000884 TD.style_13209008630000000884vacancydark, #style_13209008630000000884 TR.style_13209008630000000884vacancydark TD, #style_13209008630000000884 TD.style_13209008630000000884resumedark, #style_13209008630000000884 TR.style_13209008630000000884resumedark TD, #style_13209008630000000884 TD.style_13209008630000000884serverdark, #style_1320900863
0000000884 TR.style_13209008630000000884serverdark TD{text-align:center;padding-bottom:3px;padding-top:3px;padding-left:1px;padding-right:1px;font-weight:bold}#style_13209008630000000884 TD.style_13209008630000000884vacancydark, #style_13209008630000000884 TR.style_13209008630000000884vacancydark TD, #style_13209008630000000884 TD.style_13209008630000000884vacancydark A, #style_13209008630000000884 TD.style_13209008630000000884vacancydark A:visited, #style_13209008630000000884 TD.style_13209008630000000884vacancydark A:hover, #style_13209008630000000884 TD.style_13209008630000000884resumedark, #style_13209008630000000884 TR.style_13209008630000000884resumedark TD, #style_13209008630000000884 TD.style_13209008630000000884resumedark A, #style_13209008630000000884 TD.style_13209008630000000884resumedark A:visited, #style_13209008630000000884 TD.style_13209008630000000884resumedark A:hover, #style_13209008630000000884 TD.style_13209008630000000884serverdark, #style_13209008630000000884
TR.style_13209008630000000884serverdark TD, #style_13209008630000000884 TD.style_13209008630000000884serverdark A, #style_13209008630000000884 TD.style_13209008630000000884serverdark A:visited, #style_13209008630000000884 TD.style_13209008630000000884serverdark A:hover{color:#000000;}#style_13209008630000000884 TD.style_13209008630000000884vacancydark, #style_13209008630000000884 TR.style_13209008630000000884vacancydark TD{background-color:#FFDDBB;}#style_13209008630000000884 TD.style_13209008630000000884vacancylight, #style_13209008630000000884 TR.style_13209008630000000884vacancylight TD{background-color:#FFF5EC}#style_13209008630000000884 TD.style_13209008630000000884resumedark, #style_13209008630000000884 TR.style_13209008630000000884resumedark TD{background-color:#D3E9E9;}#style_13209008630000000884 TD.style_13209008630000000884resumelight, #style_13209008630000000884 TR.style_13209008630000000884resumelight TD{background-color:#ECF8F7}#style_13209008630000000884 TD.style_13209
008630000000884serverdark, #style_13209008630000000884 TR.style_13209008630000000884serverdark TD{background-color:#ABC2D5;}#style_13209008630000000884 TR.style_13209008630000000884serverlight TD, #style_13209008630000000884 TD.style_13209008630000000884serverlight{background-color:#E2EBF5}#style_13209008630000000884 TD.style_13209008630000000884blankheader1{font-size:24px; padding:10px}#style_13209008630000000884 TD.style_13209008630000000884blankheader2{font-size:22px; padding:10px}#style_13209008630000000884 TABLE.style_13209008630000000884resumelist TR.thead TD{background-color:#ABC2D5;}#style_13209008630000000884 TABLE.style_13209008630000000884vaclist TR.thead TD{background-color:#ABC2D5;}#style_13209008630000000884 TABLE.style_13209008630000000884vaclist_for_mail TR.thead TD{background-color:#DBDBDB;}#style_13209008630000000884 TABLE TR.style_13209008630000000884wr TD{background-color:#FFFFFF}#style_13209008630000000884 TABLE.style_13209008630000000884vaclist_for_mail TD{bord
er-bottom:#DBDBDB 1px solid}#style_13209008630000000884 .style_13209008630000000884list TR TD{background-color:#E2EBF5;padding:5px}#style_13209008630000000884 .style_13209008630000000884list TR.thead TD{background-color:#ABC2D5;color:#555555;text-align:center; padding-bottom:8px;padding-top:8px;padding-left:1px;padding-right:1px;font-weight:bold;}#style_13209008630000000884 .style_13209008630000000884list TR.wr TD{background-color:#F3F7FB}#style_13209008630000000884 A.style_13209008630000000884list_details, #style_13209008630000000884 A.style_13209008630000000884list_details:visited, #style_13209008630000000884 A.style_13209008630000000884list_details:hover{color:#7A7A7A;text-decoration:none;line-height:120%}#style_13209008630000000884 TD.style_13209008630000000884cell, #style_13209008630000000884 TD.style_13209008630000000884c{padding-top:3px;padding-left:5px;padding-right:5px}#style_13209008630000000884 BIG{font-size:24px}#style_13209008630000000884 .style_13209008630000000884smal
l, #style_13209008630000000884 SMALL{font-size:85%}#style_13209008630000000884 UL{margin-left:25px;margin-bottom:0px}#style_13209008630000000884 TD.style_13209008630000000884small, #style_13209008630000000884 .style_13209008630000000884verysmall, #style_13209008630000000884 .style_13209008630000000884verysmall INPUT, #style_13209008630000000884 .style_13209008630000000884verysmall SELECT{font-size:11px}#style_13209008630000000884 DIV.style_13209008630000000884localmenu{padding-top:10px;margin-bottom:15px;}#style_13209008630000000884 DIV.style_13209008630000000884localmenu A, #style_13209008630000000884 DIV.style_13209008630000000884localmenu A:visited{text-decoration:underline;font-weight:bold}#style_13209008630000000884 DIV.style_13209008630000000884comment{font-size:85%; background-color:#DDFFDD; padding:4px; border:1px solid #CCC;cursor:default;}#style_13209008630000000884 HR{color:#ABC2D5;background-color:#ABC2D5;height:1px;border:0px solid #ABC2D5}#style_13209008630000000884 DI
V.style_13209008630000000884dotsline{font-size:1px; margin-top:4px; margin-bottom:5px; border-bottom:#BACBD7 1px dotted}#style_13209008630000000884 TABLE.style_13209008630000000884rctable TR TD{background-color:#E5EDF7;}#style_13209008630000000884 TD.style_13209008630000000884rc1{padding-top:10px; padding-left:10px;}#style_13209008630000000884 TD.style_13209008630000000884rc2{font-size:1px; width:10px;}#style_13209008630000000884 TD.style_13209008630000000884rc3{height:10px; font-size:1px;}#style_13209008630000000884 TD.style_13209008630000000884rc4{height:10px; font-size:1px;}#style_13209008630000000884 SPAN.style_13209008630000000884super{color:#003398;font-size:150%}#style_13209008630000000884 SPAN.style_13209008630000000884job{color:#FF0000;font-size:150%}#style_13209008630000000884 TABLE.style_13209008630000000884vaclist_for_mail TD TABLE.to_site_button{background-color:#99cc00; margin:0px 5px 3px 0px;}#style_13209008630000000884 TABLE.style_13209008630000000884vaclist_for_mail
TD TABLE.to_site_button TD{background-color:#99cc00; font-weight:normal; color:#ffffff; border-bottom:0px; padding-top:6px; padding-right:7px; padding-bottom:6px; padding-left:7px; vertical-align:middle; text-align:center;}#style_13209008630000000884 TABLE.style_13209008630000000884vaclist_for_mail TD TABLE.to_site_button TD A, #style_13209008630000000884 TABLE.style_13209008630000000884to_site_button TD A:visited{color:#ffffff; text-decoration:none; font-weight:normal;}#style_13209008630000000884 TABLE.style_13209008630000000884vaclist_for_mail TD TABLE.to_site_button TD A:hover{color:#ffffff; text-decoration:underline; font-weight:normal;}#style_13209008630000000884 .style_13209008630000000884row{clear:left; padding-bottom:4px;}#style_13209008630000000884 .style_13209008630000000884row2{margin-bottom:8px;}#style_13209008630000000884 .style_13209008630000000884col1{float:left; width:140px; color:#555555; margin-right:-145px;}#style_13209008630000000884 .style_13209008630000000884c
ol2{margin-left:145px;}#style_13209008630000000884 DIV.style_13209008630000000884resume_rightcol{float:right; width:280px; margin:0px 0px 10px 30px;}#style_13209008630000000884 DIV.style_13209008630000000884blankheader1{font-size:190%;}
+</style>
+ <div id="style_13209008630000000884" class="mr_read__body">
+ <base target="_self" href="http://e.mail.ru/cgi-bin/" />
+
+ <div id="style_13209008630000000884_BODY">
+
+
+
+<style type="text/css" ></style>
+
+
+<table width="100%" cellspacing="0" cellpadding="0" height="100%" border="0" >
+<tr ><td >
+
+</td></tr>
+<tr ><td style="padding:5px" height="100%" >
+ÐдÑавÑÑвÑйÑе, !<br >
+<br >
+ÐÑедлагаем Ðам ознакомиÑÑÑÑ Ñо ÑпиÑком заÑегиÑÑÑиÑованнÑÑ
компаний, пÑедÑÑавиÑели коÑоÑÑÑ
пÑоÑмоÑÑели ÐаÑе ÑезÑме за поÑледние ÑÑÑки.<br >
+<br >
+<li ><a target="_blank" href="/cgi-bin/link?check=1&cnf=710139&url=http%3A%2F%2;0,0" >Ðомпании, пÑоÑмоÑÑевÑие ÑезÑме â .</a> ÐовÑе: <b >1.</b></li><br >
+<br >
+ÐÑи ÑÐ²ÐµÐ´ÐµÐ½Ð¸Ñ Ð¿ÑедоÑÑавлÑÑÑÑÑ Ðам иÑклÑÑиÑелÑно Ð´Ð»Ñ Ð¸Ð½ÑоÑмаÑии. ÐÑ Ð¼Ð¾Ð¶ÐµÑе опеÑаÑивно оÑÑлеживаÑÑ, какие именно компании наÑли в базе даннÑÑ
Superjob ÐаÑе ÑезÑме и заинÑеÑеÑовалиÑÑ Ð¸Ð¼.<br >
+<br >
+ÐÑли ÐаÑе ÑезÑме ÑазмеÑено в закÑÑÑом доÑÑÑпе, Ñо его могÑÑ Ð¿ÑоÑмаÑÑиваÑÑ ÑолÑко Ñе ÑабоÑодаÑели, коÑоÑÑм ÐÑ Ð¾ÑпÑавили его ÑамоÑÑоÑÑелÑно.<br >
+ÐÑÑоÑÐ¸Ñ Ð¾ÑпÑавки Ñвоего ÑезÑме ÐÑ Ð¼Ð¾Ð¶ÐµÑе поÑмоÑÑеÑÑ Ð¿Ð¾ ÑÑÑлке «ÐÑÑоÑÐ¸Ñ ÑаÑÑÑлки ÑезÑме».<br >
+<br >
+<br >
+<b >Ðнимание!</b><br >
+РпÑоÑеÑÑе поиÑка ÑабоÑÑ ÐÑ Ð¼Ð¾Ð¶ÐµÑе ÑÑолкнÑÑÑÑÑ Ñ Ñакими пÑедложениÑми ÑабоÑодаÑелей или кадÑовÑÑ
агенÑÑÑв, в коÑоÑÑÑ
ÐÐ°Ñ Ð±ÑдÑÑ Ð¿ÑоÑиÑÑ Ð²Ð½ÐµÑÑи оплаÑÑ (за пÑедваÑиÑелÑное обÑÑение, за оÑоÑмление докÑменÑов, за оÑоÑмление обÑзаÑелÑной ÑÑÑаÑ
овки, на закÑÐ¿ÐºÑ Ð¿ÐµÑвой паÑÑии пÑодÑкÑии ко
мпании, пÑедназнаÑенной Ð´Ð»Ñ Ð¿Ñодажи и Ñ.п.) или пÑедоÑÑавиÑÑ Ð¾ÑÑканиÑованнÑе копии докÑменÑов (паÑпоÑÑа, военного билеÑа, ÑÑÑдовой книжки, водиÑелÑÑкиÑ
пÑав, пенÑионного ÑдоÑÑовеÑÐ½Ð¸Ñ Ð¸ Ñ.п.) Ð´Ð»Ñ ÑÐºÐ¾Ð±Ñ Ð¿ÑедваÑиÑелÑного оÑоÑÐ¼Ð»ÐµÐ½Ð¸Ñ Ð¸Ð»Ð¸ подÑвеÑÐ¶Ð´ÐµÐ½Ð¸Ñ Ð´Ð°Ð½Ð½ÑÑ
, ÑказаннÑÑ
в ÐаÑем ÑезÑме.<br >
+ÐÑо один из пÑизнаков моÑенниÑеÑÑва! ÐÑ ÑекомендÑем Ðам оÑÐµÐ½Ñ Ð¾ÑÑоÑожно оÑноÑиÑÑÑÑ Ðº Ñаким пÑедложениÑм и по возможноÑÑи избегаÑÑ ÑобеÑедований Ñ Ð¿Ð¾Ð´Ð¾Ð±Ð½Ñми ÑабоÑодаÑелÑми.<br >
+<br >
+Также Ð¼Ñ Ð½Ð°ÑÑоÑÑелÑно не ÑекомендÑем оÑпÑавлÑÑÑ Ð¿Ð»Ð°ÑнÑе SMS-ÑообÑÐµÐ½Ð¸Ñ Ð½Ð° коÑоÑкие номеÑа Ð´Ð»Ñ Ð¿Ð¾Ð»ÑÑÐµÐ½Ð¸Ñ ÐºÐ¾Ð½ÑакÑов или дÑÑгой инÑоÑмаÑии о ваканÑии или же Ð´Ð»Ñ Ð¿Ð¾Ð»ÑÑÐµÐ½Ð¸Ñ ÑезÑлÑÑаÑов ÑеÑÑиÑованиÑ. С оÑганизаÑиÑми, коÑоÑÑе оказÑваÑÑ Ð¿Ð¾Ð´Ð¾Ð±Ð½Ñе ÑÑлÑги, Ð¼Ñ Ð½Ðµ ÑоÑÑÑдниÑаем и пÑедÑпÑеждаем, Ñ�
�о ÑÑо Ñоже один из пÑиемов моÑенниÑеÑÑва.<br >
+<br >
+<br >
+<em >x</em> <a target="_blank" href="/cgi-bin/link?check=1&cnf=8d972a&url=http%3A%2F%2Fwww.sup;0,0" >ÐÑклÑÑиÑÑ ÑÐ²ÐµÐ´Ð¾Ð¼Ð»ÐµÐ½Ð¸Ñ Ð¾ новÑÑ
пÑоÑмоÑÑаÑ
моиÑ
ÑезÑме</a><br >
+<br >
+Ðо ÑÑÑлкам в ÑÑом пиÑÑме можно войÑи в ÑиÑÑÐµÐ¼Ñ Ð±ÐµÐ· ввода паÑолÑ.
+<br ><br >
+</td>
+</tr>
+<tr >
+<td >
+<span class="style_13209008630000000884noprint" ><br ><br >ÐÑли Ñ ÐÐ°Ñ ÐµÑÑÑ Ð¿Ð¾Ð¶ÐµÐ»Ð°Ð½Ð¸Ñ Ð¸ идеи по ÑлÑÑÑÐµÐ½Ð¸Ñ ÑеÑвиÑа Superjob, пожалÑйÑÑа, <a target="_blank" href="/cgi-bin/link?check=1;0,0" >напиÑиÑе нам</a>.<br ><br ></span>
+<table width="100%" cellspacing="0" cellpadding="10" border="0" class="style_13209008630000000884noprint" >
+<tr ><td align="center" style="border-top:1px solid #BACBD7;" >
+<a target="_blank" href="/cgi-bin/link?check=1&cnf=8fa2f9&url=http%3A%2F%2Fwww.;0,0" ><big >Superjob â РабоÑа должна доÑÑавлÑÑÑ ÑдоволÑÑÑвие!</big></a>
+</td></tr>
+</table>
+<table width="100%" cellspacing="1" cellpadding="0" border="0" class="style_13209008630000000884noprint" >
+<tr ><td align="center" style="padding:5px" >
+<span style="color:#999999;font-size:8pt;" >ÐиÑÑмо оÑпÑавлено: xx.xx.xxxx xx:xx:xx</span>
+</td></tr>
+</table>
+
+</td></tr>
+</table>
+
+
+
+</div>
+
+
+ <base target="_self" href="http://e.mail.ru/cgi-bin/" />
+ </div>
+</div>
+
+
+
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/rsstest.rss
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/rsstest.rss?rev=1722385&view=auto
==============================================================================
--- tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/rsstest.rss (added)
+++ tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/rsstest.rss Wed Dec 30 17:36:48 2015
@@ -0,0 +1,36 @@
+<?xml version="1.0" encoding="ISO-8859-1" ?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+<rss version="0.91">
+ <channel>
+ <title>TestChannel</title>
+ <link>http://test.channel.com/</link>
+ <description>Sample RSS File for Junit test</description>
+ <language>en-us</language>
+
+ <item>
+ <title>Home Page of Chris Mattmann</title>
+ <link>http://www-scf.usc.edu/~mattmann/</link>
+ <description>Chris Mattmann's home page</description>
+ </item>
+ <item>
+ <title>Awesome Open Source Search Engine</title>
+ <link>http://www.nutch.org/</link>
+ <description>Yup, that's what it is</description>
+ </item>
+ </channel>
+</rss>
Added: tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/russian.cp866.txt
URL: http://svn.apache.org/viewvc/tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/russian.cp866.txt?rev=1722385&view=auto
==============================================================================
--- tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/russian.cp866.txt (added)
+++ tika/branches/2.x/tika-test-resources/src/test/resources/test-documents/russian.cp866.txt Wed Dec 30 17:36:48 2015
@@ -0,0 +1,6 @@
+¤ ¦¤ë, ¢ áâ㤥ãî §¨¬îî ¯®àã,
+ ¨§ «¥áã ¢ë襫; ¡ë« ᨫìë© ¬®à®§.
+«ï¦ã, ¯®¤¨¬ ¥âáï ¬¥¤«¥® ¢ £®àã
+®è ¤ª , ¢¥§ãé ï 墮à®áâã ¢®§.
+, è¥áâ¢ãï ¢ ¦®, ¢ ᯮª®©á⢨¨ 種¬,
+®è ¤ªã ¢¥¤¥â ¯®¤ 㧤æë ¬ã¦¨ç®ª