You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tika.apache.org by David Kovar <dk...@gmail.com> on 2010/07/10 05:32:07 UTC

Test suite for Tika?

Good evening,

Is there an available set of documents that is used to validate Tika's performance? I am working on validating the performance of some ediscovery tools and such a test set would be very useful.

Thank you.

-David





Re: Test suite for Tika?

Posted by David Kovar <dk...@gmail.com>.
Chris,

Thank you very much.

-David



On Jul 9, 2010, at 11:07 PM, Mattmann, Chris A (388J) wrote:

> Hi David,
> 
> The unit tests for the tika-parsers modules contains the test documents in the directory here:
> 
> http://svn.apache.org/repos/asf/tika/trunk/tika-parsers/src/test/resources/test-documents/
> 
> HTH,
> Chris
> 
> 
> 
> On 7/9/10 8:32 PM, "David Kovar" <dk...@gmail.com> wrote:
> 
> Good evening,
> 
> Is there an available set of documents that is used to validate Tika's performance? I am working on validating the performance of some ediscovery tools and such a test set would be very useful.
> 
> Thank you.
> 
> -David
> 
> 
> 
> 
> 
> 
> 
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: Chris.Mattmann@jpl.nasa.gov
> WWW:   http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department 
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 


Re: Test suite for Tika?

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hi David,

The unit tests for the tika-parsers modules contains the test documents in the directory here:

http://svn.apache.org/repos/asf/tika/trunk/tika-parsers/src/test/resources/test-documents/

HTH,
Chris



On 7/9/10 8:32 PM, "David Kovar" <dk...@gmail.com> wrote:

Good evening,

Is there an available set of documents that is used to validate Tika's performance? I am working on validating the performance of some ediscovery tools and such a test set would be very useful.

Thank you.

-David







++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: Chris.Mattmann@jpl.nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++