You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov> on 2009/07/30 06:55:05 UTC

FW: a new project using tika has begun

All, FYI, kind words from a Tika supporter!

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: Chris.Mattmann@jpl.nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++



------ Forwarded Message
From: Craig Stires <cr...@gmail.com>
Date: Wed, 29 Jul 2009 21:39:14 -0700
To: <ri...@apache.org>, <ma...@apache.org>
Subject: a new project using tika has begun


Hi Rida and Chris,

Just want to send in a note of much appreciation for the work you've done (and the others tika contributors, poi, pdf, lucene, the list goes on).  Work is underway on a project which feeds off the tika parser, as one of the content providers.  Although tika is still in a pre-1.0 stage, it is providing enough content to allow us to avoid delays and keep momentum.  Thanks for that!

What I am hoping to contribute as we continue, are examples of files that aren't parsing quite correctly, or have the wrong encoding set, etc.  This project is running against English and Thai data, and will be moving into Japanese and Chinese sometime next year.  So, maybe we will have access to a wider range of asian language files than you might have currently.

I wish that we had the technical level to contribute patches, but if there's anything that can be passed along to you to help with test / dev, I'd be happy to do so.

Thanks again, and letting you know that your efforts are being put to good use.

-Craig



------ End of Forwarded Message