You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov> on 2014/12/01 03:29:15 UTC

Re: Confusion

Hi Peter,

[moving webmaster@apache.org to BCC]

Thanks for your question. You’ll have to subscribe to the Tika list or
check a mailing archive for Tika to see the reply to this. My suggestion
is to subscribe to the Tika list by sending a blank email to
dev-subscribe@tika.apache.org
and following the instructions from there. Some replies below:



-----Original Message-----
From: Peter Hodges <ph...@id.iit.edu>
Date: Sunday, November 30, 2014 at 7:56 AM
To: "webmaster@apache.org" <we...@apache.org>
Subject: Confusion

>Hi.
>
>I'm sure this is not the appropriate email but one must start someplace.
>
>
>I'd like to try Tika for manipulating text.
>
>However, despite the labels "getting started" etc in your online
>directories
>
>I find the directions confusing and hard to understand.
>
>As a designer and a non inner circle software/programmer expert
>
>I'd like to see a simple example:
>
>1) Evidently Tika requires Maven.

If you’d like to build Tika, yes. If you’d like to simply use
Tika in an application, try out the tika-app jar on the downloads
page. The tika-app.jar can be invoked with a Java runtime by typing
java -jar tika-app-X.Y.jar --help

(where X.Y is the version number, e.g., 1.6).

>
>Do these codes then go in the same directory (e.g., usr on Linux)?

If you want to build Tika, a good recipe is e.g., on Linux, is:
[with Maven3.x installed]
[with Java 1.6.x or higher installed]

1. mkdir $HOME/src
2. cd $HOME/src
3. svn co http://svn.apache.org/repos/asf/tika/trunk tika
4. cd tika
5. export MAVEN_OPTS=“-Xms128m -Xmx256m”
6. mvn install

(wait a while)

7. inside of tika-app/target - you will find the tika-app JAR file


>
>2) After extraction how does one execute a simple example?
>I followed the Tika directory structure down through four or five levels
>to find the parsing example. This appears to be java code.
>
>I return to the online getting started but find line after line of code
>(is this java? Python? or ?)

See:

http://tika.apache.org/1.6/gettingstarted.html


(especially at the bottom)

You can also use Tika as a REST server, e.g., here:

https://wiki.apache.org/tika/TikaJAXRS


>
>
>The literature contains many papers about HCI, user research,
>participatory design, and other topics related to human centered design.
>
>These are powerful open source tools. It would be helpful to engaging a
>wider community to have some simple, clear directions about how to enter
>into using them.
>

I’m not sure of your comment here - how does literature relate to Tika -
what literature?

Hope that helps with some of the answers.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++




Re: Confusion

Posted by Tyler Palsulich <tp...@gmail.com>.
>
> subscribe to the Tika list by sending a blank email to
> dev-subscribe@tika.apache.org
> and following the instructions from there. Some replies below:
>

We should clear up the instructions on the website to say this explicitly,
rather than give a link the the general Apache page. No reason to not say
how to subscribe/unsubscribe directly on the Tika page.

Tyler