You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by Roger Carter <ro...@gmail.com> on 2014/08/06 23:03:00 UTC

Starting Advice

Hi Everyone,

I'm new to the apache scene; I have experience with Matlab and minimal
experience with Python. This seems like a powerful tool and I'd like to
learn more. If anyone is willing to provide reccomendations for resources
or detail their experiences in learning Tika, I would be most grateful.

Thanks,
Roger

Re: Starting Advice

Posted by Tyler Palsulich <tp...@gmail.com>.
Hi Roger,

Thanks for your interest in Tika! In a nutshell, Tika is a content
extraction tool. You can extract metadata and text, identify spoken
languages, and translate text using internet APIs (for now, we're working
on machine translation). We're in the process of releasing version 1.6.
Tika In Action is a book written by Chris Mattmann, the lead and co-creator
of Tika. You can find more info at [0].

You can use Tika multiple ways:

*1. tika-app jar*. Try downloading a release on tika.apache.org and running
`java -jar tika-app.jar [some file]`.
*2. GUI*. Try running `java -jar tika-app.jar --gui`. A graphical interface
will pop up. Then, try dragging a file into the window.
*3. Tika server*. Run `java -jar tika-app.jar --server`. Then, try one of
the commands from [0] (e.g. `curl -X PUT -d @example.csv
http://localhost:9998/meta --header "Content-Type: text/csv"`).
*4. Java API*. Check out an example of using Parser.parse() at [2].

Hope that helps!

Tyler

[0] - http://www.manning.com/mattmann/
[1] - http://wiki.apache.org/tika/TikaJAXRS
[2] - https://github.com/tpalsulich/TikaExamples


On Wed, Aug 6, 2014 at 11:04 PM, Alex Ott <al...@gmail.com> wrote:

> I think, that the "Tika in Action" is still actual...
>
>
> On Wed, Aug 6, 2014 at 11:03 PM, Roger Carter <ro...@gmail.com>
> wrote:
>
> > Hi Everyone,
> >
> > I'm new to the apache scene; I have experience with Matlab and minimal
> > experience with Python. This seems like a powerful tool and I'd like to
> > learn more. If anyone is willing to provide reccomendations for resources
> > or detail their experiences in learning Tika, I would be most grateful.
> >
> > Thanks,
> > Roger
> >
>
>
>
> --
> With best wishes,                    Alex Ott
> http://alexott.net/
> Twitter: alexott_en (English), alexott (Russian)
> Skype: alex.ott
>

Re: Starting Advice

Posted by Alex Ott <al...@gmail.com>.
I think, that the "Tika in Action" is still actual...


On Wed, Aug 6, 2014 at 11:03 PM, Roger Carter <ro...@gmail.com>
wrote:

> Hi Everyone,
>
> I'm new to the apache scene; I have experience with Matlab and minimal
> experience with Python. This seems like a powerful tool and I'd like to
> learn more. If anyone is willing to provide reccomendations for resources
> or detail their experiences in learning Tika, I would be most grateful.
>
> Thanks,
> Roger
>



-- 
With best wishes,                    Alex Ott
http://alexott.net/
Twitter: alexott_en (English), alexott (Russian)
Skype: alex.ott