You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tika.apache.org by goog cheng <go...@gmail.com> on 2012/10/25 18:39:33 UTC

input filestream in command line

Tika supports input file in CLI . But if the input is filestream, is there
a command to do it?  Any help would be greatly appreciated!

Re: input filestream in command line

Posted by Nick Burch <ap...@gagravarr.org>.
On Sun, 28 Oct 2012, goog cheng wrote:
> it seems tika doesnt stream it to disk

It will do if the parser needs it, so it'll depend on the file type

> have to do manual operation in bash, is there a more detail document 
> about tika

Aren't you in python though? That should make it much easier...

Nick

Re: input filestream in command line

Posted by goog cheng <go...@gmail.com>.
it seems tika doesnt stream it to disk , have to do manual operation in
bash, is there a more detail document about tika

2012/10/27 Nick Burch <ap...@gagravarr.org>

> On Sat, 27 Oct 2012, goog cheng wrote:
>
>> the file is in memory, i have to save it in disk and then fetch it back
>> again?
>>
>
> Some Tika parsers only work with files, so if you don't stream it to disk
> Tika will do. Otherwise, you could always stream it to Tika on stdin?
>
> Nick
>

Re: input filestream in command line

Posted by Nick Burch <ap...@gagravarr.org>.
On Sat, 27 Oct 2012, goog cheng wrote:
> the file is in memory, i have to save it in disk and then fetch it back 
> again?

Some Tika parsers only work with files, so if you don't stream it to disk 
Tika will do. Otherwise, you could always stream it to Tika on stdin?

Nick

Re: input filestream in command line

Posted by goog cheng <go...@gmail.com>.
the file is in memory,  i have to save it in disk and then fetch it back
again?

2012/10/26 Vigneshwaran <vi...@gmail.com>

> 2012/10/26 goog cheng <go...@gmail.com>:
> > subprocess.check_output("java -jar tika-app-1.2.jar -t "+ file,
> shell=True)
> >
>
> :D
>
> Why don't you try passing the absolute path name of the file?
>
> subprocess.check_output("java -jar tika-app-1.2.jar -t "+ filename,
> shell=True)
>

Re: input filestream in command line

Posted by Vigneshwaran <vi...@gmail.com>.
2012/10/26 goog cheng <go...@gmail.com>:
> subprocess.check_output("java -jar tika-app-1.2.jar -t "+ file, shell=True)
>

:D

Why don't you try passing the absolute path name of the file?

subprocess.check_output("java -jar tika-app-1.2.jar -t "+ filename, shell=True)

Re: input filestream in command line

Posted by goog cheng <go...@gmail.com>.
subprocess.check_output("java -jar tika-app-1.2.jar -t "+ file, shell=True)

2012/10/26 Nick Burch <ap...@gagravarr.org>

> On 26/10/12 00:52, goog cheng wrote:
>
>> in python,   an opened file object
>>
>
> And how are you currently calling Tika from Python?
>
> Nick
>

Re: input filestream in command line

Posted by Nick Burch <ap...@gagravarr.org>.
On 26/10/12 00:52, goog cheng wrote:
> in python,   an opened file object

And how are you currently calling Tika from Python?

Nick

Re: input filestream in command line

Posted by goog cheng <go...@gmail.com>.
in python,   an opened file object

2012/10/26 Nick Burch <ap...@gagravarr.org>

> On Fri, 26 Oct 2012, goog cheng wrote:
>
>> Tika supports input file in CLI . But if the input is filestream, is
>> there a command to do it?  Any help would be greatly appreciated!
>>
>
> What do you mean by a "filestream", a pipe? Something else?
>
> Nick
>

Re: input filestream in command line

Posted by Nick Burch <ap...@gagravarr.org>.
On Fri, 26 Oct 2012, goog cheng wrote:
> Tika supports input file in CLI . But if the input is filestream, is 
> there a command to do it?  Any help would be greatly appreciated!

What do you mean by a "filestream", a pipe? Something else?

Nick