You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by dynamolalit <la...@gmail.com> on 2010/06/08 09:57:51 UTC

Reg Autodetector Tika Parser

Hi,

I have built apache tika 0.7 jar from source got from Tika site.

Now i am trying to extract text using Autodetector parser class but getting
no text as below:

ContentHandler textHandler = new BodyContentHandler();
Metadata metadata = new Metadata();
metadata.add(Metadata.RESOURCE_NAME_KEY, cName); 
AutoDetectParser parser = new AutoDetectParser();
parser.parse(iStream , textHandler , metadata );
logger.debug("Length of Content extracted by Tika : "+ content);

But it is showing 0.

Any idea where i am going wrong.
-- 
View this message in context: http://lucene.472066.n3.nabble.com/Reg-Autodetector-Tika-Parser-tp878680p878680.html
Sent from the Apache Tika - Development mailing list archive at Nabble.com.

Re: Reg AutoDetectParser Tika Parser

Posted by Ken Krugler <kk...@transpac.com>.
> I was able to parse content now. Issue was in maven for me.
>
> Now i am using Tika 0.7.
>
>
> But i am not able to parse pst & image as stated @

"image" covers a lot of formats - do you have an example file you  
could make available?

-- Ken

>
> http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Content-Extraction-Tika#supported.formats 
> .
>
> For both i am getting 0 mength string for content.
>
> Any idea, on how to achieve the same?

--------------------------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g





Re: Reg AutoDetectParser Tika Parser

Posted by dynamolalit <la...@gmail.com>.
Hi,

I was able to parse content now. Issue was in maven for me.

Now i am using Tika 0.7.


But i am not able to parse pst & image as stated @

http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Content-Extraction-Tika#supported.formats.

For both i am getting 0 mength string for content.

Any idea, on how to achieve the same?
-- 
View this message in context: http://lucene.472066.n3.nabble.com/Reg-AutoDetectParser-Tika-Parser-tp878680p881534.html
Sent from the Apache Tika - Development mailing list archive at Nabble.com.