You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by dynamolalit <la...@gmail.com> on 2010/06/08 09:57:51 UTC
Reg Autodetector Tika Parser
Hi,
I have built apache tika 0.7 jar from source got from Tika site.
Now i am trying to extract text using Autodetector parser class but getting
no text as below:
ContentHandler textHandler = new BodyContentHandler();
Metadata metadata = new Metadata();
metadata.add(Metadata.RESOURCE_NAME_KEY, cName);
AutoDetectParser parser = new AutoDetectParser();
parser.parse(iStream , textHandler , metadata );
logger.debug("Length of Content extracted by Tika : "+ content);
But it is showing 0.
Any idea where i am going wrong.
--
View this message in context: http://lucene.472066.n3.nabble.com/Reg-Autodetector-Tika-Parser-tp878680p878680.html
Sent from the Apache Tika - Development mailing list archive at Nabble.com.
Re: Reg AutoDetectParser Tika Parser
Posted by Ken Krugler <kk...@transpac.com>.
> I was able to parse content now. Issue was in maven for me.
>
> Now i am using Tika 0.7.
>
>
> But i am not able to parse pst & image as stated @
"image" covers a lot of formats - do you have an example file you
could make available?
-- Ken
>
> http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Content-Extraction-Tika#supported.formats
> .
>
> For both i am getting 0 mength string for content.
>
> Any idea, on how to achieve the same?
--------------------------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c w e b m i n i n g
Re: Reg AutoDetectParser Tika Parser
Posted by dynamolalit <la...@gmail.com>.
Hi,
I was able to parse content now. Issue was in maven for me.
Now i am using Tika 0.7.
But i am not able to parse pst & image as stated @
http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Content-Extraction-Tika#supported.formats.
For both i am getting 0 mength string for content.
Any idea, on how to achieve the same?
--
View this message in context: http://lucene.472066.n3.nabble.com/Reg-AutoDetectParser-Tika-Parser-tp878680p881534.html
Sent from the Apache Tika - Development mailing list archive at Nabble.com.