You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by ArsenA <ar...@gmail.com> on 2011/05/11 08:34:04 UTC

extract content from doc/docx file using Apache POI

Please can anyone help with extracting content [text with images] of doc/docx
file using Apache POI?

--
View this message in context: http://apache-poi.1045710.n5.nabble.com/extract-content-from-doc-docx-file-using-Apache-POI-tp4386571p4386571.html
Sent from the POI - User mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: extract content from doc/docx file using Apache POI

Posted by Nick Burch <ni...@alfresco.com>.
On Tue, 10 May 2011, ArsenA wrote:
> Please can anyone help with extracting content [text with images] of 
> doc/docx file using Apache POI?

You might want to take a look at Apache Tika - that handles using POI to 
generate XHTML of .doc and .docx files, including image extraction

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org