You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Jebarlin Robertson <je...@gmail.com> on 2014/11/07 14:11:15 UTC

Taking more time in extracting text from 2003 excel file (xls)

Hi,
I just want to extract only the plain text from xls file without parsing
any other properties of the file (styles, cell or row details, images or
any other thing )

As I do that in SSTDeserializer class, i observe that it is taking more
time in reading the stream as the unique string count increases.
It is reading the string from Record input stream class.

Can any one help me to achive this only for extracting plain text from xls
file in less time.

Thanks in advance

Regards,
Jebarlin Robertson

Re: Taking more time in extracting text from 2003 excel file (xls)

Posted by Nick Burch <ap...@gagravarr.org>.
On Fri, 7 Nov 2014, Jebarlin Robertson wrote:
> I just want to extract only the plain text from xls file without parsing
> any other properties of the file (styles, cell or row details, images or
> any other thing )
>
> Can any one help me to achive this only for extracting plain text from 
> xls file in less time.

Quickest way would probably be to use the event model, see 
https://svn.apache.org/repos/asf/poi/trunk/src/java/org/apache/poi/hssf/extractor/EventBasedExcelExtractor.java 
for one implementation of doing that which you might even be able to use 
as-is

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org