You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Nick Burch <ni...@torchbox.com> on 2008/08/12 22:14:52 UTC
Re: Rubbish in extracted text
On Fri, 16 May 2008, Rainer Schwarze wrote:
> these are fields. A quick solution is this: Pass the extracted text
> string through a filter which removes the field codes. Fields are
> delimited by 0x13 (start), 0x14 (separator) and 0x15 (end) bytes. With
> fields which don't have a separator (0x14), remove all from 0x13 to
> 0x15.
I've just added some code to svn to implement this algorithm. It's on
Range, and is Range.stripFields(String)
Nick
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org