You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by "Heathcote, Guy" <gh...@ordsvy.gov.uk> on 2001/07/19 10:51:40 UTC

My SAX character() Callback is introducing spaces...

Hi,

I'm attempting to parse an XML file containing, amongst other data, large
lists or map co-ordinates.  In one particular file, I have an element
containing a particularly long co-ord list (fills several pages of my
editor...) which is causing me some problems.  Xerces is splitting the list
into several chunks and supplying each chunk to the character() callback
separately, which I'm OK with.  However, it is also introducing a string of
space characters mid-way through the returned text during one of these
call-backs, spaces that are not in the input file.  In fact, the spaces are
occurring half way through one of the co-ordinates, which isn't exactly
helpful.  I'm using Xerces-J version 1.4.0.

Example...

Extract of original XML:

       ... 305156.600,209888.100 305153.500,209888.900 ...

Text returned from callback

       ... 305156.600,20988                             8.100
305153.500,209888.900...

Code in the characters() callback is just:

    public void characters(char[] ch, int start, int end) throws
SAXException
    {
         String thisText = new String(ch,start,end-start);
         thisFullTextString = thisFullTextString + thisText;

         // Temp diagnostics
         System.out.println("Current string = <<<" + thisText + ">>>");
    }


Is there a reason for this behaviour?  Is it a Xerces bug?  If so, is there
a work-around?

Guy Heathcote
Ordnance Survey, England



***************************************************************
For more information on Ordnance Survey products and services,
visit our web site at http://www.ordnancesurvey.co.uk
***************************************************************




---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: My SAX character() Callback is introducing spaces...

Posted by Ian Roberts <ir...@decisionsoft.com>.
On Thu, 19 Jul 2001, Heathcote, Guy wrote:

> Code in the characters() callback is just:
> 
>     public void characters(char[] ch, int start, int end) throws
> SAXException

I think this is your problem - the parameters to characters are the
character array, the starting index, and the *length* of the chunk, not
the end index.

Ian

-- 
Ian Roberts, Software Engineer        DecisionSoft Ltd.
Telephone: +44-1865-203192            http://www.decisionsoft.com


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org