You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Matt Lovett <ml...@uk.ibm.com> on 2001/02/20 19:25:27 UTC

Bug in ICU transcoder

Hi all,

I'm using the XMLScanner::getSrcOffset calls in my application, and I
think that I've uncovered a bug in the ICU transcoder.

In ICUTranscoder::transcodeFrom(), there is the following code:

  // <TBD> Does ICU return an extra element to allow us to figure
  //  out the last char size? It better!!
  unsigned int index;
  for (index = 0; index < charsDecoded; index++)
  {
    charSizes[index] = (unsigned char)(fSrcOffsets[index + 1]
                                        - fSrcOffsets[index]);
  }

Unfortunately, the guess didn't work.  Here is a working replacement for
the code:

  //  ICU does not return an extra element to allow us to figure
  //  out the last char size, so we have to compute it from the 
  //  total bytes used.
  unsigned int index;
  for (index = 0; index < charsDecoded - 1; index++)
  {
     charSizes[index] = (unsigned char)(fSrcOffsets[index + 1]
                                         - fSrcOffsets[index]);
  }
  if( charsDecoded > 0 ) {
     charSizes[charsDecoded - 1] = (unsigned char) bytesEaten
                                    - fSrcOffsets[charsDecoded - 1];
  }

e.g. Bring the limits of the loop in by one element, and then fix that
element up later.

I'd appreciate it if someone would check it in.  I'd supply a patch, but
I'm not near a unix box at the moment... let me know if you need one.

Cheers,

Matt