You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by bu...@apache.org on 2003/11/27 09:48:27 UTC
DO NOT REPLY [Bug 25045] New: - UT8Reader is throwing a ArrayOutOfBounds Exception

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25045>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25045

UT8Reader is throwing a ArrayOutOfBounds Exception

           Summary: UT8Reader is throwing a ArrayOutOfBounds Exception
           Product: Xerces2-J
           Version: 2.5.0
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: Critical
          Priority: Other
         Component: Other
        AssignedTo: xerces-j-dev@xml.apache.org
        ReportedBy: phil@triloggroup.com


When the UTF8Reader is reading an array of characters which uses a buffer that 
is less than the internal buffer size (2048 by default), it badly computes 
the 'total' to read, assigning the size of the internal buffer without checking 
the parameter buffer size.
As a consequence, ch[out++] throws an ArrayOutOfBounds exception.

In the code bellow, I added 2 println() to see what's happening, and here are 
the result using my application. The read() method is required to fill 2043 
characters, but try to fill 2048!!


>>>>>Console
UTF8 Reader start: out=0, total=1, fOffset=0
UTF8 Reader:0,2048
UTF8 Reader start: out=0, total=2048, fOffset=0
UTF8 Reader:5,2043
UTF8 Reader start: out=5, total=2048, fOffset=0
java.lang.ArrayIndexOutOfBoundsException: 2048
        at org.apache.xerces.impl.io.UTF8Reader.read(UTF8Reader.java:355)
        at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
        at org.apache.xerces.impl.XMLEntityScanner.skipString(Unknown Source)


>>>>Code where the traces are added:

    public int read(char ch[], int offset, int length) throws IOException {
System.out.println( "UTF8 Reader:"+offset+","+length);
        // handle surrogate
        int out = offset;
        if (fSurrogate != -1) {
            ch[offset + 1] = (char)fSurrogate;
            fSurrogate = -1;
            length--;
            out++;
        }

        // read bytes
        int count = 0;
        if (fOffset == 0) {
            // adjust length to read
            if (length > fBuffer.length) {
                length = fBuffer.length;
            }

            // perform read operation
            count = fInputStream.read(fBuffer, 0, length);
            if (count == -1) {
                return -1;
            }
            count += out - offset;
        }

        // skip read; last character was in error
        // NOTE: Having an offset value other than zero means that there was
        //       an error in the last character read. In this case, we have
        //       skipped the read so we don't consume any bytes past the
        //       error. By signalling the error on the next block read we
        //       allow the method to return the most valid characters that
        //       it can on the previous block read. -Ac
        else {
            count = fOffset;
            fOffset = 0;
        }

        // convert bytes to characters
        final int total = count;
        int in;
        byte byte1;
        final byte byte0 = 0;
System.out.println( "UTF8 Reader start: out="+out+", total="+total+", 
fOffset="+fOffset);
        for (in = 0; in < total; in++) {
            byte1 = fBuffer[in];
            if (byte1 >= byte0) {
//===> Crashing HERE!!!
                ch[out++] = (char)byte1;
            }
            else   {
                break;
            }
        }

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org