You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by "George Compton (JIRA)" <xe...@xml.apache.org> on 2009/03/27 13:32:50 UTC

[jira] Created: (XERCESC-1858) Using TranscodeToStr with strings shorter than two characters corrupts memory

Using TranscodeToStr with strings shorter than two characters corrupts memory 
------------------------------------------------------------------------------

                 Key: XERCESC-1858
                 URL: https://issues.apache.org/jira/browse/XERCESC-1858
             Project: Xerces-C++
          Issue Type: Bug
          Components: Utilities
    Affects Versions: 3.0.0
         Environment: Windows XP Pro (32-bit), compiling with MS VisualStudio.Net 2003 with debugging symbols.  
            Reporter: George Compton
            Priority: Minor


I have observed this problem when using TranscodeToStr to transcode "" (empty string) and "1" (the numeral one) to US-ASCII.  Both caused the program to crash, apparently because the debug heap noticed that it had been corrupted.  

TranscodeToStr::transcode will overrun the allocated string fString when called with an input string that is less than two characters long.  When determining whether it needs to allocate additional space for terminating null characters, it uses the expression :
    if(fBytesWritten > (allocSize - 4)) {
allocSize is of type XMLSize_t, which is unsigned, assuming I followed all the typedefs correctly.  So, when the input string contains exactly one non-null character, allocSize is one times the size of XMLCh.  That's two bytes on Windows.  (2 - 4) in unsigned arithmetic wraps back to a large number, so the conditional is false, and additional memory for the terminator is not allocated.  The four terminating characters are then written to bytes 2, 3, 4, and 5 of a two byte array, corrupting whatever lies after it.  

Something similar happens with empty strings.  I haven't followed that all the way through a debugger, but I think there may be an additional problem there.  The method's while(true) loop will transcode at least one character, because it doesn't check the input length until half way through the first iteration.  Now, it's possible the allocator always allocates at least one byte, or it's possible that the transcoder won't write anything for a NULL character.  I haven't checked either into of those possibilities, but it seems risky to rely on those, even if they are the case.  

I have not had a chance to try fixing either problem yet.  For the first, just changing it to (fBytesWritten + 4 > allocSize) should probably work.  The second will probably require moving the length check or adding a second length check outside the loop.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


[jira] Assigned: (XERCESC-1858) Using TranscodeToStr with strings shorter than two characters corrupts memory

Posted by "John Snelson (JIRA)" <xe...@xml.apache.org>.
     [ https://issues.apache.org/jira/browse/XERCESC-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Snelson reassigned XERCESC-1858:
-------------------------------------

    Assignee: John Snelson

> Using TranscodeToStr with strings shorter than two characters corrupts memory 
> ------------------------------------------------------------------------------
>
>                 Key: XERCESC-1858
>                 URL: https://issues.apache.org/jira/browse/XERCESC-1858
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Utilities
>    Affects Versions: 3.0.0
>         Environment: Windows XP Pro (32-bit), compiling with MS VisualStudio.Net 2003 with debugging symbols.  
>            Reporter: George Compton
>            Assignee: John Snelson
>            Priority: Minor
>
> I have observed this problem when using TranscodeToStr to transcode "" (empty string) and "1" (the numeral one) to US-ASCII.  Both caused the program to crash, apparently because the debug heap noticed that it had been corrupted.  
> TranscodeToStr::transcode will overrun the allocated string fString when called with an input string that is less than two characters long.  When determining whether it needs to allocate additional space for terminating null characters, it uses the expression :
>     if(fBytesWritten > (allocSize - 4)) {
> allocSize is of type XMLSize_t, which is unsigned, assuming I followed all the typedefs correctly.  So, when the input string contains exactly one non-null character, allocSize is one times the size of XMLCh.  That's two bytes on Windows.  (2 - 4) in unsigned arithmetic wraps back to a large number, so the conditional is false, and additional memory for the terminator is not allocated.  The four terminating characters are then written to bytes 2, 3, 4, and 5 of a two byte array, corrupting whatever lies after it.  
> Something similar happens with empty strings.  I haven't followed that all the way through a debugger, but I think there may be an additional problem there.  The method's while(true) loop will transcode at least one character, because it doesn't check the input length until half way through the first iteration.  Now, it's possible the allocator always allocates at least one byte, or it's possible that the transcoder won't write anything for a NULL character.  I haven't checked either into of those possibilities, but it seems risky to rely on those, even if they are the case.  
> I have not had a chance to try fixing either problem yet.  For the first, just changing it to (fBytesWritten + 4 > allocSize) should probably work.  The second will probably require moving the length check or adding a second length check outside the loop.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


[jira] Commented: (XERCESC-1858) Using TranscodeToStr with strings shorter than two characters corrupts memory

Posted by "George Compton (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESC-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689886#action_12689886 ] 

George Compton commented on XERCESC-1858:
-----------------------------------------

I see the same problematic code at the head of the trunk at http://svn.apache.org/viewvc/xerces/c/trunk/src/xercesc/util/TransService.cpp?view=markup on 3/27/09, so this presumably affects 3.0.1, and any other more recent versions as well as 3.0.0 where I tested it.  

> Using TranscodeToStr with strings shorter than two characters corrupts memory 
> ------------------------------------------------------------------------------
>
>                 Key: XERCESC-1858
>                 URL: https://issues.apache.org/jira/browse/XERCESC-1858
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Utilities
>    Affects Versions: 3.0.0
>         Environment: Windows XP Pro (32-bit), compiling with MS VisualStudio.Net 2003 with debugging symbols.  
>            Reporter: George Compton
>            Priority: Minor
>
> I have observed this problem when using TranscodeToStr to transcode "" (empty string) and "1" (the numeral one) to US-ASCII.  Both caused the program to crash, apparently because the debug heap noticed that it had been corrupted.  
> TranscodeToStr::transcode will overrun the allocated string fString when called with an input string that is less than two characters long.  When determining whether it needs to allocate additional space for terminating null characters, it uses the expression :
>     if(fBytesWritten > (allocSize - 4)) {
> allocSize is of type XMLSize_t, which is unsigned, assuming I followed all the typedefs correctly.  So, when the input string contains exactly one non-null character, allocSize is one times the size of XMLCh.  That's two bytes on Windows.  (2 - 4) in unsigned arithmetic wraps back to a large number, so the conditional is false, and additional memory for the terminator is not allocated.  The four terminating characters are then written to bytes 2, 3, 4, and 5 of a two byte array, corrupting whatever lies after it.  
> Something similar happens with empty strings.  I haven't followed that all the way through a debugger, but I think there may be an additional problem there.  The method's while(true) loop will transcode at least one character, because it doesn't check the input length until half way through the first iteration.  Now, it's possible the allocator always allocates at least one byte, or it's possible that the transcoder won't write anything for a NULL character.  I haven't checked either into of those possibilities, but it seems risky to rely on those, even if they are the case.  
> I have not had a chance to try fixing either problem yet.  For the first, just changing it to (fBytesWritten + 4 > allocSize) should probably work.  The second will probably require moving the length check or adding a second length check outside the loop.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


[jira] Commented: (XERCESC-1858) Using TranscodeToStr with strings shorter than two characters corrupts memory

Posted by "John Snelson (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESC-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12690126#action_12690126 ] 

John Snelson commented on XERCESC-1858:
---------------------------------------

David - I think we just stepped on each other as we looked at fixing this bug. I've just committed a fix to SVN - almost the same as your patch. I didn't modify the initialization of charsRead, so you might want to do that if you think it's important.

> Using TranscodeToStr with strings shorter than two characters corrupts memory 
> ------------------------------------------------------------------------------
>
>                 Key: XERCESC-1858
>                 URL: https://issues.apache.org/jira/browse/XERCESC-1858
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Utilities
>    Affects Versions: 3.0.0
>         Environment: Windows XP Pro (32-bit), compiling with MS VisualStudio.Net 2003 with debugging symbols.  
>            Reporter: George Compton
>            Assignee: John Snelson
>            Priority: Minor
>         Attachments: XERCESC-1858.patch
>
>
> I have observed this problem when using TranscodeToStr to transcode "" (empty string) and "1" (the numeral one) to US-ASCII.  Both caused the program to crash, apparently because the debug heap noticed that it had been corrupted.  
> TranscodeToStr::transcode will overrun the allocated string fString when called with an input string that is less than two characters long.  When determining whether it needs to allocate additional space for terminating null characters, it uses the expression :
>     if(fBytesWritten > (allocSize - 4)) {
> allocSize is of type XMLSize_t, which is unsigned, assuming I followed all the typedefs correctly.  So, when the input string contains exactly one non-null character, allocSize is one times the size of XMLCh.  That's two bytes on Windows.  (2 - 4) in unsigned arithmetic wraps back to a large number, so the conditional is false, and additional memory for the terminator is not allocated.  The four terminating characters are then written to bytes 2, 3, 4, and 5 of a two byte array, corrupting whatever lies after it.  
> Something similar happens with empty strings.  I haven't followed that all the way through a debugger, but I think there may be an additional problem there.  The method's while(true) loop will transcode at least one character, because it doesn't check the input length until half way through the first iteration.  Now, it's possible the allocator always allocates at least one byte, or it's possible that the transcoder won't write anything for a NULL character.  I haven't checked either into of those possibilities, but it seems risky to rely on those, even if they are the case.  
> I have not had a chance to try fixing either problem yet.  For the first, just changing it to (fBytesWritten + 4 > allocSize) should probably work.  The second will probably require moving the length check or adding a second length check outside the loop.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


[jira] Updated: (XERCESC-1858) Using TranscodeToStr with strings shorter than two characters corrupts memory

Posted by "David Bertoni (JIRA)" <xe...@xml.apache.org>.
     [ https://issues.apache.org/jira/browse/XERCESC-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Bertoni updated XERCESC-1858:
-----------------------------------

    Attachment: XERCESC-1858.patch

> Using TranscodeToStr with strings shorter than two characters corrupts memory 
> ------------------------------------------------------------------------------
>
>                 Key: XERCESC-1858
>                 URL: https://issues.apache.org/jira/browse/XERCESC-1858
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Utilities
>    Affects Versions: 3.0.0
>         Environment: Windows XP Pro (32-bit), compiling with MS VisualStudio.Net 2003 with debugging symbols.  
>            Reporter: George Compton
>            Assignee: John Snelson
>            Priority: Minor
>         Attachments: XERCESC-1858.patch
>
>
> I have observed this problem when using TranscodeToStr to transcode "" (empty string) and "1" (the numeral one) to US-ASCII.  Both caused the program to crash, apparently because the debug heap noticed that it had been corrupted.  
> TranscodeToStr::transcode will overrun the allocated string fString when called with an input string that is less than two characters long.  When determining whether it needs to allocate additional space for terminating null characters, it uses the expression :
>     if(fBytesWritten > (allocSize - 4)) {
> allocSize is of type XMLSize_t, which is unsigned, assuming I followed all the typedefs correctly.  So, when the input string contains exactly one non-null character, allocSize is one times the size of XMLCh.  That's two bytes on Windows.  (2 - 4) in unsigned arithmetic wraps back to a large number, so the conditional is false, and additional memory for the terminator is not allocated.  The four terminating characters are then written to bytes 2, 3, 4, and 5 of a two byte array, corrupting whatever lies after it.  
> Something similar happens with empty strings.  I haven't followed that all the way through a debugger, but I think there may be an additional problem there.  The method's while(true) loop will transcode at least one character, because it doesn't check the input length until half way through the first iteration.  Now, it's possible the allocator always allocates at least one byte, or it's possible that the transcoder won't write anything for a NULL character.  I haven't checked either into of those possibilities, but it seems risky to rely on those, even if they are the case.  
> I have not had a chance to try fixing either problem yet.  For the first, just changing it to (fBytesWritten + 4 > allocSize) should probably work.  The second will probably require moving the length check or adding a second length check outside the loop.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


[jira] Resolved: (XERCESC-1858) Using TranscodeToStr with strings shorter than two characters corrupts memory

Posted by "Alberto Massari (JIRA)" <xe...@xml.apache.org>.
     [ https://issues.apache.org/jira/browse/XERCESC-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alberto Massari resolved XERCESC-1858.
--------------------------------------

       Resolution: Fixed
    Fix Version/s: 3.1.0
                   3.0.2

> Using TranscodeToStr with strings shorter than two characters corrupts memory 
> ------------------------------------------------------------------------------
>
>                 Key: XERCESC-1858
>                 URL: https://issues.apache.org/jira/browse/XERCESC-1858
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Utilities
>    Affects Versions: 3.0.0
>         Environment: Windows XP Pro (32-bit), compiling with MS VisualStudio.Net 2003 with debugging symbols.  
>            Reporter: George Compton
>            Assignee: John Snelson
>            Priority: Minor
>             Fix For: 3.0.2, 3.1.0
>
>         Attachments: XERCESC-1858.patch
>
>
> I have observed this problem when using TranscodeToStr to transcode "" (empty string) and "1" (the numeral one) to US-ASCII.  Both caused the program to crash, apparently because the debug heap noticed that it had been corrupted.  
> TranscodeToStr::transcode will overrun the allocated string fString when called with an input string that is less than two characters long.  When determining whether it needs to allocate additional space for terminating null characters, it uses the expression :
>     if(fBytesWritten > (allocSize - 4)) {
> allocSize is of type XMLSize_t, which is unsigned, assuming I followed all the typedefs correctly.  So, when the input string contains exactly one non-null character, allocSize is one times the size of XMLCh.  That's two bytes on Windows.  (2 - 4) in unsigned arithmetic wraps back to a large number, so the conditional is false, and additional memory for the terminator is not allocated.  The four terminating characters are then written to bytes 2, 3, 4, and 5 of a two byte array, corrupting whatever lies after it.  
> Something similar happens with empty strings.  I haven't followed that all the way through a debugger, but I think there may be an additional problem there.  The method's while(true) loop will transcode at least one character, because it doesn't check the input length until half way through the first iteration.  Now, it's possible the allocator always allocates at least one byte, or it's possible that the transcoder won't write anything for a NULL character.  I haven't checked either into of those possibilities, but it seems risky to rely on those, even if they are the case.  
> I have not had a chance to try fixing either problem yet.  For the first, just changing it to (fBytesWritten + 4 > allocSize) should probably work.  The second will probably require moving the length check or adding a second length check outside the loop.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


[jira] Commented: (XERCESC-1858) Using TranscodeToStr with strings shorter than two characters corrupts memory

Posted by "David Bertoni (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESC-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12690140#action_12690140 ] 

David Bertoni commented on XERCESC-1858:
----------------------------------------

I'm not sure if you tested with a zero-length string, but in that case, the transcoder does not write to charsRead, so it contains a garbage value, leading to and endless loop.

> Using TranscodeToStr with strings shorter than two characters corrupts memory 
> ------------------------------------------------------------------------------
>
>                 Key: XERCESC-1858
>                 URL: https://issues.apache.org/jira/browse/XERCESC-1858
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Utilities
>    Affects Versions: 3.0.0
>         Environment: Windows XP Pro (32-bit), compiling with MS VisualStudio.Net 2003 with debugging symbols.  
>            Reporter: George Compton
>            Assignee: John Snelson
>            Priority: Minor
>         Attachments: XERCESC-1858.patch
>
>
> I have observed this problem when using TranscodeToStr to transcode "" (empty string) and "1" (the numeral one) to US-ASCII.  Both caused the program to crash, apparently because the debug heap noticed that it had been corrupted.  
> TranscodeToStr::transcode will overrun the allocated string fString when called with an input string that is less than two characters long.  When determining whether it needs to allocate additional space for terminating null characters, it uses the expression :
>     if(fBytesWritten > (allocSize - 4)) {
> allocSize is of type XMLSize_t, which is unsigned, assuming I followed all the typedefs correctly.  So, when the input string contains exactly one non-null character, allocSize is one times the size of XMLCh.  That's two bytes on Windows.  (2 - 4) in unsigned arithmetic wraps back to a large number, so the conditional is false, and additional memory for the terminator is not allocated.  The four terminating characters are then written to bytes 2, 3, 4, and 5 of a two byte array, corrupting whatever lies after it.  
> Something similar happens with empty strings.  I haven't followed that all the way through a debugger, but I think there may be an additional problem there.  The method's while(true) loop will transcode at least one character, because it doesn't check the input length until half way through the first iteration.  Now, it's possible the allocator always allocates at least one byte, or it's possible that the transcoder won't write anything for a NULL character.  I haven't checked either into of those possibilities, but it seems risky to rely on those, even if they are the case.  
> I have not had a chance to try fixing either problem yet.  For the first, just changing it to (fBytesWritten + 4 > allocSize) should probably work.  The second will probably require moving the length check or adding a second length check outside the loop.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


[jira] Commented: (XERCESC-1858) Using TranscodeToStr with strings shorter than two characters corrupts memory

Posted by "John Snelson (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESC-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12693738#action_12693738 ] 

John Snelson commented on XERCESC-1858:
---------------------------------------

I did test with a zero length string and didn't have a problem, but that's just what can happen with uninitialized variables - I'm glad you noticed it. I'll commit a fix for it, and port the fixes to the xerces-3.0 branch.

> Using TranscodeToStr with strings shorter than two characters corrupts memory 
> ------------------------------------------------------------------------------
>
>                 Key: XERCESC-1858
>                 URL: https://issues.apache.org/jira/browse/XERCESC-1858
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Utilities
>    Affects Versions: 3.0.0
>         Environment: Windows XP Pro (32-bit), compiling with MS VisualStudio.Net 2003 with debugging symbols.  
>            Reporter: George Compton
>            Assignee: John Snelson
>            Priority: Minor
>         Attachments: XERCESC-1858.patch
>
>
> I have observed this problem when using TranscodeToStr to transcode "" (empty string) and "1" (the numeral one) to US-ASCII.  Both caused the program to crash, apparently because the debug heap noticed that it had been corrupted.  
> TranscodeToStr::transcode will overrun the allocated string fString when called with an input string that is less than two characters long.  When determining whether it needs to allocate additional space for terminating null characters, it uses the expression :
>     if(fBytesWritten > (allocSize - 4)) {
> allocSize is of type XMLSize_t, which is unsigned, assuming I followed all the typedefs correctly.  So, when the input string contains exactly one non-null character, allocSize is one times the size of XMLCh.  That's two bytes on Windows.  (2 - 4) in unsigned arithmetic wraps back to a large number, so the conditional is false, and additional memory for the terminator is not allocated.  The four terminating characters are then written to bytes 2, 3, 4, and 5 of a two byte array, corrupting whatever lies after it.  
> Something similar happens with empty strings.  I haven't followed that all the way through a debugger, but I think there may be an additional problem there.  The method's while(true) loop will transcode at least one character, because it doesn't check the input length until half way through the first iteration.  Now, it's possible the allocator always allocates at least one byte, or it's possible that the transcoder won't write anything for a NULL character.  I haven't checked either into of those possibilities, but it seems risky to rely on those, even if they are the case.  
> I have not had a chance to try fixing either problem yet.  For the first, just changing it to (fBytesWritten + 4 > allocSize) should probably work.  The second will probably require moving the length check or adding a second length check outside the loop.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


[jira] Commented: (XERCESC-1858) Using TranscodeToStr with strings shorter than two characters corrupts memory

Posted by "Boris Kolpackov (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESC-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12690172#action_12690172 ] 

Boris Kolpackov commented on XERCESC-1858:
------------------------------------------

Can we also have the fix committed to the xerces-3.0 branch in case we decide to release 3.0.2?


> Using TranscodeToStr with strings shorter than two characters corrupts memory 
> ------------------------------------------------------------------------------
>
>                 Key: XERCESC-1858
>                 URL: https://issues.apache.org/jira/browse/XERCESC-1858
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Utilities
>    Affects Versions: 3.0.0
>         Environment: Windows XP Pro (32-bit), compiling with MS VisualStudio.Net 2003 with debugging symbols.  
>            Reporter: George Compton
>            Assignee: John Snelson
>            Priority: Minor
>         Attachments: XERCESC-1858.patch
>
>
> I have observed this problem when using TranscodeToStr to transcode "" (empty string) and "1" (the numeral one) to US-ASCII.  Both caused the program to crash, apparently because the debug heap noticed that it had been corrupted.  
> TranscodeToStr::transcode will overrun the allocated string fString when called with an input string that is less than two characters long.  When determining whether it needs to allocate additional space for terminating null characters, it uses the expression :
>     if(fBytesWritten > (allocSize - 4)) {
> allocSize is of type XMLSize_t, which is unsigned, assuming I followed all the typedefs correctly.  So, when the input string contains exactly one non-null character, allocSize is one times the size of XMLCh.  That's two bytes on Windows.  (2 - 4) in unsigned arithmetic wraps back to a large number, so the conditional is false, and additional memory for the terminator is not allocated.  The four terminating characters are then written to bytes 2, 3, 4, and 5 of a two byte array, corrupting whatever lies after it.  
> Something similar happens with empty strings.  I haven't followed that all the way through a debugger, but I think there may be an additional problem there.  The method's while(true) loop will transcode at least one character, because it doesn't check the input length until half way through the first iteration.  Now, it's possible the allocator always allocates at least one byte, or it's possible that the transcoder won't write anything for a NULL character.  I haven't checked either into of those possibilities, but it seems risky to rely on those, even if they are the case.  
> I have not had a chance to try fixing either problem yet.  For the first, just changing it to (fBytesWritten + 4 > allocSize) should probably work.  The second will probably require moving the length check or adding a second length check outside the loop.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


[jira] Commented: (XERCESC-1858) Using TranscodeToStr with strings shorter than two characters corrupts memory

Posted by "David Bertoni (JIRA)" <xe...@xml.apache.org>.
    [ https://issues.apache.org/jira/browse/XERCESC-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12690124#action_12690124 ] 

David Bertoni commented on XERCESC-1858:
----------------------------------------

Yes, this code is definitely broken, both for strings of length 0 and 1.  I'm attaching a patch with a proposed fix.  If possible, please apply the patch to your local copy to see if it works for you.

> Using TranscodeToStr with strings shorter than two characters corrupts memory 
> ------------------------------------------------------------------------------
>
>                 Key: XERCESC-1858
>                 URL: https://issues.apache.org/jira/browse/XERCESC-1858
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Utilities
>    Affects Versions: 3.0.0
>         Environment: Windows XP Pro (32-bit), compiling with MS VisualStudio.Net 2003 with debugging symbols.  
>            Reporter: George Compton
>            Assignee: John Snelson
>            Priority: Minor
>         Attachments: XERCESC-1858.patch
>
>
> I have observed this problem when using TranscodeToStr to transcode "" (empty string) and "1" (the numeral one) to US-ASCII.  Both caused the program to crash, apparently because the debug heap noticed that it had been corrupted.  
> TranscodeToStr::transcode will overrun the allocated string fString when called with an input string that is less than two characters long.  When determining whether it needs to allocate additional space for terminating null characters, it uses the expression :
>     if(fBytesWritten > (allocSize - 4)) {
> allocSize is of type XMLSize_t, which is unsigned, assuming I followed all the typedefs correctly.  So, when the input string contains exactly one non-null character, allocSize is one times the size of XMLCh.  That's two bytes on Windows.  (2 - 4) in unsigned arithmetic wraps back to a large number, so the conditional is false, and additional memory for the terminator is not allocated.  The four terminating characters are then written to bytes 2, 3, 4, and 5 of a two byte array, corrupting whatever lies after it.  
> Something similar happens with empty strings.  I haven't followed that all the way through a debugger, but I think there may be an additional problem there.  The method's while(true) loop will transcode at least one character, because it doesn't check the input length until half way through the first iteration.  Now, it's possible the allocator always allocates at least one byte, or it's possible that the transcoder won't write anything for a NULL character.  I haven't checked either into of those possibilities, but it seems risky to rely on those, even if they are the case.  
> I have not had a chance to try fixing either problem yet.  For the first, just changing it to (fBytesWritten + 4 > allocSize) should probably work.  The second will probably require moving the length check or adding a second length check outside the loop.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org