You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stdcxx.apache.org by Farid Zaripov <Fa...@kyiv.vdiweb.com> on 2006/07/04 18:20:44 UTC

rw_match can address to memory after end of string buffer

   I found that the rw_match function can address to the memory after 
the end of the string buffer.

   It calls __rw_get_char to get the last character and this function 
reads a character after the end of the string buffer:

char.cpp line 534:
     if ('<' == char (ch) && 'U' == src [0] && isxdigit (src [1])) {

char.cpp line 548:
     if ('@' == src [0] && isdigit (src [1])) { 


   src [0] - is the place of the fail.

   I attached the test to illustrate this problem, but it will work on 
MSVC/Windows platform only (used MSVC specific keywords).

Farid.

Re: rw_match can address to memory after end of string buffer

Posted by Martin Sebor <se...@roguewave.com>.
Martin Sebor wrote:
> Martin Sebor wrote:
> 
>> Farid Zaripov wrote:
>>
>>>   I found that the rw_match function can address to the memory after 
>>> the end of the string buffer.
>>>
>>>   It calls __rw_get_char to get the last character and this function 
>>> reads a character after the end of the string buffer:
>>>
>>> char.cpp line 534:
>>>     if ('<' == char (ch) && 'U' == src [0] && isxdigit (src [1])) {
>>>
>>> char.cpp line 548:
>>>     if ('@' == src [0] && isdigit (src [1])) {
>>>
>>>   src [0] - is the place of the fail.
>>
>>
>>
>> Hmm, that does look like a subtle bug in rw_match(). Let me look
>> into how best to fix it.

The commit below should fix it. I'm not 100% happy with the code
and suspect there might still be some bugs lurking in there but
if they are there the current test doesn't reveal them (even
under Sun dbx with memory checking on) and none of the string
tests is showing any signs of problems either.
http://svn.apache.org/viewvc?rev=420363&view=rev

Martin

Re: rw_match can address to memory after end of string buffer

Posted by Martin Sebor <se...@roguewave.com>.
Martin Sebor wrote:
> Farid Zaripov wrote:
> 
>>   I found that the rw_match function can address to the memory after 
>> the end of the string buffer.
>>
>>   It calls __rw_get_char to get the last character and this function 
>> reads a character after the end of the string buffer:
>>
>> char.cpp line 534:
>>     if ('<' == char (ch) && 'U' == src [0] && isxdigit (src [1])) {
>>
>> char.cpp line 548:
>>     if ('@' == src [0] && isdigit (src [1])) {
>>
>>   src [0] - is the place of the fail.
> 
> 
> Hmm, that does look like a subtle bug in rw_match(). Let me look
> into how best to fix it.

Here's a simple test case demonstrating the bug. The value returned
from rw_match() for two NUL-terminated sequences that are the same
should be the offset of the NUL character plus 1 (i.e., strlen(s0)
+ 1).

$ cat v.cpp && make v && ./v
#include <assert.h>
#include <rw_char.h>
#include <rw_printf.h>

int main ()
{
     const char s0[] = "a\0@2";
     const char s1[] = "a\0@3";

     unsigned i = rw_match (s0, s1);

     rw_printf ("%u\n", i);

     assert (i == 2);
}
gcc -c -I/build/sebor/dev/stdlib/include/ansi -D_RWSTDDEBUG   -pthreads 
-D_RWSTD_USE_CONFIG -I/build/sebor/dev/stdlib/include 
-I/build/sebor/gcc-4.1.0-15s/include -I/build/sebor/dev/stdlib/../rwtest 
-I/build/sebor/dev/stdlib/../rwtest/include 
-I/build/sebor/dev/stdlib/tests/include  -pedantic -nostdinc++ -g  -W 
-Wall -Wcast-qual -Winline -Wshadow -Wwrite-strings -Wno-long-long  v.cpp
gcc v.o -o v -L/build/sebor/gcc-4.1.0-15s/rwtest -lrwtest15s -pthreads 
-L/build/sebor/gcc-4.1.0-15s/lib -lstd15s  -lsupc++ -lm
3
Assertion failed: i == 2, file v.cpp, line 14
Abort (core dumped)

Re: rw_match can address to memory after end of string buffer

Posted by Martin Sebor <se...@roguewave.com>.
Farid Zaripov wrote:
>   I found that the rw_match function can address to the memory after the 
> end of the string buffer.
> 
>   It calls __rw_get_char to get the last character and this function 
> reads a character after the end of the string buffer:
> 
> char.cpp line 534:
>     if ('<' == char (ch) && 'U' == src [0] && isxdigit (src [1])) {
> 
> char.cpp line 548:
>     if ('@' == src [0] && isdigit (src [1])) {
> 
>   src [0] - is the place of the fail.

Hmm, that does look like a subtle bug in rw_match(). Let me look
into how best to fix it.

> 
>   I attached the test to illustrate this problem, but it will work on 
> MSVC/Windows platform only (used MSVC specific keywords).

Cool! This type of a test would be useful in general (AFAIK, this
idea is behind Electric Fence). How about abstracting this into a
function that would let do the same thing in a portable way?

Martin