You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Erik Schroeder <ES...@Wausaufs.com> on 2000/07/21 18:59:04 UTC

Something goofy in Win32TransService::Win32TransService()?

In the constructor for Win32TransService, a variable is declared:
char nameBuf[nameBufSz + 1];

Further into the constructor, the following code can be executed:
fCPMap->put(nameBuf, newEntry);

The declaration for put looks like:
template <class TVal> void RefHashTableOf<TVal>::put(void* key, TVal* const
valueToAdopt)

So, we have nameBuf getting passed to a function that takes a void*.
Then, put calls findBucketElem:
RefHashTableBucketElem<TVal>* newBucket = findBucketElem(key, hashVal);

findBucketElem has a declaration of:
template <class TVal> RefHashTableBucketElem<TVal>* RefHashTableOf<TVal>::
findBucketElem(const void* const key, unsigned int& hashVal)

so, findBucketElem takes the param as a const void* const.
Then, findBucketElem calls getHashVal, which has a declaration of:
unsigned int HashXMLCh::getHashVal(const void *const key, unsigned int mod)

No real biggie at that point.
getHashVal calls XMLString::hash, and casts the const void* const key to a
XMLCh*:
return XMLString::hash((XMLCh*)key, mod);

So, something that started out as char nameBuf[nameBufSz + 1]  gets
eventually cast to XMLCh*.
XMLCh is typedef'd:
typedef unsigned short  XMLCh;

As you can guess, this is probably not what is desired - when hash is called
unsigned int XMLString::hash(   const   XMLCh* const    tohash , const
unsigned int    hashModulus)

tohash is expected to be a wide char, but its contents are not.  So, in the
while loop in XMLString::hash:
while (*curCh)
    {
        unsigned int top = hashVal >> 24;
        hashVal += (hashVal * 37) + top + (unsigned int)(*curCh);
        curCh++;
    }

It is quite likely that *curCh will not evaluate to 0 where expected, and
curCh will get incremented to point into uninitialized memory.

I'm wondering if changing the following in the Win32TransService
constructor:
  CPMapEntry* newEntry = new CPMapEntry(nameBuf, CPId, IEId);
  fCPMap->put(nameBuf, newEntry);
to:
  const unsigned int srcLen = strlen(nameBuf);
  const unsigned int targetLen = ::mbstowcs(0, nameBuf, srcLen);
  XMLCh* uniName = new XMLCh[targetLen + 1];
  ::mbstowcs(uniName, nameBuf, srcLen);
  uniName[targetLen] = 0;
  ::wcsupr(uniName);
  CPMapEntry* newEntry = new CPMapEntry(uniName, CPId, IEId);
  fCPMap->put(uniName, newEntry);
  delete[] uniName;

would take care of this?


Erik ©**
Wausau Financial Systems
eschroed@wausaufs.com  --

Thought for the day/week/month/year:*
Disclaimer: The entire physical universe, including this item, may one day
collapse back into an infinitesimally small space. Should another universe
subsequently re-emerge, the existence of this item in that universe cannot
be guaranteed.


Re: Something goofy in Win32TransService::Win32TransService()?

Posted by Joe Polastre <jp...@apache.org>.
Erik,

  This is a good find.  I had to convert alot of the put() functions, and
the format of these functions were just horrible to begin with...

  since you wrote such a long informative email, i figure its only fair to
do the same =)

  when the hashtables and utility functions were originally written, they
were created for specific objects (not templates).  the put function
originally was in the form put(value) with no key identifier!  one knows
from introductory data structures classes that you must always put the key
and value (nothing can be assumed about the value being stored).  So I went
through and replaced the put functions in RefHashTableOf() to take a key and
value, and I further went through to make the hasher dynamic--instead of
only being able to hash strings, you can now hash pointers or whatever you
want as long as you extend the HashBase class.  (you'll see that in the DOM
structure, the userdata is no longer on each node but in a hashtable on the
document, this was the primary motivation for the conversion to dynamic
hashers)  As a result, all the keys are void* in RefHashTableOf

  Anyway, I originally had put(value) just call put(value->getKey(), value)
so that I wouldn't have to change any existing legacy code.  This created
all kinds of problems for AIX and Solaris (go figure).  So something had to
be done.  The only work around we could find was to eliminate the orignal
put(value) function completely and change everything over in the parser to
put(key, value).  In this process, I surely missed some of the keys being
sent to the hashtable correctly (as you pointed out).  This is why.

  You'll notice that CPMapEntry() constructor has one that takes char*
instead of XMLCh*.  It then converts the char* to XMLCh* in the constructor
and stores it as fEncodingName.

  I propose a much simpler change than the one you just proposed, and tell
me your thoughts (it does the same thing, but reuses code more and makes it
a little easier to read)

you wrote:
> constructor:
>   CPMapEntry* newEntry = new CPMapEntry(nameBuf, CPId, IEId);
>   fCPMap->put(nameBuf, newEntry);
> to:
>   const unsigned int srcLen = strlen(nameBuf);
>   const unsigned int targetLen = ::mbstowcs(0, nameBuf, srcLen);
>   XMLCh* uniName = new XMLCh[targetLen + 1];
>   ::mbstowcs(uniName, nameBuf, srcLen);
>   uniName[targetLen] = 0;
>   ::wcsupr(uniName);
>   CPMapEntry* newEntry = new CPMapEntry(uniName, CPId, IEId);
>   fCPMap->put(uniName, newEntry);
>   delete[] uniName;
>

i propose:
constructor:
  CPMapEntry* newEntry = new CPMapEntry(nameBuf, CPId, IEId);
  fCPMap->put(nameBuf, newEntry);
to:
  CPMapEntry* newEntry = new CPMapEntry(nameBuf, CPId, IEId);
  fCPMap->put(newEntry->getEncodingName(), newEntry);

  comments, thoughts?

-Joe Polastre  (jpolast@apache.org)
IBM Cupertino, XML Technology Group


----- Original Message -----
From: "Erik Schroeder" <ES...@Wausaufs.com>
To: <xe...@xml.apache.org>
Sent: Friday, July 21, 2000 9:59 AM
Subject: Something goofy in Win32TransService::Win32TransService()?


> In the constructor for Win32TransService, a variable is declared:
> char nameBuf[nameBufSz + 1];
>
> Further into the constructor, the following code can be executed:
> fCPMap->put(nameBuf, newEntry);
>
> The declaration for put looks like:
> template <class TVal> void RefHashTableOf<TVal>::put(void* key, TVal*
const
> valueToAdopt)
>
> So, we have nameBuf getting passed to a function that takes a void*.
> Then, put calls findBucketElem:
> RefHashTableBucketElem<TVal>* newBucket = findBucketElem(key, hashVal);
>
> findBucketElem has a declaration of:
> template <class TVal> RefHashTableBucketElem<TVal>* RefHashTableOf<TVal>::
> findBucketElem(const void* const key, unsigned int& hashVal)
>
> so, findBucketElem takes the param as a const void* const.
> Then, findBucketElem calls getHashVal, which has a declaration of:
> unsigned int HashXMLCh::getHashVal(const void *const key, unsigned int
mod)
>
> No real biggie at that point.
> getHashVal calls XMLString::hash, and casts the const void* const key to a
> XMLCh*:
> return XMLString::hash((XMLCh*)key, mod);
>
> So, something that started out as char nameBuf[nameBufSz + 1]  gets
> eventually cast to XMLCh*.
> XMLCh is typedef'd:
> typedef unsigned short  XMLCh;
>
> As you can guess, this is probably not what is desired - when hash is
called
> unsigned int XMLString::hash(   const   XMLCh* const    tohash , const
> unsigned int    hashModulus)
>
> tohash is expected to be a wide char, but its contents are not.  So, in
the
> while loop in XMLString::hash:
> while (*curCh)
>     {
>         unsigned int top = hashVal >> 24;
>         hashVal += (hashVal * 37) + top + (unsigned int)(*curCh);
>         curCh++;
>     }
>
> It is quite likely that *curCh will not evaluate to 0 where expected, and
> curCh will get incremented to point into uninitialized memory.
>
> I'm wondering if changing the following in the Win32TransService
> constructor:
>   CPMapEntry* newEntry = new CPMapEntry(nameBuf, CPId, IEId);
>   fCPMap->put(nameBuf, newEntry);
> to:
>   const unsigned int srcLen = strlen(nameBuf);
>   const unsigned int targetLen = ::mbstowcs(0, nameBuf, srcLen);
>   XMLCh* uniName = new XMLCh[targetLen + 1];
>   ::mbstowcs(uniName, nameBuf, srcLen);
>   uniName[targetLen] = 0;
>   ::wcsupr(uniName);
>   CPMapEntry* newEntry = new CPMapEntry(uniName, CPId, IEId);
>   fCPMap->put(uniName, newEntry);
>   delete[] uniName;
>
> would take care of this?
>
>
> Erik ©**
> Wausau Financial Systems
> eschroed@wausaufs.com  --
>
> Thought for the day/week/month/year:*
> Disclaimer: The entire physical universe, including this item, may one day
> collapse back into an infinitesimally small space. Should another universe
> subsequently re-emerge, the existence of this item in that universe cannot
> be guaranteed.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>
>