You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-users@xerces.apache.org by Boris Kolpackov <bo...@codesynthesis.com> on 2010/02/03 16:07:16 UTC

Re: RefHashTableOfEnumerator returns me elements in different order for different platforms

Hi Gordon,

Gordon Brown <go...@yahoo.com> writes:

> I understand the order of types might not be gurantteed in a sense 
> that it might not be the same as presented in the schema itself. 
> But is it expected that the same piece of code (complexTypeRegistry)
> might return types in different order in different platforms?

The hashtable enumeration works this way because the XMLSize_t type
size varies among 32 and 64-bit platforms which in turn leads to
different hash values. 

I don't think we guarantee any particular order of enumeration nor 
that it will be the same among all the platforms. Though you can add
an enhancement request[1] for this feature if you believe we should
provide such a guarantee.

[1] http://xerces.apache.org/xerces-c/bug-report.html

Boris


-- 
Boris Kolpackov, Code Synthesis        http://codesynthesis.com/~boris/blog
Open-source XML data binding for C++   http://codesynthesis.com/products/xsd
XML data binding for embedded systems  http://codesynthesis.com/products/xsde
Command line interface to C++ compiler http://codesynthesis.com/projects/cli

RE: RefHashTableOfEnumerator returns me elements in differentorder for different platforms

Posted by Jesse Pelton <js...@PKC.com>.
If at all possible, you should move away from relying on the ordering produced by a hash.  This reliance is making your code needlessly fragile.  Hashes are not sorting algorithms, which is why there is no ordering guarantee.  In fact, it would be inappropriate to provide such a guarantee, as it would greatly constrain the implementation of improved algorithms.

-----Original Message-----
From: Boris Kolpackov [mailto:boris@codesynthesis.com]
Sent: Thu 2/4/2010 7:13 AM
To: c-users@xerces.apache.org
Subject: Re: RefHashTableOfEnumerator returns me elements in differentorder for different platforms
 
Hi Gordon,

Gordon Brown <go...@yahoo.com> writes:

> It is clear now that the hash function inXMLString.cpp is re-implemented.

It is not really re-implemented. Rather we use XMLSize_t instead of
unsigned int to hold the result. 


> This is causing big problems. 

You are the first person to mentioned this so I think it is more accurate
to say it is causing problems for some applications (which, BTW, rely on 
something that was never explicitly guaranteed).


> If I bring back the old hash function, will it cause other problems?

If you patch Xerces-C++ to use unsigned int to calculate the hash
value, you will get the old behavior though the hashing may not be
optimal on 64-bit platforms since only 32 bit will be used.

Boris

-- 
Boris Kolpackov, Code Synthesis        http://codesynthesis.com/~boris/blog
Open-source XML data binding for C++   http://codesynthesis.com/products/xsd
XML data binding for embedded systems  http://codesynthesis.com/products/xsde
Command line interface to C++ compiler http://codesynthesis.com/projects/cli


Re: RefHashTableOfEnumerator returns me elements in different order for different platforms

Posted by Vitaly Prapirny <ma...@mebius.net>.
Gordon Brown wrote:
> It's true that we really don't care about the orders we get for the
> derived  types the first time we set up our jobs, but once we created
> our jobs,  we do not expect this order to change in later versions of
> Xerces, nor  do we expect the order to change in a different platform.
> I would  think this is a reasonable expections.

No it isn't. Fix your code please.

Good luck!
	Vitaly

Re: RefHashTableOfEnumerator returns me elements in different order for different platforms

Posted by Gordon Brown <go...@yahoo.com>.
Thanks guys.

In our case, we are just enumerating the hashtable created for all the derived types for an abstract type (complexTypeRegistry). It's true that we really don't care about the orders we get for the derived types the first time we set up our jobs, but once we created our jobs, we do not expect this order to change in later versions of Xerces, nor do we expect the order to change in a different platform. I would think this is a reasonable expections. Also, since we have applications run on multiple platforms and move jobs around, this is import to us.

Thanks!
Gordon


________________________________
From: Boris Kolpackov <bo...@codesynthesis.com>
To: c-users@xerces.apache.org
Sent: Thu, February 4, 2010 4:13:15 AM
Subject: Re: RefHashTableOfEnumerator returns me elements in different order for different platforms

Hi Gordon,

Gordon Brown <go...@yahoo.com> writes:

> It is clear now that the hash function inXMLString.cpp is re-implemented.

It is not really re-implemented.. Rather we use XMLSize_t instead of
unsigned int to hold the result. 


> This is causing big problems. 

You are the first person to mentioned this so I think it is more accurate
to say it is causing problems for some applications (which, BTW, rely on 
something that was never explicitly guaranteed).


> If I bring back the old hash function, will it cause other problems?

If you patch Xerces-C++ to use unsigned int to calculate the hash
value, you will get the old behavior though the hashing may not be
optimal on 64-bit platforms since only 32 bit will be used.

Boris

-- 
Boris Kolpackov, Code Synthesis        http://codesynthesis.com/~boris/blog
Open-source XML data binding for C++  http://codesynthesis.com/products/xsd
XML data binding for embedded systems  http://codesynthesis.com/products/xsde
Command line interface to C++ compiler http://codesynthesis.com/projects/cli



      

Re: RefHashTableOfEnumerator returns me elements in different order for different platforms

Posted by Boris Kolpackov <bo...@codesynthesis.com>.
Hi Gordon,

Gordon Brown <go...@yahoo.com> writes:

> It is clear now that the hash function inXMLString.cpp is re-implemented.

It is not really re-implemented. Rather we use XMLSize_t instead of
unsigned int to hold the result. 


> This is causing big problems. 

You are the first person to mentioned this so I think it is more accurate
to say it is causing problems for some applications (which, BTW, rely on 
something that was never explicitly guaranteed).


> If I bring back the old hash function, will it cause other problems?

If you patch Xerces-C++ to use unsigned int to calculate the hash
value, you will get the old behavior though the hashing may not be
optimal on 64-bit platforms since only 32 bit will be used.

Boris

-- 
Boris Kolpackov, Code Synthesis        http://codesynthesis.com/~boris/blog
Open-source XML data binding for C++   http://codesynthesis.com/products/xsd
XML data binding for embedded systems  http://codesynthesis.com/products/xsde
Command line interface to C++ compiler http://codesynthesis.com/projects/cli

Re: RefHashTableOfEnumerator returns me elements in different order for different platforms

Posted by Gordon Brown <go...@yahoo.com>.
It is clear now that the hash function inXMLString.cpp is re-implemented. This is causing big problems. Now that in windows 64 or unix platform, the jobs we and our customers created using old versions of Xerces expected the order of types enumerated in the hash table filled by the old hash function. They won't run any more, they will have to be re-created. Also, a job created in windows 32 bit won't be able to run in other platforms. If I bring back the old hash function, will it cause other problems?

Thanks much!

________________________________
From: Boris Kolpackov <bo...@codesynthesis.com>
To: c-users@xerces.apache.org
Sent: Wed, February 3, 2010 7:07:16 AM
Subject: Re: RefHashTableOfEnumerator returns me elements in different order for different platforms

Hi Gordon,

Gordon Brown <go...@yahoo.com> writes:

> I understand the order of types might not be gurantteed in a sense 
> that it might not be the same as presented in the schema itself. 
> But is it expected that the same piece of code (complexTypeRegistry)
> might return types in different order in different platforms?

The hashtable enumeration works this way because the XMLSize_t type
size varies among 32 and 64-bit platforms which in turn leads to
different hash values. 

I don't think we guarantee any particular order of enumeration nor 
that it will be the same among all the platforms. Though you can add
an enhancement request[1] for this feature if you believe we should
provide such a guarantee.

[1] http://xerces.apache.org/xerces-c/bug-report.html

Boris


-- 
Boris Kolpackov, Code Synthesis        http://codesynthesis.com/~boris/blog
Open-source XML data binding for C++  http://codesynthesis..com/products/xsd
XML data binding for embedded systems  http://codesynthesis.com/products/xsde
Command line interface to C++ compiler http://codesynthesis.com/projects/cli