You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Curt Arnold <ca...@houston.rr.com> on 2001/10/24 08:13:26 UTC

Re: Xerces-C 1.5.2 status update ...]

Sean's patch looks good to me.  Basically, it fixes loadXML which parses an
XML document passed as a string.  Originally, UTF-16 was set as the encoding
which caused Xerces to look for a byte order mark that would not be present
in a string (but would in a file).  Setting UTF-16LE explicitly sets the
byte order.

There are other issues however:

xml4com.h and xml4com_i.c are generated files (from xml4com.idl) and should
not be in the CVS.

xml4com.idl attempts to set the version of the type library to 1.5.2 when
only major.minor format is allowed.  Will cause build failure.
http://msdn.microsoft.com/library/en-us/midl/mi-laref_1df2.asp?frame=true
Attached patch file.

Will have to review changes of CLSID's to see if they conflict with earlier
versions of Xerces-COM.

IXMLDOMDocument.load() does not absolutize relative paths before passing to
Xerces-C.  I thought that I had addressed this earlier, if not I know I have
code at work that does it.  load would also suffer from the same problem
with BSTR's by reference mentioned last week.  Will prepare patch tomorrow.

Will check the BSTR ByRef patch from last week at the same time.

Re: Xerces-C 1.5.2 status update ...]

Posted by Dean Roddey <dr...@charmedquark.com>.
> > Don't assume a BOM *wouldn't* be in a string, it could well be and its
> > perfectly legal to do so. You should be able to deal with it either way.
>
> I've never encountered any use of a BOM in a BSTR.  Would you ever expect
> one in a java.lang.String?

Its not illegal to have them, so you should be prepared to to deal with it.
The most common case where it might occur is when someone reads in a file
into a string and feeds it back into the parser. It might never happen in a
Java string, but could easily enough happen in C++ if they read in data into
a string. I've seen some hard-coded XML strings in Cpp files which had the
BOM in it.

--------------------------
Dean Roddey
The Charmed Quark Controller
Charmed Quark Software
droddey@charmedquark.com
http://www.charmedquark.com

"If it don't have a control port, don't buy it!"


----- Original Message -----
From: "Curt Arnold" <ca...@houston.rr.com>
To: <xe...@xml.apache.org>
Sent: Thursday, October 25, 2001 1:29 AM
Subject: Re: Xerces-C 1.5.2 status update ...]


> > Don't assume a BOM *wouldn't* be in a string, it could well be and its
> > perfectly legal to do so. You should be able to deal with it either way.
>
> I've never encountered any use of a BOM in a BSTR.  Would you ever expect
> one in a java.lang.String?
>
> As it has been, the method would always fail.  Now it might fail in a very
> artificial case.  I don't see a justification to keep the patch out.
Though
> a perfect patch could cover the artificial case.
>
> There is a problem (maybe it is just Win XP) where relative URI's are not
> being resolved.  Also, I have code that will eliminate the dependency on
> WININET and URLMON when parsing local files.   Way too late tonight,
working
> on the W3C/NIST DOM stuff, maybe tomorrow.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Re: Xerces-C 1.5.2 status update ...]

Posted by Curt Arnold <ca...@houston.rr.com>.
> Don't assume a BOM *wouldn't* be in a string, it could well be and its
> perfectly legal to do so. You should be able to deal with it either way.

I've never encountered any use of a BOM in a BSTR.  Would you ever expect
one in a java.lang.String?

As it has been, the method would always fail.  Now it might fail in a very
artificial case.  I don't see a justification to keep the patch out.  Though
a perfect patch could cover the artificial case.

There is a problem (maybe it is just Win XP) where relative URI's are not
being resolved.  Also, I have code that will eliminate the dependency on
WININET and URLMON when parsing local files.   Way too late tonight, working
on the W3C/NIST DOM stuff, maybe tomorrow.


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Re: Xerces-C 1.5.2 status update ...]

Posted by Dean Roddey <dr...@charmedquark.com>.
> Sean's patch looks good to me.  Basically, it fixes loadXML which parses
an
> XML document passed as a string.  Originally, UTF-16 was set as the
encoding
> which caused Xerces to look for a byte order mark that would not be
present
> in a string (but would in a file).  Setting UTF-16LE explicitly sets the
> byte order.

Don't assume a BOM *wouldn't* be in a string, it could well be and its
perfectly legal to do so. You should be able to deal with it either way.

--------------------------
Dean Roddey
The Charmed Quark Controller
Charmed Quark Software
droddey@charmedquark.com
http://www.charmedquark.com

"If it don't have a control port, don't buy it!"



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org