You are viewing a plain text version of this content. The canonical link for it is here.

Posted to c-users@xerces.apache.org by Suneel Suresh <co...@gmail.com> on 2008/12/14 20:19:39 UTC

xerces c++ getElementById returns NULL

Code below returns null
DOMElement *domElement = parserDom->getDocument()->getElementById(
XMLString::transcode("75"));

parserDom is initialized as
   parserDom = new XercesDOMParser();
    parserDom->setValidationScheme(XercesDOMParser::Val_Always);
    parserDom->setDoNamespaces(false);
    //parserDom->setExternalSchemaLocation("http://www.w3schools.comrules.xsd");
    parserDom->setDoSchema(true);
    parserDom->setValidationSchemaFullChecking(true);
-------------------------------------------------
xml snippet below
......
<Server xmlns="http://www.w3schools.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="
http://www.w3schools.com rules.xsd">
<AnElement id="75">
</Server>
----------------------------------
The xsd
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="
http://www.w3schools.com" xmlns="http://www.w3schools.com"
elementFormDefault="unqualified">
<xs:element name="AnElement ">
 <xs:complexType>
   <xs:attribute name="id" type="xs:NMTOKEN" use="required"/>
 </xs:complexType>
</xs:element>
</xs:schema>
----------------------------------

i am using vc9 on winxp, with xerces-c-3.0.0-x86-windows-vc-9.0
Its really getting on my nerves why this simple api call is not working.
added to that the xerces tutorials are non existent.
Can anyone shine a light in this tunnel?

Mr Su

Re: xerces c++ getElementById returns NULL

Posted by Suneel Suresh <co...@gmail.com>.

ok problem is solved. it seems it was related to using namespaces vs
not using them. i am posting code so that other who need a fast way to
setup xsd on xerces can simply copy without wasting time on the non
existent documentation

------------------------ CODE --------------------------------------------
    XercesDOMParser* parser = new XercesDOMParser();

    parser->setValidationScheme(XercesDOMParser::Val_Always);
    parser->setValidationSchemaFullChecking(true);
    parser->setDoNamespaces(true);  // this must be true, even if your
using xsi:noNamespaceSchemaLocation
    parser->setDoSchema(true);

    ErrorHandler* errHandler = (ErrorHandler*) new HandlerBase();
    parser->setErrorHandler(errHandler);

    char* xmlFile = "test.xml";

    try
    {
        parser->parse(xmlFile);
        DOMElement *domElement =
parser->getDocument()->getElementById( XMLString::transcode("one"));
        cout<<XMLString::transcode(domElement->getNodeName())<<endl;
        .......

--------------- rules.xsd
------------------------------------------------------------

<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" >
<xsd:element name="AnElement">
 <xsd:complexType>
   <xsd:attribute name="id" type="xsd:ID" use="required"/>
 </xsd:complexType>
</xsd:element>
</xsd:schema>

------------------- test.xml
----------------------------------------------------------

<?xml version="1.0" encoding="UTF-8"?>
<Server xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="rules.xsd" >
    <AnElement id="one" >
        <data>dfdfd</data>
        <value>dfdfd</value>
</AnElement>
</Server>

Re: xerces/ICU unicode alias for weak encoding when serializing/converting to CP

Posted by Jan Suchý <zu...@post.cz>.

I have made bug reporting to Jira:
https://issues.apache.org/jira/browse/XERCESC-1846
I found this problem with version xerces 3.0 and icu 4.0, but I quess it will be problematic on xerces 2.8 and icu 3.8 version too - the symptoms are there.
Thank to the xerces team for fixing this.
Jan Suchy

> ------------ Původní zpráva ------------
> Od: David Bertoni <db...@apache.org>
> Předmět: Re: xerces/ICU unicode alias for weak encoding when
> serializing/converting to CP
> Datum: 16.12.2008 22:38:25
> ----------------------------------------
> Jan Suchý wrote:
> > Hi Jesse,
> > thank you for your answer and ideas.
> > I have found one kind of solution to patch the transcoder wrap class:
> > src\xercesc\util\Transcoders\ICU\ICUTransService.cpp
> > 
> > adding there to constructor of ICUTranscoder::ICUTranscoder these lines:
> > 
> > 	UErrorCode uerr = U_ZERO_ERROR;
> > 	ucnv_setSubstChars(toAdopt, "?", 1, &uerr);
> > ...
> > 
> > Than, the "?" character is used as replacement char, when using icu.
> > This is ICU specific solutions and is not clear, because there is necessary to
> rebuild xerces lib. I would like to see some possible switch around XMLFormatter
> class, but there is unknown UConverter form ICU which will be used next, because
> there is nothing to know which transcoder will be called later.
> Please create a Jira issue because this is a bug.  We should not let the 
> ICU use a replacement character that we know will result in a document 
> that's not well-formed.
> 
> Dave
> 
> 
>

Re: xerces/ICU unicode alias for weak encoding when serializing/converting to CP

Posted by David Bertoni <db...@apache.org>.

Jan Suchý wrote:
> Hi Jesse,
> thank you for your answer and ideas.
> I have found one kind of solution to patch the transcoder wrap class:
> src\xercesc\util\Transcoders\ICU\ICUTransService.cpp
> 
> adding there to constructor of ICUTranscoder::ICUTranscoder these lines:
> 
> 	UErrorCode uerr = U_ZERO_ERROR;
> 	ucnv_setSubstChars(toAdopt, "?", 1, &uerr);
> ...
> 
> Than, the "?" character is used as replacement char, when using icu.
> This is ICU specific solutions and is not clear, because there is necessary to rebuild xerces lib. I would like to see some possible switch around XMLFormatter class, but there is unknown UConverter form ICU which will be used next, because there is nothing to know which transcoder will be called later.
Please create a Jira issue because this is a bug.  We should not let the 
ICU use a replacement character that we know will result in a document 
that's not well-formed.

Dave

RE: xerces/ICU unicode alias for weak encoding when serializing/converting to CP

Posted by Jan Suchý <zu...@post.cz>.

Hi Jesse,
thank you for your answer and ideas.
I have found one kind of solution to patch the transcoder wrap class:
src\xercesc\util\Transcoders\ICU\ICUTransService.cpp

adding there to constructor of ICUTranscoder::ICUTranscoder these lines:

	UErrorCode uerr = U_ZERO_ERROR;
	ucnv_setSubstChars(toAdopt, "?", 1, &uerr);
...

Than, the "?" character is used as replacement char, when using icu.
This is ICU specific solutions and is not clear, because there is necessary to rebuild xerces lib. I would like to see some possible switch around XMLFormatter class, but there is unknown UConverter form ICU which will be used next, because there is nothing to know which transcoder will be called later.

Not optimal, but works.
thank you,
jan


> ------------ Původní zpráva ------------
> Od: Jesse Pelton <js...@PKC.com>
> Předmět: RE: xerces/ICU unicode alias for weak encoding when
> serializing/converting to CP
> Datum: 16.12.2008 15:31:01
> ----------------------------------------
> I'm not an expert on this area, but the transcoders included with Xerces do not
> provide any way to specify the replacement character, and ICU may be the same.
> Even if ICU gives you a way to do so, I'm not sure how you'd get access to a
> transcoder instance to alter.
> 
> Note, though, that 0x1A is not a legal character in an XML document. (Oracle's
> parser is correct in rejecting it.) I think it's safe to assume in your scenario
> that any such characters in a serialized document are replacements for
> unrepresentable characters. You should therefore be able to post-process the
> serialization output and replace 0x1A with one or more characters of your
> choosing. If you don't want to post-process the whole document, you could derive
> an XMLFormatTarget that replaces the replacement character in each chunk of data
> handed to it. Neither option is exactly elegant, but I'd probably do the latter;
> it'll work regardless of your format target, where the former approach requires
> serializing to memory.
> 
> 
> -----Original Message-----
> From: Jan Suchý [mailto:zuchy@post.cz] 
> Sent: Tuesday, December 16, 2008 5:37 AM
> To: c-users@xerces.apache.org
> Subject: RE: xerces/ICU unicode alias for weak encoding when
> serializing/converting to CP
> 
> Hello again,
> i have tried to use class:
> 
> http://xerces.apache.org/xerces-c/apiDocs-2/classXMLFormatter.html#_details
> 
> with attributes: NoEscapes , UnRep_Replace 
> 
> and the problematic char was replaced by:
> ^Z
> 
> But it is still not solving problem with Oracle DB XML parser to parse this xml.
> I have got this error:
> 
> ORA-31011: XML parsing failed
> ORA-19202: Error occurred in XML processing
> LPX-00216: invalid character 26 (0x1A)
> Error at line 22
> 
> I would like to replace unknown character with my own character, which will be
> parseable (for example char "?" or "_").
> How can I change replacement character, which is used as default?
> 
> Thank anybody for any idea.
> 
> Have a nice day,
> Jan
> 
> 
> > ------------ Původní zpráva ------------
> > Od: Jan Suchý <zu...@post.cz>
> > Předmět: RE: xerces/ICU unicode alias for weak encoding when
> > serializing/converting to CP
> > Datum: 16.12.2008 09:35:40
> > ----------------------------------------
> > Hello Jesse,
> > thank you for your answer :-) it seems to be promising. I'll look at it.
> > Jan
> > 
> > 
> > > ------------ Původní zpráva ------------
> > > Od: Jesse Pelton <js...@PKC.com>
> > > Předmět: RE: xerces/ICU unicode alias for weak encoding when
> > > serializing/converting to CP
> > > Datum: 15.12.2008 18:15:49
> > > ----------------------------------------
> > > The constructors for the Xerces XMLFormatter object all take an UnRepFlags
> > > argument that allows you to specify how to handle unrepresentable
> characters.
> > 
> > > So does XMLFormatter::formatBuf().  It appears that the transcoder gets to
> > > decide what character to replace unrepresentable characters with.
> > > 
> > > Hope that helps.
> > > 
> > > -----Original Message-----
> > > From: Jan Suchý [mailto:zuchy@post.cz] 
> > > Sent: Monday, December 15, 2008 4:25 AM
> > > To: c-users@xerces.apache.org
> > > Subject: xerces/ICU unicode alias for weak encoding when
> > serializing/converting
> > > to CP
> > > 
> > > Hello all,
> > > I need to obtain output XML in iso-8859-2 encoding.
> > > I am using UTF-8 as input encoding.
> > > There is some character, in UTF-8 xml, which is not representable in
> > > iso-8859-2.
> > > I am using ICU 3.8, xerces 2.8 and Xqilla svn 702.
> > > 
> > > After serializing XML to iso-8859-2 the problematic character is serialized
> > by
> > > ICU/xerces/xq to:
> > > 
> > > &#x2013;
> > > 
> > > The problem is, that if I will send message in iso-8859-2 with character
> > > &#x2013; inside to Oracle DB, the Oracle parser 
> > > 
> > > does not like this character and this error is obtained:
> > > 
> > > ORA-31011: XML parsing failed, LPX-00217: invalid character 8211 (U+2013)
> > > 
> > > So, what I am looking for is some method, how to say to the ICU or to
> Xerces
> > or
> > > to XQ, that the Unicode character, must 
> > > 
> > > not be included in result and must be for example replaced by character
> "?",
> > to
> > > avoid Oracle parser to process it.
> > > 
> > > I would like to find clear solution, like saying to ICU not calling
> callback
> > > function or define own alias or behavior on 
> > > 
> > > this situation. Is it possible?
> > > Any ideas?
> > > Thank you
> > > Jan Suchy
> > > 
> > > 
> > > 
> > 
> > 
> > 
> 
> 
>

RE: xerces/ICU unicode alias for weak encoding when serializing/converting to CP

Posted by Jesse Pelton <js...@PKC.com>.

I'm not an expert on this area, but the transcoders included with Xerces do not provide any way to specify the replacement character, and ICU may be the same. Even if ICU gives you a way to do so, I'm not sure how you'd get access to a transcoder instance to alter.

Note, though, that 0x1A is not a legal character in an XML document. (Oracle's parser is correct in rejecting it.) I think it's safe to assume in your scenario that any such characters in a serialized document are replacements for unrepresentable characters. You should therefore be able to post-process the serialization output and replace 0x1A with one or more characters of your choosing. If you don't want to post-process the whole document, you could derive an XMLFormatTarget that replaces the replacement character in each chunk of data handed to it. Neither option is exactly elegant, but I'd probably do the latter; it'll work regardless of your format target, where the former approach requires serializing to memory.

-----Original Message-----
From: Jan Suchý [mailto:zuchy@post.cz] 
Sent: Tuesday, December 16, 2008 5:37 AM
To: c-users@xerces.apache.org
Subject: RE: xerces/ICU unicode alias for weak encoding when serializing/converting to CP

Hello again,
i have tried to use class:

http://xerces.apache.org/xerces-c/apiDocs-2/classXMLFormatter.html#_details

with attributes: NoEscapes , UnRep_Replace 

and the problematic char was replaced by:
^Z

But it is still not solving problem with Oracle DB XML parser to parse this xml. I have got this error:

ORA-31011: XML parsing failed
ORA-19202: Error occurred in XML processing
LPX-00216: invalid character 26 (0x1A)
Error at line 22

I would like to replace unknown character with my own character, which will be parseable (for example char "?" or "_").
How can I change replacement character, which is used as default?

Thank anybody for any idea.

Have a nice day,
Jan

> ------------ Původní zpráva ------------
> Od: Jan Suchý <zu...@post.cz>
> Předmět: RE: xerces/ICU unicode alias for weak encoding when
> serializing/converting to CP
> Datum: 16.12.2008 09:35:40
> ----------------------------------------
> Hello Jesse,
> thank you for your answer :-) it seems to be promising. I'll look at it.
> Jan
> 
> 
> > ------------ Původní zpráva ------------
> > Od: Jesse Pelton <js...@PKC.com>
> > Předmět: RE: xerces/ICU unicode alias for weak encoding when
> > serializing/converting to CP
> > Datum: 15.12.2008 18:15:49
> > ----------------------------------------
> > The constructors for the Xerces XMLFormatter object all take an UnRepFlags
> > argument that allows you to specify how to handle unrepresentable characters.
> 
> > So does XMLFormatter::formatBuf().  It appears that the transcoder gets to
> > decide what character to replace unrepresentable characters with.
> > 
> > Hope that helps.
> > 
> > -----Original Message-----
> > From: Jan Suchý [mailto:zuchy@post.cz] 
> > Sent: Monday, December 15, 2008 4:25 AM
> > To: c-users@xerces.apache.org
> > Subject: xerces/ICU unicode alias for weak encoding when
> serializing/converting
> > to CP
> > 
> > Hello all,
> > I need to obtain output XML in iso-8859-2 encoding.
> > I am using UTF-8 as input encoding.
> > There is some character, in UTF-8 xml, which is not representable in
> > iso-8859-2.
> > I am using ICU 3.8, xerces 2.8 and Xqilla svn 702.
> > 
> > After serializing XML to iso-8859-2 the problematic character is serialized
> by
> > ICU/xerces/xq to:
> > 
> > &#x2013;
> > 
> > The problem is, that if I will send message in iso-8859-2 with character
> > &#x2013; inside to Oracle DB, the Oracle parser 
> > 
> > does not like this character and this error is obtained:
> > 
> > ORA-31011: XML parsing failed, LPX-00217: invalid character 8211 (U+2013)
> > 
> > So, what I am looking for is some method, how to say to the ICU or to Xerces
> or
> > to XQ, that the Unicode character, must 
> > 
> > not be included in result and must be for example replaced by character "?",
> to
> > avoid Oracle parser to process it.
> > 
> > I would like to find clear solution, like saying to ICU not calling callback
> > function or define own alias or behavior on 
> > 
> > this situation. Is it possible?
> > Any ideas?
> > Thank you
> > Jan Suchy
> > 
> > 
> > 
> 
> 
>

RE: xerces/ICU unicode alias for weak encoding when serializing/converting to CP

Posted by Jan Suchý <zu...@post.cz>.

Hello again,
i have tried to use class:

http://xerces.apache.org/xerces-c/apiDocs-2/classXMLFormatter.html#_details

with attributes: NoEscapes , UnRep_Replace 

and the problematic char was replaced by:
^Z

But it is still not solving problem with Oracle DB XML parser to parse this xml. I have got this error:

ORA-31011: XML parsing failed
ORA-19202: Error occurred in XML processing
LPX-00216: invalid character 26 (0x1A)
Error at line 22

I would like to replace unknown character with my own character, which will be parseable (for example char "?" or "_").
How can I change replacement character, which is used as default?

Thank anybody for any idea.

Have a nice day,
Jan


> ------------ Původní zpráva ------------
> Od: Jan Suchý <zu...@post.cz>
> Předmět: RE: xerces/ICU unicode alias for weak encoding when
> serializing/converting to CP
> Datum: 16.12.2008 09:35:40
> ----------------------------------------
> Hello Jesse,
> thank you for your answer :-) it seems to be promising. I'll look at it.
> Jan
> 
> 
> > ------------ Původní zpráva ------------
> > Od: Jesse Pelton <js...@PKC.com>
> > Předmět: RE: xerces/ICU unicode alias for weak encoding when
> > serializing/converting to CP
> > Datum: 15.12.2008 18:15:49
> > ----------------------------------------
> > The constructors for the Xerces XMLFormatter object all take an UnRepFlags
> > argument that allows you to specify how to handle unrepresentable characters.
> 
> > So does XMLFormatter::formatBuf().  It appears that the transcoder gets to
> > decide what character to replace unrepresentable characters with.
> > 
> > Hope that helps.
> > 
> > -----Original Message-----
> > From: Jan Suchý [mailto:zuchy@post.cz] 
> > Sent: Monday, December 15, 2008 4:25 AM
> > To: c-users@xerces.apache.org
> > Subject: xerces/ICU unicode alias for weak encoding when
> serializing/converting
> > to CP
> > 
> > Hello all,
> > I need to obtain output XML in iso-8859-2 encoding.
> > I am using UTF-8 as input encoding.
> > There is some character, in UTF-8 xml, which is not representable in
> > iso-8859-2.
> > I am using ICU 3.8, xerces 2.8 and Xqilla svn 702.
> > 
> > After serializing XML to iso-8859-2 the problematic character is serialized
> by
> > ICU/xerces/xq to:
> > 
> > &#x2013;
> > 
> > The problem is, that if I will send message in iso-8859-2 with character
> > &#x2013; inside to Oracle DB, the Oracle parser 
> > 
> > does not like this character and this error is obtained:
> > 
> > ORA-31011: XML parsing failed, LPX-00217: invalid character 8211 (U+2013)
> > 
> > So, what I am looking for is some method, how to say to the ICU or to Xerces
> or
> > to XQ, that the Unicode character, must 
> > 
> > not be included in result and must be for example replaced by character "?",
> to
> > avoid Oracle parser to process it.
> > 
> > I would like to find clear solution, like saying to ICU not calling callback
> > function or define own alias or behavior on 
> > 
> > this situation. Is it possible?
> > Any ideas?
> > Thank you
> > Jan Suchy
> > 
> > 
> > 
> 
> 
>

RE: xerces/ICU unicode alias for weak encoding when serializing/converting to CP

Posted by Jan Suchý <zu...@post.cz>.

Hello Jesse,
thank you for your answer :-) it seems to be promising. I'll look at it.
Jan


> ------------ Původní zpráva ------------
> Od: Jesse Pelton <js...@PKC.com>
> Předmět: RE: xerces/ICU unicode alias for weak encoding when
> serializing/converting to CP
> Datum: 15.12.2008 18:15:49
> ----------------------------------------
> The constructors for the Xerces XMLFormatter object all take an UnRepFlags
> argument that allows you to specify how to handle unrepresentable characters. 
> So does XMLFormatter::formatBuf().  It appears that the transcoder gets to
> decide what character to replace unrepresentable characters with.
> 
> Hope that helps.
> 
> -----Original Message-----
> From: Jan Suchý [mailto:zuchy@post.cz] 
> Sent: Monday, December 15, 2008 4:25 AM
> To: c-users@xerces.apache.org
> Subject: xerces/ICU unicode alias for weak encoding when serializing/converting
> to CP
> 
> Hello all,
> I need to obtain output XML in iso-8859-2 encoding.
> I am using UTF-8 as input encoding.
> There is some character, in UTF-8 xml, which is not representable in
> iso-8859-2.
> I am using ICU 3.8, xerces 2.8 and Xqilla svn 702.
> 
> After serializing XML to iso-8859-2 the problematic character is serialized by
> ICU/xerces/xq to:
> 
> &#x2013;
> 
> The problem is, that if I will send message in iso-8859-2 with character
> &#x2013; inside to Oracle DB, the Oracle parser 
> 
> does not like this character and this error is obtained:
> 
> ORA-31011: XML parsing failed, LPX-00217: invalid character 8211 (U+2013)
> 
> So, what I am looking for is some method, how to say to the ICU or to Xerces or
> to XQ, that the Unicode character, must 
> 
> not be included in result and must be for example replaced by character "?", to
> avoid Oracle parser to process it.
> 
> I would like to find clear solution, like saying to ICU not calling callback
> function or define own alias or behavior on 
> 
> this situation. Is it possible?
> Any ideas?
> Thank you
> Jan Suchy
> 
> 
>

RE: xerces/ICU unicode alias for weak encoding when serializing/converting to CP

Posted by Jesse Pelton <js...@PKC.com>.

The constructors for the Xerces XMLFormatter object all take an UnRepFlags argument that allows you to specify how to handle unrepresentable characters.  So does XMLFormatter::formatBuf().  It appears that the transcoder gets to decide what character to replace unrepresentable characters with.

Hope that helps.

-----Original Message-----
From: Jan Suchý [mailto:zuchy@post.cz] 
Sent: Monday, December 15, 2008 4:25 AM
To: c-users@xerces.apache.org
Subject: xerces/ICU unicode alias for weak encoding when serializing/converting to CP

Hello all,
I need to obtain output XML in iso-8859-2 encoding.
I am using UTF-8 as input encoding.
There is some character, in UTF-8 xml, which is not representable in iso-8859-2.
I am using ICU 3.8, xerces 2.8 and Xqilla svn 702.

After serializing XML to iso-8859-2 the problematic character is serialized by ICU/xerces/xq to:

&#x2013;

The problem is, that if I will send message in iso-8859-2 with character &#x2013; inside to Oracle DB, the Oracle parser 

does not like this character and this error is obtained:

ORA-31011: XML parsing failed, LPX-00217: invalid character 8211 (U+2013)

So, what I am looking for is some method, how to say to the ICU or to Xerces or to XQ, that the Unicode character, must 

not be included in result and must be for example replaced by character "?", to avoid Oracle parser to process it.

I would like to find clear solution, like saying to ICU not calling callback function or define own alias or behavior on 

this situation. Is it possible?
Any ideas?
Thank you
Jan Suchy

xerces/ICU unicode alias for weak encoding when serializing/converting to CP

Posted by Jan Suchý <zu...@post.cz>.

Hello all,
I need to obtain output XML in iso-8859-2 encoding.
I am using UTF-8 as input encoding.
There is some character, in UTF-8 xml, which is not representable in iso-8859-2.
I am using ICU 3.8, xerces 2.8 and Xqilla svn 702.

After serializing XML to iso-8859-2 the problematic character is serialized by ICU/xerces/xq to:

&#x2013;

The problem is, that if I will send message in iso-8859-2 with character &#x2013; inside to Oracle DB, the Oracle parser 

does not like this character and this error is obtained:

ORA-31011: XML parsing failed, LPX-00217: invalid character 8211 (U+2013)

So, what I am looking for is some method, how to say to the ICU or to Xerces or to XQ, that the Unicode character, must 

not be included in result and must be for example replaced by character "?", to avoid Oracle parser to process it.

I would like to find clear solution, like saying to ICU not calling callback function or define own alias or behavior on 

this situation. Is it possible?
Any ideas?
Thank you
Jan Suchy

Re: xerces c++ getElementById returns NULL

Posted by Suneel Suresh <co...@gmail.com>.

i am a bit hasty, so i could use some help. i made the changes but
still the getElementById is NULL :(

----------------CODE-------------------------
XercesDOMParser* parser = new XercesDOMParser();
    parser->setValidationScheme(XercesDOMParser::Val_Always);
    parser->setValidationSchemaFullChecking(true);
    parser->setDoNamespaces(true);    // optional
    parser->setDoSchema(true);

    ErrorHandler* errHandler = (ErrorHandler*) new HandlerBase();
    parser->setErrorHandler(errHandler);

    char* xmlFile = "test.xml";

    try
    {
        parser->parse(xmlFile);
        cout<<parser->getDocument()->getElementById(
XMLString::transcode("one"))<<endl;
       .......
------------------XSD-----------
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.w3schools.com"
xmlns="http://www.w3schools.com" elementFormDefault="unqualified">
<xs:element name="AnElement">
 <xs:complexType>
   <xs:attribute name="xml:id" type="xs:ID" use="required"/>
 </xs:complexType>
</xs:element>
</xs:schema>

----------------XML---------
<?xml version="1.0" encoding="UTF-8"?>
<Server xmlns="http://www.w3schools.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3schools.com rules.xsd" >
    <AnElement xml:id="one" >
        <data>dfdfd</data>
        <value>dfdfd</value>
</AnElement>
</Server>
-------------------------

what am i doing wrong now?

On Mon, Dec 15, 2008 at 2:00 AM, David Bertoni <db...@apache.org> wrote:
> Suneel Suresh wrote:
>>
>> Code below returns null
>> DOMElement *domElement = parserDom->getDocument()->getElementById(
>> XMLString::transcode("75"));
>>
>> parserDom is initialized as
>>   parserDom = new XercesDOMParser();
>>    parserDom->setValidationScheme(XercesDOMParser::Val_Always);
>>    parserDom->setDoNamespaces(false);
>>
>>  //parserDom->setExternalSchemaLocation("http://www.w3schools.comrules.xsd");
>>    parserDom->setDoSchema(true);
>>    parserDom->setValidationSchemaFullChecking(true);
>> -------------------------------------------------
>> xml snippet below
>> ......
>> <Server xmlns="http://www.w3schools.com"
>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="
>> http://www.w3schools.com rules.xsd">
>> <AnElement id="75">
>> </Server>
>
> "75" is not a valid ID in XML, because IDs must match the NCName grammar
> production:
>
> http://www.w3.org/TR/xmlschema-2/#NCName
>
>
>> ----------------------------------
>> The xsd
>> <?xml version="1.0" encoding="UTF-8"?>
>> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="
>> http://www.w3schools.com" xmlns="http://www.w3schools.com"
>> elementFormDefault="unqualified">
>> <xs:element name="AnElement ">
>>  <xs:complexType>
>>   <xs:attribute name="id" type="xs:NMTOKEN" use="required"/>
>>  </xs:complexType>
>> </xs:element>
>> </xs:schema>
>> ----------------------------------
>>
>> i am using vc9 on winxp, with xerces-c-3.0.0-x86-windows-vc-9.0
>> Its really getting on my nerves why this simple api call is not working.
>> added to that the xerces tutorials are non existent.
>
> There are many books and tutorials for XML available, so there's no need for
> a Xerces-specific tutorial to explain how ID attributes work in XML.
>
> Just giving an attribute the name "id" is not enough to make it an ID:
>
> http://www.w3.org/TR/xmlschema-2/#ID
>
> If you were to actually declare the attribute as type ID, you would get a
> validation error:
>
> <xs:element name="AnElement">
>  <xs:complexType>
>    <xs:attribute name="id" type="xs:ID" use="required"/>
>  </xs:complexType>
> </xs:element>
>
>> Can anyone shine a light in this tunnel?
>
> You can help save your nerves by understanding the technology you're using.
>
> Dave
>

Re: xerces c++ getElementById returns NULL

Posted by David Bertoni <db...@apache.org>.

Suneel Suresh wrote:
> Code below returns null
> DOMElement *domElement = parserDom->getDocument()->getElementById(
> XMLString::transcode("75"));
> 
> parserDom is initialized as
>    parserDom = new XercesDOMParser();
>     parserDom->setValidationScheme(XercesDOMParser::Val_Always);
>     parserDom->setDoNamespaces(false);
>     //parserDom->setExternalSchemaLocation("http://www.w3schools.comrules.xsd");
>     parserDom->setDoSchema(true);
>     parserDom->setValidationSchemaFullChecking(true);
> -------------------------------------------------
> xml snippet below
> ......
> <Server xmlns="http://www.w3schools.com"
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="
> http://www.w3schools.com rules.xsd">
> <AnElement id="75">
> </Server>
"75" is not a valid ID in XML, because IDs must match the NCName grammar 
production:

http://www.w3.org/TR/xmlschema-2/#NCName


> ----------------------------------
> The xsd
> <?xml version="1.0" encoding="UTF-8"?>
> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="
> http://www.w3schools.com" xmlns="http://www.w3schools.com"
> elementFormDefault="unqualified">
> <xs:element name="AnElement ">
>  <xs:complexType>
>    <xs:attribute name="id" type="xs:NMTOKEN" use="required"/>
>  </xs:complexType>
> </xs:element>
> </xs:schema>
> ----------------------------------
> 
> i am using vc9 on winxp, with xerces-c-3.0.0-x86-windows-vc-9.0
> Its really getting on my nerves why this simple api call is not working.
> added to that the xerces tutorials are non existent.
There are many books and tutorials for XML available, so there's no need 
for a Xerces-specific tutorial to explain how ID attributes work in XML.

Just giving an attribute the name "id" is not enough to make it an ID:

http://www.w3.org/TR/xmlschema-2/#ID

If you were to actually declare the attribute as type ID, you would get 
a validation error:

<xs:element name="AnElement">
   <xs:complexType>
     <xs:attribute name="id" type="xs:ID" use="required"/>
   </xs:complexType>
</xs:element>

> Can anyone shine a light in this tunnel?
You can help save your nerves by understanding the technology you're using.

Dave