You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Carolina Garcia-Paris <cg...@bbn.com> on 2004/09/14 23:30:26 UTC

Validating against DTD in memory (Xerces C++ 2.5.0)

Hello there,

We recently upgraded our libraries from Xerces C++ 1.7.0 to Xerces C++
2.5.0. We upgraded our C++ code to comply with the new Xerces C++ parser.
However, we are still trying to resolve an issue with compiling the VXML
files against a DTD in memory. We are using the XercesDOMParser  object. Our
VXML files don't include a DOCTYPE line, we rely on the in-memory DTD
grammar for parsing the VXML files.

NOTE: our DTD file does not contain explicit ENTITY references, just ELEMENT
and ATTLIST entries.

Prior to the upgrade, our code looked like the one shown below for loading
and parsing the DTD, and then parsing the VXML files:

OLD CODE (compliant with Xerces 1.7.0)
--------

DOMParser VxmlParser;
std::string inputdtd; <---- contains the DTD text
. . .
parser::parser(void)
{
    . . .
    // Turn the Voice XML grammar DTD into a memory buffer input source
    const XMLByte* dtdsource =
        reinterpret_cast<const XMLByte*>(inputdtd.c_str());
    // Create DTD input source
    MemBufInputSource dtdsrc(dtdsource, inputdtd.length(), "VoiceXML");

    // Initialize the DOM parsing object to reuse the VXML grammar DTD
    VxmlParser.setValidationScheme(DOMParser::Val_Always);
    VxmlParser.setErrorHandler(this);

    // Load the VXML grammar DTD. Our DTD is an internal subset
    VxmlParser.parse(dtdsrc);

void vxmlparser::parse(const char* filename)
{
    VxmlParser.reset();
    // parse the given VXML file, reusing the VXML grammar DTD
    VxmlParser.parse(filename, true);
    . . .
}

After the upgrade to Xerces 2.5.0, following is our attempt to migrate the
parse code. However, the VxmlParser->parse() call will not parse the VXML
file against the DTD in memory. The loadGrammar() method does load the DTD
grammar and pre-parses it, but it is not used afterwards. The only way for
xmlParser->parse() call to succeed is to include the DOCTYPE external
reference entry in the file (See examples below).

Can anyone shed some light on what's missing/incorrect in the code below?


NEW CODE
--------

XercesDOMParser *VxmlParser;
std::string inputdtd;
. . .
// constructor - loads the VXML grammar DTD
Vxmlparser::vxmlparser(void)
{
    . . .
    // Turn the Voice XML grammar DTD into a memory buffer input source
    const XMLByte* dtdsource =
        reinterpret_cast<const XMLByte*>(inputdtd.c_str());
    // Create DTD input source
    MemBufInputSource dtdsrc(dtdsource, inputdtd.length(), "VoiceXML");

    // Initialize the DOM parsing object to reuse the VXML grammar DTD
    VxmlParser = new XercesDOMParser();
    VxmlParser->setValidationScheme(XercesDOMParser::Val_Always);

    VxmlParser->setErrorHandler(this);

    VxmlParser->cacheGrammarFromParse(true); // may be redundant
    // Load VXML DTD grammar and cache it for re-use
    VxmlParser->loadGrammar(dtdsrc, Grammar::DTDGrammarType, true);
    VxmlParser->useCachedGrammarInParse(true);
}

// Method to parse the VXML files against VXML grammar DTD
vxmlparser::parse(const char* filename)
{

  VxmlParser->resetDocumentPool();
  VxmlParser->parse(filename);
  VxmlParser->resetCachedGrammarPool();

--------------------------------------------------

Sample DTD file (person.dtd):
----------------------------
<?xml version="1.0" encoding="ISO-8859-1"?>
<!ELEMENT name (#PCDATA)>
<!ELEMENT person (name)>

Sample XML file that doesn't work:
---------------------------------
<?xml version="1.0"?>
<person>
<name>patrick</name>
</person>

Compiler returns this error:
Error (parsing can continue) at file person.grxml, line 4, char 6 --
Message: Unknown element 'name'



Sample XML file that works (with explicit DOCTYPE reference):
------------------------------------------------------------
<?xml version="1.0"?>
<!DOCTYPE person SYSTEM "person.dtd">
<person>
<name>patrick</name>
</person>



Thanks!!

--Carolina


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


RE: Validating against DTD in memory (Xerces C++ 2.5.0)

Posted by Carolina Garcia-Paris <cg...@bbn.com>.
Hi Alberto!

Thanks SO MUCH for this information, I will try it right now and will let
you know how it works. I hadn't seen this information in any Xerces
documentation (at least that I can remember).

Best regards,

--Carolina


-----Original Message-----
From: Alberto Massari [mailto:amassari@progress.com]
Sent: Wednesday, September 15, 2004 5:58 PM
To: xerces-c-dev@xml.apache.org
Subject: RE: Validating against DTD in memory (Xerces C++ 2.5.0)


Hi Carolina,

At 17.05 15/09/2004 -0400, Carolina Garcia-Paris wrote:
>I posted this message yesterday but haven't gotten a response yet.
>
>The bottomline is... has anyone gotten XercesDOMParser::loadGrammar() to
>work for loading a DTD grammar in memory and then use it to parse XML files
>against it? See my code below. Any hints/recommendations will be greatly
>appreciated.

The problem is that DTDs are cached using their system id, but your XML
files don't specify one, so they cannot be retrieved back. The only
solution is to use a custom XMLValidator.

Try starting with this code

#include <xercesc/validators/common/Grammar.hpp>
#include <xercesc/validators/DTD/DTDValidator.hpp>

...
DOMParser dtdParser;
MemBufInputSource dtdsrc(dtdsource, inputdtd.length(), "VoiceXML");
Grammar* pDTD=dtdParser.loadGrammar(dtdsrc, Grammar::DTDGrammarType, true);
DTDValidator dtdValidator;
dtdValidator.setGrammar(pDTD);

DOMParser VxmlParser(&dtdValidator);
VxmlParser.setValidationScheme(DOMParser::Val_Always);
VxmlParser.parse(...)

Hope this helps,
Alberto



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


RE: Validating against DTD in memory (Xerces C++ 2.5.0)

Posted by Alberto Massari <am...@progress.com>.
Hi Carolina,

At 17.05 15/09/2004 -0400, Carolina Garcia-Paris wrote:
>I posted this message yesterday but haven't gotten a response yet.
>
>The bottomline is... has anyone gotten XercesDOMParser::loadGrammar() to
>work for loading a DTD grammar in memory and then use it to parse XML files
>against it? See my code below. Any hints/recommendations will be greatly
>appreciated.

The problem is that DTDs are cached using their system id, but your XML 
files don't specify one, so they cannot be retrieved back. The only 
solution is to use a custom XMLValidator.

Try starting with this code

#include <xercesc/validators/common/Grammar.hpp>
#include <xercesc/validators/DTD/DTDValidator.hpp>

...
DOMParser dtdParser;
MemBufInputSource dtdsrc(dtdsource, inputdtd.length(), "VoiceXML");
Grammar* pDTD=dtdParser.loadGrammar(dtdsrc, Grammar::DTDGrammarType, true);
DTDValidator dtdValidator;
dtdValidator.setGrammar(pDTD);

DOMParser VxmlParser(&dtdValidator);
VxmlParser.setValidationScheme(DOMParser::Val_Always);
VxmlParser.parse(...)

Hope this helps,
Alberto



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


RE: Validating against DTD in memory (Xerces C++ 2.5.0)

Posted by Carolina Garcia-Paris <cg...@bbn.com>.
I posted this message yesterday but haven't gotten a response yet.

The bottomline is... has anyone gotten XercesDOMParser::loadGrammar() to
work for loading a DTD grammar in memory and then use it to parse XML files
against it? See my code below. Any hints/recommendations will be greatly
appreciated.

Thanks,

--Carolina


-----Original Message-----
From: Carolina Garcia-Paris [mailto:cgarciap@bbn.com]
Sent: Tuesday, September 14, 2004 5:30 PM
To: xerces-c-dev@xml.apache.org
Subject: Validating against DTD in memory (Xerces C++ 2.5.0)


Hello there,

We recently upgraded our libraries from Xerces C++ 1.7.0 to Xerces C++
2.5.0. We upgraded our C++ code to comply with the new Xerces C++ parser.
However, we are still trying to resolve an issue with compiling the VXML
files against a DTD in memory. We are using the XercesDOMParser  object. Our
VXML files don't include a DOCTYPE line, we rely on the in-memory DTD
grammar for parsing the VXML files.

NOTE: our DTD file does not contain explicit ENTITY references, just ELEMENT
and ATTLIST entries.

Prior to the upgrade, our code looked like the one shown below for loading
and parsing the DTD, and then parsing the VXML files:

OLD CODE (compliant with Xerces 1.7.0)
--------

DOMParser VxmlParser;
std::string inputdtd; <---- contains the DTD text
. . .
parser::parser(void)
{
    . . .
    // Turn the Voice XML grammar DTD into a memory buffer input source
    const XMLByte* dtdsource =
        reinterpret_cast<const XMLByte*>(inputdtd.c_str());
    // Create DTD input source
    MemBufInputSource dtdsrc(dtdsource, inputdtd.length(), "VoiceXML");

    // Initialize the DOM parsing object to reuse the VXML grammar DTD
    VxmlParser.setValidationScheme(DOMParser::Val_Always);
    VxmlParser.setErrorHandler(this);

    // Load the VXML grammar DTD. Our DTD is an internal subset
    VxmlParser.parse(dtdsrc);

void vxmlparser::parse(const char* filename)
{
    VxmlParser.reset();
    // parse the given VXML file, reusing the VXML grammar DTD
    VxmlParser.parse(filename, true);
    . . .
}

After the upgrade to Xerces 2.5.0, following is our attempt to migrate the
parse code. However, the VxmlParser->parse() call will not parse the VXML
file against the DTD in memory. The loadGrammar() method does load the DTD
grammar and pre-parses it, but it is not used afterwards. The only way for
xmlParser->parse() call to succeed is to include the DOCTYPE external
reference entry in the file (See examples below).

Can anyone shed some light on what's missing/incorrect in the code below?


NEW CODE
--------

XercesDOMParser *VxmlParser;
std::string inputdtd;
. . .
// constructor - loads the VXML grammar DTD
Vxmlparser::vxmlparser(void)
{
    . . .
    // Turn the Voice XML grammar DTD into a memory buffer input source
    const XMLByte* dtdsource =
        reinterpret_cast<const XMLByte*>(inputdtd.c_str());
    // Create DTD input source
    MemBufInputSource dtdsrc(dtdsource, inputdtd.length(), "VoiceXML");

    // Initialize the DOM parsing object to reuse the VXML grammar DTD
    VxmlParser = new XercesDOMParser();
    VxmlParser->setValidationScheme(XercesDOMParser::Val_Always);

    VxmlParser->setErrorHandler(this);

    VxmlParser->cacheGrammarFromParse(true); // may be redundant
    // Load VXML DTD grammar and cache it for re-use
    VxmlParser->loadGrammar(dtdsrc, Grammar::DTDGrammarType, true);
    VxmlParser->useCachedGrammarInParse(true);
}

// Method to parse the VXML files against VXML grammar DTD
vxmlparser::parse(const char* filename)
{

  VxmlParser->resetDocumentPool();
  VxmlParser->parse(filename);
  VxmlParser->resetCachedGrammarPool();

--------------------------------------------------

Sample DTD file (person.dtd):
----------------------------
<?xml version="1.0" encoding="ISO-8859-1"?>
<!ELEMENT name (#PCDATA)>
<!ELEMENT person (name)>

Sample XML file that doesn't work:
---------------------------------
<?xml version="1.0"?>
<person>
<name>patrick</name>
</person>

Compiler returns this error:
Error (parsing can continue) at file person.grxml, line 4, char 6 --
Message: Unknown element 'name'



Sample XML file that works (with explicit DOCTYPE reference):
------------------------------------------------------------
<?xml version="1.0"?>
<!DOCTYPE person SYSTEM "person.dtd">
<person>
<name>patrick</name>
</person>



Thanks!!

--Carolina


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org