You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-users@xerces.apache.org by Johan Ditmar <jo...@yahoo.co.uk> on 2008/08/04 12:24:43 UTC

Dynamically loading schemas whilst parsing

Hi,

I am using Xerces-C++ to parse an XML document into a hierarchical object structure. There are many different types of objects in this structure, each corresponding to different XML, and I would like to use schema validation.

As I need to be able to parse any object structure, I can't use a fixed schema that I know beforehand. Instead I would like to parse the XML, and when I encounter the start of an object, I would like to dynamically load the schema that is unique for that object and continue parsing and validating the child elements of that object.

For example, my XML could look as follows:

...
<parent type="ObjA">
  <child1 type="ObjB">
     <name>Some Name</name>
   </child1>
  <child2 type="ObjC">

     <value>Some Value</value>

   </child2>
</parent>


Here, when I encounter ObjA during parsing, I want to load the schema of Obj1, which validates that there should be two sub-elements, child1 and child2. And this should be repeated for ObjB and ObjC.

The only way I can see this work at the moment is by parsing the structure twice: once without validation to find all the objects and assemble the schema, and once with validation. But I was wondering whether this can be done in one pass. Any ideas?

Johan





      __________________________________________________________
Not happy with your email address?.
Get the one you really want - millions of new email addresses available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html

Re: Dynamically loading schemas whilst parsing

Posted by Alberto Massari <am...@datadirect.com>.
Johan Ditmar wrote:
> Thank you Alberto. I thought about this, but if you look at my XML structure (in which element names are instance names and object types are attributes), I can't write a global schema enumerating all the objects (as they are discriminated based on attribute rather than element name). Or can I? I'd rather not modify the XML structure.
>   

I hadn't looked at the sample XML; now that I read it, my approach is 
not usable.
BTW, this format can give you troubles; be very careful when building 
the schema, or, even better, preprocess the XML to shape it so that my 
approach works.

Alberto

> Johan
>
>
> --- On Mon, 4/8/08, Alberto Massari <am...@datadirect.com> wrote:
> From: Alberto Massari <am...@datadirect.com>
> Subject: Re: Dynamically loading schemas whilst parsing
> To: c-users@xerces.apache.org
> Date: Monday, 4 August, 2008, 11:39 AM
>
> You cannot dynamically populate a schema while parsing (also because 
> when you get the notification of an element, that element has already 
> been validated); I would just build a big schema with the definitions 
> for the all the possible objects, with a definition for the root element 
> that allows any number of them (e.g. <xs:any namespace="##local" 
> maxOccurs="unbounded"/>)
>
> Alberto
>
> Johan Ditmar wrote:
>   
>> Hi,
>>
>> I am using Xerces-C++ to parse an XML document into a hierarchical object
>>     
> structure. There are many different types of objects in this structure, each
> corresponding to different XML, and I would like to use schema validation.
>   
>> As I need to be able to parse any object structure, I can't use a
>>     
> fixed schema that I know beforehand. Instead I would like to parse the XML, and
> when I encounter the start of an object, I would like to dynamically load the
> schema that is unique for that object and continue parsing and validating the
> child elements of that object.
>   
>> For example, my XML could look as follows:
>>
>> ...
>> <parent type="ObjA">
>>   <child1 type="ObjB">
>>      <name>Some Name</name>
>>    </child1>
>>   <child2 type="ObjC">
>>
>>      <value>Some Value</value>
>>
>>    </child2>
>> </parent>
>>
>>
>> Here, when I encounter ObjA during parsing, I want to load the schema of
>>     
> Obj1, which validates that there should be two sub-elements, child1 and child2.
> And this should be repeated for ObjB and ObjC.
>   
>> The only way I can see this work at the moment is by parsing the structure
>>     
> twice: once without validation to find all the objects and assemble the schema,
> and once with validation. But I was wondering whether this can be done in one
> pass. Any ideas?
>   
>> Johan
>>
>>
>>
>>
>>
>>       __________________________________________________________
>> Not happy with your email address?.
>> Get the one you really want - millions of new email addresses available
>>     
> now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html
>   
>>   
>>     
>
>
>
>
>       __________________________________________________________
> Not happy with your email address?.
> Get the one you really want - millions of new email addresses available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html
>   


Re: Dynamically loading schemas whilst parsing

Posted by Johan Ditmar <jo...@yahoo.co.uk>.
Thank you Alberto. I thought about this, but if you look at my XML structure (in which element names are instance names and object types are attributes), I can't write a global schema enumerating all the objects (as they are discriminated based on attribute rather than element name). Or can I? I'd rather not modify the XML structure.

Johan


--- On Mon, 4/8/08, Alberto Massari <am...@datadirect.com> wrote:
From: Alberto Massari <am...@datadirect.com>
Subject: Re: Dynamically loading schemas whilst parsing
To: c-users@xerces.apache.org
Date: Monday, 4 August, 2008, 11:39 AM

You cannot dynamically populate a schema while parsing (also because 
when you get the notification of an element, that element has already 
been validated); I would just build a big schema with the definitions 
for the all the possible objects, with a definition for the root element 
that allows any number of them (e.g. <xs:any namespace="##local" 
maxOccurs="unbounded"/>)

Alberto

Johan Ditmar wrote:
> Hi,
>
> I am using Xerces-C++ to parse an XML document into a hierarchical object
structure. There are many different types of objects in this structure, each
corresponding to different XML, and I would like to use schema validation.
>
> As I need to be able to parse any object structure, I can't use a
fixed schema that I know beforehand. Instead I would like to parse the XML, and
when I encounter the start of an object, I would like to dynamically load the
schema that is unique for that object and continue parsing and validating the
child elements of that object.
>
> For example, my XML could look as follows:
>
> ...
> <parent type="ObjA">
>   <child1 type="ObjB">
>      <name>Some Name</name>
>    </child1>
>   <child2 type="ObjC">
>
>      <value>Some Value</value>
>
>    </child2>
> </parent>
>
>
> Here, when I encounter ObjA during parsing, I want to load the schema of
Obj1, which validates that there should be two sub-elements, child1 and child2.
And this should be repeated for ObjB and ObjC.
>
> The only way I can see this work at the moment is by parsing the structure
twice: once without validation to find all the objects and assemble the schema,
and once with validation. But I was wondering whether this can be done in one
pass. Any ideas?
>
> Johan
>
>
>
>
>
>       __________________________________________________________
> Not happy with your email address?.
> Get the one you really want - millions of new email addresses available
now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html
>   




      __________________________________________________________
Not happy with your email address?.
Get the one you really want - millions of new email addresses available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html

Re: Dynamically loading schemas whilst parsing

Posted by Alberto Massari <am...@datadirect.com>.
You cannot dynamically populate a schema while parsing (also because 
when you get the notification of an element, that element has already 
been validated); I would just build a big schema with the definitions 
for the all the possible objects, with a definition for the root element 
that allows any number of them (e.g. <xs:any namespace="##local" 
maxOccurs="unbounded"/>)

Alberto

Johan Ditmar wrote:
> Hi,
>
> I am using Xerces-C++ to parse an XML document into a hierarchical object structure. There are many different types of objects in this structure, each corresponding to different XML, and I would like to use schema validation.
>
> As I need to be able to parse any object structure, I can't use a fixed schema that I know beforehand. Instead I would like to parse the XML, and when I encounter the start of an object, I would like to dynamically load the schema that is unique for that object and continue parsing and validating the child elements of that object.
>
> For example, my XML could look as follows:
>
> ...
> <parent type="ObjA">
>   <child1 type="ObjB">
>      <name>Some Name</name>
>    </child1>
>   <child2 type="ObjC">
>
>      <value>Some Value</value>
>
>    </child2>
> </parent>
>
>
> Here, when I encounter ObjA during parsing, I want to load the schema of Obj1, which validates that there should be two sub-elements, child1 and child2. And this should be repeated for ObjB and ObjC.
>
> The only way I can see this work at the moment is by parsing the structure twice: once without validation to find all the objects and assemble the schema, and once with validation. But I was wondering whether this can be done in one pass. Any ideas?
>
> Johan
>
>
>
>
>
>       __________________________________________________________
> Not happy with your email address?.
> Get the one you really want - millions of new email addresses available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html
>