You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Matthias Jung <ma...@xtradyne.com> on 2003/10/06 12:52:25 UTC

Question to GrammarResolver class

Hello *

I try to use schema validation with preparsed schemas.
Within a more complex example I failed to preparse and cache a schema 
that imports an already loaded schema.

Example:
schema B includes schema A
schema C includes schema A

loadGrammer() of schema B failes because target namespace of schema A is 
already known.

I helped myself with adapting the GrammarResolver::cacheGrammars() function.
The new behaviour is to ignore caching of a new schema that is already 
known, and not to throw an exception as in original code.
With this little change anything (preparsing and validation) works fine.

Here my questions:

- Are there any reasons why GrammarResolver throws an exception if it
   should add an new Grammar to the cache? (Maybe I don't see other
   impacts yet)

- Wouldn't be this change be useful for all xerces users?

- I my eys an additional merging functionality would be useful, that
   updates and enhances definitions of an already cached grammar.
   Example: file a.xsd and b.xsd define elements of the same target
   namespace. It would be useful if the internal grammar representation
   would be a union of file a and b What do you think?


Regards

	Matthias



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Re: Question to GrammarResolver class

Posted by Gareth Reakes <ga...@decisionsoft.com>.
> Yes, it should be sufficient: for instance,
> - you call loadGrammar(D,true). This pushes in the grammar pool D
> - you then call loadGrammar(A,true). This would normally parse the grammars 
> A, B, C and D. When you try to cache these grammars, you get an exception 
> because D is already in the pool. After the change in the code, loadGrammar 
> would still load A, B and C, but reuse D from the grammar pool, and the 
> memory pool will contain A, B, C and D (that was already there).

Excellent. I will see if this is portable to xerces-j.

Cheers Alby,

Gareth


-- 
Gareth Reakes, Head of Product Development  +44-1865-203192
DecisionSoft Limited                        http://www.decisionsoft.com
XML Development and Services




---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Re: Question to GrammarResolver class

Posted by Alberto Massari <am...@progress.com>.
At 16.43 28/10/2003 +0000, Gareth Reakes wrote:
>Hey Alby,
>
>         Do you think this will be sufficient? The problem I am thinking
>about is this: We have a top level schema A that includes B, C and D. B
>uses some components from C. We want C to be a legal schema in its own
>write. To do this it imports A (we could import C, but both are legal).
>So you now have the problem that the grammars in the URI are not the same.
>We need to amalgamate them.

Yes, it should be sufficient: for instance,
- you call loadGrammar(D,true). This pushes in the grammar pool D
- you then call loadGrammar(A,true). This would normally parse the grammars 
A, B, C and D. When you try to cache these grammars, you get an exception 
because D is already in the pool. After the change in the code, loadGrammar 
would still load A, B and C, but reuse D from the grammar pool, and the 
memory pool will contain A, B, C and D (that was already there).

The only constraint you will always have is the one Xerces already has when 
parsing single schemas: one schema namespace must be represented by one 
schema URI (so, the schema for namespace D must be the same in all the 
above steps)

Alberto

>         I have a bad cold today :( so I hope this makes sense. By all
>means ask me for more information if this is not clear.
>
>Gareth
>
>On Mon, 27 Oct 2003, Alberto Massari wrote:
>
> > Gareth,
> > it seems to me that we could fix this problem by changing the following
> > line (inside IGXMLScanner::loadGrammar and SGXMLScanner::loadGrammar) from
> >
> >          fGrammarResolver->useCachedGrammarInParse(false);
> >
> > to
> >
> >          fGrammarResolver->useCachedGrammarInParse(toCache);
> >
> > In fact, if we are going to cache the grammar after loading it, we should
> > re-use the grammars already found in the pool, or the exception will occur
> > (or, if we decide to silently skip the caching of a duplicate grammar, an
> > inconsistent set of grammars will be cached).
> > I am inclined to check in this change, unless Khaled has something to say
> > about it.
> >
> > Alberto
> >
> > At 13.35 07/10/2003 +0100, Gareth Reakes wrote:
> > >Hi,
> > >
> > > > Example:
> > > > schema B includes schema A
> > > > schema C includes schema A
> > > >
> > > > loadGrammer() of schema B failes because target namespace of schema 
> A is
> > > > already known.
> > >
> > >I have also come across this issue, all be it in a slightly more
> > >convoluted way.
> > >
> > > >
> > > > I helped myself with adapting the GrammarResolver::cacheGrammars()
> > > function.
> > > > The new behaviour is to ignore caching of a new schema that is already
> > > > known, and not to throw an exception as in original code.
> > > > With this little change anything (preparsing and validation) works 
> fine.
> > >
> > >As did I, although this has some clear problems if you have a hierarchy of
> > >schemas and are including from parts of it.
> > >
> > > >
> > > > Here my questions:
> > > >
> > > > - Are there any reasons why GrammarResolver throws an exception if it
> > > >    should add an new Grammar to the cache? (Maybe I don't see other
> > > >    impacts yet)
> > >
> > >Yes, it it relates to your suggestion below. We currently do not have
> > >the ability to test for equality of the components. The schema spec has
> > >this to say:
> > >
> > >NOTE: The above is carefully worded so that multiple <include>ing of the
> > >same schema document will not constitute a violation of clause 2 of Schema
> > >Properties Correct (§3.15.6), but applications are allowed, indeed
> > >encouraged, to avoid <include>ing the same schema document more than once
> > >to forestall the necessity of establishing identity component by
> > >component.
> > >
> > >
> > >So I think that we are currently incorrect by not allowing this. In fact,
> > >one of out products in development requires this functionality. Xerces-J
> > >has a problem with it as well.
> > >
> > >
> > > >
> > > > - Wouldn't be this change be useful for all xerces users?
> > > >
> > > > - I my eys an additional merging functionality would be useful, that
> > > >    updates and enhances definitions of an already cached grammar.
> > > >    Example: file a.xsd and b.xsd define elements of the same target
> > > >    namespace. It would be useful if the internal grammar representation
> > > >    would be a union of file a and b What do you think?
> > >
> > >I think so as well, but we would have to do some work to be able to
> > >establish identity of the components.
> > >
> > >
> > >Gareth
> > >
> > >
> > >--
> > >Gareth Reakes, Head of Product Development  +44-1865-203192
> > >DecisionSoft Limited                        http://www.decisionsoft.com
> > >XML Development and Services
> > >
> > >
> > >
> > >
> > >---------------------------------------------------------------------
> > >To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> > >For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> > For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
> >
> >
> >
>
>--
>Gareth Reakes, Head of Product Development  +44-1865-203192
>DecisionSoft Limited                        http://www.decisionsoft.com
>XML Development and Services
>
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
>For additional commands, e-mail: xerces-c-dev-help@xml.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Re: Question to GrammarResolver class

Posted by Gareth Reakes <ga...@decisionsoft.com>.
Hey Alby,

	Do you think this will be sufficient? The problem I am thinking 
about is this: We have a top level schema A that includes B, C and D. B 
uses some components from C. We want C to be a legal schema in its own 
write. To do this it imports A (we could import C, but both are legal).  
So you now have the problem that the grammars in the URI are not the same. 
We need to amalgamate them. 
	I have a bad cold today :( so I hope this makes sense. By all 
means ask me for more information if this is not clear.

Gareth

On Mon, 27 Oct 2003, Alberto Massari wrote:

> Gareth,
> it seems to me that we could fix this problem by changing the following 
> line (inside IGXMLScanner::loadGrammar and SGXMLScanner::loadGrammar) from
> 
>          fGrammarResolver->useCachedGrammarInParse(false);
> 
> to
> 
>          fGrammarResolver->useCachedGrammarInParse(toCache);
> 
> In fact, if we are going to cache the grammar after loading it, we should 
> re-use the grammars already found in the pool, or the exception will occur 
> (or, if we decide to silently skip the caching of a duplicate grammar, an 
> inconsistent set of grammars will be cached).
> I am inclined to check in this change, unless Khaled has something to say 
> about it.
> 
> Alberto
> 
> At 13.35 07/10/2003 +0100, Gareth Reakes wrote:
> >Hi,
> >
> > > Example:
> > > schema B includes schema A
> > > schema C includes schema A
> > >
> > > loadGrammer() of schema B failes because target namespace of schema A is
> > > already known.
> >
> >I have also come across this issue, all be it in a slightly more
> >convoluted way.
> >
> > >
> > > I helped myself with adapting the GrammarResolver::cacheGrammars() 
> > function.
> > > The new behaviour is to ignore caching of a new schema that is already
> > > known, and not to throw an exception as in original code.
> > > With this little change anything (preparsing and validation) works fine.
> >
> >As did I, although this has some clear problems if you have a hierarchy of
> >schemas and are including from parts of it.
> >
> > >
> > > Here my questions:
> > >
> > > - Are there any reasons why GrammarResolver throws an exception if it
> > >    should add an new Grammar to the cache? (Maybe I don't see other
> > >    impacts yet)
> >
> >Yes, it it relates to your suggestion below. We currently do not have
> >the ability to test for equality of the components. The schema spec has
> >this to say:
> >
> >NOTE: The above is carefully worded so that multiple <include>ing of the
> >same schema document will not constitute a violation of clause 2 of Schema
> >Properties Correct (§3.15.6), but applications are allowed, indeed
> >encouraged, to avoid <include>ing the same schema document more than once
> >to forestall the necessity of establishing identity component by
> >component.
> >
> >
> >So I think that we are currently incorrect by not allowing this. In fact,
> >one of out products in development requires this functionality. Xerces-J
> >has a problem with it as well.
> >
> >
> > >
> > > - Wouldn't be this change be useful for all xerces users?
> > >
> > > - I my eys an additional merging functionality would be useful, that
> > >    updates and enhances definitions of an already cached grammar.
> > >    Example: file a.xsd and b.xsd define elements of the same target
> > >    namespace. It would be useful if the internal grammar representation
> > >    would be a union of file a and b What do you think?
> >
> >I think so as well, but we would have to do some work to be able to
> >establish identity of the components.
> >
> >
> >Gareth
> >
> >
> >--
> >Gareth Reakes, Head of Product Development  +44-1865-203192
> >DecisionSoft Limited                        http://www.decisionsoft.com
> >XML Development and Services
> >
> >
> >
> >
> >---------------------------------------------------------------------
> >To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> >For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
> 
> 
> 

-- 
Gareth Reakes, Head of Product Development  +44-1865-203192
DecisionSoft Limited                        http://www.decisionsoft.com
XML Development and Services




---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Re: Question to GrammarResolver class

Posted by Alberto Massari <am...@progress.com>.
Gareth,
it seems to me that we could fix this problem by changing the following 
line (inside IGXMLScanner::loadGrammar and SGXMLScanner::loadGrammar) from

         fGrammarResolver->useCachedGrammarInParse(false);

to

         fGrammarResolver->useCachedGrammarInParse(toCache);

In fact, if we are going to cache the grammar after loading it, we should 
re-use the grammars already found in the pool, or the exception will occur 
(or, if we decide to silently skip the caching of a duplicate grammar, an 
inconsistent set of grammars will be cached).
I am inclined to check in this change, unless Khaled has something to say 
about it.

Alberto

At 13.35 07/10/2003 +0100, Gareth Reakes wrote:
>Hi,
>
> > Example:
> > schema B includes schema A
> > schema C includes schema A
> >
> > loadGrammer() of schema B failes because target namespace of schema A is
> > already known.
>
>I have also come across this issue, all be it in a slightly more
>convoluted way.
>
> >
> > I helped myself with adapting the GrammarResolver::cacheGrammars() 
> function.
> > The new behaviour is to ignore caching of a new schema that is already
> > known, and not to throw an exception as in original code.
> > With this little change anything (preparsing and validation) works fine.
>
>As did I, although this has some clear problems if you have a hierarchy of
>schemas and are including from parts of it.
>
> >
> > Here my questions:
> >
> > - Are there any reasons why GrammarResolver throws an exception if it
> >    should add an new Grammar to the cache? (Maybe I don't see other
> >    impacts yet)
>
>Yes, it it relates to your suggestion below. We currently do not have
>the ability to test for equality of the components. The schema spec has
>this to say:
>
>NOTE: The above is carefully worded so that multiple <include>ing of the
>same schema document will not constitute a violation of clause 2 of Schema
>Properties Correct (§3.15.6), but applications are allowed, indeed
>encouraged, to avoid <include>ing the same schema document more than once
>to forestall the necessity of establishing identity component by
>component.
>
>
>So I think that we are currently incorrect by not allowing this. In fact,
>one of out products in development requires this functionality. Xerces-J
>has a problem with it as well.
>
>
> >
> > - Wouldn't be this change be useful for all xerces users?
> >
> > - I my eys an additional merging functionality would be useful, that
> >    updates and enhances definitions of an already cached grammar.
> >    Example: file a.xsd and b.xsd define elements of the same target
> >    namespace. It would be useful if the internal grammar representation
> >    would be a union of file a and b What do you think?
>
>I think so as well, but we would have to do some work to be able to
>establish identity of the components.
>
>
>Gareth
>
>
>--
>Gareth Reakes, Head of Product Development  +44-1865-203192
>DecisionSoft Limited                        http://www.decisionsoft.com
>XML Development and Services
>
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
>For additional commands, e-mail: xerces-c-dev-help@xml.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Re: Question to GrammarResolver class

Posted by Gareth Reakes <ga...@decisionsoft.com>.
Hi,

> Example:
> schema B includes schema A
> schema C includes schema A
> 
> loadGrammer() of schema B failes because target namespace of schema A is 
> already known.

I have also come across this issue, all be it in a slightly more 
convoluted way.

> 
> I helped myself with adapting the GrammarResolver::cacheGrammars() function.
> The new behaviour is to ignore caching of a new schema that is already 
> known, and not to throw an exception as in original code.
> With this little change anything (preparsing and validation) works fine.

As did I, although this has some clear problems if you have a hierarchy of 
schemas and are including from parts of it.

> 
> Here my questions:
> 
> - Are there any reasons why GrammarResolver throws an exception if it
>    should add an new Grammar to the cache? (Maybe I don't see other
>    impacts yet)

Yes, it it relates to your suggestion below. We currently do not have 
the ability to test for equality of the components. The schema spec has 
this to say:

NOTE: The above is carefully worded so that multiple <include>ing of the 
same schema document will not constitute a violation of clause 2 of Schema 
Properties Correct (§3.15.6), but applications are allowed, indeed 
encouraged, to avoid <include>ing the same schema document more than once 
to forestall the necessity of establishing identity component by 
component.


So I think that we are currently incorrect by not allowing this. In fact, 
one of out products in development requires this functionality. Xerces-J 
has a problem with it as well. 


> 
> - Wouldn't be this change be useful for all xerces users?
> 
> - I my eys an additional merging functionality would be useful, that
>    updates and enhances definitions of an already cached grammar.
>    Example: file a.xsd and b.xsd define elements of the same target
>    namespace. It would be useful if the internal grammar representation
>    would be a union of file a and b What do you think?

I think so as well, but we would have to do some work to be able to 
establish identity of the components.


Gareth


-- 
Gareth Reakes, Head of Product Development  +44-1865-203192
DecisionSoft Limited                        http://www.decisionsoft.com
XML Development and Services




---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org