You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by Ias <ia...@apache-korea.org> on 2003/05/10 08:34:27 UTC

[PATCH] Allowing a parser to ignore element order during dtd based validation

Some users of XML parsers want to validate loosely, i.e. ignore element
order from DTD. For example,

Here's an excerpt from person.dtd,

<!ELEMENT person (name,email*,url*,link?)>

and a possible elements from person.xml

  <person id="ias">
    <name><given>Changshin</given><family>Lee</family></name>
    <url href="http://www.iasandcb.pe.kr"/>
    <email>iasandcb@apache-korea.org</email>
  </person>

If you parse and validate the above situation by the following snippet,

		DOMParser parser = new DOMParser();
		parser.setFeature("http://xml.org/sax/features/validation",
true);
		parser.parse(new InputData("personal.xml"));

the result is 

[Error] personal.xml:45:12: The content of element type "person" must match
"(name,email*,url*,link?)".

The attached patches allow you to accept a  XML document with elements of
which order is not compliant with its given DTD. (just in terms of order.
All the other aspects such as existence and data format should abide by the
DTD.) The new feature "http://apache.org/xml/features/validation/ignore-
element-order" enables you to do so like this:

		DOMParser parser = new DOMParser();
		parser.setFeature("http://xml.org/sax/features/validation",
true);
	
parser.setFeature("http://apache.org/xml/features/validation/ignore-element-
order", true);
		parser.parse(new InputData("personal.xml"));

The result is No error.

I hope these patches are helpful.

Ias.

===========================================================
Lee, Changshin (Korean name)
Ias (International name)
               Company Web Site: http://www.tmax.co.kr
               Personal Web Site: http://www.iasandcb.pe.kr
---------------------------------------------------------
Senior Researcher, Emerging Technology Evangelist & JCP Activities
Coordinator 
JCP member - http://jcp.org/en/participation/members/L
R&D Institute
Tmax Soft, Inc. 
JCP member - http://jcp.org/en/participation/members/T
==========================================================

RE: [PATCH] Allowing a parser to ignore element order during dtd based validation

Posted by Arnaud Le Hors <le...@us.ibm.com>.
It sounds like the DTD being used in the case you're describing doesn't
actually reflect the real constraints that are meant to be imposed. If the
order in which elements appear doesn't matter, the DTD shouldn't impose one.

But besides this, you do not address the main point I raised: this can be
dealt with at the application level.
--
Arnaud  Le Hors - IBM, XML Standards Strategy Group / W3C AC Rep.


> -----Original Message-----
> From: Ias [mailto:iasandcb@apache-korea.org]
> Sent: Saturday, May 17, 2003 1:53 PM
> To: xerces-j-dev@xml.apache.org
> Cc: 김동은
> Subject: RE: [PATCH] Allowing a parser to ignore element order
> during dtd based validation
>
>
> Let me explain what rationale of those patches is. Many users (at the
> same time customers) have been using XML documents for configuration of
> many applications. The advent of XML seems so sudden or immature that
> the users have felt XML is another text format, hence they just write
> XML configurations in terms of meanings and ignore the accompanying
> format such as DTD. The most occasional example is that they know what
> elements they should write but don't mind the order of the elements.
>
> In application developers' (and providers) perspective, the way such
> users write XML documents is a pain in the neck, but cannot help being
> coped with. The problem is, however, applications need to verify the
> necessary elements for configuration are written regardless of their
> order defined by a DTD. That's the reason why we can't figure out the
> order-ignored-XML problem by just skipping validation process.
>
> Recently, we have good XML editors like XMLSpy, but in the past simply
> typed XML documents with text editors and pre-validation prior to
> deployment was desired but difficult. ("inconvenient" might be the
> right word for depicting the situation) Probably we should have told
> people to follow DTDs, but it proves to be practically too strict that
> in many cases they didn't do so for several reasons such as absence of
> XML tools and deficiency of XML concepts.
>
> Fortunately, XML Schema has "all" group method to allow XML writers to
> arrange elements freely while "sequence" is still derived from order of
> DTD. From now on, if you want your application's user to ignore element
> order, you can with XML Schema. Optionally, you can recommend some XML
> editors to validate their works enough to pass the actual deployment
> process.
>
> I sincerely understand and agree with you on the issue that how many
> these "out-of-specification" features we should support. What I'd like
> to say here is to embrace the then-existing element-order-ignored XML
> documents within possible validation just as XML Schema offers now. I
> don’t think this support is one of the main features of Xerces DTD
> validation since it's defaulted to "false" and you can turn it on only
> if you are requested. I also don’t imagine this support will encourage
> XML authors to ignore element order described by a DTD in the future
> because XML Schema will take place of DTD and XML editors fully
> supports XML Schema for not only professionals but ordinary users.
>
> Some of my opinions are inspired by Dongeun Kim, developing
> configuration tools for a web application server (WAS). She has dealt
> with a lot of XML stuffs for her job in the real field of WAS, so I'd
> appreciate her comments on this matter.
>
> Thanks,
>
> Ias.
>
> ===========================================================
> Lee, Changshin (Korean name)
> Ias (International name)
>                Company Web Site: http://www.tmax.co.kr
>                Personal Web Site: http://www.iasandcb.pe.kr
> ---------------------------------------------------------
> Senior Researcher, Emerging Technology Evangelist & JCP Activities
> Coordinator
> JCP member - http://jcp.org/en/participation/members/L
> R&D Institute
> Tmax Soft, Inc.
> JCP member - http://jcp.org/en/participation/members/T
> ==========================================================
> -----Original Message-----
> From: Arnaud Le Hors [mailto:lehors@us.ibm.com]
> Sent: Wednesday, May 14, 2003 6:10 AM
> To: xerces-j-dev@xml.apache.org
> Subject: RE: [PATCH] Allowing a parser to ignore element order during
> dtd based validation
>
> I'm sorry but I don't think it'd be a good idea to apply this patch.
>
> First of all, given that validation errors are not fatal, the
> application
> can simply ignore the ones it wants to ignore. I don't see much value in
> adding a flag to have the parser ignore a specific error on behalf of
> the
> application. And how many such flags/behaviors are we going to support?
>
> Second of all, it creates a new kind of behavior which doesn't match any
> compliance level defined by the XML spec. I'm afraid that multiplying
> the
> number of non compliant behaviors will ultimately lessen the value of
> XML.
> --
> Arnaud  Le Hors - IBM, XML Standards Strategy Group / W3C AC Rep.
>
>
> > -----Original Message-----
> > From: Elena Litani [mailto:elitani@ca.ibm.com]
> > Sent: Tuesday, May 13, 2003 4:42 PM
> > To: xerces-j-dev@xml.apache.org
> > Subject: Re: [PATCH] Allowing a parser to ignore element order during
> > dtd based validation
> >
> >
> > Hi Ias,
> > Thank you for the patch!
> >
> > Ias wrote:
> > >
> > > Some users of XML parsers want to validate loosely, i.e. ignore
> element
> > > order from DTD.
> >
> > Before applying this patch, I would like to investigate if this new
> > feature will affect the parser performance.
> > Will let you know in couple of weeks.
> >
> > Thanks,
> > --
> > Elena Litani / IBM Toronto
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
> > For additional commands, e-mail: xerces-j-dev-help@xml.apache.org
> >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-dev-help@xml.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-dev-help@xml.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org


RE: [PATCH] Allowing a parser to ignore element order during dtd based validation

Posted by Ias <ia...@apache-korea.org>.
Let me explain what rationale of those patches is. Many users (at the
same time customers) have been using XML documents for configuration of
many applications. The advent of XML seems so sudden or immature that
the users have felt XML is another text format, hence they just write
XML configurations in terms of meanings and ignore the accompanying
format such as DTD. The most occasional example is that they know what
elements they should write but don't mind the order of the elements.

In application developers' (and providers) perspective, the way such
users write XML documents is a pain in the neck, but cannot help being
coped with. The problem is, however, applications need to verify the
necessary elements for configuration are written regardless of their
order defined by a DTD. That's the reason why we can't figure out the
order-ignored-XML problem by just skipping validation process.

Recently, we have good XML editors like XMLSpy, but in the past simply
typed XML documents with text editors and pre-validation prior to
deployment was desired but difficult. ("inconvenient" might be the
right word for depicting the situation) Probably we should have told
people to follow DTDs, but it proves to be practically too strict that
in many cases they didn't do so for several reasons such as absence of
XML tools and deficiency of XML concepts.

Fortunately, XML Schema has "all" group method to allow XML writers to
arrange elements freely while "sequence" is still derived from order of
DTD. From now on, if you want your application's user to ignore element
order, you can with XML Schema. Optionally, you can recommend some XML
editors to validate their works enough to pass the actual deployment
process.

I sincerely understand and agree with you on the issue that how many
these "out-of-specification" features we should support. What I'd like
to say here is to embrace the then-existing element-order-ignored XML
documents within possible validation just as XML Schema offers now. I
don’t think this support is one of the main features of Xerces DTD
validation since it's defaulted to "false" and you can turn it on only
if you are requested. I also don’t imagine this support will encourage
XML authors to ignore element order described by a DTD in the future
because XML Schema will take place of DTD and XML editors fully
supports XML Schema for not only professionals but ordinary users.

Some of my opinions are inspired by Dongeun Kim, developing
configuration tools for a web application server (WAS). She has dealt
with a lot of XML stuffs for her job in the real field of WAS, so I'd
appreciate her comments on this matter.

Thanks,

Ias.

===========================================================
Lee, Changshin (Korean name)
Ias (International name)
               Company Web Site: http://www.tmax.co.kr
               Personal Web Site: http://www.iasandcb.pe.kr
---------------------------------------------------------
Senior Researcher, Emerging Technology Evangelist & JCP Activities
Coordinator 
JCP member - http://jcp.org/en/participation/members/L
R&D Institute
Tmax Soft, Inc. 
JCP member - http://jcp.org/en/participation/members/T
==========================================================
-----Original Message-----
From: Arnaud Le Hors [mailto:lehors@us.ibm.com] 
Sent: Wednesday, May 14, 2003 6:10 AM
To: xerces-j-dev@xml.apache.org
Subject: RE: [PATCH] Allowing a parser to ignore element order during
dtd based validation

I'm sorry but I don't think it'd be a good idea to apply this patch.

First of all, given that validation errors are not fatal, the
application
can simply ignore the ones it wants to ignore. I don't see much value in
adding a flag to have the parser ignore a specific error on behalf of
the
application. And how many such flags/behaviors are we going to support?

Second of all, it creates a new kind of behavior which doesn't match any
compliance level defined by the XML spec. I'm afraid that multiplying
the
number of non compliant behaviors will ultimately lessen the value of
XML.
--
Arnaud  Le Hors - IBM, XML Standards Strategy Group / W3C AC Rep.


> -----Original Message-----
> From: Elena Litani [mailto:elitani@ca.ibm.com]
> Sent: Tuesday, May 13, 2003 4:42 PM
> To: xerces-j-dev@xml.apache.org
> Subject: Re: [PATCH] Allowing a parser to ignore element order during
> dtd based validation
>
>
> Hi Ias,
> Thank you for the patch!
>
> Ias wrote:
> >
> > Some users of XML parsers want to validate loosely, i.e. ignore
element
> > order from DTD.
>
> Before applying this patch, I would like to investigate if this new
> feature will affect the parser performance.
> Will let you know in couple of weeks.
>
> Thanks,
> --
> Elena Litani / IBM Toronto
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-dev-help@xml.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org


RE: [PATCH] Allowing a parser to ignore element order during dtd based validation

Posted by Arnaud Le Hors <le...@us.ibm.com>.
I'm sorry but I don't think it'd be a good idea to apply this patch.

First of all, given that validation errors are not fatal, the application
can simply ignore the ones it wants to ignore. I don't see much value in
adding a flag to have the parser ignore a specific error on behalf of the
application. And how many such flags/behaviors are we going to support?

Second of all, it creates a new kind of behavior which doesn't match any
compliance level defined by the XML spec. I'm afraid that multiplying the
number of non compliant behaviors will ultimately lessen the value of XML.
--
Arnaud  Le Hors - IBM, XML Standards Strategy Group / W3C AC Rep.


> -----Original Message-----
> From: Elena Litani [mailto:elitani@ca.ibm.com]
> Sent: Tuesday, May 13, 2003 4:42 PM
> To: xerces-j-dev@xml.apache.org
> Subject: Re: [PATCH] Allowing a parser to ignore element order during
> dtd based validation
>
>
> Hi Ias,
> Thank you for the patch!
>
> Ias wrote:
> >
> > Some users of XML parsers want to validate loosely, i.e. ignore element
> > order from DTD.
>
> Before applying this patch, I would like to investigate if this new
> feature will affect the parser performance.
> Will let you know in couple of weeks.
>
> Thanks,
> --
> Elena Litani / IBM Toronto
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-dev-help@xml.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org


Re: [PATCH] Allowing a parser to ignore element order during dtd based validation

Posted by Elena Litani <el...@ca.ibm.com>.
Hi Ias,
Thank you for the patch!

Ias wrote:
> 
> Some users of XML parsers want to validate loosely, i.e. ignore element
> order from DTD. 

Before applying this patch, I would like to investigate if this new
feature will affect the parser performance. 
Will let you know in couple of weeks.

Thanks,
-- 
Elena Litani / IBM Toronto

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org