You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tuscany.apache.org by "Yang ZHONG (JIRA)" <tu...@ws.apache.org> on 2007/02/02 20:47:05 UTC

[jira] Commented: (TUSCANY-1088) SDO should tolerate malformed XML

    [ https://issues.apache.org/jira/browse/TUSCANY-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469844 ] 

Yang ZHONG commented on TUSCANY-1088:
-------------------------------------

Thanks to Ole Ersoy for pointing out EMF solution https://bugs.eclipse.org/bugs/show_bug.cgi?id=166127

More requirements have been requested, scuh as attribute form tolerance in addition to element, and tolerance of form which should have been unqualified.


To summarize form tolerance discussions on the thread, there're two types: element and attribute. Each has two categories: missing prefix and extra prefix. Each of which has several tolerance levels. I've asked EMF to accommodate them such as to provide a (XMLResource) option and support a (System) property for default value.


Both element's and attribute's form is either qualified or unqualified.

For element/attribute of qualified form, prefix may be missing. For unqualified form, extra prefix may be used. It's user-friendly to optionally tolerate them. Option level can range from zero tolerance to a high one, and the default value can be different for deployment.

Here's 3 scenarios for qualified form. The prefix missing examples illustrate element, however they also apply to attribute.

Missing 3-1.	<tns:sayHello xmlns:tns="http://QuickTest/HelloWorld">
			<input1>...</input1>	<!-- should have been tns:input1 -->
		</tns:sayHello>
Tolerance level 1 LaxForm_NO_DefaultNS_NoGlobal(1): tolerate if no "input1" global element without NameSpace.
Tolerance level 2 LaxForm_NO_DefaultNS(2): tolerate no matter there's a "input1" global element or not

Missing 3-2.	<tns:sayHello xmlns:tns="http://QuickTest/HelloWorld" xmlns:onPurpose="differentNameSpace">
			<onPurpose:input1>...</onPurpose:input1>	<!-- should have been tns:input1 -->
		</tns:sayHello>
No tolerance IMHO.

Missing 3-3.	<tns:sayHello xmlns:tns="http://QuickTest/HelloWorld" xmlns="differentNameSpace">	<!-- xmlns= declares all unqualified elements/attributes under "differentNameSpace" -->
			<input1>...</input1>	<!-- should have been tns:input1 -->
		</tns:sayHello>
Tolerance level 3 LaxForm_DefaultNS_NoGlobal(3): tolerate if no "input1" global element under "differentNameSpace".
Tolerance level 4 LaxForm_DefaultNS(4): tolerate no matter there's a "input1" global element or not

Here's 3 scenarios for unqualified form. The extra-prefix examples illustrate element, however they also apply to attribute.

Extra 3-1.	<tns:sayHello xmlns:tns="http://QuickTest/HelloWorld">
			<tns:input1>...</tns:input1>	<!-- should have been <input1> -->
		</tns:sayHello>
Tolerance level 1 LaxForm_NO_DefaultNS_NoGlobal(1): tolerate if no "input1" global element under "http://QuickTest/HelloWorld".
Tolerance level 2 LaxForm_NO_DefaultNS(2): tolerate no matter there's a "input1" global element or not

Extra 3-2.	<tns:sayHello xmlns:tns="http://QuickTest/HelloWorld" xmlns:onPurpose="http://QuickTest/HelloWorld">
			<onPurpose:input1>...</onPurpose:input1>	<!-- should have been <input1> -->
		</tns:sayHello>
No tolerance IMHO.

Extra 3-3.	<sayHello xmlns="http://QuickTest/HelloWorld">	<!-- xmlns= declares all unqualified elements/attributes under "http://QuickTest/HelloWorld" -->
			<input1>...</input1>	<!-- should have been <input1 xmlns=""> -->
		</sayHello>
Tolerance level 3 LaxForm_DefaultNS_NoGlobal(3): tolerate if no "input1" global element under "http://QuickTest/HelloWorld".
Tolerance level 4 LaxForm_DefaultNS(4): tolerate no matter there's a "input1" global element or not

Assuming ELEMENT = 0, ATTRIBUTE = 8, MISSING = 0, EXTRA = 4, (XMLResource.)OPTION_LAX_FORM_PROCESSING default value can be IMHO:

LaxForm_NO_DefaultNS<<ELEMENT<<MISSING | LaxForm_DefaultNS<<ELEMENT<<EXTRA | LaxForm_NO_DefaultNS<<ATTRIBUTE<<MISSING | LaxForm_DefaultNS<<ATTRIBUTE<<EXTRA

unless set by (System) property XML.load.form.lax

> SDO should tolerate malformed XML
> ---------------------------------
>
>                 Key: TUSCANY-1088
>                 URL: https://issues.apache.org/jira/browse/TUSCANY-1088
>             Project: Tuscany
>          Issue Type: Improvement
>          Components: Java SDO Implementation
>    Affects Versions: Java-SDO-Mx
>            Reporter: Kevin Williams
>             Fix For: Java-SDO-Mx
>
>
> I had some off-line discussion with Frank and Yang.  Here is the summary:
> As an improvement to consumability, SDO should tolerate some malformed XML.  XML documents are often less than well-formed.  Rather than failing on deserialization when a document does not completely conform to its schema, we should consider making some assumptions and continuing on.  Some competitor technologies do this today.
> Here's an example.  Say we have this schema:
> <?xml version="1.0" encoding="UTF-8"?>
> <xsd:schema targetNamespace="http://QuickTest/HelloWorld"
> 	xmlns:tns="http://QuickTest/HelloWorld"
> 	xmlns:xsd="http://www.w3.org/2001/XMLSchema"
> 	elementFormDefault="qualified">
> 	<xsd:element name="sayHello">
> 		<xsd:complexType>
> 			<xsd:sequence>
> 				<xsd:element name="input1" nillable="true"
> 					type="xsd:string" />
> 			</xsd:sequence>
> 		</xsd:complexType>
> 	</xsd:element>
> </xsd:schema>
> If we get an xml that looks like this:
> <?xml version="1.0" encoding="UTF-8"?>
> <tns:sayHello xmlns:tns="http://QuickTest/HelloWorld"
> 	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> 	xsi:schemaLocation="http://QuickTest/HelloWorld HelloWorldMessage.xsd ">
> 	<input1>input1</input1>
> </tns:sayHello>
> then we will fail validating this since input1 isn't fully qualified.  Here's the xml that would work:
> <?xml version="1.0" encoding="UTF-8"?>
> <tns:sayHello xmlns:tns="http://QuickTest/HelloWorld"
> 	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> 	xsi:schemaLocation="http://QuickTest/HelloWorld HelloWorldMessage.xsd ">
> 	<tns:input1>tns:input1</tns:input1>
> </tns:sayHello>
> Frank mentioned 2 potential approaches:
>   1. Read the element in as if it was an open content property. If you reserialize it would be the same (still invalid).
>   2. If a property with the same name (but different namespace) exists, then associate it with that. When you reserialize it will be then be correct.
> The later seems the best approach.
> Yang also contributed the following:
> It's friendly to tolerate if a user forgets to qualify a local element. There're 3 scenarios may not have the same elementFormDefault="qualified" enforcement policy. What do you think?
> 3-1.	<tns:sayHello xmlns:tns="http://QuickTest/HelloWorld">
> 		<input1>input1</input1>
> 	</tns:sayHello>
> The author may have forgot to qualify "input1" element, although "input1" may also be a global element without NameSpace.
> It's friendly to tolerate.
> 3-2.	<tns:sayHello xmlns:tns="http://QuickTest/HelloWorld" xmlns:onPurpose="differentNameSpace">
> 		<onPurpose:input1>input1</onPurpose:input1>
> 	</tns:sayHello>
> The author has qualified "input1" element; I'm not confident we should tolerate.
> 3-3.	<tns:sayHello xmlns:tns="http://QuickTest/HelloWorld" xmlns="differentNameSpace"> <!-- xmlns= declares all unqualified elements/attributes under "differentNameSpace" -->
> 		<input1>input1</input1>
> 	</tns:sayHello>
> It's hard to tell if the author may have forgot to qualify "input1" element or not.
> I bet on not. Should we tolerate?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: tuscany-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: tuscany-dev-help@ws.apache.org