You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tuscany.apache.org by "Kevin Williams (JIRA)" <tu...@ws.apache.org> on 2007/02/02 06:52:05 UTC

[jira] Updated: (TUSCANY-1088) SDO should tolerate malformed XML

     [ https://issues.apache.org/jira/browse/TUSCANY-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kevin Williams updated TUSCANY-1088:
------------------------------------

    Description: 
I had some off-line discussion with Frank and Yang.  Here is the summary:

As an improvement to consumability, SDO should tolerate some malformed XML.  XML documents are often less than well-formed.  Rather than failing on deserialization when a document does not completely conform to its schema, we should consider making some assumptions and continuing on.  Some competitor technologies do this today.

Here's an example.  Say we have this schema:

<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema targetNamespace="http://QuickTest/HelloWorld"
	xmlns:tns="http://QuickTest/HelloWorld"
	xmlns:xsd="http://www.w3.org/2001/XMLSchema"
	elementFormDefault="qualified">
	<xsd:element name="sayHello">
		<xsd:complexType>
			<xsd:sequence>
				<xsd:element name="input1" nillable="true"
					type="xsd:string" />
			</xsd:sequence>
		</xsd:complexType>
	</xsd:element>
</xsd:schema>

If we get an xml that looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<tns:sayHello xmlns:tns="http://QuickTest/HelloWorld"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://QuickTest/HelloWorld HelloWorldMessage.xsd ">
	<input1>input1</input1>
</tns:sayHello>

then we will fail validating this since input1 isn't fully qualified.  Here's the xml that would work:

<?xml version="1.0" encoding="UTF-8"?>
<tns:sayHello xmlns:tns="http://QuickTest/HelloWorld"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://QuickTest/HelloWorld HelloWorldMessage.xsd ">
	<tns:input1>tns:input1</tns:input1>
</tns:sayHello>

Frank mentioned 2 potential approaches:

  1. Read the element in as if it was an open content property. If you reserialize it would be the same (still invalid).
  2. If a property with the same name (but different namespace) exists, then associate it with that. When you reserialize it will be then be correct.

The later seems the best approach.

Yang also contributed the following:

It's friendly to tolerate if a user forgets to qualify a local element. There're 3 scenarios may not have the same elementFormDefault="qualified" enforcement policy. What do you think?

3-1.	<tns:sayHello xmlns:tns="http://QuickTest/HelloWorld">
		<input1>input1</input1>
	</tns:sayHello>

The author may have forgot to qualify "input1" element, although "input1" may also be a global element without NameSpace.
It's friendly to tolerate.

3-2.	<tns:sayHello xmlns:tns="http://QuickTest/HelloWorld" xmlns:onPurpose="differentNameSpace">
		<onPurpose:input1>input1</onPurpose:input1>
	</tns:sayHello>
The author has qualified "input1" element; I'm not confident we should tolerate.

3-3.	<tns:sayHello xmlns:tns="http://QuickTest/HelloWorld" xmlns="differentNameSpace"> <!-- xmlns= declares all unqualified elements/attributes under "differentNameSpace" -->
		<input1>input1</input1>
	</tns:sayHello>
It's hard to tell if the author may have forgot to qualify "input1" element or not.
I bet on not. Should we tolerate?







  was:
I had some off-line discussion with Frank and Yang.  Here is the summary:

As an improvement to consumability, SDO should tolerate some malformed XML.  It is not uncommon for XML documents to not be less than well-formed.  Rather than failing on deserialization when a document does not completely conform to its schema, we should consider making some assumptions and continuing on.  Some competitor technologies do this today.

Here's an example.  Say we have this schema:

<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema targetNamespace="http://QuickTest/HelloWorld"
	xmlns:tns="http://QuickTest/HelloWorld"
	xmlns:xsd="http://www.w3.org/2001/XMLSchema"
	elementFormDefault="qualified">
	<xsd:element name="sayHello">
		<xsd:complexType>
			<xsd:sequence>
				<xsd:element name="input1" nillable="true"
					type="xsd:string" />
			</xsd:sequence>
		</xsd:complexType>
	</xsd:element>
</xsd:schema>

If we get an xml that looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<tns:sayHello xmlns:tns="http://QuickTest/HelloWorld"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://QuickTest/HelloWorld HelloWorldMessage.xsd ">
	<input1>input1</input1>
</tns:sayHello>

then we will fail validating this since input1 isn't fully qualified.  Here's the xml that would work:

<?xml version="1.0" encoding="UTF-8"?>
<tns:sayHello xmlns:tns="http://QuickTest/HelloWorld"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://QuickTest/HelloWorld HelloWorldMessage.xsd ">
	<tns:input1>tns:input1</tns:input1>
</tns:sayHello>

Frank mentioned 2 potential approaches:

  1. Read the element in as if it was an open content property. If you reserialize it would be the same (still invalid).
  2. If a property with the same name (but different namespace) exists, then associate it with that. When you reserialize it will be then be correct.

The later seems the best approach.

Yang also contributed the following:

It's friendly to tolerate if a user forgets to qualify a local element. There're 3 scenarios may not have the same elementFormDefault="qualified" enforcement policy. What do you think?

3-1.	<tns:sayHello xmlns:tns="http://QuickTest/HelloWorld">
		<input1>input1</input1>
	</tns:sayHello>

The author may have forgot to qualify "input1" element, although "input1" may also be a global element without NameSpace.
It's friendly to tolerate.

3-2.	<tns:sayHello xmlns:tns="http://QuickTest/HelloWorld" xmlns:onPurpose="differentNameSpace">
		<onPurpose:input1>input1</onPurpose:input1>
	</tns:sayHello>
The author has qualified "input1" element; I'm not confident we should tolerate.

3-3.	<tns:sayHello xmlns:tns="http://QuickTest/HelloWorld" xmlns="differentNameSpace"> <!-- xmlns= declares all unqualified elements/attributes under "differentNameSpace" -->
		<input1>input1</input1>
	</tns:sayHello>
It's hard to tell if the author may have forgot to qualify "input1" element or not.
I bet on not. Should we tolerate?








> SDO should tolerate malformed XML
> ---------------------------------
>
>                 Key: TUSCANY-1088
>                 URL: https://issues.apache.org/jira/browse/TUSCANY-1088
>             Project: Tuscany
>          Issue Type: Improvement
>          Components: Java SDO Implementation
>    Affects Versions: Java-SDO-Mx
>            Reporter: Kevin Williams
>             Fix For: Java-SDO-Mx
>
>
> I had some off-line discussion with Frank and Yang.  Here is the summary:
> As an improvement to consumability, SDO should tolerate some malformed XML.  XML documents are often less than well-formed.  Rather than failing on deserialization when a document does not completely conform to its schema, we should consider making some assumptions and continuing on.  Some competitor technologies do this today.
> Here's an example.  Say we have this schema:
> <?xml version="1.0" encoding="UTF-8"?>
> <xsd:schema targetNamespace="http://QuickTest/HelloWorld"
> 	xmlns:tns="http://QuickTest/HelloWorld"
> 	xmlns:xsd="http://www.w3.org/2001/XMLSchema"
> 	elementFormDefault="qualified">
> 	<xsd:element name="sayHello">
> 		<xsd:complexType>
> 			<xsd:sequence>
> 				<xsd:element name="input1" nillable="true"
> 					type="xsd:string" />
> 			</xsd:sequence>
> 		</xsd:complexType>
> 	</xsd:element>
> </xsd:schema>
> If we get an xml that looks like this:
> <?xml version="1.0" encoding="UTF-8"?>
> <tns:sayHello xmlns:tns="http://QuickTest/HelloWorld"
> 	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> 	xsi:schemaLocation="http://QuickTest/HelloWorld HelloWorldMessage.xsd ">
> 	<input1>input1</input1>
> </tns:sayHello>
> then we will fail validating this since input1 isn't fully qualified.  Here's the xml that would work:
> <?xml version="1.0" encoding="UTF-8"?>
> <tns:sayHello xmlns:tns="http://QuickTest/HelloWorld"
> 	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> 	xsi:schemaLocation="http://QuickTest/HelloWorld HelloWorldMessage.xsd ">
> 	<tns:input1>tns:input1</tns:input1>
> </tns:sayHello>
> Frank mentioned 2 potential approaches:
>   1. Read the element in as if it was an open content property. If you reserialize it would be the same (still invalid).
>   2. If a property with the same name (but different namespace) exists, then associate it with that. When you reserialize it will be then be correct.
> The later seems the best approach.
> Yang also contributed the following:
> It's friendly to tolerate if a user forgets to qualify a local element. There're 3 scenarios may not have the same elementFormDefault="qualified" enforcement policy. What do you think?
> 3-1.	<tns:sayHello xmlns:tns="http://QuickTest/HelloWorld">
> 		<input1>input1</input1>
> 	</tns:sayHello>
> The author may have forgot to qualify "input1" element, although "input1" may also be a global element without NameSpace.
> It's friendly to tolerate.
> 3-2.	<tns:sayHello xmlns:tns="http://QuickTest/HelloWorld" xmlns:onPurpose="differentNameSpace">
> 		<onPurpose:input1>input1</onPurpose:input1>
> 	</tns:sayHello>
> The author has qualified "input1" element; I'm not confident we should tolerate.
> 3-3.	<tns:sayHello xmlns:tns="http://QuickTest/HelloWorld" xmlns="differentNameSpace"> <!-- xmlns= declares all unqualified elements/attributes under "differentNameSpace" -->
> 		<input1>input1</input1>
> 	</tns:sayHello>
> It's hard to tell if the author may have forgot to qualify "input1" element or not.
> I bet on not. Should we tolerate?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: tuscany-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: tuscany-dev-help@ws.apache.org


Re: [jira] Updated: (TUSCANY-1088) SDO should tolerate malformed XML

Posted by Ole Ersoy <ol...@yahoo.com>.
Hi,

Ed Merks and I did some work on this (Mostly Ed :-) ).
I skimmed the JIRA briefly, and I think this applies
here:

See 
https://bugs.eclipse.org/bugs/show_bug.cgi?id=166127

Here's a brief summary:


Case1:
The element is supposed to be namespaced,
but is not.

Case2:
The element is namespaced, but should not be.

Case3:
The element is namespaced, but the namespace
is different from the namespace that the element
is supposed to have per the schema definition.

The redesigne of BasicExtendedMetaData supports cases 

1 & 2

with a single switch.


Hopefully the new design of the 
BasicExtendedMetaData applies to this TUSCANY-1088

Cheers,
- Ole



--- "Kevin Williams (JIRA)"
<tu...@ws.apache.org> wrote:

> 
>      [
>
https://issues.apache.org/jira/browse/TUSCANY-1088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
> ]
> 
> Kevin Williams updated TUSCANY-1088:
> ------------------------------------
> 
>     Description: 
> I had some off-line discussion with Frank and Yang. 
> Here is the summary:
> 
> As an improvement to consumability, SDO should
> tolerate some malformed XML.  XML documents are
> often less than well-formed.  Rather than failing on
> deserialization when a document does not completely
> conform to its schema, we should consider making
> some assumptions and continuing on.  Some competitor
> technologies do this today.
> 
> Here's an example.  Say we have this schema:
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <xsd:schema
> targetNamespace="http://QuickTest/HelloWorld"
> 	xmlns:tns="http://QuickTest/HelloWorld"
> 	xmlns:xsd="http://www.w3.org/2001/XMLSchema"
> 	elementFormDefault="qualified">
> 	<xsd:element name="sayHello">
> 		<xsd:complexType>
> 			<xsd:sequence>
> 				<xsd:element name="input1" nillable="true"
> 					type="xsd:string" />
> 			</xsd:sequence>
> 		</xsd:complexType>
> 	</xsd:element>
> </xsd:schema>
> 
> If we get an xml that looks like this:
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <tns:sayHello
> xmlns:tns="http://QuickTest/HelloWorld"
> 
>
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> 	xsi:schemaLocation="http://QuickTest/HelloWorld
> HelloWorldMessage.xsd ">
> 	<input1>input1</input1>
> </tns:sayHello>
> 
> then we will fail validating this since input1 isn't
> fully qualified.  Here's the xml that would work:
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <tns:sayHello
> xmlns:tns="http://QuickTest/HelloWorld"
> 
>
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> 	xsi:schemaLocation="http://QuickTest/HelloWorld
> HelloWorldMessage.xsd ">
> 	<tns:input1>tns:input1</tns:input1>
> </tns:sayHello>
> 
> Frank mentioned 2 potential approaches:
> 
>   1. Read the element in as if it was an open
> content property. If you reserialize it would be the
> same (still invalid).
>   2. If a property with the same name (but different
> namespace) exists, then associate it with that. When
> you reserialize it will be then be correct.
> 
> The later seems the best approach.
> 
> Yang also contributed the following:
> 
> It's friendly to tolerate if a user forgets to
> qualify a local element. There're 3 scenarios may
> not have the same elementFormDefault="qualified"
> enforcement policy. What do you think?
> 
> 3-1.	<tns:sayHello
> xmlns:tns="http://QuickTest/HelloWorld">
> 		<input1>input1</input1>
> 	</tns:sayHello>
> 
> The author may have forgot to qualify "input1"
> element, although "input1" may also be a global
> element without NameSpace.
> It's friendly to tolerate.
> 
> 3-2.	<tns:sayHello
> xmlns:tns="http://QuickTest/HelloWorld"
> xmlns:onPurpose="differentNameSpace">
> 		<onPurpose:input1>input1</onPurpose:input1>
> 	</tns:sayHello>
> The author has qualified "input1" element; I'm not
> confident we should tolerate.
> 
> 3-3.	<tns:sayHello
> xmlns:tns="http://QuickTest/HelloWorld"
> xmlns="differentNameSpace"> <!-- xmlns= declares all
> unqualified elements/attributes under
> "differentNameSpace" -->
> 		<input1>input1</input1>
> 	</tns:sayHello>
> It's hard to tell if the author may have forgot to
> qualify "input1" element or not.
> I bet on not. Should we tolerate?
> 
> 
> 
> 
> 
> 
> 
>   was:
> I had some off-line discussion with Frank and Yang. 
> Here is the summary:
> 
> As an improvement to consumability, SDO should
> tolerate some malformed XML.  It is not uncommon for
> XML documents to not be less than well-formed. 
> Rather than failing on deserialization when a
> document does not completely conform to its schema,
> we should consider making some assumptions and
> continuing on.  Some competitor technologies do this
> today.
> 
> Here's an example.  Say we have this schema:
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <xsd:schema
> targetNamespace="http://QuickTest/HelloWorld"
> 	xmlns:tns="http://QuickTest/HelloWorld"
> 	xmlns:xsd="http://www.w3.org/2001/XMLSchema"
> 	elementFormDefault="qualified">
> 	<xsd:element name="sayHello">
> 		<xsd:complexType>
> 			<xsd:sequence>
> 				<xsd:element name="input1" nillable="true"
> 					type="xsd:string" />
> 			</xsd:sequence>
> 		</xsd:complexType>
> 	</xsd:element>
> </xsd:schema>
> 
> If we get an xml that looks like this:
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <tns:sayHello
> xmlns:tns="http://QuickTest/HelloWorld"
> 
>
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> 	xsi:schemaLocation="http://QuickTest/HelloWorld
> HelloWorldMessage.xsd ">
> 	<input1>input1</input1>
> </tns:sayHello>
> 
> then we will fail validating this since input1 isn't
> fully qualified.  Here's the xml that would work:
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <tns:sayHello
> xmlns:tns="http://QuickTest/HelloWorld"
> 
>
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> 	xsi:schemaLocation="http://QuickTest/HelloWorld
> HelloWorldMessage.xsd ">
> 	<tns:input1>tns:input1</tns:input1>
> </tns:sayHello>
> 
> Frank mentioned 2 potential approaches:
> 
>   1. Read the element in as if it was an open
> content property. If you reserialize it would be the
> same (still invalid).
>   2. If a property with the same name (but different
> namespace) exists, then associate it with that. When
> you reserialize it will be then be correct.
> 
> The later seems the best approach.
> 
> Yang also contributed the following:
> 
> It's friendly to tolerate if a user forgets to
> qualify a local element. There're 3 scenarios may
> not have the same elementFormDefault="qualified"
> enforcement policy. What do you think?
> 
> 3-1.	<tns:sayHello
> xmlns:tns="http://QuickTest/HelloWorld">
> 		<input1>input1</input1>
> 	</tns:sayHello>
> 
> The author may have forgot to qualify "input1"
> element, although "input1" may also be a global
> element without NameSpace.
> It's friendly to tolerate.
> 
> 3-2.	<tns:sayHello
> xmlns:tns="http://QuickTest/HelloWorld" 
=== message truncated ===



 
____________________________________________________________________________________
Looking for earth-friendly autos? 
Browse Top Cars by "Green Rating" at Yahoo! Autos' Green Center.
http://autos.yahoo.com/green_center/

---------------------------------------------------------------------
To unsubscribe, e-mail: tuscany-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: tuscany-dev-help@ws.apache.org