You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jaxme-dev@ws.apache.org by "Nacho G. Mac Dowell" <ig...@informa.es> on 2004/08/19 09:25:48 UTC

Validation

Re: RecursionWhen you marshal a java xml tree no validation is performed.
That is, say that you have mandatory children for some global element
(minOccurs=1) and you marshal one of this global elements that don't have
the mandatory children. The xml gets generated, but obviously if you try to
validate it won't pass. I beleive there is inconsistency with the JAXB
specification, since it doesn't talk much about validation (in the sense
applied here). With JAXB RI you can marshal invalid documents like the one I
said but you can't unmarshal them back. Even if you use a custom
validationEventHandler which returns true no matter what it finds, JAXB RI
considers this a fatal error and aborts the unmarshal operation. I talked
with Kohsuke Kawaguchi (JAXB engineer) and it doesn't seem this is going to
change. I think this is really annoying and this is why I stopped using the
RI. With JAXME though, I managed to unmarshal almost anything. Consider the
following scenario:

"We have an asynchronous web service which receives a xml attachment for
processing. This xml can get quite complicated and therefore it would be
really nice to be able to save your work as you construct the document. What
I do is marshal my object into xml and store the partial xml. Then the user
can retreive it by unmarshaling."

Well, Koshuke told me to validate it before marshalling!!! To me this seems
like a really bad solution. I don't know if this is because I am a
mathematician, but I can't understand how the marshal-unmarshal operations
are not bijective. This should be one of the key specifications! A XML tree
that can't be unmarshalled should not be able to get marshalled.


What do you think?


PD: I am sending a copy to Kohsuke Kawaguchi as I am talking about him.


 -----Mensaje original-----
De: Stracuzzi Stefano [mailto:Stefano.Stracuzzi@siemens.com]
Enviado el: jueves, 19 de agosto de 2004 08:18
Para: jaxme-dev@ws.apache.org
Asunto: Re: Recursion


> >unmarshaller.setValidating(false);
> >
> >
>
> I must admit, that this option is so far completely ignored by JaxMe.
> Even more, I have difficulties to imagine that it can be handled
> properly in the future,  now that we start changing the
> framework to use
> a parser generator internally.

So you are saying that an unmarshall operation always try to validate the
XML?!?!
Ther's no possibility to skip validation atm?

And why the validation operation give me an error also when i validate my
xml with an external program and the XML is correctly generated by JaxMe
classes ?!?!!?

---
Cordiali Saluti / Best Regards

Stefano Stracuzzi

:: Web Developer | Java & IBM ::

RE: Validation

Posted by "Nacho G. Mac Dowell" <ig...@informa.es>.
"However, it would mean paying a performance penalty for marshalling, which
I wouldn't be ready to pay in all situations."

Having the choice would do the trick.


-----Mensaje original-----
De: Jochen Wiedmann [mailto:jochen.wiedmann@freenet.de]
Enviado el: jueves, 19 de agosto de 2004 13:57
Para: Nacho G. Mac Dowell
CC: Stracuzzi Stefano; jaxme-dev@ws.apache.org;
Kohsuke.Kawaguchi@Sun.COM
Asunto: Re: Validation


Nacho G. Mac Dowell wrote:

>"But where's the border? There are definitely no clear guidelines."
>I think a reasonable border would be: Unmarshal anything that has been
>marshalled with the same implementation. What do you think?
>
>

I could agree with you on that, as a general guideline, possibly even as
a default.

However, it would mean paying a performance penalty for marshalling, which I
wouldn't be ready to pay in all situations.


Jochen


---------------------------------------------------------------------
To unsubscribe, e-mail: jaxme-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: jaxme-dev-help@ws.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: jaxme-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: jaxme-dev-help@ws.apache.org


Re: Validation

Posted by Jochen Wiedmann <jo...@freenet.de>.
Nacho G. Mac Dowell wrote:

>"But where's the border? There are definitely no clear guidelines."
>I think a reasonable border would be: Unmarshal anything that has been
>marshalled with the same implementation. What do you think?
>  
>

I could agree with you on that, as a general guideline, possibly even as 
a default.

However, it would mean paying a performance penalty for marshalling, which I
wouldn't be ready to pay in all situations.


Jochen


---------------------------------------------------------------------
To unsubscribe, e-mail: jaxme-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: jaxme-dev-help@ws.apache.org


RE: Validation

Posted by "Nacho G. Mac Dowell" <ig...@informa.es>.
"At least for simple cases (required attributes or elements, that are
missing, too many child elements, wrong order of child elements), this is
considered a bug."
Sorry, I don't get you here. The bug is being able to unmarshal the tree or
not being able?

"I do not consider your example as valid. "
It's probably not a real world example or a good example, but it illustrates
what I am pointing.

"- Use the weaker requirements for the schema and add some additional
validation for yourself, depending on the situation. In your example, don't
declare the longitude as mandatory and verify its presence manually, if
required.)"
I might not be able to weaken the requirements of the schema as I might be
reusing an existant schema. In my case I don't have this problem and this is
the approach I have taken. It's a good solution for some cases (like mine).

"- Use two different schemas and two different sets of classes, possible
implementing  a common interface, or something like that. (In your example,
declare the longitude  as mandatory in one case and not in another.)"
Using two different schemas with very little changes is not (IMO) a good
idea. You have to take care of both if anything changes adding another level
of complexity. Imagine using an existent schema that has a mandatory
attribute (like the example I said). I would have to create another schema
with just one little difference (making it not mandatory). Using two sets of
classes for almost the same purpose would result in lots of duplicity of
code.

There is another solution (which is probably the worst) that would be
serializing the objects and work with the objects. Clearly this makes
interoperability (IMO XML's main purpose) almost impossible.

"I think we all agree, that there are cases, where the Unmarshaller should
clearly fail"
Yes I do agree there are cases where unmarshaling should fail, as with
inexistent elements in the schema and not well formed xml (these would have
never been marshalled in the first place). But I also think that if you used
a Marshaller to generate the document the Unmarshaller should work the way
back. Don't you think? The example you give (foo-bar) would never have been
marshalled (unless there's a bug).

"But where's the border? There are definitely no clear guidelines."
I think a reasonable border would be: Unmarshal anything that has been
marshalled with the same implementation. What do you think?

Nacho

-----Mensaje original-----
De: Jochen Wiedmann [mailto:jochen.wiedmann@freenet.de]
Enviado el: jueves, 19 de agosto de 2004 12:59
Para: Nacho G. Mac Dowell
CC: Stracuzzi Stefano; jaxme-dev@ws.apache.org;
Kohsuke.Kawaguchi@Sun.COM
Asunto: Re: Validation


Nacho G. Mac Dowell wrote:

>Do you mean jaxme wont unmarshal invalid documents? It does to me...
>

At least for simple cases (required attributes or elements, that are
missing,
too many child elements, wrong order of child elements), this is considered
a bug. Of course, your suggestion of providing a ValidationEventHandler
should do the trick. However, we cannot guarantee, that this won't be fatal
errors in the future.

>I don't know if I expressed myself correctly: What I mean is that I would
>love the Unmarshaller to behave in the same way as the Marshaller (not the
>Marshaller as the Unmarshaller!), that is not validating. On the other
hand,
>I think that this behaviour is inconsistent. Imagine a geographical system.
>Imagine user1 that creates a (incomplete) set of data (it takes him say a
>week) and wants to send it to user2 for completing the missing data (say
>user1 only knows the latitude and doesn't know the longitude so he sends it
>to user2 for completing the data - of course, if the longitude is not set
>for some places, it won't be "valid" data). Imagine the system sends
user1's
>data to user2 with no problem, but when user2 receives it the system
>complains that it is incomplete making it impossible to complete the
work...
>Don't you think this situation is undesirable? I really do think so.
>
>
I do not consider your example as valid. If you actually have to
distinguish between cases,
where a document is valid or not, then you have, IMO, two possible choices:

- Use the weaker requirements for the schema and add some additional
validation for
   yourself, depending on the situation. (In your example, don't declare
the longitude as
   mandatory and verify its presence manually, if required.)
- Use two different schemas and two different sets of classes, possible
implementing
  a common interface, or something like that. (In your example, declare
the longitude
  as mandatory in one case and not in another.)

However, the expectation, that an Unmarshaller should be able to accept
invalid documents,
is clearly odd. I think we all agree, that there are cases, where the
Unmarshaller should clearly
fail: For example, if the schema specifies the root element to be "foo",
then the Unmarshaller
should reject "bar". But where's the border? There are definitely no
clear guidelines.

Jochen


---------------------------------------------------------------------
To unsubscribe, e-mail: jaxme-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: jaxme-dev-help@ws.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: jaxme-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: jaxme-dev-help@ws.apache.org


Re: Validation

Posted by Jochen Wiedmann <jo...@freenet.de>.
Nacho G. Mac Dowell wrote:

>Do you mean jaxme wont unmarshal invalid documents? It does to me...
>

At least for simple cases (required attributes or elements, that are 
missing,
too many child elements, wrong order of child elements), this is considered
a bug. Of course, your suggestion of providing a ValidationEventHandler
should do the trick. However, we cannot guarantee, that this won't be fatal
errors in the future.

>I don't know if I expressed myself correctly: What I mean is that I would
>love the Unmarshaller to behave in the same way as the Marshaller (not the
>Marshaller as the Unmarshaller!), that is not validating. On the other hand,
>I think that this behaviour is inconsistent. Imagine a geographical system.
>Imagine user1 that creates a (incomplete) set of data (it takes him say a
>week) and wants to send it to user2 for completing the missing data (say
>user1 only knows the latitude and doesn't know the longitude so he sends it
>to user2 for completing the data - of course, if the longitude is not set
>for some places, it won't be "valid" data). Imagine the system sends user1's
>data to user2 with no problem, but when user2 receives it the system
>complains that it is incomplete making it impossible to complete the work...
>Don't you think this situation is undesirable? I really do think so.
>  
>
I do not consider your example as valid. If you actually have to 
distinguish between cases,
where a document is valid or not, then you have, IMO, two possible choices:

- Use the weaker requirements for the schema and add some additional 
validation for
   yourself, depending on the situation. (In your example, don't declare 
the longitude as
   mandatory and verify its presence manually, if required.)
- Use two different schemas and two different sets of classes, possible 
implementing
  a common interface, or something like that. (In your example, declare 
the longitude
  as mandatory in one case and not in another.)

However, the expectation, that an Unmarshaller should be able to accept 
invalid documents,
is clearly odd. I think we all agree, that there are cases, where the 
Unmarshaller should clearly
fail: For example, if the schema specifies the root element to be "foo", 
then the Unmarshaller
should reject "bar". But where's the border? There are definitely no 
clear guidelines.

Jochen


---------------------------------------------------------------------
To unsubscribe, e-mail: jaxme-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: jaxme-dev-help@ws.apache.org


RE: Validation

Posted by "Nacho G. Mac Dowell" <ig...@informa.es>.
"That is exactly what JaxMe currently does."

Do you mean jaxme wont unmarshal invalid documents? It does to me... But I
would like to know if this would be the desirable behaviour, because it is
important in the project I am working on to be able to unmarshal invalid
documents like the example I said (missing mandatory children).

"You are contradicting yourself, aren't you? On one hand, you want
validation to be turned off. On the other hand, Kohsukes suggestion
gives you what you request in the last sentence?"

I don't know if I expressed myself correctly: What I mean is that I would
love the Unmarshaller to behave in the same way as the Marshaller (not the
Marshaller as the Unmarshaller!), that is not validating. On the other hand,
I think that this behaviour is inconsistent. Imagine a geographical system.
Imagine user1 that creates a (incomplete) set of data (it takes him say a
week) and wants to send it to user2 for completing the missing data (say
user1 only knows the latitude and doesn't know the longitude so he sends it
to user2 for completing the data - of course, if the longitude is not set
for some places, it won't be "valid" data). Imagine the system sends user1's
data to user2 with no problem, but when user2 receives it the system
complains that it is incomplete making it impossible to complete the work...
Don't you think this situation is undesirable? I really do think so.

"your best bet is assuming that
validation cannot be turned off actually for the Unmarshaller"

this should be stated in jaxb specification (as the RI behaves like this)
both for the marshaller and unmarshaller. Although I'd prefer:

"your best bet is assuming that
validation can be turned off for the Unmarshaller as it can be turned off
for marshalling"

I know the issue is the complexity of unmarshalling. The thing is that just
because marshalling is much simpler than unmarshalling, we shouldn't have an
inconsistent behaviour.


-----Mensaje original-----
De: Jochen Wiedmann [mailto:jochen.wiedmann@freenet.de]
Enviado el: jueves, 19 de agosto de 2004 10:55
Para: Nacho G. Mac Dowell
CC: Stracuzzi Stefano; jaxme-dev@ws.apache.org;
Kohsuke.Kawaguchi@Sun.COM
Asunto: Re: Validation


Nacho G. Mac Dowell wrote:

> When you marshal a java xml tree no validation is performed. That is,
> say that you have mandatory children for some global element
> (minOccurs=1) and you marshal one of this global elements that don't
> have the mandatory children. The xml gets generated, but obviously if
> you try to validate it won't pass. I beleive there is inconsistency
> with the JAXB specification, since it doesn't talk much about
> validation (in the sense applied here). With JAXB RI you can marshal
> invalid documents like the one I said but you can't unmarshal them back.

That is exactly what JaxMe currently does.

> Even if you use a custom validationEventHandler which returns true no
> matter what it finds, JAXB RI considers this a fatal error and aborts
> the unmarshal operation.


I did not consider the validationEventHandler, but that's worth a try.
However, it possibly won't work in the future, when we depend on
external parser generators, as we cannot control there ability to
recover from an error. In other words, your best bet is assuming that
validation cannot be turned off actually for the Unmarshaller. (That
said, I think we can keep the current behaviour for the relatively
simple XML, that is currently supported. I do not think so for more
complex schemas with nested groups, wildcards, and all that stuff.
We also might be able to create relaxed grammars on the users demand,
but that won't be something that I implement.)

> Well, Koshuke told me to validate it before marshalling!!! To me
> this seems like a really bad solution. I don't know if this is because
> I am a mathematician, but I can't understand how the marshal-unmarshal
> operations are not bijective. This should be one of the key
> specifications! A XML tree that can't be unmarshalled should not be
> able to get marshalled.

You are contradicting yourself, aren't you? On one hand, you want
validation to be turned off. On the other hand, Kohsukes suggestion
gives you what you request in the last sentence?


Jochen


---------------------------------------------------------------------
To unsubscribe, e-mail: jaxme-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: jaxme-dev-help@ws.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: jaxme-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: jaxme-dev-help@ws.apache.org


Re: Validation

Posted by Jochen Wiedmann <jo...@freenet.de>.
Nacho G. Mac Dowell wrote:

> When you marshal a java xml tree no validation is performed. That is, 
> say that you have mandatory children for some global element 
> (minOccurs=1) and you marshal one of this global elements that don't 
> have the mandatory children. The xml gets generated, but obviously if 
> you try to validate it won't pass. I beleive there is inconsistency 
> with the JAXB specification, since it doesn't talk much about 
> validation (in the sense applied here). With JAXB RI you can marshal 
> invalid documents like the one I said but you can't unmarshal them back.

That is exactly what JaxMe currently does.

> Even if you use a custom validationEventHandler which returns true no 
> matter what it finds, JAXB RI considers this a fatal error and aborts 
> the unmarshal operation.


I did not consider the validationEventHandler, but that's worth a try. 
However, it possibly won't work in the future, when we depend on 
external parser generators, as we cannot control there ability to 
recover from an error. In other words, your best bet is assuming that 
validation cannot be turned off actually for the Unmarshaller. (That 
said, I think we can keep the current behaviour for the relatively 
simple XML, that is currently supported. I do not think so for more 
complex schemas with nested groups, wildcards, and all that stuff.
We also might be able to create relaxed grammars on the users demand, 
but that won't be something that I implement.)

> Well, Koshuke told me to validate it before marshalling!!! To me 
> this seems like a really bad solution. I don't know if this is because 
> I am a mathematician, but I can't understand how the marshal-unmarshal 
> operations are not bijective. This should be one of the key 
> specifications! A XML tree that can't be unmarshalled should not be 
> able to get marshalled.

You are contradicting yourself, aren't you? On one hand, you want 
validation to be turned off. On the other hand, Kohsukes suggestion 
gives you what you request in the last sentence?


Jochen


---------------------------------------------------------------------
To unsubscribe, e-mail: jaxme-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: jaxme-dev-help@ws.apache.org