You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cocoon.apache.org by Lars Huttar <la...@sil.org> on 2005/05/27 17:41:43 UTC

validating XML data in pipelines should be a standard feature

Dear Cocoon users/developers,

Cocoon is billed as an "XML Web Development Framework". As such, you 
would expect it to be strong on the fundamentals of XML, one of which is 
ease of validation of XML documents. Not all environments might support 
the newer validation languages well, but surely DTD's?
But for some reason, Cocoon falls pretty flat with regard to XML validation.

The Cocoon doc page 
http://cocoon.apache.org/2.1/userdocs/concepts/validation.html, which is 
still "under development", says
    'You really should validate documents in your editing environment. 
It is not the concern of Cocoon.'
But it's hard to imagine how it could not be the concern of an adequate 
XML web development framework to validate XML documents.

This doc page seems to have in mind the validation of Cocoon 
*configuration* files, which are mostly static. Sure, you can validate 
static documents in your editor. But what about generated documents?
We have a Cocoon application in which XML is transformed into many 
different forms in many different pipelines. Some of the intermediate 
forms are documented with DTD's, and others may be later.
It would be very valuable, for unit testing purposes, to be able to 
validate those intermediate forms against their corresponding DTD's.

Since the Cocoon pipeline model (generator, transformer*, serializer) 
tends to be characterized by intermediate XML documents, I would expect 
this kind of intermediate validation to be useful for many complex projects.

The only built-in way to turn on XML validation in Cocoon is to set the 
"validate" parameter on the XML Parser in cocoon.xconf. (Someone please 
tell me if there is another way.) (And even this, I can't find in the 
online Cocoon documentation; you have to dig it out of a hard-copy book 
or from the comments in cocoon.xconf.)
However, this switch turns on validation for all parsed XML documents 
for that Cocoon servlet -- not just one pipeline or even one webapp.
It would seem like a straightforward fix for this would be to add a 
parameter to certain generators, especially the file generator, to tell 
the parser whether or not to perform validation. This has probably been 
discussed before...

This would enable us to turn on run-time validation of static documents. 
But it would not fulfill the requirement for validation of intermediate 
XML trees. For that, we need something like ValidatorTransformer 
(http://wiki.apache.org/cocoon/ValidationTransformer).
ValidatorTransformer, as described, does exactly what we need. It checks 
the incoming SAX stream against a specified DTD, XML Schema, or Relax NG 
Schema, and outputs any validation errors to a log. All SAX events are 
passed through, so the validator is transparent to the pipeline. A 
variant of this, the ValidatorReportTransformer, sends the validation 
errors down the pipeline, a feature useful for explicit testing by a 
third party who doesn't have to know where the error log files reside.
This functionality is ideal because we can put it in a pipeline at 
whatever stage it is needed. We could even turn it on and off using a 
<map:select>.

Unfortunately, using ValidatorTransformer requires downloading *.java 
files, figuring out how to put them into the Cocoon build tree, and 
rebuilding Cocoon -- skills that move you a bit beyond the role of using 
Cocoon for application development, to one of manipulating the Cocoon 
internals. Personally, I'm familiar with Java, and probably know more 
about Cocoon than anybody else in my organization, except one person; 
but I have had to email the author of ValidatorTransformer to find out 
how to do the above.
Even once I figure out how, there is the headache of getting the same 
customization done on the desktop of every Cocoon developer who might 
need it; and then the hassle of keeping track of all customizations so 
that when we upgrade our version of Cocoon, we're able to keep all the 
libraries and extra features we've added.

In conclusion... this is an appeal to please support general XML 
validation as part of the standard Cocoon distribution. A serious "XML 
Web Development Framework" deserves nothing less.

Thanks... I'd be happy to help if someone can point me in the right 
direction.

Feedback -- does this seem to other Cocoon users like a common need? 
Have you wished for XML validation in your Cocoon apps?

Lars






---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: validating XML data in pipelines should be a standard feature

Posted by Lars Huttar <la...@sil.org>.
Upayavira wrote:

> Bertrand Delacretaz wrote:
>
>> Le 27 mai 05, à 22:20, Bertrand Delacretaz a écrit :
>>
>>>> ...You acknowledge that Software is not designed,licensed or
>>>> intended for use in the design, construction, operation or
>>>> maintenance of any nuclear facility...
>>>
>>>
>>>
>>> I'm afraid this might be a problem, I'll check on the dev@ list...
>>
>>
>>
>> Turns out this was discussed already, and unfortunately we cannot 
>> redistribute it with Cocoon:
>>
>> http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=108573706205469&w=2
>>
>> But if someone writes a block for this it could be hosted somewhere 
>> else (cocoondev.org for example), to make it easier for people to 
>> include it at build time.
>
That would be good progress, until a solution was found that could be 
distributed with Cocoon.

>
> Or better still, recode it to use compatible licences so we can embed 
> it into Cocoon. There must be some around, eh?
>
> Regards, Upayavira

You would think so...
do Xerces or JaxpParser provide classes to validate an already-parsed DOM?
That might not give all the kinds of validation we want, but just having 
DTD and possibly XML Schema-based validation would be a good step forward.

Lars



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: validating XML data in pipelines should be a standard feature

Posted by Sebastien Arbogast <se...@gmail.com>.
> > Presumably, Sun doesn't want to be held liable if your software is used
> > in a nuclear plant, malfunctions, and causes a Chernobyl.
> 
> Yep, basically if you tell the system to cool the reactor, and the JVM
> is like, "wait a while, I have to do some garbage collection," you
> aren't guaranteed that the system is going to cool down when you need it
> (or other critical things).
> 
> The same could hold true for things like Airplane controls, etc.

Precisely. That's why such a precision for nuclear facilities seems
weird to me. There are so many application domains where some stupid
java-oriented architecture choice could be fatal. Besides, is this
license clause in all Sun licenses ? Even for Solaris systems ? ;-P
that would be funny : "my unix is more reliable than your windows but
we can't be held responsible if you use it inside a nuclear system"
lol
Nevermind... licenses are really a tricky thing and it always
fascinates me that licenses like  ASL or GPL are successfully applied
in such a US law maze...

-- 
Sebastien ARBOGAST

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: validating XML data in pipelines should be a standard feature

Posted by Tony Collen <co...@umn.edu>.
Lars Huttar wrote:
> Sebastien Arbogast wrote:
> 
>> I'm somewhat intrigued by this nuclear clause. Does someone know its
>> origin ? Is it some US law restriction or some anti-nuclear conviction
>> from Sun ? or what ?
>>  
>>
> Presumably, Sun doesn't want to be held liable if your software is used 
> in a nuclear plant, malfunctions, and causes a Chernobyl.


Yep, basically if you tell the system to cool the reactor, and the JVM 
is like, "wait a while, I have to do some garbage collection," you 
aren't guaranteed that the system is going to cool down when you need it 
(or other critical things).

The same could hold true for things like Airplane controls, etc.

Tony

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: validating XML data in pipelines should be a standard feature

Posted by Lars Huttar <la...@sil.org>.
Sebastien Arbogast wrote:

>I'm somewhat intrigued by this nuclear clause. Does someone know its
>origin ? Is it some US law restriction or some anti-nuclear conviction
>from Sun ? or what ?
>  
>
Presumably, Sun doesn't want to be held liable if your software is used 
in a nuclear plant, malfunctions, and causes a Chernobyl.

Lars



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: validating XML data in pipelines should be a standard feature

Posted by Sebastien Arbogast <se...@gmail.com>.
I'm somewhat intrigued by this nuclear clause. Does someone know its
origin ? Is it some US law restriction or some anti-nuclear conviction
from Sun ? or what ?

-- 
Sebastien ARBOGAST

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: validating XML data in pipelines should be a standard feature

Posted by Upayavira <uv...@odoko.co.uk>.
Bertrand Delacretaz wrote:
> Le 27 mai 05, à 22:20, Bertrand Delacretaz a écrit :
> 
>>> ...You acknowledge that Software is not designed,licensed or
>>> intended for use in the design, construction, operation or
>>> maintenance of any nuclear facility...
>>
>>
>> I'm afraid this might be a problem, I'll check on the dev@ list...
> 
> 
> Turns out this was discussed already, and unfortunately we cannot 
> redistribute it with Cocoon:
> 
> http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=108573706205469&w=2
> 
> But if someone writes a block for this it could be hosted somewhere else 
> (cocoondev.org for example), to make it easier for people to include it 
> at build time.

Or better still, recode it to use compatible licences so we can embed it 
into Cocoon. There must be some around, eh?

Regards, Upayavira



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: validating XML data in pipelines should be a standard feature

Posted by Bertrand Delacretaz <bd...@apache.org>.
Le 27 mai 05, à 22:20, Bertrand Delacretaz a écrit :
>> ...You acknowledge that Software is not designed,licensed or
>> intended for use in the design, construction, operation or
>> maintenance of any nuclear facility...
>
> I'm afraid this might be a problem, I'll check on the dev@ list...

Turns out this was discussed already, and unfortunately we cannot 
redistribute it with Cocoon:

http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=108573706205469&w=2

But if someone writes a block for this it could be hosted somewhere 
else (cocoondev.org for example), to make it easier for people to 
include it at build time.

-Bertrand

Re: validating XML data in pipelines should be a standard feature

Posted by Upayavira <uv...@odoko.co.uk>.
Bertrand Delacretaz wrote:
> Le 27 mai 05, à 21:49, Lars Huttar a écrit :
> 
>> Bertrand Delacretaz wrote:
>> ...I was thinking more about the libraries used, 
>> http://www.sun.com/software/xml/developers/multischema/..
> 
> 
>> ...The license agreement is:
> 
> 
> <snip looks like an ASL license>
> 
>> ...You acknowledge that Software is not designed,licensed or
>> intended for use in the design, construction, operation or
>> maintenance of any nuclear facility...

I remember that nuclear clause. That is the problem - makes it non-ASL 
compliant.

Regards, Upayavira

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: validating XML data in pipelines should be a standard feature

Posted by Bertrand Delacretaz <bd...@apache.org>.
Le 27 mai 05, à 21:49, Lars Huttar a écrit :

> Bertrand Delacretaz wrote:
> ...I was thinking more about the libraries used, 
> http://www.sun.com/software/xml/developers/multischema/..

> ...The license agreement is:

<snip looks like an ASL license>

> ...You acknowledge that Software is not designed,licensed or
> intended for use in the design, construction, operation or
> maintenance of any nuclear facility...

I'm afraid this might be a problem, I'll check on the dev@ list.

-Bertrand

Re: validating XML data in pipelines should be a standard feature

Posted by Lars Huttar <la...@sil.org>.
Bertrand Delacretaz wrote:

> Le 27 mai 05, à 19:00, Lars Huttar a écrit :
>
>> ...David, can this code be put under the Apache license or something?..
>
>
> I was thinking more about the libraries used, 
> http://www.sun.com/software/xml/developers/multischema/
>
> -Bertrand

The license agreement is:

Copyright (c) 2001-2003 Sun Microsystems, Inc.  All Rights
Reserved.

Redistribution and use in source and binary forms, with or
without modification, are permitted provided that the
following conditions are met:

- Redistributions of source code must retain the above
  copyright notice, this list of conditions and the
  following disclaimer.

- Redistribution in binary form must reproduct the above
  copyright notice, this list of conditions and the
  following disclaimer in the documentation and/or other
  materials provided with the distribution.

Neither the name of Sun Microsystems, Inc.  or the names of
contributors may be used to endorse or promote products
derived from this software without specific prior written
permission.

This software is provided "AS IS," without a warranty of any
kind.  ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS
AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR
NON-INFRINGEMENT, ARE HEREBY EXCLUDED.  SUN AND ITS
LICENSORS SHALL NOT BE LIABLE FOR ANY DAMAGES OR LIABILITIES
SUFFERED BY LICENSEE AS A RESULT OF OR RELATING TO USE,
MODIFICATION OR DISTRIBUTION OF THE SOFTWARE OR ITS
DERIVATIVES.  IN NO EVENT WILL SUN OR ITS LICENSORS BE
LIABLE FOR ANY LOST REVENUE, PROFIT OR DATA, OR FOR DIRECT,
INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL OR PUNITIVE
DAMAGES, HOWEVER CAUSED AND REGARDLESS OF THE THEORY OF
LIABILITY, ARISING OUT OF THE USE OF OR INABILITY TO USE
SOFTWARE, EVEN IF SUN HAS BEEN ADVISED OF THE POSSIBILITY OF
SUCH DAMAGES.

You acknowledge that Software is not designed,licensed or
intended for use in the design, construction, operation or
maintenance of any nuclear facility.



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: validating XML data in pipelines should be a standard feature

Posted by Bertrand Delacretaz <bd...@apache.org>.
Le 27 mai 05, à 19:00, Lars Huttar a écrit :
> ...David, can this code be put under the Apache license or something?..

I was thinking more about the libraries used, 
http://www.sun.com/software/xml/developers/multischema/

-Bertrand

Re: validating XML data in pipelines should be a standard feature

Posted by Lars Huttar <la...@sil.org>.
Bertrand Delacretaz wrote:

> Le 27 mai 05, à 17:41, Lars Huttar a écrit :
>
>> ...Feedback -- does this seem to other Cocoon users like a common 
>> need? Have you wished for XML validation in your Cocoon apps?..
>
>
> Yes, it would certainly be useful. Not only a yes/no validation, but 
> also a report of discrepancies against a schema would be useful, while 
> letting the data through.
>
> I'm not sure if the 
> http://wiki.apache.org/cocoon/ValidationTransformer code can be 
> distributed with Cocoon, have you checked the licenses?
>
Good question.
I looked through the Java source files on the wiki page and don't see 
any information about licenses.
David, can this code be put under the Apache license or something?

Regards,
Lars


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: validating XML data in pipelines should be a standard feature

Posted by Bertrand Delacretaz <bd...@apache.org>.
Le 27 mai 05, à 17:41, Lars Huttar a écrit :
> ...Feedback -- does this seem to other Cocoon users like a common 
> need? Have you wished for XML validation in your Cocoon apps?..

Yes, it would certainly be useful. Not only a yes/no validation, but 
also a report of discrepancies against a schema would be useful, while 
letting the data through.

I'm not sure if the http://wiki.apache.org/cocoon/ValidationTransformer 
code can be distributed with Cocoon, have you checked the licenses?

If not, a RelaxNG validator (I think we're distributing the RelaxNG 
libraries already) would be a good addition already, IMHO.

-Bertrand