You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@sling.apache.org by Felix Meschberger <fm...@gmail.com> on 2007/12/06 09:50:06 UTC

Request Data Validation

Hi all,

There is a general need to validate input sent from the client to the
server before actually acting upon the input. Of course, we could leave
this validation to the implementation of the input consumer. But then we
would prevent the use of the generic microjax handling of input as we
would have to implement POST (and other) servlets or scripts for each
case. Rather I suggest, we define an extensible input validation system,
which is used mainly by the microjax handling but may also be used other
consumers of client supplied input data.

I have been looking at the Spring 2.5 validation package [1] for some
inspiration. Though I think this package is somewhat overkill for our
case, there are two interfaces, which look interesting:

   * Validator - this is the interface to be implemented by input
validators
   * Errors - this is the interface gathering all validation issues

For Sling, the validator interface will be rather simple :

   public interface Validator {
       void validate(??? input, Errors errors);
   }

The validate method checks the input and reports issues to the errors
object. I am not sure about the type of the input. Probably it is best,
if it would be the request object. The validator itself does not send
any response, it just fills the errors object.

The Validator implementation is registered as an OSGi service (hold on,
scripting will follow) of type Validator. The question is, how the
validator is selected for a given request :

   (1) Simply use the resource type
or (2) Same as servlet/script resolution take resource type, selectors,
extension and
       request method into account

A validator may also be a script. The script would of course get
different input than a normal request rendering script. The location of
the script should be different from the normal request rendering
scripts, e.g. /apps/<resourcetype>/validators/* (of course this collides
with a normal request rendering script for the selector "validators",
which may or may not be an issue).

To use the validation framework, we would have a validation service :

    public interface ValidationService {
        Errors validate(SlingHttpServletRequest request);
    }

Scripts and servlets requiring validation would call the
ValidationService.validate method with the request and response objects.
The ValidationService internally manages the validator selection and
call. The validate method returns null if the input validates
successfully. Otherwise an Errors instance is returned which may be
inspected and from which a response may be generated to send back to the
client.

WDYT ?

Regards
Felix

[1]
http://static.springframework.org/spring/docs/2.5.x/api/org/springframework/validation/package-summary.html

Re: Request Data Validation

Posted by Carsten Ziegeler <cz...@apache.org>.

Felix Meschberger wrote:
> Hi,
> 
> Am Donnerstag, den 06.12.2007, 10:16 +0100 schrieb Carsten Ziegeler:
>> But then, I think it might get a little bit tricky. First, I don't see a
>> need for a Validator interface - there is currently no place to hook it in.
> 
> What do you mean ? The ValidatorService just is there to be called by
> servlets and scripts. The ValidatorService in turn listens for Validator
> service being registered and grabs them plus looks for validator scripts
> on demand.
> 
Ah, ok, hmm, so assume I call the ValidatorService from within my
script, how do I tell the service what and how to validate?
Or are the Validator services somehow registered to be validators for my
script? In this case I could call them directly from within my script.

> 
> First we already have the ErrorHandlerServlet mechanism in place.
> Second, any script or servlet may include a response suitable to render
> the validation errors. It think, there are enough mechanisms in place,
> they just must be used. 

Ah, yes, you're right - I forgot about the error handler servlet. So,
yes, this should be sufficient here.

Carsten

-- 
Carsten Ziegeler
cziegeler@apache.org

Re: Request Data Validation

Posted by Felix Meschberger <fm...@gmail.com>.

Hi,

Am Donnerstag, den 06.12.2007, 10:16 +0100 schrieb Carsten Ziegeler:
> But then, I think it might get a little bit tricky. First, I don't see a
> need for a Validator interface - there is currently no place to hook it in.

What do you mean ? The ValidatorService just is there to be called by
servlets and scripts. The ValidatorService in turn listens for Validator
service being registered and grabs them plus looks for validator scripts
on demand.

> Second, I could imagine that it might get difficult to have a generic
> mechanism which creates a "good looking" response out of the Errors
> object. Therefore I guess we need some application specific error
> handling (whatever that means, I have no clean picture yet).

Very true. This is why the ValidatorService does not render the response
itself but rather returns an Errors object containing all validation
issues.

> Depending on what people use to create their response (jsps, javascript
> etc.), they are free to use whatever validation framework (spring,
> commons validation etc.) in there application. All we need to provide is
> a hook to call this stuff at the right time (through the
> ValidationService interface or some scripting stuff).

Yes, the ValidationService as proposed above.

> So I think all we need to do is to come up with two additional
> resolutions during process handling: resolving the validation service
> for the current request and in the case of errors resolving an error
> handler.

First we already have the ErrorHandlerServlet mechanism in place.
Second, any script or servlet may include a response suitable to render
the validation errors. It think, there are enough mechanisms in place,
they just must be used. Maybe the microjax interface must be enhanced to
handle validation check feedback.

Regards
Felix

Re: Request Data Validation

Posted by Carsten Ziegeler <cz...@apache.org>.

Felix Meschberger wrote:
> Hi all,
> 
> There is a general need to validate input sent from the client to the
> server before actually acting upon the input. Of course, we could leave
> this validation to the implementation of the input consumer. But then we
> would prevent the use of the generic microjax handling of input as we
> would have to implement POST (and other) servlets or scripts for each
> case. Rather I suggest, we define an extensible input validation system,
> which is used mainly by the microjax handling but may also be used other
> consumers of client supplied input data.
> 
> I have been looking at the Spring 2.5 validation package [1] for some
> inspiration. Though I think this package is somewhat overkill for our
> case, there are two interfaces, which look interesting:
> 
>    * Validator - this is the interface to be implemented by input
> validators
>    * Errors - this is the interface gathering all validation issues
> 
> For Sling, the validator interface will be rather simple :
> 
>    public interface Validator {
>        void validate(??? input, Errors errors);
>    }
> 
> The validate method checks the input and reports issues to the errors
> object. I am not sure about the type of the input. Probably it is best,
> if it would be the request object. The validator itself does not send
> any response, it just fills the errors object.
> 
> The Validator implementation is registered as an OSGi service (hold on,
> scripting will follow) of type Validator. The question is, how the
> validator is selected for a given request :
> 
>    (1) Simply use the resource type
> or (2) Same as servlet/script resolution take resource type, selectors,
> extension and
>        request method into account
> 
> A validator may also be a script. The script would of course get
> different input than a normal request rendering script. The location of
> the script should be different from the normal request rendering
> scripts, e.g. /apps/<resourcetype>/validators/* (of course this collides
> with a normal request rendering script for the selector "validators",
> which may or may not be an issue).
> 
> To use the validation framework, we would have a validation service :
> 
>     public interface ValidationService {
>         Errors validate(SlingHttpServletRequest request);
>     }
> 
> Scripts and servlets requiring validation would call the
> ValidationService.validate method with the request and response objects.
> The ValidationService internally manages the validator selection and
> call. The validate method returns null if the input validates
> successfully. Otherwise an Errors instance is returned which may be
> inspected and from which a response may be generated to send back to the
> client.
> 
> WDYT ?
>

Currently Sling has no means of describing forms or input fields etc. (which
is absolutely fine). Therefore the only interface we can provide for
validation
is to validate the whole request - this means, the ValidationService
interface from above makes sense and is needed.
We also might need an Errors object holding all error information.

But then, I think it might get a little bit tricky. First, I don't see a
need for a Validator interface - there is currently no place to hook it in.
Second, I could imagine that it might get difficult to have a generic
mechanism which creates a "good looking" response out of the Errors
object. Therefore I guess we need some application specific error
handling (whatever that means, I have no clean picture yet).

Depending on what people use to create their response (jsps, javascript
etc.), they are free to use whatever validation framework (spring,
commons validation etc.) in there application. All we need to provide is
a hook to call this stuff at the right time (through the
ValidationService interface or some scripting stuff).

So I think all we need to do is to come up with two additional
resolutions during process handling: resolving the validation service
for the current request and in the case of errors resolving an error
handler.

Carsten

-- 
Carsten Ziegeler
cziegeler@apache.org

Re: Request Data Validation

Posted by Felix Meschberger <fm...@gmail.com>.

Hi,

Am Donnerstag, den 06.12.2007, 11:37 +0100 schrieb Tobias Bocanegra:
> imo we need not request input validation.

Are you kidding ? :-)

Honestly, all input MUST be validated somehow before it is applied. At
the minium XSS must be prevented ! This can only be done by some form of
input validation.

> if you want to restrict, use the repository value constraints :-)

This has two serious drawbacks: (1) it requires node types and (2) it is
rather limited and not extensible. So IMHO this option is not a real
one.

Regards
Felix

Re: Request Data Validation

Posted by "Roy T.Fielding" <fi...@gbiv.com>.

Most validation services are designed to validate small RPC
parameters.  That is crappy infrastructure to support crap design.

I think we should start with a use case.  For example, let's say
I want to create a service that allows me to post one of David's
keynote presentations as a zip file, results in a content
hierarchy of resources (based on each of the components within
the zip), and links to a separate resource (view) consisting of
a slideshare-style presentation window that points to slide 1
of a virtual path though the just-posted keynote.

For validation purposes, I want to exclude those components that
are not actually referenced within the presentation (i.e., the
bits that David decided not to use) and reject the entire
presentation if it is not a keynote zip or it includes a
component that looks like ActiveX.

Now, how does Sling provide a validation service that

   a) doesn't require the entire request object be in memory;
   b) identifies data to be excluded;
   c) excludes the data before it is stored; and,
   d) aborts processing on data to be rejected.

I can do such a thing with protocol filters or relay handlers.
I can't do it if the validation occurs after request processing.

To add another wrinkle, let's assume that David's presentations
have a bit of repetition with respect to previously stored presos,
and we only want to store the new content as nodes and create
reference-only nodes for the stuff already in the repository.

Finally, I'd like to be able to define a parallel resource (view)
of the same presentation, except that each occurrence of the
TradeGothic font is replaced with Arial (without changing the
content).

That's a real use case, with plenty of test data available. ;-)

....Roy

Re: Request Data Validation

Posted by David Nuescheler <da...@day.com>.

Hi Michael,

thanks for the reminder. I think this is very relevant for this discussion.

> When this was first discussed I think the result was to  "implement the POST
> script". But I would like to extend the behavior of the PostServlet without
> having to reimplement everything it does. That would be something like
> "implement the POST script (do some validation there) and pass on the
> request to the POST servlet or not (depending on the validation)".
Exactly. I think you should be able to do whatever customlogic you would
like to do in the POST script and then call the default PostHandler with
something like ...persistPostChangesFromRequest(jcrsession, request)
or similar ;)

At least my two cents....

regards,
david

Re: Request Data Validation

Posted by Michael Marth <mm...@day.com>.

Hi David,

re the use case I still got this one I brought up a while ago: catching spam
comments on a blog.

However, for this use case I do not think I need the validator functionality
typically offered in web frameworks (like checking for empty strings or
stuff like that). I would like to intercept POSTs to the microjax servlet
with some custom logic (in a script).

If that is possible all the simpler validation stuff (like checking for
required values) can be implemented (by an application writer) as well if
someone wishes to do so.

When this was first discussed I think the result was to  "implement the POST
script". But I would like to extend the behavior of the PostServlet without
having to reimplement everything it does. That would be something like
"implement the POST script (do some validation there) and pass on the
request to the POST servlet or not (depending on the validation)".

Cheers
Michael

On 12/6/07, David Nuescheler <da...@day.com> wrote:
>
> hi carsten,
>
>
> hehe ;)
> > He, you're taking all the fun out of this discussion :)
>
>
> > Now, I think it makes sense to have an additional validation mechanism
> > on top of JCR. I guess with the node type definitions you can't handle
> > all validation cases (like validating one field depends on the value of
> > another one etc.). So we need these hooks.
> i am all for a general "validate()" hook.
> (... to take care of the "dream"-case. ;) )
>
> > It makes sense to leverage the validation information from the
> > nodetypes, of course. And I also think that it makes sense to validate
> > the input based on this information before a commit. So some sort of
> > general service doing this would be great...
> excellent. then all the general cases from xss to integer validation
> can easily be taken care of via regular nodetype definition.
>
> regards,
> david
>

-- 
Michael Marth, http://dev.day.com

Re: Request Data Validation

Posted by David Nuescheler <da...@day.com>.

hi carsten,


hehe ;)
> He, you're taking all the fun out of this discussion :)


> Now, I think it makes sense to have an additional validation mechanism
> on top of JCR. I guess with the node type definitions you can't handle
> all validation cases (like validating one field depends on the value of
> another one etc.). So we need these hooks.
i am all for a general "validate()" hook.
(... to take care of the "dream"-case. ;) )

> It makes sense to leverage the validation information from the
> nodetypes, of course. And I also think that it makes sense to validate
> the input based on this information before a commit. So some sort of
> general service doing this would be great...
excellent. then all the general cases from xss to integer validation
can easily be taken care of via regular nodetype definition.

regards,
david

Re: Request Data Validation

Posted by Carsten Ziegeler <cz...@apache.org>.

David Nuescheler wrote:
>> imo we need not request input validation. if you want to restrict, use
>> the repository value constraints :-)
> seriously: +1
> 
> anyway, i think we should not even consider a different level of validation
> that could possibly conflict with what we have in the repository.
> sling can use the nodetype definitions to do the validation before trying to
> commit.
> 
> if we have a usecase that goes beyond the validations we have in jcr then
> we can engage in further discussions. the focus is on "use"-case not
> "make-up"-case.
> 
> i sure don't hope this is another case of "abstracting away from the repo"
> and that's why we would like to introduce a whole new constraint framework
> even for simple things such as "number in a range" or a "string that matches
> a given regexp".
> 
He, you're taking all the fun out of this discussion :)
Now, I think it makes sense to have an additional validation mechanism
on top of JCR. I guess with the node type definitions you can't handle
all validation cases (like validating one field depends on the value of
another one etc.). So we need these hooks.

It makes sense to leverage the validation information from the
nodetypes, of course. And I also think that it makes sense to validate
the input based on this information before a commit. So some sort of
general service doing this would be great...

Carsten

-- 
Carsten Ziegeler
cziegeler@apache.org

Re: Request Data Validation

Posted by David Nuescheler <da...@day.com>.

> imo we need not request input validation. if you want to restrict, use
> the repository value constraints :-)
seriously: +1

anyway, i think we should not even consider a different level of validation
that could possibly conflict with what we have in the repository.
sling can use the nodetype definitions to do the validation before trying to
commit.

if we have a usecase that goes beyond the validations we have in jcr then
we can engage in further discussions. the focus is on "use"-case not
"make-up"-case.

i sure don't hope this is another case of "abstracting away from the repo"
and that's why we would like to introduce a whole new constraint framework
even for simple things such as "number in a range" or a "string that matches
a given regexp".

regards,
david

Re: Request Data Validation

Posted by Tobias Bocanegra <to...@day.com>.

imo we need not request input validation. if you want to restrict, use
the repository value constraints :-)

regards, toby

On 12/6/07, Bertrand Delacretaz <bd...@apache.org> wrote:
> Hi,
>
> On Dec 6, 2007 9:50 AM, Felix Meschberger <fm...@gmail.com> wrote:
>
> > ...Rather I suggest, we define an extensible input validation system,
> > which is used mainly by the microjax handling but may also be used other
> > consumers of client supplied input data....
>
> Agreed
>
> > ...I have been looking at the Spring 2.5 validation package [1] for some
> > inspiration....
>
> There's also commons-validator, I'm not familiar with it though:
> http://commons.apache.org/validator/
>
> > ...The question is, how the
> > validator is selected for a given request :
> >
> >    (1) Simply use the resource type
> > or (2) Same as servlet/script resolution take resource type, selectors,
> > extension and
> >        request method into account...
>
> Seems like the resource type should be sufficient, and the validation
> script or code can access more info about the request if needed.
>
> > ...The location of
> > the script should be different from the normal request rendering
> > scripts, e.g. /apps/<resourcetype>/validators/* (of course this collides
> > with a normal request rendering script for the selector "validators",
> > which may or may not be an issue)....
>
> We could also simply use the script name, e.g. "validation.js" to
> indicate its role, that's consistent with what we do now with POST.js,
> html.js, etc.
>
> > ...To use the validation framework, we would have a validation service :
> >
> >     public interface ValidationService {
> >         Errors validate(SlingHttpServletRequest request);
> >     }
> >
> > Scripts and servlets requiring validation would call the
> > ValidationService.validate method with the request and response objects....
>
> Ok, so the framework does not call this automatically, scripts have to
> call it explicitely, sounds good to me.
>
> > ...The ValidationService internally manages the validator selection and
> > call. The validate method returns null if the input validates
> > successfully. Otherwise an Errors instance is returned which may be
> > inspected and from which a response may be generated to send back to the
> > client....
>
> Sounds good, and we should use subclasses of Errors like
> RequiredParameterMissing, BadNumberFormat, etc.
>
> One idea (that doesn't impact the overall design): if we can use
> javascript to define validation scripts, they might be useful on both
> the client and server sides.
>
> -Bertrand
>


-- 
-----------------------------------------< tobias.bocanegra@day.com >---
Tobias Bocanegra, Day Management AG, Barfuesserplatz 6, CH - 4001 Basel
T +41 61 226 98 98, F +41 61 226 98 97
-----------------------------------------------< http://www.day.com >---

Re: Request Data Validation

Posted by Felix Meschberger <fm...@gmail.com>.

Hi,

Am Donnerstag, den 06.12.2007, 10:48 +0100 schrieb Bertrand Delacretaz:
> There's also commons-validator, I'm not familiar with it though:
> http://commons.apache.org/validator/

Ah, good point. I will look at that, whether we can leverage it. Thanks.

> We could also simply use the script name, e.g. "validation.js" to
> indicate its role, that's consistent with what we do now with POST.js,
> html.js, etc.

Makes sense.

> > ...The ValidationService internally manages the validator selection and
> > call. The validate method returns null if the input validates
> > successfully. Otherwise an Errors instance is returned which may be
> > inspected and from which a response may be generated to send back to the
> > client....
> 
> Sounds good, and we should use subclasses of Errors like
> RequiredParameterMissing, BadNumberFormat, etc.

Actually the Errors object is a collection of issues raising from the
validation. The validator packs all issues it finds into the Errors
collection. So it would rather be:

    errors.add(requiredParameterMissing("xyz"));
    errors.add(badNumberFormat("name", wrongValue));

> One idea (that doesn't impact the overall design): if we can use
> javascript to define validation scripts, they might be useful on both
> the client and server sides.

I am not sure, because the API might be different on client and server
side. But if we can do something, why not. But I would not see this as a
blocker.

Regards
Felix

Re: Request Data Validation

Posted by Bertrand Delacretaz <bd...@apache.org>.

Hi,

On Dec 6, 2007 9:50 AM, Felix Meschberger <fm...@gmail.com> wrote:

> ...Rather I suggest, we define an extensible input validation system,
> which is used mainly by the microjax handling but may also be used other
> consumers of client supplied input data....

Agreed

> ...I have been looking at the Spring 2.5 validation package [1] for some
> inspiration....

There's also commons-validator, I'm not familiar with it though:
http://commons.apache.org/validator/

> ...The question is, how the
> validator is selected for a given request :
>
>    (1) Simply use the resource type
> or (2) Same as servlet/script resolution take resource type, selectors,
> extension and
>        request method into account...

Seems like the resource type should be sufficient, and the validation
script or code can access more info about the request if needed.

> ...The location of
> the script should be different from the normal request rendering
> scripts, e.g. /apps/<resourcetype>/validators/* (of course this collides
> with a normal request rendering script for the selector "validators",
> which may or may not be an issue)....

We could also simply use the script name, e.g. "validation.js" to
indicate its role, that's consistent with what we do now with POST.js,
html.js, etc.

> ...To use the validation framework, we would have a validation service :
>
>     public interface ValidationService {
>         Errors validate(SlingHttpServletRequest request);
>     }
>
> Scripts and servlets requiring validation would call the
> ValidationService.validate method with the request and response objects....

Ok, so the framework does not call this automatically, scripts have to
call it explicitely, sounds good to me.

> ...The ValidationService internally manages the validator selection and
> call. The validate method returns null if the input validates
> successfully. Otherwise an Errors instance is returned which may be
> inspected and from which a response may be generated to send back to the
> client....

Sounds good, and we should use subclasses of Errors like
RequiredParameterMissing, BadNumberFormat, etc.

One idea (that doesn't impact the overall design): if we can use
javascript to define validation scripts, they might be useful on both
the client and server sides.

-Bertrand