You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@directory.apache.org by Emmanuel Lecharny <el...@gmail.com> on 2007/12/14 12:57:11 UTC

[ServerEntry new API] Q about BasicServerAttribute

Hi,

just a quick Q :

we have many methods 'add' in this class. Should we always check that 
the added value is syntaxically correct ? The current version does not 
check this :

    public boolean add( String val )
    {
        return values.add( new ServerStringValue( attributeType, val ) );
    }

I suggest that we should write this method this way :

    public boolean add( String val ) throws 
InvalidAttributeValueException, NamingException
    {
        if ( attributeType.getSyntax().isHumanReadable() )
        {
            attributeType.getSyntax().getSyntaxChecker().assertSyntax( 
val );

            return values.add( new ServerStringValue( attributeType, val 
) );
        }
        else
        {
            throw new InvalidAttributeValueException();
        }
    }

wdyt ?

-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org

Re: [ServerEntry new API] Q about BasicServerAttribute

Posted by Alex Karasulu <ak...@apache.org>.

Meant to say server side verses client side here below sorry.

On Dec 14, 2007 10:04 AM, Alex Karasulu <ak...@apache.org> wrote:

> Hmmm good question.  I guess this leads to another question.  Should we
> allow the user to create entries which have attribute values that are
> syntactically incorrect?  And what does this do with respect to expected
> behavior for this Server side attribute verses a user side analog?
>
> Originally I liked this idea.  But after some time I started thinking this
> is not such a good idea.  I think we should avoid adding too much
> functionality into this.  Containment is the objective not syntax checking.
> We simply need a minimal amount of schema checking to prevent certain
> anomalies on the server side.  However we should leave this function to the
> proper service whose role is explicitly dedicated towards fullfiling this
> function which it centralizes while considering all aspects.  For example
> schema checking will be done by parts of the schema service which must
> consider a slew of aspects associated with this:
>
>    o concurrent changes to schema during checks (aka locking)
>    o configuration of schema checking: we might want to disable or relax
> certain syntaxes
>    o ...
>
> The list goes on but I have to take off here.  I hope you see what I mean
> here .. I don't want to have to be considering all these aspects and
> duplicating such functionality here in the Entry/Attribute API which is
> really there for composition of entries.
>
> Further back on the issue of what users will expect in terms of behavior;
> we will have a divergence in behavior from the client and server versions of
> the API.  We already have some divergence but this will be wider in that it
> will feel like the server is checking the user while things are being done.
> Where do we stop with schema checks then? If the answer is apply all schema
> checks then how do we deal with situations where the entry is inconsistent
> during composition but will be consistent at the end?  For example you have
> an inetOrgPerson that requires sn and cn attributes.  The user adds the
> objectClass attribute with the inetOrgPerson value into the Entry.  If we
> have schema checks enabled then this user action will trigger a violation
> error.   Likewise if they add sn or cn before they add the objectClass
> attribute since these attributes will not be in the must may list yet.  So I
> think we open a Pandora's box if we try to overload too much functionality
> into this Entry/Attribute API whose primary purpose is with respect to
> managing entry composition.
>
> I know we have to rely a bit on schema to do it right on the server side
> but let's keep this to a minimum to avoid anomalies.
>
> Alex
>
>
>
> On Dec 14, 2007 6:57 AM, Emmanuel Lecharny < elecharny@gmail.com> wrote:
>
> > Hi,
> >
> > just a quick Q :
> >
> > we have many methods 'add' in this class. Should we always check that
> > the added value is syntaxically correct ? The current version does not
> > check this :
> >
> >    public boolean add( String val )
> >    {
> >        return values.add( new ServerStringValue( attributeType, val ) );
> >    }
> >
> > I suggest that we should write this method this way :
> >
> >    public boolean add( String val ) throws
> > InvalidAttributeValueException, NamingException
> >    {
> >        if ( attributeType.getSyntax().isHumanReadable() )
> >        {
> >            attributeType.getSyntax().getSyntaxChecker().assertSyntax(
> > val );
> >
> >            return values.add( new ServerStringValue( attributeType, val
> > ) );
> >        }
> >        else
> >        {
> >            throw new InvalidAttributeValueException();
> >        }
> >    }
> >
> > wdyt ?
> >
> > --
> > --
> > cordialement, regards,
> > Emmanuel Lécharny
> > www.iktek.com
> > directory.apache.org
> >
> >
> >
>

Re: [ServerEntry new API] Q about BasicServerAttribute

Posted by Emmanuel Lecharny <el...@gmail.com>.

Alex Karasulu wrote:

Short abstract :
So we should consider that checking for H/R is mandatory when adding a 
new value, but no more.

Some more elements about the other parts of your mail :
>  
>
>
>     We can let the Schema intercerptor deal with normalization and syntax
>     checking, instead of asking the EntryAttribute to do the checking.
>     That
>     means we _must_ put this interceptor very high in the chain.
>
>  
> Right now I think this is split into two interceptors.  The first one 
> which is executed immediately is the Normalization interceptor.  It's 
> really an extension of the schema subsystem.  Normalization cannot 
> occur without schema information and the process of normalization 
> automatically enforces value syntax.  This is because to normalize 
> most parsers embedded in a normalizer must validate the syntax 
> to transform the value to a cannonical representation using String 
> prep rules.
those two guys work hand to hand... If we consider the Normalization 
Interceptor alone, this is a pretty specific animal, yes. It is run 
asap, to be sure that the elements sent by the client are in a good 
shape (ie, comparable). But we may have to normalize values later too : 
while searching for a value in a attribute, when adding some new 
attributes through any inner mechanism (trigger, for instance)...
>  
> The big difference that has evolved between the Normalization 
> interceptor and the Schema interceptor is that the Normalization 
> interceptor is not designed to fully check schema.  It does *ONLY* 
> what it needs to do to evaluate the validity of a request against the 
> DIT.  For example the DN is normalized and the filter expression is 
> normalized early to determine if we can short this process with a 
> rapid return.  This reduces latency and weeds out most incorrect 
> requests.  Now with normalized parameters the Exception interceptor 
> can more accurately do it's work to determine whether or not the 
> request makes sense: i.e. does the entry that is being deleted 
> actually exist?  Then the request goes deeper into the interceptor 
> chain for further processing.  The key concept in terms of 
> normalization and schema checking is lazy execution.
yes, but fast failing is also a good thing to have.
>  
> Lazy execution makes sense most of the time but from the many 
> converstations we've had it seems this might actually be harming us 
> since we're doing many of the same computations over and over again 
> while discarding the results, especially where normalization is 
> concerned.
So true... At some point, we might want to keep the UP form and the 
Normalized form for values, as we do for DN. It will cost some more 
memory, but :
1) entries are transient, and can be discarded at will,
2) now that we will have StreamedValue, this won't be no more a big issue
3) normalizing values over and over may cost much more than storing 
twice the size of data (in the worse cases)
4) we should consider that very often, UP value == normilized value, so 
we have a easy way to avoid a doubled memory consumption.

This need to be further, and in another thread...
>  
>
>
>     Here are the possible checks we can have on a value for an attribute :
>
>  
>
>
>     H/R : could be done when creating the attribute or adding some
>     value into it
>
>  
> Yes this will have to happen very early within the codec I guess right?
yes. We will build the ServerEntry objects in the codec, like we are 
processing DN atm. That means we will need an access to the registries 
in the codec.
>  
>
>
>     Syntax checking : SchemaInterceptor
>     Normalization : SchemaInterceptor
>
>  
> Right now request parameters are normalized in within the 
> Normalization interceptor and the these other aspects (items) are 
> being handled in the Schema interceptor.
>  
>
> <snip/>
>  
>
>
>     It brings to my mind another concern :
>     let's think about what could happen if we change the schema : we will
>     have to update all the existing Attributes, which is simply not
>     possible. Thus, storing the AttributeType within the
>     EntryAttribute does
>     not sound good anymore. (unless we kill all the current requests
>     before
>     we change the schema). It would be better to store an accessor to the
>     schema sub-system, no ?
>
>  
> This is a big concern.  For this reason I prefer holding references to 
> high level service objects which can swap out things like registries 
> when the schema changes.  This is especially important within services 
> and interceptors that depend in particular on the schema service.  I 
> would rather spend an extra cycle to do more lookups than with lazy 
> resolution which leads to a more dynamic architecture.  Changes to 
> components are reflected immediately this way and have little impact 
> in terms of leaving stale objects around which may present problems 
> and need to be cleaned up.
You are right. I was over looking this part. We should simply consider 
that if the schema changes, then we must 'reboot' the server. At least, 
it will work in any case. Schema updates are not really meant to be done 
often (we are not designing AD, are we ? ;).

The fact is that if we need to keep the serve rup and running even if we 
need to change the schema, then it's a little bit more complex than 
simply interact with the loaded values in the process being requested.
>  
> However on the flip side there's a line we need to draw.  Where we 
> draw this line will determine the level of isolation we want.  Let me 
> draw out a couple of specific scenarios to clarify. 
>  
> Scenario 1
> ========
>  
> A client binds to the server and pulls the schema at version 1, then 
> before issuing an add operation for a specific objectClass the schema 
> changes and one of the objectClasses in the entry to be added is no 
> longer present.  The request will fail and should since the schema 
> changed.  Incidentally a smart client should check the 
> subscemaSubentry timestamps before issing write operations to see if 
> needs to check for schema changes that make the request invalid.
That won't be enough. Here, we need a kind of two phase commits, as we 
are modifying two sources of data at the same time. Not very simple to 
handle. We should also consider that we may have concurrent requests on 
the same data...
>  
> Scenario 2
> ========
>  
> A client binds to the server and pulls schema at version 1, then 
> issues an add request, as the add request is being processed by the 
> server the schema changes and one of the objectClass in the entry to 
> be added is no longer present. 
>  
> Scenario 1 is pretty clear and easy to handle.  It will be handled 
> automatically for us anyway without having to explicitly code the 
> correct behavior.  Scenario 2 is a bit tricky.  First of all we have 
> to determine the correct behavoir that needs to be exhibited.  Before 
> confirming with the specifications (which we need to do) my suspicions 
> would incline me to think that this add request should be allowed 
> since it was issued and received before the schema change was 
> committed.  In this case it's OK for the add request to contain 
> handles on schema data which might be old but consistent with the time 
> at which that request was issued.
>  
> So to conclude I think it's OK, prefered and efficient for request 
> parameters and intermediate derived data structures used to evaluate 
> requests to have and leverage schema information that is not 
> necessarily up to date with the last schema change.  This brings up a 
> slew of other problems we have to tackle btw but we can talk about 
> this in another thread.
Oh, yeah... No need to stop and think right now, as the current server 
does not handle those problems anyway. First, we have to 'clean' the 
Entry code :)

<snipped the rest of the convo, it will bring us far away from my 
initial short Q  ;) />
-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org

Re: [ServerEntry new API] Q about BasicServerAttribute

Posted by Alex Karasulu <ak...@apache.org>.

Hi Emmanuel,

On Dec 14, 2007 10:53 AM, Emmanuel Lecharny <el...@gmail.com> wrote:

> Very valid points, Alex. We have had the same discussion a while back
> about DN parsing...
>

Yeah I think we talked about this too a while back while annotating this
experimental code with ideas in the java docs.

>
> My personal guess is that you are almost fully right, but there might be
> cases where we may want to check some parts of the values. The H/R
> aspect, for instance, directly drives the type of value we will create.

Yeah we're not completely free of having to do something I agree.  We just
want to minimalize just how much schema checking we want to enforce.  We do
what we have to do to remove some headaches but it's not our primary
objective in this region of the code.

>
> We can let the Schema intercerptor deal with normalization and syntax
> checking, instead of asking the EntryAttribute to do the checking. That
> means we _must_ put this interceptor very high in the chain.
>

Right now I think this is split into two interceptors.  The first one which
is executed immediately is the Normalization interceptor.  It's really an
extension of the schema subsystem.  Normalization cannot occur without
schema information and the process of normalization automatically enforces
value syntax.  This is because to normalize most parsers embedded in a
normalizer must validate the syntax to transform the value to a cannonical
representation using String prep rules.

The big difference that has evolved between the Normalization interceptor
and the Schema interceptor is that the Normalization interceptor is not
designed to fully check schema.  It does *ONLY* what it needs to do to
evaluate the validity of a request against the DIT.  For example the DN is
normalized and the filter expression is normalized early to determine if we
can short this process with a rapid return.  This reduces latency and weeds
out most incorrect requests.  Now with normalized parameters the Exception
interceptor can more accurately do it's work to determine whether or not the
request makes sense: i.e. does the entry that is being deleted actually
exist?  Then the request goes deeper into the interceptor chain for further
processing.  The key concept in terms of normalization and schema checking
is lazy execution.

Lazy execution makes sense most of the time but from the many converstations
we've had it seems this might actually be harming us since we're doing many
of the same computations over and over again while discarding the results,
especially where normalization is concerned.

>
> Here are the possible checks we can have on a value for an attribute :

>
> H/R : could be done when creating the attribute or adding some value into
> it

Yes this will have to happen very early within the codec I guess right?

>
> Syntax checking : SchemaInterceptor
> Normalization : SchemaInterceptor

Right now request parameters are normalized in within the Normalization
interceptor and the these other aspects (items) are being handled in the
Schema interceptor.

>
> Single value : SchemaInterceptor
>
> So I would say we should simply test the H/R flag in EntryAttribute.

Yes this sounds like something we must do to create the correct entry
composition in the codec.  Otherwise we would need an intermediate
representation which is a waste of memory and cycles.

>
> It brings to my mind another concern :
> let's think about what could happen if we change the schema : we will
> have to update all the existing Attributes, which is simply not
> possible. Thus, storing the AttributeType within the EntryAttribute does
> not sound good anymore. (unless we kill all the current requests before
> we change the schema). It would be better to store an accessor to the
> schema sub-system, no ?

This is a big concern.  For this reason I prefer holding references to high
level service objects which can swap out things like registries when the
schema changes.  This is especially important within services and
interceptors that depend in particular on the schema service.  I would
rather spend an extra cycle to do more lookups than with lazy resolution
which leads to a more dynamic architecture.  Changes to components are
reflected immediately this way and have little impact in terms of leaving
stale objects around which may present problems and need to be cleaned up.

However on the flip side there's a line we need to draw.  Where we draw
this line will determine the level of isolation we want.  Let me draw out a
couple of specific scenarios to clarify.

Scenario 1
========

A client binds to the server and pulls the schema at version 1, then before
issuing an add operation for a specific objectClass the schema changes and
one of the objectClasses in the entry to be added is no longer present.  The
request will fail and should since the schema changed.  Incidentally a smart
client should check the subscemaSubentry timestamps before issing write
operations to see if needs to check for schema changes that make the request
invalid.

Scenario 2
========

A client binds to the server and pulls schema at version 1, then issues an
add request, as the add request is being processed by the server the schema
changes and one of the objectClass in the entry to be added is no longer
present.

Scenario 1 is pretty clear and easy to handle.  It will be handled
automatically for us anyway without having to explicitly code the correct
behavior.  Scenario 2 is a bit tricky.  First of all we have to determine
the correct behavoir that needs to be exhibited.  Before confirming with the
specifications (which we need to do) my suspicions would incline me to think
that this add request should be allowed since it was issued and received
before the schema change was committed.  In this case it's OK for the add
request to contain handles on schema data which might be old but consistent
with the time at which that request was issued.

So to conclude I think it's OK, prefered and efficient for request
parameters and intermediate derived data structures used to evaluate
requests to have and leverage schema information that is not necessarily up
to date with the last schema change.  This brings up a slew of other
problems we have to tackle btw but we can talk about this in another thread.

SNIP ...

>
> > If the answer is apply all schema checks then how do we deal with
> > situations where the entry is inconsistent during composition but will
> > be consistent at the end?  For example you have an inetOrgPerson that
> > requires sn and cn attributes.  The user adds the objectClass
> > attribute with the inetOrgPerson value into the Entry.  If we have
> > schema checks enabled then this user action will trigger a violation
> > error.   Likewise if they add sn or cn before they add the objectClass
> > attribute since these attributes will not be in the must may list yet.
> That's not exactly what we want to introduce into the Entry class. This
> is clearly done by the Schema interceptor system. But it was not my
> initial concern, too, as I was specifically mentioning the
> EntryAttribute alone, not the Entry as a whole. So we are on the same
> page here.

I was just trying to say, if we start doing schema checks where do we stop.
However we may want to do these early too for very specific attributes like
for example the objectClass attribute.

Another mode of thinking may suggest performing all schema checks
immediately in one place since circumstances force us to deal with a part of
the problem anyway.  This line of thinking favors the benefit of keeping
such code associated with a specific function together in one place.

I don't know what is the correct answer here but was expressing to you the
different ways we can approach this problem.  I know you were talking about
attribute values but soon you'll find this pulls us into the conversation
about schema checks at the entry level.

>
> > So I think we open a Pandora's box if we try to overload too much
> > functionality into this Entry/Attribute API whose primary purpose is
> > with respect to managing entry composition.
> yeah. We need some balance. This is the reason I asked before doing
> stupid things :) At the end, this will be less work  for me ;)

Oh you're more right then you can imagine. This is why I'm being overly
analytical myself.  Slipping up here will have repercussions all over.
There's no right answer though even though some answers will be very very
detrimental.  The top few best answers will also have tradeoffs associated
with them and evaluating this and comming to a conclusion on how best to
proceed is what makes this such a difficult design problem.

Alex

Re: [ServerEntry new API] Q about BasicServerAttribute

Posted by Emmanuel Lecharny <el...@gmail.com>.

Very valid points, Alex. We have had the same discussion a while back 
about DN parsing...

My personal guess is that you are almost fully right, but there might be 
cases where we may want to check some parts of the values. The H/R 
aspect, for instance, directly drives the type of value we will create. 
We can let the Schema intercerptor deal with normalization and syntax 
checking, instead of asking the EntryAttribute to do the checking. That 
means we _must_ put this interceptor very high in the chain.

Here are the possible checks we can have on a value for an attribute :
H/R : could be done when creating the attribute or adding some value into it
Syntax checking : SchemaInterceptor
Normalization : SchemaInterceptor
Single value : SchemaInterceptor

So I would say we should simply test the H/R flag in EntryAttribute.

It brings to my mind another concern :
let's think about what could happen if we change the schema : we will 
have to update all the existing Attributes, which is simply not 
possible. Thus, storing the AttributeType within the EntryAttribute does 
not sound good anymore. (unless we kill all the current requests before 
we change the schema). It would be better to store an accessor to the 
schema sub-system, no ?

Alex Karasulu wrote:
> Hmmm good question.  I guess this leads to another question.  Should 
> we allow the user to create entries which have attribute values that 
> are syntactically incorrect?  And what does this do with respect to 
> expected behavior for this Server side attribute verses a user side 
> analog?
>
> Originally I liked this idea.  But after some time I started thinking 
> this is not such a good idea.  I think we should avoid adding too much 
> functionality into this.  Containment is the objective not syntax 
> checking.  We simply need a minimal amount of schema checking to 
> prevent certain anomalies on the server side.  However we should leave 
> this function to the proper service whose role is explicitly dedicated 
> towards fullfiling this function which it centralizes while 
> considering all aspects.  For example schema checking will be done by 
> parts of the schema service which must consider a slew of aspects 
> associated with this:
>
>    o concurrent changes to schema during checks (aka locking)
>    o configuration of schema checking: we might want to disable or 
> relax certain syntaxes
>    o ...
>
> The list goes on but I have to take off here.  I hope you see what I 
> mean here .. I don't want to have to be considering all these aspects 
> and duplicating such functionality here in the Entry/Attribute API 
> which is really there for composition of entries.
Agreed, considering my first point is valid (see my previous comment)
>
> Further back on the issue of what users will expect in terms of 
> behavior; we will have a divergence in behavior from the client and 
> server versions of the API.  We already have some divergence but this 
> will be wider in that it will feel like the server is checking the 
> user while things are being done.  Where do we stop with schema checks 
> then?
Pretty early, as you said. No need to transform a pretty well designed 
interceptor chain to a rigid system...
> If the answer is apply all schema checks then how do we deal with 
> situations where the entry is inconsistent during composition but will 
> be consistent at the end?  For example you have an inetOrgPerson that 
> requires sn and cn attributes.  The user adds the objectClass 
> attribute with the inetOrgPerson value into the Entry.  If we have 
> schema checks enabled then this user action will trigger a violation 
> error.   Likewise if they add sn or cn before they add the objectClass 
> attribute since these attributes will not be in the must may list yet.
That's not exactly what we want to introduce into the Entry class. This 
is clearly done by the Schema interceptor system. But it was not my 
initial concern, too, as I was specifically mentioning the 
EntryAttribute alone, not the Entry as a whole. So we are on the same 
page here.
> So I think we open a Pandora's box if we try to overload too much 
> functionality into this Entry/Attribute API whose primary purpose is 
> with respect to managing entry composition. 
yeah. We need some balance. This is the reason I asked before doing 
stupid things :) At the end, this will be less work  for me ;)
>
> I know we have to rely a bit on schema to do it right on the server 
> side but let's keep this to a minimum to avoid anomalies.
Sure, a very pragmatic position I am found of. Thanks !
>
> Alex
>


-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org

Re: [ServerEntry new API] Q about BasicServerAttribute

Posted by Alex Karasulu <ak...@apache.org>.

Hmmm good question.  I guess this leads to another question.  Should we
allow the user to create entries which have attribute values that are
syntactically incorrect?  And what does this do with respect to expected
behavior for this Server side attribute verses a user side analog?

Originally I liked this idea.  But after some time I started thinking this
is not such a good idea.  I think we should avoid adding too much
functionality into this.  Containment is the objective not syntax checking.
We simply need a minimal amount of schema checking to prevent certain
anomalies on the server side.  However we should leave this function to the
proper service whose role is explicitly dedicated towards fullfiling this
function which it centralizes while considering all aspects.  For example
schema checking will be done by parts of the schema service which must
consider a slew of aspects associated with this:

   o concurrent changes to schema during checks (aka locking)
   o configuration of schema checking: we might want to disable or relax
certain syntaxes
   o ...

The list goes on but I have to take off here.  I hope you see what I mean
here .. I don't want to have to be considering all these aspects and
duplicating such functionality here in the Entry/Attribute API which is
really there for composition of entries.

Further back on the issue of what users will expect in terms of behavior; we
will have a divergence in behavior from the client and server versions of
the API.  We already have some divergence but this will be wider in that it
will feel like the server is checking the user while things are being done.
Where do we stop with schema checks then? If the answer is apply all schema
checks then how do we deal with situations where the entry is inconsistent
during composition but will be consistent at the end?  For example you have
an inetOrgPerson that requires sn and cn attributes.  The user adds the
objectClass attribute with the inetOrgPerson value into the Entry.  If we
have schema checks enabled then this user action will trigger a violation
error.   Likewise if they add sn or cn before they add the objectClass
attribute since these attributes will not be in the must may list yet.  So I
think we open a Pandora's box if we try to overload too much functionality
into this Entry/Attribute API whose primary purpose is with respect to
managing entry composition.

I know we have to rely a bit on schema to do it right on the server side but
let's keep this to a minimum to avoid anomalies.

Alex

On Dec 14, 2007 6:57 AM, Emmanuel Lecharny <el...@gmail.com> wrote:

> Hi,
>
> just a quick Q :
>
> we have many methods 'add' in this class. Should we always check that
> the added value is syntaxically correct ? The current version does not
> check this :
>
>    public boolean add( String val )
>    {
>        return values.add( new ServerStringValue( attributeType, val ) );
>    }
>
> I suggest that we should write this method this way :
>
>    public boolean add( String val ) throws
> InvalidAttributeValueException, NamingException
>    {
>        if ( attributeType.getSyntax().isHumanReadable() )
>        {
>            attributeType.getSyntax().getSyntaxChecker().assertSyntax(
> val );
>
>            return values.add( new ServerStringValue( attributeType, val
> ) );
>        }
>        else
>        {
>            throw new InvalidAttributeValueException();
>        }
>    }
>
> wdyt ?
>
> --
> --
> cordialement, regards,
> Emmanuel Lécharny
> www.iktek.com
> directory.apache.org
>
>
>