You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@directory.apache.org by Emmanuel Lecharny <el...@gmail.com> on 2011/06/13 16:39:55 UTC

Alias cycle detection

Alias cycle detection
---------------------

There is an unsolved question about how we should detect Alias cycles. 
Right now, we check for cycles *before* they are created. The 
alternative would be to stop any search that could lead to an infinite loop.

A third - but unrealistic - solution would be to don't detect cycle, and 
process the search until we reach the time or size limit (in other 
words, it's up to the admin to avoid the creation of such cycle; Highly 
dangerous…).

The problem with the first approach is that we can't anymore pass the 
VSLDAP tests. It's a major burden. Also most of the current servers 
support this feature.

So it seems that everything concurs to get the cycle be detected while 
searching, not before. Now, how can we do that efficiently ?

Detecting a cycle is a pretty obvious algorithm : every entry we went 
through should be memorized, and we should check that we aren't going 
through an entry we already returned.

This is easy, but highly memory consuming, as we have to keep track of 
*all* the entries during a search. Simply idealistic.

Hopefully, we can use a simplified algorithm : if we hit an Alias, we 
have to check that we don't come back to the original DN we came from, 
or one of its descendant. We should also memorize each indirection (each 
of them is like a new search)

initial search : A.B.C -~-> a(alias)  (A.B.C memorized)
a -~-> B.C… -~-> A.B.C : cycle detected (B.C memorized)

or

initial search : A.B.C -~-> a(alias)  (A.B.C memorized)
a -~-> D.E… -~-> b(alias) : (D.E memorized)
b -~-> B.C… -~-> A.B.C : cycle detected (B.C memorized)

This way we only have to memorize the roots for all searches, instead of 
memorize all he entries. As we are very unlikely to have many aliases 
and many indirection, it should be safe from the memory consumption POV.

One other aspect is that this should be only done if the user has 
required that we dereference aliases on the server.

Last, not least, if the user has sent the ManageDSAIt control, we should 
not have to memorize anything either, as we don't dereference aliases in 
this case.

The DN cache will be stored in the session, and removed as soon as we 
reached the end of the search. It will be associated with the message 
ID, so that a user can issue more than one search in parallel.

thoughts ?

-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: Alias cycle detection

Posted by Alex Karasulu <ak...@apache.org>.

On Wed, Jun 15, 2011 at 8:33 AM, Emmanuel Lecharny <el...@gmail.com> wrote:
> On 6/15/11 2:28 AM, Alex Karasulu wrote:
>>
>>> In that case we can also get rid of all the alias indices (aliasIdx,
>>> oneAliasIdx, subAliasIdx).
>>
>> This would be a big mistake. Then you'd be better off just getting rid
>> of aliases all together. You don't know how many aliases you'll have
>> at the end of the day and presuming some number is making presumptions
>> on behalf of the user even if aliases in practice are not used.
>
> Detecting if an entry is an alias is just a matter of checking the 'alias'
> ObjectClass. This is already indexed.
>
> Sorry, but can you refresh my memory and tell what are we using those index
> for ? I was wondering yesterday if we needed to keep an 'aliased Object' ->
> 'alias' reverse index.

OK let me do this. Seems this page is missing information here so I
will add it so we're all on the same page with these indices. Will be
good for others to see as well:

    https://cwiki.apache.org/confluence/display/DIRxSRVx11/Alias+and+Index

After I fill this out perhaps we can discuss this in more depth. But
let's make sure that check to prevent alias cycles is put back and
improved otherwise we'll see odd behavior in search with alias
dereferencing enabled. We cannot just go and do this without modifying
a few other things.

> Maybe I'm overlooking this aspect. Anyway, the index are stillpresent atm.

I will update this. Also seems the people that removed the updn and
ndn indices introducing the rdn indices never updated the
documentation. This is a core concept and it should be done too for
those who want to come up to speed with ADS internals.

Regards,
Alex

Re: Alias cycle detection

Posted by Emmanuel Lecharny <el...@gmail.com>.

On 6/15/11 2:28 AM, Alex Karasulu wrote:
>
>> In that case we can also get rid of all the alias indices (aliasIdx,
>> oneAliasIdx, subAliasIdx).
> This would be a big mistake. Then you'd be better off just getting rid
> of aliases all together. You don't know how many aliases you'll have
> at the end of the day and presuming some number is making presumptions
> on behalf of the user even if aliases in practice are not used.

Detecting if an entry is an alias is just a matter of checking the 
'alias' ObjectClass. This is already indexed.

Sorry, but can you refresh my memory and tell what are we using those 
index for ? I was wondering yesterday if we needed to keep an 'aliased 
Object' -> 'alias' reverse index.

Maybe I'm overlooking this aspect. Anyway, the index are stillpresent atm.

-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: Alias cycle detection

Posted by Alex Karasulu <ak...@apache.org>.

On Tue, Jun 14, 2011 at 12:19 AM, Stefan Seelmann <se...@apache.org> wrote:
> On Mon, Jun 13, 2011 at 11:12 PM, Emmanuel Lécharny
> <el...@apache.org> wrote:
>> On 6/13/11 10:12 PM, Howard Chu wrote:
>>>
>>> Ludovic Poitou wrote:
>>>>
>>>> Sun Directory Server, OpenDS, OpenDJ, Port 389 (formerly known as Fedora
>>>> Directory) at least, do not support Alias dereferencing because the
>>>> complexity
>>>> out-weights the benefits, and very few client applications make use of it
>>>> anyway.
>>>>
>>>> My 2 cents.
>>>
>>> We had no alias support in back-bdb originally; I added it back in 2003.
>>>
>>> Indeed, you only need to track all of the alias entries that you hit, and
>>> that's typically a small number. Each new entry defines a new search scope.
>>> Any time a dereference leads to an entry that resides anywhere within your
>>> already-known search scopes that deref can be considered fully resolved,
>>> because the result was already part of the scope.
>>
>> This is teh algorithm I foresee too.
>
> +1
>
>>>
>>> It's actually quite simple and quite fast. Using the objectclass index
>>> it's trivial to obtain the list of all alias entries within the database, so
>>> from the outset you already know the maximum size of what you're dealing
>>> with.
>>
>> We already have a cache that is constructed at startup, gathering all the
>> aliases from the backend, using the OC index. This cache is of course
>> updated on the fly, if one alias is added or removed.
>>
>> I don't think it should take more than one day to fix this issue.
>
> In that case we can also get rid of all the alias indices (aliasIdx,
> oneAliasIdx, subAliasIdx).

This would be a big mistake. Then you'd be better off just getting rid
of aliases all together. You don't know how many aliases you'll have
at the end of the day and presuming some number is making presumptions
on behalf of the user even if aliases in practice are not used.

Re: Alias cycle detection

Posted by Kiran Ayyagari <ka...@apache.org>.

thats a huge plus if we get rid of 3 indices, +1 to the algo and index
removal

On 14-Jun-2011 2:50 AM, "Stefan Seelmann" <se...@apache.org> wrote:

On Mon, Jun 13, 2011 at 11:12 PM, Emmanuel Lécharny
<el...@apache.org> wrote:
> On 6/13/11 10:12...
+1


>>
>> It's actually quite simple and quite fast. Using the objectclass index
>> it's trivial to obt...
In that case we can also get rid of all the alias indices (aliasIdx,
oneAliasIdx, subAliasIdx).

Kind Regards,
Stefan

Re: Alias cycle detection

Posted by Emmanuel Lecharny <el...@gmail.com>.

On 6/14/11 4:31 PM, Kiran Ayyagari wrote:
> we can use ehcache to counter the memory issue(we can defer this  step
> though)

This was my first idea, although if we don't need a cache, then ... ;)

-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: Alias cycle detection

Posted by Kiran Ayyagari <ka...@apache.org>.

we can use ehcache to counter the memory issue(we can defer this  step
though)

On 14-Jun-2011 6:56 PM, "Stefan Seelmann" <se...@apache.org> wrote:

On Tue, Jun 14, 2011 at 2:29 PM, Emmanuel Lécharny <el...@apache.org>
wrote:
> On 6/13/11 11:19 ...
I wonder why we need an alias cache? For fast lookup of the search
base in case the "find" bit is set?


> - create an AliasInterceptor to manage the Add and Delete operations done
on
> alias entries (and...
You mean that interceptor is used to update the cache, right?


> - modify the Search to handle a set of met aliases.
Yep.


> I'll proceed by creating the alias interceptor first, and Ill remove the
> part that handle Alias...
Ok.


Two other issues I see with the new algorighm:
- It is efficient if there are only few aliases. But if a user adds
million of alias entries we may get a memory problem. I just want to
mention that to make clear that such an issue may occur. I don't think
it makes sense to create so many alias entries, but I saw an example
where group membership was implemented using aliases...
- It is possible that duplicates occur, for example if an alias
enlarges the initial search scope by pointing to a parent of the
initial search base. I think duplicates can be avoided by tracking
each search base and filter result enties within already processed
search bases.

Kind Regards,
Stefan

Re: Alias cycle detection

Posted by Emmanuel Lécharny <el...@apache.org>.

On 6/14/11 3:26 PM, Stefan Seelmann wrote:
>> There are a few steps we also have to fulfill :
>> - create an Alias cache ( I thought we had one, but in fact, we have the
>> opposite : a notAliasCache in the ExceptionInterceptor)
> I wonder why we need an alias cache? For fast lookup of the search
> base in case the "find" bit is set?
Good point... As soon as we fetch an alias, we just have to check its OC 
to see if it's an alias, or not.
>> - create an AliasInterceptor to manage the Add and Delete operations done on
>> alias entries (and also move, rename and combined ops)
> You mean that interceptor is used to update the cache, right?
Right. But if we don't need a cache, we don't either need this interceptor.

I will keep it atm to insolate the search filter in charge of detecting 
the cycle.
>> The Alias index removal will be done at the end.
> Ok.
>
>
> Two other issues I see with the new algorighm:
> - It is efficient if there are only few aliases. But if a user adds
> million of alias entries we may get a memory problem.
If a user adds millions of Alias entries, he will have issues with any 
LDAP server, I guess. And it's very unlikely to happen...

What we can do is to decide that we don't check for cycle once we have 
more than, say, 10 aliases in the search alias cache.

>   I just want to
> mention that to make clear that such an issue may occur. I don't think
> it makes sense to create so many alias entries, but I saw an example
> where group membership was implemented using aliases...
Make sense.

> - It is possible that duplicates occur, for example if an alias
> enlarges the initial search scope by pointing to a parent of the
> initial search base. I think duplicates can be avoided by tracking
> each search base and filter result enties within already processed
> search bases.
Duplicate should not occur if we store the root DN  in the search alias 
cache : we will meet it at some point if the alias point to a parent.

Thanks for your feedback, it's very valuable.


-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: Alias cycle detection

Posted by Stefan Seelmann <se...@apache.org>.

On Tue, Jun 14, 2011 at 2:29 PM, Emmanuel Lécharny <el...@apache.org> wrote:
> On 6/13/11 11:19 PM, Stefan Seelmann wrote:
>>>>
>>>> It's actually quite simple and quite fast. Using the objectclass index
>>>> it's trivial to obtain the list of all alias entries within the
>>>> database, so
>>>> from the outset you already know the maximum size of what you're dealing
>>>> with.
>>>
>>> We already have a cache that is constructed at startup, gathering all the
>>> aliases from the backend, using the OC index. This cache is of course
>>> updated on the fly, if one alias is added or removed.
>>>
>>> I don't think it should take more than one day to fix this issue.
>>
>> In that case we can also get rid of all the alias indices (aliasIdx,
>> oneAliasIdx, subAliasIdx).
>
> Yes absolutely.
>
> There are a few steps we also have to fulfill :
> - create an Alias cache ( I thought we had one, but in fact, we have the
> opposite : a notAliasCache in the ExceptionInterceptor)

I wonder why we need an alias cache? For fast lookup of the search
base in case the "find" bit is set?

> - create an AliasInterceptor to manage the Add and Delete operations done on
> alias entries (and also move, rename and combined ops)

You mean that interceptor is used to update the cache, right?

> - modify the Search to handle a set of met aliases.

Yep.

> I'll proceed by creating the alias interceptor first, and Ill remove the
> part that handle Aliases in ExceptionInterceptor
>
> The Alias index removal will be done at the end.

Ok.


Two other issues I see with the new algorighm:
- It is efficient if there are only few aliases. But if a user adds
million of alias entries we may get a memory problem. I just want to
mention that to make clear that such an issue may occur. I don't think
it makes sense to create so many alias entries, but I saw an example
where group membership was implemented using aliases...
- It is possible that duplicates occur, for example if an alias
enlarges the initial search scope by pointing to a parent of the
initial search base. I think duplicates can be avoided by tracking
each search base and filter result enties within already processed
search bases.

Kind Regards,
Stefan

Re: Alias cycle detection

Posted by Emmanuel Lécharny <el...@apache.org>.

On 6/13/11 11:19 PM, Stefan Seelmann wrote:
>>> It's actually quite simple and quite fast. Using the objectclass index
>>> it's trivial to obtain the list of all alias entries within the database, so
>>> from the outset you already know the maximum size of what you're dealing
>>> with.
>> We already have a cache that is constructed at startup, gathering all the
>> aliases from the backend, using the OC index. This cache is of course
>> updated on the fly, if one alias is added or removed.
>>
>> I don't think it should take more than one day to fix this issue.
> In that case we can also get rid of all the alias indices (aliasIdx,
> oneAliasIdx, subAliasIdx).

Yes absolutely.

There are a few steps we also have to fulfill :
- create an Alias cache ( I thought we had one, but in fact, we have the 
opposite : a notAliasCache in the ExceptionInterceptor)
- create an AliasInterceptor to manage the Add and Delete operations 
done on alias entries (and also move, rename and combined ops)
- modify the Search to handle a set of met aliases.

I'll proceed by creating the alias interceptor first, and Ill remove the 
part that handle Aliases in ExceptionInterceptor

The Alias index removal will be done at the end.

-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: Alias cycle detection

Posted by Stefan Seelmann <se...@apache.org>.

On Mon, Jun 13, 2011 at 11:12 PM, Emmanuel Lécharny
<el...@apache.org> wrote:
> On 6/13/11 10:12 PM, Howard Chu wrote:
>>
>> Ludovic Poitou wrote:
>>>
>>> Sun Directory Server, OpenDS, OpenDJ, Port 389 (formerly known as Fedora
>>> Directory) at least, do not support Alias dereferencing because the
>>> complexity
>>> out-weights the benefits, and very few client applications make use of it
>>> anyway.
>>>
>>> My 2 cents.
>>
>> We had no alias support in back-bdb originally; I added it back in 2003.
>>
>> Indeed, you only need to track all of the alias entries that you hit, and
>> that's typically a small number. Each new entry defines a new search scope.
>> Any time a dereference leads to an entry that resides anywhere within your
>> already-known search scopes that deref can be considered fully resolved,
>> because the result was already part of the scope.
>
> This is teh algorithm I foresee too.

+1

>>
>> It's actually quite simple and quite fast. Using the objectclass index
>> it's trivial to obtain the list of all alias entries within the database, so
>> from the outset you already know the maximum size of what you're dealing
>> with.
>
> We already have a cache that is constructed at startup, gathering all the
> aliases from the backend, using the OC index. This cache is of course
> updated on the fly, if one alias is added or removed.
>
> I don't think it should take more than one day to fix this issue.

In that case we can also get rid of all the alias indices (aliasIdx,
oneAliasIdx, subAliasIdx).

Kind Regards,
Stefan

Re: Alias cycle detection

Posted by Emmanuel Lécharny <el...@apache.org>.

On 6/13/11 10:12 PM, Howard Chu wrote:
> Ludovic Poitou wrote:
>> Sun Directory Server, OpenDS, OpenDJ, Port 389 (formerly known as Fedora
>> Directory) at least, do not support Alias dereferencing because the 
>> complexity
>> out-weights the benefits, and very few client applications make use 
>> of it anyway.
>>
>> My 2 cents.
>
> We had no alias support in back-bdb originally; I added it back in 2003.
>
> Indeed, you only need to track all of the alias entries that you hit, 
> and that's typically a small number. Each new entry defines a new 
> search scope. Any time a dereference leads to an entry that resides 
> anywhere within your already-known search scopes that deref can be 
> considered fully resolved, because the result was already part of the 
> scope.
This is teh algorithm I foresee too.
>
> It's actually quite simple and quite fast. Using the objectclass index 
> it's trivial to obtain the list of all alias entries within the 
> database, so from the outset you already know the maximum size of what 
> you're dealing with.
We already have a cache that is constructed at startup, gathering all 
the aliases from the backend, using the OC index. This cache is of 
course updated on the fly, if one alias is added or removed.

I don't think it should take more than one day to fix this issue.

Thanks !


-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: Alias cycle detection

Posted by Howard Chu <hy...@symas.com>.

Ludovic Poitou wrote:
> Sun Directory Server, OpenDS, OpenDJ, Port 389 (formerly known as Fedora
> Directory) at least, do not support Alias dereferencing because the complexity
> out-weights the benefits, and very few client applications make use of it anyway.
>
> My 2 cents.

We had no alias support in back-bdb originally; I added it back in 2003.

Indeed, you only need to track all of the alias entries that you hit, and 
that's typically a small number. Each new entry defines a new search scope. 
Any time a dereference leads to an entry that resides anywhere within your 
already-known search scopes that deref can be considered fully resolved, 
because the result was already part of the scope.

It's actually quite simple and quite fast. Using the objectclass index it's 
trivial to obtain the list of all alias entries within the database, so from 
the outset you already know the maximum size of what you're dealing with.

And just to remind, alias deref is a 2-bit option - deref base of the search, 
and deref (the rest of the search). They're selected independently.


> Ludo
> ---
> Ludovic Poitou
> http://ludopoitou.wordpress.com
>
>
> On Mon, Jun 13, 2011 at 4:39 PM, Emmanuel Lecharny <elecharny@gmail.com
> <ma...@gmail.com>> wrote:
>
>     Alias cycle detection
>     ---------------------
>
>     There is an unsolved question about how we should detect Alias cycles.
>     Right now, we check for cycles *before* they are created. The alternative
>     would be to stop any search that could lead to an infinite loop.
>
>     A third - but unrealistic - solution would be to don't detect cycle, and
>     process the search until we reach the time or size limit (in other words,
>     it's up to the admin to avoid the creation of such cycle; Highly dangerous…).
>
>     The problem with the first approach is that we can't anymore pass the
>     VSLDAP tests. It's a major burden. Also most of the current servers
>     support this feature.
>
>     So it seems that everything concurs to get the cycle be detected while
>     searching, not before. Now, how can we do that efficiently ?
>
>     Detecting a cycle is a pretty obvious algorithm : every entry we went
>     through should be memorized, and we should check that we aren't going
>     through an entry we already returned.
>
>     This is easy, but highly memory consuming, as we have to keep track of
>     *all* the entries during a search. Simply idealistic.
>
>     Hopefully, we can use a simplified algorithm : if we hit an Alias, we have
>     to check that we don't come back to the original DN we came from, or one
>     of its descendant. We should also memorize each indirection (each of them
>     is like a new search)
>
>     initial search : A.B.C -~-> a(alias)  (A.B.C memorized)
>     a -~-> B.C… -~-> A.B.C : cycle detected (B.C memorized)
>
>     or
>
>     initial search : A.B.C -~-> a(alias)  (A.B.C memorized)
>     a -~-> D.E… -~-> b(alias) : (D.E memorized)
>     b -~-> B.C… -~-> A.B.C : cycle detected (B.C memorized)
>
>     This way we only have to memorize the roots for all searches, instead of
>     memorize all he entries. As we are very unlikely to have many aliases and
>     many indirection, it should be safe from the memory consumption POV.
>
>     One other aspect is that this should be only done if the user has required
>     that we dereference aliases on the server.
>
>     Last, not least, if the user has sent the ManageDSAIt control, we should
>     not have to memorize anything either, as we don't dereference aliases in
>     this case.
>
>     The DN cache will be stored in the session, and removed as soon as we
>     reached the end of the search. It will be associated with the message ID,
>     so that a user can issue more than one search in parallel.
>
>     thoughts ?
>
>     --
>     Regards,
>     Cordialement,
>     Emmanuel Lécharny
>     www.iktek.com <http://www.iktek.com>
>
>


-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Re: Alias cycle detection

Posted by Ludovic Poitou <lu...@gmail.com>.

Sun Directory Server, OpenDS, OpenDJ, Port 389 (formerly known as Fedora
Directory) at least, do not support Alias dereferencing because the
complexity out-weights the benefits, and very few client applications make
use of it anyway.

My 2 cents.

Ludo
---
Ludovic Poitou
http://ludopoitou.wordpress.com


On Mon, Jun 13, 2011 at 4:39 PM, Emmanuel Lecharny <el...@gmail.com>wrote:

> Alias cycle detection
> ---------------------
>
> There is an unsolved question about how we should detect Alias cycles.
> Right now, we check for cycles *before* they are created. The alternative
> would be to stop any search that could lead to an infinite loop.
>
> A third - but unrealistic - solution would be to don't detect cycle, and
> process the search until we reach the time or size limit (in other words,
> it's up to the admin to avoid the creation of such cycle; Highly
> dangerous…).
>
> The problem with the first approach is that we can't anymore pass the
> VSLDAP tests. It's a major burden. Also most of the current servers support
> this feature.
>
> So it seems that everything concurs to get the cycle be detected while
> searching, not before. Now, how can we do that efficiently ?
>
> Detecting a cycle is a pretty obvious algorithm : every entry we went
> through should be memorized, and we should check that we aren't going
> through an entry we already returned.
>
> This is easy, but highly memory consuming, as we have to keep track of
> *all* the entries during a search. Simply idealistic.
>
> Hopefully, we can use a simplified algorithm : if we hit an Alias, we have
> to check that we don't come back to the original DN we came from, or one of
> its descendant. We should also memorize each indirection (each of them is
> like a new search)
>
> initial search : A.B.C -~-> a(alias)  (A.B.C memorized)
> a -~-> B.C… -~-> A.B.C : cycle detected (B.C memorized)
>
> or
>
> initial search : A.B.C -~-> a(alias)  (A.B.C memorized)
> a -~-> D.E… -~-> b(alias) : (D.E memorized)
> b -~-> B.C… -~-> A.B.C : cycle detected (B.C memorized)
>
> This way we only have to memorize the roots for all searches, instead of
> memorize all he entries. As we are very unlikely to have many aliases and
> many indirection, it should be safe from the memory consumption POV.
>
> One other aspect is that this should be only done if the user has required
> that we dereference aliases on the server.
>
> Last, not least, if the user has sent the ManageDSAIt control, we should
> not have to memorize anything either, as we don't dereference aliases in
> this case.
>
> The DN cache will be stored in the session, and removed as soon as we
> reached the end of the search. It will be associated with the message ID, so
> that a user can issue more than one search in parallel.
>
> thoughts ?
>
> --
> Regards,
> Cordialement,
> Emmanuel Lécharny
> www.iktek.com
>
>

Re: Alias cycle detection

Posted by Emmanuel Lécharny <el...@apache.org>.

On 6/15/11 11:21 AM, Howard Chu wrote:
> Alex Karasulu wrote:
>> On Mon, Jun 13, 2011 at 5:39 PM, Emmanuel 
>> Lecharny<el...@gmail.com>  wrote:
>>> Alias cycle detection
>>> ---------------------
>>>
>>> There is an unsolved question about how we should detect Alias 
>>> cycles. Right
>>> now, we check for cycles *before* they are created. The alternative 
>>> would be
>>> to stop any search that could lead to an infinite loop.
>>
>> That would slow down reads. The best is to stop this from happening
>> with write operations: meaning doing the computation to detect and
>> prevent the cycle then and there instead of exhausting the search
>> process to deal with such wicked constructs.
>
> You may be being over-paranoid here. First a client has to explicitly 
> request alias dereferencing and most of them don't by default, 
Actually, they do. The default for people using JNDI is to Deref 
Always.(http://download.oracle.com/javase/jndi/tutorial/ldap/misc/aliases.html) 
("When you use Sun's LDAP service provider, you can control how aliases 
are dereferenced in one of four ways, by using the 
"java.naming.ldap.derefAliases" environment property, as shown in the 
following table. If this environment property is not set, then the 
default is "always".")
> so in general reads will be unaffected. Also the DB operations 
> required to detect a cycle at write time are the kinds of things you 
> would already be performing efficiently in a search. Doing them at 
> search time is far better from a concurrency perspective because 
> you're only doing read operations inside a reader transaction, and 
> nothing touched inside the DB needs to stay locked for long. If you're 
> doing these searches during a write operation then you're going to 
> accumulate huge numbers of locks that must be held until the write 
> transaction commits.

And as I said in a previous mail, detecting cycle during search is 
probably totally unnoticeable, wrt performances.Of course, with millions 
of aliases, that would be a different problem, but in this case, I would 
likely tell the user to switch to another LDAP server :)



-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: Alias cycle detection

Posted by Howard Chu <hy...@symas.com>.

Alex Karasulu wrote:
> On Mon, Jun 13, 2011 at 5:39 PM, Emmanuel Lecharny<el...@gmail.com>  wrote:
>> Alias cycle detection
>> ---------------------
>>
>> There is an unsolved question about how we should detect Alias cycles. Right
>> now, we check for cycles *before* they are created. The alternative would be
>> to stop any search that could lead to an infinite loop.
>
> That would slow down reads. The best is to stop this from happening
> with write operations: meaning doing the computation to detect and
> prevent the cycle then and there instead of exhausting the search
> process to deal with such wicked constructs.

You may be being over-paranoid here. First a client has to explicitly request 
alias dereferencing and most of them don't by default, so in general reads 
will be unaffected. Also the DB operations required to detect a cycle at write 
time are the kinds of things you would already be performing efficiently in a 
search. Doing them at search time is far better from a concurrency perspective 
because you're only doing read operations inside a reader transaction, and 
nothing touched inside the DB needs to stay locked for long. If you're doing 
these searches during a write operation then you're going to accumulate huge 
numbers of locks that must be held until the write transaction commits.

>> A third - but unrealistic - solution would be to don't detect cycle, and
>> process the search until we reach the time or size limit (in other words,
>> it's up to the admin to avoid the creation of such cycle; Highly
>> dangerous…).
>
> Agreed - really dangerous.
>
>> The problem with the first approach is that we can't anymore pass the VSLDAP
>> tests. It's a major burden. Also most of the current servers support this
>> feature.
>
> Is there a VSLDAP test that allows for alias cycle creation? If so we
> should be able to bring this up with the Open Group. This is
> definitely a gray area in the protocol but it makes little sense to
> create alias cycles. Alias chaining on the other hand is a different
> story.

Since alias dereferencing is not implicit, it makes no sense to prohibit 
creation of alias cycles. I.e., they're otherwise just plain LDAP entries and 
if they still obey the schema then you don't have much justification for 
rejecting them.

> So let me ask once again since I know little about the VSLDAP tests:
> do they allow alias chaining or alias loops? The two would be
> different.
>
> Alex
>

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Re: Alias cycle detection

Posted by Emmanuel Lécharny <el...@apache.org>.

On 6/15/11 2:25 AM, Alex Karasulu wrote:
> On Mon, Jun 13, 2011 at 5:39 PM, Emmanuel Lecharny<el...@gmail.com>  wrote:
>> Alias cycle detection
>> ---------------------
>>
>> There is an unsolved question about how we should detect Alias cycles. Right
>> now, we check for cycles *before* they are created. The alternative would be
>> to stop any search that could lead to an infinite loop.
> That would slow down reads. The best is to stop this from happening
> with write operations: meaning doing the computation to detect and
> prevent the cycle then and there instead of exhausting the search
> process to deal with such wicked constructs.

If we do that, then we won't be able to pass VSLDAP tests. And the 
slowdown will be *very* minimal : we just need an additional filter 
which checks if the read entry is an alias (if not, we just do nothing), 
and if it's an alias, we put its DN in a Set of DNs in the 
SearchRequestContect.
>
>> The problem with the first approach is that we can't anymore pass the VSLDAP
>> tests. It's a major burden. Also most of the current servers support this
>> feature.
> Is there a VSLDAP test that allows for alias cycle creation? If so we
> should be able to bring this up with the Open Group. This is
> definitely a gray area in the protocol but it makes little sense to
> create alias cycles. Alias chaining on the other hand is a different
> story.
Alias creation is not the server business. The user (or admin) is 
responsible for such mistakes. Sadly, it *will* happen, and we must 
forbid this. Right now, we detect a cycle the hard way, using indexes, 
and I'm pretty sure that in complex cases where the cycle is far away 
from the base search, it might take a while to detect it.

True, this is a gray area, because nowhere in the RFCs it is said that 
Alias cycle are forbidden, and we must protect the server from such 
behavior.
> So let me ask once again since I know little about the VSLDAP tests:
> do they allow alias chaining or alias loops? The two would be
> different.

Alias chaining is definitively accepted.

Now, we also have some JIRAs that are a direct consequence of what we 
have coded in the server atm :
https://issues.apache.org/jira/browse/DIRSERVER-803
And this is a VSLDAP Standard compliance test.

I will continue my experiments on the Filter approach, it's almost done 
(I was surprised how easy it was to implement a cycle detection this 
way), then we can compare this algorithm with another where we forbid 
the cycle creation when adding a new alias.

-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: Alias cycle detection

Posted by Alex Karasulu <ak...@apache.org>.

On Mon, Jun 13, 2011 at 5:39 PM, Emmanuel Lecharny <el...@gmail.com> wrote:
> Alias cycle detection
> ---------------------
>
> There is an unsolved question about how we should detect Alias cycles. Right
> now, we check for cycles *before* they are created. The alternative would be
> to stop any search that could lead to an infinite loop.

That would slow down reads. The best is to stop this from happening
with write operations: meaning doing the computation to detect and
prevent the cycle then and there instead of exhausting the search
process to deal with such wicked constructs.

> A third - but unrealistic - solution would be to don't detect cycle, and
> process the search until we reach the time or size limit (in other words,
> it's up to the admin to avoid the creation of such cycle; Highly
> dangerous…).

Agreed - really dangerous.

> The problem with the first approach is that we can't anymore pass the VSLDAP
> tests. It's a major burden. Also most of the current servers support this
> feature.

Is there a VSLDAP test that allows for alias cycle creation? If so we
should be able to bring this up with the Open Group. This is
definitely a gray area in the protocol but it makes little sense to
create alias cycles. Alias chaining on the other hand is a different
story.

So let me ask once again since I know little about the VSLDAP tests:
do they allow alias chaining or alias loops? The two would be
different.

Alex