You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@directory.apache.org by Emmanuel Lecharny <el...@apache.org> on 2005/06/19 10:42:12 UTC

Re: [apacheds] Idea on refactoring Database, ContextPartition, and RootNexus

Hi Trustin,

> I thought over simplifying Database and ContextPartition, and here's my idea:
> 
> 1) Do we need that much modification operations (add, delete, move
> ...) in Database?
> 
> No we don't need that much, all we need is:
> 
> * Put (used to add and replace entries)
> * Remove (used to delete entries)

I suggest that you keep an operation to change entries. It may be much
faster to do a Change than to do a Remove/Put. Even a Put instead of a
Change can be costly, as all the elements of an entry are to be updated,
which could cost a lot. Obviously, it depends on the underlying
database !


> We could be able to combine put and remove operation to move or rename
> entries, and therefore rename and move operation is splitted into many
> small operations, so we need transaction here:
> 
> Database db = ...;
> Transaction tx = db.beginTransaction();
> tx.delete( entryWithOldName );
> tx.put( entryWithNewName );
> tx.commit();

Put aside the previous comment, I really like the Tx stuff. This is
something that many LDAP implementations don't have.

Emmanuel Lécharny

Re: [apacheds] Idea on refactoring Database, ContextPartition, and RootNexus

Posted by Trustin Lee <tr...@gmail.com>.

Hi Emmanuel,

2005/6/19, Emmanuel Lecharny <el...@apache.org>:
> Not so sure. You have two basic choices here :
> - use a specific indexed Database (BDB, or whatever fit your needs)
> - use a classic relationnal Database (Or*cle(TM), Posgresql,
> MySql, ...)
> 
> I bet that if you are to use ApacheDS in production, you will favor
> reliability above performance. This will be the choice of many clients,
> I assure you ! Not that performance is not important, but clients hate
> to loose data, strange enough ;)
> 
> Or*cle(TM) has implemented a new hierarchical engine in V10, which
> improve greatly performances for data which has this kind of structure.
> 
> Whatever, I really buy your idea to have the simplest interface as
> possible, because it will help to implements any kind of backend
> Databse.

Thanks for your information, and now I moved most BTree-specific
classes to btree package, so it will be more clear to implement those
context partitions.

> There is also a point that I want to rise : a Modify request does not
> give you the attributes of an entry, as you perfectly know. So a Put
> will be quite complicated to implement, as you won't be able to deal
> with deleted attributes without fetching the full entry before. So the
> sequence will be :
> - fetch the entry from the Database
> - modify the entry in memory
> - update the entry into the database
> 
> It will be very costly.
> 
> I also bet that we won't have ten thousands of Database to deal with, so
> even if the proposed interface is a little bit more complicated than
> just a Put and a Delete, it's not a big deal.

Right, so I retained current API.

> So your NotDeletingDatabase is still seen as a Database, am I wrong? If
> so, this is quite a smart idea. The end user should not know what's
> going on in the kitchen, he just want to have good food in his plate !

Yes exactly, but I found it is more powerful to make interceptor look
more like ContextPartitions to reduce impedence mismatch between
Interceptor and ContextPartition.  I'll show how can I make this
happen today or tomorrow.

> Much more important : just add transaction because it's a must. This is
> something I always had problem with OpenLdap, because updates are
> dangerous on a running environment. If you change an information, you
> can't guaranty that somebody is already using it. The problem is that
> you have to extend the transaction mechanism to reads, because you want
> a reader to have a fresh entry, not something that is currently being
> modified. (Updating en entry can be something costly, if you store
> images, so this is a scenario which could occur)

Right, I'm trying to find out the best way to add transaction support now.

> > 5) add AbstractDatabase class that helps users to implement databases
> > easily.  (e.g. two modify operations are delegated to one modify
> > method after some normalization)
> 
> +1, but can you develop a little bit what you mean by 'two modify
> operations are delegated to one modify method after some
> normalization' ?

I thought wrong by mistake just ignore it please. :)

> > 6) provide standardized initialization method like 'open' instead of
> > constructors like we did for ContextPartition so that users can
> > instantiate Database and DatabaseContextPartition in configuration
> > phase.
> +1

I've done this in the branch. :)

> really cool, and we need it actually, we will have 30° today here in
> Paris ;)

It's also hot here in South Korea.  I'd like drink a cup of Chi Chi cocktail! :)

Trustin
-- 
what we call human nature is actually human habit
--
http://gleamynode.net/

Re: [apacheds] Idea on refactoring Database, ContextPartition, and RootNexus

Posted by Emmanuel Lecharny <el...@apache.org>.

Hi Trustin,

> If you're going to change an entry, put operation will be sufficient. 
> But if underlying database implementaion stores attributes sparsely,
> it can be insuffucient of course.  I guess that kind of implementation
> will be rare and can be improved by using cache IMHO, so the problem
> here is possibly moving or renaming.

Not so sure. You have two basic choices here :
 - use a specific indexed Database (BDB, or whatever fit your needs)
 - use a classic relationnal Database (Or*cle(TM), Posgresql,
MySql, ...)

I bet that if you are to use ApacheDS in production, you will favor
reliability above performance. This will be the choice of many clients,
I assure you ! Not that performance is not important, but clients hate
to loose data, strange enough ;)

Or*cle(TM) has implemented a new hierarchical engine in V10, which
improve greatly performances for data which has this kind of structure. 

Whatever, I really buy your idea to have the simplest interface as
possible, because it will help to implements any kind of backend
Databse. 

There is also a point that I want to rise : a Modify request does not
give you the attributes of an entry, as you perfectly know. So a Put
will be quite complicated to implement, as you won't be able to deal
with deleted attributes without fetching the full entry before. So the
sequence will be :
 - fetch the entry from the Database
 - modify the entry in memory
 - update the entry into the database

It will be very costly. 

I also bet that we won't have ten thousands of Database to deal with, so
even if the proposed interface is a little bit more complicated than
just a Put and a Delete, it's not a big deal.

> 
> The reason why I proposed these simple operations were actually to
> create an interceptor that supresses delete operation and marks the
> target entry as 'deleted' instead of deleting it really.  

I guess that you'll use it for Mitosis?

> I was able
> to simplify the implementation of that filter using this ways.  But
> now I found that we can implement NotDeletingDatabase which wraps
> existing Database to intercept delete, move or modifyRN operations and
> fulfill my aim, and of course, retaining current Database interface.

So your NotDeletingDatabase is still seen as a Database, am I wrong? If
so, this is quite a smart idea. The end user should not know what's
going on in the kitchen, he just want to have good food in his plate !


> So.. here's my more refined idea:
> 
> 1) create a new package 'partition' and place all partition related
> stuffs there for better package layout.
> 2) merge AbstractContextPartition and ApplicationContextPartition to
> DatabaseContextPartition.
> 3) move db package into 'partition' package. (i.e. partition.database)
> and put DatabaseContextPartition to 'database' package for better
> package layout.
> 4) retain the interface of current Database class, and add transaction
> support to make all operations atomic.

Much more important : just add transaction because it's a must. This is
something I always had problem with OpenLdap, because updates are
dangerous on a running environment. If you change an information, you
can't guaranty that somebody is already using it. The problem is that
you have to extend the transaction mechanism to reads, because you want
a reader to have a fresh entry, not something that is currently being
modified. (Updating en entry can be something costly, if you store
images, so this is a scenario which could occur)

> 5) add AbstractDatabase class that helps users to implement databases
> easily.  (e.g. two modify operations are delegated to one modify
> method after some normalization)

+1, but can you develop a little bit what you mean by 'two modify
operations are delegated to one modify method after some
normalization' ?

> 6) provide standardized initialization method like 'open' instead of
> constructors like we did for ContextPartition so that users can
> instantiate Database and DatabaseContextPartition in configuration
> phase.
+1

> 
> I must admit that my first idea was a kind of expression of
> stupidness.  

-1. You are not stupid at all !

> > Put aside the previous comment, I really like the Tx stuff. This is
> > something that many LDAP implementations don't have.
> 
> Yeah we'll be able to put quite useful metadata to Database. Cool? :)

really cool, and we need it actually, we will have 30° today here in
Paris ;)

Emmanuel Lécharny

Re: [apacheds] Idea on refactoring Database, ContextPartition, and RootNexus

Posted by Trustin Lee <tr...@gmail.com>.

Hi Emmanuel,

2005/6/19, Emmanuel Lecharny <el...@apache.org>:
> > * Put (used to add and replace entries)
> > * Remove (used to delete entries)
> 
> I suggest that you keep an operation to change entries. It may be much
> faster to do a Change than to do a Remove/Put. Even a Put instead of a
> Change can be costly, as all the elements of an entry are to be updated,
> which could cost a lot. Obviously, it depends on the underlying
> database !

If you're going to change an entry, put operation will be sufficient. 
But if underlying database implementaion stores attributes sparsely,
it can be insuffucient of course.  I guess that kind of implementation
will be rare and can be improved by using cache IMHO, so the problem
here is possibly moving or renaming.

The reason why I proposed these simple operations were actually to
create an interceptor that supresses delete operation and marks the
target entry as 'deleted' instead of deleting it really.  I was able
to simplify the implementation of that filter using this ways.  But
now I found that we can implement NotDeletingDatabase which wraps
existing Database to intercept delete, move or modifyRN operations and
fulfill my aim, and of course, retaining current Database interface.

But my idea on DatabaseContextPartition is still valid.  Users who
want to implement partition like LDAP proxy will have to implement
ContextPartition directly, but users who wants to use ApacheDS's
Database support will have to extend DatabaseContextPartition.   For
now, AbstractContextPartition and ApplicationContextPartition seems to
do what DatabaseContextPartition should do.

So.. here's my more refined idea:

1) create a new package 'partition' and place all partition related
stuffs there for better package layout.
2) merge AbstractContextPartition and ApplicationContextPartition to
DatabaseContextPartition.
3) move db package into 'partition' package. (i.e. partition.database)
and put DatabaseContextPartition to 'database' package for better
package layout.
4) retain the interface of current Database class, and add transaction
support to make all operations atomic.
5) add AbstractDatabase class that helps users to implement databases
easily.  (e.g. two modify operations are delegated to one modify
method after some normalization)
6) provide standardized initialization method like 'open' instead of
constructors like we did for ContextPartition so that users can
instantiate Database and DatabaseContextPartition in configuration
phase.

I must admit that my first idea was a kind of expression of
stupidness.  But this discussion led me to find out more better
solutions.  I think I'll satisfy with these 6 improbements.

> Put aside the previous comment, I really like the Tx stuff. This is
> something that many LDAP implementations don't have.

Yeah we'll be able to put quite useful metadata to Database. Cool? :)

Thanks for everyone's feedback!

Trustin
-- 
what we call human nature is actually human habit
--
http://gleamynode.net/