You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@directory.apache.org by Marcel Bruch <ma...@codetrails.com> on 2014/03/03 20:46:53 UTC

[ApacheDS] Using DS in a many-clients-one-master setup over the internet?

Hi ds-users,

I’m currently evaluating an idea to which using Apache DS partially sounds like a good fit. However, I’m not sure and I’m seeking some advice. Without detailing on the exact requirements and use case it may sound weird.

We have highly structured and hierarchical data (basically a several GB huge knowledge-base) that is stored on a server and updated from time to time. 

In a (far) future there *might* be 10.000 up to 100.000 clients somewhere on the web that need to access parts of that data. Currently there are a few hundred clients.

These clients should be able to replicate some small parts of that hierarchical data (according to some access rights) to speed up their data access and work in some "offline mode“ if required. These slaves should be updated from time to time with data from the master server. 


My first question is: Is LDAP in general a suitable protocol for these requirements and is Apache DS an appropriate server when it comes to such master-slave scenario with slaves all over the internet? The slaves would run as embedded clients inside a java application on a desktop pc.

My second question would be: Do firewalls typically allow connections to LDAP or LDAPS ports? if not, is there any way to run replication over something that firewalls usually permit?


Thanks in advance,
Marcel


Re: [ApacheDS] Using DS in a many-clients-one-master setup over the internet?

Posted by Marcel Bruch <ma...@codetrails.com>.
Thanks for your answers. From your comments it looks like we should start modeling the data structures to see where we get with this. Having the ability to sync LDAP servers via DSML/JSON over HTTP was a key information to me. We’ll give it a spin an evaluate Apache DS.

Thanks,
Marcel

Am 04.03.2014 um 03:51 schrieb Kiran Ayyagari <ka...@apache.org>:

> (parts of the reply are top posted, cause I wanted to keep Emmanuel's reply
> as well in the context)
> On Tue, Mar 4, 2014 at 2:20 AM, Emmanuel Lécharny <el...@gmail.com>wrote:
> 
>> Le 3/3/14 8:46 PM, Marcel Bruch a écrit :
>>> Hi ds-users,
>>> 
>>> I'm currently evaluating an idea to which using Apache DS partially
>> sounds like a good fit. However, I'm not sure and I'm seeking some advice.
>> Without detailing on the exact requirements and use case it may sound weird.
>>> 
>>> We have highly structured and hierarchical data (basically a several GB
>> huge knowledge-base) that is stored on a server and updated from time to
>> time.
>>> 
>>> In a (far) future there *might* be 10.000 up to 100.000 clients
>> somewhere on the web that need to access parts of that data. Currently
>> there are a few hundred clients.
>>> 
>>> These clients should be able to replicate some small parts of that
>> hierarchical data (according to some access rights) to speed up their data
>> access and work in some "offline mode" if required. These slaves
>> 
> using ApcheDS you can perfectly replicate parts of the tree(i.e., DIT)
> 
>> should be updated from time to time with data from the master server.
>> 
> the best configuration is to use "Refresh-Only" mode from each slave that
> is connecting to the master
> (this will avoid keeping 100k connections alive constantly from the master,
> and your application can let the
> client choose at whatever interval he wants to poll for updates)
> 
>>> 
>>> 
>>> My first question is: Is LDAP in general a suitable protocol for these
>> requirements
>> 
>> let me emphasize it, YES, the one part that stands out here is your
> requirement to replicate parts of
> the tree and using access controls to prevent the rest from replicating
> 
>> Yes. Definitively yes. For the record, this is what Microsoft is doing
>> with Active Directoy, where everyone can connect on his/her machine even
>> if it's not connected to the domain server.
>> 
>> 
> 
>>> and is Apache DS an appropriate server when it comes to such
>> master-slave scenario with slaves all over the internet?
>> 
> and again YES, the best part of ApacheDS is the ability to embed (and I see
> that you are already using this
> in your slave nodes) (a double edged sword, if I may say so ;) and if
> needed you don't need to expose
> LDAP port and protocol data at all, this all can be done over HTTP/S by
> using DSML or JSON format
> 
>> 
>> Assuming you don't have a lot of modifications, most certainly. And if
>> 
> let me add a bit here, the only concern here is the bandwidth *iff* you
> have many many modifications,
> but not related to reliability, the master ApacheDS keeps track of all the
> modifications and updates the
> slaves whenever they connect
> 
>> ApacheDS is not fast enough for your needs, you can even use OpenLDAP as
>> a central server, with ApacheDS being distributed - they are usig the
>> same replication protocol, syncrepl -.
>> 
>>> The slaves would run as embedded clients inside a java application on a
>> desktop pc.
>> 
>> perfect
> 
>> That's fine.
>>> 
>>> My second question would be: Do firewalls typically allow connections to
>> LDAP or LDAPS ports?
>> This has to be configured. But if this becomes a problem, we have worked
>> on some scenario where we use DSML instead of pure LDAP, thus allowing
>> your applicatio, to use port 80. This is not part of the main server
>> though, it has to be added (and, no this is not complicated).
>> 
>> ApacheDS already comes with a built in HTTP server and all that needed is
> to deploy
> a DSML gateway app (and optionally turn off/protect the LDAP port)
> 
>> 
>>> if not, is there any way to run replication over something that
>> firewalls usually permit?
>> 
>> replicatio is pure LDAP. Using a DSML proxy should work, or some LDAP
>> <-> Json transport. I would left Kiran replied here.
>> 
>> I have created such a translation layer in 2011 to make it easy for apps
> to access LDAP data
> however technically this is not different from the DSML format, and DSML is
> the ideal choice
> if you are planning to replicate over HTTP.
> 
> Let me know, will be happy to help.
> 
>> Hope it helps.
>> 
>> 
>> --
>> Regards,
>> Cordialement,
>> Emmanuel Lécharny
>> www.iktek.com
>> 
>> 
> 
> 
> -- 
> Kiran Ayyagari
> http://keydap.com


Re: [ApacheDS] Using DS in a many-clients-one-master setup over the internet?

Posted by Kiran Ayyagari <ka...@apache.org>.
(parts of the reply are top posted, cause I wanted to keep Emmanuel's reply
as well in the context)
On Tue, Mar 4, 2014 at 2:20 AM, Emmanuel Lécharny <el...@gmail.com>wrote:

> Le 3/3/14 8:46 PM, Marcel Bruch a écrit :
> > Hi ds-users,
> >
> > I'm currently evaluating an idea to which using Apache DS partially
> sounds like a good fit. However, I'm not sure and I'm seeking some advice.
> Without detailing on the exact requirements and use case it may sound weird.
> >
> > We have highly structured and hierarchical data (basically a several GB
> huge knowledge-base) that is stored on a server and updated from time to
> time.
> >
> > In a (far) future there *might* be 10.000 up to 100.000 clients
> somewhere on the web that need to access parts of that data. Currently
> there are a few hundred clients.
> >
> > These clients should be able to replicate some small parts of that
> hierarchical data (according to some access rights) to speed up their data
> access and work in some "offline mode" if required. These slaves
>
using ApcheDS you can perfectly replicate parts of the tree(i.e., DIT)

> should be updated from time to time with data from the master server.
>
the best configuration is to use "Refresh-Only" mode from each slave that
is connecting to the master
(this will avoid keeping 100k connections alive constantly from the master,
and your application can let the
client choose at whatever interval he wants to poll for updates)

> >
> >
> > My first question is: Is LDAP in general a suitable protocol for these
> requirements
>
> let me emphasize it, YES, the one part that stands out here is your
requirement to replicate parts of
the tree and using access controls to prevent the rest from replicating

> Yes. Definitively yes. For the record, this is what Microsoft is doing
> with Active Directoy, where everyone can connect on his/her machine even
> if it's not connected to the domain server.
>
>

> > and is Apache DS an appropriate server when it comes to such
> master-slave scenario with slaves all over the internet?
>
and again YES, the best part of ApacheDS is the ability to embed (and I see
that you are already using this
in your slave nodes) (a double edged sword, if I may say so ;) and if
needed you don't need to expose
LDAP port and protocol data at all, this all can be done over HTTP/S by
using DSML or JSON format

>
> Assuming you don't have a lot of modifications, most certainly. And if
>
let me add a bit here, the only concern here is the bandwidth *iff* you
have many many modifications,
but not related to reliability, the master ApacheDS keeps track of all the
modifications and updates the
slaves whenever they connect

> ApacheDS is not fast enough for your needs, you can even use OpenLDAP as
> a central server, with ApacheDS being distributed - they are usig the
> same replication protocol, syncrepl -.
>
> > The slaves would run as embedded clients inside a java application on a
> desktop pc.
>
> perfect

> That's fine.
> >
> > My second question would be: Do firewalls typically allow connections to
> LDAP or LDAPS ports?
> This has to be configured. But if this becomes a problem, we have worked
> on some scenario where we use DSML instead of pure LDAP, thus allowing
> your applicatio, to use port 80. This is not part of the main server
> though, it has to be added (and, no this is not complicated).
>
> ApacheDS already comes with a built in HTTP server and all that needed is
to deploy
a DSML gateway app (and optionally turn off/protect the LDAP port)

>
> > if not, is there any way to run replication over something that
> firewalls usually permit?
>
> replicatio is pure LDAP. Using a DSML proxy should work, or some LDAP
> <-> Json transport. I would left Kiran replied here.
>
> I have created such a translation layer in 2011 to make it easy for apps
to access LDAP data
however technically this is not different from the DSML format, and DSML is
the ideal choice
if you are planning to replicate over HTTP.

Let me know, will be happy to help.

> Hope it helps.
>
>
> --
> Regards,
> Cordialement,
> Emmanuel Lécharny
> www.iktek.com
>
>


-- 
Kiran Ayyagari
http://keydap.com

Re: [ApacheDS] Using DS in a many-clients-one-master setup over the internet?

Posted by Emmanuel Lécharny <el...@gmail.com>.
Le 3/3/14 8:46 PM, Marcel Bruch a écrit :
> Hi ds-users,
>
> I’m currently evaluating an idea to which using Apache DS partially sounds like a good fit. However, I’m not sure and I’m seeking some advice. Without detailing on the exact requirements and use case it may sound weird.
>
> We have highly structured and hierarchical data (basically a several GB huge knowledge-base) that is stored on a server and updated from time to time. 
>
> In a (far) future there *might* be 10.000 up to 100.000 clients somewhere on the web that need to access parts of that data. Currently there are a few hundred clients.
>
> These clients should be able to replicate some small parts of that hierarchical data (according to some access rights) to speed up their data access and work in some "offline mode“ if required. These slaves should be updated from time to time with data from the master server. 
>
>
> My first question is: Is LDAP in general a suitable protocol for these requirements 

Yes. Definitively yes. For the record, this is what Microsoft is doing
with Active Directoy, where everyone can connect on his/her machine even
if it's not connected to the domain server.


> and is Apache DS an appropriate server when it comes to such master-slave scenario with slaves all over the internet? 

Assuming you don't have a lot of modifications, most certainly. And if
ApacheDS is not fast enough for your needs, you can even use OpenLDAP as
a central server, with ApacheDS being distributed - they are usig the
same replication protocol, syncrepl -.

> The slaves would run as embedded clients inside a java application on a desktop pc.

That's fine.
>
> My second question would be: Do firewalls typically allow connections to LDAP or LDAPS ports? 
This has to be configured. But if this becomes a problem, we have worked
on some scenario where we use DSML instead of pure LDAP, thus allowing
your applicatio, to use port 80. This is not part of the main server
though, it has to be added (and, no this is not complicated).


> if not, is there any way to run replication over something that firewalls usually permit?

replicatio is pure LDAP. Using a DSML proxy should work, or some LDAP
<-> Json transport. I would left Kiran replied here.

Hope it helps.


-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com