You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@directory.apache.org by Berin Loritsch <bl...@d-haven.org> on 2004/11/24 20:08:38 UTC

[RT] SEDA Package: Rework Proposal (long, sorry)

My initial reaction to what is currently the SEDA package is that in
order to
do what I want would result in a hack. I'm not opposed to doing that
for the
short term, but I do want to look at a sustainable future. While the
networking
part isn't the bread and butter of Eve and Kerberos, it does play an
important
part. Things like firewalling (dynamic and static), load shedding (for
protocols
that allow that sort of thing), and load balancing are all network layer
functions.

After talking to Alex a bit, it seems that he, Trustin, and I are all on
the same
page that SEDA needs a bit of rework. The question of course is to do it in
a sane fashion. That obviously means that we deal with a branch and
keep the
one interface that everything else is defining and using. Beyond that
is a question
of what would make the best approach.

The purpose of this email is to get some dialog going to come up with
the best
approach overall. I am stating my thoughts as a starting point, and am not
married to these thoughts.

Alex put together what we have now with the only intention that it be quick
and dirty and be something that could get the project going. The SEDA
project
fills that need right now.

I understand Trustin comes to the table with the Netty2 library, of
which I must
admit ignorance. He also has been doing most of the work getting SEDA to
work the way it does now.

Now, I come to the table with an existing, well tested library that is
used by
several projects including Excalibur, some more stuff at D-Haven, and
several
commercial projects. That is the D-Haven Event library.

What I would like to propose is that, as much as possible, we avoid NIH
syndrome
(Not Invented Here) and use stuff that we know works. I will be the
first to admit
that the D-Haven Event library as it is now is not a drop in replacement
for SEDA.
It provides the plumbing work which can be leveraged for the final
product. If it
makes sense to use Event and Netty2 together under one umbrelly then we
should
pursue that.

Nevertheless, here is my understanding of SEDA in a nutshell. As the
current project
does right now, each stage listens for events, processes them, and then
sends new or
processed events on to the next stage. Everything is decoupled through
Queues or
Pipes. Typically each stage will have at least two output pipes: the
loopback pipe and
the output pipe. The loopback pipe is used to push events we can't deal
with right now
out of the way until we can. The output pipe of course is used to push
events on to
the next stage. With this basic structure, we can really do wonders
pushing events through
the system using enqueue predicates, multicast pipes, load balancing
pipes, etc.

D-Haven Event handles the core part of moving things through a pipeline
quite well,
and is pretty well tested. It comes to the table with the following
features:

* An event pipeline (pre-matched pairs of a set of sources and an event
handler)
* A thread manager (using whatever policy you want for pushing events
through each pipeline)
* A rate limiting enqueue predicate
* A multicast pipe
* Asynchronous command management (including periodically recurring
commands)

Now, all this is just the skeleton of what makes a SEDA system go. The
real power is
in what the stages do and how the pipelines and stages are configured.
That part is not
done in the D-Haven Event library. The core set of stages that I see we
need are as follows:

1. ConnectionManager (this includes firewalling by dropping unallowable
connections)
2. Reader (start reading bytes from the stream)
* Router (1 pipe per protocol)
3. Decoder (use the decoder from the protocol handler)
4. RequestHandler (use the request handler from the protocol handler)
5. Encoder (use the encoder from the protocol handler)

6. Writer (start writing bytes to the stream)

The reader will route the ByteBuffers to the proper protocol (based on a
port mapping
or something like that) which will then do all its necessary steps for
dealing with its
own type of information, and write the response ByteBuffers to the
output stream.

Later on we can look at the pipelines to use load-balancing pipelines
for each protocol
as necessary. That will allow us the flexibility to merely forward
requests to other
machines in a DMZ if we need it.

I haven't looked at the details of the rest of the stuff in the SEDA
package to see how
that would best be served in this architectural view, but I am
definitely open to suggestions.

"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning."
- Rich Cook

Re: [RT] SEDA Package: Rework Proposal (long, sorry)

Posted by Berin Loritsch <bl...@d-haven.org>.

Alex Karasulu wrote:

<snip/>

> Let me also inject that I would like to see some statistics and user 
> feedback too.  The good thing about doing this release with the 
> current SEDA framework is that we get things out there for people to 
> complain about.  That gives use feedback and requirements.  I would 
> also like to bang requests against both servers to see where the 
> performance bottlenecks are so we can design with these in mind.  I 
> don't want to design the next best internet protocol server framework 
> without these metrics.  It makes me feel like I'm spinning my wheels.  
> I think we all agree with this.

Agreed.  Just something we can work with.

<snip/>

> Say Berin do you have documentation on the event package at D-Haven 
> that is at 50K feet with some drill down.  I'd love to look at it.  
> Likewise I'd like to look at documentation on Netty2 and the Geronimo 
> Networking code.  I think we should take the best of all the worlds 
> here.  Plus I have some serious ACE research to do as well.  I think 
> Trustin has been doing the same till now.

I have some documentation here:

http://projects.d-haven.org/modules/sections/index.php?op=listarticles&secid=4

I have not yet gotten the build to create the Xdocs.

Essentially, you assemble the "big" pipeline by wiring together several 
"small" pipelines,
and registering each with the ThreadManager.  The ThreadManager will use 
whatever
thread policy you decide for pushing the events through the pipeline.  
Assembling the
pipeline is really not to hard.  The DefaultPipeline has an array of 
Source objects (usually
Queues) and one EventHandler.  The EventHandler is an object that does 
something with
those events.

The easiest solution to "hardwire" something together for what we have 
here is to create
some EventHandlers that pass in the Sink objects necessary for passing 
in the next stage.
This is where something simple is done.  We can add some variation to it 
to handle more
complex event routing as needed.  Adding the command subsystem as part 
of the pipeline,
and a routing stage (something that routes events based on type of 
event) will provide us
with a very flexible and easily tuned system.

>> in what the stages do and how the pipelines and stages are 
>> configured.  That part is not
>> done in the D-Haven Event library.  The core set of stages that I see 
>> we need are as follows:
>>
>> 1. ConnectionManager (this includes firewalling by dropping 
>> unallowable connections)
>> 2. Reader (start reading bytes from the stream)
>>   * Router (1 pipe per protocol)
>> 3. Decoder (use the decoder from the protocol handler)
>> 4. RequestHandler (use the request handler from the protocol handler)
>> 5. Encoder (use the encoder from the protocol handler)
>>
>> 6. Writer (start writing bytes to the stream)
>>
> Now, all this is just the skeleton of what makes a SEDA system go.  
> The real power is
> These are the exact same components in SEDA btw.

Right, but they are a bit too strongly typed IMO.  Keep in mind that as 
necessary we can
deal with non-reentrant protocol handler stages by providing a load 
balancing multiplexer/
demultiplexer.  IOW, being able to handle multiple requests at a time by 
providing a separate
pipeline per concurrency needed.  It would ensure that only one thread 
is operating on the
sensitive area at a time--but there are multiple instances of the set up 
making it easier to deal
with.

> I think the best way to procede is to start up the dialog as you have 
> recommended and have done.  This is excellent.  Now I think we should 
> all get familiar with SEDA, Netty2, Geronimo Networking, D-Haven Event 
> and the ACE architecture and incorporate them into our 
> converstations.  Let's start a branch or several branches where we can 
> play with these ideas and these constructs.  Meanwhile let's get this 
> release out the door and see what's good and bad about SEDA.
> I really want the best of all the worlds and could care less what we 
> have at the end of the day so long as some very basic fundamentals are 
> met:
>
> 1). I don't want users having to know SEDA theory to write a protocol 
> server that snaps in to the framework.  So details can be hidden and 
> administrators deploying servers can be concerned with SEDA settings 
> and dynamics.  SEDA or ACE is just a model and we should not get 
> carried away with it.  We are in the business of writing protocol 
> servers not extending Matt Welsh's discertation.

Right, and part of that is being able to parallelize non-reentrant 
code--which is currently
not possible.

> 2). Make sure we have a simple, clean and intuitive ProtocolProvider 
> interface with helper interfaces whatever they may be

I think we have this, and I don't think it needs to be altered--unless 
we come up with a need for it.

> 3). Make sure the framework leverages encoder/decoder pairs that can 
> chunk data and maintain state between chuncks - this way we actually 
> utilize non-blocking facilities to the fullest extent

I think this ability is best done by maintaining state in the event 
itself (making it easier to
make reentrant stages).

> 4). Make sure the framework is fast and optimized for rapidly 
> implementing internet protocol servers and in this regard I would like 
> design decisions to be driven by some statistics and concensus

Right, and with the ability to have some callbacks for events and errors 
set up, we can monitor
a running system.

> 5). Avoid generic framework-itis: we want a specific framework for 
> writing internet protocol servers that behaves sort of like inetd in a 
> single process.

Its all about leveraging simplicity in design.  I'm not trying to create 
Avalon over here.

> Lastly although least important in the decision making process I would 
> like the internals to be easy to maintain and grasp for those 
> developing the framework and maintaining it.  However this is less 
> important than the points above.

If we work with a small set of principles, it makes the whole thing 
easier to grasp.  I have a
feeling that the current SEDA system has too many principles to 
grasp--making it more
difficult than it needs to be.

-- 

"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning."
                - Rich Cook

Re: [RT] SEDA Package: Rework Proposal (long, sorry)

Posted by Alex Karasulu <ao...@bellsouth.net>.

Berin Loritsch wrote:

> My initial reaction to what is currently the SEDA package is that in 
> order to
> do what I want would result in a hack.  I'm not opposed to doing that 
> for the
> short term, but I do want to look at a sustainable future.  While the 
> networking
> part isn't the bread and butter of Eve and Kerberos, it does play an 
> important
> part.  Things like firewalling (dynamic and static), load shedding 
> (for protocols
> that allow that sort of thing), and load balancing are all network 
> layer functions.

> After talking to Alex a bit, it seems that he, Trustin, and I are all 
> on the same
> page that SEDA needs a bit of rework.  The question of course is to do 
> it in
> a sane fashion.  That obviously means that we deal with a branch and 
> keep the
> one interface that everything else is defining and using.  Beyond that 
> is a question
> of what would make the best approach.
>
Let me also inject that I would like to see some statistics and user 
feedback too.  The good thing about doing this release with the current 
SEDA framework is that we get things out there for people to complain 
about.  That gives use feedback and requirements.  I would also like to 
bang requests against both servers to see where the performance 
bottlenecks are so we can design with these in mind.  I don't want to 
design the next best internet protocol server framework without these 
metrics.  It makes me feel like I'm spinning my wheels.  I think we all 
agree with this.

> The purpose of this email is to get some dialog going to come up with 
> the best
> approach overall.  I am stating my thoughts as a starting point, and 
> am not
> married to these thoughts.
>
> Alex put together what we have now with the only intention that it be 
> quick
> and dirty and be something that could get the project going.  The SEDA 
> project
> fills that need right now.
>
> I understand Trustin comes to the table with the Netty2 library, of 
> which I must
> admit ignorance.  He also has been doing most of the work getting SEDA to
> work the way it does now.
>
> Now, I come to the table with an existing, well tested library that is 
> used by
> several projects including Excalibur, some more stuff at D-Haven, and 
> several
> commercial projects.  That is the D-Haven Event library.
>
> What I would like to propose is that, as much as possible, we avoid 
> NIH syndrome
> (Not Invented Here) and use stuff that we know works.  I will be the 
> first to admit
> that the D-Haven Event library as it is now is not a drop in 
> replacement for SEDA.
> It provides the plumbing work which can be leveraged for the final 
> product.  If it
> makes sense to use Event and Netty2 together under one umbrelly then 
> we should
> pursue that.
>
> Nevertheless, here is my understanding of SEDA in a nutshell.  As the 
> current project
> does right now, each stage listens for events, processes them, and 
> then sends new or
> processed events on to the next stage.  Everything is decoupled 
> through Queues or
> Pipes.  Typically each stage will have at least two output pipes: the 
> loopback pipe and
> the output pipe.  The loopback pipe is used to push events we can't 
> deal with right now
> out of the way until we can.  The output pipe of course is used to 
> push events on to
> the next stage.  With this basic structure, we can really do wonders 
> pushing events through
> the system using enqueue predicates, multicast pipes, load balancing 
> pipes, etc.
>
> D-Haven Event handles the core part of moving things through a 
> pipeline quite well,
> and is pretty well tested.  It comes to the table with the following 
> features:
>
> * An event pipeline (pre-matched pairs of a set of sources and an 
> event handler)
> * A thread manager (using whatever policy you want for pushing events 
> through each pipeline)
> * A rate limiting enqueue predicate
> * A multicast pipe
> * Asynchronous command management (including periodically recurring 
> commands)
>
Say Berin do you have documentation on the event package at D-Haven that 
is at 50K feet with some drill down.  I'd love to look at it.  Likewise 
I'd like to look at documentation on Netty2 and the Geronimo Networking 
code.  I think we should take the best of all the worlds here.  Plus I 
have some serious ACE research to do as well.  I think Trustin has been 
doing the same till now.

> Now, all this is just the skeleton of what makes a SEDA system go.  
> The real power is
> in what the stages do and how the pipelines and stages are 
> configured.  That part is not
> done in the D-Haven Event library.  The core set of stages that I see 
> we need are as follows:
>
> 1. ConnectionManager (this includes firewalling by dropping 
> unallowable connections)
> 2. Reader (start reading bytes from the stream)
>   * Router (1 pipe per protocol)
> 3. Decoder (use the decoder from the protocol handler)
> 4. RequestHandler (use the request handler from the protocol handler)
> 5. Encoder (use the encoder from the protocol handler)
>
> 6. Writer (start writing bytes to the stream)
>
These are the exact same components in SEDA btw. 

> The reader will route the ByteBuffers to the proper protocol (based on 
> a port mapping
> or something like that) which will then do all its necessary steps for 
> dealing with its
> own type of information, and write the response ByteBuffers to the 
> output stream.
>
> Later on we can look at the pipelines to use load-balancing pipelines 
> for each protocol
> as necessary.  That will allow us the flexibility to merely forward 
> requests to other
> machines in a DMZ if we need it.
>
> I haven't looked at the details of the rest of the stuff in the SEDA 
> package to see how
> that would best be served in this architectural view, but I am 
> definitely open to suggestions.
>
I think the best way to procede is to start up the dialog as you have 
recommended and have done.  This is excellent.  Now I think we should 
all get familiar with SEDA, Netty2, Geronimo Networking, D-Haven Event 
and the ACE architecture and incorporate them into our converstations.  
Let's start a branch or several branches where we can play with these 
ideas and these constructs.  Meanwhile let's get this release out the 
door and see what's good and bad about SEDA. 

I really want the best of all the worlds and could care less what we 
have at the end of the day so long as some very basic fundamentals are met:

1). I don't want users having to know SEDA theory to write a protocol 
server that snaps in to the framework.  So details can be hidden and 
administrators deploying servers can be concerned with SEDA settings and 
dynamics.  SEDA or ACE is just a model and we should not get carried 
away with it.  We are in the business of writing protocol servers not 
extending Matt Welsh's discertation.
2). Make sure we have a simple, clean and intuitive ProtocolProvider 
interface with helper interfaces whatever they may be
3). Make sure the framework leverages encoder/decoder pairs that can 
chunk data and maintain state between chuncks - this way we actually 
utilize non-blocking facilities to the fullest extent
4). Make sure the framework is fast and optimized for rapidly 
implementing internet protocol servers and in this regard I would like 
design decisions to be driven by some statistics and concensus
5). Avoid generic framework-itis: we want a specific framework for 
writing internet protocol servers that behaves sort of like inetd in a 
single process.

Lastly although least important in the decision making process I would 
like the internals to be easy to maintain and grasp for those developing 
the framework and maintaining it.  However this is less important than 
the points above.

Alex

Re: [RT] SEDA Package: Rework Proposal (long, sorry)

Posted by Trustin Lee <tr...@gmail.com>.

Hi,

> I understand Trustin comes to the table with the Netty2 library, of
> which I must
> admit ignorance.  He also has been doing most of the work getting SEDA to
> work the way it does now.

The URL of Netty2 is:
http://gleamynode.net/dev/projects/netty2

> Now, I come to the table with an existing, well tested library that is
> used by
> several projects including Excalibur, some more stuff at D-Haven, and
> several
> commercial projects.  That is the D-Haven Event library.

Alex and I have been talking about stream hierarchy model being used
by ACE and Geromino stack.  It is a kind of 'chain of responsibility'
pattern rather than an event routing model.  I really like their
approach and it is proven to work great.  The use case of SEDA event
propagation process is actually not an event routing.  It has a static
flow of events; the specific subset of event routing model, so I think
we need to go simpler.  Of course there are some issues related with
this such as propagation of ConnectEvent and DisconnectEvent, but it
should be easily resolved by using a global session registry.  I want
to show this in a few days in my branch.

But, of course, I think we need to look at D-Haven library if it
provides what we want, so could you give me the URL?

-- 
what we call human nature is actually human habit
--
http://gleamynode.net/