You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@mina.apache.org by Emmanuel Lecharny <el...@gmail.com> on 2008/11/01 01:43:53 UTC

Re: [About the Filter Chain] Proposals

Jeanfrancois Arcand wrote:
> Salut,
Bonsoir,
>
> Emmanuel Lecharny wrote:
>> <snip/>
>> - The filter chain is not static, but it should be thread safe, so 
>> any kind of modification must *not* lead to some exception (NPE or 
>> inconsistent state)
>
> Can you made this behavior configurable? In a project I'm working on 
> :-) we made this configurable. e.g. a filters can be thread safe or 
> not. When not safe, we internally pool filters. That might sound a 
> weird design, but we have seen application that needed to write 
> 'statefull'/thread unsafe filter. The performance penalty is limited 
> to WorkerThread that poll for instance of those filters...which is not 
> that significant.
My first understanding was that once the chain is defined, it never 
changes. It's somehow pretty much always true. But if you dig a bit, you 
might have cases where it would be a cool feature to make this chain 
configurable. Now it leads to some decision : should this chain be 
thread safe ? The chain is associated with a session, and as you may 
have more than one message processed on this session (as we may have an 
executorFilter in front of the chain to dispatch the processing to a 
pool of thread), this may be a problem. On the other side, that's a 
penalty we would like not to pay in all the case we don't need to change 
the chain.

If we implement the chain using an simple pointer to the next filter in 
each filter, obvioulsy, if this pointer (reference to the next filter, 
'pointer' is not to be understood as if MINA was written in C :), it 
could be volatile, protecting the chain from NPE. 

Another option, and it's not a bad idea, would be to define a threadSafe 
chain, which can be modified, or a not synchronized chain, which can't 
be changed. We have to dig those ideas.
>
>
>> - Passing to the next filter should be possible in the middle of a 
>> filter processing (ie, in a specific filter, you can do some 
>> pre-processing, call the next filter, and do some post-processing)
>
> Would it make it "too" complicated for users? What I did in Grizzly 
> was to split the task into 2 operations (Filter.execute(), 
> Filter.postExecute()). The execute() method return a boolean to 
> telling the chain to invoke the next filter or not. Then, in reverse 
> order, we invoke the postExecute() on the previously invoked filters. 
> But I might be wrong here :-)
We discussed about this option : having preExecute(), callNext() and 
postExecute() methods. At first sight, sounds interesting. But there is 
a couple of problems with this approach :
- this is quite heavy for the user, as he has to implement potentially 3 
methids for each event, instead of one
- it does not cover the case where you have some branching in the 
preExecute method. Let me explain with a small piece of pseudo-code :

if ( message meets condition )
  then callNext filter
  else
    do some processing on the message
    if ( processed message meets some other condition )
      then callNext filter
       ...

In this case, having a preExecute() method does not help a lot. 
Something better would be to have a getNext() method which compute the 
next filter (or possibly this filter is already fixed when the chain is 
built, for instance if you have an immutable chain), then you can do 
whatever processing and eventually call the next filter when you want.

Of course, I'm talking with a one-way chain in mind, not a two ways chain.

But this is only what I have in mind, and I think we can ellaborate a 
bit more, possibly discarding my crazy ideas :)
 
>
>
>> - We should be able to use different pipelines depending on the 
>> service (filters can be arranged differently on the same server for 2 
>> different services)
>
> That one sound interesting. I'm curious to find how you will detect 
> which pipeline to invoke. You will need some Mapper right?
yeah. For instance, if you are building a kind of proxy, you will most 
certainly have more than one service, but you may share filters. Each 
service will proxy for a specific protocol, thus will have its own 
chain. Now, the service is associated with an established connection, 
which means you can define which service to invoke depending on the port 
the connection is connected to.

Another idea is that you may have a demuxer which will determinate which 
chain to invoke depending on the encapsulating protocol elements. This 
is something we use in LDAP, as we know which message we are dealing 
with only after having decoded a part of the message, then we chain to 
many different handlers, but we could also have more 
protocolCodecFilters for the encapsulated protocol element.

But this is going a bit too far from the initial discussion here :)
>
>> - Even for two different sessions, we may have different chain of 
>> filters (one session might use a Audit filter while the next is just 
>> fine without this filter)
>> - We want to decouple the message processing from the socket 
>> processor, using a special filter which use an executor to process 
>> the message in its own thread
>
> Yes that one will for sure improve performance :-)
FYI, we already have this filter.
>> Proposition (d) The Protocolhandler should be a filter, like any 
>> other one.
>
>
> We had the same discussion in Grizzly and we came with the same 
> conclusion :-)
Pfewww !!! I'm not alone in the dark ;)
>> PS : All those changes need to be validated, as I may have missed 
>> some points. I also suggest that some prototype be written, in a 
>> sandbox. This will be the best way to check if those ideas are sane 
>> or insane, and also to correctly evaluate the impact on existing code.
>>
>> So, wdyt, guys ?
>
> From an outlier view, that looks promising (and dangerous for the 
> outlier project :-))
I'm more concerned by the time it will take to implement this in MINA, 
than by the harm it may cause to other projects, especially to Grizzly 
:)  All in all, we benefit from each others, and this is good for our users.

>
> -- Jeanfrancois
Thanks Jean-François ! This is a pleasure to have you around. I feel so 
comfortable to have such guys like you or Howard Chu (OpenLdap) because 
you guys are sharing ideas. This is a rare quality !

-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org

Re: [About the Filter Chain] Proposals

Posted by Emmanuel Lecharny <el...@gmail.com>.

Jeanfrancois Arcand wrote:
> Salut,
>
>> Another option, and it's not a bad idea, would be to define a 
>> threadSafe chain, which can be modified, or a not synchronized chain, 
>> which can't be changed. We have to dig those ideas.
>
> Just for my understanding, does the chain be modified by the user? 
> Would it be too demanding to ask the user to sync its filter itself. 
> That's one of the reason why we have 2 mode in Grizzly, but that might 
> not be a good idea :-)
Yrs, the user can modify the chain (well, 'user' here means the guy who 
write the application using MINA :) But we can also think about a 
self-configuring system which add a new filter in the chain right in the 
middle of an event processing. A good exemple could also be to add a new 
logging filter when having some error we want to log, but only if we get 
this error.

In order to protect the chain against concurrent modification, the ideas 
I have are :
- always call the next filter using a getNextFilter() method which can 
contain some protected section
- define a thread-safe chain a bit like you can declare a protected 
collection (Collections.synchronizedCollection( set )). We may do the 
very same for the chain: ChainUtil.synchronizeChain( myChain).
- maybe set a simple volatile flag, used in the getNextFilter to avoid a 
heavy synchronization

Anyway, there are options here, and it can be user driven (as you create 
the chain before it is injected into the server, you can create it 
thread safe, or not).

It has to be tested in the branch...
>> But this is only what I have in mind, and I think we can ellaborate a 
>> bit more, possibly discarding my crazy ideas :)
>
> The getNext() approach is promising. At least in Tomcat they moved 
> their internal Valve architecture to exactly use that approach. 
> GlassFish forked Tomcat and the approach used is 
> preExecute/postExecute. There is pro and con for both 
> (preExecute/postExecute was faster when we measured long time ago) but 
> I would think for user is it simple using the getNext() (strange I say 
> that as I've implemented the pre/post :-)). I think it is simpler.
We have such code :

    public void messageReceived(IoSession session, Object message) {
        if (!(message instanceof IoBuffer)) {
            getNextFilter().messageReceived(session, message);
            return;
        }

        IoBuffer in = (IoBuffer) message;
        ProtocolDecoder decoder = getDecoder(session);
       
        if ( decoder == null) {
            ProtocolDecoderException pde = new ProtocolDecoderException(
                "Cannot decode if the decoder is null. Add the filter in 
the chain" +
                "before the first session is created" );
            getNextFilter().exceptionCaught(session, pde);
            return;
        }
       
        ProtocolDecoderOutput decoderOut = getDecoderOut(session, 
getNextFilter());
       
        if ( decoderOut == null) {
            ProtocolDecoderException pde = new ProtocolDecoderException(
                "Cannot decode if the decoder is null. Add the filter in 
the chain" +
                "before the first session is created" );
            getNextFilter().exceptionCaught(session, pde);
            return;
        }
       
        while (in.hasRemaining()) {
            int oldPos = in.position();
            try {
                synchronized (decoderOut) {
                    decoder.decode(session, in, decoderOut);
                }

                decoderOut.flush();
            } catch (Throwable t) {
                ...
            }
        }
    }

As you can see, (it's in the ProtocolCodecFilter) we have more than one 
place where we call the next filter, for different messages. 
Implementing such a behavior with pre(), callNext(), post() method would 
be quite a PITA to implement... (forget about this code, it's not 
necessarily a perfect exemple of good code, but it demonstracte what I mean)


Thanks !

-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org

Re: [About the Filter Chain] Proposals

Posted by Jeanfrancois Arcand <Je...@Sun.COM>.

Salut,

Emmanuel Lecharny wrote:
> Jeanfrancois Arcand wrote:
>> Salut,
> Bonsoir,
>>
>> Emmanuel Lecharny wrote:
>>> <snip/>
>>> - The filter chain is not static, but it should be thread safe, so 
>>> any kind of modification must *not* lead to some exception (NPE or 
>>> inconsistent state)
>>
>> Can you made this behavior configurable? In a project I'm working on 
>> :-) we made this configurable. e.g. a filters can be thread safe or 
>> not. When not safe, we internally pool filters. That might sound a 
>> weird design, but we have seen application that needed to write 
>> 'statefull'/thread unsafe filter. The performance penalty is limited 
>> to WorkerThread that poll for instance of those filters...which is not 
>> that significant.
> My first understanding was that once the chain is defined, it never 
> changes. It's somehow pretty much always true. But if you dig a bit, you 
> might have cases where it would be a cool feature to make this chain 
> configurable. Now it leads to some decision : should this chain be 
> thread safe ? The chain is associated with a session, and as you may 
> have more than one message processed on this session (as we may have an 
> executorFilter in front of the chain to dispatch the processing to a 
> pool of thread), this may be a problem. On the other side, that's a 
> penalty we would like not to pay in all the case we don't need to change 
> the chain.
> 
> If we implement the chain using an simple pointer to the next filter in 
> each filter, obvioulsy, if this pointer (reference to the next filter, 
> 'pointer' is not to be understood as if MINA was written in C :), it 
> could be volatile, protecting the chain from NPE.
> Another option, and it's not a bad idea, would be to define a threadSafe 
> chain, which can be modified, or a not synchronized chain, which can't 
> be changed. We have to dig those ideas.

Just for my understanding, does the chain be modified by the user? Would 
it be too demanding to ask the user to sync its filter itself. That's 
one of the reason why we have 2 mode in Grizzly, but that might not be a 
good idea :-)


>>
>>
>>> - Passing to the next filter should be possible in the middle of a 
>>> filter processing (ie, in a specific filter, you can do some 
>>> pre-processing, call the next filter, and do some post-processing)
>>
>> Would it make it "too" complicated for users? What I did in Grizzly 
>> was to split the task into 2 operations (Filter.execute(), 
>> Filter.postExecute()). The execute() method return a boolean to 
>> telling the chain to invoke the next filter or not. Then, in reverse 
>> order, we invoke the postExecute() on the previously invoked filters. 
>> But I might be wrong here :-)
> We discussed about this option : having preExecute(), callNext() and 
> postExecute() methods. At first sight, sounds interesting. But there is 
> a couple of problems with this approach :
> - this is quite heavy for the user, as he has to implement potentially 3 
> methids for each event, instead of one

Agree.


> - it does not cover the case where you have some branching in the 
> preExecute method. Let me explain with a small piece of pseudo-code :
> 
> if ( message meets condition )
>  then callNext filter
>  else
>    do some processing on the message
>    if ( processed message meets some other condition )
>      then callNext filter
>       ...
> 
> In this case, having a preExecute() method does not help a lot. 
> Something better would be to have a getNext() method which compute the 
> next filter (or possibly this filter is already fixed when the chain is 
> built, for instance if you have an immutable chain), then you can do 
> whatever processing and eventually call the next filter when you want.
> 
> Of course, I'm talking with a one-way chain in mind, not a two ways chain.
> 
> But this is only what I have in mind, and I think we can ellaborate a 
> bit more, possibly discarding my crazy ideas :)

The getNext() approach is promising. At least in Tomcat they moved their 
internal Valve architecture to exactly use that approach. GlassFish 
forked Tomcat and the approach used is preExecute/postExecute. There is 
pro and con for both (preExecute/postExecute was faster when we measured 
long time ago) but I would think for user is it simple using the 
getNext() (strange I say that as I've implemented the pre/post :-)). I 
think it is simpler.


> 
>>
>>
>>> - We should be able to use different pipelines depending on the 
>>> service (filters can be arranged differently on the same server for 2 
>>> different services)
>>
>> That one sound interesting. I'm curious to find how you will detect 
>> which pipeline to invoke. You will need some Mapper right?
> yeah. For instance, if you are building a kind of proxy, you will most 
> certainly have more than one service, but you may share filters. Each 
> service will proxy for a specific protocol, thus will have its own 
> chain. Now, the service is associated with an established connection, 
> which means you can define which service to invoke depending on the port 
> the connection is connected to.
> 
> Another idea is that you may have a demuxer which will determinate which 
> chain to invoke depending on the encapsulating protocol elements. This 
> is something we use in LDAP, as we know which message we are dealing 
> with only after having decoded a part of the message, then we chain to 
> many different handlers, but we could also have more 
> protocolCodecFilters for the encapsulated protocol element.
> 
> But this is going a bit too far from the initial discussion here :)
>>
>>> - Even for two different sessions, we may have different chain of 
>>> filters (one session might use a Audit filter while the next is just 
>>> fine without this filter)
>>> - We want to decouple the message processing from the socket 
>>> processor, using a special filter which use an executor to process 
>>> the message in its own thread
>>
>> Yes that one will for sure improve performance :-)
> FYI, we already have this filter.
>>> Proposition (d) The Protocolhandler should be a filter, like any 
>>> other one.
>>
>>
>> We had the same discussion in Grizzly and we came with the same 
>> conclusion :-)
> Pfewww !!! I'm not alone in the dark ;)
>>> PS : All those changes need to be validated, as I may have missed 
>>> some points. I also suggest that some prototype be written, in a 
>>> sandbox. This will be the best way to check if those ideas are sane 
>>> or insane, and also to correctly evaluate the impact on existing code.
>>>
>>> So, wdyt, guys ?
>>
>> From an outlier view, that looks promising (and dangerous for the 
>> outlier project :-))
> I'm more concerned by the time it will take to implement this in MINA, 
> than by the harm it may cause to other projects, especially to Grizzly 
> :)  All in all, we benefit from each others, and this is good for our 
> users.

Agree.

> 
>>
>> -- Jeanfrancois
> Thanks Jean-François ! This is a pleasure to have you around. I feel so 
> comfortable to have such guys like you or Howard Chu (OpenLdap) because 
> you guys are sharing ideas. This is a rare quality !

Thanks :-)

-- Jeanfrancois


>