You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@mina.apache.org by 이희승 (Trustin Lee) <t...@gmail.com> on 2008/04/28 15:21:57 UTC

Redesigning IoBuffer, IoFilter and IoHandler

I thought about the current MINA API these days while I am idle, and got
some idea of improvements:

1) Split IoBuffer into two parts - array and buffer

IoBuffer is basically an improvement to ByteBuffer.  Improvement here
means it inherited some bad asset from ByteBuffer - all the stateful
properties such as position, limit and mark.  I think they should be
provided as a separate class, and there should be classes dedicated for
storing bytes only (something like ByteArray?)

BTW, why is mixing them a bad idea?  It's because it makes the
implementation too complicated.  For example, it is almost impossible to
implement a composite buffer to support close-to-zero-copy I/O.  What
about all the weird rules related with auto-expansion and buffer
derivation?  It's increasing the learning curve.

2) Get rid of IoHandler and let IoFilter replace it.

IoHandler is just a special IoFilter at the end of the filter chain.  I
don't see any reason to keep it special considering that we are often
building multi-layered protocols.

3) Split IoFilter into multiple interfaces.

If IoHandler is removed, IoFilter should be renamed to represent itself
better.  Moreover, IoFilter should be split into more than one interface
 (e.g. UpstreamHandler for receiving events and DownstreamHandler for
sending events from/to an IoProcessor) so they can choose what to
override more conveniently.

4) UDP as the first citizen transport.

MINA currently handles UDP like it's a TCP connection.  Consequently, we
had to provide some additional stuff like IoSessionRecycler and
IoAcceptor.newSession().  IMHO, session state management should be
completely optional when a user is using the UDP transport.  Possible
solution is to provide a different handler and session interface from TCP.

How do we execute all these changes? Well.. I think it's pretty easy to
do purely from technical point of view.  All the changes will make users
experience bigger backward incompatibility, but these issues should be
fixed somehow to make MINA a better netapp framework IMHO.

Any feed back is welcome.
-- 
Trustin Lee - Principal Software Engineer, JBoss, Red Hat
--
what we call human nature is actually human habit
--
http://gleamynode.net/

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by Emmanuel Lecharny <el...@apache.org>.

David M. Lloyd wrote:
> On 04/28/2008 11:40 AM, Emmanuel Lecharny wrote:
>>> Yes and no - we should at least make support methods available to 
>>> make this easier.  MINA won't win any fans by being labeled 
>>> "framework that makes decoder implementers do the most work" :-)
>> The best would be to offer an API having such a method : a blocking 
>> getNextByte() method. Atm, you have to do something like 
>> BB.hasRemaining() before grabbing a byte. Painfull ... and ineficient !
>
> Blocking also has the problem of consuming threads.
> The ideal codec would be a fully non-blocking state machine (this is 
> what I mean when I say "can exit at any point").  So for example, an 
> HTTP state machine might accept "GE" before the buffer runs out.  The 
> method would then return; the state machine would know to expect "T" 
> next.
you may be right. This is what we implemented in ADS to decode LDAP PDU. 
The thing is that you have to check if some bytes are available before 
reading them, but this is a little bit costly.

I'm just trying to imagine the ultimate decoder which will not block a 
thread, which will not have to deal with excpetion when the incoming BB 
is empty, and which will be fast :)
>
> In practice, this is really hard to do in Java.
Well, not that hard. And this is certainly not a Java issue. Just a pb 
with people not able to cope with the complexity of a state 
machine/statefull decoder.
> This is why Trustin is talking about a tool that is "similar to 
> ANTLR"; it would be a tool to parse a simple language to generate a 
> highly efficient non-blocking state machine (such as a DFA) in bytecode.  
ASN.1 compiler :)
> This type of tool has the potential to be really excellent - it would 
> make it easy for anyone to write an extremely efficient protocol codec 
> for just about any protocol you could imagine.
I started a lab in apache 2 years ago. (labs.apache.org, project 
Dungeon. http://cwiki.apache.org/confluence/display/labs/dungeon)

Didn't had time since then to go any farther ...
>> I would prefer the underlying layer to handle this case. The encoder 
>> is responsible to encode the data, not to handle the client sloppiness.
>
> But the behavior of the client is almost always a part of the 
> protocol. Also, the right behavior might not be to block.  My use case 
> in JBoss Remoting, for example, is much more complex.  A connection 
> might have four or five client threads using it at the same time.  So 
> I'd need to detect a saturated channel, and block the client threads.  
> If the output channel is saturated, the input channel might still be 
> usable, and vice-versa.
>
> The output channel and the input channel operate completely 
> independently in many protocols (including mine).  Also, UDP-based 
> protocols often have completely different policies.
Sure, but this should be the role of a throttle filter, maybe. Mixing 
encoding and throttle management will lead to complexity, I guess.
>
>> At some point, we need to establish some policy about how to handle 
>> such problems. Even a disk is a limited resource.
>
> Yes, but ultimately that decision *has* to be made by the protocol 
> implementor.  
Usually, protocol says nothing about how to handle slow clients, and 
such problems. This is why I think it should not be handled at the 
direct protocol level (ie, at least, at the codec level), but around it.
> It depends on what is more important: allowing the sender to utilize 
> maximum throughput, no matter what (in this case you'd spool to disk 
> if you can't handle the messages fast enough), or to "push back" when 
> the server is saturated (by either blocking the client using TCP 
> mechanisms or by sending your own squelch message or some other 
> mechanism).
There are so many possibilities ... We won't be able to define a catch 
all solution in this thread, that's for sure :)



-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by "David M. Lloyd" <da...@redhat.com>.

On 04/28/2008 11:40 AM, Emmanuel Lecharny wrote:
>> Yes and no - we should at least make support methods available to make 
>> this easier.  MINA won't win any fans by being labeled "framework that 
>> makes decoder implementers do the most work" :-)
> The best would be to offer an API having such a method : a blocking 
> getNextByte() method. Atm, you have to do something like 
> BB.hasRemaining() before grabbing a byte. Painfull ... and ineficient !

Blocking also has the problem of consuming threads.  The ideal codec would 
be a fully non-blocking state machine (this is what I mean when I say "can 
exit at any point").  So for example, an HTTP state machine might accept 
"GE" before the buffer runs out.  The method would then return; the state 
machine would know to expect "T" next.

In practice, this is really hard to do in Java.  This is why Trustin is 
talking about a tool that is "similar to ANTLR"; it would be a tool to 
parse a simple language to generate a highly efficient non-blocking state 
machine (such as a DFA) in bytecode.  This type of tool has the potential 
to be really excellent - it would make it easy for anyone to write an 
extremely efficient protocol codec for just about any protocol you could 
imagine.

>> If you saturate the output channel though, you'll have to be able to 
>> handle that situation somehow.  It ultimately has to be up to the 
>> protocol decoder to detect a saturated output channel and perform 
>> whatever action is necessary to squelch the output message source 
>> until the output channel can be cleared again.

> I would prefer the underlying layer to handle this case. The encoder is 
> responsible to encode the data, not to handle the client sloppiness.

But the behavior of the client is almost always a part of the protocol. 
Also, the right behavior might not be to block.  My use case in JBoss 
Remoting, for example, is much more complex.  A connection might have four 
or five client threads using it at the same time.  So I'd need to detect a 
saturated channel, and block the client threads.  If the output channel is 
saturated, the input channel might still be usable, and vice-versa.

The output channel and the input channel operate completely independently 
in many protocols (including mine).  Also, UDP-based protocols often have 
completely different policies.

> At some point, we need to establish some policy about how to handle such 
> problems. Even a disk is a limited resource.

Yes, but ultimately that decision *has* to be made by the protocol 
implementor.  It depends on what is more important: allowing the sender to 
utilize maximum throughput, no matter what (in this case you'd spool to 
disk if you can't handle the messages fast enough), or to "push back" when 
the server is saturated (by either blocking the client using TCP mechanisms 
or by sending your own squelch message or some other mechanism).

>> So I'd cast my (useless and non-binding) vote behind either using 
>> ByteBuffer with static support methods, or using a byte array 
>> abstraction object with a separate buffer abstraction like Trustin 
>> suggests.

> Some experiment with code could help, at this point. Navigating at such 
> high level does not help a lot to grasp the real bytes we are dealing 
> with on the network level ;)

Agreed.

- DML

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by Emmanuel Lecharny <el...@apache.org>.

David M. Lloyd wrote:
> On 04/28/2008 10:46 AM, Emmanuel Lecharny wrote:
>> David M. Lloyd wrote:
>>>> - should we serialize the stream at some point ?
>>>
>>> What do you mean by "serialize"?
>> Write to disk if the received data are too big. See my previous point 
>> (it's up to the decoder to deal with this)
>
> Ah I understand.  Yes, it would be up to the decoder.  Though 
> hopefully most decoders can process the entire buffer without having 
> to keep the buffer data around.  The faster you can get rid of 
> incoming buffers, the less memory you will consume overall.
+1
>
>>> Note that a buffer might contain data from more than one message as 
>>> well. So it's important to use only a slice of the buffer in this case.
>> Not a big deal. Again, it's the decoder task to handle such a case. 
>> We have experimented such a case in LDAP too.
>
> Yes and no - we should at least make support methods available to make 
> this easier.  MINA won't win any fans by being labeled "framework that 
> makes decoder implementers do the most work" :-)
The best would be to offer an API having such a method : a blocking 
getNextByte() method. Atm, you have to do something like 
BB.hasRemaining() before grabbing a byte. Painfull ... and ineficient !

>
>>>> - how to write an efficient encoder when you have no idea about the 
>>>> size of the data you are going to send ?
>>>
>>> Use a buffer factory, such as IoBufferAllocator, or use an even 
>>> simpler interface like this:
>>>
>>> public interface BufferFactory {
>>>     ByteBuffer createBuffer();
>>> }
>>> [..]
>> That's an idea. But this does not solve one little pb : if the reader 
>> is slow, you may saturate the server memory with prepared BB. So you 
>> may need a kind of throttle mechanism, or a blocking queue, to manage 
>> this issue : a new BB should not be created unless the previous one 
>> has been completely sent.
>
> Well that would really depend on the use case.  If you're sending the 
> buffers as soon as they're filled, it might not be a problem.  
I don't think so. If you do that, you may stack thousands of BB if the 
client reader is slow. Likely to be a real problem quite fast ... Unless 
if you meant "sending to the client".
> If you saturate the output channel though, you'll have to be able to 
> handle that situation somehow.  It ultimately has to be up to the 
> protocol decoder to detect a saturated output channel and perform 
> whatever action is necessary to squelch the output message source 
> until the output channel can be cleared again.
I would prefer the underlying layer to handle this case. The encoder is 
responsible to encode the data, not to handle the client sloppiness. So 
when you have filled a BB to be sent to the client, just push it to the 
underlying layer, and wait for this layer to tell you that you can 
provide one more BB, and so on. With a correct time out, you may even 
kill the encoder task, to free some ressource for serious clients :)
>
> The right answer might be to spool output data to a disk-backed queue, 
> or it might be to block further requests, etc.  Basically the same 
> situation that exists today.
At some point, we need to establish some policy about how to handle such 
problems. Even a disk is a limited ressource (yeah, you can still create 
gmail account, and store data as mails, as each gmail account allows you 
to store 2 Gb or more, but sooner or later they will knock at your door ;)

(this idea has been patented by me : "An almost unlimited storage using 
GMail accounts")
>>> Another option is to skip ByteBuffers and go with raw byte[] objects 
>>> (though this closes the door completely to direct buffers).
>> Well, ByteBuffers are so intimately wired with NIO that I don't think 
>> we can easily use byte[] without losing performances... (not sure 
>> though ...)
>
> Not that I'm in favor of using byte[] directly, but it's easy enough 
> to wrap a (single) byte[] with a ByteBuffer using the 
> ByteBuffer.wrap() methods.
Yep.
>
>>> Yet another option is to have a simplified abstraction for byte 
>>> arrays like Trustin proposes, and use the stream cleasses for the 
>>> buffer state implementation.
>>>
>>> This is all in addition to Trustin's idea of providing a byte array 
>>> abstraction and a buffer state abstraction class.
>> I'm afraid that offering a byte[] abstraction might lead to more 
>> complexity, wrt with what you wrote about the way codec should handle 
>> data. At some point, your ideas are just the good ones, IMHO : use 
>> BB, and let the codec deal with it. No need to add more complex data 
>> structure on top of it.
>
> Maybe.  If this route is taken though, a very comprehensive set of 
> "helper" methods will probably be needed.  ByteBuffer has a pretty 
> lousy API. :-)
ByteBuffers are, hmmm, buffers :)
>
> So I'd cast my (useless and non-binding) vote behind either using 
> ByteBuffer with static support methods, or using a byte array 
> abstraction object with a separate buffer abstraction like Trustin 
> suggests.
Some experiment with code could help, at this point. Navigating at such 
high level does not help a lot to grasp the real bytes we are dealing 
with on the network level ;)

-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by "David M. Lloyd" <da...@redhat.com>.

On 04/28/2008 10:46 AM, Emmanuel Lecharny wrote:
> David M. Lloyd wrote:
>>> - should we serialize the stream at some point ?
>>
>> What do you mean by "serialize"?
> Write to disk if the received data are too big. See my previous point 
> (it's up to the decoder to deal with this)

Ah I understand.  Yes, it would be up to the decoder.  Though hopefully 
most decoders can process the entire buffer without having to keep the 
buffer data around.  The faster you can get rid of incoming buffers, the 
less memory you will consume overall.

>> Note that a buffer might contain data from more than one message as 
>> well. So it's important to use only a slice of the buffer in this case.
> Not a big deal. Again, it's the decoder task to handle such a case. We 
> have experimented such a case in LDAP too.

Yes and no - we should at least make support methods available to make this 
easier.  MINA won't win any fans by being labeled "framework that makes 
decoder implementers do the most work" :-)

>>> - how to write an efficient encoder when you have no idea about the 
>>> size of the data you are going to send ?
>>
>> Use a buffer factory, such as IoBufferAllocator, or use an even 
>> simpler interface like this:
>>
>> public interface BufferFactory {
>>     ByteBuffer createBuffer();
>> }
>>[..]
> That's an idea. But this does not solve one little pb : if the reader is 
> slow, you may saturate the server memory with prepared BB. So you may 
> need a kind of throttle mechanism, or a blocking queue, to manage this 
> issue : a new BB should not be created unless the previous one has been 
> completely sent.

Well that would really depend on the use case.  If you're sending the 
buffers as soon as they're filled, it might not be a problem.  If you 
saturate the output channel though, you'll have to be able to handle that 
situation somehow.  It ultimately has to be up to the protocol decoder to 
detect a saturated output channel and perform whatever action is necessary 
to squelch the output message source until the output channel can be 
cleared again.

The right answer might be to spool output data to a disk-backed queue, or 
it might be to block further requests, etc.  Basically the same situation 
that exists today.

One of the "classic" NIO fallacies is to assume that the output channel 
will never block. :-)

>> Another option is to skip ByteBuffers and go with raw byte[] objects 
>> (though this closes the door completely to direct buffers).
> Well, ByteBuffers are so intimately wired with NIO that I don't think we 
> can easily use byte[] without losing performances... (not sure though ...)

Not that I'm in favor of using byte[] directly, but it's easy enough to 
wrap a (single) byte[] with a ByteBuffer using the ByteBuffer.wrap() methods.

>> Yet another option is to have a simplified abstraction for byte arrays 
>> like Trustin proposes, and use the stream cleasses for the buffer 
>> state implementation.
>>
>> This is all in addition to Trustin's idea of providing a byte array 
>> abstraction and a buffer state abstraction class.
> I'm afraid that offering a byte[] abstraction might lead to more 
> complexity, wrt with what you wrote about the way codec should handle 
> data. At some point, your ideas are just the good ones, IMHO : use BB, 
> and let the codec deal with it. No need to add more complex data 
> structure on top of it.

Maybe.  If this route is taken though, a very comprehensive set of "helper" 
methods will probably be needed.  ByteBuffer has a pretty lousy API. :-)

So I'd cast my (useless and non-binding) vote behind either using 
ByteBuffer with static support methods, or using a byte array abstraction 
object with a separate buffer abstraction like Trustin suggests.

> Otherwise, the idea may be to define some simple codec which transform a 
> BB to a array[], for those who need it. As we have a cool Filter chain, 
> let's use it...

Any non-direct buffer is already backed by a byte[], so this would be 
pretty easy.  Though you'd have to pass up a byte[], int offs, int len to 
avoid copying.  Copying is really the #1 problem with IoBuffer as it exists 
today.

- DML

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by Emmanuel Lecharny <el...@apache.org>.

David M. Lloyd wrote:
> <snip/>
>> I think that using a byte[] (for instance in the encoder), transform 
>> it to a ByteBuffer, is another way to deal with the problem.
>>
>> One important point is that ByteBuffers are just mean to contains a 
>> fixed amount of data. It's a buffer, not a data structure. 
>> Transforming ByteBuffer to make them able to expand twist their 
>> intrinsic semantic.
>
> Yes, it makes far more sense to accumulate buffers until you can 
> decode your message from it.
Or decode the stream as it comes, creating the object on the fly. A 
statefull decoder...
>
>> So I would say that BB should be used on the very low level (reading 
>> data and sending data), but then, the other layers should use byte[] 
>> or a stream of bytes.
>
> I don't see the advantage of using byte[] honestly - using at the 
> least a wrapper object seems preferable.  
This is what we are doing in ADS : LDAP messages are built on the fly, 
simply by coping with ByteBuffers.

Consider that accumulating BB to create a big byte[] should be 
understand as : transform BB directly to the targeted wrapper objects. 
Thanks for correcting me :)
> And if you're going to use a wrapper object, why not just use ByteBuffer.
Because you may receive more than one BB before you can build the 
wrapper object.
>
>> This will lead to very intersting performances questions :
>> - how to handle large stream of data ?
>
> One buffer at a time. :-)
Well, I tried to think about other strategies, but, eh, you are just 
plain right ! It's up to the codec filter to deal with the complexity of 
the data it has to decode !
>
>> - should we serialize the stream at some point ?
>
> What do you mean by "serialize"?
Write to disk if the received data are too big. See my previous point 
(it's up to the decoder to deal with this)
>
>> - how to write an efficient decoder, when you may receive fractions 
>> of what you are waiting for ?
>
> An ideal decoder would be a state machine which can be entered and 
> exited at any state.  This way, even a partial buffer can be fully 
> consumed before returning to wait for the next buffer.
This is what we have in ADS : A stateful decoder. Not as simple as if 
you have the whole data in memory, especially if you have to deal with 
multi-bytes markers, but not too complex neither.
>
> However many decoders are not ideal due to various constraints.  In 
> the worst case, you could accumulate ByteBuffer instances until you 
> have a complete message that can be handled.  What I do at this point 
> is to create a DataInputStream that encapsulates all the received 
> buffers.
Yeah, 100% agree.
>
> Note that a buffer might contain data from more than one message as 
> well. So it's important to use only a slice of the buffer in this case.
Not a big deal. Again, it's the decoder task to handle such a case. We 
have experimented such a case in LDAP too.
(make me think that we should describe the ldap codec on the MINA site, 
just to give some insight for people who want to write a statefull decoder)
>
>> - how to write an efficient encoder when you have no idea about the 
>> size of the data you are going to send ?
>
> Use a buffer factory, such as IoBufferAllocator, or use an even 
> simpler interface like this:
>
> public interface BufferFactory {
>     ByteBuffer createBuffer();
> }
>
> which mass-produces pre-sized buffers.  In the case of stream-oriented 
> systems like TCP or serial, you could probably send buffers as you 
> fill them.  For message-oriented protocols like UDP, you can 
> accumulate all the buffers to send, and then use a single gathering 
> write to send them as a single message (yes, this stinks in the 
> current NIO implementation, as Trustin pointed out in DIRMINA-518, but 
> it's no worse than the repeated copying that auto-expanding buffers 
> use; and APR and other possible backends [and, if I have any say at 
> all in it, future OpenJDK implementations] would hopefully not suffer 
> from this limitation).
That's an idea. But this does not solve one little pb : if the reader is 
slow, you may saturate the server memory with prepared BB. So you may 
need a kind of throttle mechanism, or a blocking queue, to manage this 
issue : a new BB should not be created unless the previous one has been 
completely sent.
>
>> For all these reasons, the mail I sent a few days ago express my 
>> personnal opinion that IoBuffer may be a little bit overkilling 
>> (remember that this class -and the associated tests- represent around 
>> 13% of all mina common code ! )
>
> Yes, that's very heavy.  I looked at resolving DIRMINA-489 more than 
> once, and was overwhelmed by the sheer number of methods that had to 
> be implemented, and the overly complex class structure.
>
> One option could be to use ByteBuffer with some static support methods, 
+1
> and streams to act as the "user interface" into collections of 
> buffers.  For example, an InputStream that reads from a collection of 
> buffers, and an OutputStream that is configurable to auto-allocate 
> buffers, performing an action every time a buffer is filled:
>
> public interface BufferSink {
>     void handleBuffer(ByteBuffer buffer);
> }
That's an option.
>
> Another option is to skip ByteBuffers and go with raw byte[] objects 
> (though this closes the door completely to direct buffers).
Well, ByteBuffers are so intimately wired with NIO that I don't think we 
can easily use byte[] without losing performances... (not sure though ...)
>
> Yet another option is to have a simplified abstraction for byte arrays 
> like Trustin proposes, and use the stream cleasses for the buffer 
> state implementation.
>
> This is all in addition to Trustin's idea of providing a byte array 
> abstraction and a buffer state abstraction class.
I'm afraid that offering a byte[] abstraction might lead to more 
complexity, wrt with what you wrote about the way codec should handle 
data. At some point, your ideas are just the good ones, IMHO : use BB, 
and let the codec deal with it. No need to add more complex data 
structure on top of it.

Otherwise, the idea may be to define some simple codec which transform a 
BB to a array[], for those who need it. As we have a cool Filter chain, 
let's use it...

wdyt ?


-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by fr...@free.fr.

Hi all,
I am not part of the process of Mina construction,
I am just an end user.
I try to follow your discussion and I mostly agree with your points.

I have however one concern, OK, it is personnal one, but probably
I could be not the only one.

I know that Mina2 M1 API is not stable and could change in RC.
But I would suggest to create one specific "help" page
to migrate from 2-M1 (or trunk) to 2-RC when it will be released,
and starting this doc from early revision (M2, M3, ...).
Why ? Because you often suggested to use the 2M1 version
instead of 1.X version (for obvious reasons of improvement).

And now, as an end user, I just published the V1.0 (ready for production)
version of my project using the Mina 2 (pre-M2 version so from trunk).
Of course, I had taken into account that V2
can change in API, but I would like that those changes can be done
with respect with end user, so the documentation.
In order to be independant, I always join "my stable" version
of Mina so I am independent of Mina's roadmap.
>From time to time, I update the project to use more recent version
of Mina.

Of course, if you think it can be useful, I can help on this doc,
since I would probably get into it whatever the doc exists or not.
But on the counter part, I don't used all part of Mina, so probably
I could miss some parts...

This was just my 2 cents of contribution.
But please go ahead to improve Mina, as always !

Frederic

Selon "\"ì´í¬ì¹ (Trustin Lee)" :

> Emmanuel Lecharny wrote:
> > "ì´í¬ì¹ (Trustin Lee) <tr...@gmail.com>" wrote:
> >> So.. The changes I am proposing is all about simplification sacrificing
> >> backward compatibility.  This might affect people who are using MINA 2
> >> M1 or trunk.  We had to think about all these issues before recommending
> >> people to use 2 M1 and placing the download links in the very top of the
> >> download page.  (we could move the links to the unstable releases
> >> section.)
> >>
> > The problem is that we already have people who are using MINA 2. Now,
> > the question is : should we release it and go for a MINA 3 with a
> > completely new API, or should we change MINA 2 API now before we release
> > it as a RC...
> >
> > That's not a simple question.
> >
> > My personnal choice would be to do it now, before MINA is widely used.
> > We never said that MINA 2 API was stable. As soon as we deliver the
> > first RC, we are done.
> >
> > Another option (and may be better) would be to deprecate dead API (but
> > that means we still have to guarantee some compatibility... No fun at
> > all !)
>
> I concur with you.  I'd prefer to make the changes in M2.
>
> --
> Trustin Lee - Principal Software Engineer, JBoss, Red Hat
> --
> what we call human nature is actually human habit
> --
> http://gleamynode.net/
>
>

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by 이희승 (Trustin Lee) <t...@gmail.com>.

Emmanuel Lecharny wrote:
> "이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
>> So.. The changes I am proposing is all about simplification sacrificing
>> backward compatibility.  This might affect people who are using MINA 2
>> M1 or trunk.  We had to think about all these issues before recommending
>> people to use 2 M1 and placing the download links in the very top of the
>> download page.  (we could move the links to the unstable releases
>> section.)
>>   
> The problem is that we already have people who are using MINA 2. Now,
> the question is : should we release it and go for a MINA 3 with a
> completely new API, or should we change MINA 2 API now before we release
> it as a RC...
> 
> That's not a simple question.
> 
> My personnal choice would be to do it now, before MINA is widely used.
> We never said that MINA 2 API was stable. As soon as we deliver the
> first RC, we are done.
> 
> Another option (and may be better) would be to deprecate dead API (but
> that means we still have to guarantee some compatibility... No fun at
> all !)

I concur with you.  I'd prefer to make the changes in M2.

-- 
Trustin Lee - Principal Software Engineer, JBoss, Red Hat
--
what we call human nature is actually human habit
--
http://gleamynode.net/

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by Emmanuel Lecharny <el...@apache.org>.

"이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
> So.. The changes I am proposing is all about simplification sacrificing
> backward compatibility.  This might affect people who are using MINA 2
> M1 or trunk.  We had to think about all these issues before recommending
> people to use 2 M1 and placing the download links in the very top of the
> download page.  (we could move the links to the unstable releases section.)
>   
The problem is that we already have people who are using MINA 2. Now, 
the question is : should we release it and go for a MINA 3 with a 
completely new API, or should we change MINA 2 API now before we release 
it as a RC...

That's not a simple question.

My personnal choice would be to do it now, before MINA is widely used. 
We never said that MINA 2 API was stable. As soon as we deliver the 
first RC, we are done.

Another option (and may be better) would be to deprecate dead API (but 
that means we still have to guarantee some compatibility... No fun at all !)

wdyt ?

-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by 이희승 (Trustin Lee) <t...@gmail.com>.

Emmanuel Lecharny wrote:
> 
>>> I don't see the advantage of using byte[] honestly - using at the least
>>> a wrapper object seems preferable.  And if you're going to use a wrapper
>>> object, why not just use ByteBuffer.
>>>     
>>
>> I'd prefer to introduce an interface type because ByteBuffer is
>> impossible to extend.  I can do some quick and dirty coding to prove the
>> concept.  How does it sound?
>>   
> Defining an interface won't be helpfull... It will end up with almost
> the very same implementation than the one we currenlty have (IoBuffer).
> 
> If you consider the reasons why IoBuffer was created (ie, loosy BB
> interface, not expandable), I think that David's idea to define a helper
> class seems slightly better.
> 
> For instance, BufferUtils.getLong( ByteBuffer ) or such methods can be
> as convenient.
> 
> (That's a pity we can't extend ByteBuffer...)

Indeed.  The interface I am trying to introduce is something smaller
than IoBuffer.  It's rather close to what ByteBuffer offers. Simple
getters and setters for primitive types.

We have static imports, so introducing such a utility class shouldn't be
a problem.

>> Allocating a pre-sized buffer per connection can cause OOM when there
>> are massive number of idle connections.  
> +1
>> We could use composite buffer
>> in this case too because we can adjust the size of the buffer more
>> flexibly without causing reallocation.
>>   
> Creating a BB is not costly in java 6. The idea would be to create BB
> with the best possible size depending on the network capability (of
> course, that mean we have some way to know what is the best size ...)
> 
>> Gathering write is also a piece of cake if composite buffer is realized.
>>  Of course it should be friendly with your suggested allocation
>> mechanism.
>>   
> I remember having read some post about gathering writes, where it is
> explained that you should not take care of that on the BB level, as the
> JVM will handle it by itself. Jean-François Arcan's post may be ?

Hmm I don't remember.  At least you have to provide an array of
ByteBuffers to perform gathering write or scattered read?

>> <snip/>
>>
>> Thanks for mentioning all possibilities.  I'm leaning toward to my
>> original idea of providing byte array abstraction and buffer state
>> abstraction types because it can cover all cases you've mentioned.
>> Also, we can provide the sheer number of getters and putters for users
>> who like them anyway - it should end up with much simpler one monolithic
>> class - a way simpler.
>>   
> That would be good ! And easier for people who want to enter the API to
> maintain it !

I even find such a monolithic stuff might not be needed if we are going
to use utility classes backed by static imports.

So.. The changes I am proposing is all about simplification sacrificing
backward compatibility.  This might affect people who are using MINA 2
M1 or trunk.  We had to think about all these issues before recommending
people to use 2 M1 and placing the download links in the very top of the
download page.  (we could move the links to the unstable releases section.)

-- 
Trustin Lee - Principal Software Engineer, JBoss, Red Hat
--
what we call human nature is actually human habit
--
http://gleamynode.net/

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by Emmanuel Lecharny <el...@apache.org>.

>> I don't see the advantage of using byte[] honestly - using at the least
>> a wrapper object seems preferable.  And if you're going to use a wrapper
>> object, why not just use ByteBuffer.
>>     
>
> I'd prefer to introduce an interface type because ByteBuffer is
> impossible to extend.  I can do some quick and dirty coding to prove the
> concept.  How does it sound?
>   
Defining an interface won't be helpfull... It will end up with almost 
the very same implementation than the one we currenlty have (IoBuffer).

If you consider the reasons why IoBuffer was created (ie, loosy BB 
interface, not expandable), I think that David's idea to define a helper 
class seems slightly better.

For instance, BufferUtils.getLong( ByteBuffer ) or such methods can be 
as convenient.

(That's a pity we can't extend ByteBuffer...)
> Allocating a pre-sized buffer per connection can cause OOM when there
> are massive number of idle connections.  
+1
> We could use composite buffer
> in this case too because we can adjust the size of the buffer more
> flexibly without causing reallocation.
>   
Creating a BB is not costly in java 6. The idea would be to create BB 
with the best possible size depending on the network capability (of 
course, that mean we have some way to know what is the best size ...)

> Gathering write is also a piece of cake if composite buffer is realized.
>  Of course it should be friendly with your suggested allocation mechanism.
>   
I remember having read some post about gathering writes, where it is 
explained that you should not take care of that on the BB level, as the 
JVM will handle it by itself. Jean-François Arcan's post may be ?
> <snip/>
>
> Thanks for mentioning all possibilities.  I'm leaning toward to my
> original idea of providing byte array abstraction and buffer state
> abstraction types because it can cover all cases you've mentioned.
> Also, we can provide the sheer number of getters and putters for users
> who like them anyway - it should end up with much simpler one monolithic
> class - a way simpler.
>   
That would be good ! And easier for people who want to enter the API to 
maintain it !


-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by 이희승 (Trustin Lee) <t...@gmail.com>.

"이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
> David M. Lloyd wrote:
>> On 04/28/2008 01:09 PM, Emmanuel Lecharny wrote:
>>> May be but this is just a Buffer, not a data structure ! BB are really
>>> meant to be used as a fixed and temporary storage, not as something a
>>> user application can use at will.
>> Yes, I think the important change is to break the 1:1 association
>> between a buffer and a message.
>>
>> That's what this part of the thread is really all about I think.
> 
> Very true.  I actually don't care even if we use NIO ByteBuffer as a
> underlying storage and build something on top of it.  The problem is to
> find out how we can protect users from modifying position, limit, mark
> and order - these four properties must be handled in our abstraction
> layer, which breaks the 1:1 association.
> 
> We might be able to enforce such restriction without introducing new
> types.  Let me research a little bit about this and check in the prototype.

Uh-oh.  It's simply impossible due to the two abstract methods in
ByteBuffer - _get() and _set(). :-(

-- 
Trustin Lee - Principal Software Engineer, JBoss, Red Hat
--
what we call human nature is actually human habit
--
http://gleamynode.net/

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by Emmanuel Lecharny <el...@apache.org>.

"이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
> Emmanuel Lecharny wrote:
>   
>>> If those internal values shouldn't be modified, then there should be
>>> zero possibility.  Otherwise, it's not internal and the contract is too
>>> loose.  API should minimize user's mistake wherever possible.
>>>   
>>>       
>> The pb is that BB should be seen as a 2 ways API : you can read a BB,
>> but you can also feed it with data. This is the reason why you can
>> modify the position, mark etc.
>>
>> Would they want to protect those values, they would have provided
>> factories for a read-only BB.
>>     
>
> Read only ByteBuffer doesn't mean read-only position, limit, etc.
>   
Sure, but you get the idea: we don't really need to care about position, 
limit,etc...
>   
>> Gain, I think you are over-interpreting the BB semantic. They are just
>> Buffers, so they are just supposed to be transitory, not permanent. An
>> advised user should be aware of that, so I don't think we should be
>> over-pretective, because it comes to a price the user don't necessarily
>> want to pay.
>>     
>
> Not all users use codec framework we provide, and users often build
> their own filter which deals with buffers.  They should be covered.
>   
When a user comes to write a Filter, usually, he has quite a good 
knowledge and is not a beginner. We should assume that we are dealing 
with experienced users. (even if it's not the case)
>>>> If we do care, maybe another option would be to transform those BB to
>>>> byte[] and let the user play with the limits by himself...
>>>>     
>>>>         
>>> Direct buffer is a show-stopper for that unfortunately.
>>>   
>>>       
>> True, you can't get a byte[] out of a direct buffer, but direct buffer
>> has a very specific semantic which is known by the user. If the user has
>> selected a direct buffer, he is aware that he can't manipulate bytes at
>> will. In any case, Direct buffers should not be used in the vast
>> majority of cases. Even the small performance tests you have conducted
>> show that Direct buffer perform slower than Heap buffer.
>>
>> I do think that Direct BB should be used very carefully, and in the
>> spirit they were created for : video memory access, device memory
>> access, etc... Not as a speed up in a network framework...
>>     
>
> According to my recent benchmark, Java 6 VM showed that a direct buffer
> outperforms a heap buffer for a unknown reason.  
Well, I would like to get some more specific benchmarks, and no micro 
benchmarks, but anyway, this is not the thing we are discussing here. 
Even if direct buffers are faster, we should _not_ use them just because 
they are faster. There are many payload associated with direct buffers, 
including the fact that you can simply kill a machine by eating all of 
its memory, which is not possible with heap buffers. Again, direct 
buffers were not created for such a general purpose.
> The problem is
> allocation and deallocation, but it is also possible to work around.  Of
> course, we should be careful, but it doesn't mean that we shouldn't
> support a direct buffer.
>   
We should support them, but BB already have support for them. Let's the 
user decide what to do, and let him deal with the fact that some BB's 
methods won't work when they are using direct buffers.
>> - huge code has to be written (13% of the total MINA code base)
>>     
>
> It's because of the current implementation.  It should be different in
> the new implementation we are discussing now.
>   
Sure
>   
>> - users have to learn a new API
>>     
>
> The number of classes they have to learn doesn't change much at all
> considering the number of all MINA classes.
>   
Yes, but no need to add a new one then :) KISS !
>   
>> - as users are lazzy (we all are...), they will make mistakes using the
>> new API, so we will spend more time fixing their errors
>>     
> The new API is more intuitive than NIO ByteBuffer IMHO.
Maybe. But not sure. And our users are supposed to know about BB already.
>   I think most
> users know better about accessing an array and using Iterator than using
> flip() and compact().  
Iterator() is one of the few classes I really don't like to deal with. 
Tis is cumbersome when debugging; as you have no way to know which will 
be the next object until the next() method is called. It's a good 
pattern for many complex collections, but when it comes to a byte array, 
this is not exactly what comes in mind at first shot ...

I agree that flip() and compact() are also a burden... (I never used 
compact, and I have been burnt by flip more than once, so...), but at 
least, once you understand how works a BB, this should not be a problem.
> We have received a lot of questions from users
> who forgot to call flip() - they didn't even know what flip() is in many
> cases.  
Very true. And you will get more users question about the new API too. 
Users may be unexperienced, that's life !
> We don't get such a question now after we modified MINA to throw
> an exception with some message ("call flip()").
>   
Sometime, a good RTFM helps :)
> However, why should we do such a runtime check when we can provide
> something much easier to understand?
>   
Ok, I see your point.

The basic idea here would be to offer two options depending on the 
user's level :
- BB access for experienced users
- a layer on top of BB for dummies, with all the associated costs

I agree that BB API is losy, and that it's a pity it can't be extended, 
and that we don't have subclasses for DirectBB and HeapBB, but for those 
of us who are writing real stacks on top of MINA, I would rather deal 
with BB API than having to pay the overload to learn a new API and pay a 
price for using it in term of speed.
>   
>> - performances will be lower, because we will stack another layer on top
>> of the existing one
>>     
>
> I did performance test to make sure the performance doesn't get worse if
> we use an interface to access a byte array.  There was no performance loss.
>   
You will have a performance loss, that's guaranteed. If you see no loss, 
then it's because your measure were not accurate enough. Better say that 
the performance loss will be minimal, but you have to be aware that you 
will eat more CPU, more heap, more memory and more GC. There is nothing 
bad about it, but it has to be known.
>   
>> - you will loose some semantic, and hide some of it to the user, who for
>> instance may use Direct BB when they should not, simply because you
>> masked the underlying semantic.
>>     
>
> I don't understand what you mean here.  The default buffer will be heap
> buffers and when users want to alocate a direct buffer, it should be
> explicitly specified.
>   
What I mean is that having a flag telling the API that the structure you 
want to allocate is a Direct BB under the hood is not enough. The BB API 
is losy in this respect too, as some of its method won't react the same 
way depending o,n the kind of BB you have allocated (ie, the array() 
method). Sun should have came with a DirectBB inheriting from a BB 
interface in place of merging those two kind of BB. Anyway... The 
important point is that Direct BB have their own semantic which is 
totally different from Heap BB, and we should let the user deal with this.

Direct BB should _not_ be used in place of Heap BB just because the user 
set a flag.
> Of course, it's very different from calling position(int) or limit(int)
> by mistake (?).  They are sitting down there in front of users and the
> users are supposed to make a mistake unless they are hidden, and we can
> hide them in compilation time.
>   
Users make mistakes, all the time :) Poor users ;)

Let's close this thread which becomes much more theorical than necessary :
- I buy the idea of having a 'uber' API which hide the intricacy of BB 
to level 1 users
- I think that if we write such an API, we may want to expose two 
interfaces : one for Heap BB and one for Direct BB
- I also want to be able to use nio BB without depending on this high 
level abstraction, and I'm pretty sure that some Level 2 users will like 
to do so.

wdyt ?
> Thanks,
>   


-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by 이희승 (Trustin Lee) <t...@gmail.com>.

Emmanuel Lecharny wrote:
> "이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
>>> Why do you want to protect those internal values ? The contract is
>>> pretty clear, and if a user fucks with those values, eh, too bad for
>>> him !
>>>     
>>
>> If those internal values shouldn't be modified, then there should be
>> zero possibility.  Otherwise, it's not internal and the contract is too
>> loose.  API should minimize user's mistake wherever possible.
>>   
> The pb is that BB should be seen as a 2 ways API : you can read a BB,
> but you can also feed it with data. This is the reason why you can
> modify the position, mark etc.
> 
> Would they want to protect those values, they would have provided
> factories for a read-only BB.

Read only ByteBuffer doesn't mean read-only position, limit, etc.

> Gain, I think you are over-interpreting the BB semantic. They are just
> Buffers, so they are just supposed to be transitory, not permanent. An
> advised user should be aware of that, so I don't think we should be
> over-pretective, because it comes to a price the user don't necessarily
> want to pay.

Not all users use codec framework we provide, and users often build
their own filter which deals with buffers.  They should be covered.

>>> If we do care, maybe another option would be to transform those BB to
>>> byte[] and let the user play with the limits by himself...
>>>     
>>
>> Direct buffer is a show-stopper for that unfortunately.
>>   
> True, you can't get a byte[] out of a direct buffer, but direct buffer
> has a very specific semantic which is known by the user. If the user has
> selected a direct buffer, he is aware that he can't manipulate bytes at
> will. In any case, Direct buffers should not be used in the vast
> majority of cases. Even the small performance tests you have conducted
> show that Direct buffer perform slower than Heap buffer.
> 
> I do think that Direct BB should be used very carefully, and in the
> spirit they were created for : video memory access, device memory
> access, etc... Not as a speed up in a network framework...

According to my recent benchmark, Java 6 VM showed that a direct buffer
outperforms a heap buffer for a unknown reason.  The problem is
allocation and deallocation, but it is also possible to work around.  Of
course, we should be careful, but it doesn't mean that we shouldn't
support a direct buffer.

>>> But I do think that as soon as you have been burnt once with the limit,
>>> position and mark, as a smart user, you RTFM and try to respect the API
>>> contract :)
>>>     
>>
>> That's what's called 'bad impression' IMHO.  API should respect the user
>> before the user respects the API.
>>   
> <IMHO>
> 
> Actually, I think it should be the opposite : Users should respect the
> API, and the API should respect the user who respects the API...
> 
> As a matter of fact, the french law system says that everything which is
> not strictly forbidden by law is authorized, which is much more
> powerfull than the opposite position : you can only do what the law
> authorizes. (I guess it's the same in any democratic country). An API
> should be as permissive as the law.
> 
> That don't mean you can't derive a more strict API on top of BB (this is
> what we currently have), but at some point, you have to question
> yourself : "is it valuable ?". Here, I think the cost is too high in
> many respects :
> - huge code has to be written (13% of the total MINA code base)

It's because of the current implementation.  It should be different in
the new implementation we are discussing now.

> - users have to learn a new API

The number of classes they have to learn doesn't change much at all
considering the number of all MINA classes.

> - as users are lazzy (we all are...), they will make mistakes using the
> new API, so we will spend more time fixing their errors

The new API is more intuitive than NIO ByteBuffer IMHO.  I think most
users know better about accessing an array and using Iterator than using
flip() and compact().  We have received a lot of questions from users
who forgot to call flip() - they didn't even know what flip() is in many
cases.  We don't get such a question now after we modified MINA to throw
an exception with some message ("call flip()").

However, why should we do such a runtime check when we can provide
something much easier to understand?

> - performances will be lower, because we will stack another layer on top
> of the existing one

I did performance test to make sure the performance doesn't get worse if
we use an interface to access a byte array.  There was no performance loss.

> - you will loose some semantic, and hide some of it to the user, who for
> instance may use Direct BB when they should not, simply because you
> masked the underlying semantic.

I don't understand what you mean here.  The default buffer will be heap
buffers and when users want to alocate a direct buffer, it should be
explicitly specified.

Of course, it's very different from calling position(int) or limit(int)
by mistake (?).  They are sitting down there in front of users and the
users are supposed to make a mistake unless they are hidden, and we can
hide them in compilation time.

Thanks,
-- 
Trustin Lee - Principal Software Engineer, JBoss, Red Hat
--
what we call human nature is actually human habit
--
http://gleamynode.net/

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by Emmanuel Lecharny <el...@apache.org>.

"이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
>> Why do you want to protect those internal values ? The contract is
>> pretty clear, and if a user fucks with those values, eh, too bad for him !
>>     
>
> If those internal values shouldn't be modified, then there should be
> zero possibility.  Otherwise, it's not internal and the contract is too
> loose.  API should minimize user's mistake wherever possible.
>   
The pb is that BB should be seen as a 2 ways API : you can read a BB, 
but you can also feed it with data. This is the reason why you can 
modify the position, mark etc.

Would they want to protect those values, they would have provided 
factories for a read-only BB.

Gain, I think you are over-interpreting the BB semantic. They are just 
Buffers, so they are just supposed to be transitory, not permanent. An 
advised user should be aware of that, so I don't think we should be 
over-pretective, because it comes to a price the user don't necessarily 
want to pay.
>   
>> If we do care, maybe another option would be to transform those BB to
>> byte[] and let the user play with the limits by himself...
>>     
>
> Direct buffer is a show-stopper for that unfortunately.
>   
True, you can't get a byte[] out of a direct buffer, but direct buffer 
has a very specific semantic which is known by the user. If the user has 
selected a direct buffer, he is aware that he can't manipulate bytes at 
will. In any case, Direct buffers should not be used in the vast 
majority of cases. Even the small performance tests you have conducted 
show that Direct buffer perform slower than Heap buffer.

I do think that Direct BB should be used very carefully, and in the 
spirit they were created for : video memory access, device memory 
access, etc... Not as a speed up in a network framework...
>   
>> But I do think that as soon as you have been burnt once with the limit,
>> position and mark, as a smart user, you RTFM and try to respect the API
>> contract :)
>>     
>
> That's what's called 'bad impression' IMHO.  API should respect the user
> before the user respects the API.
>   
<IMHO>

Actually, I think it should be the opposite : Users should respect the 
API, and the API should respect the user who respects the API...

As a matter of fact, the french law system says that everything which is 
not strictly forbidden by law is authorized, which is much more 
powerfull than the opposite position : you can only do what the law 
authorizes. (I guess it's the same in any democratic country). An API 
should be as permissive as the law.

That don't mean you can't derive a more strict API on top of BB (this is 
what we currently have), but at some point, you have to question 
yourself : "is it valuable ?". Here, I think the cost is too high in 
many respects :
- huge code has to be written (13% of the total MINA code base)
- users have to learn a new API
- as users are lazzy (we all are...), they will make mistakes using the 
new API, so we will spend more time fixing their errors
- performances will be lower, because we will stack another layer on top 
of the existing one
- you will loose some semantic, and hide some of it to the user, who for 
instance may use Direct BB when they should not, simply because you 
masked the underlying semantic.

</IMHO>


-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by 이희승 (Trustin Lee) <t...@gmail.com>.

Emmanuel Lecharny wrote:
> "이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
>> David M. Lloyd wrote:
>>  
>>> On 04/28/2008 01:09 PM, Emmanuel Lecharny wrote:
>>>    
>>>> May be but this is just a Buffer, not a data structure ! BB are really
>>>> meant to be used as a fixed and temporary storage, not as something a
>>>> user application can use at will.
>>>>       
>>> Yes, I think the important change is to break the 1:1 association
>>> between a buffer and a message.
>>>
>>> That's what this part of the thread is really all about I think.
>>>     
>>
>> Very true.  I actually don't care even if we use NIO ByteBuffer as a
>> underlying storage and build something on top of it.  The problem is to
>> find out how we can protect users from modifying position, limit, mark
>> and order - these four properties must be handled in our abstraction
>> layer, which breaks the 1:1 association.
>>   
> Why do you want to protect those internal values ? The contract is
> pretty clear, and if a user fucks with those values, eh, too bad for him !

If those internal values shouldn't be modified, then there should be
zero possibility.  Otherwise, it's not internal and the contract is too
loose.  API should minimize user's mistake wherever possible.

> If we do care, maybe another option would be to transform those BB to
> byte[] and let the user play with the limits by himself...

Direct buffer is a show-stopper for that unfortunately.

> But I do think that as soon as you have been burnt once with the limit,
> position and mark, as a smart user, you RTFM and try to respect the API
> contract :)

That's what's called 'bad impression' IMHO.  API should respect the user
before the user respects the API.

-- 
Trustin Lee - Principal Software Engineer, JBoss, Red Hat
--
what we call human nature is actually human habit
--
http://gleamynode.net/

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by Emmanuel Lecharny <el...@apache.org>.

"이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
> David M. Lloyd wrote:
>   
>> On 04/28/2008 01:09 PM, Emmanuel Lecharny wrote:
>>     
>>> May be but this is just a Buffer, not a data structure ! BB are really
>>> meant to be used as a fixed and temporary storage, not as something a
>>> user application can use at will.
>>>       
>> Yes, I think the important change is to break the 1:1 association
>> between a buffer and a message.
>>
>> That's what this part of the thread is really all about I think.
>>     
>
> Very true.  I actually don't care even if we use NIO ByteBuffer as a
> underlying storage and build something on top of it.  The problem is to
> find out how we can protect users from modifying position, limit, mark
> and order - these four properties must be handled in our abstraction
> layer, which breaks the 1:1 association.
>   
Why do you want to protect those internal values ? The contract is 
pretty clear, and if a user fucks with those values, eh, too bad for him !

If we do care, maybe another option would be to transform those BB to 
byte[] and let the user play with the limits by himself...

But I do think that as soon as you have been burnt once with the limit, 
position and mark, as a smart user, you RTFM and try to respect the API 
contract :)

Am I missing something important ?

-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by Daniel Wirtz <da...@virtunity.com>.

Hello,
I followed this IoBuffer discussion for some time now and I have a general
concern:

I don't see a need to deal with buffers at all outside the framework or even
invent a buffer api a developer has to learn. What about implementing the
whole buffer thing as a non blocking composite stream?

The first thing is: You said, that there is a need to not expose the API too
much to prevent mistakes.
When using an IoInputStream version of a composite buffer for example, there
is no need to deal with position, limit, mark etc. Everything is just read
once and mina could convert input to/from output streams for interal
population or for giving them to protocols in the right type.

The next thing is the composite buffer idea: When using streams, complete or
partial basic byte arrays can be appended with zero-copy. The only thing
that needs to be done is take care of this when reading data from the
combined byte arrays to merge them if necessary e.g. for readInt(),
readString() etc.. You could even slice the underlying byte array arrays
into another stream with appropriate methods. Buffers could even overlap
between the original and the sliced buffer, assuming that no more data is
written to the sliced one (it's a stream, pure IoInputStream without
put-methods).

Another advantage is, that the coherent composite stream could be reused
after a decoder read the available bytes and filled with more data as it
comes in so that there would no need for a developer to handle buffering on
its own. He just checks if enough data is available (e.g. for the header of
a binary protocol or a string for a text based one) and reads it - the
decoder just needs to be notified when new data is available. If only
partially enough content is available, the not yet consumed bytes are kept
inside the stream which is filled more the next time data is read. Also
completely consumed byte arrays could be freed automatically by removing the
reference from the stream.

On this base, mina could populate an IoOutputStream for each connection and
fill it with byte arrays without a need to copy anything. If just appends
basic byte arrays to it. The decoder is notifed when new data is available
so it can check if it is able to continue to process (e.g. the stream emmits
an EOFException if not enough data is available on readString() or a binary
decoder just checks for the available() byte count).For this Mina converts
the IoOuputStream to an IoInputStream (new IoInputStream(output), zero copy
of course, same byte array arrays) and gives it to the decoder for further
processing which generates some sort of messages for decoderoutput - you all
know the procedure. When this is done, the stream is reused (new
IoOutputStream(input)) and filled even more with byte arrays. All not yet
consumed data is kept available for the decoder.

If a developer creates a new IoOutputStream, large enough byte arrays could
just be appended. For small values (e.g. for putInt() etc.), the stream
could expand by just adding another buffer of a predefined size (e.g. 128
bytes on the default or explicitly set by the developer if he is concerned
of performance) for a sequence of put operations - so there is still an easy
possibility for the extension mechanism.

So: Almost every java guy knows how to deal with streams and the composite
and api issues would be solved, too. It should not be too hard to add all
the put/get Byte/Short/Int/Long/String methods to it to read data types from
overlapping byte arrays. The available() method to count the remaining bytes
also can be cached on read/write so that no costly iteration over byte array
arrays (I start to like this phrase) needs to be done.

However, I am very new to mina and I may just miss the point but for my
binary FastCGI implementation this kind of streams would satisfy all my
needs.

For example:
First a 8 bytes long header is read containing some protocol specific stuff
and the content length. Afterwards, when enough content bytes are available,
the stream is added to the message as the content (slice it to fixed length)
and the message containing the sliced stream is written to decoderoutput.
Afterwards, decoding starts again on the still existing composite stream and
nothing had to be copied. The buffers at the end of the stream (that may
contain the last content bytes and a new header) could even overlap between
duplicates/slices without a possibility to modify the wrong bytes because of
the nature of a stream, as mentioned above.

Afaik this works the same way for HTTP (mixed text/binary) and text based
protocols would be even simpler as they already are because they could just
check if a string is available (this also could be cached internally to not
check for (CR)LF on already checked bytes when the appropriate getString()
method is called another time).

As I already said I don't see a need to struggle with buffers at all on
user/protocol developer side, so wouldn't this be the best way?

reagards
Daniel

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by "David M. Lloyd" <da...@redhat.com>.

On 04/28/2008 07:56 PM, "이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
> David M. Lloyd wrote:
>> On 04/28/2008 01:09 PM, Emmanuel Lecharny wrote:
>>> May be but this is just a Buffer, not a data structure ! BB are really
>>> meant to be used as a fixed and temporary storage, not as something a
>>> user application can use at will.
>> Yes, I think the important change is to break the 1:1 association
>> between a buffer and a message.
>>
>> That's what this part of the thread is really all about I think.
> 
> Very true.  I actually don't care even if we use NIO ByteBuffer as a
> underlying storage and build something on top of it.  The problem is to
> find out how we can protect users from modifying position, limit, mark
> and order - these four properties must be handled in our abstraction
> layer, which breaks the 1:1 association.

Why must we protect the users from modifying these four properties?  If a 
user thinks they've got a better way to recombine buffers, why not let them?

I don't see how the byte array interface would be used in practice.  Do you 
have an example (just a mockup is fine)?

- DML

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by 이희승 (Trustin Lee) <t...@gmail.com>.

David M. Lloyd wrote:
> On 04/28/2008 01:09 PM, Emmanuel Lecharny wrote:
>> May be but this is just a Buffer, not a data structure ! BB are really
>> meant to be used as a fixed and temporary storage, not as something a
>> user application can use at will.
> 
> Yes, I think the important change is to break the 1:1 association
> between a buffer and a message.
> 
> That's what this part of the thread is really all about I think.

Very true.  I actually don't care even if we use NIO ByteBuffer as a
underlying storage and build something on top of it.  The problem is to
find out how we can protect users from modifying position, limit, mark
and order - these four properties must be handled in our abstraction
layer, which breaks the 1:1 association.

We might be able to enforce such restriction without introducing new
types.  Let me research a little bit about this and check in the prototype.
-- 
Trustin Lee - Principal Software Engineer, JBoss, Red Hat
--
what we call human nature is actually human habit
--
http://gleamynode.net/

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by "David M. Lloyd" <da...@redhat.com>.

On 04/28/2008 01:09 PM, Emmanuel Lecharny wrote:
> May be but this is just a Buffer, not a data structure ! BB are really 
> meant to be used as a fixed and temporary storage, not as something a 
> user application can use at will.

Yes, I think the important change is to break the 1:1 association between a 
buffer and a message.

That's what this part of the thread is really all about I think.

- DML

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by 이희승 (Trustin Lee) <t...@gmail.com>.

David M. Lloyd wrote:
> On 04/28/2008 01:45 PM, Outside - Karl's ACM wrote:
>> Exciting conversations this morning. ;)
>>
>> I also have a personal interest in a multithread safe, nonblocking,
>> bytecode targeting DFA (and Thompson NFA?) compiler. If such a project
>> exists or is forming I would like to become involved.
> 
> Great!
> 
>> With regards to mutating the mark / limit of buffers would anyone favor
>> method invariants that can be checked in a test pipeline and not in the
>> production pipeline? Another pattern I have used when dealing with
>> ByteBuffers is to .slice() and then move the position ahead on the
>> original buffer after the call (slightly slower than invariants).
> 
> DIRMINA-490 was all about adding this functionality (sort of a
> "consuming" slice() if you will).  I've also solved this problem with
> standard NIO ByteBuffers more times than I care to think about. :-)
> 
> Usually I just have an IoUtil class or something with a static method..
> something like this:
> 
>     // untested by the way ;-)
>     public static ByteBuffer sliceBuffer(ByteBuffer buffer, int
> sliceSize) {
>         if (sliceSize > buffer.remaining() || sliceSize <
> -buffer.remaining()) {
>             throw new BufferUnderflowException();
>         }
>         final int oldPos = buffer.position();
>         final int oldLim = buffer.limit();
>         if (sliceSize < 0) {
>             // count from end
>             buffer.position(oldPos - sliceSize);
>             try {
>                 return buffer.slice();
>             } finally {
>                 buffer.position(oldPos);
>                 buffer.limit(oldLim + sliceSize);
>             }
>         } else {
>             // count from start
>             buffer.limit(oldLim - sliceSize);
>             try {
>                 return buffer.slice();
>             } finally {
>                 buffer.position(oldPos + sliceSize);
>                 buffer.limit(oldLim);
>             }
>         }
>     }

Actually I used similar technique for some cases, but I felt
uncomfortable with it because it can't stop a user from modifying the
parent buffer's position and limit as long as it's exposed to the user.

Checking the state of the parent buffer in the filter chain can solve
the problem to some extent as Karl pointed out.  However, I would find
it's somewhat degrading user experience because it means I can't use all
the access methods in ByteBuffer at the first place even if they are
sitting down there.  Always failing methods shouldn't be exposed IMHO.
-- 
Trustin Lee - Principal Software Engineer, JBoss, Red Hat
--
what we call human nature is actually human habit
--
http://gleamynode.net/

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by "David M. Lloyd" <da...@redhat.com>.

On 04/28/2008 01:45 PM, Outside - Karl's ACM wrote:
> Exciting conversations this morning. ;)
> 
> I also have a personal interest in a multithread safe, nonblocking,
> bytecode targeting DFA (and Thompson NFA?) compiler. If such a project
> exists or is forming I would like to become involved.

Great!

> With regards to mutating the mark / limit of buffers would anyone favor
> method invariants that can be checked in a test pipeline and not in the
> production pipeline? Another pattern I have used when dealing with
> ByteBuffers is to .slice() and then move the position ahead on the
> original buffer after the call (slightly slower than invariants).

DIRMINA-490 was all about adding this functionality (sort of a "consuming" 
slice() if you will).  I've also solved this problem with standard NIO 
ByteBuffers more times than I care to think about. :-)

Usually I just have an IoUtil class or something with a static method.. 
something like this:

     // untested by the way ;-)
     public static ByteBuffer sliceBuffer(ByteBuffer buffer, int sliceSize) {
         if (sliceSize > buffer.remaining() || sliceSize < 
-buffer.remaining()) {
             throw new BufferUnderflowException();
         }
         final int oldPos = buffer.position();
         final int oldLim = buffer.limit();
         if (sliceSize < 0) {
             // count from end
             buffer.position(oldPos - sliceSize);
             try {
                 return buffer.slice();
             } finally {
                 buffer.position(oldPos);
                 buffer.limit(oldLim + sliceSize);
             }
         } else {
             // count from start
             buffer.limit(oldLim - sliceSize);
             try {
                 return buffer.slice();
             } finally {
                 buffer.position(oldPos + sliceSize);
                 buffer.limit(oldLim);
             }
         }
     }

- DML

RE: Redesigning IoBuffer, IoFilter and IoHandler

Posted by Outside - Karl's ACM <kp...@acm.org>.

Exciting conversations this morning. ;)

I also have a personal interest in a multithread safe, nonblocking, bytecode targeting DFA (and Thompson NFA?) compiler. If such a project exists or is forming I would like to become involved.

Regarding ByteBuffer, tho I am a little greener than several of you I'll offer my opinion, I favor the Sun ByteBuffer because it will be easier for me to migrate any homegrown schemes. I would go so far as to request that the MINA interfaces conform to the standard Java ones where possible, and if needed conform to JSR draft interfaces (i.e. NIO.2). I don't mean for MINA to be dependent on any JSR, but where possible to conform to the interfaces which are anticipated to be standardized and included in the JVM.

With regards to mutating the mark / limit of buffers would anyone favor method invariants that can be checked in a test pipeline and not in the production pipeline? Another pattern I have used when dealing with ByteBuffers is to .slice() and then move the position ahead on the original buffer after the call (slightly slower than invariants).

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by Emmanuel Lecharny <el...@apache.org>.

"이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
> David M. Lloyd wrote:
>   
>> On 04/28/2008 11:39 AM, "이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
>>     
>>> I'd prefer to introduce an interface type because ByteBuffer is
>>> impossible to extend.
>>>       
>> That's fair, but I don't see why you'd need to extend ByteBuffer anyway?
>>     
>
> Because we can't create a Composite ByteBuffer by extending ByteBuffer.
>  ByteBuffer also has unnecessary stuff like position, mark, limit,
> flip(), compact() and order() which changes the state and behavior of
> the buffer.  There's no way to make the position and limit read-only
> while making the buffer content writable.
>   
May be but this is just a Buffer, not a data structure ! BB are really 
meant to be used as a fixed and temporary storage, not as something a 
user application can use at will.
>> I don't mean to allocate a buffer per connection, I mean to allocate
>> buffers as needed.  An idle connection should need no buffers, if the
>> codec is efficiently implemented.
>>     
>
> An efficient buffer class is needed to implement an efficient codec IMO.
>   
We are using ByteBuffers (now renamed to IoBuffer in MINA 2) as if they 
were plain Sun ByteBuffer in ADS decoder, and it's pretty efficient.  
I'm not sure I need something different, and I'm not sure at all it can 
bring some improvement.


Keep in mind that people who will write decoder will know about Sun BB, 
but not about any other data structure. Don't force them to learn a new 
API, even if it's very close to the existing one ! A helper class with 
facility methods is generally enough.

-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by 이희승 (Trustin Lee) <t...@gmail.com>.

David M. Lloyd wrote:
> On 04/28/2008 12:34 PM, "이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
>> David M. Lloyd wrote:
>>> On 04/28/2008 11:39 AM, "이희승 (Trustin Lee) <tr...@gmail.com>"
>>> wrote:
>>>> I'd prefer to introduce an interface type because ByteBuffer is
>>>> impossible to extend.
>>> That's fair, but I don't see why you'd need to extend ByteBuffer anyway?
>>
>> Because we can't create a Composite ByteBuffer by extending ByteBuffer.
>>  ByteBuffer also has unnecessary stuff like position, mark, limit,
>> flip(), compact() and order() which changes the state and behavior of
>> the buffer.  There's no way to make the position and limit read-only
>> while making the buffer content writable.
> 
> You don't really need an actual composite ByteBuffer though.  You'd just
> have to make sure that all the send receive methods can handle more than
> one ByteBuffer at once.  You could even make a separate class that can
> do the heavy lifting for you (this is why I keep mentioning
> DataInput/Output, though their default implementations aren't terribly
> efficient).  In a way this isn't *too* different from the two-class
> approach you mentioned, having a separate "array" vs "buffer management"
> class.  You'd just use ByteBuffer directly for the "array" class.

Great point.  We should see composite byte buffer from the buffer
management perspective.  Actually I was somewhat confused about my own
idea because composite buffer can be implemented in both layer.  Drawing
a strict line seems to be a good idea.
>>>> My long term idea is to write something similar to ANTLR (parser
>>>> generator) which works in a binary level; we can call it decoder
>>>> generator.
>>> This would be better. ;-)
>>
>> Yeah.  The only problem is all nice parser generator implementations
>> depend on InputStream and consequently blocking I/O.  We need to write a
>> new generator which works with non-blocking I/O.
> 
> Yeah, this is my complaint too.  I've started work on more than one
> occasion on a DFA bytecode generator that works in a non-blocking
> fashion.  But this is not a small task and unfortunately it is low on my
> list of priorities. :-)
> 
> Maybe a collaboration is possible with the ANTLR project to achieve this
> goal.

Agreed.  We will definitely have a chance although we don't need to
hurry up too much.

Thanks,
-- 
Trustin Lee - Principal Software Engineer, JBoss, Red Hat
--
what we call human nature is actually human habit
--
http://gleamynode.net/

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by "David M. Lloyd" <da...@redhat.com>.

On 04/28/2008 12:34 PM, "이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
> David M. Lloyd wrote:
>> On 04/28/2008 11:39 AM, "이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
>>> I'd prefer to introduce an interface type because ByteBuffer is
>>> impossible to extend.
>> That's fair, but I don't see why you'd need to extend ByteBuffer anyway?
> 
> Because we can't create a Composite ByteBuffer by extending ByteBuffer.
>  ByteBuffer also has unnecessary stuff like position, mark, limit,
> flip(), compact() and order() which changes the state and behavior of
> the buffer.  There's no way to make the position and limit read-only
> while making the buffer content writable.

You don't really need an actual composite ByteBuffer though.  You'd just 
have to make sure that all the send receive methods can handle more than 
one ByteBuffer at once.  You could even make a separate class that can do 
the heavy lifting for you (this is why I keep mentioning DataInput/Output, 
though their default implementations aren't terribly efficient).  In a way 
this isn't *too* different from the two-class approach you mentioned, 
having a separate "array" vs "buffer management" class.  You'd just use 
ByteBuffer directly for the "array" class.

>>> My long term idea is to write something similar to ANTLR (parser
>>> generator) which works in a binary level; we can call it decoder
>>> generator.
>> This would be better. ;-)
> 
> Yeah.  The only problem is all nice parser generator implementations
> depend on InputStream and consequently blocking I/O.  We need to write a
> new generator which works with non-blocking I/O.

Yeah, this is my complaint too.  I've started work on more than one 
occasion on a DFA bytecode generator that works in a non-blocking fashion. 
  But this is not a small task and unfortunately it is low on my list of 
priorities. :-)

Maybe a collaboration is possible with the ANTLR project to achieve this goal.

- DML

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by 이희승 (Trustin Lee) <t...@gmail.com>.

David M. Lloyd wrote:
> On 04/28/2008 11:39 AM, "이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
>> I'd prefer to introduce an interface type because ByteBuffer is
>> impossible to extend.
> 
> That's fair, but I don't see why you'd need to extend ByteBuffer anyway?

Because we can't create a Composite ByteBuffer by extending ByteBuffer.
 ByteBuffer also has unnecessary stuff like position, mark, limit,
flip(), compact() and order() which changes the state and behavior of
the buffer.  There's no way to make the position and limit read-only
while making the buffer content writable.

>> I can do some quick and dirty coding to prove the concept.  How does
>> it sound?
> 
> Sounds good.

Great.  Let me come up with some working code tomorrow.

>> My long term idea is to write something similar to ANTLR (parser
>> generator) which works in a binary level; we can call it decoder
>> generator.
> 
> This would be better. ;-)

Yeah.  The only problem is all nice parser generator implementations
depend on InputStream and consequently blocking I/O.  We need to write a
new generator which works with non-blocking I/O.

>>>> - how to write an efficient encoder when you have no idea about the
>>>> size of the data you are going to send ?
>>> Use a buffer factory, such as IoBufferAllocator, or use an even simpler
>>> interface like this:
>>>
>>> public interface BufferFactory {
>>>     ByteBuffer createBuffer();
>>> }
>>>
>>> which mass-produces pre-sized buffers.  In the case of stream-oriented
>>> systems like TCP or serial, you could probably send buffers as you fill
>>> them.  For message-oriented protocols like UDP, you can accumulate all
>>> the buffers to send, and then use a single gathering write to send them
>>> as a single message (yes, this stinks in the current NIO implementation,
>>> as Trustin pointed out in DIRMINA-518, but it's no worse than the
>>> repeated copying that auto-expanding buffers use; and APR and other
>>> possible backends [and, if I have any say at all in it, future OpenJDK
>>> implementations] would hopefully not suffer from this limitation).
>>
>> Allocating a pre-sized buffer per connection can cause OOM when there
>> are massive number of idle connections.
> 
> I don't mean to allocate a buffer per connection, I mean to allocate
> buffers as needed.  An idle connection should need no buffers, if the
> codec is efficiently implemented.

An efficient buffer class is needed to implement an efficient codec IMO.
 We could probably start from the low-level buffer (or array)
implementation and see what we can do with our existing codecs?

Thanks,
-- 
Trustin Lee - Principal Software Engineer, JBoss, Red Hat
--
what we call human nature is actually human habit
--
http://gleamynode.net/

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by "David M. Lloyd" <da...@redhat.com>.

On 04/28/2008 11:39 AM, "이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
> I'd prefer to introduce an interface type because ByteBuffer is
> impossible to extend.

That's fair, but I don't see why you'd need to extend ByteBuffer anyway?

> I can do some quick and dirty coding to prove the concept.  How does it sound?

Sounds good.

> We already have a state machine based codec but it's far from
> documentation.  Blame me. ;)

This doesn't seem like the ideal solution...

> My long term idea is to write something similar to ANTLR (parser
> generator) which works in a binary level; we can call it decoder generator.

This would be better. ;-)

>>> - how to write an efficient encoder when you have no idea about the
>>> size of the data you are going to send ?
>> Use a buffer factory, such as IoBufferAllocator, or use an even simpler
>> interface like this:
>>
>> public interface BufferFactory {
>>     ByteBuffer createBuffer();
>> }
>>
>> which mass-produces pre-sized buffers.  In the case of stream-oriented
>> systems like TCP or serial, you could probably send buffers as you fill
>> them.  For message-oriented protocols like UDP, you can accumulate all
>> the buffers to send, and then use a single gathering write to send them
>> as a single message (yes, this stinks in the current NIO implementation,
>> as Trustin pointed out in DIRMINA-518, but it's no worse than the
>> repeated copying that auto-expanding buffers use; and APR and other
>> possible backends [and, if I have any say at all in it, future OpenJDK
>> implementations] would hopefully not suffer from this limitation).
> 
> Allocating a pre-sized buffer per connection can cause OOM when there
> are massive number of idle connections.

I don't mean to allocate a buffer per connection, I mean to allocate 
buffers as needed.  An idle connection should need no buffers, if the codec 
is efficiently implemented.

- DML

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by 이희승 (Trustin Lee) <t...@gmail.com>.

It's nice to see some cool discussion. :)

David M. Lloyd wrote:
> On 04/28/2008 08:57 AM, Emmanuel Lecharny wrote:
>> "이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
>>> I thought about the current MINA API these days while I am idle, and got
>>> some idea of improvements:
>>>
>>> 1) Split IoBuffer into two parts - array and buffer
>>>
>>> IoBuffer is basically an improvement to ByteBuffer.  Improvement here
>>> means it inherited some bad asset from ByteBuffer - all the stateful
>>> properties such as position, limit and mark.  I think they should be
>>> provided as a separate class, and there should be classes dedicated for
>>> storing bytes only (something like ByteArray?)
>>>   
>> I was also thinking about this class. ByteBuffers are allocated once,
>> with a specific size, and the JVM will optimize the way they are
>> handled. When dealing with IO, I don't see a lot of reasons to have
>> expandable BB.
> 
> Indeed.  This was my motivation for opening DIRMINA-489.
> 
>> I think that using a byte[] (for instance in the encoder), transform
>> it to a ByteBuffer, is another way to deal with the problem.
>>
>> One important point is that ByteBuffers are just mean to contains a
>> fixed amount of data. It's a buffer, not a data structure.
>> Transforming ByteBuffer to make them able to expand twist their
>> intrinsic semantic.
> 
> Yes, it makes far more sense to accumulate buffers until you can decode
> your message from it.
> 
>> So I would say that BB should be used on the very low level (reading
>> data and sending data), but then, the other layers should use byte[]
>> or a stream of bytes.
> 
> I don't see the advantage of using byte[] honestly - using at the least
> a wrapper object seems preferable.  And if you're going to use a wrapper
> object, why not just use ByteBuffer.

I'd prefer to introduce an interface type because ByteBuffer is
impossible to extend.  I can do some quick and dirty coding to prove the
concept.  How does it sound?

>> This will lead to very intersting performances questions :
>> - how to handle large stream of data ?
> 
> One buffer at a time. :-)
> 
>> - should we serialize the stream at some point ?
> 
> What do you mean by "serialize"?
> 
>> - how to write an efficient decoder, when you may receive fractions of
>> what you are waiting for ?
> 
> An ideal decoder would be a state machine which can be entered and
> exited at any state.  This way, even a partial buffer can be fully
> consumed before returning to wait for the next buffer.

We already have a state machine based codec but it's far from
documentation.  Blame me. ;)

My long term idea is to write something similar to ANTLR (parser
generator) which works in a binary level; we can call it decoder generator.

>> - how to write an efficient encoder when you have no idea about the
>> size of the data you are going to send ?
> 
> Use a buffer factory, such as IoBufferAllocator, or use an even simpler
> interface like this:
> 
> public interface BufferFactory {
>     ByteBuffer createBuffer();
> }
> 
> which mass-produces pre-sized buffers.  In the case of stream-oriented
> systems like TCP or serial, you could probably send buffers as you fill
> them.  For message-oriented protocols like UDP, you can accumulate all
> the buffers to send, and then use a single gathering write to send them
> as a single message (yes, this stinks in the current NIO implementation,
> as Trustin pointed out in DIRMINA-518, but it's no worse than the
> repeated copying that auto-expanding buffers use; and APR and other
> possible backends [and, if I have any say at all in it, future OpenJDK
> implementations] would hopefully not suffer from this limitation).

Allocating a pre-sized buffer per connection can cause OOM when there
are massive number of idle connections.  We could use composite buffer
in this case too because we can adjust the size of the buffer more
flexibly without causing reallocation.

Gathering write is also a piece of cake if composite buffer is realized.
 Of course it should be friendly with your suggested allocation mechanism.

>> For all these reasons, the mail I sent a few days ago express my
>> personnal opinion that IoBuffer may be a little bit overkilling
>> (remember that this class -and the associated tests- represent around
>> 13% of all mina common code ! )
> 
> Yes, that's very heavy.  I looked at resolving DIRMINA-489 more than
> once, and was overwhelmed by the sheer number of methods that had to be
> implemented, and the overly complex class structure.
> 
> One option could be to use ByteBuffer with some static support methods,
> and streams to act as the "user interface" into collections of buffers. 
> For example, an InputStream that reads from a collection of buffers, and
> an OutputStream that is configurable to auto-allocate buffers,
> performing an action every time a buffer is filled:
> 
> public interface BufferSink {
>     void handleBuffer(ByteBuffer buffer);
> }
> 
> Another option is to skip ByteBuffers and go with raw byte[] objects
> (though this closes the door completely to direct buffers).
> 
> Yet another option is to have a simplified abstraction for byte arrays
> like Trustin proposes, and use the stream cleasses for the buffer state
> implementation.
> 
> This is all in addition to Trustin's idea of providing a byte array
> abstraction and a buffer state abstraction class.

Thanks for mentioning all possibilities.  I'm leaning toward to my
original idea of providing byte array abstraction and buffer state
abstraction types because it can cover all cases you've mentioned.
Also, we can provide the sheer number of getters and putters for users
who like them anyway - it should end up with much simpler one monolithic
class - a way simpler.
-- 
Trustin Lee - Principal Software Engineer, JBoss, Red Hat
--
what we call human nature is actually human habit
--
http://gleamynode.net/

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by "David M. Lloyd" <da...@redhat.com>.

On 04/28/2008 08:57 AM, Emmanuel Lecharny wrote:
> "이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
>> I thought about the current MINA API these days while I am idle, and got
>> some idea of improvements:
>>
>> 1) Split IoBuffer into two parts - array and buffer
>>
>> IoBuffer is basically an improvement to ByteBuffer.  Improvement here
>> means it inherited some bad asset from ByteBuffer - all the stateful
>> properties such as position, limit and mark.  I think they should be
>> provided as a separate class, and there should be classes dedicated for
>> storing bytes only (something like ByteArray?)
>>   
> I was also thinking about this class. ByteBuffers are allocated once, 
> with a specific size, and the JVM will optimize the way they are 
> handled. When dealing with IO, I don't see a lot of reasons to have 
> expandable BB.

Indeed.  This was my motivation for opening DIRMINA-489.

> I think that using a byte[] (for instance in the encoder), transform it 
> to a ByteBuffer, is another way to deal with the problem.
> 
> One important point is that ByteBuffers are just mean to contains a 
> fixed amount of data. It's a buffer, not a data structure. Transforming 
> ByteBuffer to make them able to expand twist their intrinsic semantic.

Yes, it makes far more sense to accumulate buffers until you can decode 
your message from it.

> So I would say that BB should be used on the very low level (reading 
> data and sending data), but then, the other layers should use byte[] or 
> a stream of bytes.

I don't see the advantage of using byte[] honestly - using at the least a 
wrapper object seems preferable.  And if you're going to use a wrapper 
object, why not just use ByteBuffer.

> This will lead to very intersting performances questions :
> - how to handle large stream of data ?

One buffer at a time. :-)

> - should we serialize the stream at some point ?

What do you mean by "serialize"?

> - how to write an efficient decoder, when you may receive fractions of 
> what you are waiting for ?

An ideal decoder would be a state machine which can be entered and exited 
at any state.  This way, even a partial buffer can be fully consumed before 
returning to wait for the next buffer.

However many decoders are not ideal due to various constraints.  In the 
worst case, you could accumulate ByteBuffer instances until you have a 
complete message that can be handled.  What I do at this point is to create 
a DataInputStream that encapsulates all the received buffers.

Note that a buffer might contain data from more than one message as well. 
So it's important to use only a slice of the buffer in this case.

> - how to write an efficient encoder when you have no idea about the size 
> of the data you are going to send ?

Use a buffer factory, such as IoBufferAllocator, or use an even simpler 
interface like this:

public interface BufferFactory {
     ByteBuffer createBuffer();
}

which mass-produces pre-sized buffers.  In the case of stream-oriented 
systems like TCP or serial, you could probably send buffers as you fill 
them.  For message-oriented protocols like UDP, you can accumulate all the 
buffers to send, and then use a single gathering write to send them as a 
single message (yes, this stinks in the current NIO implementation, as 
Trustin pointed out in DIRMINA-518, but it's no worse than the repeated 
copying that auto-expanding buffers use; and APR and other possible 
backends [and, if I have any say at all in it, future OpenJDK 
implementations] would hopefully not suffer from this limitation).

> For all these reasons, the mail I sent a few days ago express my 
> personnal opinion that IoBuffer may be a little bit overkilling 
> (remember that this class -and the associated tests- represent around 
> 13% of all mina common code ! )

Yes, that's very heavy.  I looked at resolving DIRMINA-489 more than once, 
and was overwhelmed by the sheer number of methods that had to be 
implemented, and the overly complex class structure.

One option could be to use ByteBuffer with some static support methods, and 
streams to act as the "user interface" into collections of buffers.  For 
example, an InputStream that reads from a collection of buffers, and an 
OutputStream that is configurable to auto-allocate buffers, performing an 
action every time a buffer is filled:

public interface BufferSink {
     void handleBuffer(ByteBuffer buffer);
}

Another option is to skip ByteBuffers and go with raw byte[] objects 
(though this closes the door completely to direct buffers).

Yet another option is to have a simplified abstraction for byte arrays like 
Trustin proposes, and use the stream cleasses for the buffer state 
implementation.

This is all in addition to Trustin's idea of providing a byte array 
abstraction and a buffer state abstraction class.

- DML

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by 이희승 (Trustin Lee) <t...@gmail.com>.

Emmanuel Lecharny wrote:
> "이희승 (Trustin Lee) <tr...@gmail.com>" wrote
>>> The differences between IoHandler and IoFilter are:
>>>
>>> * IoHandler doesn't have life cycle callbacks (onPreAdd,
>>> onPostRemove...) while IoFilter has.
>>> * IoHandler doesn't have handler methods for requests such as write,
>>> setTrafficMask and close while IoFilter has.
>>>
>>> This is why I'm proposing to split IoFilter into multiple interfaces:
>>>
>>> * Interface A which provides handler methods for incoming events.
>>> * Interface B which extends the interface A and provides handler methods
>>> for outgoing requests
>>> * Interface C which extends the interface B and provides lifecycle
>>> management methods
>>>
>>> This inheritance relationship might not be precise.  For example, both
>>> interface B and C could extend the interface A directly.
>>>     
>>
>> UpstreamFilter and DownstreamFilter sounds somewhat strange. :)
>>
>> What about InputHandler, OutputHandler, IoHandler (extends InputHandler
>> and OutputHandler) and LifecycleAwareHandler?
>>
>>   
> Just a quick Q : do we really need those three interfaces ?
> 
> For instance, A and B can be merged, as I don't know a lot of protocol
> who deal with incoming requests but never send back a reply (of course,
> there are some, but it's not the general case). If you don't have any
> outgoing requests, then an abstract class can implement a null operation
> for outgoing requests.

Current IoHandler probably doesn't need to handle outgoing request?
They could just leave unused methods empty.  :)

Once IoFilter replaces IoHandler, even the last handler will be invoked
with NextFilter (or NextHandler if renamed).  Which is somewhat less
optimal.  We can take care of this by providing an abstract class of
course.  How does it sound?

> 
> The very same for C, I see no reason why we should not merge it with A
> and B.
> 
> In any case, if you have an inheritance scheme like A <-- B <-- C (<-- =
> extends), then I think we can merge those three interfaces, except if
> implementing a specific interface will bring some specific function.
> 
> A class which will need to implement A and C for instance would be
> better described as class MyClass implements A,C instead of having an
> inheritence between C, B and A.
> 
> What I mean is that interfaces might not inherits from each other, if
> they are just extending the API.
> 
> Here, if we don't merge the interface, something like :
> 
> class UpstreamFilter implements A
> 
> class DownStreamFilter implements B
> 
> class ManagedUpstreamFilter implements A,C
> 
> class ManagedDownStreamFilter implements B,C
> 
> class ManagedFullFilter implements A,B,C
> 
> covers all the case.
> 
> About the names, keep Filter. Handler has a different semantic to me...
> (may be it's just me)

Yep.  I agree with you having all these types doesn't look that clean.
Maybe we could just replace IoHandler with IoFilter meanwhile and see
what users think.

-- 
Trustin Lee - Principal Software Engineer, JBoss, Red Hat
--
what we call human nature is actually human habit
--
http://gleamynode.net/

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by Emmanuel Lecharny <el...@apache.org>.

"이희승 (Trustin Lee) <tr...@gmail.com>" wrote
>> The differences between IoHandler and IoFilter are:
>>
>> * IoHandler doesn't have life cycle callbacks (onPreAdd,
>> onPostRemove...) while IoFilter has.
>> * IoHandler doesn't have handler methods for requests such as write,
>> setTrafficMask and close while IoFilter has.
>>
>> This is why I'm proposing to split IoFilter into multiple interfaces:
>>
>> * Interface A which provides handler methods for incoming events.
>> * Interface B which extends the interface A and provides handler methods
>> for outgoing requests
>> * Interface C which extends the interface B and provides lifecycle
>> management methods
>>
>> This inheritance relationship might not be precise.  For example, both
>> interface B and C could extend the interface A directly.
>>     
>
> UpstreamFilter and DownstreamFilter sounds somewhat strange. :)
>
> What about InputHandler, OutputHandler, IoHandler (extends InputHandler
> and OutputHandler) and LifecycleAwareHandler?
>
>   
Just a quick Q : do we really need those three interfaces ?

For instance, A and B can be merged, as I don't know a lot of protocol 
who deal with incoming requests but never send back a reply (of course, 
there are some, but it's not the general case). If you don't have any 
outgoing requests, then an abstract class can implement a null operation 
for outgoing requests.

The very same for C, I see no reason why we should not merge it with A 
and B.

In any case, if you have an inheritance scheme like A <-- B <-- C (<-- = 
extends), then I think we can merge those three interfaces, except if 
implementing a specific interface will bring some specific function.

A class which will need to implement A and C for instance would be 
better described as class MyClass implements A,C instead of having an 
inheritence between C, B and A.

What I mean is that interfaces might not inherits from each other, if 
they are just extending the API.

Here, if we don't merge the interface, something like :

class UpstreamFilter implements A

class DownStreamFilter implements B

class ManagedUpstreamFilter implements A,C

class ManagedDownStreamFilter implements B,C

class ManagedFullFilter implements A,B,C

covers all the case.

About the names, keep Filter. Handler has a different semantic to me... 
(may be it's just me)


-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by "David M. Lloyd" <da...@redhat.com>.

On 04/28/2008 12:39 PM, "이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
>> The differences between IoHandler and IoFilter are:
>>
>> * IoHandler doesn't have life cycle callbacks (onPreAdd,
>> onPostRemove...) while IoFilter has.
>> * IoHandler doesn't have handler methods for requests such as write,
>> setTrafficMask and close while IoFilter has.
>>
>> This is why I'm proposing to split IoFilter into multiple interfaces:
> [...]
> UpstreamFilter and DownstreamFilter sounds somewhat strange. :)
> 
> What about InputHandler, OutputHandler, IoHandler (extends InputHandler
> and OutputHandler) and LifecycleAwareHandler?

Also you've got Session management to consider, which would presumably be a 
parent interface to both Input and OutputHandler.

Also I'm not so sure about LifecycleAwareHandler, but I don't have a better 
idea. ;-)

- DML

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by 이희승 (Trustin Lee) <t...@gmail.com>.

"이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
> David, you and I discussed about IoBuffer stuff, so let me comment about
> other proposed changes...
> 
> Emmanuel Lecharny wrote:
>> "이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
>>> 2) Get rid of IoHandler and let IoFilter replace it.
>>>
>>> IoHandler is just a special IoFilter at the end of the filter chain.  I
>>> don't see any reason to keep it special considering that we are often
>>> building multi-layered protocols.
>>>   
>> I didn't went deep into this class, but if it's just the last part of a
>> filter, then it's just a filter like any other, a catch-all filter. So I
>> agree with the idea to define it as a IoFilter too.
> 
> The differences between IoHandler and IoFilter are:
> 
> * IoHandler doesn't have life cycle callbacks (onPreAdd,
> onPostRemove...) while IoFilter has.
> * IoHandler doesn't have handler methods for requests such as write,
> setTrafficMask and close while IoFilter has.
> 
> This is why I'm proposing to split IoFilter into multiple interfaces:
> 
> * Interface A which provides handler methods for incoming events.
> * Interface B which extends the interface A and provides handler methods
> for outgoing requests
> * Interface C which extends the interface B and provides lifecycle
> management methods
> 
> This inheritance relationship might not be precise.  For example, both
> interface B and C could extend the interface A directly.

UpstreamFilter and DownstreamFilter sounds somewhat strange. :)

What about InputHandler, OutputHandler, IoHandler (extends InputHandler
and OutputHandler) and LifecycleAwareHandler?

-- 
Trustin Lee - Principal Software Engineer, JBoss, Red Hat
--
what we call human nature is actually human habit
--
http://gleamynode.net/

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by 이희승 (Trustin Lee) <t...@gmail.com>.

David, you and I discussed about IoBuffer stuff, so let me comment about
other proposed changes...

Emmanuel Lecharny wrote:
> "이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
>> 2) Get rid of IoHandler and let IoFilter replace it.
>>
>> IoHandler is just a special IoFilter at the end of the filter chain.  I
>> don't see any reason to keep it special considering that we are often
>> building multi-layered protocols.
>>   
> I didn't went deep into this class, but if it's just the last part of a
> filter, then it's just a filter like any other, a catch-all filter. So I
> agree with the idea to define it as a IoFilter too.

The differences between IoHandler and IoFilter are:

* IoHandler doesn't have life cycle callbacks (onPreAdd,
onPostRemove...) while IoFilter has.
* IoHandler doesn't have handler methods for requests such as write,
setTrafficMask and close while IoFilter has.

This is why I'm proposing to split IoFilter into multiple interfaces:

* Interface A which provides handler methods for incoming events.
* Interface B which extends the interface A and provides handler methods
for outgoing requests
* Interface C which extends the interface B and provides lifecycle
management methods

This inheritance relationship might not be precise.  For example, both
interface B and C could extend the interface A directly.

>> 3) Split IoFilter into multiple interfaces.
>>
>> If IoHandler is removed, IoFilter should be renamed to represent itself
>> better.  Moreover, IoFilter should be split into more than one interface
>>  (e.g. UpstreamHandler for receiving events and DownstreamHandler for
>> sending events from/to an IoProcessor) so they can choose what to
>> override more conveniently.
>>   
> I'm a little bit reluctant... Filters are two-ways, so I'm not sure this
> is a good idea to define at least two separate kind of filters (up and
> down stream). let me explain :
> 
> You have something similar to what Eiffel (the language) has : pre and
> post processing. Calling the next filter is just an action. The flow is
> much more like :
> 
> filter N-1 -> [Filter N] pre processing... call Filter N+1... post
> processing -> return to filter N-1
> 
> If you separate up stream and downstream filters, this will become
> slightly more complicated to write, because post-processing won't be
> executed, but moved to the up-stream Filter. This will split the logic
> of a filter into two classes.

It is true, and that's why I am proposing some inheritance relationship.
 Of course it's food for thought to determine which is the simpler
design (i.e. w/ IoHandler vs w/o IoHandler).  I personally think no
IoHandler is better looking at org.apache.mina.handler.chain package.
It's simply copycat of the filter chain.  Yeah, I wrote that crap!  :)

> I have some other suggestion which is not related : The CoR we are using
> is really a PITA to handle when debugging. If you don't know which
> filter will be called next, either you step through a container object
> before jumping into the next filter, or you set a breakpoint in every
> filter (but then, if you have many filters this is a burden).
> 
> I would rather see something really more simple, like if the call for
> the next filter directly point to the next filter. Filter instanciation
> will differ, but not that much.
> 
> Atm, the logic is :
> 
> filter N-1
>  call next filter
>    compute next filter
>      call next filter
>        Filter N
>          etc...
> 
> It would be really more simpler to have something like :
> Filter N-1
>  next filter.call()
>    Filter N
> 
> Or at least :
> Filter N-1
>  next Filter = compute next filter
>  next filter.call()
>    Filter N

I also want to see the stack depth as shallow as possible.  This might
be easier than we expect.  One bad guy is the anonymous accessor
methods.  Removing that will decrease the stack depth quite a lot
without much effort.  Of course, more research will cut more out.

>> 4) UDP as the first citizen transport.
>>   
> I don't know nothing about UDP ...

UDP is basically connectionless transport.  MINA IoSession assumes all
sessions have connection causing impedance mismatch for UDP users.
That's why we have IoSessionRecycler to emulate connection.  It's like
using cookie to store session ID in HTTP.

However, what most UDP clients and servers need is different.  They just
don't need such an emulation.  IoHandler should provide enough
information for UDP transport users; it should be provided with source
and destination address of the received message, while TCP doesn't need
them.

The same issue lies in IoSession.  IoSession.getRemoteAddress() changes
whenever a new packet is received, and therefore should be considered
thread-unsafe.  IoSession.write(Object) has the same issue.
IoSession.write(Object, SocketAddress) is actually the only valid write
method in UDP.  For now, emulation takes care of this issue.

Then should we introduce a separate IoSession and IoHandler interfaces
which is not compatible with stream-based IoSession and IoHandler
interfaces?  It's also food for thought; just retaining the current
emulation workaround might be more consistent than introducing
incompatible interfaces.

Or... the UDP users might not care about this issue as long as our UDP
transport performs good.  :D

>> Any feed back is welcome.
>>   
> Hope that i'm not totally off rails with my feedback !

It was pretty worthwhile.  As you know, discussion and realization of an
idea leads us to another level of thoughts.

Cheers,
-- 
Trustin Lee - Principal Software Engineer, JBoss, Red Hat
--
what we call human nature is actually human habit
--
http://gleamynode.net/

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by Emmanuel Lecharny <el...@apache.org>.

"이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
> I thought about the current MINA API these days while I am idle, and got
> some idea of improvements:
>
> 1) Split IoBuffer into two parts - array and buffer
>
> IoBuffer is basically an improvement to ByteBuffer.  Improvement here
> means it inherited some bad asset from ByteBuffer - all the stateful
> properties such as position, limit and mark.  I think they should be
> provided as a separate class, and there should be classes dedicated for
> storing bytes only (something like ByteArray?)
>   
I was also thinking about this class. ByteBuffers are allocated once, 
with a specific size, and the JVM will optimize the way they are 
handled. When dealing with IO, I don't see a lot of reasons to have 
expandable BB.

I think that using a byte[] (for instance in the encoder), transform it 
to a ByteBuffer, is another way to deal with the problem.

One important point is that ByteBuffers are just mean to contains a 
fixed amount of data. It's a buffer, not a data structure. Transforming 
ByteBuffer to make them able to expand twist their intrinsic semantic.

So I would say that BB should be used on the very low level (reading 
data and sending data), but then, the other layers should use byte[] or 
a stream of bytes.

This will lead to very intersting performances questions :
- how to handle large stream of data ?
- should we serialize the stream at some point ?
- how to write an efficient decoder, when you may receive fractions of 
what you are waiting for ?
- how to write an efficient encoder when you have no idea about the size 
of the data you are going to send ?

For all these reasons, the mail I sent a few days ago express my 
personnal opinion that IoBuffer may be a little bit overkilling 
(remember that this class -and the associated tests- represent around 
13% of all mina common code ! )
> BTW, why is mixing them a bad idea?  It's because it makes the
> implementation too complicated.  For example, it is almost impossible to
> implement a composite buffer to support close-to-zero-copy I/O.  What
> about all the weird rules related with auto-expansion and buffer
> derivation?  It's increasing the learning curve.
>   
Very true.
> 2) Get rid of IoHandler and let IoFilter replace it.
>
> IoHandler is just a special IoFilter at the end of the filter chain.  I
> don't see any reason to keep it special considering that we are often
> building multi-layered protocols.
>   
I didn't went deep into this class, but if it's just the last part of a 
filter, then it's just a filter like any other, a catch-all filter. So I 
agree with the idea to define it as a IoFilter too.
> 3) Split IoFilter into multiple interfaces.
>
> If IoHandler is removed, IoFilter should be renamed to represent itself
> better.  Moreover, IoFilter should be split into more than one interface
>  (e.g. UpstreamHandler for receiving events and DownstreamHandler for
> sending events from/to an IoProcessor) so they can choose what to
> override more conveniently.
>   
I'm a little bit reluctant... Filters are two-ways, so I'm not sure this 
is a good idea to define at least two separate kind of filters (up and 
down stream). let me explain :

You have something similar to what Eiffel (the language) has : pre and 
post processing. Calling the next filter is just an action. The flow is 
much more like :

filter N-1 -> [Filter N] pre processing... call Filter N+1... post 
processing -> return to filter N-1

If you separate up stream and downstream filters, this will become 
slightly more complicated to write, because post-processing won't be 
executed, but moved to the up-stream Filter. This will split the logic 
of a filter into two classes.

I have some other suggestion which is not related : The CoR we are using 
is really a PITA to handle when debugging. If you don't know which 
filter will be called next, either you step through a container object 
before jumping into the next filter, or you set a breakpoint in every 
filter (but then, if you have many filters this is a burden).

I would rather see something really more simple, like if the call for 
the next filter directly point to the next filter. Filter instanciation 
will differ, but not that much.

Atm, the logic is :

filter N-1
  call next filter
    compute next filter
      call next filter
        Filter N
          etc...

It would be really more simpler to have something like :
Filter N-1
  next filter.call()
    Filter N

Or at least :
Filter N-1
  next Filter = compute next filter
  next filter.call()
    Filter N



> 4) UDP as the first citizen transport.
>   
I don't know nothing about UDP ...

> Any feed back is welcome.
>   
Hope that i'm not totally off rails with my feedback !


-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by Emmanuel Lecharny <el...@apache.org>.

Julien Vermillard wrote:
> Hi,
>
> Look like I'm late, you should try to slow down on the reply button
> guys ! It's hard to follow :)
>
>
>  On Mon, 28 Apr 2008 22:21:57 +0900 (KST)
> "이희승 (Trustin Lee) <tr...@gmail.com> wrote:
>
>   
>> I thought about the current MINA API these days while I am idle, and
>> got some idea of improvements:
>>
>> 1) Split IoBuffer into two parts - array and buffer
>>
>> IoBuffer is basically an improvement to ByteBuffer.  Improvement here
>> means it inherited some bad asset from ByteBuffer - all the stateful
>> properties such as position, limit and mark.  I think they should be
>> provided as a separate class, and there should be classes dedicated
>> for storing bytes only (something like ByteArray?)
>>
>> BTW, why is mixing them a bad idea?  It's because it makes the
>> implementation too complicated.  For example, it is almost impossible
>> to implement a composite buffer to support close-to-zero-copy I/O.
>> What about all the weird rules related with auto-expansion and buffer
>> derivation?  It's increasing the learning curve.
>>     
>
> I never liked BB, I find it painful to use, and the plain Sun API
> don't even give method for dealing with unsigned types.
>   
Just because JAVA base object are signed ;) And, yeah, BB is a losy API...
> The byte array is probably sucking even more, because you can't cast it
> like in C to the type you want and fetch the data. the ones who ever
> wrote a binary decoder in C understand what I mean :
> gather enough bytes in a buffer, cast the buffer pointer to the good
>  struct/type and extract the data, increment the pointer to the next
>  step.
>   
Ahhh... Good old time, where I was able to handle registers in my 
prefered assembly code :) May be we are getting too old, Julien ;)
> Now compare that with freaking BB decoding, sweep, mark,slice, get,
> apply & 0xFF for handling sign.. and if you want to extract a bunch of
> structured data in 1 pass, it's impossible.
>   
It still is, but you have to wake up early !
> I think the problem is the way you implement decoder in Java using the
> ByteBuffer abstraction, the way we extract structured data .
>
> I have no clear solution, but if we work out a good practise for writing
> binary decoders (text are excluded, it's too easy ;) ), we will have
> clearer view of the base buffer we need to have for producing
> great/fast/simple decoders.
>   
The secret of a good decoder lies much more on knowledge about statefull 
state machines than the byte storage. As soon as you can grab a single 
byte, you are all done. (I mean, it's now up to you to deal with the 
decoding of your incoming flow...)
>   
>> 3) Split IoFilter into multiple interfaces.
>>
>> If IoHandler is removed, IoFilter should be renamed to represent
>> itself better.  Moreover, IoFilter should be split into more than one
>> interface (e.g. UpstreamHandler for receiving events and
>> DownstreamHandler for sending events from/to an IoProcessor) so they
>> can choose what to override more conveniently.
>>     
>
> Do we really need it ? Isn't going to clutter the API ?
>   
I agree with Julien here, and have expressed the concern in another mail.
> Any feed back is welcome.
>   
>
> If we focus on providing a great codec framework, we will know what we
> need as underlying engine.
>
> Actually we have to many helpers and way to make a decoder with MINA
> and it's not helping much to decide if we need to use expendable
> buffers or compositing byte arrays. I know some wants the two, but do
> we really need it ?
>   
One concern I have about encoding is that usually, I have to compute the 
size of the encoded BB before being able to feed it with some encoded 
data, just because the BB has a fixed size, and also because my protocol 
mandate me to store the PDU size as the second piece of information of 
my PDU ( Type, Length, value ...)

Otherwise, it's not really a problem for many protocol, as they can feed 
fixed BB and send them to the underlying writer without bothering about 
the gathering (which can be handled with a GatheringByteChannel )
> Julien
>   


-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by Emmanuel Lecharny <el...@apache.org>.

Julien Vermillard wrote:
> Hi,
>
> Look like I'm late, you should try to slow down on the reply button
> guys ! It's hard to follow :)
>
>
>  On Mon, 28 Apr 2008 22:21:57 +0900 (KST)
> "이희승 (Trustin Lee) <tr...@gmail.com> wrote:
>
>   
>> I thought about the current MINA API these days while I am idle, and
>> got some idea of improvements:
>>
>> 1) Split IoBuffer into two parts - array and buffer
>>
>> IoBuffer is basically an improvement to ByteBuffer.  Improvement here
>> means it inherited some bad asset from ByteBuffer - all the stateful
>> properties such as position, limit and mark.  I think they should be
>> provided as a separate class, and there should be classes dedicated
>> for storing bytes only (something like ByteArray?)
>>
>> BTW, why is mixing them a bad idea?  It's because it makes the
>> implementation too complicated.  For example, it is almost impossible
>> to implement a composite buffer to support close-to-zero-copy I/O.
>> What about all the weird rules related with auto-expansion and buffer
>> derivation?  It's increasing the learning curve.
>>     
>
> I never liked BB, I find it painful to use, and the plain Sun API
> don't even give method for dealing with unsigned types.
>   
Just because JAVA base object are signed ;) And, yeah, BB is a losy API...
> The byte array is probably sucking even more, because you can't cast it
> like in C to the type you want and fetch the data. the ones who ever
> wrote a binary decoder in C understand what I mean :
> gather enough bytes in a buffer, cast the buffer pointer to the good
>  struct/type and extract the data, increment the pointer to the next
>  step.
>   
Ahhh... Good old time, where I was able to handle registers in my 
prefered assembly code :) May be we are getting too old, Julien ;)
> Now compare that with freaking BB decoding, sweep, mark,slice, get,
> apply & 0xFF for handling sign.. and if you want to extract a bunch of
> structured data in 1 pass, it's impossible.
>   
It still is, but you have to wake up early !
> I think the problem is the way you implement decoder in Java using the
> ByteBuffer abstraction, the way we extract structured data .
>
> I have no clear solution, but if we work out a good practise for writing
> binary decoders (text are excluded, it's too easy ;) ), we will have
> clearer view of the base buffer we need to have for producing
> great/fast/simple decoders.
>   
The secret of a good decoder lies much more on knowledge about statefull 
state machines than the byte storage. As soon as you can grab a single 
byte, you are all done. (I mean, it's now up to you to deal with the 
decoding of your incoming flow...)
>   
>> 3) Split IoFilter into multiple interfaces.
>>
>> If IoHandler is removed, IoFilter should be renamed to represent
>> itself better.  Moreover, IoFilter should be split into more than one
>> interface (e.g. UpstreamHandler for receiving events and
>> DownstreamHandler for sending events from/to an IoProcessor) so they
>> can choose what to override more conveniently.
>>     
>
> Do we really need it ? Isn't going to clutter the API ?
>   
I agree with Julien here, and have expressed the concern in another mail.
> Any feed back is welcome.
>   
>
> If we focus on providing a great codec framework, we will know what we
> need as underlying engine.
>
> Actually we have to many helpers and way to make a decoder with MINA
> and it's not helping much to decide if we need to use expendable
> buffers or compositing byte arrays. I know some wants the two, but do
> we really need it ?
>   
One concern I have about encoding is that usually, I have to compute the 
size of the encoded BB before being able to feed it with some encoded 
data, just because the BB has a fixed size, and also because my protocol 
mandate me to store the PDU size as the second piece of information of 
my PDU ( Type, Length, value ...)

Otherwise, it's not really a problem for many protocol, as they can feed 
fixed BB and send them to the underlying writer without bothering about 
the gathering.
> Julien
>   


-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by Emmanuel Lecharny <el...@apache.org>.

>> Now compare that with freaking BB decoding, sweep, mark,slice, get,
>> apply & 0xFF for handling sign.. and if you want to extract a bunch of
>> structured data in 1 pass, it's impossible.
>>
>> I think the problem is the way you implement decoder in Java using the
>> ByteBuffer abstraction, the way we extract structured data .
>>     
>
> Agreed.  For Java, what we can ask is just 'which abstraction is
> better?'  My point is to separate state information from data.  Like we
> did in C with malloc().  Using ByteBuffer is like changing the value of
> the return value of malloc() not storing its initial value in that
> position and limit can be changed whenever user wants.
>   
Not sure I get the analogy. Malloc just returns a pointer on a memory 
area, you are free, as a user, to move this pointer back and forth, it's 
up to you to be sure you don't step out of the area... Java is much more 
defensive, and when you do that, you get an OOB exception quickly.

I would say that it's up to the user to deal with this problem unless 
you provide a copy of the buffer as a byte[] (and with direct buffer, 
this will be costly). So if we provide such a layer, be sure that 
experienced users can still hit the underlying BB without passing 
through another API

-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by 이희승 (Trustin Lee) <t...@gmail.com>.

Julien Vermillard wrote:
> Hi,
> 
> Look like I'm late, you should try to slow down on the reply button
> guys ! It's hard to follow :)
> 
> 
>  On Mon, 28 Apr 2008 22:21:57 +0900 (KST)
> "이희승 (Trustin Lee) <tr...@gmail.com> wrote:
> 
>> I thought about the current MINA API these days while I am idle, and
>> got some idea of improvements:
>>
>> 1) Split IoBuffer into two parts - array and buffer
>>
>> IoBuffer is basically an improvement to ByteBuffer.  Improvement here
>> means it inherited some bad asset from ByteBuffer - all the stateful
>> properties such as position, limit and mark.  I think they should be
>> provided as a separate class, and there should be classes dedicated
>> for storing bytes only (something like ByteArray?)
>>
>> BTW, why is mixing them a bad idea?  It's because it makes the
>> implementation too complicated.  For example, it is almost impossible
>> to implement a composite buffer to support close-to-zero-copy I/O.
>> What about all the weird rules related with auto-expansion and buffer
>> derivation?  It's increasing the learning curve.
> 
> I never liked BB, I find it painful to use, and the plain Sun API
> don't even give method for dealing with unsigned types.
> 
> The byte array is probably sucking even more, because you can't cast it
> like in C to the type you want and fetch the data. the ones who ever
> wrote a binary decoder in C understand what I mean :
> gather enough bytes in a buffer, cast the buffer pointer to the good
>  struct/type and extract the data, increment the pointer to the next
>  step.

You still have to swap the bytes for each fields to assure your
application is built and run in different architectures.  Java just
don't provide a way to copy fields like that unfortunately, no matter
type (ByteBuffer or ByteArray) we use.

> Now compare that with freaking BB decoding, sweep, mark,slice, get,
> apply & 0xFF for handling sign.. and if you want to extract a bunch of
> structured data in 1 pass, it's impossible.
> 
> I think the problem is the way you implement decoder in Java using the
> ByteBuffer abstraction, the way we extract structured data .

Agreed.  For Java, what we can ask is just 'which abstraction is
better?'  My point is to separate state information from data.  Like we
did in C with malloc().  Using ByteBuffer is like changing the value of
the return value of malloc() not storing its initial value in that
position and limit can be changed whenever user wants.

> I have no clear solution, but if we work out a good practise for writing
> binary decoders (text are excluded, it's too easy ;) ), we will have
> clearer view of the base buffer we need to have for producing
> great/fast/simple decoders.
> 
>> 2) Get rid of IoHandler and let IoFilter replace it.
>>
>> IoHandler is just a special IoFilter at the end of the filter chain.
>> I don't see any reason to keep it special considering that we are
>> often building multi-layered protocols.
> 
> Agree.
> 
>> 3) Split IoFilter into multiple interfaces.
>>
>> If IoHandler is removed, IoFilter should be renamed to represent
>> itself better.  Moreover, IoFilter should be split into more than one
>> interface (e.g. UpstreamHandler for receiving events and
>> DownstreamHandler for sending events from/to an IoProcessor) so they
>> can choose what to override more conveniently.
> 
> Do we really need it ? Isn't going to clutter the API ?

Right.  That's why I changed my opinion - let's just replace IoHandler
with IoFilter.

>> 4) UDP as the first citizen transport.
>>
>> MINA currently handles UDP like it's a TCP connection.  Consequently,
>> we had to provide some additional stuff like IoSessionRecycler and
>> IoAcceptor.newSession().  IMHO, session state management should be
>> completely optional when a user is using the UDP transport.  Possible
>> solution is to provide a different handler and session interface from
>> TCP.
>>
>> How do we execute all these changes? Well.. I think it's pretty easy
>> to do purely from technical point of view.  All the changes will make
>> users experience bigger backward incompatibility, but these issues
>> should be fixed somehow to make MINA a better netapp framework IMHO.
>>
> 
> The multihoming and connection less nature of some transport (UDP
> multicast, RS485, CAN, ethernet) aren't really mixing well with MINA,
> but perhaps it's out-of-scope for MINA ?
> I think no, because the whole decoding process is nearly the same.

I can't think of a nice solution yet.  The current UDP API just needs a
little bit more of learning.

>> Any feed back is welcome.
> 
> If we focus on providing a great codec framework, we will know what we
> need as underlying engine.
> 
> Actually we have too many helpers and way to make a decoder with MINA
> and it's not helping much to decide if we need to use expendable
> buffers or compositing byte arrays. I know some wants the two, but do
> we really need it ?

The problem is not in expandability but in the cost of copying on
expansion.  Composite byte array is one of the ways to resolve such an
issue.  Less copy is better IMO.

Thanks,
-- 
Trustin Lee - Principal Software Engineer, JBoss, Red Hat
--
what we call human nature is actually human habit
--
http://gleamynode.net/

Re: Redesigning IoBuffer, IoFilter and IoHandler

Posted by Julien Vermillard <jv...@archean.fr>.

Hi,

Look like I'm late, you should try to slow down on the reply button
guys ! It's hard to follow :)

 On Mon, 28 Apr 2008 22:21:57 +0900 (KST)
"이희승 (Trustin Lee) <tr...@gmail.com> wrote:

> I thought about the current MINA API these days while I am idle, and
> got some idea of improvements:
> 
> 1) Split IoBuffer into two parts - array and buffer
> 
> IoBuffer is basically an improvement to ByteBuffer.  Improvement here
> means it inherited some bad asset from ByteBuffer - all the stateful
> properties such as position, limit and mark.  I think they should be
> provided as a separate class, and there should be classes dedicated
> for storing bytes only (something like ByteArray?)
> 
> BTW, why is mixing them a bad idea?  It's because it makes the
> implementation too complicated.  For example, it is almost impossible
> to implement a composite buffer to support close-to-zero-copy I/O.
> What about all the weird rules related with auto-expansion and buffer
> derivation?  It's increasing the learning curve.

I never liked BB, I find it painful to use, and the plain Sun API
don't even give method for dealing with unsigned types.

The byte array is probably sucking even more, because you can't cast it
like in C to the type you want and fetch the data. the ones who ever
wrote a binary decoder in C understand what I mean :
gather enough bytes in a buffer, cast the buffer pointer to the good
 struct/type and extract the data, increment the pointer to the next
 step.

Now compare that with freaking BB decoding, sweep, mark,slice, get,
apply & 0xFF for handling sign.. and if you want to extract a bunch of
structured data in 1 pass, it's impossible.

I think the problem is the way you implement decoder in Java using the
ByteBuffer abstraction, the way we extract structured data .

I have no clear solution, but if we work out a good practise for writing
binary decoders (text are excluded, it's too easy ;) ), we will have
clearer view of the base buffer we need to have for producing
great/fast/simple decoders.

> 
> 2) Get rid of IoHandler and let IoFilter replace it.
> 
> IoHandler is just a special IoFilter at the end of the filter chain.
> I don't see any reason to keep it special considering that we are
> often building multi-layered protocols.

Agree.

> 
> 3) Split IoFilter into multiple interfaces.
> 
> If IoHandler is removed, IoFilter should be renamed to represent
> itself better.  Moreover, IoFilter should be split into more than one
> interface (e.g. UpstreamHandler for receiving events and
> DownstreamHandler for sending events from/to an IoProcessor) so they
> can choose what to override more conveniently.

Do we really need it ? Isn't going to clutter the API ?

> 
> 4) UDP as the first citizen transport.
> 
> MINA currently handles UDP like it's a TCP connection.  Consequently,
> we had to provide some additional stuff like IoSessionRecycler and
> IoAcceptor.newSession().  IMHO, session state management should be
> completely optional when a user is using the UDP transport.  Possible
> solution is to provide a different handler and session interface from
> TCP.
> 
> How do we execute all these changes? Well.. I think it's pretty easy
> to do purely from technical point of view.  All the changes will make
> users experience bigger backward incompatibility, but these issues
> should be fixed somehow to make MINA a better netapp framework IMHO.
> 

The multihoming and connection less nature of some transport (UDP
multicast, RS485, CAN, ethernet) aren't really mixing well with MINA,
but perhaps it's out-of-scope for MINA ?
I think no, because the whole decoding process is nearly the same.

> Any feed back is welcome.

If we focus on providing a great codec framework, we will know what we
need as underlying engine.

Actually we have to many helpers and way to make a decoder with MINA
and it's not helping much to decide if we need to use expendable
buffers or compositing byte arrays. I know some wants the two, but do
we really need it ?

Julien