You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mina.apache.org by 이희승 (Trustin Lee) <t...@gmail.com> on 2008/04/29 19:06:21 UTC

IoBuffer refactoring (final take?)

After long long IRC conversation with Emmanuel Lecharny, Julien
Vermillard and David M. Lloyd.  We seem to have reached to the following
temporary consensus:

1) Ditch IoBuffer and use ByteBuffer directly.
2) Provide all convenience methods IoBuffer provide as static methods so
they can be static-imported.
3) Modify our filter implementations to understand ByteBuffer and
Iterable<ByteBuffer>.
4) Update the IoFilter tutorial to inform users about this change.

This change means we decided not to create a new type to support
composite buffer.  Iterable<ByteBuffer> seems to be enough.

This change will take place in a new branch and its review will be
requested before merger.

Another issue to think about is how we can implement auto-expansion.
Many users find it very useful when they construct a variable length
message.  My idea right now is to provide a builder class which builds
an Iterable<ByteBuffer> or ByteBuffer depending on user preference.
Same preference property should be provided by the protocol codec
framework for those who still wants a single ByteBuffer.  I will also
explore this in the branch.

Any more feed back before I proceed?

Cheers,
-- 
Trustin Lee - Principal Software Engineer, JBoss, Red Hat
--
what we call human nature is actually human habit
--
http://gleamynode.net/


Re: IoBuffer refactoring (final take?)

Posted by Julien Vermillard <jv...@archean.fr>.
On Wed, 30 Apr 2008 02:13:09 +0900 (KST)
"이희승 (Trustin Lee) <tr...@gmail.com> wrote:

> Someone could post the IRC log (or link to it) just for the record.  I
> turned off logging because my HDD has only 32GB capacity and don't
> know where the log is stored in FreeNode.
> 

Here the logs :
http://www.antwerkz.com/javabot/javabot/home/3/%23mina/2/29/1/04/0/2008/

Thanks to David :)

Julien
> "이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
> > After long long IRC conversation with Emmanuel Lecharny, Julien
> > Vermillard and David M. Lloyd.  We seem to have reached to the
> > following temporary consensus:
> > 
> > 1) Ditch IoBuffer and use ByteBuffer directly.
> > 2) Provide all convenience methods IoBuffer provide as static
> > methods so they can be static-imported.
> > 3) Modify our filter implementations to understand ByteBuffer and
> > Iterable<ByteBuffer>.
> > 4) Update the IoFilter tutorial to inform users about this change.
> > 
> > This change means we decided not to create a new type to support
> > composite buffer.  Iterable<ByteBuffer> seems to be enough.
> > 
> > This change will take place in a new branch and its review will be
> > requested before merger.
> > 
> > Another issue to think about is how we can implement auto-expansion.
> > Many users find it very useful when they construct a variable length
> > message.  My idea right now is to provide a builder class which
> > builds an Iterable<ByteBuffer> or ByteBuffer depending on user
> > preference. Same preference property should be provided by the
> > protocol codec framework for those who still wants a single
> > ByteBuffer.  I will also explore this in the branch.
> > 
> > Any more feed back before I proceed?
> > 
> > Cheers,
> 


Re: IoBuffer refactoring (final take?)

Posted by 이희승 (Trustin Lee) <t...@gmail.com>.
Someone could post the IRC log (or link to it) just for the record.  I
turned off logging because my HDD has only 32GB capacity and don't know
where the log is stored in FreeNode.

"이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
> After long long IRC conversation with Emmanuel Lecharny, Julien
> Vermillard and David M. Lloyd.  We seem to have reached to the following
> temporary consensus:
> 
> 1) Ditch IoBuffer and use ByteBuffer directly.
> 2) Provide all convenience methods IoBuffer provide as static methods so
> they can be static-imported.
> 3) Modify our filter implementations to understand ByteBuffer and
> Iterable<ByteBuffer>.
> 4) Update the IoFilter tutorial to inform users about this change.
> 
> This change means we decided not to create a new type to support
> composite buffer.  Iterable<ByteBuffer> seems to be enough.
> 
> This change will take place in a new branch and its review will be
> requested before merger.
> 
> Another issue to think about is how we can implement auto-expansion.
> Many users find it very useful when they construct a variable length
> message.  My idea right now is to provide a builder class which builds
> an Iterable<ByteBuffer> or ByteBuffer depending on user preference.
> Same preference property should be provided by the protocol codec
> framework for those who still wants a single ByteBuffer.  I will also
> explore this in the branch.
> 
> Any more feed back before I proceed?
> 
> Cheers,

-- 
Trustin Lee - Principal Software Engineer, JBoss, Red Hat
--
what we call human nature is actually human habit
--
http://gleamynode.net/


Re: IoBuffer refactoring (final take?)

Posted by Emmanuel Lecharny <el...@apache.org>.
Thanks Trustin,

you just perfectly sumarize the 5 hours convo we had :)

"이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
> After long long IRC conversation with Emmanuel Lecharny, Julien
> Vermillard and David M. Lloyd.  We seem to have reached to the following
> temporary consensus:
>
> 1) Ditch IoBuffer and use ByteBuffer directly.
> 2) Provide all convenience methods IoBuffer provide as static methods so
> they can be static-imported.
> 3) Modify our filter implementations to understand ByteBuffer and
> Iterable<ByteBuffer>.
> 4) Update the IoFilter tutorial to inform users about this change.
>
> This change means we decided not to create a new type to support
> composite buffer.  Iterable<ByteBuffer> seems to be enough.
>
> This change will take place in a new branch and its review will be
> requested before merger.
>
> Another issue to think about is how we can implement auto-expansion.
> Many users find it very useful when they construct a variable length
> message.  My idea right now is to provide a builder class which builds
> an Iterable<ByteBuffer> or ByteBuffer depending on user preference.
> Same preference property should be provided by the protocol codec
> framework for those who still wants a single ByteBuffer.  I will also
> explore this in the branch.
>
> Any more feed back before I proceed?
>
> Cheers,
>   


-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org



Re: IoBuffer refactoring (final take?)

Posted by 이희승 (Trustin Lee) <t...@gmail.com>.
"이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
> Therefore, I'd propose to use a dedicated type instead of
> Iterable<ByteBuffer>.  WDYT?

Or we could keep using Iterable<ByteBuffer> leaving down-casting (or
construction of a new CompositeByteBuffer) as an implementor's homework?

-- 
Trustin Lee - Principal Software Engineer, JBoss, Red Hat
--
what we call human nature is actually human habit
--
http://gleamynode.net/


Re: IoBuffer refactoring (final take?)

Posted by 이희승 (Trustin Lee) <t...@gmail.com>.
Uh, your reply was faster than mine. :)

Emmanuel Lecharny wrote:
> "이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
>> Last night, we purely focused on sequential scanning of incoming data
>> regarding handling a composite buffer.  I kept thinking about our
>> decision not to introduce any new type while I sleep, and I found a few
>> critical issues with it:
>>
>> * As Rich pointed out, random-accessing the composite buffer has too
>> much overhead with finding a right ByteBuffer when just an index and
>> Iterable<ByteBuffer> are provided.  More information (e.g. cache or
>> index tree) should be provided for faster access.
>> * Even if what user is doing is just sequential scanning of incoming
>> data, it's so much pain if a user happens to access data which spans
>> over more than one ByteBuffer.  For example, imagine the first buffer
>> contains 3 bytes and you want to read a long integer.
>>   
> I don't see this as a critical issue. It all depend on what you accept
> to deal with on your first filter. If we consider that everything which
> has been discussed applies to the very first filter (generally speaking,
> the codec), here is how I see the full problem :
> 
> 1) you have some incoming bytes, received packet by packet, those packet
> being stored in BB, and accumulating until we find a free thread to deal
> with it
> 2) The processing thread absorb those buffers and start to process them.
> Suppose this is a codec, we will have to decode the incoming "stream",
> assuming that this so called "stream" is not really a stream, but a list
> of BB (list != List here. we may just have more than one BB waiting)
> 3) The decoder role is to transform this "stream" of bytes to a "Message"
> 4) To add some complexity to this whole picture, a BB may contains more
> than one "Message" (or to be very clear, the start of a new "message"
> can be found in the middle of an incoming BB)
> 5) Some "Message" can't be created unless we received the whole bytes
> which compose the message (for instance, a newLine terminated message
> has to wait for the '\n' char, which may come after having received a
> lot of bytes in a lot of BB
> 6) You may have to keep in memory a really big piece of data if your
> message is big
> 7) When your decoder is done with the "Message" it is processing, it can
> forward it to the next filter, and discard the BB, if not already done,
> freeing some memory.
> 
> I'm not describing the opposite operation (ie encoding and sending data
> back to the client, but it's a little bit more simpler, as you handle
> the Message to encode and the BB to send. There is just a little pb :
> how to handle a slow client and how to protect the server against OOM in
> this case, as the messages can accumulate before they are sent to the
> client)
> 
> In many respects, the codec will not be funny to write if it has to deal
> with BB, as you will have to check for the remaining number of bytes in
> the BB _before_ getting some bytes out of it. Typically, your decoder
> will looks like :
> 
> for each ByteBuffer
>  do
>    while buffer.hasRemaining()
>      decoder.process( buffer.get() ); // Over simplified example: you
> may read more than one byte
> 
> This means that your codec should be able to stop decoding in the middle
> of a construction (Trustin's example is very clear : you have only 3
> bytes in the current BB, and you want to construct a Long)

Actually my example was when a codec is provided with an
Iterable<ByteBuffer> which contains two buffers - the first one has 3
bytes and the second one has more than 5.

Of course, the decoder can read 3 byte and 5 bytes separately and create
a long integer from them, but it's very inconvenient:

  if (firstBuf.remaining() >= 8) {
    value = firstBuf.getLong();
  } else {
    value = .... you know the complexity ...
  }

> This is possible, you just have to keep a state, and deal with it. You
> also have to kick the codec when you receive new bytes.

Of course it is possible, and what I am saying is that it should be
convenient to implement.

> Another possibility is to mask all this complexity and offers an higher
> level abstraction to allow the codec to poll the bytes from a "stream"
> (as suggested by some of our users), something like :
> 
> codec :
>  ...
>  byte[] data = stream.get( 8 )
>  long myLong = createLong( data )
>  ...
> 
> In this case, your codec is not statefull, and you don't want to know
> that behind the curtain, the stream is stored in BB. We can also see
> that in this case, you want specifically 8 bytes, which means that if
> the underlying BB does not contains those 8 bytes, the get( 8 ) call
> will block until some more bytes will be received.

What I am suggesting is somewhat different, and it's not related with
thread pool at all:

CompositeByteBuffer buf = new CompositeByteBuffer(firstBuf, secondBuf);

if (buf.remaining() >= 8) {
    value = buf.getLong();
}

Much simpler than without CompositeByteBuffer, no?

As I already posted in this thread, I think just accepting
Iterable<ByteBuffer> and providing downcasting to CompositeByteBuffer or
creating a new CompositeByteBuffer from Iterable<ByteBuffer> as a
utility method.  For example:

  Iterable<ByteBuffer> bufs = ...;
  // Downcast or create a new one.
  CompositeByteBuffer buf = CompositeByteBuffer.getInstance(bufs);

> So the question is : why do we have to define a very special new type
> when a specific filter can do the work ? What if we provide this
> filter, using existing java classes to ease the Codec implementor work
> instead ?

It's because we cannot provide a generic filter implementation.  Every
protocol has its own nature so it needs to be dealt by codec/filter
implementor.  For some protocol, it's not so special?  We will have to
provide filters for a few typical cases though (delimiter-based and
header field-based framing).

> (PS: it does not change the consensus we reached yesturday : the first
> filter will still have to deal with either BB or Iterator<BB>)

Yes, it doesn't change anything basically.
-- 
Trustin Lee - Principal Software Engineer, JBoss, Red Hat
--
what we call human nature is actually human habit
--
http://gleamynode.net/


Re: IoBuffer refactoring (final take?)

Posted by Emmanuel Lecharny <el...@apache.org>.
"이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
> Last night, we purely focused on sequential scanning of incoming data
> regarding handling a composite buffer.  I kept thinking about our
> decision not to introduce any new type while I sleep, and I found a few
> critical issues with it:
>
> * As Rich pointed out, random-accessing the composite buffer has too
> much overhead with finding a right ByteBuffer when just an index and
> Iterable<ByteBuffer> are provided.  More information (e.g. cache or
> index tree) should be provided for faster access.
> * Even if what user is doing is just sequential scanning of incoming
> data, it's so much pain if a user happens to access data which spans
> over more than one ByteBuffer.  For example, imagine the first buffer
> contains 3 bytes and you want to read a long integer.
>   
I don't see this as a critical issue. It all depend on what you accept 
to deal with on your first filter. If we consider that everything which 
has been discussed applies to the very first filter (generally speaking, 
the codec), here is how I see the full problem :

1) you have some incoming bytes, received packet by packet, those packet 
being stored in BB, and accumulating until we find a free thread to deal 
with it
2) The processing thread absorb those buffers and start to process them. 
Suppose this is a codec, we will have to decode the incoming "stream", 
assuming that this so called "stream" is not really a stream, but a list 
of BB (list != List here. we may just have more than one BB waiting)
3) The decoder role is to transform this "stream" of bytes to a "Message"
4) To add some complexity to this whole picture, a BB may contains more 
than one "Message" (or to be very clear, the start of a new "message" 
can be found in the middle of an incoming BB)
5) Some "Message" can't be created unless we received the whole bytes 
which compose the message (for instance, a newLine terminated message 
has to wait for the '\n' char, which may come after having received a 
lot of bytes in a lot of BB
6) You may have to keep in memory a really big piece of data if your 
message is big
7) When your decoder is done with the "Message" it is processing, it can 
forward it to the next filter, and discard the BB, if not already done, 
freeing some memory.

I'm not describing the opposite operation (ie encoding and sending data 
back to the client, but it's a little bit more simpler, as you handle 
the Message to encode and the BB to send. There is just a little pb : 
how to handle a slow client and how to protect the server against OOM in 
this case, as the messages can accumulate before they are sent to the 
client)

In many respects, the codec will not be funny to write if it has to deal 
with BB, as you will have to check for the remaining number of bytes in 
the BB _before_ getting some bytes out of it. Typically, your decoder 
will looks like :

for each ByteBuffer
  do
    while buffer.hasRemaining()
      decoder.process( buffer.get() ); // Over simplified example: you 
may read more than one byte

This means that your codec should be able to stop decoding in the middle 
of a construction (Trustin's example is very clear : you have only 3 
bytes in the current BB, and you want to construct a Long)

This is possible, you just have to keep a state, and deal with it. You 
also have to kick the codec when you receive new bytes.

Another possibility is to mask all this complexity and offers an higher 
level abstraction to allow the codec to poll the bytes from a "stream" 
(as suggested by some of our users), something like :

codec :
  ...
  byte[] data = stream.get( 8 )
  long myLong = createLong( data )
  ...

In this case, your codec is not statefull, and you don't want to know 
that behind the curtain, the stream is stored in BB. We can also see 
that in this case, you want specifically 8 bytes, which means that if 
the underlying BB does not contains those 8 bytes, the get( 8 ) call 
will block until some more bytes will be received.

This leads to some other kind of complexity, as your codec might consume 
a Thread and it can lead to a Thread exhaustion in the pool of threads. 
But form the user POV, this is much more simpler to implement a codec 
using such a paradigm.

Now, I don't see how those two scenario are incompatible. We can provide 
a Filter which will transform the incoming BB to any kind of other 
structure, including a byte stream. The idea is that from a Filter to 
the next one, you have transformed the data. here, we will have :

client data stored into BB -> Filter 1 transform BB to a stream --> 
Filter 2 is the decoder, transforming the stream to messages --> Filter 
3 ...

So the question is : why do we have to define a very special new type 
when a specific filter can do the work ? What if we provide this filter, 
using existing java classes to ease the Codec implementor work instead ?

(PS: it does not change the consensus we reached yesturday : the first 
filter will still have to deal with either BB or Iterator<BB>)

wdyt ? Am I missing something ?

-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org



Re: IoBuffer refactoring (final take?)

Posted by 이희승 (Trustin Lee) <t...@gmail.com>.
Last night, we purely focused on sequential scanning of incoming data
regarding handling a composite buffer.  I kept thinking about our
decision not to introduce any new type while I sleep, and I found a few
critical issues with it:

* As Rich pointed out, random-accessing the composite buffer has too
much overhead with finding a right ByteBuffer when just an index and
Iterable<ByteBuffer> are provided.  More information (e.g. cache or
index tree) should be provided for faster access.
* Even if what user is doing is just sequential scanning of incoming
data, it's so much pain if a user happens to access data which spans
over more than one ByteBuffer.  For example, imagine the first buffer
contains 3 bytes and you want to read a long integer.

So.. I think we should introduce a new type at least for composite
buffer, even if we are going to ditch IoBuffer.  Probably something like
this?

public interface CompositeByteBuffer extends Iterable<ByteBuffer> {
    // These two are optional and can throw an exception when
    // this is read only.
    CompositeByteBuffer prepend(ByteBuffer bb);
    CompositeByteBuffer append (ByteBuffer bb);

    // All ByteBuffer methods here; make sure to provide the same
    // experience for better learning curve.
    int getInt(int index);
    int getInt();
    ...

    CompositeByteBuffer asReadOnly();
    CompositeByteBuffer duplicate();
    ...
}

We might not even need to make it an interface.  The implementation
shouldn't overlap much with that of NIO ByteBuffer because most methods
will be implemented in three steps:

1) Find a proper ByteBuffer
2) Forward the call to the found ByteBuffer
3) If the call should be span over more than one ByteBuffer, handle it.

Why didn't we discuss about this issue?  I don't know, but I think this
is quite an important issue. :)

Therefore, I'd propose to use a dedicated type instead of
Iterable<ByteBuffer>.  WDYT?

"이희승 (Trustin Lee) <tr...@gmail.com>" wrote:
> After long long IRC conversation with Emmanuel Lecharny, Julien
> Vermillard and David M. Lloyd.  We seem to have reached to the following
> temporary consensus:
> 
> 1) Ditch IoBuffer and use ByteBuffer directly.
> 2) Provide all convenience methods IoBuffer provide as static methods so
> they can be static-imported.
> 3) Modify our filter implementations to understand ByteBuffer and
> Iterable<ByteBuffer>.
> 4) Update the IoFilter tutorial to inform users about this change.
> 
> This change means we decided not to create a new type to support
> composite buffer.  Iterable<ByteBuffer> seems to be enough.
> 
> This change will take place in a new branch and its review will be
> requested before merger.
> 
> Another issue to think about is how we can implement auto-expansion.
> Many users find it very useful when they construct a variable length
> message.  My idea right now is to provide a builder class which builds
> an Iterable<ByteBuffer> or ByteBuffer depending on user preference.
> Same preference property should be provided by the protocol codec
> framework for those who still wants a single ByteBuffer.  I will also
> explore this in the branch.
> 
> Any more feed back before I proceed?
> 
> Cheers,

-- 
Trustin Lee - Principal Software Engineer, JBoss, Red Hat
--
what we call human nature is actually human habit
--
http://gleamynode.net/


RE: IoBuffer refactoring (final take?)

Posted by "Pauls, Karl" <kp...@trigeo.com>.

> Another issue to think about is how we can implement auto-expansion.
> Many users find it very useful when they construct a variable length
> message.  My idea right now is to provide a builder class which builds
> an Iterable<ByteBuffer> or ByteBuffer depending on user preference.
> Same preference property should be provided by the protocol codec
> framework for those who still wants a single ByteBuffer.  I will also
> explore this in the branch.

If Iterable<ByteBuffer> could be extended by specifying a buffer pool/factory in the builder class I think users should make use of that for expanding their message size. However we could delegate the expansion function to a factory if users want the convenience of writing to a single buffer. I don't think it should be automatic anymore.

Here are some ideas regarding a ByteBuffer pool/factory interface and requirements and how we could use a factory to possibly achieve auto-expand functionality.
http://pastebin.com/f3934290d

-karl