You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@mina.apache.org by Emmanuel Lecharny <el...@gmail.com> on 2011/12/01 10:00:52 UTC

[MINA 3.0] IoBuffer usage

Hi guys,

yesterday, I committed some changes that make the NioSelectorProcessor 
to use the IoBuffer class instead of a singe buffer to store the 
incoming data. Here is the snippet of changed code :

                                 int readCount = 0;
                                 IoBuffer ioBuffer = session.getIoBuffer();

                                 do {
                                     ByteBuffer readBuffer = 
ByteBuffer.allocate(1024);
                                     readCount = channel.read(readBuffer);
                                     LOGGER.debug("read {} bytes", 
readCount);

                                     if (readCount < 0) {
                                         // session closed by the remote 
peer
                                         LOGGER.debug("session closed by 
the remote peer");
                                         sessionsToClose.add(session);
                                         break;
                                     } else if (readCount > 0) {
                                         readBuffer.flip();
                                         ioBuffer.add(readBuffer);
                                     }
                                 } while (readCount > 0);

                                 // we have read some data
                                 // limit at the current position & 
rewind buffer back to start & push to the chain
                                 
session.getFilterChain().processMessageReceived(session, ioBuffer);

As you can see, instead of reading one buffer, and call the chain, we 
gather as many data as we can (ie as many as the channel can provide), 
and we call the chain.
This has one major advantage : we don't call the chain many times if the 
data is bigger than the buffer size (currently set to 1024 bytes), and 
as a side effect does not require that we define a bigger buffer (not 
really a big deal, we can afford to use a 64kb buffer here, as there is 
only one buffer per selector)
The drawback is that we allocate ByteBuffers on the fly. This can be 
improved by using a pre-allocated buffer (say a 64kb buffer), and if we 
still have something to read, then we allocate some more (this is 
probably what I will change).

The rest of the code is not changed widely, except the decoder and every 
filter that expected to receive a ByteBuffer (like the LoggingFilter). 
It's just a matter of casting the Object to IoBuffer, and process the 
data, as the IoBuffer methods are the same than the ByteBuffer (except 
that you can't inject anything but ByteBuffers into an IoBuffer, so no 
put method, for instance).

The decoders for Http and Ldap have been changed to deal with the 
IoBuffer. The big gain here, in the Http cas, is that we don't have to 
accumulate the data into a new ByteBuffer : the IoBuffer already 
accumulate data itself.

The IoBuffer is stored into the session, which means we can reuse it 
over and over, no need to create a new one. I still have to implement 
the compact() method which will remove the used ByteBuffers, in order 
for this IoBuffer not to grow our of bounds.

thoughts, comments ?

Thanks !

-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: [MINA 3.0] IoBuffer usage

Posted by Emmanuel Lecharny <el...@gmail.com>.

On 1/18/12 2:57 AM, Chad Beaulac wrote:
> Ok. I'll test the throughput on it with the tests I have.

Let me know if you have any isues with the throughput you obtain.

Be sure to set the default read buffer size to the maximum size, AFAIR, 
it's set to 2048 bytes in MINA 2 (and can't be higher than  65536), 
otherwise you'll may get atrocious throughput...
>
> Thanks
> Chad
>
> Sent from my iPad
>
> On Jan 17, 2012, at 11:33 AM, Emmanuel Lecharny<el...@gmail.com>  wrote:
>
>> IFAICT, on MINA 2, you should not have any issue.
>>
>> The loop where we write buffers into the channel is :
>>
>>     ...
>>         final int maxWrittenBytes = session.getConfig().getMaxReadBufferSize()
>>                 + (session.getConfig().getMaxReadBufferSize()>>>  1);
>>         int writtenBytes = 0;
>>         WriteRequest req = null;
>>
>>         try {
>>             // Clear OP_WRITE
>>             setInterestedInWrite(session, false);
>>
>>             do {
>>                 // Check for pending writes.
>>                 req = session.getCurrentWriteRequest();
>>
>>                 if (req == null) {
>>                     req = writeRequestQueue.poll(session);
>>
>>                     if (req == null) {
>>                         break;
>>                     }
>>
>>                     session.setCurrentWriteRequest(req);
>>                 }
>>
>>                 int localWrittenBytes = 0;
>>                 Object message = req.getMessage();
>>
>>                 if (message instanceof IoBuffer) {
>>                     localWrittenBytes = writeBuffer(session, req,
>>                             hasFragmentation, maxWrittenBytes - writtenBytes,
>>                             currentTime);
>>
>>                     if (( localWrittenBytes>  0 )
>> &&  ((IoBuffer) message).hasRemaining()) {
>>                         // the buffer isn't empty, we re-interest it in writing
>>                         writtenBytes += localWrittenBytes;
>>                         setInterestedInWrite(session, true);
>>                         return false;
>>                     }
>>                 } else { // Blahhh }
>>
>>                 if (localWrittenBytes == 0) {
>>                     // Kernel buffer is full.
>>                     setInterestedInWrite(session, true);
>>                     return false;
>>                 }
>>
>>                 writtenBytes += localWrittenBytes;
>>
>>                 if (writtenBytes>= maxWrittenBytes) {
>>                     // Wrote too much
>>                     scheduleFlush(session);
>>                     return false;
>>                 }
>>             } while (writtenBytes<  maxWrittenBytes);
>>     ...
>>
>> with :
>>
>>     private int writeBuffer(S session, WriteRequest req,
>>             boolean hasFragmentation, int maxLength, long currentTime)
>>             throws Exception {
>>         IoBuffer buf = (IoBuffer) req.getMessage();
>>         int localWrittenBytes = 0;
>>
>>         if (buf.hasRemaining()) {
>>             int length;
>>
>>             if (hasFragmentation) {
>>                 length = Math.min(buf.remaining(), maxLength);
>>             } else {
>>                 length = buf.remaining();
>>             }
>>
>>             localWrittenBytes = write(session, buf, length);
>>         }
>>
>> and :
>>
>>     protected int write(NioSession session, IoBuffer buf, int length)
>>             throws Exception {
>>         if (buf.remaining()<= length) {
>>             return session.getChannel().write(buf.buf());
>>         }
>>
>>         int oldLimit = buf.limit();
>>         buf.limit(buf.position() + length);
>>         try {
>>             return session.getChannel().write(buf.buf());
>>         } finally {
>>             buf.limit(oldLimit);
>>         }
>>     }
>>
>> So we try our best to stuff the channel with as many bytes as possible, before giving up (either because we don't have anything to write, to because the channel is full...)
>>
>> I don't see if we can do any better.
>>
>> On 1/17/12 3:57 PM, Chad Beaulac wrote:
>>> On Jan 17, 2012, at 9:32 AM, Emmanuel Lécharny<el...@apache.org>   wrote:
>>>
>>>> On 1/17/12 3:02 PM, Chad Beaulac wrote:
>>>>> On Mon, Jan 16, 2012 at 1:10 PM, Emmanuel Lecharny<el...@gmail.com>wrote:
>>>>>
>>>>>> On 1/16/12 2:56 PM, Chad Beaulac wrote:
>>>>>>
>>>>>>> Emmanuel, (all)
>>>>>>>
>>>>>>> I'm working on this Camel ticket:
>>>>>>> https://issues.apache.org/**jira/browse/CAMEL-2624<https://issues.apache.org/jira/browse/CAMEL-2624>
>>>>>>>
>>>>>>> I finished the initial cut of
>>>>>>> https://issues.apache.org/**jira/browse/CAMEL-3471<https://issues.apache.org/jira/browse/CAMEL-3471>to create a mina2
>>>>>>> component in Camel.
>>>>>>>
>>>>>>> CAMEL-2624 adds the full async behavior for Mina2 endpoints in Camel.
>>>>>>>
>>>>>>> Would it be possible to backport the IoBuffer reading and writing
>>>>>>> discussed
>>>>>>> in this email thread from Mina3 to Mina2?
>>>>>>> Following the depth of the stack trace through
>>>>>>> AbstractIoSession.write(...), I'm a little concerned about the throughput.
>>>>>>> My current code (mina-less) is supporting single TCP channels with 320+
>>>>>>> Mb/s rates. I'm porting my code to use Mina with a Mina Google Protocol
>>>>>>> Buffers codec I wrote. I know if this is a real problem soon when I finish
>>>>>>> CAMEL-2624 and setup some throughput tests.
>>>>>>>
>>>>>> Another option would be to port this to MINA3, and make it work with
>>>>>> Camel. Right now, MINA 3 is pretty rough, but we have made it works well
>>>>>> with Http and LDAP. I'd rather spend some time making it a bit more solid
>>>>>> and working well in your case instead of trying to inject the code in MINA2.
>>>>>>
>>>>>> Now, it's your call. We can discuss the pros and cons of both approach if
>>>>>> you like.
>>>>>>
>>>>>>
>>>>> Hi Emmanuel,
>>>>>
>>>>> One of my pros/cons trade-offs is time-to-market. I can have a solution in
>>>>> Camel with Mina2 fairly quickly. Although I might have issues with high
>>>>> data rate streams.
>>>>> With that said, my approach would be the following:
>>>>> 1) Finish CAMEL-2624 with Mina2. This will give me the asynchronous
>>>>> endpoints I need and a quick time-to-market. I'll put off issues concerning
>>>>> high throughput.
>>>>> 2) Work on Mina3 to ensure it has low latency with small data rate streams
>>>>> and high throughput with large data pipes.
>>>>> 3) Upgrade my Google protocol buffers codec for Mina3.
>>>>> 4) When Mina3 is ready, open a new Camel ticket and create a new mina3
>>>>> Camel Component.
>>>>>
>>>>> What do you think?
>>>> I'll try to squeeze 2 hours to backport the patch to MINA 2 today or tomorrow.
>>>>
>>>> Feel free to ping me on mail or on #mina if I don't send any feedback in the next 2 days (I'm pretty busy and may slip)
>>>>
>>>>
>>>> -- 
>>>> Regards,
>>>> Cordialement,
>>>> Emmanuel Lécharny
>>>> www.iktek.com
>>>>
>>> Wow. That is nice! Look forward to checking it out. I'll move forward with my plan in the meantime.
>>>
>>> Chad
>>> Sent from my iPhone
>>>
>>
>> -- 
>> Regards,
>> Cordialement,
>> Emmanuel Lécharny
>> www.iktek.com
>>


-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: [MINA 3.0] IoBuffer usage

Posted by Chad Beaulac <ca...@gmail.com>.

Ok. I'll test the throughput on it with the tests I have. 

Thanks
Chad

Sent from my iPad

On Jan 17, 2012, at 11:33 AM, Emmanuel Lecharny <el...@gmail.com> wrote:

> IFAICT, on MINA 2, you should not have any issue.
> 
> The loop where we write buffers into the channel is :
> 
>    ...
>        final int maxWrittenBytes = session.getConfig().getMaxReadBufferSize()
>                + (session.getConfig().getMaxReadBufferSize() >>> 1);
>        int writtenBytes = 0;
>        WriteRequest req = null;
> 
>        try {
>            // Clear OP_WRITE
>            setInterestedInWrite(session, false);
> 
>            do {
>                // Check for pending writes.
>                req = session.getCurrentWriteRequest();
> 
>                if (req == null) {
>                    req = writeRequestQueue.poll(session);
> 
>                    if (req == null) {
>                        break;
>                    }
> 
>                    session.setCurrentWriteRequest(req);
>                }
> 
>                int localWrittenBytes = 0;
>                Object message = req.getMessage();
> 
>                if (message instanceof IoBuffer) {
>                    localWrittenBytes = writeBuffer(session, req,
>                            hasFragmentation, maxWrittenBytes - writtenBytes,
>                            currentTime);
> 
>                    if (( localWrittenBytes > 0 )
> && ((IoBuffer) message).hasRemaining()) {
>                        // the buffer isn't empty, we re-interest it in writing
>                        writtenBytes += localWrittenBytes;
>                        setInterestedInWrite(session, true);
>                        return false;
>                    }
>                } else { // Blahhh }
> 
>                if (localWrittenBytes == 0) {
>                    // Kernel buffer is full.
>                    setInterestedInWrite(session, true);
>                    return false;
>                }
> 
>                writtenBytes += localWrittenBytes;
> 
>                if (writtenBytes >= maxWrittenBytes) {
>                    // Wrote too much
>                    scheduleFlush(session);
>                    return false;
>                }
>            } while (writtenBytes < maxWrittenBytes);
>    ...
> 
> with :
> 
>    private int writeBuffer(S session, WriteRequest req,
>            boolean hasFragmentation, int maxLength, long currentTime)
>            throws Exception {
>        IoBuffer buf = (IoBuffer) req.getMessage();
>        int localWrittenBytes = 0;
> 
>        if (buf.hasRemaining()) {
>            int length;
> 
>            if (hasFragmentation) {
>                length = Math.min(buf.remaining(), maxLength);
>            } else {
>                length = buf.remaining();
>            }
> 
>            localWrittenBytes = write(session, buf, length);
>        }
> 
> and :
> 
>    protected int write(NioSession session, IoBuffer buf, int length)
>            throws Exception {
>        if (buf.remaining() <= length) {
>            return session.getChannel().write(buf.buf());
>        }
> 
>        int oldLimit = buf.limit();
>        buf.limit(buf.position() + length);
>        try {
>            return session.getChannel().write(buf.buf());
>        } finally {
>            buf.limit(oldLimit);
>        }
>    }
> 
> So we try our best to stuff the channel with as many bytes as possible, before giving up (either because we don't have anything to write, to because the channel is full...)
> 
> I don't see if we can do any better.
> 
> On 1/17/12 3:57 PM, Chad Beaulac wrote:
>> 
>> On Jan 17, 2012, at 9:32 AM, Emmanuel Lécharny<el...@apache.org>  wrote:
>> 
>>> On 1/17/12 3:02 PM, Chad Beaulac wrote:
>>>> On Mon, Jan 16, 2012 at 1:10 PM, Emmanuel Lecharny<el...@gmail.com>wrote:
>>>> 
>>>>> On 1/16/12 2:56 PM, Chad Beaulac wrote:
>>>>> 
>>>>>> Emmanuel, (all)
>>>>>> 
>>>>>> I'm working on this Camel ticket:
>>>>>> https://issues.apache.org/**jira/browse/CAMEL-2624<https://issues.apache.org/jira/browse/CAMEL-2624>
>>>>>> 
>>>>>> I finished the initial cut of
>>>>>> https://issues.apache.org/**jira/browse/CAMEL-3471<https://issues.apache.org/jira/browse/CAMEL-3471>to create a mina2
>>>>>> component in Camel.
>>>>>> 
>>>>>> CAMEL-2624 adds the full async behavior for Mina2 endpoints in Camel.
>>>>>> 
>>>>>> Would it be possible to backport the IoBuffer reading and writing
>>>>>> discussed
>>>>>> in this email thread from Mina3 to Mina2?
>>>>>> Following the depth of the stack trace through
>>>>>> AbstractIoSession.write(...), I'm a little concerned about the throughput.
>>>>>> My current code (mina-less) is supporting single TCP channels with 320+
>>>>>> Mb/s rates. I'm porting my code to use Mina with a Mina Google Protocol
>>>>>> Buffers codec I wrote. I know if this is a real problem soon when I finish
>>>>>> CAMEL-2624 and setup some throughput tests.
>>>>>> 
>>>>> Another option would be to port this to MINA3, and make it work with
>>>>> Camel. Right now, MINA 3 is pretty rough, but we have made it works well
>>>>> with Http and LDAP. I'd rather spend some time making it a bit more solid
>>>>> and working well in your case instead of trying to inject the code in MINA2.
>>>>> 
>>>>> Now, it's your call. We can discuss the pros and cons of both approach if
>>>>> you like.
>>>>> 
>>>>> 
>>>> Hi Emmanuel,
>>>> 
>>>> One of my pros/cons trade-offs is time-to-market. I can have a solution in
>>>> Camel with Mina2 fairly quickly. Although I might have issues with high
>>>> data rate streams.
>>>> With that said, my approach would be the following:
>>>> 1) Finish CAMEL-2624 with Mina2. This will give me the asynchronous
>>>> endpoints I need and a quick time-to-market. I'll put off issues concerning
>>>> high throughput.
>>>> 2) Work on Mina3 to ensure it has low latency with small data rate streams
>>>> and high throughput with large data pipes.
>>>> 3) Upgrade my Google protocol buffers codec for Mina3.
>>>> 4) When Mina3 is ready, open a new Camel ticket and create a new mina3
>>>> Camel Component.
>>>> 
>>>> What do you think?
>>> I'll try to squeeze 2 hours to backport the patch to MINA 2 today or tomorrow.
>>> 
>>> Feel free to ping me on mail or on #mina if I don't send any feedback in the next 2 days (I'm pretty busy and may slip)
>>> 
>>> 
>>> -- 
>>> Regards,
>>> Cordialement,
>>> Emmanuel Lécharny
>>> www.iktek.com
>>> 
>> Wow. That is nice! Look forward to checking it out. I'll move forward with my plan in the meantime.
>> 
>> Chad
>> Sent from my iPhone
>> 
> 
> 
> -- 
> Regards,
> Cordialement,
> Emmanuel Lécharny
> www.iktek.com
>

Re: [MINA 3.0] IoBuffer usage

Posted by Emmanuel Lecharny <el...@gmail.com>.

IFAICT, on MINA 2, you should not have any issue.

The loop where we write buffers into the channel is :

     ...
         final int maxWrittenBytes = 
session.getConfig().getMaxReadBufferSize()
                 + (session.getConfig().getMaxReadBufferSize() >>> 1);
         int writtenBytes = 0;
         WriteRequest req = null;

         try {
             // Clear OP_WRITE
             setInterestedInWrite(session, false);

             do {
                 // Check for pending writes.
                 req = session.getCurrentWriteRequest();

                 if (req == null) {
                     req = writeRequestQueue.poll(session);

                     if (req == null) {
                         break;
                     }

                     session.setCurrentWriteRequest(req);
                 }

                 int localWrittenBytes = 0;
                 Object message = req.getMessage();

                 if (message instanceof IoBuffer) {
                     localWrittenBytes = writeBuffer(session, req,
                             hasFragmentation, maxWrittenBytes - 
writtenBytes,
                             currentTime);

                     if (( localWrittenBytes > 0 )
&& ((IoBuffer) message).hasRemaining()) {
                         // the buffer isn't empty, we re-interest it in 
writing
                         writtenBytes += localWrittenBytes;
                         setInterestedInWrite(session, true);
                         return false;
                     }
                 } else { // Blahhh }

                 if (localWrittenBytes == 0) {
                     // Kernel buffer is full.
                     setInterestedInWrite(session, true);
                     return false;
                 }

                 writtenBytes += localWrittenBytes;

                 if (writtenBytes >= maxWrittenBytes) {
                     // Wrote too much
                     scheduleFlush(session);
                     return false;
                 }
             } while (writtenBytes < maxWrittenBytes);
     ...

with :

     private int writeBuffer(S session, WriteRequest req,
             boolean hasFragmentation, int maxLength, long currentTime)
             throws Exception {
         IoBuffer buf = (IoBuffer) req.getMessage();
         int localWrittenBytes = 0;

         if (buf.hasRemaining()) {
             int length;

             if (hasFragmentation) {
                 length = Math.min(buf.remaining(), maxLength);
             } else {
                 length = buf.remaining();
             }

             localWrittenBytes = write(session, buf, length);
         }

and :

     protected int write(NioSession session, IoBuffer buf, int length)
             throws Exception {
         if (buf.remaining() <= length) {
             return session.getChannel().write(buf.buf());
         }

         int oldLimit = buf.limit();
         buf.limit(buf.position() + length);
         try {
             return session.getChannel().write(buf.buf());
         } finally {
             buf.limit(oldLimit);
         }
     }

So we try our best to stuff the channel with as many bytes as possible, 
before giving up (either because we don't have anything to write, to 
because the channel is full...)

I don't see if we can do any better.

On 1/17/12 3:57 PM, Chad Beaulac wrote:
>
> On Jan 17, 2012, at 9:32 AM, Emmanuel Lécharny<el...@apache.org>  wrote:
>
>> On 1/17/12 3:02 PM, Chad Beaulac wrote:
>>> On Mon, Jan 16, 2012 at 1:10 PM, Emmanuel Lecharny<el...@gmail.com>wrote:
>>>
>>>> On 1/16/12 2:56 PM, Chad Beaulac wrote:
>>>>
>>>>> Emmanuel, (all)
>>>>>
>>>>> I'm working on this Camel ticket:
>>>>> https://issues.apache.org/**jira/browse/CAMEL-2624<https://issues.apache.org/jira/browse/CAMEL-2624>
>>>>>
>>>>> I finished the initial cut of
>>>>> https://issues.apache.org/**jira/browse/CAMEL-3471<https://issues.apache.org/jira/browse/CAMEL-3471>to create a mina2
>>>>> component in Camel.
>>>>>
>>>>> CAMEL-2624 adds the full async behavior for Mina2 endpoints in Camel.
>>>>>
>>>>> Would it be possible to backport the IoBuffer reading and writing
>>>>> discussed
>>>>> in this email thread from Mina3 to Mina2?
>>>>> Following the depth of the stack trace through
>>>>> AbstractIoSession.write(...), I'm a little concerned about the throughput.
>>>>> My current code (mina-less) is supporting single TCP channels with 320+
>>>>> Mb/s rates. I'm porting my code to use Mina with a Mina Google Protocol
>>>>> Buffers codec I wrote. I know if this is a real problem soon when I finish
>>>>> CAMEL-2624 and setup some throughput tests.
>>>>>
>>>> Another option would be to port this to MINA3, and make it work with
>>>> Camel. Right now, MINA 3 is pretty rough, but we have made it works well
>>>> with Http and LDAP. I'd rather spend some time making it a bit more solid
>>>> and working well in your case instead of trying to inject the code in MINA2.
>>>>
>>>> Now, it's your call. We can discuss the pros and cons of both approach if
>>>> you like.
>>>>
>>>>
>>> Hi Emmanuel,
>>>
>>> One of my pros/cons trade-offs is time-to-market. I can have a solution in
>>> Camel with Mina2 fairly quickly. Although I might have issues with high
>>> data rate streams.
>>> With that said, my approach would be the following:
>>> 1) Finish CAMEL-2624 with Mina2. This will give me the asynchronous
>>> endpoints I need and a quick time-to-market. I'll put off issues concerning
>>> high throughput.
>>> 2) Work on Mina3 to ensure it has low latency with small data rate streams
>>> and high throughput with large data pipes.
>>> 3) Upgrade my Google protocol buffers codec for Mina3.
>>> 4) When Mina3 is ready, open a new Camel ticket and create a new mina3
>>> Camel Component.
>>>
>>> What do you think?
>> I'll try to squeeze 2 hours to backport the patch to MINA 2 today or tomorrow.
>>
>> Feel free to ping me on mail or on #mina if I don't send any feedback in the next 2 days (I'm pretty busy and may slip)
>>
>>
>> -- 
>> Regards,
>> Cordialement,
>> Emmanuel Lécharny
>> www.iktek.com
>>
> Wow. That is nice! Look forward to checking it out. I'll move forward with my plan in the meantime.
>
> Chad
> Sent from my iPhone
>


-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: [MINA 3.0] IoBuffer usage

Posted by Chad Beaulac <ca...@gmail.com>.


On Jan 17, 2012, at 9:32 AM, Emmanuel Lécharny <el...@apache.org> wrote:

> On 1/17/12 3:02 PM, Chad Beaulac wrote:
>> On Mon, Jan 16, 2012 at 1:10 PM, Emmanuel Lecharny<el...@gmail.com>wrote:
>> 
>>> On 1/16/12 2:56 PM, Chad Beaulac wrote:
>>> 
>>>> Emmanuel, (all)
>>>> 
>>>> I'm working on this Camel ticket:
>>>> https://issues.apache.org/**jira/browse/CAMEL-2624<https://issues.apache.org/jira/browse/CAMEL-2624>
>>>> 
>>>> I finished the initial cut of
>>>> https://issues.apache.org/**jira/browse/CAMEL-3471<https://issues.apache.org/jira/browse/CAMEL-3471>to create a mina2
>>>> component in Camel.
>>>> 
>>>> CAMEL-2624 adds the full async behavior for Mina2 endpoints in Camel.
>>>> 
>>>> Would it be possible to backport the IoBuffer reading and writing
>>>> discussed
>>>> in this email thread from Mina3 to Mina2?
>>>> Following the depth of the stack trace through
>>>> AbstractIoSession.write(...), I'm a little concerned about the throughput.
>>>> My current code (mina-less) is supporting single TCP channels with 320+
>>>> Mb/s rates. I'm porting my code to use Mina with a Mina Google Protocol
>>>> Buffers codec I wrote. I know if this is a real problem soon when I finish
>>>> CAMEL-2624 and setup some throughput tests.
>>>> 
>>> Another option would be to port this to MINA3, and make it work with
>>> Camel. Right now, MINA 3 is pretty rough, but we have made it works well
>>> with Http and LDAP. I'd rather spend some time making it a bit more solid
>>> and working well in your case instead of trying to inject the code in MINA2.
>>> 
>>> Now, it's your call. We can discuss the pros and cons of both approach if
>>> you like.
>>> 
>>> 
>> Hi Emmanuel,
>> 
>> One of my pros/cons trade-offs is time-to-market. I can have a solution in
>> Camel with Mina2 fairly quickly. Although I might have issues with high
>> data rate streams.
>> With that said, my approach would be the following:
>> 1) Finish CAMEL-2624 with Mina2. This will give me the asynchronous
>> endpoints I need and a quick time-to-market. I'll put off issues concerning
>> high throughput.
>> 2) Work on Mina3 to ensure it has low latency with small data rate streams
>> and high throughput with large data pipes.
>> 3) Upgrade my Google protocol buffers codec for Mina3.
>> 4) When Mina3 is ready, open a new Camel ticket and create a new mina3
>> Camel Component.
>> 
>> What do you think?
> 
> I'll try to squeeze 2 hours to backport the patch to MINA 2 today or tomorrow.
> 
> Feel free to ping me on mail or on #mina if I don't send any feedback in the next 2 days (I'm pretty busy and may slip)
> 
> 
> -- 
> Regards,
> Cordialement,
> Emmanuel Lécharny
> www.iktek.com
> 
Wow. That is nice! Look forward to checking it out. I'll move forward with my plan in the meantime. 

Chad
Sent from my iPhone

Re: [MINA 3.0] IoBuffer usage

Posted by Emmanuel Lécharny <el...@apache.org>.

On 1/17/12 3:02 PM, Chad Beaulac wrote:
> On Mon, Jan 16, 2012 at 1:10 PM, Emmanuel Lecharny<el...@gmail.com>wrote:
>
>> On 1/16/12 2:56 PM, Chad Beaulac wrote:
>>
>>> Emmanuel, (all)
>>>
>>> I'm working on this Camel ticket:
>>> https://issues.apache.org/**jira/browse/CAMEL-2624<https://issues.apache.org/jira/browse/CAMEL-2624>
>>>
>>> I finished the initial cut of
>>> https://issues.apache.org/**jira/browse/CAMEL-3471<https://issues.apache.org/jira/browse/CAMEL-3471>to create a mina2
>>> component in Camel.
>>>
>>> CAMEL-2624 adds the full async behavior for Mina2 endpoints in Camel.
>>>
>>> Would it be possible to backport the IoBuffer reading and writing
>>> discussed
>>> in this email thread from Mina3 to Mina2?
>>> Following the depth of the stack trace through
>>> AbstractIoSession.write(...), I'm a little concerned about the throughput.
>>> My current code (mina-less) is supporting single TCP channels with 320+
>>> Mb/s rates. I'm porting my code to use Mina with a Mina Google Protocol
>>> Buffers codec I wrote. I know if this is a real problem soon when I finish
>>> CAMEL-2624 and setup some throughput tests.
>>>
>> Another option would be to port this to MINA3, and make it work with
>> Camel. Right now, MINA 3 is pretty rough, but we have made it works well
>> with Http and LDAP. I'd rather spend some time making it a bit more solid
>> and working well in your case instead of trying to inject the code in MINA2.
>>
>> Now, it's your call. We can discuss the pros and cons of both approach if
>> you like.
>>
>>
> Hi Emmanuel,
>
> One of my pros/cons trade-offs is time-to-market. I can have a solution in
> Camel with Mina2 fairly quickly. Although I might have issues with high
> data rate streams.
> With that said, my approach would be the following:
> 1) Finish CAMEL-2624 with Mina2. This will give me the asynchronous
> endpoints I need and a quick time-to-market. I'll put off issues concerning
> high throughput.
> 2) Work on Mina3 to ensure it has low latency with small data rate streams
> and high throughput with large data pipes.
> 3) Upgrade my Google protocol buffers codec for Mina3.
> 4) When Mina3 is ready, open a new Camel ticket and create a new mina3
> Camel Component.
>
> What do you think?

I'll try to squeeze 2 hours to backport the patch to MINA 2 today or 
tomorrow.

Feel free to ping me on mail or on #mina if I don't send any feedback in 
the next 2 days (I'm pretty busy and may slip)


-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: [MINA 3.0] IoBuffer usage

Posted by Chad Beaulac <ca...@gmail.com>.

On Mon, Jan 16, 2012 at 1:10 PM, Emmanuel Lecharny <el...@gmail.com>wrote:

> On 1/16/12 2:56 PM, Chad Beaulac wrote:
>
>> Emmanuel, (all)
>>
>> I'm working on this Camel ticket:
>> https://issues.apache.org/**jira/browse/CAMEL-2624<https://issues.apache.org/jira/browse/CAMEL-2624>
>>
>> I finished the initial cut of
>> https://issues.apache.org/**jira/browse/CAMEL-3471<https://issues.apache.org/jira/browse/CAMEL-3471>to create a mina2
>> component in Camel.
>>
>> CAMEL-2624 adds the full async behavior for Mina2 endpoints in Camel.
>>
>> Would it be possible to backport the IoBuffer reading and writing
>> discussed
>> in this email thread from Mina3 to Mina2?
>> Following the depth of the stack trace through
>> AbstractIoSession.write(...), I'm a little concerned about the throughput.
>> My current code (mina-less) is supporting single TCP channels with 320+
>> Mb/s rates. I'm porting my code to use Mina with a Mina Google Protocol
>> Buffers codec I wrote. I know if this is a real problem soon when I finish
>> CAMEL-2624 and setup some throughput tests.
>>
>
> Another option would be to port this to MINA3, and make it work with
> Camel. Right now, MINA 3 is pretty rough, but we have made it works well
> with Http and LDAP. I'd rather spend some time making it a bit more solid
> and working well in your case instead of trying to inject the code in MINA2.
>
> Now, it's your call. We can discuss the pros and cons of both approach if
> you like.
>
>
Hi Emmanuel,

One of my pros/cons trade-offs is time-to-market. I can have a solution in
Camel with Mina2 fairly quickly. Although I might have issues with high
data rate streams.
With that said, my approach would be the following:
1) Finish CAMEL-2624 with Mina2. This will give me the asynchronous
endpoints I need and a quick time-to-market. I'll put off issues concerning
high throughput.
2) Work on Mina3 to ensure it has low latency with small data rate streams
and high throughput with large data pipes.
3) Upgrade my Google protocol buffers codec for Mina3.
4) When Mina3 is ready, open a new Camel ticket and create a new mina3
Camel Component.

What do you think?

Regards,
Chad
www.objectivesolutions.com

>
>
> --
> Regards,
> Cordialement,
> Emmanuel Lécharny
> www.iktek.com
>
>

Re: [MINA 3.0] IoBuffer usage

Posted by Emmanuel Lecharny <el...@gmail.com>.

On 1/16/12 2:56 PM, Chad Beaulac wrote:
> Emmanuel, (all)
>
> I'm working on this Camel ticket:
> https://issues.apache.org/jira/browse/CAMEL-2624
>
> I finished the initial cut of
> https://issues.apache.org/jira/browse/CAMEL-3471 to create a mina2
> component in Camel.
>
> CAMEL-2624 adds the full async behavior for Mina2 endpoints in Camel.
>
> Would it be possible to backport the IoBuffer reading and writing discussed
> in this email thread from Mina3 to Mina2?
> Following the depth of the stack trace through
> AbstractIoSession.write(...), I'm a little concerned about the throughput.
> My current code (mina-less) is supporting single TCP channels with 320+
> Mb/s rates. I'm porting my code to use Mina with a Mina Google Protocol
> Buffers codec I wrote. I know if this is a real problem soon when I finish
> CAMEL-2624 and setup some throughput tests.

Another option would be to port this to MINA3, and make it work with 
Camel. Right now, MINA 3 is pretty rough, but we have made it works well 
with Http and LDAP. I'd rather spend some time making it a bit more 
solid and working well in your case instead of trying to inject the code 
in MINA2.

Now, it's your call. We can discuss the pros and cons of both approach 
if you like.


-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: [MINA 3.0] IoBuffer usage

Posted by Chad Beaulac <ca...@gmail.com>.

Emmanuel, (all)

I'm working on this Camel ticket:
https://issues.apache.org/jira/browse/CAMEL-2624

I finished the initial cut of
https://issues.apache.org/jira/browse/CAMEL-3471 to create a mina2
component in Camel.

CAMEL-2624 adds the full async behavior for Mina2 endpoints in Camel.

Would it be possible to backport the IoBuffer reading and writing discussed
in this email thread from Mina3 to Mina2?
Following the depth of the stack trace through
AbstractIoSession.write(...), I'm a little concerned about the throughput.
My current code (mina-less) is supporting single TCP channels with 320+
Mb/s rates. I'm porting my code to use Mina with a Mina Google Protocol
Buffers codec I wrote. I know if this is a real problem soon when I finish
CAMEL-2624 and setup some throughput tests.

Regards,
Chad


On Tue, Dec 6, 2011 at 1:27 AM, Chad Beaulac <ca...@gmail.com> wrote:

> Looks good Emmanuel.
>
> Sent from my iPhone
>
> On Dec 5, 2011, at 10:13 AM, Emmanuel Lecharny <el...@gmail.com>
> wrote:
>
> > On 12/5/11 3:50 PM, Julien Vermillard wrote:
> >> since it's a ConcurrentLinkedQueue it could be a perf killer to do a
> .size() from the oracle javadoc : ""Beware that, unlike in most
> collections, the size method is NOT a constant-time operation. Because of
> the asynchronous nature of these queues, determining the current number of
> elements requires a traversal of the elements.""
> > Damn right...
> >
> > What about using a read/write lock instead of a synchronized block ? The
> problem with the synchornized block on queue is that we must still protect
> the queue when it's being written with another synchronized block, when if
> we use a read/write lock, we can allow parallel writes in the queue, but
> once it comes to write in the channel, we acquire a write lock and the
> queue is now protected. Something like :
> >
> > private final ReadWriteLock lock = new ReentrantReadWriteLock();
> > private final Lock queueReadLock = lock.readLock();
> > private final Lock queueWriteLock= lock.writeLock();
> > ...
> > try {
> > queueWriteLock.lock();
> >
> >    do {
> >        WriteRequest wreq = queue.peek();
> >
> >        if (wreq == null) {
> >            break;
> >        }
> >        ...
> >    } while (!queue.isEmpty());
> > } finally {
> >    queueWriteLock.unlock();
> > }
> >
> > ...
> >
> >    public WriteRequest enqueueWriteRequest(Object message) {
> >        DefaultWriteRequest request = new DefaultWriteRequest(message);
> >
> >        try {
> > queueReadLock().lock()
> >            writeQueue.add(request);
> >        } finally {
> > queueReadLock.unlock();
> >        }
> >    }
> >
> > -- Regards, Cordialement, Emmanuel Lécharny www.iktek.com
>

Re: [MINA 3.0] IoBuffer usage

Posted by Chad Beaulac <ca...@gmail.com>.

Emmanuel, (all)

I'm working on this Camel ticket:
https://issues.apache.org/jira/browse/CAMEL-2624

I finished the initial cut of
https://issues.apache.org/jira/browse/CAMEL-3471 to create a mina2
component in Camel.

CAMEL-2624 adds the full async behavior for Mina2 endpoints in Camel.

Would it be possible to backport the IoBuffer reading and writing discussed
in this email thread from Mina3 to Mina2?
Following the depth of the stack trace through
AbstractIoSession.write(...), I'm a little concerned about the throughput.
My current code (mina-less) is supporting single TCP channels with 320+
Mb/s rates. I'm porting my code to use Mina with a Mina Google Protocol
Buffers codec I wrote. I know if this is a real problem soon when I finish
CAMEL-2624 and setup some throughput tests.

Regards,
Chad


On Tue, Dec 6, 2011 at 1:27 AM, Chad Beaulac <ca...@gmail.com> wrote:

> Looks good Emmanuel.
>
> Sent from my iPhone
>
> On Dec 5, 2011, at 10:13 AM, Emmanuel Lecharny <el...@gmail.com>
> wrote:
>
> > On 12/5/11 3:50 PM, Julien Vermillard wrote:
> >> since it's a ConcurrentLinkedQueue it could be a perf killer to do a
> .size() from the oracle javadoc : ""Beware that, unlike in most
> collections, the size method is NOT a constant-time operation. Because of
> the asynchronous nature of these queues, determining the current number of
> elements requires a traversal of the elements.""
> > Damn right...
> >
> > What about using a read/write lock instead of a synchronized block ? The
> problem with the synchornized block on queue is that we must still protect
> the queue when it's being written with another synchronized block, when if
> we use a read/write lock, we can allow parallel writes in the queue, but
> once it comes to write in the channel, we acquire a write lock and the
> queue is now protected. Something like :
> >
> > private final ReadWriteLock lock = new ReentrantReadWriteLock();
> > private final Lock queueReadLock = lock.readLock();
> > private final Lock queueWriteLock= lock.writeLock();
> > ...
> > try {
> > queueWriteLock.lock();
> >
> >    do {
> >        WriteRequest wreq = queue.peek();
> >
> >        if (wreq == null) {
> >            break;
> >        }
> >        ...
> >    } while (!queue.isEmpty());
> > } finally {
> >    queueWriteLock.unlock();
> > }
> >
> > ...
> >
> >    public WriteRequest enqueueWriteRequest(Object message) {
> >        DefaultWriteRequest request = new DefaultWriteRequest(message);
> >
> >        try {
> > queueReadLock().lock()
> >            writeQueue.add(request);
> >        } finally {
> > queueReadLock.unlock();
> >        }
> >    }
> >
> > -- Regards, Cordialement, Emmanuel Lécharny www.iktek.com
>

Re: [MINA 3.0] IoBuffer usage

Posted by Chad Beaulac <ca...@gmail.com>.

Looks good Emmanuel. 

Sent from my iPhone

On Dec 5, 2011, at 10:13 AM, Emmanuel Lecharny <el...@gmail.com> wrote:

> On 12/5/11 3:50 PM, Julien Vermillard wrote:
>> since it's a ConcurrentLinkedQueue it could be a perf killer to do a .size() from the oracle javadoc : ""Beware that, unlike in most collections, the size method is NOT a constant-time operation. Because of the asynchronous nature of these queues, determining the current number of elements requires a traversal of the elements.""
> Damn right...
> 
> What about using a read/write lock instead of a synchronized block ? The problem with the synchornized block on queue is that we must still protect the queue when it's being written with another synchronized block, when if we use a read/write lock, we can allow parallel writes in the queue, but once it comes to write in the channel, we acquire a write lock and the queue is now protected. Something like :
> 
> private final ReadWriteLock lock = new ReentrantReadWriteLock();
> private final Lock queueReadLock = lock.readLock();
> private final Lock queueWriteLock= lock.writeLock();
> ...
> try {
> queueWriteLock.lock();
> 
>    do {
>        WriteRequest wreq = queue.peek();
> 
>        if (wreq == null) {
>            break;
>        }
>        ...
>    } while (!queue.isEmpty());
> } finally {
>    queueWriteLock.unlock();
> }
> 
> ...
> 
>    public WriteRequest enqueueWriteRequest(Object message) {
>        DefaultWriteRequest request = new DefaultWriteRequest(message);
> 
>        try {
> queueReadLock().lock()
>            writeQueue.add(request);
>        } finally {
> queueReadLock.unlock();
>        }
>    }
> 
> -- Regards, Cordialement, Emmanuel Lécharny www.iktek.com

Re: [MINA 3.0] IoBuffer usage

Posted by Emmanuel Lecharny <el...@gmail.com>.

On 12/5/11 3:50 PM, Julien Vermillard wrote:
> since it's a ConcurrentLinkedQueue it could be a perf killer to do a 
> .size() from the oracle javadoc : ""Beware that, unlike in most 
> collections, the size method is NOT a constant-time operation. Because 
> of the asynchronous nature of these queues, determining the current 
> number of elements requires a traversal of the elements.""
Damn right...

What about using a read/write lock instead of a synchronized block ? The 
problem with the synchornized block on queue is that we must still 
protect the queue when it's being written with another synchronized 
block, when if we use a read/write lock, we can allow parallel writes in 
the queue, but once it comes to write in the channel, we acquire a write 
lock and the queue is now protected. Something like :

private final ReadWriteLock lock = new ReentrantReadWriteLock();
private final Lock queueReadLock = lock.readLock();
private final Lock queueWriteLock= lock.writeLock();
...
try {
queueWriteLock.lock();

     do {
         WriteRequest wreq = queue.peek();

         if (wreq == null) {
             break;
         }
         ...
     } while (!queue.isEmpty());
} finally {
     queueWriteLock.unlock();
}

...

     public WriteRequest enqueueWriteRequest(Object message) {
         DefaultWriteRequest request = new DefaultWriteRequest(message);

         try {
queueReadLock().lock()
             writeQueue.add(request);
         } finally {
queueReadLock.unlock();
         }
     }

-- Regards, Cordialement, Emmanuel Lécharny www.iktek.com

Re: [MINA 3.0] IoBuffer usage

Posted by Julien Vermillard <jv...@gmail.com>.

since it's a ConcurrentLinkedQueue it could be a perf killer to do a .size()
from the oracle javadoc :

""Beware that, unlike in most collections, the size method is NOT a
constant-time operation. Because of the asynchronous nature of these
queues, determining the current number of elements requires a
traversal of the elements.""

Julien

On Mon, Dec 5, 2011 at 3:20 PM, Chad Beaulac <ca...@gmail.com> wrote:
> Excellent. I like the counter in the write algorithm.
>
> Sent from my iPhone
>
> On Dec 5, 2011, at 8:50 AM, Emmanuel Lecharny <el...@gmail.com> wrote:
>
>> On 12/5/11 2:39 PM, Chad Beaulac wrote:
>>> Looks perfect except one thing. Don't allow clients to put WriteRequest's
>>> into the queue while you're draining it. If you allow other threads to
>>> enqueue more WriteRequest's, the algorithm is unbounded and you risk
>>> getting stuck writing for only one channel. I added a synchronized block
>>> below.
>>
>> We can avoid the synchronized section by getting the queue size at the beginning :
>>
>> Queue<WriteRequest>  queue = session.getWriteQueue();
>>
>> int size = queue.size();
>>
>> while (size > 0) {
>>    ... // process each entry in the queue
>> size--;
>> }
>>
>> But I don't even think it's necessary : a session can only be processed by one single thread, which is the one which process the write (the selector's thread).
>>
>> Of course, if we add an executor somewhere in the chain, this is a different story, but then, we will need to add some synchronization at the executor level, not on the selector level, IMO.
>>
>>
>>
>> --
>> Regards,
>> Cordialement,
>> Emmanuel Lécharny
>> www.iktek.com
>>

Re: [MINA 3.0] IoBuffer usage

Posted by Chad Beaulac <ca...@gmail.com>.

Excellent. I like the counter in the write algorithm. 

Sent from my iPhone

On Dec 5, 2011, at 8:50 AM, Emmanuel Lecharny <el...@gmail.com> wrote:

> On 12/5/11 2:39 PM, Chad Beaulac wrote:
>> Looks perfect except one thing. Don't allow clients to put WriteRequest's
>> into the queue while you're draining it. If you allow other threads to
>> enqueue more WriteRequest's, the algorithm is unbounded and you risk
>> getting stuck writing for only one channel. I added a synchronized block
>> below.
> 
> We can avoid the synchronized section by getting the queue size at the beginning :
> 
> Queue<WriteRequest>  queue = session.getWriteQueue();
> 
> int size = queue.size();
> 
> while (size > 0) {
>    ... // process each entry in the queue
> size--;
> }
> 
> But I don't even think it's necessary : a session can only be processed by one single thread, which is the one which process the write (the selector's thread).
> 
> Of course, if we add an executor somewhere in the chain, this is a different story, but then, we will need to add some synchronization at the executor level, not on the selector level, IMO.
> 
> 
> 
> -- 
> Regards,
> Cordialement,
> Emmanuel Lécharny
> www.iktek.com
>

Re: [MINA 3.0] IoBuffer usage

Posted by Emmanuel Lecharny <el...@gmail.com>.

On 12/5/11 2:39 PM, Chad Beaulac wrote:
> Looks perfect except one thing. Don't allow clients to put WriteRequest's
> into the queue while you're draining it. If you allow other threads to
> enqueue more WriteRequest's, the algorithm is unbounded and you risk
> getting stuck writing for only one channel. I added a synchronized block
> below.

We can avoid the synchronized section by getting the queue size at the 
beginning :

Queue<WriteRequest>  queue = session.getWriteQueue();

int size = queue.size();

while (size > 0) {
     ... // process each entry in the queue
size--;
}

But I don't even think it's necessary : a session can only be processed 
by one single thread, which is the one which process the write (the 
selector's thread).

Of course, if we add an executor somewhere in the chain, this is a 
different story, but then, we will need to add some synchronization at 
the executor level, not on the selector level, IMO.

-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: Re: [MINA 3.0] IoBuffer usage

Posted by Chad Beaulac <ca...@gmail.com>.

Looks perfect except one thing. Don't allow clients to put WriteRequest's
into the queue while you're draining it. If you allow other threads to
enqueue more WriteRequest's, the algorithm is unbounded and you risk
getting stuck writing for only one channel. I added a synchronized block
below.

On Mon, Dec 5, 2011 at 4:07 AM, Julien Vermillard <jv...@gmail.com>wrote:

> Hi,
> snipped a lot of the message :)
> >> So, do you mean that the underlying layer will not allow us to push say,
> >> 20M, without informing the session that it's full ? In other word, there
> >> is a limited size that can be pushed and we don't have to take care of
> >> this limit ourselves ?
> >
> >
> >>
> > Sort of. If the TCP send window (OS layer) has less room in it than the
> > outputBuffer.remaining(), the write will only write a portion of
> > outputBufffer. Consider this the CONGESTION_CONTROLLED state. If the TCP
> > send window is full when you try to write, the write will return 0. The
> > algorithm should never see this case because you should always stop
> trying
> > to write when only a portion of the outputBuffer is written. And, always
> > continue to try and write when an entire outputBuffer is written and
> there
> > are more outputBuffers to write in the output queue.
> >
>
> Here the write algorithm used in trunk (3.0), we give up writing if
> the buffer is not written totally because we consider the kernel
> buffer is full or congested :
>
> Queue<WriteRequest> queue = session.getWriteQueue();
>
> do {
>    // get a write request from the queue
>
      synchronized (queue) {

>    WriteRequest wreq = queue.peek();
>    if (wreq == null) {
>        break;
>    }
>    ByteBuffer buf = (ByteBuffer) wreq.getMessage();
>
>
>    int wrote = session.getSocketChannel().write(buf);
>    if (LOGGER.isDebugEnabled()) {
>        LOGGER.debug("wrote {} bytes to {}", wrote, session);
>    }
>
>    if (buf.remaining() == 0) {
>        // completed write request, let's remove
>        // it
>        queue.remove();
>        // complete the future
>        DefaultWriteFuture future = (DefaultWriteFuture) wreq.getFuture();
>        if (future != null) {
>            future.complete();
>        }
>    } else {
>        // output socket buffer is full, we need
>        // to give up until next selection for
>        // writing
>        break;
>    }
>
>     } // end synchronized(queue)


> } while (!queue.isEmpty());
>

Re: Re: [MINA 3.0] IoBuffer usage

Posted by Julien Vermillard <jv...@gmail.com>.

Hi,
snipped a lot of the message :)
>> So, do you mean that the underlying layer will not allow us to push say,
>> 20M, without informing the session that it's full ? In other word, there
>> is a limited size that can be pushed and we don't have to take care of
>> this limit ourselves ?
>
>
>>
> Sort of. If the TCP send window (OS layer) has less room in it than the
> outputBuffer.remaining(), the write will only write a portion of
> outputBufffer. Consider this the CONGESTION_CONTROLLED state. If the TCP
> send window is full when you try to write, the write will return 0. The
> algorithm should never see this case because you should always stop trying
> to write when only a portion of the outputBuffer is written. And, always
> continue to try and write when an entire outputBuffer is written and there
> are more outputBuffers to write in the output queue.
>

Here the write algorithm used in trunk (3.0), we give up writing if
the buffer is not written totally because we consider the kernel
buffer is full or congested :

Queue<WriteRequest> queue = session.getWriteQueue();

do {
    // get a write request from the queue
    WriteRequest wreq = queue.peek();
    if (wreq == null) {
        break;
    }
    ByteBuffer buf = (ByteBuffer) wreq.getMessage();


    int wrote = session.getSocketChannel().write(buf);
    if (LOGGER.isDebugEnabled()) {
        LOGGER.debug("wrote {} bytes to {}", wrote, session);
    }

    if (buf.remaining() == 0) {
        // completed write request, let's remove
        // it
        queue.remove();
        // complete the future
        DefaultWriteFuture future = (DefaultWriteFuture) wreq.getFuture();
        if (future != null) {
            future.complete();
        }
    } else {
        // output socket buffer is full, we need
        // to give up until next selection for
        // writing
        break;
    }

} while (!queue.isEmpty());

Re: [MINA 3.0] IoBuffer usage

Posted by Emmanuel Lécharny <el...@apache.org>.

On 12/4/11 2:34 PM, Chad Beaulac wrote:

Hi Chad,

> Agreed :-)
> Do you have a Git repo setup for Mina3? I'll help you write some of it if
> you like. With unit tests, of course. ;-)

http://git.apache.org/

This is a read only Git repos. Internally, we are -still- using SVN (but 
we are discussing about moving to GiT in 2012)

Your participation would be *very* welcomed !

-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: Re: [MINA 3.0] IoBuffer usage

Posted by Chad Beaulac <ca...@gmail.com>.

On Sun, Dec 4, 2011 at 8:04 AM, Emmanuel Lecharny <el...@gmail.com>wrote:

> Posted on the wrong mailing list... Forwarding there.
>
> Hi Chad,
>
>
>
> On 12/4/11 1:25 AM, Chad Beaulac wrote:
>
>>  A single algorithm
>>
>>>  can handle large data pipes and provide extremely low latency for
>>>>  variable,
>>>>  small and large message sizes at the same time.
>>>>
>>>>   AFAIU, it' snot because you use a big buffer that you will put some
>>> strain
>>>  when dealing with small messages : the buffer will only contain a few
>>>  useful bytes, and that's it. In any case, this buffer won't be allocated
>>>  everytime we read from the channel, so it's just a container. But it's
>>> way
>>>  better to have a big buffer when dealing with big messages, because then
>>>  you'll have less roundtrips between the read and the processing. But the
>>>  condition, as you said, is that you don't read the channel until there
>>> is
>>>  no more bytes to read. You just read *once* get what you get, and go
>>> fetch
>>>  the processing part of your application with these bytes.
>>>
>>>  The write has exactly the same kind of issue, as you said : don't pound
>>>  the channel, let other channel the opportunity to be written too...
>>>
>>>
>>>   The write has the same sort of issue but it can be handled more
>> optimally
>>  in a different manner. The use case is slightly different because it's
>> the
>>  client producer code driving the algorithm instead the Selector.
>>  Producer Side
>>  - Use a queue of ByteBuffers as a send queue.
>>  - When send is possible for the selector, block on the queue, loop over
>> the
>>  output queue and send until SocketChannel.send(ByteBuffer src)
>>  (returnVal
>>  <   src.remaining || returnVal == 0) or you catch exception.
>>  - This is a fair algorithm when dealing with multiple selectors because
>> the
>>  amount of time the sending thread will spend inside the "send" method is
>>  bounded by how much data is in the ouputQueue and nothing can put data
>> into
>>  the queue while draining the queue to send data out.
>>
>
> Right, but there are some cases where for one session, there is a lot to
> write, when the other sessions are waiting, as the thread is in used
> flusing all its data. This is why I proposed to chunk the writes in
> small chunks (well, small does not mean 1kb here).
>
>
This won't work when you have channels with very large data pipes and
channels with small data pipes in the same selector. It will end up being
inefficient for the large data pipe channel.  Chunking the writes is
unnecessary and will consume extra resources.
Yes, the other sessions will be waiting when you're writing for one
channel. This is true for the entire algorithm.
An example of the fairness of the algorithm is as follows:
Consider a selector with two channels in it that you're writing to.
Channel-1 is a 300Mb/second stream.
Channel-2 is a 2Mb/second stream.
To be fair, the system will need to spend a lot more time writing data for
channel-1. Chunking the data creates overhead at the TCP layer that is best
avoided. Let the TCP layer figure out how it wants to segment TCP packets.
If you have 40MB to write, just call channel1.write(outputBuffer). It is ok
that output for channel-2 is waiting while you're writing for channel-1.
Either the call to write will immediately work and all data will be
written, some portion of it will be written or some error occurs because
the socket is closed or something. In case-1, you'll look in the queue for
more output which is synchronized some nobody can put more data into while
this write is occurring.


> If we have more than one selector, it's still the same issue, as a
> session will always use the same selector.
>
>
Not sure why you'll need more than one selector.


>
>

>>  Consumer Side
>>  - Use a ByteBuffer(64k) as a container to receive data into
>>  - Only call SocketChannel.read(**inputBuffer) once for the channel
>> that's
>>  ready to read.
>>  - Create a new ByteBuffer for the size read. Copy the the intputBuffer
>> into
>>  the new ByteBuffer. Give the new ByteBuffer to the session to process.
>>
> Not sure we want to copy the ByteBuffer. It coud be an option, but if we
> can save us this copy, that would be cool.
>
>   Rewind the input ByteBuffer. An alternative to creating a new ByteBuffer
>>  every time for the size read is allow client code to specify a custom
>>  ByteBuffer factory. This allows client code to pre-allocate memory and
>>  create a ring buffer or something like that.
>>
>>  I use these algorithms in C++ (using ACE - Adaptive Communications
>>  Environment) and Java. The algorithm is basically the same in C++ and
>> Java
>>  and handles protocols with a lot of small messages, variable message size
>>  protocols and large data block sizes.
>>
> I bet it's pretty much the same kind of algorihm, ACE and MINA are based
> on the same logic.
>
> Thanks for your input. I guess I have to put it down somewhere so that
> we have a clear algorithm described before starting implementing anything !
>
>
Agreed :-)
Do you have a Git repo setup for Mina3? I'll help you write some of it if
you like. With unit tests, of course. ;-)


>
>
>>
>>
>>   On the Producer side:
>>>>  Application code should determine the block sizes that are pushed onto
>>>> the
>>>>  output queue. Logic would be as previously stated:
>>>>  - write until there's nothing left to write, unregister for the write
>>>>  event, return to event processing
>>>>
>>>>   This is what we do. I'm afraid that it may be a bit annoying for the
>>> other
>>>  sessions, waiting to send data. At some point, it could be better to
>>> write
>>>  only a limited number of bytes, then give back control to the selector,
>>> and
>>>  be awaken when the selector set the OP_WRITE flag again (which will be
>>>  during the next loop anyway, or ay be another later).
>>>
>>>   - write until the the channel is congestion controlled, stay registered
>>>
>>>>  for
>>>>  write event, return to event processing
>>>>
>>>>   And what about a third option : write until the buffer we have
>>> prepared is
>>>  empty, even if the channel is not full ? That mean even if the producer
>>> has
>>>  prepared a -say- 1Mb block of data to write, it will be written in 16
>>>  blocks of 64Kb, even if the channel can absorb more.
>>>
>>>  Does it make sense ?
>>>
>>>
>>>   No. Doesn't make sense to me. Let the TCP layer handle optimizing how
>> large
>>  chunks of data is handled. If the client puts a ByteBuffer of 1MB or 20MB
>>  or whatever onto the outputQueue, call
>>  SocketChannel.write(**outputByteBuffer). Don't chunk it up.
>>
> But then, while we push all those data in the channel, we may have the
> other sessions on wait untl it's done (unless the channel is full, and
> we can switch to the next session).
>
> Correct and that is what you want.


> So, do you mean that the underlying layer will not allow us to push say,
> 20M, without informing the session that it's full ? In other word, there
> is a limited size that can be pushed and we don't have to take care of
> this limit ourselves ?


>
Sort of. If the TCP send window (OS layer) has less room in it than the
outputBuffer.remaining(), the write will only write a portion of
outputBufffer. Consider this the CONGESTION_CONTROLLED state. If the TCP
send window is full when you try to write, the write will return 0. The
algorithm should never see this case because you should always stop trying
to write when only a portion of the outputBuffer is written. And, always
continue to try and write when an entire outputBuffer is written and there
are more outputBuffers to write in the output queue.

>
> --
> Regards,
> Cordialement,
> Emmanuel Lécharny
> www.iktek.com
>
>
>
>

Fwd: Re: [MINA 3.0] IoBuffer usage

Posted by Emmanuel Lecharny <el...@gmail.com>.

Posted on the wrong mailing list... Forwarding there.

Hi Chad,


On 12/4/11 1:25 AM, Chad Beaulac wrote:
>  A single algorithm
>>>  can handle large data pipes and provide extremely low latency for
>>>  variable,
>>>  small and large message sizes at the same time.
>>>
>>  AFAIU, it' snot because you use a big buffer that you will put some strain
>>  when dealing with small messages : the buffer will only contain a few
>>  useful bytes, and that's it. In any case, this buffer won't be allocated
>>  everytime we read from the channel, so it's just a container. But it's way
>>  better to have a big buffer when dealing with big messages, because then
>>  you'll have less roundtrips between the read and the processing. But the
>>  condition, as you said, is that you don't read the channel until there is
>>  no more bytes to read. You just read *once* get what you get, and go fetch
>>  the processing part of your application with these bytes.
>>
>>  The write has exactly the same kind of issue, as you said : don't pound
>>  the channel, let other channel the opportunity to be written too...
>>
>>
>  The write has the same sort of issue but it can be handled more optimally
>  in a different manner. The use case is slightly different because it's the
>  client producer code driving the algorithm instead the Selector.
>  Producer Side
>  - Use a queue of ByteBuffers as a send queue.
>  - When send is possible for the selector, block on the queue, loop over the
>  output queue and send until SocketChannel.send(ByteBuffer src)  (returnVal
>  <   src.remaining || returnVal == 0) or you catch exception.
>  - This is a fair algorithm when dealing with multiple selectors because the
>  amount of time the sending thread will spend inside the "send" method is
>  bounded by how much data is in the ouputQueue and nothing can put data into
>  the queue while draining the queue to send data out.

Right, but there are some cases where for one session, there is a lot to
write, when the other sessions are waiting, as the thread is in used
flusing all its data. This is why I proposed to chunk the writes in
small chunks (well, small does not mean 1kb here).

If we have more than one selector, it's still the same issue, as a
session will always use the same selector.
>
>  Consumer Side
>  - Use a ByteBuffer(64k) as a container to receive data into
>  - Only call SocketChannel.read(inputBuffer) once for the channel that's
>  ready to read.
>  - Create a new ByteBuffer for the size read. Copy the the intputBuffer into
>  the new ByteBuffer. Give the new ByteBuffer to the session to process.
Not sure we want to copy the ByteBuffer. It coud be an option, but if we
can save us this copy, that would be cool.
>  Rewind the input ByteBuffer. An alternative to creating a new ByteBuffer
>  every time for the size read is allow client code to specify a custom
>  ByteBuffer factory. This allows client code to pre-allocate memory and
>  create a ring buffer or something like that.
>
>  I use these algorithms in C++ (using ACE - Adaptive Communications
>  Environment) and Java. The algorithm is basically the same in C++ and Java
>  and handles protocols with a lot of small messages, variable message size
>  protocols and large data block sizes.
I bet it's pretty much the same kind of algorihm, ACE and MINA are based
on the same logic.

Thanks for your input. I guess I have to put it down somewhere so that
we have a clear algorithm described before starting implementing anything !

>
>
>
>>>  On the Producer side:
>>>  Application code should determine the block sizes that are pushed onto the
>>>  output queue. Logic would be as previously stated:
>>>  - write until there's nothing left to write, unregister for the write
>>>  event, return to event processing
>>>
>>  This is what we do. I'm afraid that it may be a bit annoying for the other
>>  sessions, waiting to send data. At some point, it could be better to write
>>  only a limited number of bytes, then give back control to the selector, and
>>  be awaken when the selector set the OP_WRITE flag again (which will be
>>  during the next loop anyway, or ay be another later).
>>
>>    - write until the the channel is congestion controlled, stay registered
>>>  for
>>>  write event, return to event processing
>>>
>>  And what about a third option : write until the buffer we have prepared is
>>  empty, even if the channel is not full ? That mean even if the producer has
>>  prepared a -say- 1Mb block of data to write, it will be written in 16
>>  blocks of 64Kb, even if the channel can absorb more.
>>
>>  Does it make sense ?
>>
>>
>  No. Doesn't make sense to me. Let the TCP layer handle optimizing how large
>  chunks of data is handled. If the client puts a ByteBuffer of 1MB or 20MB
>  or whatever onto the outputQueue, call
>  SocketChannel.write(outputByteBuffer). Don't chunk it up.
But then, while we push all those data in the channel, we may have the
other sessions on wait untl it's done (unless the channel is full, and
we can switch to the next session).

So, do you mean that the underlying layer will not allow us to push say,
20M, without informing the session that it's full ? In other word, there
is a limited size that can be pushed and we don't have to take care of
this limit ourselves ?

-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: [MINA 3.0] IoBuffer usage

Posted by Emmanuel Lecharny <el...@gmail.com>.

Wrong list, please ignore...


On 12/4/11 1:43 PM, Emmanuel Lécharny wrote:
> Hi Chad,
>
>
> On 12/4/11 1:25 AM, Chad Beaulac wrote:
>> A single algorithm
>>>> can handle large data pipes and provide extremely low latency for
>>>> variable,
>>>> small and large message sizes at the same time.
>>>>
>>> AFAIU, it' snot because you use a big buffer that you will put some 
>>> strain
>>> when dealing with small messages : the buffer will only contain a few
>>> useful bytes, and that's it. In any case, this buffer won't be 
>>> allocated
>>> everytime we read from the channel, so it's just a container. But 
>>> it's way
>>> better to have a big buffer when dealing with big messages, because 
>>> then
>>> you'll have less roundtrips between the read and the processing. But 
>>> the
>>> condition, as you said, is that you don't read the channel until 
>>> there is
>>> no more bytes to read. You just read *once* get what you get, and go 
>>> fetch
>>> the processing part of your application with these bytes.
>>>
>>> The write has exactly the same kind of issue, as you said : don't pound
>>> the channel, let other channel the opportunity to be written too...
>>>
>>>
>> The write has the same sort of issue but it can be handled more 
>> optimally
>> in a different manner. The use case is slightly different because 
>> it's the
>> client producer code driving the algorithm instead the Selector.
>> Producer Side
>> - Use a queue of ByteBuffers as a send queue.
>> - When send is possible for the selector, block on the queue, loop 
>> over the
>> output queue and send until SocketChannel.send(ByteBuffer src)  
>> (returnVal
>> <  src.remaining || returnVal == 0) or you catch exception.
>> - This is a fair algorithm when dealing with multiple selectors 
>> because the
>> amount of time the sending thread will spend inside the "send" method is
>> bounded by how much data is in the ouputQueue and nothing can put 
>> data into
>> the queue while draining the queue to send data out.
>
> Right, but there are some cases where for one session, there is a lot 
> to write, when the other sessions are waiting, as the thread is in 
> used flusing all its data. This is why I proposed to chunk the writes 
> in small chunks (well, small does not mean 1kb here).
>
> If we have more than one selector, it's still the same issue, as a 
> session will always use the same selector.
>>
>> Consumer Side
>> - Use a ByteBuffer(64k) as a container to receive data into
>> - Only call SocketChannel.read(inputBuffer) once for the channel that's
>> ready to read.
>> - Create a new ByteBuffer for the size read. Copy the the 
>> intputBuffer into
>> the new ByteBuffer. Give the new ByteBuffer to the session to process.
> Not sure we want to copy the ByteBuffer. It coud be an option, but if 
> we can save us this copy, that would be cool.
>> Rewind the input ByteBuffer. An alternative to creating a new ByteBuffer
>> every time for the size read is allow client code to specify a custom
>> ByteBuffer factory. This allows client code to pre-allocate memory and
>> create a ring buffer or something like that.
>>
>> I use these algorithms in C++ (using ACE - Adaptive Communications
>> Environment) and Java. The algorithm is basically the same in C++ and 
>> Java
>> and handles protocols with a lot of small messages, variable message 
>> size
>> protocols and large data block sizes.
> I bet it's pretty much the same kind of algorihm, ACE and MINA are 
> based on the same logic.
>
> Thanks for your input. I guess I have to put it down somewhere so that 
> we have a clear algorithm described before starting implementing 
> anything !
>
>>
>>
>>
>>>> On the Producer side:
>>>> Application code should determine the block sizes that are pushed 
>>>> onto the
>>>> output queue. Logic would be as previously stated:
>>>> - write until there's nothing left to write, unregister for the write
>>>> event, return to event processing
>>>>
>>> This is what we do. I'm afraid that it may be a bit annoying for the 
>>> other
>>> sessions, waiting to send data. At some point, it could be better to 
>>> write
>>> only a limited number of bytes, then give back control to the 
>>> selector, and
>>> be awaken when the selector set the OP_WRITE flag again (which will be
>>> during the next loop anyway, or ay be another later).
>>>
>>>   - write until the the channel is congestion controlled, stay 
>>> registered
>>>> for
>>>> write event, return to event processing
>>>>
>>> And what about a third option : write until the buffer we have 
>>> prepared is
>>> empty, even if the channel is not full ? That mean even if the 
>>> producer has
>>> prepared a -say- 1Mb block of data to write, it will be written in 16
>>> blocks of 64Kb, even if the channel can absorb more.
>>>
>>> Does it make sense ?
>>>
>>>
>> No. Doesn't make sense to me. Let the TCP layer handle optimizing how 
>> large
>> chunks of data is handled. If the client puts a ByteBuffer of 1MB or 
>> 20MB
>> or whatever onto the outputQueue, call
>> SocketChannel.write(outputByteBuffer). Don't chunk it up.
> But then, while we push all those data in the channel, we may have the 
> other sessions on wait untl it's done (unless the channel is full, and 
> we can switch to the next session).
>
> So, do you mean that the underlying layer will not allow us to push 
> say, 20M, without informing the session that it's full ? In other 
> word, there is a limited size that can be pushed and we don't have to 
> take care of this limit ourselves ?
>


-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: [MINA 3.0] IoBuffer usage

Posted by Emmanuel Lécharny <el...@apache.org>.

Hi Chad,


On 12/4/11 1:25 AM, Chad Beaulac wrote:
> A single algorithm
>>> can handle large data pipes and provide extremely low latency for
>>> variable,
>>> small and large message sizes at the same time.
>>>
>> AFAIU, it' snot because you use a big buffer that you will put some strain
>> when dealing with small messages : the buffer will only contain a few
>> useful bytes, and that's it. In any case, this buffer won't be allocated
>> everytime we read from the channel, so it's just a container. But it's way
>> better to have a big buffer when dealing with big messages, because then
>> you'll have less roundtrips between the read and the processing. But the
>> condition, as you said, is that you don't read the channel until there is
>> no more bytes to read. You just read *once* get what you get, and go fetch
>> the processing part of your application with these bytes.
>>
>> The write has exactly the same kind of issue, as you said : don't pound
>> the channel, let other channel the opportunity to be written too...
>>
>>
> The write has the same sort of issue but it can be handled more optimally
> in a different manner. The use case is slightly different because it's the
> client producer code driving the algorithm instead the Selector.
> Producer Side
> - Use a queue of ByteBuffers as a send queue.
> - When send is possible for the selector, block on the queue, loop over the
> output queue and send until SocketChannel.send(ByteBuffer src)  (returnVal
> <  src.remaining || returnVal == 0) or you catch exception.
> - This is a fair algorithm when dealing with multiple selectors because the
> amount of time the sending thread will spend inside the "send" method is
> bounded by how much data is in the ouputQueue and nothing can put data into
> the queue while draining the queue to send data out.

Right, but there are some cases where for one session, there is a lot to 
write, when the other sessions are waiting, as the thread is in used 
flusing all its data. This is why I proposed to chunk the writes in 
small chunks (well, small does not mean 1kb here).

If we have more than one selector, it's still the same issue, as a 
session will always use the same selector.
>
> Consumer Side
> - Use a ByteBuffer(64k) as a container to receive data into
> - Only call SocketChannel.read(inputBuffer) once for the channel that's
> ready to read.
> - Create a new ByteBuffer for the size read. Copy the the intputBuffer into
> the new ByteBuffer. Give the new ByteBuffer to the session to process.
Not sure we want to copy the ByteBuffer. It coud be an option, but if we 
can save us this copy, that would be cool.
> Rewind the input ByteBuffer. An alternative to creating a new ByteBuffer
> every time for the size read is allow client code to specify a custom
> ByteBuffer factory. This allows client code to pre-allocate memory and
> create a ring buffer or something like that.
>
> I use these algorithms in C++ (using ACE - Adaptive Communications
> Environment) and Java. The algorithm is basically the same in C++ and Java
> and handles protocols with a lot of small messages, variable message size
> protocols and large data block sizes.
I bet it's pretty much the same kind of algorihm, ACE and MINA are based 
on the same logic.

Thanks for your input. I guess I have to put it down somewhere so that 
we have a clear algorithm described before starting implementing anything !

>
>
>
>>> On the Producer side:
>>> Application code should determine the block sizes that are pushed onto the
>>> output queue. Logic would be as previously stated:
>>> - write until there's nothing left to write, unregister for the write
>>> event, return to event processing
>>>
>> This is what we do. I'm afraid that it may be a bit annoying for the other
>> sessions, waiting to send data. At some point, it could be better to write
>> only a limited number of bytes, then give back control to the selector, and
>> be awaken when the selector set the OP_WRITE flag again (which will be
>> during the next loop anyway, or ay be another later).
>>
>>   - write until the the channel is congestion controlled, stay registered
>>> for
>>> write event, return to event processing
>>>
>> And what about a third option : write until the buffer we have prepared is
>> empty, even if the channel is not full ? That mean even if the producer has
>> prepared a -say- 1Mb block of data to write, it will be written in 16
>> blocks of 64Kb, even if the channel can absorb more.
>>
>> Does it make sense ?
>>
>>
> No. Doesn't make sense to me. Let the TCP layer handle optimizing how large
> chunks of data is handled. If the client puts a ByteBuffer of 1MB or 20MB
> or whatever onto the outputQueue, call
> SocketChannel.write(outputByteBuffer). Don't chunk it up.
But then, while we push all those data in the channel, we may have the 
other sessions on wait untl it's done (unless the channel is full, and 
we can switch to the next session).

So, do you mean that the underlying layer will not allow us to push say, 
20M, without informing the session that it's full ? In other word, there 
is a limited size that can be pushed and we don't have to take care of 
this limit ourselves ?

-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: [MINA 3.0] IoBuffer usage

Posted by Chad Beaulac <ca...@gmail.com>.

On Fri, Dec 2, 2011 at 10:19 AM, Emmanuel Lécharny <el...@apache.org>wrote:

> On 12/2/11 2:02 PM, Chad Beaulac wrote:
>
>> I think I'm not communicating my thoughts well enough.
>>
> Well, I hope I have undesrtood what you said, at least :)
>
>
>   A single algorithm
>> can handle large data pipes and provide extremely low latency for
>> variable,
>> small and large message sizes at the same time.
>>
> AFAIU, it' snot because you use a big buffer that you will put some strain
> when dealing with small messages : the buffer will only contain a few
> useful bytes, and that's it. In any case, this buffer won't be allocated
> everytime we read from the channel, so it's just a container. But it's way
> better to have a big buffer when dealing with big messages, because then
> you'll have less roundtrips between the read and the processing. But the
> condition, as you said, is that you don't read the channel until there is
> no more bytes to read. You just read *once* get what you get, and go fetch
> the processing part of your application with these bytes.
>
> The write has exactly the same kind of issue, as you said : don't pound
> the channel, let other channel the opportunity to be written too...
>
>
The write has the same sort of issue but it can be handled more optimally
in a different manner. The use case is slightly different because it's the
client producer code driving the algorithm instead the Selector.
Producer Side
- Use a queue of ByteBuffers as a send queue.
- When send is possible for the selector, block on the queue, loop over the
output queue and send until SocketChannel.send(ByteBuffer src)  (returnVal
< src.remaining || returnVal == 0) or you catch exception.
- This is a fair algorithm when dealing with multiple selectors because the
amount of time the sending thread will spend inside the "send" method is
bounded by how much data is in the ouputQueue and nothing can put data into
the queue while draining the queue to send data out.

Consumer Side
- Use a ByteBuffer(64k) as a container to receive data into
- Only call SocketChannel.read(inputBuffer) once for the channel that's
ready to read.
- Create a new ByteBuffer for the size read. Copy the the intputBuffer into
the new ByteBuffer. Give the new ByteBuffer to the session to process.
Rewind the input ByteBuffer. An alternative to creating a new ByteBuffer
every time for the size read is allow client code to specify a custom
ByteBuffer factory. This allows client code to pre-allocate memory and
create a ring buffer or something like that.

I use these algorithms in C++ (using ACE - Adaptive Communications
Environment) and Java. The algorithm is basically the same in C++ and Java
and handles protocols with a lot of small messages, variable message size
protocols and large data block sizes.



>
>> On the Producer side:
>> Application code should determine the block sizes that are pushed onto the
>> output queue. Logic would be as previously stated:
>> - write until there's nothing left to write, unregister for the write
>> event, return to event processing
>>
> This is what we do. I'm afraid that it may be a bit annoying for the other
> sessions, waiting to send data. At some point, it could be better to write
> only a limited number of bytes, then give back control to the selector, and
> be awaken when the selector set the OP_WRITE flag again (which will be
> during the next loop anyway, or ay be another later).
>
>  - write until the the channel is congestion controlled, stay registered
>> for
>> write event, return to event processing
>>
>
> And what about a third option : write until the buffer we have prepared is
> empty, even if the channel is not full ? That mean even if the producer has
> prepared a -say- 1Mb block of data to write, it will be written in 16
> blocks of 64Kb, even if the channel can absorb more.
>
> Does it make sense ?
>
>
No. Doesn't make sense to me. Let the TCP layer handle optimizing how large
chunks of data is handled. If the client puts a ByteBuffer of 1MB or 20MB
or whatever onto the outputQueue, call
SocketChannel.write(outputByteBuffer). Don't chunk it up.



>  This handles very low latency for 1K message blocks and ensures optimal
>> usage of a socket for large data blocks.
>>
>> On the Consumer side:
>> 64K non-blocking read of channel when read selector fires. Don't read
>> until
>> there's nothing left to read. Let the Selector tell you when it's time to
>> read again.
>>
> Read you. Totally agree.
>
>
>
> --
> Regards,
> Cordialement,
> Emmanuel Lécharny
> www.iktek.com
>
>

Re: [MINA 3.0] IoBuffer usage

Posted by Emmanuel Lécharny <el...@apache.org>.

On 12/2/11 2:02 PM, Chad Beaulac wrote:
> I think I'm not communicating my thoughts well enough.
Well, I hope I have undesrtood what you said, at least :)

>   A single algorithm
> can handle large data pipes and provide extremely low latency for variable,
> small and large message sizes at the same time.
AFAIU, it' snot because you use a big buffer that you will put some 
strain when dealing with small messages : the buffer will only contain a 
few useful bytes, and that's it. In any case, this buffer won't be 
allocated everytime we read from the channel, so it's just a container. 
But it's way better to have a big buffer when dealing with big messages, 
because then you'll have less roundtrips between the read and the 
processing. But the condition, as you said, is that you don't read the 
channel until there is no more bytes to read. You just read *once* get 
what you get, and go fetch the processing part of your application with 
these bytes.

The write has exactly the same kind of issue, as you said : don't pound 
the channel, let other channel the opportunity to be written too...
>
> On the Producer side:
> Application code should determine the block sizes that are pushed onto the
> output queue. Logic would be as previously stated:
> - write until there's nothing left to write, unregister for the write
> event, return to event processing
This is what we do. I'm afraid that it may be a bit annoying for the 
other sessions, waiting to send data. At some point, it could be better 
to write only a limited number of bytes, then give back control to the 
selector, and be awaken when the selector set the OP_WRITE flag again 
(which will be during the next loop anyway, or ay be another later).
> - write until the the channel is congestion controlled, stay registered for
> write event, return to event processing

And what about a third option : write until the buffer we have prepared 
is empty, even if the channel is not full ? That mean even if the 
producer has prepared a -say- 1Mb block of data to write, it will be 
written in 16 blocks of 64Kb, even if the channel can absorb more.

Does it make sense ?
> This handles very low latency for 1K message blocks and ensures optimal
> usage of a socket for large data blocks.
>
> On the Consumer side:
> 64K non-blocking read of channel when read selector fires. Don't read until
> there's nothing left to read. Let the Selector tell you when it's time to
> read again.
Read you. Totally agree.


-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: [MINA 3.0] IoBuffer usage

Posted by Chad Beaulac <ca...@gmail.com>.

I think I'm not communicating my thoughts well enough. A single algorithm
can handle large data pipes and provide extremely low latency for variable,
small and large message sizes at the same time.

On the Producer side:
Application code should determine the block sizes that are pushed onto the
output queue. Logic would be as previously stated:
- write until there's nothing left to write, unregister for the write
event, return to event processing
- write until the the channel is congestion controlled, stay registered for
write event, return to event processing
This handles very low latency for 1K message blocks and ensures optimal
usage of a socket for large data blocks.

On the Consumer side:
64K non-blocking read of channel when read selector fires. Don't read until
there's nothing left to read. Let the Selector tell you when it's time to
read again.





On Thu, Dec 1, 2011 at 11:53 AM, Emmanuel Lecharny <el...@gmail.com>wrote:

> On 12/1/11 5:28 PM, Steve Ulrich wrote:
>
>> Hi (quickly reading ,
>>
>> reading everything-you-can-get might starve the application logic.
>> We currently have some "realtime" stuff which must be transferred as
>> quickly as possible, but it's just some bytes (Biggest messages are 1K,
>> smallest about 10 bytes). This logic would increase roundtrip times to
>> numbers where we can shut our servers down.
>>
>
> Yes, Chad ointed out that it was not an option, so I reverted my changes.
>
>
>> In such a setup it would be nice if every 1K ByteBuffer is pushed to the
>> chain, since in most cases it's a full message and waiting any longer just
>> increases roundtrip times.
>> In this case, streaming big data would be very inefficient, so don't
>> expect a simple solution that fits all problems.
>>
>
> Right now, we use one single buffer associated with the selector, and it's
> now set to 64Kb, so it works for streaming big data as small ones. We can
> make this size configurable.
>
>
>> Maybe the application/decoder logic should set some hints to the
>> Processor on a session base. This way you could even switch a running
>> session between short reaction time and efficient streaming.
>>
>> A quick and unfinished thought about a hint-class:
>>
>> class DecodingHints {
>>   static DecodingHints MASS_DATA = new DecodingHints(65535, 10)
>>   static DecodingHints NORMAL = new DecodingHints(16384, 10)
>>   static DecodingHints QUICK = new DecodingHints(1024, 1)
>>
>>   DecodingHints(int bufferSize, in maxBufferedBuffersCount){
>> ...
>>   }
>> }
>>
>> Usage:
>>
>> class MyDecoder {
>>   ...
>>   if (isStreamingBegin){
>>     session.setDecodingHints(**DecodingHints.MASS_DATA);
>>   } else if (isStreamingEnd) {
>>     session.setDecodingHints(**NORMAL);
>>   }
>>   ...
>> }
>>
>
> This is something we can probably implement in the selector's logic, sure.
> We can even let the session define which size fits the best its need,
> starting with a small buffer and increase it later.
>
> It can even be interesting on a highly loaded server to process small
> chunks of data, in order to allow other sessions to be processed.
>
> A kind of adaptative system...
>
>
>
> --
> Regards,
> Cordialement,
> Emmanuel Lécharny
> www.iktek.com
>
>

Re: [MINA 3.0] IoBuffer usage

Posted by Emmanuel Lecharny <el...@gmail.com>.

On 12/1/11 5:28 PM, Steve Ulrich wrote:
> Hi (quickly reading ,
>
> reading everything-you-can-get might starve the application logic.
> We currently have some "realtime" stuff which must be transferred as quickly as possible, but it's just some bytes (Biggest messages are 1K, smallest about 10 bytes). This logic would increase roundtrip times to numbers where we can shut our servers down.

Yes, Chad ointed out that it was not an option, so I reverted my changes.
>
> In such a setup it would be nice if every 1K ByteBuffer is pushed to the chain, since in most cases it's a full message and waiting any longer just increases roundtrip times.
> In this case, streaming big data would be very inefficient, so don't expect a simple solution that fits all problems.

Right now, we use one single buffer associated with the selector, and 
it's now set to 64Kb, so it works for streaming big data as small ones. 
We can make this size configurable.
>
> Maybe the application/decoder logic should set some hints to the Processor on a session base. This way you could even switch a running session between short reaction time and efficient streaming.
>
> A quick and unfinished thought about a hint-class:
>
> class DecodingHints {
>    static DecodingHints MASS_DATA = new DecodingHints(65535, 10)
>    static DecodingHints NORMAL = new DecodingHints(16384, 10)
>    static DecodingHints QUICK = new DecodingHints(1024, 1)
>
>    DecodingHints(int bufferSize, in maxBufferedBuffersCount){
> ...
>    }
> }
>
> Usage:
>
> class MyDecoder {
>    ...
>    if (isStreamingBegin){
>      session.setDecodingHints(DecodingHints.MASS_DATA);
>    } else if (isStreamingEnd) {
>      session.setDecodingHints(NORMAL);
>    }
>    ...
> }

This is something we can probably implement in the selector's logic, 
sure. We can even let the session define which size fits the best its 
need, starting with a small buffer and increase it later.

It can even be interesting on a highly loaded server to process small 
chunks of data, in order to allow other sessions to be processed.

A kind of adaptative system...


-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

RE: [MINA 3.0] IoBuffer usage

Posted by Steve Ulrich <st...@proemion.com>.

Hi (quickly reading ,

reading everything-you-can-get might starve the application logic.
We currently have some "realtime" stuff which must be transferred as quickly as possible, but it's just some bytes (Biggest messages are 1K, smallest about 10 bytes). This logic would increase roundtrip times to numbers where we can shut our servers down.

In such a setup it would be nice if every 1K ByteBuffer is pushed to the chain, since in most cases it's a full message and waiting any longer just increases roundtrip times.
In this case, streaming big data would be very inefficient, so don't expect a simple solution that fits all problems.

Maybe the application/decoder logic should set some hints to the Processor on a session base. This way you could even switch a running session between short reaction time and efficient streaming.

A quick and unfinished thought about a hint-class:

class DecodingHints {
  static DecodingHints MASS_DATA = new DecodingHints(65535, 10)
  static DecodingHints NORMAL = new DecodingHints(16384, 10)
  static DecodingHints QUICK = new DecodingHints(1024, 1)

  DecodingHints(int bufferSize, in maxBufferedBuffersCount){
...
  }
}

Usage:

class MyDecoder {
  ...
  if (isStreamingBegin){
    session.setDecodingHints(DecodingHints.MASS_DATA);
  } else if (isStreamingEnd) {
    session.setDecodingHints(NORMAL);
  }
  ...
}



> Chad Beaulac [mailto:cabeaulac@gmail.com] wrote
>
> Hi Emmanuel,
>
> A 1k ByteBuffer will be too small for large data pipes. Consider using
> 64k
> like you mentioned yesterday.
> Draining the channel before returning control to the program can be
> problematic. This thread can monopolize the CPU and other necessary
> processing could get neglected. The selector will fire again when
> there's
> more data to read. Suggest removing the loop below and using a 64k
> input
> buffer.
>
> Regards,
> Chad
>
>
> On Thu, Dec 1, 2011 at 4:00 AM, Emmanuel Lecharny
> <el...@gmail.com>wrote:
>
[snip]
> >
> > As you can see, instead of reading one buffer, and call the chain, we
> > gather as many data as we can (ie as many as the channel can
> provide), and
> > we call the chain.
> > This has one major advantage : we don't call the chain many times if
> the
> > data is bigger than the buffer size (currently set to 1024 bytes),
> and as a
> > side effect does not require that we define a bigger buffer (not
> really a
> > big deal, we can afford to use a 64kb buffer here, as there is only
> one
> > buffer per selector)
> > The drawback is that we allocate ByteBuffers on the fly. This can be
> > improved by using a pre-allocated buffer (say a 64kb buffer), and if
> we
> > still have something to read, then we allocate some more (this is
> probably
> > what I will change).
> >
> > The rest of the code is not changed widely, except the decoder and
> every
> > filter that expected to receive a ByteBuffer (like the
> LoggingFilter). It's
> > just a matter of casting the Object to IoBuffer, and process the
> data, as
> > the IoBuffer methods are the same than the ByteBuffer (except that
> you
> > can't inject anything but ByteBuffers into an IoBuffer, so no put
> method,
> > for instance).
> >
> > The decoders for Http and Ldap have been changed to deal with the
> > IoBuffer. The big gain here, in the Http cas, is that we don't have
> to
> > accumulate the data into a new ByteBuffer : the IoBuffer already
> accumulate
> > data itself.
> >
> > The IoBuffer is stored into the session, which means we can reuse it
> over
> > and over, no need to create a new one. I still have to implement the
> > compact() method which will remove the used ByteBuffers, in order for
> this
> > IoBuffer not to grow our of bounds.
> >
> > thoughts, comments ?
> >
> > Thanks !
> >
> > --
> > Regards,
> > Cordialement,
> > Emmanuel Lécharny
> > www.iktek.com
> >
> >
>



--------------------------------------------------------------------------
PROEMION GmbH

Steve Ulrich

IT Development (IT/DEV)

Donaustrasse 14
D-36043 Fulda, Germany
Phone +49 (0) 661 9490-601
Fax +49 (0) 661 9490-333

http://www.proemion.com

Geschäftsführer: Dipl. Ing. Robert Michaelides
Amtsgericht-Registergericht-Fulda: 5 HRB 1867
--------------------------------------------------------------------------
E-mail and any attachments may be confidential. If you have received this
E-mail and you are not a named addressee, please inform the sender immediately by E-mail and then delete this E-mail from your system. If you are not a named addressee, you may not use, disclose, distribute, copy or print this E-mail. Addressees should scan this E-mail and any attachments for viruses. No representation or warranty is made as to the absence of viruses in this E-mail or any of its attachments.

AKTUELLES:
http://www.proemion.de

NEWS:
http://www.proemion.com

Re: [MINA 3.0] IoBuffer usage

Posted by Chad Beaulac <ca...@gmail.com>.

Hi Emmanuel,

A 1k ByteBuffer will be too small for large data pipes. Consider using 64k
like you mentioned yesterday.
Draining the channel before returning control to the program can be
problematic. This thread can monopolize the CPU and other necessary
processing could get neglected. The selector will fire again when there's
more data to read. Suggest removing the loop below and using a 64k input
buffer.

Regards,
Chad


On Thu, Dec 1, 2011 at 4:00 AM, Emmanuel Lecharny <el...@gmail.com>wrote:

> Hi guys,
>
> yesterday, I committed some changes that make the NioSelectorProcessor to
> use the IoBuffer class instead of a singe buffer to store the incoming
> data. Here is the snippet of changed code :
>
>                                int readCount = 0;
>                                IoBuffer ioBuffer = session.getIoBuffer();
>
>                                do {
>                                    ByteBuffer readBuffer =
> ByteBuffer.allocate(1024);
>                                    readCount = channel.read(readBuffer);
>                                    LOGGER.debug("read {} bytes",
> readCount);
>
>                                    if (readCount < 0) {
>                                        // session closed by the remote peer
>                                        LOGGER.debug("session closed by the
> remote peer");
>                                        sessionsToClose.add(session);
>                                        break;
>                                    } else if (readCount > 0) {
>                                        readBuffer.flip();
>                                        ioBuffer.add(readBuffer);
>                                    }
>                                } while (readCount > 0);
>
>                                // we have read some data
>                                // limit at the current position & rewind
> buffer back to start & push to the chain
>                                session.getFilterChain().**
> processMessageReceived(**session, ioBuffer);
>
> As you can see, instead of reading one buffer, and call the chain, we
> gather as many data as we can (ie as many as the channel can provide), and
> we call the chain.
> This has one major advantage : we don't call the chain many times if the
> data is bigger than the buffer size (currently set to 1024 bytes), and as a
> side effect does not require that we define a bigger buffer (not really a
> big deal, we can afford to use a 64kb buffer here, as there is only one
> buffer per selector)
> The drawback is that we allocate ByteBuffers on the fly. This can be
> improved by using a pre-allocated buffer (say a 64kb buffer), and if we
> still have something to read, then we allocate some more (this is probably
> what I will change).
>
> The rest of the code is not changed widely, except the decoder and every
> filter that expected to receive a ByteBuffer (like the LoggingFilter). It's
> just a matter of casting the Object to IoBuffer, and process the data, as
> the IoBuffer methods are the same than the ByteBuffer (except that you
> can't inject anything but ByteBuffers into an IoBuffer, so no put method,
> for instance).
>
> The decoders for Http and Ldap have been changed to deal with the
> IoBuffer. The big gain here, in the Http cas, is that we don't have to
> accumulate the data into a new ByteBuffer : the IoBuffer already accumulate
> data itself.
>
> The IoBuffer is stored into the session, which means we can reuse it over
> and over, no need to create a new one. I still have to implement the
> compact() method which will remove the used ByteBuffers, in order for this
> IoBuffer not to grow our of bounds.
>
> thoughts, comments ?
>
> Thanks !
>
> --
> Regards,
> Cordialement,
> Emmanuel Lécharny
> www.iktek.com
>
>