You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Andrew Purtell <ap...@apache.org> on 2009/02/06 20:00:08 UTC

Re: HADOOP-3856

So it looks like a complete rewrite of the datanode network I/O
in the async model is necessary. I don't want to invest the time
if in the end it isn't enough to significantly affect HBase
stability in a positive way. 

Blocking in disk I/O will be out of scope.

   - Andy


> From: Andrew Purtell 
> 
> I have started work on this. The plan is to use selectable
> channels (set to blocking mode), a work queue, and a pool of
> worker threads to handle xceiver connections.
[...]
> However, it may be necessary to go fully async
[...]



      

Re: HADOOP-3856

Posted by Andrew Purtell <ap...@apache.org>.
I'll see about MINA.

  http://mina.apache.org/
      http://mina.apache.org/features.html

   - Andy

> From: stack
>
> >  Andrew Purtell wrote:
> > I'm wondering if any async framework offers enough
> > to make an> external dependency worthwhile. What benefit
> > do they provide, at the low level that the datanode would
> > operate? 
> 
> I don't have enough experience to say whether or which.
> Reading -- not implementing -- what I learned about an nio
> implementation is that its easy to make a mess; in particular
> an implementation that works but that is dog slow or resource
> expensive.  I got the sense that using one of the frameworks
> would help avoid the ready traps.
[...]



      

Re: HADOOP-3856

Posted by stack <st...@duboce.net>.
On Sat, Feb 7, 2009 at 2:27 PM, Andrew Purtell <ap...@apache.org> wrote:

> I'm wondering if any async framework offers enough to make an
> external dependency worthwhile. What benefit do they provide, at
> the low level that the datanode would operate? I'm not sure.



I don't have enough experience to say whether or which.  Reading -- not
implementing -- what I learned about an nio implementation is that its easy
to make a mess; in particular an implementation that works but that is dog
slow or resource expensive.  I got the sense that using one of the
frameworks would help avoid the ready traps.  Then again, I was reading the
framework's literature and did not make a discount for self-promotion.

Regards dependency, if its working the dependency will be overlooked (IMO).

St.Ack

>
> I've used Netty before, but it's LGPL. Grizzly and MINA are two
> more or less equivalent options that I am aware of. MINA is ASF.
> That's probably the least objectionable external dependency, but
> then I go back to the question of how much time savings would it
> offer? There's a learning curve for me in any direction, picking
> apart the DN, piecing it back together. There's no argument
> against a new dependency if I just go with plain NIO.
>
> Is anyone aware of some compelling benefit to Grizzly or MINA or
> some serious pitfall with just using plain NIO? I'm a C++ guy
> still learning the Java landscape here...
>
>   - Andy
>
> > From: stack
> >
> > Did you see in the hadoop issue where Raghu talks of grizzly
> > mayhaps been better suited because it has more accomodating
> > community and its written more to the datanode level?
> >
> > Would suggest keeping Raghu in the loop.  Perhaps by
> > posting intermediary patches up against the hadoop issue.
> >
> > I can help pre-review patches and with testing.
>
>
>
>
>

Re: HADOOP-3856

Posted by Andrew Purtell <ap...@apache.org>.
Thanks for the offer of support Stack.

I'm wondering if any async framework offers enough to make an
external dependency worthwhile. What benefit do they provide, at
the low level that the datanode would operate? I'm not sure.
I've used Netty before, but it's LGPL. Grizzly and MINA are two
more or less equivalent options that I am aware of. MINA is ASF.
That's probably the least objectionable external dependency, but
then I go back to the question of how much time savings would it
offer? There's a learning curve for me in any direction, picking
apart the DN, piecing it back together. There's no argument
against a new dependency if I just go with plain NIO. 

Is anyone aware of some compelling benefit to Grizzly or MINA or
some serious pitfall with just using plain NIO? I'm a C++ guy
still learning the Java landscape here...

   - Andy

> From: stack
>
> Did you see in the hadoop issue where Raghu talks of grizzly
> mayhaps been better suited because it has more accomodating
> community and its written more to the datanode level?
> 
> Would suggest keeping Raghu in the loop.  Perhaps by
> posting intermediary patches up against the hadoop issue.
> 
> I can help pre-review patches and with testing.



      

Re: HADOOP-3856

Posted by stack <st...@duboce.net>.
Did you see in the hadoop issue where Raghu talks of grizzly mayhaps been
better suited because it has more accomodating community and its written
more to the datanode level?

Would suggest keeping Raghu in the loop.  Perhaps by posting intermediary
patches up against the hadoop issue.

I can help pre-review patches and with testing.

You are a good man Andrew Purtell.
St.Ack

On Fri, Feb 6, 2009 at 11:00 AM, Andrew Purtell <ap...@apache.org> wrote:

> So it looks like a complete rewrite of the datanode network I/O
> in the async model is necessary. I don't want to invest the time
> if in the end it isn't enough to significantly affect HBase
> stability in a positive way.
>
> Blocking in disk I/O will be out of scope.
>
>   - Andy
>
>
> > From: Andrew Purtell
> >
> > I have started work on this. The plan is to use selectable
> > channels (set to blocking mode), a work queue, and a pool of
> > worker threads to handle xceiver connections.
> [...]
> > However, it may be necessary to go fully async
> [...]
>
>
>
>
>