You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Jim Kellerman (POWERSET)" <Ji...@microsoft.com> on 2009/06/14 12:03:01 UTC

You guys rocked the house this week!

I have received nothing but compliments in all my schmoozing
this week.

Although I was mostly absent from 0.20, it is 0.20
that has everyone excited. Congrats, and great work guys.

However, we still have to deliver on 0.20. It has to be rock
solid, or the buzz will turn against us.

Friday, I was at Cloudera along with Doug C, Erik14, Owen O'Malley,
Arun Murthy, Alan Gates (Pig), a guy from Hive (whose name I
can't remember at the moment), Dhruba (facebook) and, of course,
the Cloudera guys (Todd Lipcon, Jeff Hammerbacher, Christopne,
Amr, etc.)

The day went something like this:

1. 1st exercise: write (on a postit) 5 things you like about
   hadoop and 5 things you don't - most people submitted more
   than 10) and discussion.

2. 2nd exercise: write (on a postit) features that you'd like
   to see in the short term in hadoop

   - We had submissions that were truly short term and some that
     were truly "blue sky". These were divided into categories:
     Map/Reduce, HDFS, Build/Test, Core (including Avro)

   - We then split up into separate sessions. I attended HDFS.
     (the session leaders are supposed to send in notes from
     their session, and as soon as I get them, I will post.)

   - The biggest issue from HDFS was append (actually flush/sync),
     and not just me, there were about 7 votes for it (just "append")
     whereas my votes were like "flush/sync in 0.21" and HADOOP-4379
     in 0.20.x.

3. Third session: Blue Sky: not much happened here because every
   one was kind of burned out at this point.

Important points (for HBase):

1. We need to deliver a rock-solid 0.20 release or we will lose
   all the credibility that we gained this week.

   BTW: Cloudera's next release is going to be based on 0.20, and
   they will either include HBase as alpha software, or put us
   in their supported stack, depending on the reaction from our
   community. And despite the fact that their revenue stream depends
   on the Hadoop community, I got the feeling that they are getting
   pressured to have a version of HBase (not so much on 0.18, but
   more on 0.20). They have a '$' interest in seeing us succeed.

2. Once we get 0.20 out, we need to focus on beating the sh*t out
   of HADOOP-4379 patch for 0.20. Once we think it is solid, we
   need to create a script that randomly fails region server and
   datanodes. If we do this, Cloudera has volunteered to run that
   script on EC2 on a ~100 node cluster to burn it in. (They have
   some arrangement with Amazon) and they have volunteered to run
   the test on a "big" cluster for us. They will run it for several
   days if necessary to prove that it works.

   -- We need to be sure HADOOP-4379 is solid, which could lead to
      getting 4379 into hadoop 0.20.x if so. Dhruba, who led the HFDS
      breakout session, will do what it takes to fix issues around
      his current patch, provided we provide feedback to him. However,
      if we don't do #1 above, it won't matter.

   -- Master failover works

   -- Region server failover works.

3. After this week, both PIG and Hive are excited about using
   HBase as a source and a sink for their map-reduce jobs that
   they spawn. They have both come to realize that we are
   becoming more important in the Hadoop community, and are
   willing to devote resources to make their stuff work with HBase.
   (They will look bad if the other supports HBase and they do
   not - although to be fair, there was no data store available
   before this that met their needs either).

So keep up the great work and MAKE SURE 0.20 IS ROCK SOLID STABLE!!!

---
Jim Kellerman, Powerset (Live Search, Microsoft Corporation)



Re: You guys rocked the house this week!

Posted by Andrew Purtell <ap...@apache.org>.
> Andrew Purtell volunteered long ago to aid with packaging

Yes.

> and has been doing ongoing work to make hbase TRUNK works on hadoop 0.18.3.

Yes. More work today, in fact.

> I know there was some difficulty communicating at first, has this
> been worked out since?

I believe so.

I'll produce RPMs and DEBs for 0.20 release for both generic (top level spec file in HBase
distrib for Hadoop 0.20) and also Cloudera specific packaging (0.18.3 branch).

Beyond this, I haven't heard anything specifically requested. I would expect some integration with their config wizard would be needed to become part of the official release. I have offered to support that, as well as make time quarterly for release engineering.

   - Andy




________________________________
From: stack <st...@duboce.net>
To: hbase-dev@hadoop.apache.org
Sent: Sunday, June 14, 2009 2:06:22 PM
Subject: Re: You guys rocked the house this week!

Sounds like an interesting meeting.  Thanks for representing.

Being part of the Cloudera bundle would be a nice-to-have but sounds like
they are still on the fence and meantime they want us to do some scripting?

I thought we already had a designated Couldera point person?  Andrew Purtell
volunteered long ago to aid with packaging and has been doing ongoing work
to make hbase TRUNK works on hadoop 0.18.3.  Correct me if I wrong Andrew,
but I thought you were doing this at Cloudera's request?  If this is not
whats wanted, then there should be some messaging to prevent wasted
volunteer effort.

Andew, I know there was some difficulty communicating at first, has this
been worked out since?

St.Ack



      

Re: You guys rocked the house this week!

Posted by stack <st...@duboce.net>.
Sounds like an interesting meeting.  Thanks for representing.

Being part of the Cloudera bundle would be a nice-to-have but sounds like
they are still on the fence and meantime they want us to do some scripting?

I thought we already had a designated Couldera point person?  Andrew Purtell
volunteered long ago to aid with packaging and has been doing ongoing work
to make hbase TRUNK works on hadoop 0.18.3.  Correct me if I wrong Andrew,
but I thought you were doing this at Cloudera's request?  If this is not
whats wanted, then there should be some messaging to prevent wasted
volunteer effort.

Andew, I know there was some difficulty communicating at first, has this
been worked out since?

St.Ack

RE: You guys rocked the house this week!

Posted by "Jim Kellerman (POWERSET)" <Ji...@microsoft.com>.
I'm going to try to answer all the questions around Cloudera, etc in
this email.

> -----Original Message-----
> From: Andrew Purtell [mailto:apurtell@apache.org]
> Sent: Sunday, June 14, 2009 11:18 AM
> To: hbase-dev@hadoop.apache.org
> Subject: Re: You guys rocked the house this week!
>
> Hi Jim,
>
> > BTW: Cloudera's next release is going to be based on 0.20, and
> > they will either include HBase as alpha software, or put us
> > in their supported stack, depending on the reaction from our
> > community.
>
> What does that mean, "depending on the reaction from our  community"?

If the community tries it out and start saying:
- it's not as fast as we claim
- it's failover does not work as advertised
- it's not as solid as advertised
- etc.

i.e., we receive a lot of negative press, we end up in a 2nd class bin.
Otherwise, they will start devoting a resource and we will end up as
a top tier app.

> > If we do this, Cloudera has volunteered to run that
> > script on EC2 on a ~100 node cluster to burn it in. (They have
> > some arrangement with Amazon) and they have volunteered to run
> > the test on a "big" cluster for us.
>
> I think HBase, as well as Hadoop frankly, can also use a reasonably
> scaled performance, reliability, and fault tolerance automated test
> platform. (See "Re: scanner is returning everything in parent region
> plus one of the daughters?") Think of it as expanding Hudson to
> a cluster of several nodes hosted with community resources, perhaps
> on EC2, running some suite once per day, or perhaps triggered by a
> project once they reach a certain milestone, so each project could
> be allocated a budget in terms of hours/month and time limits of
> hours/day or similar. ~10 nodes seems reasonably affordable, with
> ~100 used on occasion, the difference being daily versus weekly, or
> weekly versus monthly.
>
> Stepping back from blue sky, I wonder if HBase anyway can pool
> resources to run such a reasonably scaled performance, reliability,
> and fault tolerance automated test at least twice a week. 10 extra
> large EC2 instances running 5 hours per day is about $300/month.

That was one of the issues discussed on Friday. Nigel will be working
with Tom White on getting this set up but it won't be there for a while.

> -----Original Message-----
> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of stack
> Sent: Sunday, June 14, 2009 2:06 PM
> To: hbase-dev@hadoop.apache.org
> Subject: Re: You guys rocked the house this week!
>
> Sounds like an interesting meeting.  Thanks for representing.
>
> Being part of the Cloudera bundle would be a nice-to-have but sounds
> like they are still on the fence and meantime they want us to do some
> scripting?

All they want is a simple script that launches Performance Evaluation
and randomly kills Master, Region servers and datanodes. They will handle
the startup and config of the EC2 cluster.

I viewed this as a positive step. We get our stuff run at scale, and they
are willing to devote some resources to it.

> -----Original Message-----
> From: Andrew Purtell [mailto:apurtell@apache.org]
> Sent: Sunday, June 14, 2009 6:28 PM
> To: hbase-dev@hadoop.apache.org
> Subject: Re: You guys rocked the house this week!
>
> > Andrew Purtell volunteered long ago to aid with packaging
>
> Yes.
>
> > and has been doing ongoing work to make hbase TRUNK works on hadoop
> > 0.18.3.
>
> Yes. More work today, in fact.
>

> > I know there was some difficulty communicating at first, has this
> > been worked out since?
>
> I believe so.
>
> I'll produce RPMs and DEBs for 0.20 release for both generic (top level
> spec file in HBase distrib for Hadoop 0.20) and also Cloudera specific
> packaging (0.18.3 branch).
>
> Beyond this, I haven't heard anything specifically requested. I would
> expect some integration with their config wizard would be needed to
> become part of the official release. I have offered to support that, as
> well as make time quarterly for release engineering.

I don't know what their status is around releases is, except their
next release will be based on Hadoop 0.20. If we can get the packaging,
etc., for 0.18.3 then when they go to 0.20, there should be almost
nothing extra to do for 0.20, so I don't think doing the support for
0.18.3 is a waste of time.

I'm sorry if I confused anyone. I wanted to convey that Cloudera was
very high on HBase after last week and I think we should be seeing
devoting some resources on HBase in the near future.

-Jim

Re: You guys rocked the house this week!

Posted by Andrew Purtell <ap...@apache.org>.
Hi Jim,

> BTW: Cloudera's next release is going to be based on 0.20, and
> they will either include HBase as alpha software, or put us
> in their supported stack, depending on the reaction from our
> community.

What does that mean, "depending on the reaction from our  community"?

> If we do this, Cloudera has volunteered to run that
> script on EC2 on a ~100 node cluster to burn it in. (They have
> some arrangement with Amazon) and they have volunteered to run
> the test on a "big" cluster for us. 

I think HBase, as well as Hadoop frankly, can also use a reasonably
scaled performance, reliability, and fault tolerance automated test
platform. (See "Re: scanner is returning everything in parent region
plus one of the daughters?") Think of it as expanding Hudson to 
a cluster of several nodes hosted with community resources, perhaps
on EC2, running some suite once per day, or perhaps triggered by a
project once they reach a certain milestone, so each project could
be allocated a budget in terms of hours/month and time limits of
hours/day or similar. ~10 nodes seems reasonably affordable, with
~100 used on occasion, the difference being daily versus weekly, or
weekly versus monthly. 

Stepping back from blue sky, I wonder if HBase anyway can pool
resources to run such a reasonably scaled performance, reliability,
and fault tolerance automated test at least twice a week. 10 extra
large EC2 instances running 5 hours per day is about $300/month.

  - Andy




________________________________
From: Jim Kellerman (POWERSET) <Ji...@microsoft.com>
To: "hbase-dev@hadoop.apache.org" <hb...@hadoop.apache.org>
Sent: Sunday, June 14, 2009 3:03:01 AM
Subject: You guys rocked the house this week!

I have received nothing but compliments in all my schmoozing
this week.

Although I was mostly absent from 0.20, it is 0.20
that has everyone excited. Congrats, and great work guys.

However, we still have to deliver on 0.20. It has to be rock
solid, or the buzz will turn against us.

Friday, I was at Cloudera along with Doug C, Erik14, Owen O'Malley,
Arun Murthy, Alan Gates (Pig), a guy from Hive (whose name I
can't remember at the moment), Dhruba (facebook) and, of course,
the Cloudera guys (Todd Lipcon, Jeff Hammerbacher, Christopne,
Amr, etc.)

The day went something like this:

1. 1st exercise: write (on a postit) 5 things you like about
   hadoop and 5 things you don't - most people submitted more
   than 10) and discussion.

2. 2nd exercise: write (on a postit) features that you'd like
   to see in the short term in hadoop

   - We had submissions that were truly short term and some that
     were truly "blue sky". These were divided into categories:
     Map/Reduce, HDFS, Build/Test, Core (including Avro)

   - We then split up into separate sessions. I attended HDFS.
     (the session leaders are supposed to send in notes from
     their session, and as soon as I get them, I will post.)

   - The biggest issue from HDFS was append (actually flush/sync),
     and not just me, there were about 7 votes for it (just "append")
     whereas my votes were like "flush/sync in 0.21" and HADOOP-4379
     in 0.20.x.

3. Third session: Blue Sky: not much happened here because every
   one was kind of burned out at this point.

Important points (for HBase):

1. We need to deliver a rock-solid 0.20 release or we will lose
   all the credibility that we gained this week.

   BTW: Cloudera's next release is going to be based on 0.20, and
   they will either include HBase as alpha software, or put us
   in their supported stack, depending on the reaction from our
   community. And despite the fact that their revenue stream depends
   on the Hadoop community, I got the feeling that they are getting
   pressured to have a version of HBase (not so much on 0.18, but
   more on 0.20). They have a '$' interest in seeing us succeed.

2. Once we get 0.20 out, we need to focus on beating the sh*t out
   of HADOOP-4379 patch for 0.20. Once we think it is solid, we
   need to create a script that randomly fails region server and
   datanodes. If we do this, Cloudera has volunteered to run that
   script on EC2 on a ~100 node cluster to burn it in. (They have
   some arrangement with Amazon) and they have volunteered to run
   the test on a "big" cluster for us. They will run it for several
   days if necessary to prove that it works.

   -- We need to be sure HADOOP-4379 is solid, which could lead to
      getting 4379 into hadoop 0.20.x if so. Dhruba, who led the HFDS
      breakout session, will do what it takes to fix issues around
      his current patch, provided we provide feedback to him. However,
      if we don't do #1 above, it won't matter.

   -- Master failover works

   -- Region server failover works.

3. After this week, both PIG and Hive are excited about using
   HBase as a source and a sink for their map-reduce jobs that
   they spawn. They have both come to realize that we are
   becoming more important in the Hadoop community, and are
   willing to devote resources to make their stuff work with HBase.
   (They will look bad if the other supports HBase and they do
   not - although to be fair, there was no data store available
   before this that met their needs either).

So keep up the great work and MAKE SURE 0.20 IS ROCK SOLID STABLE!!!

---
Jim Kellerman, Powerset (Live Search, Microsoft Corporation)