You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Stefan Groschupf <sg...@media-style.com> on 2006/05/30 22:42:57 UTC

HBase Design Ideas, terminology

Hi Michael,

what you think are the next steps?
I really would love to help in the implementation,
ideally we define interfaces or create lists of functionality and may  
be people can provide implementations for the different components.
Or is your idea to provide a first implementation and continue on a  
patch file kind of development?

What you think to create a wiki page were we at least list all  
components within a short description and list of functionalities.
We can start with your emails and discuss each component in detail  
here in the list.

Makes that sense?

BTW,
Yoram and me discussed that a kind of big table could be a  
interesting storage for data that are hold in memory in jobtracker  
and namenode and cause some scalability problems.
What you think?


Greetings,
Stefan



Re: HBase Design Ideas, terminology

Posted by Andrzej Bialecki <ab...@getopt.org>.
Michael Cafarella wrote:
> Storing namenode and MapReduce status in a BigTable-like structure is
> intriguing,
> but seems dangerous.  (All the database stuff is built on top of DFS
> and MapReduce.  What happens when the database needs a file in DFS,
> which needs a record from the database?  Oh lord...)

Let me pipe-in with a "me too" comment: I think it's important to 
implement HBase in a way that other parts of Hadoop don't depend on it. 
So far we have a very nice separation of layers: DFS can run 
independently of map-reduce, and DFS + map-reduce should run 
independently of HBase, too. IMHO, any tricks like the one you guys 
describe would better be left to the high-level applications, and not 
hard-wired into the infrastructure ...

-- 
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



Re: HBase Design Ideas, terminology

Posted by Stefan Groschupf <sg...@media-style.com>.
Hi,

> Storing namenode and MapReduce status in a BigTable-like structure is
> intriguing,
> but seems dangerous.  (All the database stuff is built on top of DFS
> and MapReduce.  What happens when the database needs a file in DFS,
> which needs a record from the database?  Oh lord...)

Right, I missed that. :-) In general I also completely agree also  
with Andrzej, having HBase as a independent component using io, rpc,  
dfs and map reduce is the way we should go.

Stefan 

Re: HBase Design Ideas, terminology

Posted by Michael Cafarella <mi...@gmail.com>.
Hi Stefan,

I'm continuing to plug away at code myself.  My goal is to have a
fully-working
(though feature-poor) system before committing any code.  I don't want to
clutter
the Hadoop workspace with experimental stuff.  (I feel like I did that a
little
too much in the early days of DFS, and I want to avoid that experience ;)

A wiki for current docs, status, etc, is a good idea.  I'll look to add that
to the Hadoop pages.  (Also, let's talk offline about how we can best work
together
on this.  Once the first commit is finished, then I think it should enter
the general bug-patch-commit development mode.)

Storing namenode and MapReduce status in a BigTable-like structure is
intriguing,
but seems dangerous.  (All the database stuff is built on top of DFS
and MapReduce.  What happens when the database needs a file in DFS,
which needs a record from the database?  Oh lord...)

--Mike

On 5/30/06, Stefan Groschupf <sg...@media-style.com> wrote:
>
> Hi Michael,
>
> what you think are the next steps?
> I really would love to help in the implementation,
> ideally we define interfaces or create lists of functionality and may
> be people can provide implementations for the different components.
> Or is your idea to provide a first implementation and continue on a
> patch file kind of development?
>
> What you think to create a wiki page were we at least list all
> components within a short description and list of functionalities.
> We can start with your emails and discuss each component in detail
> here in the list.
>
> Makes that sense?
>
> BTW,
> Yoram and me discussed that a kind of big table could be a
> interesting storage for data that are hold in memory in jobtracker
> and namenode and cause some scalability problems.
> What you think?
>
>
> Greetings,
> Stefan
>
>
>