You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@hadoop.apache.org by Ajay Anand <aa...@yahoo-inc.com> on 2009/02/05 20:25:03 UTC

Hadoop Summit 2009 - Call for submissions

We are planning the 2009 Hadoop Summit, to be held the second week of
June in Santa Clara, CA.

 

Please send me (aanand@yahoo-inc.com) your presentation proposals and
suggested topics.

 

Areas we plan to cover include:

-          Core Hadoop - areas of development and key contributions

-          Subprojects and related projects: including Pig, Hive, Hbase,
Mahout, Zookeeper and others

-          Administration: Monitoring tools and framework, services,
experiences with administering Hadoop clusters

-          Adoption: Deployment examples, hosted services

-          Applications on Hadoop: examples of innovative applications
being developed and deployed on Hadoop

-          Research: Research topics being explored, university research
examples

 

Thanks, and looking forward to a strong event,

Ajay


RE: Hadoop Summit 2009 - Call for submissions

Posted by Ashish Thusoo <at...@facebook.com>.
Hi Ajay,

Please put us down for a session on Hive too.

Thanks,
Ashish 

-----Original Message-----
From: Ajay Anand [mailto:aanand@yahoo-inc.com] 
Sent: Thursday, February 05, 2009 11:25 AM
To: core-user@hadoop.apache.org; general@hadoop.apache.org; zookeeper-user@hadoop.apache.org; hbase-user@hadoop.apache.org; pig-user@hadoop.apache.org
Subject: Hadoop Summit 2009 - Call for submissions

We are planning the 2009 Hadoop Summit, to be held the second week of June in Santa Clara, CA.

 

Please send me (aanand@yahoo-inc.com) your presentation proposals and suggested topics.

 

Areas we plan to cover include:

-          Core Hadoop - areas of development and key contributions

-          Subprojects and related projects: including Pig, Hive, Hbase,
Mahout, Zookeeper and others

-          Administration: Monitoring tools and framework, services,
experiences with administering Hadoop clusters

-          Adoption: Deployment examples, hosted services

-          Applications on Hadoop: examples of innovative applications
being developed and deployed on Hadoop

-          Research: Research topics being explored, university research
examples

 

Thanks, and looking forward to a strong event,

Ajay


Re: Hadoop Summit 2009 - Call for submissions

Posted by Jean-Daniel Cryans <jd...@apache.org>.
Hey all,

Jonathan, I think this is all good stuff

On my end, I'd like to give a presentation on the integration of Zookeeper
in HBase that Nitay Joffe and me have been working on. Main subjects would
be:

- The main features that are implemented for 0.20 (.ROOT. location stored in
ZK, HMaster failover, Region Servers leases replaced with ephemeral nodes,
etc)

- How HBase manages ZK instances.

- The issues we encountered during the integration and how we solved them
(mostly stuff about configuration)

- How using ZK improved HBase. One example is that by removing the RS
leases, a dead node is detected in the following 2-3 seconds instead of, in
the worst case, the full length of a lease. This way, we can reassign the
regions faster.

- The main features that are implemented for 0.21 like using ZK to
distribute the HMaster role to Region Servers.

- If in June HBase 0.21 is released or soon to be, I would cover the new
issues/improvements.

J-D

On Thu, Feb 5, 2009 at 3:59 PM, Jonathan Gray <jl...@streamy.com> wrote:

> Following some discussion on IRC, I'd like to propose giving a presentation
> for the Hadoop Summit.  JD is also interested.
>
> My current idea would be to title it something like "HBase Goes Realtime"
> or
> something about HBase being Web-ready.
>
> Would include discussion of:
>
> - Improvements in the (released at this point) 0.20 and potentially
> 0.21/0.22 releases by June.
> - Specifically those related to performance and the use of HBase for
> realtime serving of a web site.  (New file format, cell caching, block
> caching, indexes, rpc, whatever else we tackle by the summit).
> - Use of HBase in production at Streamy.  We're using it in a number of
> different ways and should be able to cover most of the issues related to
> using HBase for the web.  Specifically, we'd touch on the migration from
> relational databases, decisions we made along the way, trade-offs in
> architecture and schema design, etc. but all in all a "success story" on
> using HBase to serve a realtime web site.
> - Touch on improved scalability.
>
> I will let JD speak for himself, but I believe he was thinking of talking
> about Zookeeper and HBase.  We'd then be able to cover the stability and
> fault-tolerance improvements made in HBase in this talk.
>
> I think that would cover our bases.
>
> Let me know what you all think.
>
> JG
>
> > -----Original Message-----
> > From: Ajay Anand [mailto:aanand@yahoo-inc.com]
> > Sent: Thursday, February 05, 2009 11:25 AM
> > To: core-user@hadoop.apache.org; general@hadoop.apache.org; zookeeper-
> > user@hadoop.apache.org; hbase-user@hadoop.apache.org; pig-
> > user@hadoop.apache.org
> > Subject: Hadoop Summit 2009 - Call for submissions
> >
> > We are planning the 2009 Hadoop Summit, to be held the second week of
> > June in Santa Clara, CA.
> >
> >
> >
> > Please send me (aanand@yahoo-inc.com) your presentation proposals and
> > suggested topics.
> >
> >
> >
> > Areas we plan to cover include:
> >
> > -          Core Hadoop - areas of development and key contributions
> >
> > -          Subprojects and related projects: including Pig, Hive,
> > Hbase,
> > Mahout, Zookeeper and others
> >
> > -          Administration: Monitoring tools and framework, services,
> > experiences with administering Hadoop clusters
> >
> > -          Adoption: Deployment examples, hosted services
> >
> > -          Applications on Hadoop: examples of innovative applications
> > being developed and deployed on Hadoop
> >
> > -          Research: Research topics being explored, university
> > research
> > examples
> >
> >
> >
> > Thanks, and looking forward to a strong event,
> >
> > Ajay
>
>
>

RE: Hadoop Summit 2009 - Call for submissions

Posted by Jonathan Gray <jl...@streamy.com>.
Following some discussion on IRC, I'd like to propose giving a presentation
for the Hadoop Summit.  JD is also interested.

My current idea would be to title it something like "HBase Goes Realtime" or
something about HBase being Web-ready.

Would include discussion of:

- Improvements in the (released at this point) 0.20 and potentially
0.21/0.22 releases by June.
- Specifically those related to performance and the use of HBase for
realtime serving of a web site.  (New file format, cell caching, block
caching, indexes, rpc, whatever else we tackle by the summit). 
- Use of HBase in production at Streamy.  We're using it in a number of
different ways and should be able to cover most of the issues related to
using HBase for the web.  Specifically, we'd touch on the migration from
relational databases, decisions we made along the way, trade-offs in
architecture and schema design, etc. but all in all a "success story" on
using HBase to serve a realtime web site.
- Touch on improved scalability.

I will let JD speak for himself, but I believe he was thinking of talking
about Zookeeper and HBase.  We'd then be able to cover the stability and
fault-tolerance improvements made in HBase in this talk.

I think that would cover our bases.

Let me know what you all think.

JG 

> -----Original Message-----
> From: Ajay Anand [mailto:aanand@yahoo-inc.com]
> Sent: Thursday, February 05, 2009 11:25 AM
> To: core-user@hadoop.apache.org; general@hadoop.apache.org; zookeeper-
> user@hadoop.apache.org; hbase-user@hadoop.apache.org; pig-
> user@hadoop.apache.org
> Subject: Hadoop Summit 2009 - Call for submissions
> 
> We are planning the 2009 Hadoop Summit, to be held the second week of
> June in Santa Clara, CA.
> 
> 
> 
> Please send me (aanand@yahoo-inc.com) your presentation proposals and
> suggested topics.
> 
> 
> 
> Areas we plan to cover include:
> 
> -          Core Hadoop - areas of development and key contributions
> 
> -          Subprojects and related projects: including Pig, Hive,
> Hbase,
> Mahout, Zookeeper and others
> 
> -          Administration: Monitoring tools and framework, services,
> experiences with administering Hadoop clusters
> 
> -          Adoption: Deployment examples, hosted services
> 
> -          Applications on Hadoop: examples of innovative applications
> being developed and deployed on Hadoop
> 
> -          Research: Research topics being explored, university
> research
> examples
> 
> 
> 
> Thanks, and looking forward to a strong event,
> 
> Ajay



[event] Cloud Computing:Using the Open Source Hadoop to Generate Data-Intensive Insights

Posted by Bonesata <bo...@bonesata.com>.
Registration:
http://www.meetup.com/CIO-IT-Executives/calendar/9528874/

Gaining and keeping a competitive edge in Internet offerings has increasingly become a matter of continuously processing enormous volumes of data about users, user activities, Web sites, ads, and Web searches. There is gold in the mountain of data but it is often impossible to extract in time to make use of it if you are constrained to a single (albeit powerful) computer or database. Hadoop (http://hadoop.apache....) is open source software for creating a cluster of commodity computers from one node to several thousand nodes in size and internally managing petabytes of data. It provides a simple interface for attaching user-written code to be executed in parallel on some or all of the nodes in the cluster. As an option to creating your own Hadoop cluster, there are Hadoop AMIs (Amazon Machine Images - virtual machines) that allow you to create and run Hadoop programs on Amazon's EC2 infrastructure.

Rob will talk about what Hadoop is, options for writing programs that run on a Hadoop cluster, and Yahoo use cases where Hadoop has proved beneficial in dealing with very large data volumes.

--

Rob Weltman has been Director of Engineering in Enterprise Software at Nescape, Chief Architect at AOL, and Director of Engineering for Yahoo's data warehouse technology. He is currently Director of Grid Services at Yahoo.

Gourmet dinner and wine are included.