You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Jaesun Han <js...@gmail.com> on 2008/11/27 07:54:47 UTC
Hadoop Tutorial Workshop in South Korea
Hi, all
Korea Hadoop Community hosts half-day Hadoop Tutorial Workshop
on November 28(Friday) in Seoul, South Korea.
You can check and register the workshop in our website.
http://www.hadoop.or.kr/?document_srl=1945
Time: Friday, November 28, 14:00 ~ 18:00
Location: Seoul National University School of Dentistry main building 121
Free and open event (but limited by 100 persons)
Agenda
- Hadoop Overview
- Hadoop Installation & Management
- Managing a Hadoop Cluster
- MapReduce Programming
- Advanced MapReduce Programming
Look forward to seeing you there!
Jason
Best practices of using Hadoop
Posted by Ricky Ho <rh...@adobe.com>.
I am trying to get some answers to these kind of questions as they pop up frequently ...
1) What kind of problems fits best to Hadoop and what not ?
2) What is the dark side of Hadoop where other parallel processing model (e.g. MPI, TupleSpace ... etc) fits better ?
3) What is the demarcation point between choosing a Hadoop model versus a multi-thread share memory model ?
4) Given that we can partition and replicate a RDBMS table. We can make it as big as we like and spread the workload across. Why isn't that good enough for scalability ? Why do we need BigTable or HBase which require an adoption of a new data model ?
5) Is there a general methodology that can transform any algorithm into the map/reduce form ?
6) How would one choose between Hadoop Java, Hadoop Streaming and PIG ? Looks like if a problem can be solved in one, it can be solved in others. If so, PIG is more attractive because it gives a higher level semantics.
I appreciate if anyone come across these decisions can share their thoughts.
Rgds,
ricky