You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by meda vijendharreddy <me...@yahoo.co.in> on 2007/07/25 13:05:23 UTC

help on hadoop

Hi,
   Iam new to hadoop, Wanted to use hadoop in my
application.

Currently I want to simulate something like 
"SELECT FROM WHERE "

FieldSelectionMapReduce  Class can be used to reduce
the no of columns(which is like  SELECT blah blah )

FROM , if I have more than 2 tables, then i can do
join on those and if it is single table then i can
acheive  easily.

How can I acheive the where condition functionality.

Please help me on this. I have no clues at this
moment.


Thanks in Advance,




-----
Thanks
Vijen


      Once upon a time there was 1 GB storage in your inbox. To know the happy ending go to http://help.yahoo.com/l/in/yahoo/mail/yahoomail/tools/tools-08.html

Re: help on hadoop

Posted by Ted Dunning <td...@veoh.com>.
Remember that you can do more than one map/reduce step.

Suppose that you want to implement something that looks like this:

Select f(x), g(y), z from table1 join table2 using (j1, j2) where z > 0

Also assume that table1 and table2 have lots of columns besides x, y and z.

You can implement this with a map-reduce where the map step gets both table1
and table2 as inputs.  The output of the map step will be empty if z <= 0
and will otherwise have (j1, j2) as key and f(x), g(y), z as value.  The
reduce function will get records from table1 and table2 all mixed together,
but grouped according to the join key.  It can combine these into the
desired output.

If you add a "group by y, z" clause, then f has to be a function of a set of
values of x (like max or average, but you get to write it).  You should
change the map function so that the key is now (j1, j2, y, z) and the value
would be x, g(y), z.  Then change the reduce function to collect the values
of x and compute f(x) (and pass through g(y) and z).

Hope this helps.

The key here is that the output can be polymorphic so you can use the sort
phase between map and reduce to do the join.

On 7/25/07 4:05 AM, "meda vijendharreddy" <me...@yahoo.co.in> wrote:

> Hi,
>    Iam new to hadoop, Wanted to use hadoop in my
> application.
> 
> Currently I want to simulate something like
> "SELECT FROM WHERE "
> 
> FieldSelectionMapReduce  Class can be used to reduce
> the no of columns(which is like  SELECT blah blah )
> 
> FROM , if I have more than 2 tables, then i can do
> join on those and if it is single table then i can
> acheive  easily.
> 
> How can I acheive the where condition functionality.
> 
> Please help me on this. I have no clues at this
> moment.
> 
> 
> Thanks in Advance,
> 
> 
> 
> 
> -----
> Thanks
> Vijen
> 
> 
>       Once upon a time there was 1 GB storage in your inbox. To know the happy
> ending go to 
> http://help.yahoo.com/l/in/yahoo/mail/yahoomail/tools/tools-08.html