You are viewing a plain text version of this content. The canonical link for it is here.

Posted to general@hadoop.apache.org by Philip Zeyliger <ph...@cloudera.com> on 2009/09/04 18:34:49 UTC

Re: drivers to bridge familiar SQL queries to Hadoop MapReduce internals?

Hi Benjamin,

This is actually very much on the mark.

Take a look at the Hive project -- http://hadoop.apache.org/hive/ ,
also video at http://www.cloudera.com/hadoop-training-hive-introduction.
 Hive is a SQL-like interface developed initially at Facebook for
exactly that.  Pig is also working on something similar -- see
http://issues.apache.org/jira/browse/PIG-824.

Cheers,

-- Philip



On Fri, Sep 4, 2009 at 9:16 AM,
benjamin.cotton@lehman.com<Be...@lehman.com> wrote:
>
> I am brand new to Hadoop and have a very newbie question:  Is it a Hadoop
> community priority to  build drivers (or layers of drivers) that will help
> bridge simple, familiar SQL queries to Hadoop MapReduce internals  -
> liberating the application query developer from having to necessarily learn
> Hadoop-specific technologies, APIs, and tactics?
>
> E.g. in   the "Hadoop - The Definitive Guide" initial example, I would like
> to STILL just be able to write
>
> Select avg(weatherStationTable.airTemp), max(weatherStationTable.airTemp)
> from   weatherStationTable
> group by  weatherStationTable.year
>
> and depend on some Driver (or layer of Drivers) to bridge that familiar SQL
> relational query to a Hadoop MapReduce job that is deployed across the HDFS
> (or other  Hadoop-specific data hostng layer) to  execute in Hadoop and
> return my result.
>
> is the notion of this potential capability off-the mark re: current Hadoop
> community development priorities?
>