You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@hadoop.apache.org by Philip Zeyliger <ph...@cloudera.com> on 2009/09/04 18:34:49 UTC
Re: drivers to bridge familiar SQL queries to Hadoop MapReduce
internals?
Hi Benjamin,
This is actually very much on the mark.
Take a look at the Hive project -- http://hadoop.apache.org/hive/ ,
also video at http://www.cloudera.com/hadoop-training-hive-introduction.
Hive is a SQL-like interface developed initially at Facebook for
exactly that. Pig is also working on something similar -- see
http://issues.apache.org/jira/browse/PIG-824.
Cheers,
-- Philip
On Fri, Sep 4, 2009 at 9:16 AM,
benjamin.cotton@lehman.com<Be...@lehman.com> wrote:
>
> I am brand new to Hadoop and have a very newbie question: Is it a Hadoop
> community priority to build drivers (or layers of drivers) that will help
> bridge simple, familiar SQL queries to Hadoop MapReduce internals -
> liberating the application query developer from having to necessarily learn
> Hadoop-specific technologies, APIs, and tactics?
>
> E.g. in the "Hadoop - The Definitive Guide" initial example, I would like
> to STILL just be able to write
>
> Select avg(weatherStationTable.airTemp), max(weatherStationTable.airTemp)
> from weatherStationTable
> group by weatherStationTable.year
>
> and depend on some Driver (or layer of Drivers) to bridge that familiar SQL
> relational query to a Hadoop MapReduce job that is deployed across the HDFS
> (or other Hadoop-specific data hostng layer) to execute in Hadoop and
> return my result.
>
> is the notion of this potential capability off-the mark re: current Hadoop
> community development priorities?
>