You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Zheng Shao <zs...@gmail.com> on 2010/02/26 22:56:12 UTC

Hive User Group Meeting 3/18/2010 7pm at Facebook

Hi all,

We are going to hold the second Hive User Group Meeting at 7PM on
3/18/2010 Thursday.

The agenda will be:

* Hive Tutorial: 20 min
* Hive User Case Study: 20 min
* New Features and API: 25 min
 JDBC/ODBC and CTAS
 UDF/UDAF/UDTF
 Create View/HBaseInputFormat
 Hive Join Strategy
 SerDe

The audience is beginner to intermediate Hive users/developers.

*** The details are here: http://www.facebook.com/event.php?eid=319237846974 ***
*** Please RSVP so we can schedule logistics accordingly. ***

-- 
Yours,
Zheng

Re: hive 0.50 and hwi

Posted by Edward Capriolo <ed...@gmail.com>.
On Wed, Mar 3, 2010 at 3:50 PM, Massoud Mazar <Ma...@avg.com> wrote:
> Just installed hive 0.50 and HWI does not work. I could not find the .war file:
>
> [hadoop@centos1 hive]$ bin/hive --service hwi
> ls: /hadoop/hive/lib/hive-hwi-*.war: No such file or directory
> 10/03/03 15:42:39 INFO hwi.HWIServer: HWI is starting up
> 10/03/03 15:42:39 INFO mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
> 10/03/03 15:42:39 INFO mortbay.log: jetty-6.1.14
> 10/03/03 15:42:39 INFO mortbay.log: Started SocketConnector@192.168.1.22:9999
>

I feel like a broken record on this :)

Luckily I think we have this licked for the final time. You must be
working with 5.0-rc0

https://issues.apache.org/jira/browse/HIVE-1183

In any case, you can find the war file in your build directory.
Move it to hive/lib

Add this to your hive site if need be.
<property>
  <name>hive.hwi.war.file</name>
  <value>lib/hive-hwi-0.5.0.war</value>
</property>

Edward

hive 0.50 and hwi

Posted by Massoud Mazar <Ma...@avg.com>.
Just installed hive 0.50 and HWI does not work. I could not find the .war file:

[hadoop@centos1 hive]$ bin/hive --service hwi
ls: /hadoop/hive/lib/hive-hwi-*.war: No such file or directory
10/03/03 15:42:39 INFO hwi.HWIServer: HWI is starting up
10/03/03 15:42:39 INFO mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
10/03/03 15:42:39 INFO mortbay.log: jetty-6.1.14
10/03/03 15:42:39 INFO mortbay.log: Started SocketConnector@192.168.1.22:9999

Re: hive 0.50 on hadoop 0.22

Posted by Zheng Shao <zs...@gmail.com>.
Hi Massoud,

Great work!

Yes this is exactly the use of shims. When we see an API change across
hadoop versions, we add a new function to shims interface, and
implement it in each of the shim.

For this one, you probably want to wrap the logic in Driver.java into
a single shim interface function, and implement that function in all
shim versions.

Does that make sense?

Zheng

On Mon, Mar 1, 2010 at 1:08 PM, Massoud Mazar <Ma...@avg.com> wrote:
> Zheng,
>
> Thanks for answering.
> I've decided to give it (hive 0.50 on hadoop 0.22) a try. I'm a developer, but not a Java developer, so with some initial help I can spend time and work on this.
> Just to start, I modified the ShimLoader.java and copied the same HADOOP_SHIM_CLASSES and JETTY_SHIM_CLASSES from 0.20 to 0.22 to see where it breaks.
>
> I built and deployed hive 0.50 to a running hadoop 0.22 and did "show tables;" in hive, and I got this:
>
> Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.security.UserGroupInformation: method <init>()V not found
>        at org.apache.hadoop.security.UnixUserGroupInformation.<init>(UnixUserGroupInformation.java:69)
>        at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:271)
>        at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:300)
>        at org.apache.hadoop.hive.ql.Driver.<init>(Driver.java:243)
>        at org.apache.hadoop.hive.ql.processors.CommandProcessorFactory.get(CommandProcessorFactory.java:40)
>        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:116)
>        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
>        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.util.RunJar.main(RunJar.java:187)
>
> Now, when I look at the UserGroupInformation class in hadoop 0.22 source code, it does not have a parameter-less constructor, but documentation at http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/security/UserGroupInformation.html shows such a constructor.
>
> Now, my question is: is this something that can be fixed by shims? Or it is a problem with hadoop?
>
> -----Original Message-----
> From: Zheng Shao [mailto:zshao9@gmail.com]
> Sent: Saturday, February 27, 2010 4:24 AM
> To: hive-user@hadoop.apache.org
> Subject: Re: hive 0.50 on hadoop 0.22
>
> Hi Mazar,
>
> We have not tried Hive on Hadoop higher than 0.20 yet.
>
> However, Hive has the shim infrastructure which makes it easy to port
> to new Hadoop versions.
> Please see the shim directory inside Hive.
>
> Zheng
>
> On Fri, Feb 26, 2010 at 1:59 PM, Massoud Mazar <Ma...@avg.com> wrote:
>> Is it possible to run release-0.5.0-rc0 on top of hadoop 0.22.0 (trunk)?
>>
>
>
>
> --
> Yours,
> Zheng
>



-- 
Yours,
Zheng

RE: hive 0.50 on hadoop 0.22

Posted by Massoud Mazar <Ma...@avg.com>.
Zheng,

Thanks for answering.
I've decided to give it (hive 0.50 on hadoop 0.22) a try. I'm a developer, but not a Java developer, so with some initial help I can spend time and work on this.
Just to start, I modified the ShimLoader.java and copied the same HADOOP_SHIM_CLASSES and JETTY_SHIM_CLASSES from 0.20 to 0.22 to see where it breaks.

I built and deployed hive 0.50 to a running hadoop 0.22 and did "show tables;" in hive, and I got this:

Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.security.UserGroupInformation: method <init>()V not found
        at org.apache.hadoop.security.UnixUserGroupInformation.<init>(UnixUserGroupInformation.java:69)
        at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:271)
        at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:300)
        at org.apache.hadoop.hive.ql.Driver.<init>(Driver.java:243)
        at org.apache.hadoop.hive.ql.processors.CommandProcessorFactory.get(CommandProcessorFactory.java:40)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:116)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:187)

Now, when I look at the UserGroupInformation class in hadoop 0.22 source code, it does not have a parameter-less constructor, but documentation at http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/security/UserGroupInformation.html shows such a constructor.

Now, my question is: is this something that can be fixed by shims? Or it is a problem with hadoop?

-----Original Message-----
From: Zheng Shao [mailto:zshao9@gmail.com] 
Sent: Saturday, February 27, 2010 4:24 AM
To: hive-user@hadoop.apache.org
Subject: Re: hive 0.50 on hadoop 0.22

Hi Mazar,

We have not tried Hive on Hadoop higher than 0.20 yet.

However, Hive has the shim infrastructure which makes it easy to port
to new Hadoop versions.
Please see the shim directory inside Hive.

Zheng

On Fri, Feb 26, 2010 at 1:59 PM, Massoud Mazar <Ma...@avg.com> wrote:
> Is it possible to run release-0.5.0-rc0 on top of hadoop 0.22.0 (trunk)?
>



-- 
Yours,
Zheng

Re: hive 0.50 on hadoop 0.22

Posted by Zheng Shao <zs...@gmail.com>.
Hi Mazar,

We have not tried Hive on Hadoop higher than 0.20 yet.

However, Hive has the shim infrastructure which makes it easy to port
to new Hadoop versions.
Please see the shim directory inside Hive.

Zheng

On Fri, Feb 26, 2010 at 1:59 PM, Massoud Mazar <Ma...@avg.com> wrote:
> Is it possible to run release-0.5.0-rc0 on top of hadoop 0.22.0 (trunk)?
>



-- 
Yours,
Zheng

hive 0.50 on hadoop 0.22

Posted by Massoud Mazar <Ma...@avg.com>.
Is it possible to run release-0.5.0-rc0 on top of hadoop 0.22.0 (trunk)?

Re: Hive User Group Meeting 3/18/2010 7pm at Facebook

Posted by Zheng Shao <zs...@gmail.com>.
Just a reminder that we have Hive User Group Meeting this Thursday at Facebook.

Please register on
http://www.meetup.com/Hive-User-Group-Meeting/calendar/12741356/ if
you plan to come.

Zheng

On Mon, Mar 1, 2010 at 12:57 PM, Zheng Shao <zs...@gmail.com> wrote:
> We also created a Meetup group in case you prefer to register on meetup.com
>
> http://www.meetup.com/Hive-User-Group-Meeting/calendar/12741356/
>
> We are hosting a Hive User Group Meeting, open to all current and
> potential hadoop/hive users.
>
> Agenda:
> * Hive Tutorial (Carl Steinbach, cloudera): 20 min
> * Hive User Case Study (Eva Tse, netflix): 20 min
> * New Features and API (Hive team, Facebook): 25 min
> JDBC/ODBC and CTAS(Create Table As Select)
> UDF/UDAF/UDTF (User-defined Functions)
> Create View/HBaseInputFormat (Hive and HBase integration)
> Hive Join Strategy (How Hive does the join)
> SerDe (Hive's serialization/deserialization framework)
>
>
> Hive is a scalable data warehouse infrastructure built on top of
> Hadoop. It provides tools to enable easy data ETL, a mechanism to put
> structures on the data, and the capability to querying and analysis of
> large data sets stored in Hadoop files. Hive defines a simple SQL-like
> query language, called HiveQL, that enables users familiar with SQL to
> query the data. At the same time, this language also allows
> programmers who are familiar with MapReduce to be able to plug in
> their custom mappers and reducers to perform more sophisticated
> analysis.
>
> The current largest deployment of Hive is the silver cluster at
> Facebook, which consists of 1100 nodes with 8 CPU-cores and 12
> 1TB-disk each. The total capacity is 8800 CPU-cores with 13 PB of raw
> storage space. More than 4 TB of compressed data (20+ TB uncompressed)
> are loaded into Hive every day.
>
>
> If you'd like to network with fellow Hive/Hadoop users online, feel
> free to find them here:
> http://www.facebook.com/event.php?eid=319237846974
>
>
>
> Zheng
>
> On Fri, Feb 26, 2010 at 1:56 PM, Zheng Shao <zs...@gmail.com> wrote:
>> Hi all,
>>
>> We are going to hold the second Hive User Group Meeting at 7PM on
>> 3/18/2010 Thursday.
>>
>> The agenda will be:
>>
>> * Hive Tutorial: 20 min
>> * Hive User Case Study: 20 min
>> * New Features and API: 25 min
>>  JDBC/ODBC and CTAS
>>  UDF/UDAF/UDTF
>>  Create View/HBaseInputFormat
>>  Hive Join Strategy
>>  SerDe
>>
>> The audience is beginner to intermediate Hive users/developers.
>>
>> *** The details are here: http://www.facebook.com/event.php?eid=319237846974 ***
>> *** Please RSVP so we can schedule logistics accordingly. ***
>>
>> --
>> Yours,
>> Zheng
>>
>
>
>
> --
> Yours,
> Zheng
>



-- 
Yours,
Zheng

Re: Hive User Group Meeting 3/18/2010 7pm at Facebook

Posted by Zheng Shao <zs...@gmail.com>.
Just a reminder that we have Hive User Group Meeting this Thursday at Facebook.

Please register on
http://www.meetup.com/Hive-User-Group-Meeting/calendar/12741356/ if
you plan to come.

Zheng

On Mon, Mar 1, 2010 at 12:57 PM, Zheng Shao <zs...@gmail.com> wrote:
> We also created a Meetup group in case you prefer to register on meetup.com
>
> http://www.meetup.com/Hive-User-Group-Meeting/calendar/12741356/
>
> We are hosting a Hive User Group Meeting, open to all current and
> potential hadoop/hive users.
>
> Agenda:
> * Hive Tutorial (Carl Steinbach, cloudera): 20 min
> * Hive User Case Study (Eva Tse, netflix): 20 min
> * New Features and API (Hive team, Facebook): 25 min
> JDBC/ODBC and CTAS(Create Table As Select)
> UDF/UDAF/UDTF (User-defined Functions)
> Create View/HBaseInputFormat (Hive and HBase integration)
> Hive Join Strategy (How Hive does the join)
> SerDe (Hive's serialization/deserialization framework)
>
>
> Hive is a scalable data warehouse infrastructure built on top of
> Hadoop. It provides tools to enable easy data ETL, a mechanism to put
> structures on the data, and the capability to querying and analysis of
> large data sets stored in Hadoop files. Hive defines a simple SQL-like
> query language, called HiveQL, that enables users familiar with SQL to
> query the data. At the same time, this language also allows
> programmers who are familiar with MapReduce to be able to plug in
> their custom mappers and reducers to perform more sophisticated
> analysis.
>
> The current largest deployment of Hive is the silver cluster at
> Facebook, which consists of 1100 nodes with 8 CPU-cores and 12
> 1TB-disk each. The total capacity is 8800 CPU-cores with 13 PB of raw
> storage space. More than 4 TB of compressed data (20+ TB uncompressed)
> are loaded into Hive every day.
>
>
> If you'd like to network with fellow Hive/Hadoop users online, feel
> free to find them here:
> http://www.facebook.com/event.php?eid=319237846974
>
>
>
> Zheng
>
> On Fri, Feb 26, 2010 at 1:56 PM, Zheng Shao <zs...@gmail.com> wrote:
>> Hi all,
>>
>> We are going to hold the second Hive User Group Meeting at 7PM on
>> 3/18/2010 Thursday.
>>
>> The agenda will be:
>>
>> * Hive Tutorial: 20 min
>> * Hive User Case Study: 20 min
>> * New Features and API: 25 min
>>  JDBC/ODBC and CTAS
>>  UDF/UDAF/UDTF
>>  Create View/HBaseInputFormat
>>  Hive Join Strategy
>>  SerDe
>>
>> The audience is beginner to intermediate Hive users/developers.
>>
>> *** The details are here: http://www.facebook.com/event.php?eid=319237846974 ***
>> *** Please RSVP so we can schedule logistics accordingly. ***
>>
>> --
>> Yours,
>> Zheng
>>
>
>
>
> --
> Yours,
> Zheng
>



-- 
Yours,
Zheng

Re: Hive User Group Meeting 3/18/2010 7pm at Facebook

Posted by Zheng Shao <zs...@gmail.com>.
Just a reminder that we have Hive User Group Meeting this Thursday at Facebook.

Please register on
http://www.meetup.com/Hive-User-Group-Meeting/calendar/12741356/ if
you plan to come.

Zheng

On Mon, Mar 1, 2010 at 12:57 PM, Zheng Shao <zs...@gmail.com> wrote:
> We also created a Meetup group in case you prefer to register on meetup.com
>
> http://www.meetup.com/Hive-User-Group-Meeting/calendar/12741356/
>
> We are hosting a Hive User Group Meeting, open to all current and
> potential hadoop/hive users.
>
> Agenda:
> * Hive Tutorial (Carl Steinbach, cloudera): 20 min
> * Hive User Case Study (Eva Tse, netflix): 20 min
> * New Features and API (Hive team, Facebook): 25 min
> JDBC/ODBC and CTAS(Create Table As Select)
> UDF/UDAF/UDTF (User-defined Functions)
> Create View/HBaseInputFormat (Hive and HBase integration)
> Hive Join Strategy (How Hive does the join)
> SerDe (Hive's serialization/deserialization framework)
>
>
> Hive is a scalable data warehouse infrastructure built on top of
> Hadoop. It provides tools to enable easy data ETL, a mechanism to put
> structures on the data, and the capability to querying and analysis of
> large data sets stored in Hadoop files. Hive defines a simple SQL-like
> query language, called HiveQL, that enables users familiar with SQL to
> query the data. At the same time, this language also allows
> programmers who are familiar with MapReduce to be able to plug in
> their custom mappers and reducers to perform more sophisticated
> analysis.
>
> The current largest deployment of Hive is the silver cluster at
> Facebook, which consists of 1100 nodes with 8 CPU-cores and 12
> 1TB-disk each. The total capacity is 8800 CPU-cores with 13 PB of raw
> storage space. More than 4 TB of compressed data (20+ TB uncompressed)
> are loaded into Hive every day.
>
>
> If you'd like to network with fellow Hive/Hadoop users online, feel
> free to find them here:
> http://www.facebook.com/event.php?eid=319237846974
>
>
>
> Zheng
>
> On Fri, Feb 26, 2010 at 1:56 PM, Zheng Shao <zs...@gmail.com> wrote:
>> Hi all,
>>
>> We are going to hold the second Hive User Group Meeting at 7PM on
>> 3/18/2010 Thursday.
>>
>> The agenda will be:
>>
>> * Hive Tutorial: 20 min
>> * Hive User Case Study: 20 min
>> * New Features and API: 25 min
>>  JDBC/ODBC and CTAS
>>  UDF/UDAF/UDTF
>>  Create View/HBaseInputFormat
>>  Hive Join Strategy
>>  SerDe
>>
>> The audience is beginner to intermediate Hive users/developers.
>>
>> *** The details are here: http://www.facebook.com/event.php?eid=319237846974 ***
>> *** Please RSVP so we can schedule logistics accordingly. ***
>>
>> --
>> Yours,
>> Zheng
>>
>
>
>
> --
> Yours,
> Zheng
>



-- 
Yours,
Zheng

Re: Hive User Group Meeting 3/18/2010 7pm at Facebook

Posted by Zheng Shao <zs...@gmail.com>.
Just a reminder that we have Hive User Group Meeting this Thursday at Facebook.

Please register on
http://www.meetup.com/Hive-User-Group-Meeting/calendar/12741356/ if
you plan to come.

Zheng

On Mon, Mar 1, 2010 at 12:57 PM, Zheng Shao <zs...@gmail.com> wrote:
> We also created a Meetup group in case you prefer to register on meetup.com
>
> http://www.meetup.com/Hive-User-Group-Meeting/calendar/12741356/
>
> We are hosting a Hive User Group Meeting, open to all current and
> potential hadoop/hive users.
>
> Agenda:
> * Hive Tutorial (Carl Steinbach, cloudera): 20 min
> * Hive User Case Study (Eva Tse, netflix): 20 min
> * New Features and API (Hive team, Facebook): 25 min
> JDBC/ODBC and CTAS(Create Table As Select)
> UDF/UDAF/UDTF (User-defined Functions)
> Create View/HBaseInputFormat (Hive and HBase integration)
> Hive Join Strategy (How Hive does the join)
> SerDe (Hive's serialization/deserialization framework)
>
>
> Hive is a scalable data warehouse infrastructure built on top of
> Hadoop. It provides tools to enable easy data ETL, a mechanism to put
> structures on the data, and the capability to querying and analysis of
> large data sets stored in Hadoop files. Hive defines a simple SQL-like
> query language, called HiveQL, that enables users familiar with SQL to
> query the data. At the same time, this language also allows
> programmers who are familiar with MapReduce to be able to plug in
> their custom mappers and reducers to perform more sophisticated
> analysis.
>
> The current largest deployment of Hive is the silver cluster at
> Facebook, which consists of 1100 nodes with 8 CPU-cores and 12
> 1TB-disk each. The total capacity is 8800 CPU-cores with 13 PB of raw
> storage space. More than 4 TB of compressed data (20+ TB uncompressed)
> are loaded into Hive every day.
>
>
> If you'd like to network with fellow Hive/Hadoop users online, feel
> free to find them here:
> http://www.facebook.com/event.php?eid=319237846974
>
>
>
> Zheng
>
> On Fri, Feb 26, 2010 at 1:56 PM, Zheng Shao <zs...@gmail.com> wrote:
>> Hi all,
>>
>> We are going to hold the second Hive User Group Meeting at 7PM on
>> 3/18/2010 Thursday.
>>
>> The agenda will be:
>>
>> * Hive Tutorial: 20 min
>> * Hive User Case Study: 20 min
>> * New Features and API: 25 min
>>  JDBC/ODBC and CTAS
>>  UDF/UDAF/UDTF
>>  Create View/HBaseInputFormat
>>  Hive Join Strategy
>>  SerDe
>>
>> The audience is beginner to intermediate Hive users/developers.
>>
>> *** The details are here: http://www.facebook.com/event.php?eid=319237846974 ***
>> *** Please RSVP so we can schedule logistics accordingly. ***
>>
>> --
>> Yours,
>> Zheng
>>
>
>
>
> --
> Yours,
> Zheng
>



-- 
Yours,
Zheng

Re: Hive User Group Meeting 3/18/2010 7pm at Facebook

Posted by Zheng Shao <zs...@gmail.com>.
We also created a Meetup group in case you prefer to register on meetup.com

http://www.meetup.com/Hive-User-Group-Meeting/calendar/12741356/

We are hosting a Hive User Group Meeting, open to all current and
potential hadoop/hive users.

Agenda:
* Hive Tutorial (Carl Steinbach, cloudera): 20 min
* Hive User Case Study (Eva Tse, netflix): 20 min
* New Features and API (Hive team, Facebook): 25 min
JDBC/ODBC and CTAS(Create Table As Select)
UDF/UDAF/UDTF (User-defined Functions)
Create View/HBaseInputFormat (Hive and HBase integration)
Hive Join Strategy (How Hive does the join)
SerDe (Hive's serialization/deserialization framework)


Hive is a scalable data warehouse infrastructure built on top of
Hadoop. It provides tools to enable easy data ETL, a mechanism to put
structures on the data, and the capability to querying and analysis of
large data sets stored in Hadoop files. Hive defines a simple SQL-like
query language, called HiveQL, that enables users familiar with SQL to
query the data. At the same time, this language also allows
programmers who are familiar with MapReduce to be able to plug in
their custom mappers and reducers to perform more sophisticated
analysis.

The current largest deployment of Hive is the silver cluster at
Facebook, which consists of 1100 nodes with 8 CPU-cores and 12
1TB-disk each. The total capacity is 8800 CPU-cores with 13 PB of raw
storage space. More than 4 TB of compressed data (20+ TB uncompressed)
are loaded into Hive every day.


If you'd like to network with fellow Hive/Hadoop users online, feel
free to find them here:
http://www.facebook.com/event.php?eid=319237846974



Zheng

On Fri, Feb 26, 2010 at 1:56 PM, Zheng Shao <zs...@gmail.com> wrote:
> Hi all,
>
> We are going to hold the second Hive User Group Meeting at 7PM on
> 3/18/2010 Thursday.
>
> The agenda will be:
>
> * Hive Tutorial: 20 min
> * Hive User Case Study: 20 min
> * New Features and API: 25 min
>  JDBC/ODBC and CTAS
>  UDF/UDAF/UDTF
>  Create View/HBaseInputFormat
>  Hive Join Strategy
>  SerDe
>
> The audience is beginner to intermediate Hive users/developers.
>
> *** The details are here: http://www.facebook.com/event.php?eid=319237846974 ***
> *** Please RSVP so we can schedule logistics accordingly. ***
>
> --
> Yours,
> Zheng
>



-- 
Yours,
Zheng

Re: Hive User Group Meeting 3/18/2010 7pm at Facebook

Posted by Zheng Shao <zs...@gmail.com>.
We also created a Meetup group in case you prefer to register on meetup.com

http://www.meetup.com/Hive-User-Group-Meeting/calendar/12741356/

We are hosting a Hive User Group Meeting, open to all current and
potential hadoop/hive users.

Agenda:
* Hive Tutorial (Carl Steinbach, cloudera): 20 min
* Hive User Case Study (Eva Tse, netflix): 20 min
* New Features and API (Hive team, Facebook): 25 min
JDBC/ODBC and CTAS(Create Table As Select)
UDF/UDAF/UDTF (User-defined Functions)
Create View/HBaseInputFormat (Hive and HBase integration)
Hive Join Strategy (How Hive does the join)
SerDe (Hive's serialization/deserialization framework)


Hive is a scalable data warehouse infrastructure built on top of
Hadoop. It provides tools to enable easy data ETL, a mechanism to put
structures on the data, and the capability to querying and analysis of
large data sets stored in Hadoop files. Hive defines a simple SQL-like
query language, called HiveQL, that enables users familiar with SQL to
query the data. At the same time, this language also allows
programmers who are familiar with MapReduce to be able to plug in
their custom mappers and reducers to perform more sophisticated
analysis.

The current largest deployment of Hive is the silver cluster at
Facebook, which consists of 1100 nodes with 8 CPU-cores and 12
1TB-disk each. The total capacity is 8800 CPU-cores with 13 PB of raw
storage space. More than 4 TB of compressed data (20+ TB uncompressed)
are loaded into Hive every day.


If you'd like to network with fellow Hive/Hadoop users online, feel
free to find them here:
http://www.facebook.com/event.php?eid=319237846974



Zheng

On Fri, Feb 26, 2010 at 1:56 PM, Zheng Shao <zs...@gmail.com> wrote:
> Hi all,
>
> We are going to hold the second Hive User Group Meeting at 7PM on
> 3/18/2010 Thursday.
>
> The agenda will be:
>
> * Hive Tutorial: 20 min
> * Hive User Case Study: 20 min
> * New Features and API: 25 min
>  JDBC/ODBC and CTAS
>  UDF/UDAF/UDTF
>  Create View/HBaseInputFormat
>  Hive Join Strategy
>  SerDe
>
> The audience is beginner to intermediate Hive users/developers.
>
> *** The details are here: http://www.facebook.com/event.php?eid=319237846974 ***
> *** Please RSVP so we can schedule logistics accordingly. ***
>
> --
> Yours,
> Zheng
>



-- 
Yours,
Zheng

Re: Hive User Group Meeting 3/18/2010 7pm at Facebook

Posted by Zheng Shao <zs...@gmail.com>.
We also created a Meetup group in case you prefer to register on meetup.com

http://www.meetup.com/Hive-User-Group-Meeting/calendar/12741356/

We are hosting a Hive User Group Meeting, open to all current and
potential hadoop/hive users.

Agenda:
* Hive Tutorial (Carl Steinbach, cloudera): 20 min
* Hive User Case Study (Eva Tse, netflix): 20 min
* New Features and API (Hive team, Facebook): 25 min
JDBC/ODBC and CTAS(Create Table As Select)
UDF/UDAF/UDTF (User-defined Functions)
Create View/HBaseInputFormat (Hive and HBase integration)
Hive Join Strategy (How Hive does the join)
SerDe (Hive's serialization/deserialization framework)


Hive is a scalable data warehouse infrastructure built on top of
Hadoop. It provides tools to enable easy data ETL, a mechanism to put
structures on the data, and the capability to querying and analysis of
large data sets stored in Hadoop files. Hive defines a simple SQL-like
query language, called HiveQL, that enables users familiar with SQL to
query the data. At the same time, this language also allows
programmers who are familiar with MapReduce to be able to plug in
their custom mappers and reducers to perform more sophisticated
analysis.

The current largest deployment of Hive is the silver cluster at
Facebook, which consists of 1100 nodes with 8 CPU-cores and 12
1TB-disk each. The total capacity is 8800 CPU-cores with 13 PB of raw
storage space. More than 4 TB of compressed data (20+ TB uncompressed)
are loaded into Hive every day.


If you'd like to network with fellow Hive/Hadoop users online, feel
free to find them here:
http://www.facebook.com/event.php?eid=319237846974



Zheng

On Fri, Feb 26, 2010 at 1:56 PM, Zheng Shao <zs...@gmail.com> wrote:
> Hi all,
>
> We are going to hold the second Hive User Group Meeting at 7PM on
> 3/18/2010 Thursday.
>
> The agenda will be:
>
> * Hive Tutorial: 20 min
> * Hive User Case Study: 20 min
> * New Features and API: 25 min
>  JDBC/ODBC and CTAS
>  UDF/UDAF/UDTF
>  Create View/HBaseInputFormat
>  Hive Join Strategy
>  SerDe
>
> The audience is beginner to intermediate Hive users/developers.
>
> *** The details are here: http://www.facebook.com/event.php?eid=319237846974 ***
> *** Please RSVP so we can schedule logistics accordingly. ***
>
> --
> Yours,
> Zheng
>



-- 
Yours,
Zheng

Re: Hive User Group Meeting 3/18/2010 7pm at Facebook

Posted by Zheng Shao <zs...@gmail.com>.
We also created a Meetup group in case you prefer to register on meetup.com

http://www.meetup.com/Hive-User-Group-Meeting/calendar/12741356/

We are hosting a Hive User Group Meeting, open to all current and
potential hadoop/hive users.

Agenda:
* Hive Tutorial (Carl Steinbach, cloudera): 20 min
* Hive User Case Study (Eva Tse, netflix): 20 min
* New Features and API (Hive team, Facebook): 25 min
JDBC/ODBC and CTAS(Create Table As Select)
UDF/UDAF/UDTF (User-defined Functions)
Create View/HBaseInputFormat (Hive and HBase integration)
Hive Join Strategy (How Hive does the join)
SerDe (Hive's serialization/deserialization framework)


Hive is a scalable data warehouse infrastructure built on top of
Hadoop. It provides tools to enable easy data ETL, a mechanism to put
structures on the data, and the capability to querying and analysis of
large data sets stored in Hadoop files. Hive defines a simple SQL-like
query language, called HiveQL, that enables users familiar with SQL to
query the data. At the same time, this language also allows
programmers who are familiar with MapReduce to be able to plug in
their custom mappers and reducers to perform more sophisticated
analysis.

The current largest deployment of Hive is the silver cluster at
Facebook, which consists of 1100 nodes with 8 CPU-cores and 12
1TB-disk each. The total capacity is 8800 CPU-cores with 13 PB of raw
storage space. More than 4 TB of compressed data (20+ TB uncompressed)
are loaded into Hive every day.


If you'd like to network with fellow Hive/Hadoop users online, feel
free to find them here:
http://www.facebook.com/event.php?eid=319237846974



Zheng

On Fri, Feb 26, 2010 at 1:56 PM, Zheng Shao <zs...@gmail.com> wrote:
> Hi all,
>
> We are going to hold the second Hive User Group Meeting at 7PM on
> 3/18/2010 Thursday.
>
> The agenda will be:
>
> * Hive Tutorial: 20 min
> * Hive User Case Study: 20 min
> * New Features and API: 25 min
>  JDBC/ODBC and CTAS
>  UDF/UDAF/UDTF
>  Create View/HBaseInputFormat
>  Hive Join Strategy
>  SerDe
>
> The audience is beginner to intermediate Hive users/developers.
>
> *** The details are here: http://www.facebook.com/event.php?eid=319237846974 ***
> *** Please RSVP so we can schedule logistics accordingly. ***
>
> --
> Yours,
> Zheng
>



-- 
Yours,
Zheng