You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by Edward Capriolo <ed...@gmail.com> on 2010/01/27 22:00:02 UTC

HIVE-49 and other forms of CLI niceness

All,

Some simple features in Hive can really bring down the learning curve
for new users. I am teaching some how to use hive.

A buddy if mine did this.
hive> select * from mt_date_test;
OK
a	2010-01-01	NULL
b	2009-12-31	NULL
c	2010-01-27	NULL

hive> select * from mt_date_test where my_date > '2010-01-01';

2010-01-27 08:18:27,008 map = 100%,  reduce =100%
Ended Job = job_200909171715_20264
OK

I instantly suspected 1) whiteplace 2) delimeters

hive> select key from mt_date_test;

OK
a       2010-01-01
b       2009-12-31
c       2010-01-27

!!BINGO!!

Should we use a pipe | or some other column delimiter like the mysql
CLI does? and have this be a property that is on by default

hive.cli.columnseparator='\t'
hive.cli.columnseparator='|'

In its current state the user understandably made the assumption that
'>' does not work on strings.

Should we add some expose the format of the results in Driver so that
the CLI can effectively split the rows by column?

RE: HIVE-49 and other forms of CLI niceness

Posted by Ashish Thusoo <at...@facebook.com>.

Looks like a good suggestion. Ideally the driver code should just return a structure that encodes the columns separately as opposed to a single serialized string today and the formatting logic should all be in the CliDriver

Ashish 

-----Original Message-----
From: Edward Capriolo [mailto:edlinuxguru@gmail.com] 
Sent: Wednesday, January 27, 2010 1:00 PM
To: hive-user@hadoop.apache.org
Subject: HIVE-49 and other forms of CLI niceness

All,

Some simple features in Hive can really bring down the learning curve for new users. I am teaching some how to use hive.

A buddy if mine did this.
hive> select * from mt_date_test;
OK
a	2010-01-01	NULL
b	2009-12-31	NULL
c	2010-01-27	NULL

hive> select * from mt_date_test where my_date > '2010-01-01';

2010-01-27 08:18:27,008 map = 100%,  reduce =100% Ended Job = job_200909171715_20264 OK

I instantly suspected 1) whiteplace 2) delimeters

hive> select key from mt_date_test;

OK
a       2010-01-01
b       2009-12-31
c       2010-01-27

!!BINGO!!

Should we use a pipe | or some other column delimiter like the mysql CLI does? and have this be a property that is on by default

hive.cli.columnseparator='\t'
hive.cli.columnseparator='|'

In its current state the user understandably made the assumption that '>' does not work on strings.

Should we add some expose the format of the results in Driver so that the CLI can effectively split the rows by column?

RE: HIVE-49 and other forms of CLI niceness

Posted by Ashish Thusoo <at...@facebook.com>.

Looks like a good suggestion. Ideally the driver code should just return a structure that encodes the columns separately as opposed to a single serialized string today and the formatting logic should all be in the CliDriver

Ashish 

-----Original Message-----
From: Edward Capriolo [mailto:edlinuxguru@gmail.com] 
Sent: Wednesday, January 27, 2010 1:00 PM
To: hive-user@hadoop.apache.org
Subject: HIVE-49 and other forms of CLI niceness

All,

Some simple features in Hive can really bring down the learning curve for new users. I am teaching some how to use hive.

A buddy if mine did this.
hive> select * from mt_date_test;
OK
a	2010-01-01	NULL
b	2009-12-31	NULL
c	2010-01-27	NULL

hive> select * from mt_date_test where my_date > '2010-01-01';

2010-01-27 08:18:27,008 map = 100%,  reduce =100% Ended Job = job_200909171715_20264 OK

I instantly suspected 1) whiteplace 2) delimeters

hive> select key from mt_date_test;

OK
a       2010-01-01
b       2009-12-31
c       2010-01-27

!!BINGO!!

Should we use a pipe | or some other column delimiter like the mysql CLI does? and have this be a property that is on by default

hive.cli.columnseparator='\t'
hive.cli.columnseparator='|'

In its current state the user understandably made the assumption that '>' does not work on strings.

Should we add some expose the format of the results in Driver so that the CLI can effectively split the rows by column?