You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Zheng Shao <zs...@facebook.com> on 2009/10/06 00:16:25 UTC

RE: Pig and Hive on the same data?

Hi Chris,

Hive does not mandate a separator as well.

CREATE TABLE mytable (a STRING, b INT)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';

If you don't specify "ROW FORMAT ...", the default is Ctrl^A.

Also, if you have more question on Hive, please post on hive-user list for the same reason.


Zheng
-----Original Message-----
From: Ashutosh Chauhan [mailto:ashutosh.chauhan@gmail.com] 
Sent: Wednesday, September 30, 2009 8:46 AM
To: common-user@hadoop.apache.org
Cc: core-user@hadoop.apache.org
Subject: Re: Pig and Hive on the same data?

Hi Chris,

Pig doesn't mandate a Ctrl-A or any other character to be used as field
delimiter. You can tell Pig which delimiter to use. For example, you can
specify Ctrl-A as field delimiter  as following:

A = load 'mydata' using PigStorage('\u0001');

If you don't specify any delimiter, e.g. A = load 'mydata';  tab is assumed
to be a delimiter.

Also, if you have more questions on Pig, please post on pig-user list to get
faster response.

Thanks,
Ashutosh

On Wed, Sep 30, 2009 at 10:55, dumbfounder <ch...@searchles.com> wrote:

>
> We would like to use the same data for Pig and Hive queries for
> flexibility,
> has anyone done this without having 2 copies of the data? Hive seems to
> only
> want to work with CTRL-A delimited data, and I don't see a way to specify
> CTRL-A as a delimiter for Pig. Is there another efficient regex that people
> have used for Pig, or has anyone figured out a way to use delimiters that
> aren't CTRL-A for Hive? Or are there any other outside the box ideas?
> --
> View this message in context:
> http://www.nabble.com/Pig-and-Hive-on-the-same-data--tp25682735p25682735.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
>