You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by zuohua zhang <zu...@gmail.com> on 2012/08/07 06:09:12 UTC

question on output hive table to file

I have used the following to output a hive table to a file:
DROP TABLE IF EXISTS myoutputable;
CREATE TABLE myoutputtable
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS TEXTFILE
AS
select
*
from originaltable;
INSERT OVERWRITE DIRECTORY '/outputable.txt'
select * from myoutputtable;

then i used
hadoop dfs -getmerge /outputtable.txt /mnt/

but the /mnt/outputtable.txt file shows strange characters ^A in the file.
What did I do wrong?

Re: question on output hive table to file

Posted by Gabi D <ga...@gmail.com>.
haven't tried this but - since your myoutputtable table is tab delimited,
and if this format suites your needs, you could create it as an external
table and specify its hadoop path then run the getmerge command off of that
location (without needing the 'insert overwrite directory ...'  command, so
you're even gaining some performance).

On Tue, Aug 7, 2012 at 8:04 AM, Vinod Singh <vi...@vinodsingh.com> wrote:

> If output file is not too big then ^A can be replaced by using simple
> command like-
>
> $ tr "\001" "," < src_file > out_file
>
> Thanks,
> Vinod
>
>
> On Tue, Aug 7, 2012 at 10:27 AM, zuohua zhang <zu...@gmail.com> wrote:
>
>> Thanks so much!!!!!!!!! that did work. I have 200+ columns so it is quite
>> an ugly thing. No shortcut?
>>
>>
>> On Mon, Aug 6, 2012 at 9:50 PM, Vinod Singh <vi...@vinodsingh.com> wrote:
>>
>>> Change the query to something like-
>>>
>>> INSERT OVERWRITE DIRECTORY '/outputable.txt'
>>> select concat(col1, ',', col2, ',', col3)  from myoutputtable;
>>>
>>> That way columns will be separated by ,.
>>>
>>> Thanks,
>>> Vinod
>>>
>>>
>>> On Tue, Aug 7, 2012 at 10:16 AM, zuohua zhang <zu...@gmail.com> wrote:
>>>
>>>> I used the following that it won't help?
>>>>
>>>> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
>>>>
>>>> On Mon, Aug 6, 2012 at 9:43 PM, Vinod Singh <vi...@vinodsingh.com>wrote:
>>>>
>>>>> Columns of a Hive table are separated by ^A character. Instead of
>>>>> doing a "SELECT * ", you may like to use concat function to have a
>>>>> separator of your choice.
>>>>>
>>>>> Thanks,
>>>>> Vinod
>>>>>
>>>>>
>>>>> On Tue, Aug 7, 2012 at 9:39 AM, zuohua zhang <zu...@gmail.com> wrote:
>>>>>
>>>>>> I have used the following to output a hive table to a file:
>>>>>> DROP TABLE IF EXISTS myoutputable;
>>>>>> CREATE TABLE myoutputtable
>>>>>> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
>>>>>> STORED AS TEXTFILE
>>>>>> AS
>>>>>> select
>>>>>> *
>>>>>> from originaltable;
>>>>>> INSERT OVERWRITE DIRECTORY '/outputable.txt'
>>>>>> select * from myoutputtable;
>>>>>>
>>>>>> then i used
>>>>>> hadoop dfs -getmerge /outputtable.txt /mnt/
>>>>>>
>>>>>> but the /mnt/outputtable.txt file shows strange characters ^A in the
>>>>>> file. What did I do wrong?
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: question on output hive table to file

Posted by Vinod Singh <vi...@vinodsingh.com>.
If output file is not too big then ^A can be replaced by using simple
command like-

$ tr "\001" "," < src_file > out_file

Thanks,
Vinod

On Tue, Aug 7, 2012 at 10:27 AM, zuohua zhang <zu...@gmail.com> wrote:

> Thanks so much!!!!!!!!! that did work. I have 200+ columns so it is quite
> an ugly thing. No shortcut?
>
>
> On Mon, Aug 6, 2012 at 9:50 PM, Vinod Singh <vi...@vinodsingh.com> wrote:
>
>> Change the query to something like-
>>
>> INSERT OVERWRITE DIRECTORY '/outputable.txt'
>> select concat(col1, ',', col2, ',', col3)  from myoutputtable;
>>
>> That way columns will be separated by ,.
>>
>> Thanks,
>> Vinod
>>
>>
>> On Tue, Aug 7, 2012 at 10:16 AM, zuohua zhang <zu...@gmail.com> wrote:
>>
>>> I used the following that it won't help?
>>>
>>> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
>>>
>>> On Mon, Aug 6, 2012 at 9:43 PM, Vinod Singh <vi...@vinodsingh.com>wrote:
>>>
>>>> Columns of a Hive table are separated by ^A character. Instead of doing
>>>> a "SELECT * ", you may like to use concat function to have a separator of
>>>> your choice.
>>>>
>>>> Thanks,
>>>> Vinod
>>>>
>>>>
>>>> On Tue, Aug 7, 2012 at 9:39 AM, zuohua zhang <zu...@gmail.com> wrote:
>>>>
>>>>> I have used the following to output a hive table to a file:
>>>>> DROP TABLE IF EXISTS myoutputable;
>>>>> CREATE TABLE myoutputtable
>>>>> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
>>>>> STORED AS TEXTFILE
>>>>> AS
>>>>> select
>>>>> *
>>>>> from originaltable;
>>>>> INSERT OVERWRITE DIRECTORY '/outputable.txt'
>>>>> select * from myoutputtable;
>>>>>
>>>>> then i used
>>>>> hadoop dfs -getmerge /outputtable.txt /mnt/
>>>>>
>>>>> but the /mnt/outputtable.txt file shows strange characters ^A in the
>>>>> file. What did I do wrong?
>>>>>
>>>>
>>>>
>>>
>>
>

RE: question on output hive table to file

Posted by Tony Burton <TB...@SportingIndex.com>.
I use the following example to set my own delimiter, I hope it's easy to adjust for your own needs:

hive> create external table input (a int, b string, c float) row format delimited fields terminated by "\t" stored as sequencefile location 's3://path/to/data/input/';
hive> create external table output (a int, b string, c float) row format delimited fields terminated by "~" stored as textfile location 's3://path/to/data/output/';
hive> insert overwrite table output select * from input;

Let me know if it works!

Tony




From: zuohua zhang [mailto:zuohua@gmail.com]
Sent: 07 August 2012 05:58
To: user@hive.apache.org
Subject: Re: question on output hive table to file

Thanks so much!!!!!!!!! that did work. I have 200+ columns so it is quite an ugly thing. No shortcut?
On Mon, Aug 6, 2012 at 9:50 PM, Vinod Singh <vi...@vinodsingh.com>> wrote:
Change the query to something like-

INSERT OVERWRITE DIRECTORY '/outputable.txt'
select concat(col1, ',', col2, ',', col3)  from myoutputtable;

That way columns will be separated by ,.

Thanks,
Vinod

On Tue, Aug 7, 2012 at 10:16 AM, zuohua zhang <zu...@gmail.com>> wrote:
I used the following that it won't help?

ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'

On Mon, Aug 6, 2012 at 9:43 PM, Vinod Singh <vi...@vinodsingh.com>> wrote:
Columns of a Hive table are separated by ^A character. Instead of doing a "SELECT * ", you may like to use concat function to have a separator of your choice.

Thanks,
Vinod

On Tue, Aug 7, 2012 at 9:39 AM, zuohua zhang <zu...@gmail.com>> wrote:
I have used the following to output a hive table to a file:
DROP TABLE IF EXISTS myoutputable;
CREATE TABLE myoutputtable
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS TEXTFILE
AS
select
*
from originaltable;
INSERT OVERWRITE DIRECTORY '/outputable.txt'
select * from myoutputtable;

then i used
hadoop dfs -getmerge /outputtable.txt /mnt/

but the /mnt/outputtable.txt file shows strange characters ^A in the file. What did I do wrong?





Inbound Email has been scanned for viruses and SPAM
**********************************************************************
This email and any attachments are confidential, protected by copyright and may be legally privileged.  If you are not the intended recipient, then the dissemination or copying of this email is prohibited. If you have received this in error, please notify the sender by replying by email and then delete the email completely from your system.  Neither Sporting Index nor the sender accepts responsibility for any virus, or any other defect which might affect any computer or IT system into which the email is received and/or opened.  It is the responsibility of the recipient to scan the email and no responsibility is accepted for any loss or damage arising in any way from receipt or use of this email.  Sporting Index Ltd is a company registered in England and Wales with company number 2636842, whose registered office is at Gateway House, Milverton Street, London, SE11 4AP.  Sporting Index Ltd is authorised and regulated by the UK Financial Services Authority (reg. no. 150404) and Gambling Commission (reg. no. 000-027343-R-308898-001).  Any financial promotion contained herein has been issued 
and approved by Sporting Index Ltd.

Outbound email has been scanned for viruses and SPAM

Re: question on output hive table to file

Posted by zuohua zhang <zu...@gmail.com>.
Thanks so much!!!!!!!!! that did work. I have 200+ columns so it is quite
an ugly thing. No shortcut?

On Mon, Aug 6, 2012 at 9:50 PM, Vinod Singh <vi...@vinodsingh.com> wrote:

> Change the query to something like-
>
> INSERT OVERWRITE DIRECTORY '/outputable.txt'
> select concat(col1, ',', col2, ',', col3)  from myoutputtable;
>
> That way columns will be separated by ,.
>
> Thanks,
> Vinod
>
>
> On Tue, Aug 7, 2012 at 10:16 AM, zuohua zhang <zu...@gmail.com> wrote:
>
>> I used the following that it won't help?
>>
>> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
>>
>> On Mon, Aug 6, 2012 at 9:43 PM, Vinod Singh <vi...@vinodsingh.com> wrote:
>>
>>> Columns of a Hive table are separated by ^A character. Instead of doing
>>> a "SELECT * ", you may like to use concat function to have a separator of
>>> your choice.
>>>
>>> Thanks,
>>> Vinod
>>>
>>>
>>> On Tue, Aug 7, 2012 at 9:39 AM, zuohua zhang <zu...@gmail.com> wrote:
>>>
>>>> I have used the following to output a hive table to a file:
>>>> DROP TABLE IF EXISTS myoutputable;
>>>> CREATE TABLE myoutputtable
>>>> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
>>>> STORED AS TEXTFILE
>>>> AS
>>>> select
>>>> *
>>>> from originaltable;
>>>> INSERT OVERWRITE DIRECTORY '/outputable.txt'
>>>> select * from myoutputtable;
>>>>
>>>> then i used
>>>> hadoop dfs -getmerge /outputtable.txt /mnt/
>>>>
>>>> but the /mnt/outputtable.txt file shows strange characters ^A in the
>>>> file. What did I do wrong?
>>>>
>>>
>>>
>>
>

Re: question on output hive table to file

Posted by Vinod Singh <vi...@vinodsingh.com>.
Change the query to something like-

INSERT OVERWRITE DIRECTORY '/outputable.txt'
select concat(col1, ',', col2, ',', col3)  from myoutputtable;

That way columns will be separated by ,.

Thanks,
Vinod

On Tue, Aug 7, 2012 at 10:16 AM, zuohua zhang <zu...@gmail.com> wrote:

> I used the following that it won't help?
>
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
>
> On Mon, Aug 6, 2012 at 9:43 PM, Vinod Singh <vi...@vinodsingh.com> wrote:
>
>> Columns of a Hive table are separated by ^A character. Instead of doing a
>> "SELECT * ", you may like to use concat function to have a separator of
>> your choice.
>>
>> Thanks,
>> Vinod
>>
>>
>> On Tue, Aug 7, 2012 at 9:39 AM, zuohua zhang <zu...@gmail.com> wrote:
>>
>>> I have used the following to output a hive table to a file:
>>> DROP TABLE IF EXISTS myoutputable;
>>> CREATE TABLE myoutputtable
>>> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
>>> STORED AS TEXTFILE
>>> AS
>>> select
>>> *
>>> from originaltable;
>>> INSERT OVERWRITE DIRECTORY '/outputable.txt'
>>> select * from myoutputtable;
>>>
>>> then i used
>>> hadoop dfs -getmerge /outputtable.txt /mnt/
>>>
>>> but the /mnt/outputtable.txt file shows strange characters ^A in the
>>> file. What did I do wrong?
>>>
>>
>>
>

Re: question on output hive table to file

Posted by zuohua zhang <zu...@gmail.com>.
I used the following that it won't help?
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'

On Mon, Aug 6, 2012 at 9:43 PM, Vinod Singh <vi...@vinodsingh.com> wrote:

> Columns of a Hive table are separated by ^A character. Instead of doing a
> "SELECT * ", you may like to use concat function to have a separator of
> your choice.
>
> Thanks,
> Vinod
>
>
> On Tue, Aug 7, 2012 at 9:39 AM, zuohua zhang <zu...@gmail.com> wrote:
>
>> I have used the following to output a hive table to a file:
>> DROP TABLE IF EXISTS myoutputable;
>> CREATE TABLE myoutputtable
>> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
>> STORED AS TEXTFILE
>> AS
>> select
>> *
>> from originaltable;
>> INSERT OVERWRITE DIRECTORY '/outputable.txt'
>> select * from myoutputtable;
>>
>> then i used
>> hadoop dfs -getmerge /outputtable.txt /mnt/
>>
>> but the /mnt/outputtable.txt file shows strange characters ^A in the
>> file. What did I do wrong?
>>
>
>

Re: question on output hive table to file

Posted by Vinod Singh <vi...@vinodsingh.com>.
Columns of a Hive table are separated by ^A character. Instead of doing a
"SELECT * ", you may like to use concat function to have a separator of
your choice.

Thanks,
Vinod

On Tue, Aug 7, 2012 at 9:39 AM, zuohua zhang <zu...@gmail.com> wrote:

> I have used the following to output a hive table to a file:
> DROP TABLE IF EXISTS myoutputable;
> CREATE TABLE myoutputtable
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
> STORED AS TEXTFILE
> AS
> select
> *
> from originaltable;
> INSERT OVERWRITE DIRECTORY '/outputable.txt'
> select * from myoutputtable;
>
> then i used
> hadoop dfs -getmerge /outputtable.txt /mnt/
>
> but the /mnt/outputtable.txt file shows strange characters ^A in the file.
> What did I do wrong?
>