You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Bhavesh Shah <bh...@gmail.com> on 2012/02/15 10:03:29 UTC

Doubt in INSERT query in Hive?

Hello,
Whenever we want to insert into table we use:
INSERT OVERWRITE TABLE TBL_NAME
(SELECT ....)
Due to this, table gets overwrites everytime.

I don't want to overwrite table, I want append it everytime.
I thought about LOAD TABLE , but writing the file may take more time and I
don't think so that it will efficient.

Does Hive Support INSERT INTO TABLE TAB_NAME?
(I am using hive-0.7.1)
Is there any patch for it? (But I don't know How to apply patch ?)

Pls suggest me as soon as possible.
Thanks.



-- 
Regards,
Bhavesh Shah

Re: Doubt in INSERT query in Hive?

Posted by hadoop hive <ha...@gmail.com>.
if you want to append data so you can you partitioning in that, crate
partition everytime...

On Wed, Feb 15, 2012 at 3:33 PM, Gabi D <ga...@gmail.com> wrote:

> Hi Bhavesh,
> You could consider partitioning your table. Then every insert would be to
> a different partition, not overwriting the previous ones, and a select *
> would work on all partitions. Depending on your functionality, this might
> also help you with queries, identifying only data of a certain run/partition
>
>
> On Wed, Feb 15, 2012 at 11:45 AM, <be...@yahoo.com> wrote:
>
>> **
>> Bhavesh
>> In this case if you are not using INSERT INTO, you may need some tmp
>> table write the query output to that. Load that data from there to your
>> target table's data dir.
>> You are not writing that to any file while doing the LOAD DATA operation.
>> Rather you are just moving the files(in hdfs) from the source location to
>> the table's data dir (where the previous data files are present). In hdfs
>> move operation there is just a meta data operation happening at file system
>> level.
>>
>> Go with INSERT INTO as it is a cleaner way in hql perspective.
>> Regards
>> Bejoy K S
>>
>> From handheld, Please excuse typos.
>> ------------------------------
>> *From: * Bhavesh Shah <bh...@gmail.com>
>> *Date: *Wed, 15 Feb 2012 15:03:07 +0530
>> *To: *<us...@hive.apache.org>; <be...@yahoo.com>
>> *ReplyTo: * user@hive.apache.org
>> *Subject: *Re: Doubt in INSERT query in Hive?
>>
>> Hi Bejoy K S,
>> Thanks for your reply.
>> The overhead is, in select query I have near about 85 columns. Writing
>> this in the file and again loading it may take some time.
>> For that reason I am thinking that it will be inefficient.
>>
>>
>>
>> --
>> Regards,
>> Bhavesh Shah
>>
>>
>> On Wed, Feb 15, 2012 at 2:51 PM, <be...@yahoo.com> wrote:
>>
>>> **
>>> Hi Bhavesh
>>> INSERT INTO is supported in hive 0.8 . An upgrade would get you things
>>> rolling.
>>> LOAD DATA inefficient? What was the performance overhead you were facing
>>> here?
>>> Regards
>>> Bejoy K S
>>>
>>> From handheld, Please excuse typos.
>>> ------------------------------
>>> *From: * Bhavesh Shah <bh...@gmail.com>
>>> *Date: *Wed, 15 Feb 2012 14:33:29 +0530
>>> *To: *<us...@hive.apache.org>; <de...@hive.apache.org>
>>> *ReplyTo: * user@hive.apache.org
>>> *Subject: *Doubt in INSERT query in Hive?
>>>
>>> Hello,
>>> Whenever we want to insert into table we use:
>>> INSERT OVERWRITE TABLE TBL_NAME
>>> (SELECT ....)
>>> Due to this, table gets overwrites everytime.
>>>
>>> I don't want to overwrite table, I want append it everytime.
>>> I thought about LOAD TABLE , but writing the file may take more time and
>>> I don't think so that it will efficient.
>>>
>>> Does Hive Support INSERT INTO TABLE TAB_NAME?
>>> (I am using hive-0.7.1)
>>> Is there any patch for it? (But I don't know How to apply patch ?)
>>>
>>> Pls suggest me as soon as possible.
>>> Thanks.
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Bhavesh Shah
>>>
>>>
>>
>>
>

Re: Doubt in INSERT query in Hive?

Posted by Gabi D <ga...@gmail.com>.
Hi Bhavesh,
You could consider partitioning your table. Then every insert would be to a
different partition, not overwriting the previous ones, and a select *
would work on all partitions. Depending on your functionality, this might
also help you with queries, identifying only data of a certain run/partition

On Wed, Feb 15, 2012 at 11:45 AM, <be...@yahoo.com> wrote:

> **
> Bhavesh
> In this case if you are not using INSERT INTO, you may need some tmp table
> write the query output to that. Load that data from there to your target
> table's data dir.
> You are not writing that to any file while doing the LOAD DATA operation.
> Rather you are just moving the files(in hdfs) from the source location to
> the table's data dir (where the previous data files are present). In hdfs
> move operation there is just a meta data operation happening at file system
> level.
>
> Go with INSERT INTO as it is a cleaner way in hql perspective.
> Regards
> Bejoy K S
>
> From handheld, Please excuse typos.
> ------------------------------
> *From: * Bhavesh Shah <bh...@gmail.com>
> *Date: *Wed, 15 Feb 2012 15:03:07 +0530
> *To: *<us...@hive.apache.org>; <be...@yahoo.com>
> *ReplyTo: * user@hive.apache.org
> *Subject: *Re: Doubt in INSERT query in Hive?
>
> Hi Bejoy K S,
> Thanks for your reply.
> The overhead is, in select query I have near about 85 columns. Writing
> this in the file and again loading it may take some time.
> For that reason I am thinking that it will be inefficient.
>
>
>
> --
> Regards,
> Bhavesh Shah
>
>
> On Wed, Feb 15, 2012 at 2:51 PM, <be...@yahoo.com> wrote:
>
>> **
>> Hi Bhavesh
>> INSERT INTO is supported in hive 0.8 . An upgrade would get you things
>> rolling.
>> LOAD DATA inefficient? What was the performance overhead you were facing
>> here?
>> Regards
>> Bejoy K S
>>
>> From handheld, Please excuse typos.
>> ------------------------------
>> *From: * Bhavesh Shah <bh...@gmail.com>
>> *Date: *Wed, 15 Feb 2012 14:33:29 +0530
>> *To: *<us...@hive.apache.org>; <de...@hive.apache.org>
>> *ReplyTo: * user@hive.apache.org
>> *Subject: *Doubt in INSERT query in Hive?
>>
>> Hello,
>> Whenever we want to insert into table we use:
>> INSERT OVERWRITE TABLE TBL_NAME
>> (SELECT ....)
>> Due to this, table gets overwrites everytime.
>>
>> I don't want to overwrite table, I want append it everytime.
>> I thought about LOAD TABLE , but writing the file may take more time and
>> I don't think so that it will efficient.
>>
>> Does Hive Support INSERT INTO TABLE TAB_NAME?
>> (I am using hive-0.7.1)
>> Is there any patch for it? (But I don't know How to apply patch ?)
>>
>> Pls suggest me as soon as possible.
>> Thanks.
>>
>>
>>
>> --
>> Regards,
>> Bhavesh Shah
>>
>>
>
>

Re: Doubt in INSERT query in Hive?

Posted by be...@yahoo.com.
Bhavesh
       In this case if you are not using INSERT INTO, you may need some tmp table write the query output to that. Load that data from there to your target table's data dir. 
You are not writing that to any file while doing the LOAD DATA operation. Rather you are just moving the files(in hdfs) from the source location to the table's data dir (where the previous data files are present). In hdfs move operation there is just a meta data operation happening at file system level. 

 Go with INSERT INTO as it is a cleaner way in hql perspective.
Regards
Bejoy K S

From handheld, Please excuse typos.

-----Original Message-----
From: Bhavesh Shah <bh...@gmail.com>
Date: Wed, 15 Feb 2012 15:03:07 
To: <us...@hive.apache.org>; <be...@yahoo.com>
Reply-To: user@hive.apache.org
Subject: Re: Doubt in INSERT query in Hive?

Hi Bejoy K S,
Thanks for your reply.
The overhead is, in select query I have near about 85 columns. Writing this
in the file and again loading it may take some time.
For that reason I am thinking that it will be inefficient.



-- 
Regards,
Bhavesh Shah


On Wed, Feb 15, 2012 at 2:51 PM, <be...@yahoo.com> wrote:

> **
> Hi Bhavesh
> INSERT INTO is supported in hive 0.8 . An upgrade would get you things
> rolling.
> LOAD DATA inefficient? What was the performance overhead you were facing
> here?
> Regards
> Bejoy K S
>
> From handheld, Please excuse typos.
> ------------------------------
> *From: * Bhavesh Shah <bh...@gmail.com>
> *Date: *Wed, 15 Feb 2012 14:33:29 +0530
> *To: *<us...@hive.apache.org>; <de...@hive.apache.org>
> *ReplyTo: * user@hive.apache.org
> *Subject: *Doubt in INSERT query in Hive?
>
> Hello,
> Whenever we want to insert into table we use:
> INSERT OVERWRITE TABLE TBL_NAME
> (SELECT ....)
> Due to this, table gets overwrites everytime.
>
> I don't want to overwrite table, I want append it everytime.
> I thought about LOAD TABLE , but writing the file may take more time and I
> don't think so that it will efficient.
>
> Does Hive Support INSERT INTO TABLE TAB_NAME?
> (I am using hive-0.7.1)
> Is there any patch for it? (But I don't know How to apply patch ?)
>
> Pls suggest me as soon as possible.
> Thanks.
>
>
>
> --
> Regards,
> Bhavesh Shah
>
>


Re: Doubt in INSERT query in Hive?

Posted by Bhavesh Shah <bh...@gmail.com>.
Hi Bejoy K S,
Thanks for your reply.
The overhead is, in select query I have near about 85 columns. Writing this
in the file and again loading it may take some time.
For that reason I am thinking that it will be inefficient.



-- 
Regards,
Bhavesh Shah


On Wed, Feb 15, 2012 at 2:51 PM, <be...@yahoo.com> wrote:

> **
> Hi Bhavesh
> INSERT INTO is supported in hive 0.8 . An upgrade would get you things
> rolling.
> LOAD DATA inefficient? What was the performance overhead you were facing
> here?
> Regards
> Bejoy K S
>
> From handheld, Please excuse typos.
> ------------------------------
> *From: * Bhavesh Shah <bh...@gmail.com>
> *Date: *Wed, 15 Feb 2012 14:33:29 +0530
> *To: *<us...@hive.apache.org>; <de...@hive.apache.org>
> *ReplyTo: * user@hive.apache.org
> *Subject: *Doubt in INSERT query in Hive?
>
> Hello,
> Whenever we want to insert into table we use:
> INSERT OVERWRITE TABLE TBL_NAME
> (SELECT ....)
> Due to this, table gets overwrites everytime.
>
> I don't want to overwrite table, I want append it everytime.
> I thought about LOAD TABLE , but writing the file may take more time and I
> don't think so that it will efficient.
>
> Does Hive Support INSERT INTO TABLE TAB_NAME?
> (I am using hive-0.7.1)
> Is there any patch for it? (But I don't know How to apply patch ?)
>
> Pls suggest me as soon as possible.
> Thanks.
>
>
>
> --
> Regards,
> Bhavesh Shah
>
>

Re: Doubt in INSERT query in Hive?

Posted by be...@yahoo.com.
Hi Bhavesh
       INSERT INTO is supported in hive 0.8 . An upgrade would get you things rolling. 
LOAD DATA inefficient? What was the performance overhead you were facing here?

Regards
Bejoy K S

From handheld, Please excuse typos.

-----Original Message-----
From: Bhavesh Shah <bh...@gmail.com>
Date: Wed, 15 Feb 2012 14:33:29 
To: <us...@hive.apache.org>; <de...@hive.apache.org>
Reply-To: user@hive.apache.org
Subject: Doubt in INSERT query in Hive?

Hello,
Whenever we want to insert into table we use:
INSERT OVERWRITE TABLE TBL_NAME
(SELECT ....)
Due to this, table gets overwrites everytime.

I don't want to overwrite table, I want append it everytime.
I thought about LOAD TABLE , but writing the file may take more time and I
don't think so that it will efficient.

Does Hive Support INSERT INTO TABLE TAB_NAME?
(I am using hive-0.7.1)
Is there any patch for it? (But I don't know How to apply patch ?)

Pls suggest me as soon as possible.
Thanks.



-- 
Regards,
Bhavesh Shah


Re: Doubt in INSERT query in Hive?

Posted by be...@yahoo.com.
Hi Bhavesh
       INSERT INTO is supported in hive 0.8 . An upgrade would get you things rolling. 
LOAD DATA inefficient? What was the performance overhead you were facing here?

Regards
Bejoy K S

From handheld, Please excuse typos.

-----Original Message-----
From: Bhavesh Shah <bh...@gmail.com>
Date: Wed, 15 Feb 2012 14:33:29 
To: <us...@hive.apache.org>; <de...@hive.apache.org>
Reply-To: user@hive.apache.org
Subject: Doubt in INSERT query in Hive?

Hello,
Whenever we want to insert into table we use:
INSERT OVERWRITE TABLE TBL_NAME
(SELECT ....)
Due to this, table gets overwrites everytime.

I don't want to overwrite table, I want append it everytime.
I thought about LOAD TABLE , but writing the file may take more time and I
don't think so that it will efficient.

Does Hive Support INSERT INTO TABLE TAB_NAME?
(I am using hive-0.7.1)
Is there any patch for it? (But I don't know How to apply patch ?)

Pls suggest me as soon as possible.
Thanks.



-- 
Regards,
Bhavesh Shah