You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Josh Ferguson <jo...@besquared.net> on 2009/01/12 06:48:53 UTC

INSERT OVERWRITE not working with map/reduce transform

I have a query that returns the proper results:

SELECT TRANSFORM(actor_id) USING '/my/script.rb' AS (actor_id,  
percentile, count) FROM (SELECT actor_id FROM activities CLUSTER BY  
actor_id) actors;

But when I do

INSERT OVERWRITE TABLE percentiles
SELECT TRANSFORM(actor_id) USING '/my/script.rb' AS (actor_id,  
percentile, count) FROM (SELECT actor_id FROM activities CLUSTER BY  
actor_id) actors;

It says it loads data into the percentiles table but when I ask for  
data from that table I get:

hive> SELECT actor_id, percentile, count FROM percentiles;
FAILED: Error in semantic analysis:  
org.apache.hadoop.hive.ql.metadata.HiveException: Path /user/hive/ 
warehouse/percentiles not a valid path

$ hadoop fs -ls /user/hive/warehouse/percentiles/
Found 1 items
-rw-r--r--   1 Josh supergroup          0 2009-01-11 21:45 /user/hive/ 
warehouse/percentiles/attempt_200901112100_0010_r_000000_0

It's nothing but an empty file.

Am I doing something wrong?

Josh Ferguson

Re: INSERT OVERWRITE not working with map/reduce transform

Posted by Prasad Chakka <pr...@facebook.com>.
I think it is the same problem as Jeremy Chow's (the name configured in hadoop-site.xml and hive-default.xml seem to be different).


________________________________
From: Ashish Thusoo <at...@facebook.com>
Reply-To: <hi...@hadoop.apache.org>
Date: Tue, 13 Jan 2009 11:30:18 -0800
To: <hi...@hadoop.apache.org>
Subject: RE: INSERT OVERWRITE not working with map/reduce transform

Can you also post the output of

explain <query>

Thanks,
Ashish

________________________________
From: Josh Ferguson [mailto:josh@besquared.net]
Sent: Monday, January 12, 2009 10:24 PM
To: hive-user@hadoop.apache.org
Subject: Re: INSERT OVERWRITE not working with map/reduce transform

A follow up

If I insert into a local file and then load the file into the table using a load command it works:

hive> INSERT OVERWRITE LOCAL DIRECTORY '/tmp/hdfs_out' SELECT TRANSFORM(actor_id) USING '/Users/Josh/percentiles.rb' AS (actor_id, percentile, count) FROM (SELECT actor_id FROM activities CLUSTER BY actor_id) actor;

hive> LOAD DATA LOCAL INPATH '/tmp/hdfs_out/attempt_200901112100_0030_r_000000_0' OVERWRITE INTO TABLE percentiles PARTITION ( account='cUU5T7y6DmdzMJFcFt3JDe', application='Test Application', dataset='Purchases', hour=342007, span=1 );
Copying data from file:/tmp/hdfs_out/attempt_200901112100_0030_r_000000_0
Loading data to table percentiles partition {account=cUU5T7y6DmdzMJFcFt3JDe, application=Test Application, dataset=Purchases, hour=342007, span=1}
OK

hive> SELECT * FROM percentiles;
OK
00d5c3f0-1b29-4bd3-bc8c-e3f0a7ba5949 100 1 cUU5T7y6DmdzMJFcFt3JDe Test Application Purchases 342007 1
03e58605-7de7-48fb-9852-781700823a71 100 1 cUU5T7y6DmdzMJFcFt3JDe Test Application Purchases 342007 1
06986e3e-7c73-4466-b0dc-d92038e3f665 100 1 cUU5T7y6DmdzMJFcFt3JDe Test Application Purchases 342007 1
06af307c-0da6-4795-860a-6b26425bdbc8 100 1 cUU5T7y6DmdzMJFcFt3JDe Test Application Purchases 342007 1
0cca5b18-efe3-4903-9387-a8c7eb8432e4 100 1 cUU5T7y6DmdzMJFcFt3JDe Test Application Purchases 342007 1
0e20b565-59aa-4d57-8f68-ddcaaa75673b 100 1 cUU5T7y6DmdzMJFcFt3JDe Test Application Purchases 342007 1
...

So it works in two parts but not in one part, I'm not sure why.

Josh F.

On Jan 12, 2009, at 9:57 PM, Zheng Shao wrote:

Does the map-reduce job produced any output? Can you  check the mapred job page?

If so then it must be that the "moveTask"  after map-reduce job is failed.

Zheng


On Mon, Jan 12, 2009 at 9:48 PM, Josh Ferguson <jo...@besquared.net> wrote:


https://gist.github.com/3cb4be29625442c90140


Josh







On Jan 12, 2009, at 9:39 PM, Zheng Shao wrote:


Should be tab separated.

if you run it  without insert, is there any data on the screen?

Zheng


On Mon, Jan 12, 2009 at 9:32 PM, Josh Ferguson  <jo...@besquared.net> wrote:


The only thing I could figure is that my output is incorrect.. Is  the output from the transform script supposed to be tab separated or  separated by the delimiters of the table you're trying to insert into?  It doesn't seem to make any difference (my table is still empty no  matter which one I try) but I'd better make sure just incase.


Josh





On Jan 12, 2009, at 9:11 PM, Zheng Shao wrote:





Here are some examples:

[zshao@xxx  /hive.root] find ./ql/src/test/queries/clientpositive -name  '*.q'  | xargs grep  TRANSFORM
./ql/src/test/queries/clientpositive/input14_limit.q:   SELECT TRANSFORM(src.key,  src.value)
./ql/src/test/queries/clientpositive/input14_limit.q:   SELECT TRANSFORM(src.key,  src.value)
./ql/src/test/queries/clientpositive/input14.q:   SELECT TRANSFORM(src.key,  src.value)
./ql/src/test/queries/clientpositive/input14.q:   SELECT TRANSFORM(src.key,  src.value)
./ql/src/test/queries/clientpositive/input18.q:   SELECT TRANSFORM(src.key, src.value, 1+2,  3+4)
./ql/src/test/queries/clientpositive/input18.q:  SELECT  TRANSFORM(src.key, src.value, 1+2,  3+4)
./ql/src/test/queries/clientpositive/scriptfile1.q:   SELECT TRANSFORM(src.key,  src.value)
./ql/src/test/queries/clientpositive/input5.q:   SELECT TRANSFORM(src_thrift.lint,  src_thrift.lintstring)
./ql/src/test/queries/clientpositive/input5.q:   SELECT TRANSFORM(src_thrift.lint,  src_thrift.lintstring)
./ql/src/test/queries/clientpositive/input17.q:   SELECT TRANSFORM(src_thrift.aint + src_thrift.lint[0],  src_thrift.lintstring[0])
./ql/src/test/queries/clientpositive/input17.q:   SELECT TRANSFORM(src_thrift.aint + src_thrift.lint[0],  src_thrift.lintstring[0])



On Mon, Jan 12, 2009 at 8:58 PM, Josh Ferguson  <jo...@besquared.net> wrote:

Anyone  have any word on why this might not work? Can someone give me an  example of a query they use to INSERT OVERWRITE a table from a map  and/or reduce job that I could use as a reference?

Josh F.




On Jan 11, 2009, at 9:48 PM, Josh Ferguson  wrote:


I  have a query that returns the proper results:

SELECT  TRANSFORM(actor_id) USING '/my/script.rb' AS (actor_id,  percentile, count) FROM (SELECT actor_id FROM activities CLUSTER  BY actor_id) actors;

But when I do

INSERT OVERWRITE  TABLE percentiles
SELECT TRANSFORM(actor_id) USING  '/my/script.rb' AS (actor_id, percentile, count) FROM (SELECT  actor_id FROM activities CLUSTER BY actor_id) actors;

It  says it loads data into the percentiles table but when I ask for  data from that table I get:

hive> SELECT actor_id,  percentile, count FROM percentiles;
FAILED: Error in semantic  analysis: org.apache.hadoop.hive.ql.metadata.HiveException: Path  /user/hive/warehouse/percentiles not a valid path

$ hadoop  fs -ls /user/hive/warehouse/percentiles/
Found 1  items
-rw-r--r--   1 Josh supergroup           0 2009-01-11 21:45  /user/hive/warehouse/percentiles/attempt_200901112100_0010_r_000000_0

It's  nothing but an empty file.

Am I doing something  wrong?

Josh  Ferguson




--
Yours,
Zheng




--
Yours,
Zheng




--
Yours,
Zheng



RE: INSERT OVERWRITE not working with map/reduce transform

Posted by Ashish Thusoo <at...@facebook.com>.
Can you also post the output of

explain <query>

Thanks,
Ashish

________________________________
From: Josh Ferguson [mailto:josh@besquared.net]
Sent: Monday, January 12, 2009 10:24 PM
To: hive-user@hadoop.apache.org
Subject: Re: INSERT OVERWRITE not working with map/reduce transform

A follow up

If I insert into a local file and then load the file into the table using a load command it works:

hive> INSERT OVERWRITE LOCAL DIRECTORY '/tmp/hdfs_out' SELECT TRANSFORM(actor_id) USING '/Users/Josh/percentiles.rb' AS (actor_id, percentile, count) FROM (SELECT actor_id FROM activities CLUSTER BY actor_id) actor;

hive> LOAD DATA LOCAL INPATH '/tmp/hdfs_out/attempt_200901112100_0030_r_000000_0' OVERWRITE INTO TABLE percentiles PARTITION ( account='cUU5T7y6DmdzMJFcFt3JDe', application='Test Application', dataset='Purchases', hour=342007, span=1 );
Copying data from file:/tmp/hdfs_out/attempt_200901112100_0030_r_000000_0
Loading data to table percentiles partition {account=cUU5T7y6DmdzMJFcFt3JDe, application=Test Application, dataset=Purchases, hour=342007, span=1}
OK

hive> SELECT * FROM percentiles;
OK
00d5c3f0-1b29-4bd3-bc8c-e3f0a7ba5949 100 1 cUU5T7y6DmdzMJFcFt3JDe Test Application Purchases 342007 1
03e58605-7de7-48fb-9852-781700823a71 100 1 cUU5T7y6DmdzMJFcFt3JDe Test Application Purchases 342007 1
06986e3e-7c73-4466-b0dc-d92038e3f665 100 1 cUU5T7y6DmdzMJFcFt3JDe Test Application Purchases 342007 1
06af307c-0da6-4795-860a-6b26425bdbc8 100 1 cUU5T7y6DmdzMJFcFt3JDe Test Application Purchases 342007 1
0cca5b18-efe3-4903-9387-a8c7eb8432e4 100 1 cUU5T7y6DmdzMJFcFt3JDe Test Application Purchases 342007 1
0e20b565-59aa-4d57-8f68-ddcaaa75673b 100 1 cUU5T7y6DmdzMJFcFt3JDe Test Application Purchases 342007 1
...

So it works in two parts but not in one part, I'm not sure why.

Josh F.

On Jan 12, 2009, at 9:57 PM, Zheng Shao wrote:

Does the map-reduce job produced any output? Can you check the mapred job page?

If so then it must be that the "moveTask" after map-reduce job is failed.

Zheng

On Mon, Jan 12, 2009 at 9:48 PM, Josh Ferguson <jo...@besquared.net>> wrote:
https://gist.github.com/3cb4be29625442c90140

Josh

On Jan 12, 2009, at 9:39 PM, Zheng Shao wrote:

Should be tab separated.

if you run it without insert, is there any data on the screen?

Zheng

On Mon, Jan 12, 2009 at 9:32 PM, Josh Ferguson <jo...@besquared.net>> wrote:
The only thing I could figure is that my output is incorrect.. Is the output from the transform script supposed to be tab separated or separated by the delimiters of the table you're trying to insert into? It doesn't seem to make any difference (my table is still empty no matter which one I try) but I'd better make sure just incase.

Josh

On Jan 12, 2009, at 9:11 PM, Zheng Shao wrote:

Here are some examples:

[zshao@xxx /hive.root] find ./ql/src/test/queries/clientpositive -name '*.q'  | xargs grep TRANSFORM
./ql/src/test/queries/clientpositive/input14_limit.q:  SELECT TRANSFORM(src.key, src.value)
./ql/src/test/queries/clientpositive/input14_limit.q:  SELECT TRANSFORM(src.key, src.value)
./ql/src/test/queries/clientpositive/input14.q:  SELECT TRANSFORM(src.key, src.value)
./ql/src/test/queries/clientpositive/input14.q:  SELECT TRANSFORM(src.key, src.value)
./ql/src/test/queries/clientpositive/input18.q:  SELECT TRANSFORM(src.key, src.value, 1+2, 3+4)
./ql/src/test/queries/clientpositive/input18.q:  SELECT TRANSFORM(src.key, src.value, 1+2, 3+4)
./ql/src/test/queries/clientpositive/scriptfile1.q:  SELECT TRANSFORM(src.key, src.value)
./ql/src/test/queries/clientpositive/input5.q:  SELECT TRANSFORM(src_thrift.lint, src_thrift.lintstring)
./ql/src/test/queries/clientpositive/input5.q:  SELECT TRANSFORM(src_thrift.lint, src_thrift.lintstring)
./ql/src/test/queries/clientpositive/input17.q:  SELECT TRANSFORM(src_thrift.aint + src_thrift.lint[0], src_thrift.lintstring[0])
./ql/src/test/queries/clientpositive/input17.q:  SELECT TRANSFORM(src_thrift.aint + src_thrift.lint[0], src_thrift.lintstring[0])


On Mon, Jan 12, 2009 at 8:58 PM, Josh Ferguson <jo...@besquared.net>> wrote:
Anyone have any word on why this might not work? Can someone give me an example of a query they use to INSERT OVERWRITE a table from a map and/or reduce job that I could use as a reference?

Josh F.


On Jan 11, 2009, at 9:48 PM, Josh Ferguson wrote:

I have a query that returns the proper results:

SELECT TRANSFORM(actor_id) USING '/my/script.rb' AS (actor_id, percentile, count) FROM (SELECT actor_id FROM activities CLUSTER BY actor_id) actors;

But when I do

INSERT OVERWRITE TABLE percentiles
SELECT TRANSFORM(actor_id) USING '/my/script.rb' AS (actor_id, percentile, count) FROM (SELECT actor_id FROM activities CLUSTER BY actor_id) actors;

It says it loads data into the percentiles table but when I ask for data from that table I get:

hive> SELECT actor_id, percentile, count FROM percentiles;
FAILED: Error in semantic analysis: org.apache.hadoop.hive.ql.metadata.HiveException: Path /user/hive/warehouse/percentiles not a valid path

$ hadoop fs -ls /user/hive/warehouse/percentiles/
Found 1 items
-rw-r--r--   1 Josh supergroup          0 2009-01-11 21:45 /user/hive/warehouse/percentiles/attempt_200901112100_0010_r_000000_0

It's nothing but an empty file.

Am I doing something wrong?

Josh Ferguson




--
Yours,
Zheng




--
Yours,
Zheng




--
Yours,
Zheng


Re: INSERT OVERWRITE not working with map/reduce transform

Posted by Josh Ferguson <jo...@besquared.net>.
A follow up

If I insert into a local file and then load the file into the table  
using a load command it works:

hive> INSERT OVERWRITE LOCAL DIRECTORY '/tmp/hdfs_out' SELECT  
TRANSFORM(actor_id) USING '/Users/Josh/percentiles.rb' AS (actor_id,  
percentile, count) FROM (SELECT actor_id FROM activities CLUSTER BY  
actor_id) actor;

hive> LOAD DATA LOCAL INPATH '/tmp/hdfs_out/ 
attempt_200901112100_0030_r_000000_0' OVERWRITE INTO TABLE percentiles  
PARTITION ( account='cUU5T7y6DmdzMJFcFt3JDe', application='Test  
Application', dataset='Purchases', hour=342007, span=1 );
Copying data from file:/tmp/hdfs_out/ 
attempt_200901112100_0030_r_000000_0
Loading data to table percentiles partition  
{account=cUU5T7y6DmdzMJFcFt3JDe, application=Test Application,  
dataset=Purchases, hour=342007, span=1}
OK

hive> SELECT * FROM percentiles;
OK
00d5c3f0-1b29-4bd3-bc8c-e3f0a7ba5949	100	1	cUU5T7y6DmdzMJFcFt3JDe	Test  
Application	Purchases	342007	1
03e58605-7de7-48fb-9852-781700823a71	100	1	cUU5T7y6DmdzMJFcFt3JDe	Test  
Application	Purchases	342007	1
06986e3e-7c73-4466-b0dc-d92038e3f665	100	1	cUU5T7y6DmdzMJFcFt3JDe	Test  
Application	Purchases	342007	1
06af307c-0da6-4795-860a-6b26425bdbc8	100	1	cUU5T7y6DmdzMJFcFt3JDe	Test  
Application	Purchases	342007	1
0cca5b18-efe3-4903-9387-a8c7eb8432e4	100	1	cUU5T7y6DmdzMJFcFt3JDe	Test  
Application	Purchases	342007	1
0e20b565-59aa-4d57-8f68-ddcaaa75673b	100	1	cUU5T7y6DmdzMJFcFt3JDe	Test  
Application	Purchases	342007	1
...

So it works in two parts but not in one part, I'm not sure why.

Josh F.

On Jan 12, 2009, at 9:57 PM, Zheng Shao wrote:

> Does the map-reduce job produced any output? Can you check the  
> mapred job page?
>
> If so then it must be that the "moveTask" after map-reduce job is  
> failed.
>
> Zheng
>
> On Mon, Jan 12, 2009 at 9:48 PM, Josh Ferguson <jo...@besquared.net>  
> wrote:
> https://gist.github.com/3cb4be29625442c90140
>
> Josh
>
> On Jan 12, 2009, at 9:39 PM, Zheng Shao wrote:
>
>> Should be tab separated.
>>
>> if you run it without insert, is there any data on the screen?
>>
>> Zheng
>>
>> On Mon, Jan 12, 2009 at 9:32 PM, Josh Ferguson <jo...@besquared.net>  
>> wrote:
>> The only thing I could figure is that my output is incorrect.. Is  
>> the output from the transform script supposed to be tab separated  
>> or separated by the delimiters of the table you're trying to insert  
>> into? It doesn't seem to make any difference (my table is still  
>> empty no matter which one I try) but I'd better make sure just  
>> incase.
>>
>> Josh
>>
>> On Jan 12, 2009, at 9:11 PM, Zheng Shao wrote:
>>
>>> Here are some examples:
>>>
>>> [zshao@xxx /hive.root] find ./ql/src/test/queries/clientpositive - 
>>> name '*.q'  | xargs grep TRANSFORM
>>> ./ql/src/test/queries/clientpositive/input14_limit.q:  SELECT  
>>> TRANSFORM(src.key, src.value)
>>> ./ql/src/test/queries/clientpositive/input14_limit.q:  SELECT  
>>> TRANSFORM(src.key, src.value)
>>> ./ql/src/test/queries/clientpositive/input14.q:  SELECT  
>>> TRANSFORM(src.key, src.value)
>>> ./ql/src/test/queries/clientpositive/input14.q:  SELECT  
>>> TRANSFORM(src.key, src.value)
>>> ./ql/src/test/queries/clientpositive/input18.q:  SELECT  
>>> TRANSFORM(src.key, src.value, 1+2, 3+4)
>>> ./ql/src/test/queries/clientpositive/input18.q:  SELECT  
>>> TRANSFORM(src.key, src.value, 1+2, 3+4)
>>> ./ql/src/test/queries/clientpositive/scriptfile1.q:  SELECT  
>>> TRANSFORM(src.key, src.value)
>>> ./ql/src/test/queries/clientpositive/input5.q:  SELECT  
>>> TRANSFORM(src_thrift.lint, src_thrift.lintstring)
>>> ./ql/src/test/queries/clientpositive/input5.q:  SELECT  
>>> TRANSFORM(src_thrift.lint, src_thrift.lintstring)
>>> ./ql/src/test/queries/clientpositive/input17.q:  SELECT  
>>> TRANSFORM(src_thrift.aint + src_thrift.lint[0],  
>>> src_thrift.lintstring[0])
>>> ./ql/src/test/queries/clientpositive/input17.q:  SELECT  
>>> TRANSFORM(src_thrift.aint + src_thrift.lint[0],  
>>> src_thrift.lintstring[0])
>>>
>>>
>>> On Mon, Jan 12, 2009 at 8:58 PM, Josh Ferguson  
>>> <jo...@besquared.net> wrote:
>>> Anyone have any word on why this might not work? Can someone give  
>>> me an example of a query they use to INSERT OVERWRITE a table from  
>>> a map and/or reduce job that I could use as a reference?
>>>
>>> Josh F.
>>>
>>>
>>> On Jan 11, 2009, at 9:48 PM, Josh Ferguson wrote:
>>>
>>> I have a query that returns the proper results:
>>>
>>> SELECT TRANSFORM(actor_id) USING '/my/script.rb' AS (actor_id,  
>>> percentile, count) FROM (SELECT actor_id FROM activities CLUSTER  
>>> BY actor_id) actors;
>>>
>>> But when I do
>>>
>>> INSERT OVERWRITE TABLE percentiles
>>> SELECT TRANSFORM(actor_id) USING '/my/script.rb' AS (actor_id,  
>>> percentile, count) FROM (SELECT actor_id FROM activities CLUSTER  
>>> BY actor_id) actors;
>>>
>>> It says it loads data into the percentiles table but when I ask  
>>> for data from that table I get:
>>>
>>> hive> SELECT actor_id, percentile, count FROM percentiles;
>>> FAILED: Error in semantic analysis:  
>>> org.apache.hadoop.hive.ql.metadata.HiveException: Path /user/hive/ 
>>> warehouse/percentiles not a valid path
>>>
>>> $ hadoop fs -ls /user/hive/warehouse/percentiles/
>>> Found 1 items
>>> -rw-r--r--   1 Josh supergroup          0 2009-01-11 21:45 /user/ 
>>> hive/warehouse/percentiles/attempt_200901112100_0010_r_000000_0
>>>
>>> It's nothing but an empty file.
>>>
>>> Am I doing something wrong?
>>>
>>> Josh Ferguson
>>>
>>>
>>>
>>>
>>> -- 
>>> Yours,
>>> Zheng
>>
>>
>>
>>
>> -- 
>> Yours,
>> Zheng
>
>
>
>
> -- 
> Yours,
> Zheng


Re: INSERT OVERWRITE not working with map/reduce transform

Posted by Zheng Shao <zs...@gmail.com>.
Does the map-reduce job produced any output? Can you check the mapred job
page?

If so then it must be that the "moveTask" after map-reduce job is failed.

Zheng

On Mon, Jan 12, 2009 at 9:48 PM, Josh Ferguson <jo...@besquared.net> wrote:

> https://gist.github.com/3cb4be29625442c90140
> Josh
>
> On Jan 12, 2009, at 9:39 PM, Zheng Shao wrote:
>
> Should be tab separated.
>
> if you run it without insert, is there any data on the screen?
>
> Zheng
>
> On Mon, Jan 12, 2009 at 9:32 PM, Josh Ferguson <jo...@besquared.net> wrote:
>
>> The only thing I could figure is that my output is incorrect.. Is the
>> output from the transform script supposed to be tab separated or separated
>> by the delimiters of the table you're trying to insert into? It doesn't seem
>> to make any difference (my table is still empty no matter which one I try)
>> but I'd better make sure just incase.
>> Josh
>>
>> On Jan 12, 2009, at 9:11 PM, Zheng Shao wrote:
>>
>> Here are some examples:
>>
>> [zshao@xxx /hive.root] find ./ql/src/test/queries/clientpositive -name
>> '*.q'  | xargs grep TRANSFORM
>> ./ql/src/test/queries/clientpositive/input14_limit.q:  SELECT
>> TRANSFORM(src.key, src.value)
>> ./ql/src/test/queries/clientpositive/input14_limit.q:  SELECT
>> TRANSFORM(src.key, src.value)
>> ./ql/src/test/queries/clientpositive/input14.q:  SELECT TRANSFORM(src.key,
>> src.value)
>> ./ql/src/test/queries/clientpositive/input14.q:  SELECT TRANSFORM(src.key,
>> src.value)
>> ./ql/src/test/queries/clientpositive/input18.q:  SELECT TRANSFORM(src.key,
>> src.value, 1+2, 3+4)
>> ./ql/src/test/queries/clientpositive/input18.q:  SELECT TRANSFORM(src.key,
>> src.value, 1+2, 3+4)
>> ./ql/src/test/queries/clientpositive/scriptfile1.q:  SELECT
>> TRANSFORM(src.key, src.value)
>> ./ql/src/test/queries/clientpositive/input5.q:  SELECT
>> TRANSFORM(src_thrift.lint, src_thrift.lintstring)
>> ./ql/src/test/queries/clientpositive/input5.q:  SELECT
>> TRANSFORM(src_thrift.lint, src_thrift.lintstring)
>> ./ql/src/test/queries/clientpositive/input17.q:  SELECT
>> TRANSFORM(src_thrift.aint + src_thrift.lint[0], src_thrift.lintstring[0])
>> ./ql/src/test/queries/clientpositive/input17.q:  SELECT
>> TRANSFORM(src_thrift.aint + src_thrift.lint[0], src_thrift.lintstring[0])
>>
>>
>> On Mon, Jan 12, 2009 at 8:58 PM, Josh Ferguson <jo...@besquared.net>wrote:
>>
>>> Anyone have any word on why this might not work? Can someone give me an
>>> example of a query they use to INSERT OVERWRITE a table from a map and/or
>>> reduce job that I could use as a reference?
>>>
>>> Josh F.
>>>
>>>
>>> On Jan 11, 2009, at 9:48 PM, Josh Ferguson wrote:
>>>
>>>  I have a query that returns the proper results:
>>>>
>>>> SELECT TRANSFORM(actor_id) USING '/my/script.rb' AS (actor_id,
>>>> percentile, count) FROM (SELECT actor_id FROM activities CLUSTER BY
>>>> actor_id) actors;
>>>>
>>>> But when I do
>>>>
>>>> INSERT OVERWRITE TABLE percentiles
>>>> SELECT TRANSFORM(actor_id) USING '/my/script.rb' AS (actor_id,
>>>> percentile, count) FROM (SELECT actor_id FROM activities CLUSTER BY
>>>> actor_id) actors;
>>>>
>>>> It says it loads data into the percentiles table but when I ask for data
>>>> from that table I get:
>>>>
>>>> hive> SELECT actor_id, percentile, count FROM percentiles;
>>>> FAILED: Error in semantic analysis:
>>>> org.apache.hadoop.hive.ql.metadata.HiveException: Path
>>>> /user/hive/warehouse/percentiles not a valid path
>>>>
>>>> $ hadoop fs -ls /user/hive/warehouse/percentiles/
>>>> Found 1 items
>>>> -rw-r--r--   1 Josh supergroup          0 2009-01-11 21:45
>>>> /user/hive/warehouse/percentiles/attempt_200901112100_0010_r_000000_0
>>>>
>>>> It's nothing but an empty file.
>>>>
>>>> Am I doing something wrong?
>>>>
>>>> Josh Ferguson
>>>>
>>>
>>>
>>
>>
>> --
>> Yours,
>> Zheng
>>
>>
>>
>
>
> --
> Yours,
> Zheng
>
>
>


-- 
Yours,
Zheng

Re: INSERT OVERWRITE not working with map/reduce transform

Posted by Prasad Chakka <pr...@facebook.com>.
I have never used 'int' partition keys.

You might want to try inserting the transform output into a HDFS file using 'insert overwrite directory' and then use 'load into table' to load that file into the correct partition?


________________________________
From: Josh Ferguson <jo...@besquared.net>
Reply-To: <hi...@hadoop.apache.org>
Date: Mon, 12 Jan 2009 21:48:17 -0800
To: <hi...@hadoop.apache.org>
Subject: Re: INSERT OVERWRITE not working with map/reduce transform

https://gist.github.com/3cb4be29625442c90140

Josh

On Jan 12, 2009, at 9:39 PM, Zheng Shao wrote:

Should be tab separated.

if you run it without insert, is there any data on the screen?

Zheng

On Mon, Jan 12, 2009 at 9:32 PM, Josh Ferguson <jo...@besquared.net> wrote:

The only thing I could figure is that my output is incorrect.. Is the output from the transform script supposed to be tab separated or separated by the delimiters of the table you're trying to insert into? It doesn't seem to make any difference (my table is still empty no matter which one I try) but I'd better make sure just incase.

Josh

On Jan 12, 2009, at 9:11 PM, Zheng Shao wrote:

Here are some examples:

[zshao@xxx /hive.root] find ./ql/src/test/queries/clientpositive -name '*.q'  | xargs grep TRANSFORM
 ./ql/src/test/queries/clientpositive/input14_limit.q:  SELECT TRANSFORM(src.key, src.value)
 ./ql/src/test/queries/clientpositive/input14_limit.q:  SELECT TRANSFORM(src.key, src.value)
./ql/src/test/queries/clientpositive/input14.q:  SELECT TRANSFORM(src.key, src.value)
 ./ql/src/test/queries/clientpositive/input14.q:  SELECT TRANSFORM(src.key, src.value)
 ./ql/src/test/queries/clientpositive/input18.q:  SELECT TRANSFORM(src.key, src.value, 1+2, 3+4)
./ql/src/test/queries/clientpositive/input18.q:  SELECT TRANSFORM(src.key, src.value, 1+2, 3+4)
 ./ql/src/test/queries/clientpositive/scriptfile1.q:  SELECT TRANSFORM(src.key, src.value)
 ./ql/src/test/queries/clientpositive/input5.q:  SELECT TRANSFORM(src_thrift.lint, src_thrift.lintstring)
./ql/src/test/queries/clientpositive/input5.q:  SELECT TRANSFORM(src_thrift.lint, src_thrift.lintstring)
 ./ql/src/test/queries/clientpositive/input17.q:  SELECT TRANSFORM(src_thrift.aint + src_thrift.lint[0], src_thrift.lintstring[0])
 ./ql/src/test/queries/clientpositive/input17.q:  SELECT TRANSFORM(src_thrift.aint + src_thrift.lint[0], src_thrift.lintstring[0])


On Mon, Jan 12, 2009 at 8:58 PM, Josh Ferguson <jo...@besquared.net> wrote:

Anyone have any word on why this might not work? Can someone give me an example of a query they use to INSERT OVERWRITE a table from a map and/or reduce job that I could use as a reference?

 Josh F.



 On Jan 11, 2009, at 9:48 PM, Josh Ferguson wrote:


I have a query that returns the proper results:

 SELECT TRANSFORM(actor_id) USING '/my/script.rb' AS (actor_id, percentile, count) FROM (SELECT actor_id FROM activities CLUSTER BY actor_id) actors;

 But when I do

 INSERT OVERWRITE TABLE percentiles
 SELECT TRANSFORM(actor_id) USING '/my/script.rb' AS (actor_id, percentile, count) FROM (SELECT actor_id FROM activities CLUSTER BY actor_id) actors;

 It says it loads data into the percentiles table but when I ask for data from that table I get:

 hive> SELECT actor_id, percentile, count FROM percentiles;
 FAILED: Error in semantic analysis: org.apache.hadoop.hive.ql.metadata.HiveException: Path /user/hive/warehouse/percentiles not a valid path

 $ hadoop fs -ls /user/hive/warehouse/percentiles/
 Found 1 items
 -rw-r--r--   1 Josh supergroup          0 2009-01-11 21:45 /user/hive/warehouse/percentiles/attempt_200901112100_0010_r_000000_0

 It's nothing but an empty file.

 Am I doing something wrong?

 Josh Ferguson






--
Yours,
Zheng





--
Yours,
Zheng



Re: INSERT OVERWRITE not working with map/reduce transform

Posted by Josh Ferguson <jo...@besquared.net>.
https://gist.github.com/3cb4be29625442c90140

Josh

On Jan 12, 2009, at 9:39 PM, Zheng Shao wrote:

> Should be tab separated.
>
> if you run it without insert, is there any data on the screen?
>
> Zheng
>
> On Mon, Jan 12, 2009 at 9:32 PM, Josh Ferguson <jo...@besquared.net>  
> wrote:
> The only thing I could figure is that my output is incorrect.. Is  
> the output from the transform script supposed to be tab separated or  
> separated by the delimiters of the table you're trying to insert  
> into? It doesn't seem to make any difference (my table is still  
> empty no matter which one I try) but I'd better make sure just incase.
>
> Josh
>
> On Jan 12, 2009, at 9:11 PM, Zheng Shao wrote:
>
>> Here are some examples:
>>
>> [zshao@xxx /hive.root] find ./ql/src/test/queries/clientpositive - 
>> name '*.q'  | xargs grep TRANSFORM
>> ./ql/src/test/queries/clientpositive/input14_limit.q:  SELECT  
>> TRANSFORM(src.key, src.value)
>> ./ql/src/test/queries/clientpositive/input14_limit.q:  SELECT  
>> TRANSFORM(src.key, src.value)
>> ./ql/src/test/queries/clientpositive/input14.q:  SELECT  
>> TRANSFORM(src.key, src.value)
>> ./ql/src/test/queries/clientpositive/input14.q:  SELECT  
>> TRANSFORM(src.key, src.value)
>> ./ql/src/test/queries/clientpositive/input18.q:  SELECT  
>> TRANSFORM(src.key, src.value, 1+2, 3+4)
>> ./ql/src/test/queries/clientpositive/input18.q:  SELECT  
>> TRANSFORM(src.key, src.value, 1+2, 3+4)
>> ./ql/src/test/queries/clientpositive/scriptfile1.q:  SELECT  
>> TRANSFORM(src.key, src.value)
>> ./ql/src/test/queries/clientpositive/input5.q:  SELECT  
>> TRANSFORM(src_thrift.lint, src_thrift.lintstring)
>> ./ql/src/test/queries/clientpositive/input5.q:  SELECT  
>> TRANSFORM(src_thrift.lint, src_thrift.lintstring)
>> ./ql/src/test/queries/clientpositive/input17.q:  SELECT  
>> TRANSFORM(src_thrift.aint + src_thrift.lint[0],  
>> src_thrift.lintstring[0])
>> ./ql/src/test/queries/clientpositive/input17.q:  SELECT  
>> TRANSFORM(src_thrift.aint + src_thrift.lint[0],  
>> src_thrift.lintstring[0])
>>
>>
>> On Mon, Jan 12, 2009 at 8:58 PM, Josh Ferguson <jo...@besquared.net>  
>> wrote:
>> Anyone have any word on why this might not work? Can someone give  
>> me an example of a query they use to INSERT OVERWRITE a table from  
>> a map and/or reduce job that I could use as a reference?
>>
>> Josh F.
>>
>>
>> On Jan 11, 2009, at 9:48 PM, Josh Ferguson wrote:
>>
>> I have a query that returns the proper results:
>>
>> SELECT TRANSFORM(actor_id) USING '/my/script.rb' AS (actor_id,  
>> percentile, count) FROM (SELECT actor_id FROM activities CLUSTER BY  
>> actor_id) actors;
>>
>> But when I do
>>
>> INSERT OVERWRITE TABLE percentiles
>> SELECT TRANSFORM(actor_id) USING '/my/script.rb' AS (actor_id,  
>> percentile, count) FROM (SELECT actor_id FROM activities CLUSTER BY  
>> actor_id) actors;
>>
>> It says it loads data into the percentiles table but when I ask for  
>> data from that table I get:
>>
>> hive> SELECT actor_id, percentile, count FROM percentiles;
>> FAILED: Error in semantic analysis:  
>> org.apache.hadoop.hive.ql.metadata.HiveException: Path /user/hive/ 
>> warehouse/percentiles not a valid path
>>
>> $ hadoop fs -ls /user/hive/warehouse/percentiles/
>> Found 1 items
>> -rw-r--r--   1 Josh supergroup          0 2009-01-11 21:45 /user/ 
>> hive/warehouse/percentiles/attempt_200901112100_0010_r_000000_0
>>
>> It's nothing but an empty file.
>>
>> Am I doing something wrong?
>>
>> Josh Ferguson
>>
>>
>>
>>
>> -- 
>> Yours,
>> Zheng
>
>
>
>
> -- 
> Yours,
> Zheng


Re: INSERT OVERWRITE not working with map/reduce transform

Posted by Zheng Shao <zs...@gmail.com>.
Should be tab separated.

if you run it without insert, is there any data on the screen?

Zheng

On Mon, Jan 12, 2009 at 9:32 PM, Josh Ferguson <jo...@besquared.net> wrote:

> The only thing I could figure is that my output is incorrect.. Is the
> output from the transform script supposed to be tab separated or separated
> by the delimiters of the table you're trying to insert into? It doesn't seem
> to make any difference (my table is still empty no matter which one I try)
> but I'd better make sure just incase.
> Josh
>
> On Jan 12, 2009, at 9:11 PM, Zheng Shao wrote:
>
> Here are some examples:
>
> [zshao@xxx /hive.root] find ./ql/src/test/queries/clientpositive -name
> '*.q'  | xargs grep TRANSFORM
> ./ql/src/test/queries/clientpositive/input14_limit.q:  SELECT
> TRANSFORM(src.key, src.value)
> ./ql/src/test/queries/clientpositive/input14_limit.q:  SELECT
> TRANSFORM(src.key, src.value)
> ./ql/src/test/queries/clientpositive/input14.q:  SELECT TRANSFORM(src.key,
> src.value)
> ./ql/src/test/queries/clientpositive/input14.q:  SELECT TRANSFORM(src.key,
> src.value)
> ./ql/src/test/queries/clientpositive/input18.q:  SELECT TRANSFORM(src.key,
> src.value, 1+2, 3+4)
> ./ql/src/test/queries/clientpositive/input18.q:  SELECT TRANSFORM(src.key,
> src.value, 1+2, 3+4)
> ./ql/src/test/queries/clientpositive/scriptfile1.q:  SELECT
> TRANSFORM(src.key, src.value)
> ./ql/src/test/queries/clientpositive/input5.q:  SELECT
> TRANSFORM(src_thrift.lint, src_thrift.lintstring)
> ./ql/src/test/queries/clientpositive/input5.q:  SELECT
> TRANSFORM(src_thrift.lint, src_thrift.lintstring)
> ./ql/src/test/queries/clientpositive/input17.q:  SELECT
> TRANSFORM(src_thrift.aint + src_thrift.lint[0], src_thrift.lintstring[0])
> ./ql/src/test/queries/clientpositive/input17.q:  SELECT
> TRANSFORM(src_thrift.aint + src_thrift.lint[0], src_thrift.lintstring[0])
>
>
> On Mon, Jan 12, 2009 at 8:58 PM, Josh Ferguson <jo...@besquared.net> wrote:
>
>> Anyone have any word on why this might not work? Can someone give me an
>> example of a query they use to INSERT OVERWRITE a table from a map and/or
>> reduce job that I could use as a reference?
>>
>> Josh F.
>>
>>
>> On Jan 11, 2009, at 9:48 PM, Josh Ferguson wrote:
>>
>>  I have a query that returns the proper results:
>>>
>>> SELECT TRANSFORM(actor_id) USING '/my/script.rb' AS (actor_id,
>>> percentile, count) FROM (SELECT actor_id FROM activities CLUSTER BY
>>> actor_id) actors;
>>>
>>> But when I do
>>>
>>> INSERT OVERWRITE TABLE percentiles
>>> SELECT TRANSFORM(actor_id) USING '/my/script.rb' AS (actor_id,
>>> percentile, count) FROM (SELECT actor_id FROM activities CLUSTER BY
>>> actor_id) actors;
>>>
>>> It says it loads data into the percentiles table but when I ask for data
>>> from that table I get:
>>>
>>> hive> SELECT actor_id, percentile, count FROM percentiles;
>>> FAILED: Error in semantic analysis:
>>> org.apache.hadoop.hive.ql.metadata.HiveException: Path
>>> /user/hive/warehouse/percentiles not a valid path
>>>
>>> $ hadoop fs -ls /user/hive/warehouse/percentiles/
>>> Found 1 items
>>> -rw-r--r--   1 Josh supergroup          0 2009-01-11 21:45
>>> /user/hive/warehouse/percentiles/attempt_200901112100_0010_r_000000_0
>>>
>>> It's nothing but an empty file.
>>>
>>> Am I doing something wrong?
>>>
>>> Josh Ferguson
>>>
>>
>>
>
>
> --
> Yours,
> Zheng
>
>
>


-- 
Yours,
Zheng

Re: INSERT OVERWRITE not working with map/reduce transform

Posted by Josh Ferguson <jo...@besquared.net>.
The only thing I could figure is that my output is incorrect.. Is the  
output from the transform script supposed to be tab separated or  
separated by the delimiters of the table you're trying to insert into?  
It doesn't seem to make any difference (my table is still empty no  
matter which one I try) but I'd better make sure just incase.

Josh

On Jan 12, 2009, at 9:11 PM, Zheng Shao wrote:

> Here are some examples:
>
> [zshao@xxx /hive.root] find ./ql/src/test/queries/clientpositive - 
> name '*.q'  | xargs grep TRANSFORM
> ./ql/src/test/queries/clientpositive/input14_limit.q:  SELECT  
> TRANSFORM(src.key, src.value)
> ./ql/src/test/queries/clientpositive/input14_limit.q:  SELECT  
> TRANSFORM(src.key, src.value)
> ./ql/src/test/queries/clientpositive/input14.q:  SELECT  
> TRANSFORM(src.key, src.value)
> ./ql/src/test/queries/clientpositive/input14.q:  SELECT  
> TRANSFORM(src.key, src.value)
> ./ql/src/test/queries/clientpositive/input18.q:  SELECT  
> TRANSFORM(src.key, src.value, 1+2, 3+4)
> ./ql/src/test/queries/clientpositive/input18.q:  SELECT  
> TRANSFORM(src.key, src.value, 1+2, 3+4)
> ./ql/src/test/queries/clientpositive/scriptfile1.q:  SELECT  
> TRANSFORM(src.key, src.value)
> ./ql/src/test/queries/clientpositive/input5.q:  SELECT  
> TRANSFORM(src_thrift.lint, src_thrift.lintstring)
> ./ql/src/test/queries/clientpositive/input5.q:  SELECT  
> TRANSFORM(src_thrift.lint, src_thrift.lintstring)
> ./ql/src/test/queries/clientpositive/input17.q:  SELECT  
> TRANSFORM(src_thrift.aint + src_thrift.lint[0],  
> src_thrift.lintstring[0])
> ./ql/src/test/queries/clientpositive/input17.q:  SELECT  
> TRANSFORM(src_thrift.aint + src_thrift.lint[0],  
> src_thrift.lintstring[0])
>
>
> On Mon, Jan 12, 2009 at 8:58 PM, Josh Ferguson <jo...@besquared.net>  
> wrote:
> Anyone have any word on why this might not work? Can someone give me  
> an example of a query they use to INSERT OVERWRITE a table from a  
> map and/or reduce job that I could use as a reference?
>
> Josh F.
>
>
> On Jan 11, 2009, at 9:48 PM, Josh Ferguson wrote:
>
> I have a query that returns the proper results:
>
> SELECT TRANSFORM(actor_id) USING '/my/script.rb' AS (actor_id,  
> percentile, count) FROM (SELECT actor_id FROM activities CLUSTER BY  
> actor_id) actors;
>
> But when I do
>
> INSERT OVERWRITE TABLE percentiles
> SELECT TRANSFORM(actor_id) USING '/my/script.rb' AS (actor_id,  
> percentile, count) FROM (SELECT actor_id FROM activities CLUSTER BY  
> actor_id) actors;
>
> It says it loads data into the percentiles table but when I ask for  
> data from that table I get:
>
> hive> SELECT actor_id, percentile, count FROM percentiles;
> FAILED: Error in semantic analysis:  
> org.apache.hadoop.hive.ql.metadata.HiveException: Path /user/hive/ 
> warehouse/percentiles not a valid path
>
> $ hadoop fs -ls /user/hive/warehouse/percentiles/
> Found 1 items
> -rw-r--r--   1 Josh supergroup          0 2009-01-11 21:45 /user/ 
> hive/warehouse/percentiles/attempt_200901112100_0010_r_000000_0
>
> It's nothing but an empty file.
>
> Am I doing something wrong?
>
> Josh Ferguson
>
>
>
>
> -- 
> Yours,
> Zheng


Re: INSERT OVERWRITE not working with map/reduce transform

Posted by Josh Ferguson <jo...@besquared.net>.
The transform works by itself, but when coupled with an overwrite the  
overwritten table contains nothing, if there are partitions do they  
have to be specified?

Josh F

On Jan 12, 2009, at 9:11 PM, Zheng Shao wrote:

> Here are some examples:
>
> [zshao@xxx /hive.root] find ./ql/src/test/queries/clientpositive - 
> name '*.q'  | xargs grep TRANSFORM
> ./ql/src/test/queries/clientpositive/input14_limit.q:  SELECT  
> TRANSFORM(src.key, src.value)
> ./ql/src/test/queries/clientpositive/input14_limit.q:  SELECT  
> TRANSFORM(src.key, src.value)
> ./ql/src/test/queries/clientpositive/input14.q:  SELECT  
> TRANSFORM(src.key, src.value)
> ./ql/src/test/queries/clientpositive/input14.q:  SELECT  
> TRANSFORM(src.key, src.value)
> ./ql/src/test/queries/clientpositive/input18.q:  SELECT  
> TRANSFORM(src.key, src.value, 1+2, 3+4)
> ./ql/src/test/queries/clientpositive/input18.q:  SELECT  
> TRANSFORM(src.key, src.value, 1+2, 3+4)
> ./ql/src/test/queries/clientpositive/scriptfile1.q:  SELECT  
> TRANSFORM(src.key, src.value)
> ./ql/src/test/queries/clientpositive/input5.q:  SELECT  
> TRANSFORM(src_thrift.lint, src_thrift.lintstring)
> ./ql/src/test/queries/clientpositive/input5.q:  SELECT  
> TRANSFORM(src_thrift.lint, src_thrift.lintstring)
> ./ql/src/test/queries/clientpositive/input17.q:  SELECT  
> TRANSFORM(src_thrift.aint + src_thrift.lint[0],  
> src_thrift.lintstring[0])
> ./ql/src/test/queries/clientpositive/input17.q:  SELECT  
> TRANSFORM(src_thrift.aint + src_thrift.lint[0],  
> src_thrift.lintstring[0])
>
>
> On Mon, Jan 12, 2009 at 8:58 PM, Josh Ferguson <jo...@besquared.net>  
> wrote:
> Anyone have any word on why this might not work? Can someone give me  
> an example of a query they use to INSERT OVERWRITE a table from a  
> map and/or reduce job that I could use as a reference?
>
> Josh F.
>
>
> On Jan 11, 2009, at 9:48 PM, Josh Ferguson wrote:
>
> I have a query that returns the proper results:
>
> SELECT TRANSFORM(actor_id) USING '/my/script.rb' AS (actor_id,  
> percentile, count) FROM (SELECT actor_id FROM activities CLUSTER BY  
> actor_id) actors;
>
> But when I do
>
> INSERT OVERWRITE TABLE percentiles
> SELECT TRANSFORM(actor_id) USING '/my/script.rb' AS (actor_id,  
> percentile, count) FROM (SELECT actor_id FROM activities CLUSTER BY  
> actor_id) actors;
>
> It says it loads data into the percentiles table but when I ask for  
> data from that table I get:
>
> hive> SELECT actor_id, percentile, count FROM percentiles;
> FAILED: Error in semantic analysis:  
> org.apache.hadoop.hive.ql.metadata.HiveException: Path /user/hive/ 
> warehouse/percentiles not a valid path
>
> $ hadoop fs -ls /user/hive/warehouse/percentiles/
> Found 1 items
> -rw-r--r--   1 Josh supergroup          0 2009-01-11 21:45 /user/ 
> hive/warehouse/percentiles/attempt_200901112100_0010_r_000000_0
>
> It's nothing but an empty file.
>
> Am I doing something wrong?
>
> Josh Ferguson
>
>
>
>
> -- 
> Yours,
> Zheng


Re: INSERT OVERWRITE not working with map/reduce transform

Posted by Zheng Shao <zs...@gmail.com>.
Here are some examples:

[zshao@xxx /hive.root] find ./ql/src/test/queries/clientpositive -name
'*.q'  | xargs grep TRANSFORM
./ql/src/test/queries/clientpositive/input14_limit.q:  SELECT
TRANSFORM(src.key, src.value)
./ql/src/test/queries/clientpositive/input14_limit.q:  SELECT
TRANSFORM(src.key, src.value)
./ql/src/test/queries/clientpositive/input14.q:  SELECT TRANSFORM(src.key,
src.value)
./ql/src/test/queries/clientpositive/input14.q:  SELECT TRANSFORM(src.key,
src.value)
./ql/src/test/queries/clientpositive/input18.q:  SELECT TRANSFORM(src.key,
src.value, 1+2, 3+4)
./ql/src/test/queries/clientpositive/input18.q:  SELECT TRANSFORM(src.key,
src.value, 1+2, 3+4)
./ql/src/test/queries/clientpositive/scriptfile1.q:  SELECT
TRANSFORM(src.key, src.value)
./ql/src/test/queries/clientpositive/input5.q:  SELECT
TRANSFORM(src_thrift.lint, src_thrift.lintstring)
./ql/src/test/queries/clientpositive/input5.q:  SELECT
TRANSFORM(src_thrift.lint, src_thrift.lintstring)
./ql/src/test/queries/clientpositive/input17.q:  SELECT
TRANSFORM(src_thrift.aint + src_thrift.lint[0], src_thrift.lintstring[0])
./ql/src/test/queries/clientpositive/input17.q:  SELECT
TRANSFORM(src_thrift.aint + src_thrift.lint[0], src_thrift.lintstring[0])


On Mon, Jan 12, 2009 at 8:58 PM, Josh Ferguson <jo...@besquared.net> wrote:

> Anyone have any word on why this might not work? Can someone give me an
> example of a query they use to INSERT OVERWRITE a table from a map and/or
> reduce job that I could use as a reference?
>
> Josh F.
>
>
> On Jan 11, 2009, at 9:48 PM, Josh Ferguson wrote:
>
>  I have a query that returns the proper results:
>>
>> SELECT TRANSFORM(actor_id) USING '/my/script.rb' AS (actor_id, percentile,
>> count) FROM (SELECT actor_id FROM activities CLUSTER BY actor_id) actors;
>>
>> But when I do
>>
>> INSERT OVERWRITE TABLE percentiles
>> SELECT TRANSFORM(actor_id) USING '/my/script.rb' AS (actor_id, percentile,
>> count) FROM (SELECT actor_id FROM activities CLUSTER BY actor_id) actors;
>>
>> It says it loads data into the percentiles table but when I ask for data
>> from that table I get:
>>
>> hive> SELECT actor_id, percentile, count FROM percentiles;
>> FAILED: Error in semantic analysis:
>> org.apache.hadoop.hive.ql.metadata.HiveException: Path
>> /user/hive/warehouse/percentiles not a valid path
>>
>> $ hadoop fs -ls /user/hive/warehouse/percentiles/
>> Found 1 items
>> -rw-r--r--   1 Josh supergroup          0 2009-01-11 21:45
>> /user/hive/warehouse/percentiles/attempt_200901112100_0010_r_000000_0
>>
>> It's nothing but an empty file.
>>
>> Am I doing something wrong?
>>
>> Josh Ferguson
>>
>
>


-- 
Yours,
Zheng

Re: INSERT OVERWRITE not working with map/reduce transform

Posted by Josh Ferguson <jo...@besquared.net>.
Anyone have any word on why this might not work? Can someone give me  
an example of a query they use to INSERT OVERWRITE a table from a map  
and/or reduce job that I could use as a reference?

Josh F.

On Jan 11, 2009, at 9:48 PM, Josh Ferguson wrote:

> I have a query that returns the proper results:
>
> SELECT TRANSFORM(actor_id) USING '/my/script.rb' AS (actor_id,  
> percentile, count) FROM (SELECT actor_id FROM activities CLUSTER BY  
> actor_id) actors;
>
> But when I do
>
> INSERT OVERWRITE TABLE percentiles
> SELECT TRANSFORM(actor_id) USING '/my/script.rb' AS (actor_id,  
> percentile, count) FROM (SELECT actor_id FROM activities CLUSTER BY  
> actor_id) actors;
>
> It says it loads data into the percentiles table but when I ask for  
> data from that table I get:
>
> hive> SELECT actor_id, percentile, count FROM percentiles;
> FAILED: Error in semantic analysis:  
> org.apache.hadoop.hive.ql.metadata.HiveException: Path /user/hive/ 
> warehouse/percentiles not a valid path
>
> $ hadoop fs -ls /user/hive/warehouse/percentiles/
> Found 1 items
> -rw-r--r--   1 Josh supergroup          0 2009-01-11 21:45 /user/ 
> hive/warehouse/percentiles/attempt_200901112100_0010_r_000000_0
>
> It's nothing but an empty file.
>
> Am I doing something wrong?
>
> Josh Ferguson