You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by kiranprasad <ki...@imimobile.com> on 2011/09/06 11:02:24 UTC

How to Generate single output file(part-m-0001) instead of multiple files

Hi

I am new to PIG, I would like to know how to generate only single output file by using STORE.

Regards
Kiran.G

Re: How to Generate single output file(part-m-0001) instead of multiple files

Posted by Laukik Chitnis <la...@yahoo-inc.com>.
If you really really want a single output file, you will have to pay the price of an additional MR job with a single reducer (scalability alert!):

Store (foreach (group output all) generate flatten($1)) into 'outputfile';

Cheers,
Laukik


On Sep 6, 2011, at 2:33 AM, "Marek Miglinski" <mm...@seven.com> wrote:

> "When writing to a file system processed will be a directory with part files rather than a single file. But how many part files will be created? That depends on the parallelism of the last job before the store. If it has reduces, it will be determined by the parallel level set for that job (1). If it is a map-only job then it will be determined by the number of maps, which is controlled by Hadoop and not Pig."
> (r) Alan Gates
> 
> 1:
> --defaultparallel.pig
> set default_parallel 10;
> 
> 
> Read here http://wiki.apache.org/pig/PigLatin#Increasing_the_parallelism
> 
> 
> 
> -----Original Message-----
> From: kiranprasad [mailto:kiranprasad.g@imimobile.com] 
> Sent: Tuesday, September 06, 2011 12:19 PM
> To: user@pig.apache.org
> Subject: Re: How to Generate single output file(part-m-0001) instead of multiple files
> 
> Hi Marek,
> 
> Thanks for quick response.
> I have tried it, after using the below mentioned, multiple files are generated inside the result folder( 'output/result' ).
> But I would like to know how to generate only single output file. the size of each outputfile generated is 3000k.
> 
> Regards
> Kiran.G
> 
> IMImobile Plot 770, Rd. 44 Jubilee Hills, Hyderabad - 500033 M +91 9000170909 T +91 40 2355 5945 - Ext: 229 www.imimobile.com -----Original Message-----
> From: Marek Miglinski
> Sent: Tuesday, September 06, 2011 2:40 PM
> To: user@pig.apache.org
> Subject: RE: How to Generate single output file(part-m-0001) instead of multiple files
> 
> Hi,
> 
> STORE param INTO 'output/result' USING PigStorage(',');
> 
> If your data is comma delimited.
> 
> 
> Marek M.
> 
> -----Original Message-----
> From: kiranprasad [mailto:kiranprasad.g@imimobile.com]
> Sent: Tuesday, September 06, 2011 12:02 PM
> To: user@pig.apache.org
> Cc: kiranprasad
> Subject: How to Generate single output file(part-m-0001) instead of multiple files
> 
> Hi
> 
> I am new to PIG, I would like to know how to generate only single output file by using STORE.
> 
> Regards
> Kiran.G 
> 
> 

RE: How to Generate single output file(part-m-0001) instead of multiple files

Posted by Marek Miglinski <mm...@seven.com>.
"When writing to a file system processed will be a directory with part files rather than a single file. But how many part files will be created? That depends on the parallelism of the last job before the store. If it has reduces, it will be determined by the parallel level set for that job (1). If it is a map-only job then it will be determined by the number of maps, which is controlled by Hadoop and not Pig."
(r) Alan Gates

1:
--defaultparallel.pig
set default_parallel 10;


Read here http://wiki.apache.org/pig/PigLatin#Increasing_the_parallelism



-----Original Message-----
From: kiranprasad [mailto:kiranprasad.g@imimobile.com] 
Sent: Tuesday, September 06, 2011 12:19 PM
To: user@pig.apache.org
Subject: Re: How to Generate single output file(part-m-0001) instead of multiple files

Hi Marek,

Thanks for quick response.
I have tried it, after using the below mentioned, multiple files are generated inside the result folder( 'output/result' ).
But I would like to know how to generate only single output file. the size of each outputfile generated is 3000k.

Regards
Kiran.G

IMImobile Plot 770, Rd. 44 Jubilee Hills, Hyderabad - 500033 M +91 9000170909 T +91 40 2355 5945 - Ext: 229 www.imimobile.com -----Original Message-----
From: Marek Miglinski
Sent: Tuesday, September 06, 2011 2:40 PM
To: user@pig.apache.org
Subject: RE: How to Generate single output file(part-m-0001) instead of multiple files

Hi,

STORE param INTO 'output/result' USING PigStorage(',');

If your data is comma delimited.


Marek M.

-----Original Message-----
From: kiranprasad [mailto:kiranprasad.g@imimobile.com]
Sent: Tuesday, September 06, 2011 12:02 PM
To: user@pig.apache.org
Cc: kiranprasad
Subject: How to Generate single output file(part-m-0001) instead of multiple files

Hi

I am new to PIG, I would like to know how to generate only single output file by using STORE.

Regards
Kiran.G 



Re: How to Generate single output file(part-m-0001) instead of multiple files

Posted by kiranprasad <ki...@imimobile.com>.
Hi Marek,

Thanks for quick response.
I have tried it, after using the below mentioned, multiple files are 
generated inside the result folder( 'output/result' ).
But I would like to know how to generate only single output file. the size 
of each outputfile generated is 3000k.

Regards
Kiran.G

IMImobile Plot 770, Rd. 44 Jubilee Hills, Hyderabad - 500033
M +91 9000170909 T +91 40 2355 5945 - Ext: 229 www.imimobile.com
-----Original Message----- 
From: Marek Miglinski
Sent: Tuesday, September 06, 2011 2:40 PM
To: user@pig.apache.org
Subject: RE: How to Generate single output file(part-m-0001) instead of 
multiple files

Hi,

STORE param INTO 'output/result' USING PigStorage(',');

If your data is comma delimited.


Marek M.

-----Original Message-----
From: kiranprasad [mailto:kiranprasad.g@imimobile.com]
Sent: Tuesday, September 06, 2011 12:02 PM
To: user@pig.apache.org
Cc: kiranprasad
Subject: How to Generate single output file(part-m-0001) instead of multiple 
files

Hi

I am new to PIG, I would like to know how to generate only single output 
file by using STORE.

Regards
Kiran.G 



RE: How to Generate single output file(part-m-0001) instead of multiple files

Posted by Marek Miglinski <mm...@seven.com>.
Hi,

STORE	param	INTO 'output/result'	USING PigStorage(',');

If your data is comma delimited.


Marek M.

-----Original Message-----
From: kiranprasad [mailto:kiranprasad.g@imimobile.com] 
Sent: Tuesday, September 06, 2011 12:02 PM
To: user@pig.apache.org
Cc: kiranprasad
Subject: How to Generate single output file(part-m-0001) instead of multiple files

Hi

I am new to PIG, I would like to know how to generate only single output file by using STORE.

Regards
Kiran.G