You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by kiranprasad <ki...@imimobile.com> on 2011/09/06 11:02:24 UTC
How to Generate single output file(part-m-0001) instead of multiple files
Hi
I am new to PIG, I would like to know how to generate only single output file by using STORE.
Regards
Kiran.G
Re: How to Generate single output file(part-m-0001) instead of
multiple files
Posted by Laukik Chitnis <la...@yahoo-inc.com>.
If you really really want a single output file, you will have to pay the price of an additional MR job with a single reducer (scalability alert!):
Store (foreach (group output all) generate flatten($1)) into 'outputfile';
Cheers,
Laukik
On Sep 6, 2011, at 2:33 AM, "Marek Miglinski" <mm...@seven.com> wrote:
> "When writing to a file system processed will be a directory with part files rather than a single file. But how many part files will be created? That depends on the parallelism of the last job before the store. If it has reduces, it will be determined by the parallel level set for that job (1). If it is a map-only job then it will be determined by the number of maps, which is controlled by Hadoop and not Pig."
> (r) Alan Gates
>
> 1:
> --defaultparallel.pig
> set default_parallel 10;
>
>
> Read here http://wiki.apache.org/pig/PigLatin#Increasing_the_parallelism
>
>
>
> -----Original Message-----
> From: kiranprasad [mailto:kiranprasad.g@imimobile.com]
> Sent: Tuesday, September 06, 2011 12:19 PM
> To: user@pig.apache.org
> Subject: Re: How to Generate single output file(part-m-0001) instead of multiple files
>
> Hi Marek,
>
> Thanks for quick response.
> I have tried it, after using the below mentioned, multiple files are generated inside the result folder( 'output/result' ).
> But I would like to know how to generate only single output file. the size of each outputfile generated is 3000k.
>
> Regards
> Kiran.G
>
> IMImobile Plot 770, Rd. 44 Jubilee Hills, Hyderabad - 500033 M +91 9000170909 T +91 40 2355 5945 - Ext: 229 www.imimobile.com -----Original Message-----
> From: Marek Miglinski
> Sent: Tuesday, September 06, 2011 2:40 PM
> To: user@pig.apache.org
> Subject: RE: How to Generate single output file(part-m-0001) instead of multiple files
>
> Hi,
>
> STORE param INTO 'output/result' USING PigStorage(',');
>
> If your data is comma delimited.
>
>
> Marek M.
>
> -----Original Message-----
> From: kiranprasad [mailto:kiranprasad.g@imimobile.com]
> Sent: Tuesday, September 06, 2011 12:02 PM
> To: user@pig.apache.org
> Cc: kiranprasad
> Subject: How to Generate single output file(part-m-0001) instead of multiple files
>
> Hi
>
> I am new to PIG, I would like to know how to generate only single output file by using STORE.
>
> Regards
> Kiran.G
>
>
RE: How to Generate single output file(part-m-0001) instead of
multiple files
Posted by Marek Miglinski <mm...@seven.com>.
"When writing to a file system processed will be a directory with part files rather than a single file. But how many part files will be created? That depends on the parallelism of the last job before the store. If it has reduces, it will be determined by the parallel level set for that job (1). If it is a map-only job then it will be determined by the number of maps, which is controlled by Hadoop and not Pig."
(r) Alan Gates
1:
--defaultparallel.pig
set default_parallel 10;
Read here http://wiki.apache.org/pig/PigLatin#Increasing_the_parallelism
-----Original Message-----
From: kiranprasad [mailto:kiranprasad.g@imimobile.com]
Sent: Tuesday, September 06, 2011 12:19 PM
To: user@pig.apache.org
Subject: Re: How to Generate single output file(part-m-0001) instead of multiple files
Hi Marek,
Thanks for quick response.
I have tried it, after using the below mentioned, multiple files are generated inside the result folder( 'output/result' ).
But I would like to know how to generate only single output file. the size of each outputfile generated is 3000k.
Regards
Kiran.G
IMImobile Plot 770, Rd. 44 Jubilee Hills, Hyderabad - 500033 M +91 9000170909 T +91 40 2355 5945 - Ext: 229 www.imimobile.com -----Original Message-----
From: Marek Miglinski
Sent: Tuesday, September 06, 2011 2:40 PM
To: user@pig.apache.org
Subject: RE: How to Generate single output file(part-m-0001) instead of multiple files
Hi,
STORE param INTO 'output/result' USING PigStorage(',');
If your data is comma delimited.
Marek M.
-----Original Message-----
From: kiranprasad [mailto:kiranprasad.g@imimobile.com]
Sent: Tuesday, September 06, 2011 12:02 PM
To: user@pig.apache.org
Cc: kiranprasad
Subject: How to Generate single output file(part-m-0001) instead of multiple files
Hi
I am new to PIG, I would like to know how to generate only single output file by using STORE.
Regards
Kiran.G
Re: How to Generate single output file(part-m-0001) instead of multiple files
Posted by kiranprasad <ki...@imimobile.com>.
Hi Marek,
Thanks for quick response.
I have tried it, after using the below mentioned, multiple files are
generated inside the result folder( 'output/result' ).
But I would like to know how to generate only single output file. the size
of each outputfile generated is 3000k.
Regards
Kiran.G
IMImobile Plot 770, Rd. 44 Jubilee Hills, Hyderabad - 500033
M +91 9000170909 T +91 40 2355 5945 - Ext: 229 www.imimobile.com
-----Original Message-----
From: Marek Miglinski
Sent: Tuesday, September 06, 2011 2:40 PM
To: user@pig.apache.org
Subject: RE: How to Generate single output file(part-m-0001) instead of
multiple files
Hi,
STORE param INTO 'output/result' USING PigStorage(',');
If your data is comma delimited.
Marek M.
-----Original Message-----
From: kiranprasad [mailto:kiranprasad.g@imimobile.com]
Sent: Tuesday, September 06, 2011 12:02 PM
To: user@pig.apache.org
Cc: kiranprasad
Subject: How to Generate single output file(part-m-0001) instead of multiple
files
Hi
I am new to PIG, I would like to know how to generate only single output
file by using STORE.
Regards
Kiran.G
RE: How to Generate single output file(part-m-0001) instead of
multiple files
Posted by Marek Miglinski <mm...@seven.com>.
Hi,
STORE param INTO 'output/result' USING PigStorage(',');
If your data is comma delimited.
Marek M.
-----Original Message-----
From: kiranprasad [mailto:kiranprasad.g@imimobile.com]
Sent: Tuesday, September 06, 2011 12:02 PM
To: user@pig.apache.org
Cc: kiranprasad
Subject: How to Generate single output file(part-m-0001) instead of multiple files
Hi
I am new to PIG, I would like to know how to generate only single output file by using STORE.
Regards
Kiran.G