You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by samir das mohapatra <sa...@gmail.com> on 2013/06/03 11:34:54 UTC
How to get the intermediate mapper output file name
Hi all,
How to get the mapper output filename inside the the mapper .
or
How to change the mapper ouput file name.
Default it looks like part-m-00000,part-m-00001 etc.
Regards,
samir.
Re: How to get the intermediate mapper output file name
Posted by Rahul Bhattacharjee <ra...@gmail.com>.
I think the format of the mapper and reducer split files are hard wired
into hadoop code , however you can prepend something in the beginning of
the filename or even a directory using multiple output format.
thanks,
Rahul
On Mon, Jun 3, 2013 at 3:04 PM, samir das mohapatra <samir.helpdoc@gmail.com
> wrote:
> Hi all,
> How to get the mapper output filename inside the the mapper .
>
> or
>
> How to change the mapper ouput file name.
> Default it looks like part-m-00000,part-m-00001 etc.
>
> Regards,
> samir.
>
Re: How to get the intermediate mapper output file name
Posted by Serega Sheypak <se...@gmail.com>.
See
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/lib/MultipleOutputFormat.html
- Case two: This class is used for a map only job. The job wants to use an
output file name that is either a part of the input file name of the input
data, or some derivation of it. -- Case three: This class is used for a
map only job. The job wants to use an output file name that depends on both
the keys and the input file name
понедельник, 3 июня 2013 г., 13:34:54 UTC+4 пользователь samir das
mohapatra написал:
>
> Hi all,
> How to get the mapper output filename inside the the mapper .
>
> or
>
> How to change the mapper ouput file name.
> Default it looks like part-m-00000,part-m-00001 etc.
>
> Regards,
> samir.
>
Re: How to get the intermediate mapper output file name
Posted by Rahul Bhattacharjee <ra...@gmail.com>.
Thanks Dino , good to know this.
On Mon, Jun 3, 2013 at 3:12 PM, Dino Kečo <di...@gmail.com> wrote:
> Hi Samir,
>
> File naming is defined in FileOutputFormat class and there is property mapreduce.output.basename
> which you can use to tweak things with file naming.
>
> Please check this code
> http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/com.cloudera.hadoop/hadoop-core/0.20.2-737/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.java#FileOutputFormat for
> more details (line 272).
>
> HTH
>
> Regards,
>
> Dino Kečo
> mail: dino.keco@gmail.com
> skype: dino.keco
> phone: +387 61 507 851
>
>
> On Mon, Jun 3, 2013 at 11:34 AM, samir das mohapatra <
> samir.helpdoc@gmail.com> wrote:
>
>> Hi all,
>> How to get the mapper output filename inside the the mapper .
>>
>> or
>>
>> How to change the mapper ouput file name.
>> Default it looks like part-m-00000,part-m-00001 etc.
>>
>> Regards,
>> samir.
>>
>
>
Re: How to get the intermediate mapper output file name
Posted by Rahul Bhattacharjee <ra...@gmail.com>.
Thanks Dino , good to know this.
On Mon, Jun 3, 2013 at 3:12 PM, Dino Kečo <di...@gmail.com> wrote:
> Hi Samir,
>
> File naming is defined in FileOutputFormat class and there is property mapreduce.output.basename
> which you can use to tweak things with file naming.
>
> Please check this code
> http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/com.cloudera.hadoop/hadoop-core/0.20.2-737/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.java#FileOutputFormat for
> more details (line 272).
>
> HTH
>
> Regards,
>
> Dino Kečo
> mail: dino.keco@gmail.com
> skype: dino.keco
> phone: +387 61 507 851
>
>
> On Mon, Jun 3, 2013 at 11:34 AM, samir das mohapatra <
> samir.helpdoc@gmail.com> wrote:
>
>> Hi all,
>> How to get the mapper output filename inside the the mapper .
>>
>> or
>>
>> How to change the mapper ouput file name.
>> Default it looks like part-m-00000,part-m-00001 etc.
>>
>> Regards,
>> samir.
>>
>
>
Re: How to get the intermediate mapper output file name
Posted by Rahul Bhattacharjee <ra...@gmail.com>.
Thanks Dino , good to know this.
On Mon, Jun 3, 2013 at 3:12 PM, Dino Kečo <di...@gmail.com> wrote:
> Hi Samir,
>
> File naming is defined in FileOutputFormat class and there is property mapreduce.output.basename
> which you can use to tweak things with file naming.
>
> Please check this code
> http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/com.cloudera.hadoop/hadoop-core/0.20.2-737/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.java#FileOutputFormat for
> more details (line 272).
>
> HTH
>
> Regards,
>
> Dino Kečo
> mail: dino.keco@gmail.com
> skype: dino.keco
> phone: +387 61 507 851
>
>
> On Mon, Jun 3, 2013 at 11:34 AM, samir das mohapatra <
> samir.helpdoc@gmail.com> wrote:
>
>> Hi all,
>> How to get the mapper output filename inside the the mapper .
>>
>> or
>>
>> How to change the mapper ouput file name.
>> Default it looks like part-m-00000,part-m-00001 etc.
>>
>> Regards,
>> samir.
>>
>
>
Re: How to get the intermediate mapper output file name
Posted by Rahul Bhattacharjee <ra...@gmail.com>.
Thanks Dino , good to know this.
On Mon, Jun 3, 2013 at 3:12 PM, Dino Kečo <di...@gmail.com> wrote:
> Hi Samir,
>
> File naming is defined in FileOutputFormat class and there is property mapreduce.output.basename
> which you can use to tweak things with file naming.
>
> Please check this code
> http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/com.cloudera.hadoop/hadoop-core/0.20.2-737/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.java#FileOutputFormat for
> more details (line 272).
>
> HTH
>
> Regards,
>
> Dino Kečo
> mail: dino.keco@gmail.com
> skype: dino.keco
> phone: +387 61 507 851
>
>
> On Mon, Jun 3, 2013 at 11:34 AM, samir das mohapatra <
> samir.helpdoc@gmail.com> wrote:
>
>> Hi all,
>> How to get the mapper output filename inside the the mapper .
>>
>> or
>>
>> How to change the mapper ouput file name.
>> Default it looks like part-m-00000,part-m-00001 etc.
>>
>> Regards,
>> samir.
>>
>
>
Re: How to get the intermediate mapper output file name
Posted by Dino Kečo <di...@gmail.com>.
Hi Samir,
File naming is defined in FileOutputFormat class and there is property
mapreduce.output.basename
which you can use to tweak things with file naming.
Please check this code
http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/com.cloudera.hadoop/hadoop-core/0.20.2-737/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.java#FileOutputFormat
for
more details (line 272).
HTH
Regards,
Dino Kečo
mail: dino.keco@gmail.com
skype: dino.keco
phone: +387 61 507 851
On Mon, Jun 3, 2013 at 11:34 AM, samir das mohapatra <
samir.helpdoc@gmail.com> wrote:
> Hi all,
> How to get the mapper output filename inside the the mapper .
>
> or
>
> How to change the mapper ouput file name.
> Default it looks like part-m-00000,part-m-00001 etc.
>
> Regards,
> samir.
>
Re: How to get the intermediate mapper output file name
Posted by dvohra <dv...@yahoo.com>.
The part-m-00000,part-m-00001 file names are Hadoop naming conventions. To
use custom output file names use the MultipleOutputs class.
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/lib/MultipleOutputs.html
With MultipleOutputs the file name may be customized as
<namedOutput>_<multiName>-(m|r)-<part-number>
On Monday, June 3, 2013 2:34:54 AM UTC-7, samir das mohapatra wrote:
>
> Hi all,
> How to get the mapper output filename inside the the mapper .
>
> or
>
> How to change the mapper ouput file name.
> Default it looks like part-m-00000,part-m-00001 etc.
>
> Regards,
> samir.
>
Re: How to get the intermediate mapper output file name
Posted by Raj K Singh <ra...@gmail.com>.
you can use *getInputFileBasedOutputFileName*(JobConf job, String name)
which Generate the outfile name based on a given anme and the input file
name.
thanks
::::::::::::::::::::::::::::::::::::::::
Raj K Singh
http://www.rajkrrsingh.blogspot.com
Mobile Tel: +91 (0)9899821370
On Mon, Jun 3, 2013 at 3:04 PM, samir das mohapatra <samir.helpdoc@gmail.com
> wrote:
> Hi all,
> How to get the mapper output filename inside the the mapper .
>
> or
>
> How to change the mapper ouput file name.
> Default it looks like part-m-00000,part-m-00001 etc.
>
> Regards,
> samir.
>
Re: How to get the intermediate mapper output file name
Posted by Rahul Bhattacharjee <ra...@gmail.com>.
I think the format of the mapper and reducer split files are hard wired
into hadoop code , however you can prepend something in the beginning of
the filename or even a directory using multiple output format.
thanks,
Rahul
On Mon, Jun 3, 2013 at 3:04 PM, samir das mohapatra <samir.helpdoc@gmail.com
> wrote:
> Hi all,
> How to get the mapper output filename inside the the mapper .
>
> or
>
> How to change the mapper ouput file name.
> Default it looks like part-m-00000,part-m-00001 etc.
>
> Regards,
> samir.
>
Re: How to get the intermediate mapper output file name
Posted by Raj K Singh <ra...@gmail.com>.
you can use *getInputFileBasedOutputFileName*(JobConf job, String name)
which Generate the outfile name based on a given anme and the input file
name.
thanks
::::::::::::::::::::::::::::::::::::::::
Raj K Singh
http://www.rajkrrsingh.blogspot.com
Mobile Tel: +91 (0)9899821370
On Mon, Jun 3, 2013 at 3:04 PM, samir das mohapatra <samir.helpdoc@gmail.com
> wrote:
> Hi all,
> How to get the mapper output filename inside the the mapper .
>
> or
>
> How to change the mapper ouput file name.
> Default it looks like part-m-00000,part-m-00001 etc.
>
> Regards,
> samir.
>
Re: How to get the intermediate mapper output file name
Posted by Raj K Singh <ra...@gmail.com>.
you can use *getInputFileBasedOutputFileName*(JobConf job, String name)
which Generate the outfile name based on a given anme and the input file
name.
thanks
::::::::::::::::::::::::::::::::::::::::
Raj K Singh
http://www.rajkrrsingh.blogspot.com
Mobile Tel: +91 (0)9899821370
On Mon, Jun 3, 2013 at 3:04 PM, samir das mohapatra <samir.helpdoc@gmail.com
> wrote:
> Hi all,
> How to get the mapper output filename inside the the mapper .
>
> or
>
> How to change the mapper ouput file name.
> Default it looks like part-m-00000,part-m-00001 etc.
>
> Regards,
> samir.
>
Re: How to get the intermediate mapper output file name
Posted by Raj K Singh <ra...@gmail.com>.
you can use *getInputFileBasedOutputFileName*(JobConf job, String name)
which Generate the outfile name based on a given anme and the input file
name.
thanks
::::::::::::::::::::::::::::::::::::::::
Raj K Singh
http://www.rajkrrsingh.blogspot.com
Mobile Tel: +91 (0)9899821370
On Mon, Jun 3, 2013 at 3:04 PM, samir das mohapatra <samir.helpdoc@gmail.com
> wrote:
> Hi all,
> How to get the mapper output filename inside the the mapper .
>
> or
>
> How to change the mapper ouput file name.
> Default it looks like part-m-00000,part-m-00001 etc.
>
> Regards,
> samir.
>
Re: How to get the intermediate mapper output file name
Posted by Dino Kečo <di...@gmail.com>.
Hi Samir,
File naming is defined in FileOutputFormat class and there is property
mapreduce.output.basename
which you can use to tweak things with file naming.
Please check this code
http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/com.cloudera.hadoop/hadoop-core/0.20.2-737/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.java#FileOutputFormat
for
more details (line 272).
HTH
Regards,
Dino Kečo
mail: dino.keco@gmail.com
skype: dino.keco
phone: +387 61 507 851
On Mon, Jun 3, 2013 at 11:34 AM, samir das mohapatra <
samir.helpdoc@gmail.com> wrote:
> Hi all,
> How to get the mapper output filename inside the the mapper .
>
> or
>
> How to change the mapper ouput file name.
> Default it looks like part-m-00000,part-m-00001 etc.
>
> Regards,
> samir.
>
Re: How to get the intermediate mapper output file name
Posted by dvohra <dv...@yahoo.com>.
The part-m-00000,part-m-00001 file names are Hadoop naming conventions. To
use custom output file names use the MultipleOutputs class.
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/lib/MultipleOutputs.html
With MultipleOutputs the file name may be customized as
<namedOutput>_<multiName>-(m|r)-<part-number>
On Monday, June 3, 2013 2:34:54 AM UTC-7, samir das mohapatra wrote:
>
> Hi all,
> How to get the mapper output filename inside the the mapper .
>
> or
>
> How to change the mapper ouput file name.
> Default it looks like part-m-00000,part-m-00001 etc.
>
> Regards,
> samir.
>
Re: How to get the intermediate mapper output file name
Posted by dvohra <dv...@yahoo.com>.
The part-m-00000,part-m-00001 file names are Hadoop naming conventions. To
use custom output file names use the MultipleOutputs class.
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/lib/MultipleOutputs.html
With MultipleOutputs the file name may be customized as
<namedOutput>_<multiName>-(m|r)-<part-number>
On Monday, June 3, 2013 2:34:54 AM UTC-7, samir das mohapatra wrote:
>
> Hi all,
> How to get the mapper output filename inside the the mapper .
>
> or
>
> How to change the mapper ouput file name.
> Default it looks like part-m-00000,part-m-00001 etc.
>
> Regards,
> samir.
>
Re: How to get the intermediate mapper output file name
Posted by Rahul Bhattacharjee <ra...@gmail.com>.
I think the format of the mapper and reducer split files are hard wired
into hadoop code , however you can prepend something in the beginning of
the filename or even a directory using multiple output format.
thanks,
Rahul
On Mon, Jun 3, 2013 at 3:04 PM, samir das mohapatra <samir.helpdoc@gmail.com
> wrote:
> Hi all,
> How to get the mapper output filename inside the the mapper .
>
> or
>
> How to change the mapper ouput file name.
> Default it looks like part-m-00000,part-m-00001 etc.
>
> Regards,
> samir.
>
Re: How to get the intermediate mapper output file name
Posted by Dino Kečo <di...@gmail.com>.
Hi Samir,
File naming is defined in FileOutputFormat class and there is property
mapreduce.output.basename
which you can use to tweak things with file naming.
Please check this code
http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/com.cloudera.hadoop/hadoop-core/0.20.2-737/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.java#FileOutputFormat
for
more details (line 272).
HTH
Regards,
Dino Kečo
mail: dino.keco@gmail.com
skype: dino.keco
phone: +387 61 507 851
On Mon, Jun 3, 2013 at 11:34 AM, samir das mohapatra <
samir.helpdoc@gmail.com> wrote:
> Hi all,
> How to get the mapper output filename inside the the mapper .
>
> or
>
> How to change the mapper ouput file name.
> Default it looks like part-m-00000,part-m-00001 etc.
>
> Regards,
> samir.
>
Re: How to get the intermediate mapper output file name
Posted by Serega Sheypak <se...@gmail.com>.
See
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/lib/MultipleOutputFormat.html
- Case two: This class is used for a map only job. The job wants to use an
output file name that is either a part of the input file name of the input
data, or some derivation of it. -- Case three: This class is used for a
map only job. The job wants to use an output file name that depends on both
the keys and the input file name
понедельник, 3 июня 2013 г., 13:34:54 UTC+4 пользователь samir das
mohapatra написал:
>
> Hi all,
> How to get the mapper output filename inside the the mapper .
>
> or
>
> How to change the mapper ouput file name.
> Default it looks like part-m-00000,part-m-00001 etc.
>
> Regards,
> samir.
>
Re: How to get the intermediate mapper output file name
Posted by Dino Kečo <di...@gmail.com>.
Hi Samir,
File naming is defined in FileOutputFormat class and there is property
mapreduce.output.basename
which you can use to tweak things with file naming.
Please check this code
http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/com.cloudera.hadoop/hadoop-core/0.20.2-737/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.java#FileOutputFormat
for
more details (line 272).
HTH
Regards,
Dino Kečo
mail: dino.keco@gmail.com
skype: dino.keco
phone: +387 61 507 851
On Mon, Jun 3, 2013 at 11:34 AM, samir das mohapatra <
samir.helpdoc@gmail.com> wrote:
> Hi all,
> How to get the mapper output filename inside the the mapper .
>
> or
>
> How to change the mapper ouput file name.
> Default it looks like part-m-00000,part-m-00001 etc.
>
> Regards,
> samir.
>
Re: How to get the intermediate mapper output file name
Posted by Serega Sheypak <se...@gmail.com>.
See
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/lib/MultipleOutputFormat.html
- Case two: This class is used for a map only job. The job wants to use an
output file name that is either a part of the input file name of the input
data, or some derivation of it. -- Case three: This class is used for a
map only job. The job wants to use an output file name that depends on both
the keys and the input file name
понедельник, 3 июня 2013 г., 13:34:54 UTC+4 пользователь samir das
mohapatra написал:
>
> Hi all,
> How to get the mapper output filename inside the the mapper .
>
> or
>
> How to change the mapper ouput file name.
> Default it looks like part-m-00000,part-m-00001 etc.
>
> Regards,
> samir.
>
Re: How to get the intermediate mapper output file name
Posted by dvohra <dv...@yahoo.com>.
The part-m-00000,part-m-00001 file names are Hadoop naming conventions. To
use custom output file names use the MultipleOutputs class.
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/lib/MultipleOutputs.html
With MultipleOutputs the file name may be customized as
<namedOutput>_<multiName>-(m|r)-<part-number>
On Monday, June 3, 2013 2:34:54 AM UTC-7, samir das mohapatra wrote:
>
> Hi all,
> How to get the mapper output filename inside the the mapper .
>
> or
>
> How to change the mapper ouput file name.
> Default it looks like part-m-00000,part-m-00001 etc.
>
> Regards,
> samir.
>
Re: How to get the intermediate mapper output file name
Posted by Rahul Bhattacharjee <ra...@gmail.com>.
I think the format of the mapper and reducer split files are hard wired
into hadoop code , however you can prepend something in the beginning of
the filename or even a directory using multiple output format.
thanks,
Rahul
On Mon, Jun 3, 2013 at 3:04 PM, samir das mohapatra <samir.helpdoc@gmail.com
> wrote:
> Hi all,
> How to get the mapper output filename inside the the mapper .
>
> or
>
> How to change the mapper ouput file name.
> Default it looks like part-m-00000,part-m-00001 etc.
>
> Regards,
> samir.
>
Re: How to get the intermediate mapper output file name
Posted by Serega Sheypak <se...@gmail.com>.
See
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/lib/MultipleOutputFormat.html
- Case two: This class is used for a map only job. The job wants to use an
output file name that is either a part of the input file name of the input
data, or some derivation of it. -- Case three: This class is used for a
map only job. The job wants to use an output file name that depends on both
the keys and the input file name
понедельник, 3 июня 2013 г., 13:34:54 UTC+4 пользователь samir das
mohapatra написал:
>
> Hi all,
> How to get the mapper output filename inside the the mapper .
>
> or
>
> How to change the mapper ouput file name.
> Default it looks like part-m-00000,part-m-00001 etc.
>
> Regards,
> samir.
>