You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by YouPeng Yang <yy...@gmail.com> on 2013/04/22 04:54:36 UTC

Append MR output file to an exitsted HDFS file

Hi All

     Can I append a MR output file to an existed file on HDFS.

     I‘m using CDH4.1.2 vs MRv2


Regards

Re: Append MR output file to an exitsted HDFS file

Posted by Jay Vyas <ja...@gmail.com>.
I might be misunderstanding, but if you want each Reducer to append its
outputs to outputs to corresponding files that already exist in HDFS?

Remember that the reducers usually are outputting globs so you will have
several parts to your output - so the append has to be done in a way where
new reducer paritions corresponds to a old paritions.

If so - maybe you could play with your own OutputFormat by taking the
source from one that serves as a starting point, and replacing the
"create..." stream part with a call to write() with a call to append().

The reason this is tricky is that each OutputFormat is going to have to
find the corresponding file to append.


On Sun, Apr 21, 2013 at 10:54 PM, YouPeng Yang <yy...@gmail.com>wrote:

> Hi All
>
>      Can I append a MR output file to an existed file on HDFS.
>
>      I‘m using CDH4.1.2 vs MRv2
>
>
> Regards
>



-- 
Jay Vyas
http://jayunit100.blogspot.com

Re: Append MR output file to an exitsted HDFS file

Posted by Jay Vyas <ja...@gmail.com>.
I might be misunderstanding, but if you want each Reducer to append its
outputs to outputs to corresponding files that already exist in HDFS?

Remember that the reducers usually are outputting globs so you will have
several parts to your output - so the append has to be done in a way where
new reducer paritions corresponds to a old paritions.

If so - maybe you could play with your own OutputFormat by taking the
source from one that serves as a starting point, and replacing the
"create..." stream part with a call to write() with a call to append().

The reason this is tricky is that each OutputFormat is going to have to
find the corresponding file to append.


On Sun, Apr 21, 2013 at 10:54 PM, YouPeng Yang <yy...@gmail.com>wrote:

> Hi All
>
>      Can I append a MR output file to an existed file on HDFS.
>
>      I‘m using CDH4.1.2 vs MRv2
>
>
> Regards
>



-- 
Jay Vyas
http://jayunit100.blogspot.com

Re: Append MR output file to an exitsted HDFS file

Posted by Jay Vyas <ja...@gmail.com>.
I might be misunderstanding, but if you want each Reducer to append its
outputs to outputs to corresponding files that already exist in HDFS?

Remember that the reducers usually are outputting globs so you will have
several parts to your output - so the append has to be done in a way where
new reducer paritions corresponds to a old paritions.

If so - maybe you could play with your own OutputFormat by taking the
source from one that serves as a starting point, and replacing the
"create..." stream part with a call to write() with a call to append().

The reason this is tricky is that each OutputFormat is going to have to
find the corresponding file to append.


On Sun, Apr 21, 2013 at 10:54 PM, YouPeng Yang <yy...@gmail.com>wrote:

> Hi All
>
>      Can I append a MR output file to an existed file on HDFS.
>
>      I‘m using CDH4.1.2 vs MRv2
>
>
> Regards
>



-- 
Jay Vyas
http://jayunit100.blogspot.com

Re: Append MR output file to an exitsted HDFS file

Posted by Jay Vyas <ja...@gmail.com>.
I might be misunderstanding, but if you want each Reducer to append its
outputs to outputs to corresponding files that already exist in HDFS?

Remember that the reducers usually are outputting globs so you will have
several parts to your output - so the append has to be done in a way where
new reducer paritions corresponds to a old paritions.

If so - maybe you could play with your own OutputFormat by taking the
source from one that serves as a starting point, and replacing the
"create..." stream part with a call to write() with a call to append().

The reason this is tricky is that each OutputFormat is going to have to
find the corresponding file to append.


On Sun, Apr 21, 2013 at 10:54 PM, YouPeng Yang <yy...@gmail.com>wrote:

> Hi All
>
>      Can I append a MR output file to an existed file on HDFS.
>
>      I‘m using CDH4.1.2 vs MRv2
>
>
> Regards
>



-- 
Jay Vyas
http://jayunit100.blogspot.com