You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by - <co...@ymail.com> on 2013/10/25 21:35:40 UTC
How / when does On-disk merge work?
Hi All,
Can anyone provide documentation regarding how on-disk merge on reduce phase works in detail in Hadoop 2.2.0?
There is an explanation in this page but I am afraid it could be outdated since what I observe in my log files is a bunch of "OnDiskMerger - Thread to merge on-disk map-outputs" work at the end of merge phase.
Thanks,
-
Re: How / when does On-disk merge work?
Posted by Ravi Prakash <ra...@ymail.com>.
Hi!
Tom White's "Hadoop: The Definitive Guide" is probably the best source for information on this (apart from the code itself ;-) Look at MergeManagerImpl.java btw in case you are so inclined).
HTH
Ravi
On Friday, October 25, 2013 2:36 PM, - <co...@ymail.com> wrote:
Hi All,
Can anyone provide documentation regarding how on-disk merge on reduce phase works in detail in Hadoop 2.2.0?
There is an explanation in this page but I am afraid it could be outdated since what I observe in my log files is a bunch of "OnDiskMerger - Thread to merge on-disk map-outputs" work at the end of merge phase.
Thanks,
-
Re: How / when does On-disk merge work?
Posted by Ravi Prakash <ra...@ymail.com>.
Hi!
Tom White's "Hadoop: The Definitive Guide" is probably the best source for information on this (apart from the code itself ;-) Look at MergeManagerImpl.java btw in case you are so inclined).
HTH
Ravi
On Friday, October 25, 2013 2:36 PM, - <co...@ymail.com> wrote:
Hi All,
Can anyone provide documentation regarding how on-disk merge on reduce phase works in detail in Hadoop 2.2.0?
There is an explanation in this page but I am afraid it could be outdated since what I observe in my log files is a bunch of "OnDiskMerger - Thread to merge on-disk map-outputs" work at the end of merge phase.
Thanks,
-
Re: How / when does On-disk merge work?
Posted by Ravi Prakash <ra...@ymail.com>.
Hi!
Tom White's "Hadoop: The Definitive Guide" is probably the best source for information on this (apart from the code itself ;-) Look at MergeManagerImpl.java btw in case you are so inclined).
HTH
Ravi
On Friday, October 25, 2013 2:36 PM, - <co...@ymail.com> wrote:
Hi All,
Can anyone provide documentation regarding how on-disk merge on reduce phase works in detail in Hadoop 2.2.0?
There is an explanation in this page but I am afraid it could be outdated since what I observe in my log files is a bunch of "OnDiskMerger - Thread to merge on-disk map-outputs" work at the end of merge phase.
Thanks,
-
Re: How / when does On-disk merge work?
Posted by Ravi Prakash <ra...@ymail.com>.
Hi!
Tom White's "Hadoop: The Definitive Guide" is probably the best source for information on this (apart from the code itself ;-) Look at MergeManagerImpl.java btw in case you are so inclined).
HTH
Ravi
On Friday, October 25, 2013 2:36 PM, - <co...@ymail.com> wrote:
Hi All,
Can anyone provide documentation regarding how on-disk merge on reduce phase works in detail in Hadoop 2.2.0?
There is an explanation in this page but I am afraid it could be outdated since what I observe in my log files is a bunch of "OnDiskMerger - Thread to merge on-disk map-outputs" work at the end of merge phase.
Thanks,
-