You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by John Vines <jo...@ugov.gov> on 2012/05/04 16:14:47 UTC

Question regarding FSDataOutputStream.close() behavior

So I'm trying to figure out the behavior of calling
DFSOutputStream.close(), as well as if/which version it changed in. I see
in the javadocs that the complete call (which close calls) will not return
until "all the file's  * blocks have been replicated the minimum number of
times".  Is the minimum number dfs.replication.min, or is it the files
number of replications?

Regardless of that answer, I'm curious when the behavior changed, if it
hadn't been like that. I remember it used to be that it would utilize lazy
replication, though that may just be for bringing underreplicated blocks up
to snuff. Any insight would be appreciated.

Re: Question regarding FSDataOutputStream.close() behavior

Posted by Todd Lipcon <to...@cloudera.com>.
On Fri, May 4, 2012 at 7:14 AM, John Vines <jo...@ugov.gov> wrote:
> So I'm trying to figure out the behavior of calling DFSOutputStream.close(),
> as well as if/which version it changed in. I see in the javadocs that the
> complete call (which close calls) will not return until "all the file's  *
> blocks have been replicated the minimum number of times".  Is the minimum
> number dfs.replication.min, or is it the files number of replications?

The former.

>
> Regardless of that answer, I'm curious when the behavior changed, if it
> hadn't been like that. I remember it used to be that it would utilize lazy
> replication, though that may just be for bringing underreplicated blocks up
> to snuff. Any insight would be appreciated.

I'm not aware of any change in recent years... what versions are you comparing?

-Todd
--
Todd Lipcon
Software Engineer, Cloudera

Re: Question regarding FSDataOutputStream.close() behavior

Posted by John Vines <jo...@ugov.gov>.
On Fri, May 4, 2012 at 1:53 PM, Todd Lipcon <to...@cloudera.com> wrote:

> On Fri, May 4, 2012 at 7:14 AM, John Vines <jo...@ugov.gov> wrote:
> > So I'm trying to figure out the behavior of calling
> DFSOutputStream.close(),
> > as well as if/which version it changed in. I see in the javadocs that the
> > complete call (which close calls) will not return until "all the file's
> *
> > blocks have been replicated the minimum number of times".  Is the minimum
> > number dfs.replication.min, or is it the files number of replications?
>
> The former.
>
> >
> > Regardless of that answer, I'm curious when the behavior changed, if it
> > hadn't been like that. I remember it used to be that it would utilize
> lazy
> > replication, though that may just be for bringing underreplicated blocks
> up
> > to snuff. Any insight would be appreciated.
>
> I'm not aware of any change in recent years... what versions are you
> comparing?
>
> I thought it had made changes in the last few releases, but I guess I was
mistaken. Thank you for the help.


> -Todd
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

John