You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by John Lilley <jo...@redpoint.net> on 2014/01/23 15:57:23 UTC

HDFS buffer sizes

What is the interaction between dfs.stream-buffer-size and dfs.client-write-packet-size?
I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have experience using larger buffers to optimize large writes?
Thanks
John


RE: HDFS buffer sizes

Posted by John Lilley <jo...@redpoint.net>.
Thanks.  Experimentally, I have found that changing the buffers sizes has no effect, so that makes sense.
John

From: Arpit Agarwal [mailto:aagarwal@hortonworks.com]
Sent: Tuesday, January 28, 2014 12:35 AM
To: user@hadoop.apache.org
Subject: Re: HDFS buffer sizes

Looks like DistributedFileSystem ignores it though.

On Sat, Jan 25, 2014 at 6:09 AM, John Lilley <jo...@redpoint.net>> wrote:
There is this in FileSystem.java, which would appear to use the default buffer size of 4096 in the create() call unless otherwise specified in io.file.buffer.size

  public FSDataOutputStream create(Path f, short replication,
      Progressable progress) throws IOException {
    return create(f, true,
                  getConf().getInt(
                      CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_KEY,
                      CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_DEFAULT),
                  replication,
                  getDefaultBlockSize(f), progress);
  }

But this discussion is missing the point; I really want to know, is there any benefit to setting a larger bufferSize in FileSystem.create() and FileSystem.append()?

From: Arpit Agarwal [mailto:aagarwal@hortonworks.com<ma...@hortonworks.com>]
Sent: Friday, January 24, 2014 9:35 AM

To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: HDFS buffer sizes

I don't think that value is used either except in the legacy block reader which is turned off by default.

On Fri, Jan 24, 2014 at 6:34 AM, John Lilley <jo...@redpoint.net>> wrote:
Ah, I see... it is a constant
CommonConfigurationKeysPublic.java:  public static final int IO_FILE_BUFFER_SIZE_DEFAULT = 4096;
Are there benefits to increasing this for large reads or writes?
john

From: Arpit Agarwal [mailto:aagarwal@hortonworks.com<ma...@hortonworks.com>]
Sent: Thursday, January 23, 2014 3:31 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: HDFS buffer sizes

HDFS does not appear to use dfs.stream-buffer-size.

On Thu, Jan 23, 2014 at 6:57 AM, John Lilley <jo...@redpoint.net>> wrote:
What is the interaction between dfs.stream-buffer-size and dfs.client-write-packet-size?
I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have experience using larger buffers to optimize large writes?
Thanks
John



CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.

RE: HDFS buffer sizes

Posted by John Lilley <jo...@redpoint.net>.
Thanks.  Experimentally, I have found that changing the buffers sizes has no effect, so that makes sense.
John

From: Arpit Agarwal [mailto:aagarwal@hortonworks.com]
Sent: Tuesday, January 28, 2014 12:35 AM
To: user@hadoop.apache.org
Subject: Re: HDFS buffer sizes

Looks like DistributedFileSystem ignores it though.

On Sat, Jan 25, 2014 at 6:09 AM, John Lilley <jo...@redpoint.net>> wrote:
There is this in FileSystem.java, which would appear to use the default buffer size of 4096 in the create() call unless otherwise specified in io.file.buffer.size

  public FSDataOutputStream create(Path f, short replication,
      Progressable progress) throws IOException {
    return create(f, true,
                  getConf().getInt(
                      CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_KEY,
                      CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_DEFAULT),
                  replication,
                  getDefaultBlockSize(f), progress);
  }

But this discussion is missing the point; I really want to know, is there any benefit to setting a larger bufferSize in FileSystem.create() and FileSystem.append()?

From: Arpit Agarwal [mailto:aagarwal@hortonworks.com<ma...@hortonworks.com>]
Sent: Friday, January 24, 2014 9:35 AM

To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: HDFS buffer sizes

I don't think that value is used either except in the legacy block reader which is turned off by default.

On Fri, Jan 24, 2014 at 6:34 AM, John Lilley <jo...@redpoint.net>> wrote:
Ah, I see... it is a constant
CommonConfigurationKeysPublic.java:  public static final int IO_FILE_BUFFER_SIZE_DEFAULT = 4096;
Are there benefits to increasing this for large reads or writes?
john

From: Arpit Agarwal [mailto:aagarwal@hortonworks.com<ma...@hortonworks.com>]
Sent: Thursday, January 23, 2014 3:31 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: HDFS buffer sizes

HDFS does not appear to use dfs.stream-buffer-size.

On Thu, Jan 23, 2014 at 6:57 AM, John Lilley <jo...@redpoint.net>> wrote:
What is the interaction between dfs.stream-buffer-size and dfs.client-write-packet-size?
I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have experience using larger buffers to optimize large writes?
Thanks
John



CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.

RE: HDFS buffer sizes

Posted by John Lilley <jo...@redpoint.net>.
Thanks.  Experimentally, I have found that changing the buffers sizes has no effect, so that makes sense.
John

From: Arpit Agarwal [mailto:aagarwal@hortonworks.com]
Sent: Tuesday, January 28, 2014 12:35 AM
To: user@hadoop.apache.org
Subject: Re: HDFS buffer sizes

Looks like DistributedFileSystem ignores it though.

On Sat, Jan 25, 2014 at 6:09 AM, John Lilley <jo...@redpoint.net>> wrote:
There is this in FileSystem.java, which would appear to use the default buffer size of 4096 in the create() call unless otherwise specified in io.file.buffer.size

  public FSDataOutputStream create(Path f, short replication,
      Progressable progress) throws IOException {
    return create(f, true,
                  getConf().getInt(
                      CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_KEY,
                      CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_DEFAULT),
                  replication,
                  getDefaultBlockSize(f), progress);
  }

But this discussion is missing the point; I really want to know, is there any benefit to setting a larger bufferSize in FileSystem.create() and FileSystem.append()?

From: Arpit Agarwal [mailto:aagarwal@hortonworks.com<ma...@hortonworks.com>]
Sent: Friday, January 24, 2014 9:35 AM

To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: HDFS buffer sizes

I don't think that value is used either except in the legacy block reader which is turned off by default.

On Fri, Jan 24, 2014 at 6:34 AM, John Lilley <jo...@redpoint.net>> wrote:
Ah, I see... it is a constant
CommonConfigurationKeysPublic.java:  public static final int IO_FILE_BUFFER_SIZE_DEFAULT = 4096;
Are there benefits to increasing this for large reads or writes?
john

From: Arpit Agarwal [mailto:aagarwal@hortonworks.com<ma...@hortonworks.com>]
Sent: Thursday, January 23, 2014 3:31 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: HDFS buffer sizes

HDFS does not appear to use dfs.stream-buffer-size.

On Thu, Jan 23, 2014 at 6:57 AM, John Lilley <jo...@redpoint.net>> wrote:
What is the interaction between dfs.stream-buffer-size and dfs.client-write-packet-size?
I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have experience using larger buffers to optimize large writes?
Thanks
John



CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.

RE: HDFS buffer sizes

Posted by John Lilley <jo...@redpoint.net>.
Thanks.  Experimentally, I have found that changing the buffers sizes has no effect, so that makes sense.
John

From: Arpit Agarwal [mailto:aagarwal@hortonworks.com]
Sent: Tuesday, January 28, 2014 12:35 AM
To: user@hadoop.apache.org
Subject: Re: HDFS buffer sizes

Looks like DistributedFileSystem ignores it though.

On Sat, Jan 25, 2014 at 6:09 AM, John Lilley <jo...@redpoint.net>> wrote:
There is this in FileSystem.java, which would appear to use the default buffer size of 4096 in the create() call unless otherwise specified in io.file.buffer.size

  public FSDataOutputStream create(Path f, short replication,
      Progressable progress) throws IOException {
    return create(f, true,
                  getConf().getInt(
                      CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_KEY,
                      CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_DEFAULT),
                  replication,
                  getDefaultBlockSize(f), progress);
  }

But this discussion is missing the point; I really want to know, is there any benefit to setting a larger bufferSize in FileSystem.create() and FileSystem.append()?

From: Arpit Agarwal [mailto:aagarwal@hortonworks.com<ma...@hortonworks.com>]
Sent: Friday, January 24, 2014 9:35 AM

To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: HDFS buffer sizes

I don't think that value is used either except in the legacy block reader which is turned off by default.

On Fri, Jan 24, 2014 at 6:34 AM, John Lilley <jo...@redpoint.net>> wrote:
Ah, I see... it is a constant
CommonConfigurationKeysPublic.java:  public static final int IO_FILE_BUFFER_SIZE_DEFAULT = 4096;
Are there benefits to increasing this for large reads or writes?
john

From: Arpit Agarwal [mailto:aagarwal@hortonworks.com<ma...@hortonworks.com>]
Sent: Thursday, January 23, 2014 3:31 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: HDFS buffer sizes

HDFS does not appear to use dfs.stream-buffer-size.

On Thu, Jan 23, 2014 at 6:57 AM, John Lilley <jo...@redpoint.net>> wrote:
What is the interaction between dfs.stream-buffer-size and dfs.client-write-packet-size?
I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have experience using larger buffers to optimize large writes?
Thanks
John



CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.

Re: HDFS buffer sizes

Posted by Arpit Agarwal <aa...@hortonworks.com>.
Looks like DistributedFileSystem ignores it though.


On Sat, Jan 25, 2014 at 6:09 AM, John Lilley <jo...@redpoint.net>wrote:

>  There is this in FileSystem.java, which would appear to use the default
> buffer size of 4096 in the create() call unless otherwise specified in
> *io.file.buffer.size*
>
>
>
>   public FSDataOutputStream create(Path f, short replication,
>
>       Progressable progress) throws IOException {
>
>     return create(f, true,
>
>                   getConf().getInt(
>
>
> CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_KEY,
>
>
>                     CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_DEFAULT),
>
>                   replication,
>
>                   getDefaultBlockSize(f), progress);
>
>   }
>
>
>
> But this discussion is missing the point; I really want to know, is there
> any benefit to setting a larger bufferSize in FileSystem.create() and
> FileSystem.append()?
>
>
>
> *From:* Arpit Agarwal [mailto:aagarwal@hortonworks.com]
> *Sent:* Friday, January 24, 2014 9:35 AM
>
> *To:* user@hadoop.apache.org
> *Subject:* Re: HDFS buffer sizes
>
>
>
> I don't think that value is used either except in the legacy block reader
> which is turned off by default.
>
>
>
> On Fri, Jan 24, 2014 at 6:34 AM, John Lilley <jo...@redpoint.net>
> wrote:
>
> Ah, I see… it is a constant
>
> CommonConfigurationKeysPublic.java:  public static final int
> IO_FILE_BUFFER_SIZE_DEFAULT = 4096;
>
> Are there benefits to increasing this for large reads or writes?
>
> john
>
>
>
> *From:* Arpit Agarwal [mailto:aagarwal@hortonworks.com]
> *Sent:* Thursday, January 23, 2014 3:31 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: HDFS buffer sizes
>
>
>
> HDFS does not appear to use dfs.stream-buffer-size.
>
>
>
> On Thu, Jan 23, 2014 at 6:57 AM, John Lilley <jo...@redpoint.net>
> wrote:
>
> What is the interaction between dfs.stream-buffer-size and
> dfs.client-write-packet-size?
>
> I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have
> experience using larger buffers to optimize large writes?
>
> Thanks
>
>
> John
>
>
>
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: HDFS buffer sizes

Posted by Arpit Agarwal <aa...@hortonworks.com>.
Looks like DistributedFileSystem ignores it though.


On Sat, Jan 25, 2014 at 6:09 AM, John Lilley <jo...@redpoint.net>wrote:

>  There is this in FileSystem.java, which would appear to use the default
> buffer size of 4096 in the create() call unless otherwise specified in
> *io.file.buffer.size*
>
>
>
>   public FSDataOutputStream create(Path f, short replication,
>
>       Progressable progress) throws IOException {
>
>     return create(f, true,
>
>                   getConf().getInt(
>
>
> CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_KEY,
>
>
>                     CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_DEFAULT),
>
>                   replication,
>
>                   getDefaultBlockSize(f), progress);
>
>   }
>
>
>
> But this discussion is missing the point; I really want to know, is there
> any benefit to setting a larger bufferSize in FileSystem.create() and
> FileSystem.append()?
>
>
>
> *From:* Arpit Agarwal [mailto:aagarwal@hortonworks.com]
> *Sent:* Friday, January 24, 2014 9:35 AM
>
> *To:* user@hadoop.apache.org
> *Subject:* Re: HDFS buffer sizes
>
>
>
> I don't think that value is used either except in the legacy block reader
> which is turned off by default.
>
>
>
> On Fri, Jan 24, 2014 at 6:34 AM, John Lilley <jo...@redpoint.net>
> wrote:
>
> Ah, I see… it is a constant
>
> CommonConfigurationKeysPublic.java:  public static final int
> IO_FILE_BUFFER_SIZE_DEFAULT = 4096;
>
> Are there benefits to increasing this for large reads or writes?
>
> john
>
>
>
> *From:* Arpit Agarwal [mailto:aagarwal@hortonworks.com]
> *Sent:* Thursday, January 23, 2014 3:31 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: HDFS buffer sizes
>
>
>
> HDFS does not appear to use dfs.stream-buffer-size.
>
>
>
> On Thu, Jan 23, 2014 at 6:57 AM, John Lilley <jo...@redpoint.net>
> wrote:
>
> What is the interaction between dfs.stream-buffer-size and
> dfs.client-write-packet-size?
>
> I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have
> experience using larger buffers to optimize large writes?
>
> Thanks
>
>
> John
>
>
>
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: HDFS buffer sizes

Posted by Arpit Agarwal <aa...@hortonworks.com>.
Looks like DistributedFileSystem ignores it though.


On Sat, Jan 25, 2014 at 6:09 AM, John Lilley <jo...@redpoint.net>wrote:

>  There is this in FileSystem.java, which would appear to use the default
> buffer size of 4096 in the create() call unless otherwise specified in
> *io.file.buffer.size*
>
>
>
>   public FSDataOutputStream create(Path f, short replication,
>
>       Progressable progress) throws IOException {
>
>     return create(f, true,
>
>                   getConf().getInt(
>
>
> CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_KEY,
>
>
>                     CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_DEFAULT),
>
>                   replication,
>
>                   getDefaultBlockSize(f), progress);
>
>   }
>
>
>
> But this discussion is missing the point; I really want to know, is there
> any benefit to setting a larger bufferSize in FileSystem.create() and
> FileSystem.append()?
>
>
>
> *From:* Arpit Agarwal [mailto:aagarwal@hortonworks.com]
> *Sent:* Friday, January 24, 2014 9:35 AM
>
> *To:* user@hadoop.apache.org
> *Subject:* Re: HDFS buffer sizes
>
>
>
> I don't think that value is used either except in the legacy block reader
> which is turned off by default.
>
>
>
> On Fri, Jan 24, 2014 at 6:34 AM, John Lilley <jo...@redpoint.net>
> wrote:
>
> Ah, I see… it is a constant
>
> CommonConfigurationKeysPublic.java:  public static final int
> IO_FILE_BUFFER_SIZE_DEFAULT = 4096;
>
> Are there benefits to increasing this for large reads or writes?
>
> john
>
>
>
> *From:* Arpit Agarwal [mailto:aagarwal@hortonworks.com]
> *Sent:* Thursday, January 23, 2014 3:31 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: HDFS buffer sizes
>
>
>
> HDFS does not appear to use dfs.stream-buffer-size.
>
>
>
> On Thu, Jan 23, 2014 at 6:57 AM, John Lilley <jo...@redpoint.net>
> wrote:
>
> What is the interaction between dfs.stream-buffer-size and
> dfs.client-write-packet-size?
>
> I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have
> experience using larger buffers to optimize large writes?
>
> Thanks
>
>
> John
>
>
>
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: HDFS buffer sizes

Posted by Arpit Agarwal <aa...@hortonworks.com>.
Looks like DistributedFileSystem ignores it though.


On Sat, Jan 25, 2014 at 6:09 AM, John Lilley <jo...@redpoint.net>wrote:

>  There is this in FileSystem.java, which would appear to use the default
> buffer size of 4096 in the create() call unless otherwise specified in
> *io.file.buffer.size*
>
>
>
>   public FSDataOutputStream create(Path f, short replication,
>
>       Progressable progress) throws IOException {
>
>     return create(f, true,
>
>                   getConf().getInt(
>
>
> CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_KEY,
>
>
>                     CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_DEFAULT),
>
>                   replication,
>
>                   getDefaultBlockSize(f), progress);
>
>   }
>
>
>
> But this discussion is missing the point; I really want to know, is there
> any benefit to setting a larger bufferSize in FileSystem.create() and
> FileSystem.append()?
>
>
>
> *From:* Arpit Agarwal [mailto:aagarwal@hortonworks.com]
> *Sent:* Friday, January 24, 2014 9:35 AM
>
> *To:* user@hadoop.apache.org
> *Subject:* Re: HDFS buffer sizes
>
>
>
> I don't think that value is used either except in the legacy block reader
> which is turned off by default.
>
>
>
> On Fri, Jan 24, 2014 at 6:34 AM, John Lilley <jo...@redpoint.net>
> wrote:
>
> Ah, I see… it is a constant
>
> CommonConfigurationKeysPublic.java:  public static final int
> IO_FILE_BUFFER_SIZE_DEFAULT = 4096;
>
> Are there benefits to increasing this for large reads or writes?
>
> john
>
>
>
> *From:* Arpit Agarwal [mailto:aagarwal@hortonworks.com]
> *Sent:* Thursday, January 23, 2014 3:31 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: HDFS buffer sizes
>
>
>
> HDFS does not appear to use dfs.stream-buffer-size.
>
>
>
> On Thu, Jan 23, 2014 at 6:57 AM, John Lilley <jo...@redpoint.net>
> wrote:
>
> What is the interaction between dfs.stream-buffer-size and
> dfs.client-write-packet-size?
>
> I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have
> experience using larger buffers to optimize large writes?
>
> Thanks
>
>
> John
>
>
>
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

RE: HDFS buffer sizes

Posted by John Lilley <jo...@redpoint.net>.
There is this in FileSystem.java, which would appear to use the default buffer size of 4096 in the create() call unless otherwise specified in io.file.buffer.size

  public FSDataOutputStream create(Path f, short replication,
      Progressable progress) throws IOException {
    return create(f, true,
                  getConf().getInt(
                      CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_KEY,
                      CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_DEFAULT),
                  replication,
                  getDefaultBlockSize(f), progress);
  }

But this discussion is missing the point; I really want to know, is there any benefit to setting a larger bufferSize in FileSystem.create() and FileSystem.append()?

From: Arpit Agarwal [mailto:aagarwal@hortonworks.com]
Sent: Friday, January 24, 2014 9:35 AM
To: user@hadoop.apache.org
Subject: Re: HDFS buffer sizes

I don't think that value is used either except in the legacy block reader which is turned off by default.

On Fri, Jan 24, 2014 at 6:34 AM, John Lilley <jo...@redpoint.net>> wrote:
Ah, I see... it is a constant
CommonConfigurationKeysPublic.java:  public static final int IO_FILE_BUFFER_SIZE_DEFAULT = 4096;
Are there benefits to increasing this for large reads or writes?
john

From: Arpit Agarwal [mailto:aagarwal@hortonworks.com<ma...@hortonworks.com>]
Sent: Thursday, January 23, 2014 3:31 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: HDFS buffer sizes

HDFS does not appear to use dfs.stream-buffer-size.

On Thu, Jan 23, 2014 at 6:57 AM, John Lilley <jo...@redpoint.net>> wrote:
What is the interaction between dfs.stream-buffer-size and dfs.client-write-packet-size?
I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have experience using larger buffers to optimize large writes?
Thanks
John



CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.

RE: HDFS buffer sizes

Posted by John Lilley <jo...@redpoint.net>.
There is this in FileSystem.java, which would appear to use the default buffer size of 4096 in the create() call unless otherwise specified in io.file.buffer.size

  public FSDataOutputStream create(Path f, short replication,
      Progressable progress) throws IOException {
    return create(f, true,
                  getConf().getInt(
                      CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_KEY,
                      CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_DEFAULT),
                  replication,
                  getDefaultBlockSize(f), progress);
  }

But this discussion is missing the point; I really want to know, is there any benefit to setting a larger bufferSize in FileSystem.create() and FileSystem.append()?

From: Arpit Agarwal [mailto:aagarwal@hortonworks.com]
Sent: Friday, January 24, 2014 9:35 AM
To: user@hadoop.apache.org
Subject: Re: HDFS buffer sizes

I don't think that value is used either except in the legacy block reader which is turned off by default.

On Fri, Jan 24, 2014 at 6:34 AM, John Lilley <jo...@redpoint.net>> wrote:
Ah, I see... it is a constant
CommonConfigurationKeysPublic.java:  public static final int IO_FILE_BUFFER_SIZE_DEFAULT = 4096;
Are there benefits to increasing this for large reads or writes?
john

From: Arpit Agarwal [mailto:aagarwal@hortonworks.com<ma...@hortonworks.com>]
Sent: Thursday, January 23, 2014 3:31 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: HDFS buffer sizes

HDFS does not appear to use dfs.stream-buffer-size.

On Thu, Jan 23, 2014 at 6:57 AM, John Lilley <jo...@redpoint.net>> wrote:
What is the interaction between dfs.stream-buffer-size and dfs.client-write-packet-size?
I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have experience using larger buffers to optimize large writes?
Thanks
John



CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.

RE: HDFS buffer sizes

Posted by John Lilley <jo...@redpoint.net>.
There is this in FileSystem.java, which would appear to use the default buffer size of 4096 in the create() call unless otherwise specified in io.file.buffer.size

  public FSDataOutputStream create(Path f, short replication,
      Progressable progress) throws IOException {
    return create(f, true,
                  getConf().getInt(
                      CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_KEY,
                      CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_DEFAULT),
                  replication,
                  getDefaultBlockSize(f), progress);
  }

But this discussion is missing the point; I really want to know, is there any benefit to setting a larger bufferSize in FileSystem.create() and FileSystem.append()?

From: Arpit Agarwal [mailto:aagarwal@hortonworks.com]
Sent: Friday, January 24, 2014 9:35 AM
To: user@hadoop.apache.org
Subject: Re: HDFS buffer sizes

I don't think that value is used either except in the legacy block reader which is turned off by default.

On Fri, Jan 24, 2014 at 6:34 AM, John Lilley <jo...@redpoint.net>> wrote:
Ah, I see... it is a constant
CommonConfigurationKeysPublic.java:  public static final int IO_FILE_BUFFER_SIZE_DEFAULT = 4096;
Are there benefits to increasing this for large reads or writes?
john

From: Arpit Agarwal [mailto:aagarwal@hortonworks.com<ma...@hortonworks.com>]
Sent: Thursday, January 23, 2014 3:31 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: HDFS buffer sizes

HDFS does not appear to use dfs.stream-buffer-size.

On Thu, Jan 23, 2014 at 6:57 AM, John Lilley <jo...@redpoint.net>> wrote:
What is the interaction between dfs.stream-buffer-size and dfs.client-write-packet-size?
I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have experience using larger buffers to optimize large writes?
Thanks
John



CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.

RE: HDFS buffer sizes

Posted by John Lilley <jo...@redpoint.net>.
There is this in FileSystem.java, which would appear to use the default buffer size of 4096 in the create() call unless otherwise specified in io.file.buffer.size

  public FSDataOutputStream create(Path f, short replication,
      Progressable progress) throws IOException {
    return create(f, true,
                  getConf().getInt(
                      CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_KEY,
                      CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_DEFAULT),
                  replication,
                  getDefaultBlockSize(f), progress);
  }

But this discussion is missing the point; I really want to know, is there any benefit to setting a larger bufferSize in FileSystem.create() and FileSystem.append()?

From: Arpit Agarwal [mailto:aagarwal@hortonworks.com]
Sent: Friday, January 24, 2014 9:35 AM
To: user@hadoop.apache.org
Subject: Re: HDFS buffer sizes

I don't think that value is used either except in the legacy block reader which is turned off by default.

On Fri, Jan 24, 2014 at 6:34 AM, John Lilley <jo...@redpoint.net>> wrote:
Ah, I see... it is a constant
CommonConfigurationKeysPublic.java:  public static final int IO_FILE_BUFFER_SIZE_DEFAULT = 4096;
Are there benefits to increasing this for large reads or writes?
john

From: Arpit Agarwal [mailto:aagarwal@hortonworks.com<ma...@hortonworks.com>]
Sent: Thursday, January 23, 2014 3:31 PM
To: user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Re: HDFS buffer sizes

HDFS does not appear to use dfs.stream-buffer-size.

On Thu, Jan 23, 2014 at 6:57 AM, John Lilley <jo...@redpoint.net>> wrote:
What is the interaction between dfs.stream-buffer-size and dfs.client-write-packet-size?
I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have experience using larger buffers to optimize large writes?
Thanks
John



CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.

Re: HDFS buffer sizes

Posted by Arpit Agarwal <aa...@hortonworks.com>.
I don't think that value is used either except in the legacy block reader
which is turned off by default.


On Fri, Jan 24, 2014 at 6:34 AM, John Lilley <jo...@redpoint.net>wrote:

>  Ah, I see… it is a constant
>
> CommonConfigurationKeysPublic.java:  public static final int
> IO_FILE_BUFFER_SIZE_DEFAULT = 4096;
>
> Are there benefits to increasing this for large reads or writes?
>
> john
>
>
>
> *From:* Arpit Agarwal [mailto:aagarwal@hortonworks.com]
> *Sent:* Thursday, January 23, 2014 3:31 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: HDFS buffer sizes
>
>
>
> HDFS does not appear to use dfs.stream-buffer-size.
>
>
>
> On Thu, Jan 23, 2014 at 6:57 AM, John Lilley <jo...@redpoint.net>
> wrote:
>
> What is the interaction between dfs.stream-buffer-size and
> dfs.client-write-packet-size?
>
> I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have
> experience using larger buffers to optimize large writes?
>
> Thanks
>
>
> John
>
>
>
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: HDFS buffer sizes

Posted by Arpit Agarwal <aa...@hortonworks.com>.
I don't think that value is used either except in the legacy block reader
which is turned off by default.


On Fri, Jan 24, 2014 at 6:34 AM, John Lilley <jo...@redpoint.net>wrote:

>  Ah, I see… it is a constant
>
> CommonConfigurationKeysPublic.java:  public static final int
> IO_FILE_BUFFER_SIZE_DEFAULT = 4096;
>
> Are there benefits to increasing this for large reads or writes?
>
> john
>
>
>
> *From:* Arpit Agarwal [mailto:aagarwal@hortonworks.com]
> *Sent:* Thursday, January 23, 2014 3:31 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: HDFS buffer sizes
>
>
>
> HDFS does not appear to use dfs.stream-buffer-size.
>
>
>
> On Thu, Jan 23, 2014 at 6:57 AM, John Lilley <jo...@redpoint.net>
> wrote:
>
> What is the interaction between dfs.stream-buffer-size and
> dfs.client-write-packet-size?
>
> I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have
> experience using larger buffers to optimize large writes?
>
> Thanks
>
>
> John
>
>
>
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: HDFS buffer sizes

Posted by Arpit Agarwal <aa...@hortonworks.com>.
I don't think that value is used either except in the legacy block reader
which is turned off by default.


On Fri, Jan 24, 2014 at 6:34 AM, John Lilley <jo...@redpoint.net>wrote:

>  Ah, I see… it is a constant
>
> CommonConfigurationKeysPublic.java:  public static final int
> IO_FILE_BUFFER_SIZE_DEFAULT = 4096;
>
> Are there benefits to increasing this for large reads or writes?
>
> john
>
>
>
> *From:* Arpit Agarwal [mailto:aagarwal@hortonworks.com]
> *Sent:* Thursday, January 23, 2014 3:31 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: HDFS buffer sizes
>
>
>
> HDFS does not appear to use dfs.stream-buffer-size.
>
>
>
> On Thu, Jan 23, 2014 at 6:57 AM, John Lilley <jo...@redpoint.net>
> wrote:
>
> What is the interaction between dfs.stream-buffer-size and
> dfs.client-write-packet-size?
>
> I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have
> experience using larger buffers to optimize large writes?
>
> Thanks
>
>
> John
>
>
>
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: HDFS buffer sizes

Posted by Arpit Agarwal <aa...@hortonworks.com>.
I don't think that value is used either except in the legacy block reader
which is turned off by default.


On Fri, Jan 24, 2014 at 6:34 AM, John Lilley <jo...@redpoint.net>wrote:

>  Ah, I see… it is a constant
>
> CommonConfigurationKeysPublic.java:  public static final int
> IO_FILE_BUFFER_SIZE_DEFAULT = 4096;
>
> Are there benefits to increasing this for large reads or writes?
>
> john
>
>
>
> *From:* Arpit Agarwal [mailto:aagarwal@hortonworks.com]
> *Sent:* Thursday, January 23, 2014 3:31 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: HDFS buffer sizes
>
>
>
> HDFS does not appear to use dfs.stream-buffer-size.
>
>
>
> On Thu, Jan 23, 2014 at 6:57 AM, John Lilley <jo...@redpoint.net>
> wrote:
>
> What is the interaction between dfs.stream-buffer-size and
> dfs.client-write-packet-size?
>
> I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have
> experience using larger buffers to optimize large writes?
>
> Thanks
>
>
> John
>
>
>
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

RE: HDFS buffer sizes

Posted by John Lilley <jo...@redpoint.net>.
Ah, I see... it is a constant
CommonConfigurationKeysPublic.java:  public static final int IO_FILE_BUFFER_SIZE_DEFAULT = 4096;
Are there benefits to increasing this for large reads or writes?
john

From: Arpit Agarwal [mailto:aagarwal@hortonworks.com]
Sent: Thursday, January 23, 2014 3:31 PM
To: user@hadoop.apache.org
Subject: Re: HDFS buffer sizes

HDFS does not appear to use dfs.stream-buffer-size.

On Thu, Jan 23, 2014 at 6:57 AM, John Lilley <jo...@redpoint.net>> wrote:
What is the interaction between dfs.stream-buffer-size and dfs.client-write-packet-size?
I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have experience using larger buffers to optimize large writes?
Thanks
John



CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.

RE: HDFS buffer sizes

Posted by John Lilley <jo...@redpoint.net>.
Ah, I see... it is a constant
CommonConfigurationKeysPublic.java:  public static final int IO_FILE_BUFFER_SIZE_DEFAULT = 4096;
Are there benefits to increasing this for large reads or writes?
john

From: Arpit Agarwal [mailto:aagarwal@hortonworks.com]
Sent: Thursday, January 23, 2014 3:31 PM
To: user@hadoop.apache.org
Subject: Re: HDFS buffer sizes

HDFS does not appear to use dfs.stream-buffer-size.

On Thu, Jan 23, 2014 at 6:57 AM, John Lilley <jo...@redpoint.net>> wrote:
What is the interaction between dfs.stream-buffer-size and dfs.client-write-packet-size?
I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have experience using larger buffers to optimize large writes?
Thanks
John



CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.

RE: HDFS buffer sizes

Posted by John Lilley <jo...@redpoint.net>.
Ah, I see... it is a constant
CommonConfigurationKeysPublic.java:  public static final int IO_FILE_BUFFER_SIZE_DEFAULT = 4096;
Are there benefits to increasing this for large reads or writes?
john

From: Arpit Agarwal [mailto:aagarwal@hortonworks.com]
Sent: Thursday, January 23, 2014 3:31 PM
To: user@hadoop.apache.org
Subject: Re: HDFS buffer sizes

HDFS does not appear to use dfs.stream-buffer-size.

On Thu, Jan 23, 2014 at 6:57 AM, John Lilley <jo...@redpoint.net>> wrote:
What is the interaction between dfs.stream-buffer-size and dfs.client-write-packet-size?
I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have experience using larger buffers to optimize large writes?
Thanks
John



CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.

RE: HDFS buffer sizes

Posted by John Lilley <jo...@redpoint.net>.
Ah, I see... it is a constant
CommonConfigurationKeysPublic.java:  public static final int IO_FILE_BUFFER_SIZE_DEFAULT = 4096;
Are there benefits to increasing this for large reads or writes?
john

From: Arpit Agarwal [mailto:aagarwal@hortonworks.com]
Sent: Thursday, January 23, 2014 3:31 PM
To: user@hadoop.apache.org
Subject: Re: HDFS buffer sizes

HDFS does not appear to use dfs.stream-buffer-size.

On Thu, Jan 23, 2014 at 6:57 AM, John Lilley <jo...@redpoint.net>> wrote:
What is the interaction between dfs.stream-buffer-size and dfs.client-write-packet-size?
I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have experience using larger buffers to optimize large writes?
Thanks
John



CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.

Re: HDFS buffer sizes

Posted by Arpit Agarwal <aa...@hortonworks.com>.
HDFS does not appear to use dfs.stream-buffer-size.


On Thu, Jan 23, 2014 at 6:57 AM, John Lilley <jo...@redpoint.net>wrote:

>  What is the interaction between dfs.stream-buffer-size and
> dfs.client-write-packet-size?
>
> I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have
> experience using larger buffers to optimize large writes?
>
> Thanks
>
>
> John
>
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: HDFS buffer sizes

Posted by Arpit Agarwal <aa...@hortonworks.com>.
HDFS does not appear to use dfs.stream-buffer-size.


On Thu, Jan 23, 2014 at 6:57 AM, John Lilley <jo...@redpoint.net>wrote:

>  What is the interaction between dfs.stream-buffer-size and
> dfs.client-write-packet-size?
>
> I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have
> experience using larger buffers to optimize large writes?
>
> Thanks
>
>
> John
>
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: HDFS buffer sizes

Posted by Arpit Agarwal <aa...@hortonworks.com>.
HDFS does not appear to use dfs.stream-buffer-size.


On Thu, Jan 23, 2014 at 6:57 AM, John Lilley <jo...@redpoint.net>wrote:

>  What is the interaction between dfs.stream-buffer-size and
> dfs.client-write-packet-size?
>
> I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have
> experience using larger buffers to optimize large writes?
>
> Thanks
>
>
> John
>
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: HDFS buffer sizes

Posted by Arpit Agarwal <aa...@hortonworks.com>.
HDFS does not appear to use dfs.stream-buffer-size.


On Thu, Jan 23, 2014 at 6:57 AM, John Lilley <jo...@redpoint.net>wrote:

>  What is the interaction between dfs.stream-buffer-size and
> dfs.client-write-packet-size?
>
> I see that the default for dfs.stream-buffer-size is 4K.  Does anyone have
> experience using larger buffers to optimize large writes?
>
> Thanks
>
>
> John
>
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.