You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Muhuan Huang <mh...@cs.ucla.edu> on 2014/11/09 20:54:39 UTC

Does io.sort.mb count in the records or just the keys?

Hello everyone,

I have a question about the io.sort.mb property. The document says that
io.sort.mb is the total amount of buffer memory to use while sorting files.
My question is that does it include both the keys and values of the records
or just keys (and perhaps some pointers to the values)?

More specifically in the case of terasort where each record is 100 bytes
but the key is only 10 bytes, if io.sort.mb is set to 100, does it mean
that it can support a maximum of 1M records or 10M records?

Thanks a lot!

Muhuan

Re: Does io.sort.mb count in the records or just the keys?

Posted by Vinod Kumar Vavilapalli <vi...@hortonworks.com>.
It accounts for both keys and values.

+Vinod
Hortonworks Inc.
http://hortonworks.com/

On Sun, Nov 9, 2014 at 11:54 AM, Muhuan Huang <mh...@cs.ucla.edu> wrote:

> Hello everyone,
>
> I have a question about the io.sort.mb property. The document says that
> io.sort.mb is the total amount of buffer memory to use while sorting files.
> My question is that does it include both the keys and values of the records
> or just keys (and perhaps some pointers to the values)?
>
> More specifically in the case of terasort where each record is 100 bytes
> but the key is only 10 bytes, if io.sort.mb is set to 100, does it mean
> that it can support a maximum of 1M records or 10M records?
>
> Thanks a lot!
>
> Muhuan
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Does io.sort.mb count in the records or just the keys?

Posted by Vinod Kumar Vavilapalli <vi...@hortonworks.com>.
It accounts for both keys and values.

+Vinod
Hortonworks Inc.
http://hortonworks.com/

On Sun, Nov 9, 2014 at 11:54 AM, Muhuan Huang <mh...@cs.ucla.edu> wrote:

> Hello everyone,
>
> I have a question about the io.sort.mb property. The document says that
> io.sort.mb is the total amount of buffer memory to use while sorting files.
> My question is that does it include both the keys and values of the records
> or just keys (and perhaps some pointers to the values)?
>
> More specifically in the case of terasort where each record is 100 bytes
> but the key is only 10 bytes, if io.sort.mb is set to 100, does it mean
> that it can support a maximum of 1M records or 10M records?
>
> Thanks a lot!
>
> Muhuan
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Does io.sort.mb count in the records or just the keys?

Posted by Vinod Kumar Vavilapalli <vi...@hortonworks.com>.
It accounts for both keys and values.

+Vinod
Hortonworks Inc.
http://hortonworks.com/

On Sun, Nov 9, 2014 at 11:54 AM, Muhuan Huang <mh...@cs.ucla.edu> wrote:

> Hello everyone,
>
> I have a question about the io.sort.mb property. The document says that
> io.sort.mb is the total amount of buffer memory to use while sorting files.
> My question is that does it include both the keys and values of the records
> or just keys (and perhaps some pointers to the values)?
>
> More specifically in the case of terasort where each record is 100 bytes
> but the key is only 10 bytes, if io.sort.mb is set to 100, does it mean
> that it can support a maximum of 1M records or 10M records?
>
> Thanks a lot!
>
> Muhuan
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Does io.sort.mb count in the records or just the keys?

Posted by Vinod Kumar Vavilapalli <vi...@hortonworks.com>.
It accounts for both keys and values.

+Vinod
Hortonworks Inc.
http://hortonworks.com/

On Sun, Nov 9, 2014 at 11:54 AM, Muhuan Huang <mh...@cs.ucla.edu> wrote:

> Hello everyone,
>
> I have a question about the io.sort.mb property. The document says that
> io.sort.mb is the total amount of buffer memory to use while sorting files.
> My question is that does it include both the keys and values of the records
> or just keys (and perhaps some pointers to the values)?
>
> More specifically in the case of terasort where each record is 100 bytes
> but the key is only 10 bytes, if io.sort.mb is set to 100, does it mean
> that it can support a maximum of 1M records or 10M records?
>
> Thanks a lot!
>
> Muhuan
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.