You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by 谢良 <xi...@xiaomi.com> on 2013/02/16 04:23:05 UTC

答复: why my test result on dfs short circuit read is slower?

Hi Raymond,

did you enable security feature in your cluster?  there'll be no obvious benefit be found if so.

Regards,
Liang
_______________________________________
发件人: Liu, Raymond [raymond.liu@intel.com]
发送时间: 2013年2月16日 11:10
收件人: user@hadoop.apache.org
主题: why my test result on dfs short circuit read is slower?

Hi

        I tried to use short circuit read to improve my hbase cluster MR scan performance.

        I have the following setting in hdfs-site.xml

        dfs.client.read.shortcircuit set to true
        dfs.block.local-path-access.user set to MR job runner.

        The cluster is 1+4 node and each data node have 16cpu/4HDD, with all hbase table major compact thus all data is local.
        I have hoped that the short circuit read will improve the performance.

        While the test result is that with short circuit read enabled, the performance actually dropped 10-15%. Say scan a 50G table cost around 100s instead of 90s.

        My hadoop version is 1.1.1, any idea on this? Thx!

Best Regards,
Raymond Liu

答复: why my test result on dfs short circuit read is slower?

Posted by 谢良 <xi...@xiaomi.com>.
I'm not very clear about your senario, just a kindly reminder: "If security is on, the feature can be used only for user that has kerberos credentials at the client, therefore map reduce tasks cannot benefit from it in general",  see HDFS-2246's release note for more info
If you didn't enable security at all, please ignore my comments:)

Regards,
Liang
________________________________________
发件人: Liu, Raymond [raymond.liu@intel.com]
发送时间: 2013年2月16日 11:40
收件人: user@hadoop.apache.org
主题: RE: why my test result on dfs short circuit read is slower?

Hi Liang

Did you mean set dfs.permissions to false?

Is that all I need to do to disable security feature? Cause It seems to me that without change dfs.block.local-path-access.user, dfs.permissions alone doesn't works. HBASE still fall back to go through datanode to read data.


>
> Hi Raymond,
>
> did you enable security feature in your cluster?  there'll be no obvious benefit
> be found if so.
>
> Regards,
> Liang
> _______________________________________
> 发件人: Liu, Raymond [raymond.liu@intel.com]
> 发送时间: 2013年2月16日 11:10
> 收件人: user@hadoop.apache.org
> 主题: why my test result on dfs short circuit read is slower?
>
> Hi
>
>         I tried to use short circuit read to improve my hbase cluster MR scan
> performance.
>
>         I have the following setting in hdfs-site.xml
>
>         dfs.client.read.shortcircuit set to true
>         dfs.block.local-path-access.user set to MR job runner.
>
>         The cluster is 1+4 node and each data node have 16cpu/4HDD, with
> all hbase table major compact thus all data is local.
>         I have hoped that the short circuit read will improve the
> performance.
>
>         While the test result is that with short circuit read enabled, the
> performance actually dropped 10-15%. Say scan a 50G table cost around 100s
> instead of 90s.
>
>         My hadoop version is 1.1.1, any idea on this? Thx!
>
> Best Regards,
> Raymond Liu

答复: why my test result on dfs short circuit read is slower?

Posted by 谢良 <xi...@xiaomi.com>.
I'm not very clear about your senario, just a kindly reminder: "If security is on, the feature can be used only for user that has kerberos credentials at the client, therefore map reduce tasks cannot benefit from it in general",  see HDFS-2246's release note for more info
If you didn't enable security at all, please ignore my comments:)

Regards,
Liang
________________________________________
发件人: Liu, Raymond [raymond.liu@intel.com]
发送时间: 2013年2月16日 11:40
收件人: user@hadoop.apache.org
主题: RE: why my test result on dfs short circuit read is slower?

Hi Liang

Did you mean set dfs.permissions to false?

Is that all I need to do to disable security feature? Cause It seems to me that without change dfs.block.local-path-access.user, dfs.permissions alone doesn't works. HBASE still fall back to go through datanode to read data.


>
> Hi Raymond,
>
> did you enable security feature in your cluster?  there'll be no obvious benefit
> be found if so.
>
> Regards,
> Liang
> _______________________________________
> 发件人: Liu, Raymond [raymond.liu@intel.com]
> 发送时间: 2013年2月16日 11:10
> 收件人: user@hadoop.apache.org
> 主题: why my test result on dfs short circuit read is slower?
>
> Hi
>
>         I tried to use short circuit read to improve my hbase cluster MR scan
> performance.
>
>         I have the following setting in hdfs-site.xml
>
>         dfs.client.read.shortcircuit set to true
>         dfs.block.local-path-access.user set to MR job runner.
>
>         The cluster is 1+4 node and each data node have 16cpu/4HDD, with
> all hbase table major compact thus all data is local.
>         I have hoped that the short circuit read will improve the
> performance.
>
>         While the test result is that with short circuit read enabled, the
> performance actually dropped 10-15%. Say scan a 50G table cost around 100s
> instead of 90s.
>
>         My hadoop version is 1.1.1, any idea on this? Thx!
>
> Best Regards,
> Raymond Liu

答复: why my test result on dfs short circuit read is slower?

Posted by 谢良 <xi...@xiaomi.com>.
I'm not very clear about your senario, just a kindly reminder: "If security is on, the feature can be used only for user that has kerberos credentials at the client, therefore map reduce tasks cannot benefit from it in general",  see HDFS-2246's release note for more info
If you didn't enable security at all, please ignore my comments:)

Regards,
Liang
________________________________________
发件人: Liu, Raymond [raymond.liu@intel.com]
发送时间: 2013年2月16日 11:40
收件人: user@hadoop.apache.org
主题: RE: why my test result on dfs short circuit read is slower?

Hi Liang

Did you mean set dfs.permissions to false?

Is that all I need to do to disable security feature? Cause It seems to me that without change dfs.block.local-path-access.user, dfs.permissions alone doesn't works. HBASE still fall back to go through datanode to read data.


>
> Hi Raymond,
>
> did you enable security feature in your cluster?  there'll be no obvious benefit
> be found if so.
>
> Regards,
> Liang
> _______________________________________
> 发件人: Liu, Raymond [raymond.liu@intel.com]
> 发送时间: 2013年2月16日 11:10
> 收件人: user@hadoop.apache.org
> 主题: why my test result on dfs short circuit read is slower?
>
> Hi
>
>         I tried to use short circuit read to improve my hbase cluster MR scan
> performance.
>
>         I have the following setting in hdfs-site.xml
>
>         dfs.client.read.shortcircuit set to true
>         dfs.block.local-path-access.user set to MR job runner.
>
>         The cluster is 1+4 node and each data node have 16cpu/4HDD, with
> all hbase table major compact thus all data is local.
>         I have hoped that the short circuit read will improve the
> performance.
>
>         While the test result is that with short circuit read enabled, the
> performance actually dropped 10-15%. Say scan a 50G table cost around 100s
> instead of 90s.
>
>         My hadoop version is 1.1.1, any idea on this? Thx!
>
> Best Regards,
> Raymond Liu

答复: why my test result on dfs short circuit read is slower?

Posted by 谢良 <xi...@xiaomi.com>.
I'm not very clear about your senario, just a kindly reminder: "If security is on, the feature can be used only for user that has kerberos credentials at the client, therefore map reduce tasks cannot benefit from it in general",  see HDFS-2246's release note for more info
If you didn't enable security at all, please ignore my comments:)

Regards,
Liang
________________________________________
发件人: Liu, Raymond [raymond.liu@intel.com]
发送时间: 2013年2月16日 11:40
收件人: user@hadoop.apache.org
主题: RE: why my test result on dfs short circuit read is slower?

Hi Liang

Did you mean set dfs.permissions to false?

Is that all I need to do to disable security feature? Cause It seems to me that without change dfs.block.local-path-access.user, dfs.permissions alone doesn't works. HBASE still fall back to go through datanode to read data.


>
> Hi Raymond,
>
> did you enable security feature in your cluster?  there'll be no obvious benefit
> be found if so.
>
> Regards,
> Liang
> _______________________________________
> 发件人: Liu, Raymond [raymond.liu@intel.com]
> 发送时间: 2013年2月16日 11:10
> 收件人: user@hadoop.apache.org
> 主题: why my test result on dfs short circuit read is slower?
>
> Hi
>
>         I tried to use short circuit read to improve my hbase cluster MR scan
> performance.
>
>         I have the following setting in hdfs-site.xml
>
>         dfs.client.read.shortcircuit set to true
>         dfs.block.local-path-access.user set to MR job runner.
>
>         The cluster is 1+4 node and each data node have 16cpu/4HDD, with
> all hbase table major compact thus all data is local.
>         I have hoped that the short circuit read will improve the
> performance.
>
>         While the test result is that with short circuit read enabled, the
> performance actually dropped 10-15%. Say scan a 50G table cost around 100s
> instead of 90s.
>
>         My hadoop version is 1.1.1, any idea on this? Thx!
>
> Best Regards,
> Raymond Liu

RE: why my test result on dfs short circuit read is slower?

Posted by "Liu, Raymond" <ra...@intel.com>.
Hi Liang

Did you mean set dfs.permissions to false?

Is that all I need to do to disable security feature? Cause It seems to me that without change dfs.block.local-path-access.user, dfs.permissions alone doesn't works. HBASE still fall back to go through datanode to read data.


> 
> Hi Raymond,
> 
> did you enable security feature in your cluster?  there'll be no obvious benefit
> be found if so.
> 
> Regards,
> Liang
> _______________________________________
> 发件人: Liu, Raymond [raymond.liu@intel.com]
> 发送时间: 2013年2月16日 11:10
> 收件人: user@hadoop.apache.org
> 主题: why my test result on dfs short circuit read is slower?
> 
> Hi
> 
>         I tried to use short circuit read to improve my hbase cluster MR scan
> performance.
> 
>         I have the following setting in hdfs-site.xml
> 
>         dfs.client.read.shortcircuit set to true
>         dfs.block.local-path-access.user set to MR job runner.
> 
>         The cluster is 1+4 node and each data node have 16cpu/4HDD, with
> all hbase table major compact thus all data is local.
>         I have hoped that the short circuit read will improve the
> performance.
> 
>         While the test result is that with short circuit read enabled, the
> performance actually dropped 10-15%. Say scan a 50G table cost around 100s
> instead of 90s.
> 
>         My hadoop version is 1.1.1, any idea on this? Thx!
> 
> Best Regards,
> Raymond Liu

RE: why my test result on dfs short circuit read is slower?

Posted by "Liu, Raymond" <ra...@intel.com>.
Hi Liang

Did you mean set dfs.permissions to false?

Is that all I need to do to disable security feature? Cause It seems to me that without change dfs.block.local-path-access.user, dfs.permissions alone doesn't works. HBASE still fall back to go through datanode to read data.


> 
> Hi Raymond,
> 
> did you enable security feature in your cluster?  there'll be no obvious benefit
> be found if so.
> 
> Regards,
> Liang
> _______________________________________
> 发件人: Liu, Raymond [raymond.liu@intel.com]
> 发送时间: 2013年2月16日 11:10
> 收件人: user@hadoop.apache.org
> 主题: why my test result on dfs short circuit read is slower?
> 
> Hi
> 
>         I tried to use short circuit read to improve my hbase cluster MR scan
> performance.
> 
>         I have the following setting in hdfs-site.xml
> 
>         dfs.client.read.shortcircuit set to true
>         dfs.block.local-path-access.user set to MR job runner.
> 
>         The cluster is 1+4 node and each data node have 16cpu/4HDD, with
> all hbase table major compact thus all data is local.
>         I have hoped that the short circuit read will improve the
> performance.
> 
>         While the test result is that with short circuit read enabled, the
> performance actually dropped 10-15%. Say scan a 50G table cost around 100s
> instead of 90s.
> 
>         My hadoop version is 1.1.1, any idea on this? Thx!
> 
> Best Regards,
> Raymond Liu

RE: why my test result on dfs short circuit read is slower?

Posted by "Liu, Raymond" <ra...@intel.com>.
Hi Liang

Did you mean set dfs.permissions to false?

Is that all I need to do to disable security feature? Cause It seems to me that without change dfs.block.local-path-access.user, dfs.permissions alone doesn't works. HBASE still fall back to go through datanode to read data.


> 
> Hi Raymond,
> 
> did you enable security feature in your cluster?  there'll be no obvious benefit
> be found if so.
> 
> Regards,
> Liang
> _______________________________________
> 发件人: Liu, Raymond [raymond.liu@intel.com]
> 发送时间: 2013年2月16日 11:10
> 收件人: user@hadoop.apache.org
> 主题: why my test result on dfs short circuit read is slower?
> 
> Hi
> 
>         I tried to use short circuit read to improve my hbase cluster MR scan
> performance.
> 
>         I have the following setting in hdfs-site.xml
> 
>         dfs.client.read.shortcircuit set to true
>         dfs.block.local-path-access.user set to MR job runner.
> 
>         The cluster is 1+4 node and each data node have 16cpu/4HDD, with
> all hbase table major compact thus all data is local.
>         I have hoped that the short circuit read will improve the
> performance.
> 
>         While the test result is that with short circuit read enabled, the
> performance actually dropped 10-15%. Say scan a 50G table cost around 100s
> instead of 90s.
> 
>         My hadoop version is 1.1.1, any idea on this? Thx!
> 
> Best Regards,
> Raymond Liu

RE: why my test result on dfs short circuit read is slower?

Posted by "Liu, Raymond" <ra...@intel.com>.
Hi Liang

Did you mean set dfs.permissions to false?

Is that all I need to do to disable security feature? Cause It seems to me that without change dfs.block.local-path-access.user, dfs.permissions alone doesn't works. HBASE still fall back to go through datanode to read data.


> 
> Hi Raymond,
> 
> did you enable security feature in your cluster?  there'll be no obvious benefit
> be found if so.
> 
> Regards,
> Liang
> _______________________________________
> 发件人: Liu, Raymond [raymond.liu@intel.com]
> 发送时间: 2013年2月16日 11:10
> 收件人: user@hadoop.apache.org
> 主题: why my test result on dfs short circuit read is slower?
> 
> Hi
> 
>         I tried to use short circuit read to improve my hbase cluster MR scan
> performance.
> 
>         I have the following setting in hdfs-site.xml
> 
>         dfs.client.read.shortcircuit set to true
>         dfs.block.local-path-access.user set to MR job runner.
> 
>         The cluster is 1+4 node and each data node have 16cpu/4HDD, with
> all hbase table major compact thus all data is local.
>         I have hoped that the short circuit read will improve the
> performance.
> 
>         While the test result is that with short circuit read enabled, the
> performance actually dropped 10-15%. Say scan a 50G table cost around 100s
> instead of 90s.
> 
>         My hadoop version is 1.1.1, any idea on this? Thx!
> 
> Best Regards,
> Raymond Liu