You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Wei-Chiu Chuang (JIRA)" <ji...@apache.org> on 2017/04/11 16:30:41 UTC
[jira] [Comment Edited] (HADOOP-14297) Update the documentation about the new ec codecs config keys

    [ https://issues.apache.org/jira/browse/HADOOP-14297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964511#comment-15964511 ] 

Wei-Chiu Chuang edited comment on HADOOP-14297 at 4/11/17 4:30 PM:
-------------------------------------------------------------------

Hi [~lewuathe] I think we can do a bigger change than just updating the configuration key:

In section "Architecture"
bq. The earlier factory is prior to followings in case of failure of creating raw coders. The default implementation classes which has the highest priority of RS and XOR codec are native codecs using Intel ISA-L to improve the performance. If the native library is not available, the codec should fallback to pure Java implementation. You can change the priority by changing these configuration keys.
How about if we say "These codec factories are loaded in the order specified by the configuration values, until a codec is loaded successfully. The default RS and XOR codec configuration prefers native implementation over the pure Java one. There is no RS-LEGACY native codec implementation so the default is pure Java implementation only."

In section "Enable Intel ISA-L"
bq. HDFS native implementation of default RS codec leverages Intel ISA-L library to improve the encoding and decoding calculation. To enable and use Intel ISA-L, there are three steps. 1. Build ISA-L library. Please refer to the official site “https://github.com/01org/isa-l/” for detail information. 2. Build Hadoop with ISA-L support. Please refer to “Intel ISA-L build options” section in “Build instructions for Hadoop” in (BUILDING.txt) in the source code. Use -Dbundle.isal to copy the contents of the isal.lib directory into the final tar file. Deploy Hadoop with the tar file. Make sure ISA-L is available on HDFS clients and DataNodes. 3. Configure the io.erasurecode.codec.rs.rawcoder key with value org.apache.hadoop.io.erasurecode.rawcoder.NativeRSRawErasureCoderFactory on HDFS clients and DataNodes.
The 3rd step is not needed anymore. The 2nd step looks a little long and can be split into two steps. It would also be nice if you can make the steps as bulletins. So like
{noformat}
1. ....
2. ...
{noformat}


bq. To enable the native implementation of the XOR codec, perform the same first two steps as above to build and deploy Hadoop with ISA-L support. Afterwards, configure the io.erasurecode.codec.xor.rawcoder key with org.apache.hadoop.io.erasurecode.rawcoder.NativeXORRawErasureCoderFactory on both HDFS client and DataNodes.
This paragraph can now be removed since the native codec is preferred by default.


was (Author: jojochuang):
Hi [~lewuathe] I think we can do a bigger change than just updating the configuration key:

In section "Architecture"
bq. The earlier factory is prior to followings in case of failure of creating raw coders. The default implementation classes which has the highest priority of RS and XOR codec are native codecs using Intel ISA-L to improve the performance. If the native library is not available, the codec should fallback to pure Java implementation. You can change the priority by changing these configuration keys.
How about if we say "These codec factories are loaded in the order specified by the configuration values, until a codec is loaded successfully. The default RS and XOR codec configuration prefers native implementation versus the pure Java one."

In section "Enable Intel ISA-L"
bq. HDFS native implementation of default RS codec leverages Intel ISA-L library to improve the encoding and decoding calculation. To enable and use Intel ISA-L, there are three steps. 1. Build ISA-L library. Please refer to the official site “https://github.com/01org/isa-l/” for detail information. 2. Build Hadoop with ISA-L support. Please refer to “Intel ISA-L build options” section in “Build instructions for Hadoop” in (BUILDING.txt) in the source code. Use -Dbundle.isal to copy the contents of the isal.lib directory into the final tar file. Deploy Hadoop with the tar file. Make sure ISA-L is available on HDFS clients and DataNodes. 3. Configure the io.erasurecode.codec.rs.rawcoder key with value org.apache.hadoop.io.erasurecode.rawcoder.NativeRSRawErasureCoderFactory on HDFS clients and DataNodes.
The 3rd step is not needed anymore. The 2nd step looks a little long and can be split into two steps. It would also be nice if you can make the steps as bulletins. So like
{noformat}
1. ....
2. ...
{noformat}


bq. To enable the native implementation of the XOR codec, perform the same first two steps as above to build and deploy Hadoop with ISA-L support. Afterwards, configure the io.erasurecode.codec.xor.rawcoder key with org.apache.hadoop.io.erasurecode.rawcoder.NativeXORRawErasureCoderFactory on both HDFS client and DataNodes.
This paragraph can now be removed since the native codec is preferred by default.

> Update the documentation about the new ec codecs config keys
> ------------------------------------------------------------
>
>                 Key: HADOOP-14297
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14297
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: documentation
>            Reporter: Kai Sasaki
>            Assignee: Kai Sasaki
>         Attachments: HADOOP-14297.01.patch, HADOOP-14297.02.patch
>
>
> In HADOOP-13665, io.erasurecode.codec.{rs-legacy.rawcoder,rs.rawcoder,xor.rawcoder} are no more used. 
> It is necessary to update {{HDFSErasureCoding.md}} to show new config keys io.erasurecode.codec.{rs-legacy.rawcoders,rs.rawcoders,xor.rawcoders} instead.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org