You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Kai Zheng (JIRA)" <ji...@apache.org> on 2016/05/25 22:07:12 UTC

[jira] [Commented] (HADOOP-13200) Seeking a better approach allowing to customize and configure erasure coders

    [ https://issues.apache.org/jira/browse/HADOOP-13200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15300958#comment-15300958 ] 

Kai Zheng commented on HADOOP-13200:
------------------------------------

Copied from [here | https://issues.apache.org/jira/browse/HADOOP-13010?focusedCommentId=15289544&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15289544] by [~drankye]:
{quote}
Hi Colin,

Thanks for the comments. About the factories, I have to clarify the real problem in details and hope this works since the f2f discussion isn't going into details due to time constraint.

We may have the following codecs in the 1st level:
rs-legacy, rs-default (both belonging to RS)
xor,
hh or hitchhiker,
lrc,
...

And for each codec, it may use one or more raw coders, but each of such coders may use different implementations. For example, for the rs-default codec, we have two coder implementations (the pure java one and the isa-l one). Users may add their own coder implementation for a codec, maybe for better performance.

So that's why I would have a configuration key like this:
o.a.h.io.erasurecode.codec.(codec-name).rawcoder: (whatever value to be used to create or load the coder).

Currently we configured the factory to create the encoder and decoder for a coder implementation, I agree there could be better option here, and while discussing about this in details with Andrew yesterday in the SF office, wonder if we could achieve the effect avoding the factories using java service loader.

First, we can add codec-name and coder-name to the raw coder, so each coder will have a codec-name and coder-name when it's created.

Then we have the built-in coders of fixed codec-name and coder-name. Customized coders will be loaded via service loader.

Eventually we will have all the raw erasure coders loaded and created, then we can setup a mapping between codec-name and coder-name, coder-name and the coder-class or instance.

Does this sound good to you? If it works, then we might do this in a follow-on task?

Thanks again!
{quote}

> Seeking a better approach allowing to customize and configure erasure coders
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-13200
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13200
>             Project: Hadoop Common
>          Issue Type: Sub-task
>            Reporter: Kai Zheng
>            Assignee: Kai Zheng
>
> This is a follow-on task for HADOOP-13010 as discussed over there. There may be some better approach allowing to customize and configure erasure coders than the current having raw coder factory, as [~cmccabe] suggested. Will copy the relevant comments here to continue the discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org