You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-issues@hadoop.apache.org by "Kai Zheng (JIRA)" <ji...@apache.org> on 2015/06/01 05:10:17 UTC

[jira] [Created] (HADOOP-12047) Ensure inputs not to be affected during coding in raw erasure coder

Kai Zheng created HADOOP-12047:
----------------------------------

             Summary: Ensure inputs not to be affected during coding in raw erasure coder
                 Key: HADOOP-12047
                 URL: https://issues.apache.org/jira/browse/HADOOP-12047
             Project: Hadoop Common
          Issue Type: Sub-task
            Reporter: Kai Zheng
            Assignee: Kai Zheng


It's good to define and ensure input buffers are not affected during coding process in raw erasure coders. Below are copied from discussion with [~jingzhao] in HDFS-8481:
bq. In that case we cannot reuse the source buffers I guess? Then do we need to expose this information in the decoder?
bq. Good catch Jing! Yes in this case we can't reuse the source buffers here as they need to be passed to caller/applications without being changed. I'm planning to re-implement the Java coders in HADOOP-12041 and related, when done it's possible to ensure the input buffers not to be affected. Benefits of doing this in coder layer: 1) a more clear contract between coder and caller in more general sense for the inputs; 2) concrete coder may have specific tweak to optimize in the aspect, ideally no input data copying at all, worst, make the copy, but all transparent to callers; 3) allow new coders (LRC, HH) to be layered on other primitive coders (RS, XOR) more easily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)