You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Colin Patrick McCabe (JIRA)" <ji...@apache.org> on 2016/03/10 04:12:40 UTC

[jira] [Commented] (HADOOP-11996) Native erasure coder facilities based on ISA-L

    [ https://issues.apache.org/jira/browse/HADOOP-11996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15188584#comment-15188584 ] 

Colin Patrick McCabe commented on HADOOP-11996:
-----------------------------------------------

Thanks, [~drankye].  This makes the overall structure much clearer.  It is nice to see the loading functions all in {{isal_load.c}}.

In {{isal_load.h}}, there is a definition for {{isaLoader}}:
{code}
IsaLibLoader* isaLoader;
{code}

Any globals in header files should be defined with the {{extern}} keyword, to indicate that they are declarations, not definitions.  Basically, the global has to reside in a certain translation unit (i.e. .c file).  A global can't actually reside in a .h file.  So pick a .c file to define this global in (probably {{isal_load.c}}?) and make the header file use {{extern}}.  You also don't need to have {{extern}} declarations of this global in every {{.c}} file if you have the {{extern}} declaration in the header file that you're including.  So, for example, this code in {{gf_util.c}} can go away:

{code}
extern IsaLibLoader* isaLoader;
{code}


More comments about headers: In general, we don't check in JNI-generated header files.  Given that this is a pre-existing problem, we could fix this in a follow-on JIRA, though.

{code}
void loadLib(JNIEnv *env) {
  char errMsg[1024];
  load_erasurecode_lib(errMsg, sizeof(errMsg));
  if (strlen(errMsg) > 0) {
    THROW(env, "java/lang/UnsatisfiedLinkError", errMsg);
  }
}
{code}
This seems like a duplicate of {{Java_org_apache_hadoop_io_erasurecode_ErasureCodeNative_loadLibrary}}.  Is there any reason for having it when that function already exists?

Is there a reason to keep {{coder_util.c}} and {{erasure_coder.c}} as separate files?  It seems like they both relate to coders, perhaps they could be merged into a file called {{coder.c}}?  Similarly, should the contents of {{erasure_code.c}} also be merged into {{coder.c}}?  If not, why not?

In general, it seems like we could have:
* {{encoder.c}}: encoding-related functions and types
* {{decoder.c}}: decoding-related functions and types
* {{coder.c}}: common code and types used by both encoding and decoding

The distinction between "util" and "coder" seems artificial to me.

The declarations for the {{dump.c}} functions are not in {{dump.h}} as expected.  Instead, they seem to be hiding in {{erasure_coder.h}}-- that was unexpected.

I don't think there is a good reason to have an {{include}} directory.  The header files just be in the same directory as the code, just as it looks like some (but not all) of them already are.

> Native erasure coder facilities based on ISA-L
> ----------------------------------------------
>
>                 Key: HADOOP-11996
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11996
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: io
>            Reporter: Kai Zheng
>            Assignee: Kai Zheng
>         Attachments: HADOOP-11996-initial.patch, HADOOP-11996-v2.patch, HADOOP-11996-v3.patch, HADOOP-11996-v4.patch, HADOOP-11996-v5.patch, HADOOP-11996-v6.patch, HADOOP-11996-v7.patch, HADOOP-11996-v8.patch
>
>
> While working on HADOOP-11540 and etc., it was found useful to write the basic facilities based on Intel ISA-L library separately from JNI stuff. It's also easy to debug and troubleshooting, as no JNI or Java stuffs are involved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)