You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Binglin Chang (JIRA)" <ji...@apache.org> on 2014/04/01 08:28:19 UTC
[jira] [Commented] (HADOOP-10388) Pure native hadoop client
[ https://issues.apache.org/jira/browse/HADOOP-10388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13956157#comment-13956157 ]
Binglin Chang commented on HADOOP-10388:
----------------------------------------
bq. We can even make the XML-reading code optional if you want.
Sure, if for compatibility I guess add xml support if fine. By keeping strict compatibility we may need to support all javax xml / hadoop config features, I'm afraid libexpact/libxml2 support all of those, a lot effort may be spent on this, it is better to make it optional and do it later I think.
bq. Thread pools and async I/O, I'm afraid, are something we can't live without.
I am also prefer to use async I/O and thread for performance reasons, the code I published on github already have a working HDFS client with read/write, and HDFSOuputstream uses an aditional thread.
What I was saying is use of more threads should be limited, in java client, to simply read/write a HDFS file, too much threads are used(rpc socket read/write, data transfer socket read/write, other misc executors, lease renewer etc.) Since we use async i/o, thread number should be rapidly reduced
> Pure native hadoop client
> -------------------------
>
> Key: HADOOP-10388
> URL: https://issues.apache.org/jira/browse/HADOOP-10388
> Project: Hadoop Common
> Issue Type: New Feature
> Affects Versions: HADOOP-10388
> Reporter: Binglin Chang
> Assignee: Colin Patrick McCabe
>
> A pure native hadoop client has following use case/advantages:
> 1. writing Yarn applications using c++
> 2. direct access to HDFS, without extra proxy overhead, comparing to web/nfs interface.
> 3. wrap native library to support more languages, e.g. python
> 4. lightweight, small footprint compare to several hundred MB of JDK and hadoop library with various dependencies.
--
This message was sent by Atlassian JIRA
(v6.2#6252)