You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Owen O'Malley (JIRA)" <ji...@apache.org> on 2007/01/22 19:57:29 UTC
[jira] Resolved: (HADOOP-913) dynamically loading C++
mapper/reducer classes in map/reduce jobs
[ https://issues.apache.org/jira/browse/HADOOP-913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Owen O'Malley resolved HADOOP-913.
----------------------------------
Resolution: Duplicate
Fix Version/s: 0.11.0
Duplicate of HADOOP-234.
> dynamically loading C++ mapper/reducer classes in map/reduce jobs
> ------------------------------------------------------------------
>
> Key: HADOOP-913
> URL: https://issues.apache.org/jira/browse/HADOOP-913
> Project: Hadoop
> Issue Type: New Feature
> Reporter: Runping Qi
> Fix For: 0.11.0
>
>
> It is highly desirable for the current map/reduce framework to be able to call functions in c++ (or other languages).
> I am proposing a generic entension to the current framework to achieve the above goal.
> The extension is an application level solution, similar to
> HadoopStreaming in spirit, thus does not have impact on Hadoop core.
> I will maintain the native map/reduce execution model.
> The basic idea is to use socket/rpc to go through the language barrier.
> In particular, we can implement a generic mapper/reducer class in Java as a proxy for calling functions in other language.
> The configure function of the class will create a process that will open a user specified shared lirary act as an RPC server.
> The map function of the class will just invoke an RPC call the key/value pair.
> Such an RPC call is expected to return a list of key/value pairs. The map function then can emit the outputs.
> The below is a sketch for the generic class:
> public class MapRedCPPAdapter implements Mapper, Reducer {
> String sharedLibraryName;
> RPCProxy theServer;
>
> ...
> public void configure(JobConf job) {
> sharedLibraryName = job.get("shared.lib.name");
> theServer = createServer(sharedLibraryName );
> }
> public void close() {
> theServer.stop();
> }
> public void map(key, value, output, repoter) {
> ArrayList pairs = invokeRemoteMap(theServer, key, value);
> emit(pairs)
> }
> public void reduce (key, values, output, reporter) {
> ArrayList pairs = invokeRemoteReduce(theServer, key, value);
> emit(pairs)
> }
> }
> The cons of this approach include are the overhead associated with
> RPC calls and creating an additional process per mapper/reducer task.
> The pros are thhat the extension is clean, generic, simple. It is applicable to other foreign languages too.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira