You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Saisai Shao (JIRA)" <ji...@apache.org> on 2018/07/06 08:26:00 UTC
[jira] [Comment Edited] (SPARK-24723) Discuss necessary info and access in barrier mode + YARN

    [ https://issues.apache.org/jira/browse/SPARK-24723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16534565#comment-16534565 ] 

Saisai Shao edited comment on SPARK-24723 at 7/6/18 8:25 AM:
-------------------------------------------------------------

[~mengxr] [~jiangxb1987]

There's one solution to handle password-less SSH problem for all cluster manager in a programming way. This is referred from MPI on YARN framework [https://github.com/alibaba/mpich2-yarn]

In this MPI on YARN framework, before launching MPI job, application master (master) will generate ssh private key and public key and then propagate the public key to all the containers (worker), during container start, it will write public key to local authorized_keys file, so after that, MPI job started from master node can ssh with all the containers in password-less manner. After MPI job is finished, all the containers would delete this public key from authorized_keys file to revert the environment.

In our case, we could do this in a similar way, before launching MPI job, 0-th task could also generate ssh private key and public key, and then propagate the public keys to all the barrier task (maybe through BarrierTaskContext). For other tasks, they could receive public key from 0-th task and write public key to authorized_keys file (maybe by BarrierTaskContext). After this, password-less ssh is set up, mpirun from 0-th task could be started without password. After MPI job is finished, all the barrier tasks could delete this public key from authorized_keys file to revert the environment.

The example code is like below:
{code:java}
 rdd.barrier().mapPartitions { (iter, context) => 
  // Write iter to disk. ??? 
  // Wait until all tasks finished writing. 
  context.barrier() 
  // The 0-th task launches an MPI job. 
  if (context.partitionId() == 0) {
    // generate and propagate ssh keys.
    // Wait for keys to set up in other tasks.
    val hosts = context.getTaskInfos().map(_.host)
    // Set up MPI machine file using host infos. ???
    // Launch the MPI job by calling mpirun. ???
  } else {
    // get and setup public key
    // notify 0-th task that pubic key is setup.
  }

  // Wait until the MPI job finished. 
  context.barrier()

  // Delete SSH key and revert the environment. 
  // Collect output and return. ??? 
 }
{code}
 

What is your opinion about this solution?


was (Author: jerryshao):
[~mengxr] [~jiangxb1987]

There's one solution to handle password-less SSH problem for all cluster manager in a programming way. This is referred from MPI on YARN framework [https://github.com/alibaba/mpich2-yarn]

In this MPI on YARN framework, before launching MPI job, application master (master) will generate ssh private key and public key and then propagate the public key to all the containers (worker), during container start, it will write public key to local authorized_keys file, so after that, MPI job started from master node can ssh with all the containers in password-less manner. After MPI job is finished, all the containers would delete this public key from authorized_keys file to revert the environment.

In our case, we could do this in a similar way, before launching MPI job, 0-th task could also generate ssh private key and public key, and then propagate the public keys to all the barrier task (maybe through BarrierTaskContext). For other tasks, they could receive public key from 0-th task and write public key to authorized_keys file (maybe by BarrierTaskContext). After this, password-less ssh is set up, mpirun from 0-th task could be started without password. After MPI job is finished, all the barrier tasks could delete this public key from authorized_keys file to revert the environment.

The example code is like below:

 
rdd.barrier().mapPartitions { (iter, context) =>    
  // Write iter to disk.    ???    
  // Wait until all tasks finished writing.    
  context.barrier()    
  // The 0-th task launches an MPI job.    
  if (context.partitionId() == 0) {
    // generate and propagate ssh keys.
    // Wait for keys to set up in other tasks.

    val hosts = context.getTaskInfos().map(_.host)      
    // Set up MPI machine file using host infos.      ???        
    // Launch the MPI job by calling mpirun.      ???    
  } else {
    // get and setup public key
    // notify 0-th task that pubic key is setup.
  }      
  // Wait until the MPI job finished.    
  context.barrier()

  // Delete SSH key and revert the environment.      
  // Collect output and return.    ???  
}
 

What is your opinion about this solution?

> Discuss necessary info and access in barrier mode + YARN
> --------------------------------------------------------
>
>                 Key: SPARK-24723
>                 URL: https://issues.apache.org/jira/browse/SPARK-24723
>             Project: Spark
>          Issue Type: Story
>          Components: ML, Spark Core
>    Affects Versions: 3.0.0
>            Reporter: Xiangrui Meng
>            Priority: Major
>
> In barrier mode, to run hybrid distributed DL training jobs, we need to provide users sufficient info and access so they can set up a hybrid distributed training job, e.g., using MPI.
> This ticket limits the scope of discussion to Spark + YARN. There were some past attempts from the Hadoop community. So we should find someone with good knowledge to lead the discussion here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org