You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@twill.apache.org by Terence Yim <ch...@gmail.com> on 2014/05/03 17:40:59 UTC

Re: Twill, GraphLab, and Distributed Barriers

Hi Erick,

Sorry for the late response. I also like the 2nd option with the way 3 best, although it might take more time to address you immediate needs. Going down the path of passing zkstr is a good starting point, as I believe that the work being done could be ported into Twill later

Terence 

Sent from my iPhone

> On Apr 24, 2014, at 8:44 AM, Erick Tryzelaar <er...@gmail.com> wrote:
> 
> Good morning all,
> 
> I wanted to run an idea by you all. I'm currently working on using Twill to
> schedule GraphLab (http://graphlab.org), which is a distributed graph
> analytics package written in C++. They currently use MPI, but it's only to
> coordinate the launch of a cluster, so it should be comparatively easy to
> migrate them over to YARN and Twill. In order to do this, I would like to
> add to Twill some mechanism to allow me to request:
> 
> 1. Request X containers
> 2. Wait for the first container to be assigned to me.
> 3. Wait Y seconds for the rest of the containers to be assigned to me.
> 4. If the number of containers allocated equals X continue, otherwise
> release my containers and go to step 1.
> 
> I can think of two main ways to implement this, one inside Twill itself,
> and one inside the application.
> 
> 1. Modify `YarnAMClient.doRun` to block launching the processes until all
> the containers have been allocated.
> 2. Add some sort of distributed barrier that the application could use to
> block until all the containers have been allocated.
> 
> I'm leaning towards the second option, as Zookeeper and Curator already
> implement distributed barriers.
> 
> so all that's left is figuring out what's the right API to expose this. I
> have a couple ideas for this:
> 
> 1. Pass the Zookeeper connection string to the `TwillRunnable`. This would
> be simplest as I wouldn't have to modify Twill, but then I would have a
> redundant connection to Zookeeper.
> 2. Expose the Zookeeper client to `TwillContext`. This would be simpler,
> but then we'd be tightly coupling the Twill API to only work with Zookeeper.
> 3. Draw inspiration from service discovery and add a
> `SynchronizationService`, a `Barrier` interface, and a
> `TwillContext.createBarrier(String)` method. It would use Zookeeper or
> Curator under the covers. This would be a bit more work, but could be
> useful for a lot of other applications. It also would be a nice place to
> put other synchronization primitives.
> 
> My plan right now is to start off with passing the Zookeeper connection
> string to the `TwillRunnable`. Once I get that working I'd like to try to
> implement the `SynchronizationService`. Does this sound like a good plan,
> or would any of you suggest a better approach for implementing this?
> 
> Thanks,
> Erick