You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@zookeeper.apache.org by Thomas Koch <th...@koch.ro> on 2010/02/19 10:01:02 UTC

zookeeper for gearman?

CC to zookeeper-user

Hi,

I've not keeped myself up to date on gearman development in the last weeks 
(months) since I've been occupied with other duties, mostly the introduction 
of hadoop[1] in our company.
One subproject of hadoop is zookeeper[2], "a centralized service for 
maintaining configuration information, naming, providing distributed 
synchronization, and providing group services.".
One of the documented use cases of zookeeper is a distributed queue[3]. 
(However this document doesn't seem to be that well written.)
I woundered if anyone from the gearman project has already heard of zookeeper 
and eventually considered a gearman implementation on top of it? It shouldn't 
be that hard and it would get you replication for free.
Maybe I'll try ones my current project is done. :-)

[1] http://hadoop.apache.org/
[2] http://hadoop.apache.org/zookeeper/
[3] http://hadoop.apache.org/zookeeper/docs/current/zookeeperTutorial.html

Best regards,

Thomas Koch, http://www.koch.ro

Re: zookeeper for gearman?

Posted by Patrick Hunt <ph...@apache.org>.

Thomas, I've looked at integrating the two, so far as to download the 
gearman source and examine it a bit. I didn't see a huge near-term win 
implementing a plugin as gearman already has support for 
drizzle/memcached/sqlite4/pq. While ZK could be used to provide highly 
reliable/available persistence it wouldn't really add anything over 
these other options (assuming these other options are configured to be 
highly reliable). Obviously if someone is already using ZK then there is 
the benefit of not having to add an additional persistence component, so 
that is one plus.

Longer term it did seem like ZK could benefit gearman by providing 
support for job server failover. If I understand the way gearman job 
servers work, even though the jobs are stored persistently the failed 
gearman server must be restarted (or another to take it's place) and 
re-read the persisted jobs before those jobs can be made available to 
workers again. ZK could facilitate this, perhaps even being used to 
re-distribute the load btw the available job servers (those still 
active). This was one concrete idea I had, I'm sure ZK could be applied 
in other areas as well.

Patrick

Thomas Koch wrote:
> CC to zookeeper-user
> 
> Hi,
> 
> I've not keeped myself up to date on gearman development in the last weeks 
> (months) since I've been occupied with other duties, mostly the introduction 
> of hadoop[1] in our company.
> One subproject of hadoop is zookeeper[2], "a centralized service for 
> maintaining configuration information, naming, providing distributed 
> synchronization, and providing group services.".
> One of the documented use cases of zookeeper is a distributed queue[3]. 
> (However this document doesn't seem to be that well written.)
> I woundered if anyone from the gearman project has already heard of zookeeper 
> and eventually considered a gearman implementation on top of it? It shouldn't 
> be that hard and it would get you replication for free.
> Maybe I'll try ones my current project is done. :-)
> 
> [1] http://hadoop.apache.org/
> [2] http://hadoop.apache.org/zookeeper/
> [3] http://hadoop.apache.org/zookeeper/docs/current/zookeeperTutorial.html
> 
> Best regards,
> 
> Thomas Koch, http://www.koch.ro