You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@helix.apache.org by "kishore gopalakrishna (JIRA)" <ji...@apache.org> on 2013/02/12 08:49:13 UTC

[jira] [Created] (HELIX-45) Standalone helix agent

kishore gopalakrishna created HELIX-45:
------------------------------------------

             Summary: Standalone helix agent
                 Key: HELIX-45
                 URL: https://issues.apache.org/jira/browse/HELIX-45
             Project: Apache Helix
          Issue Type: Task
            Reporter: kishore gopalakrishna


Copy paste of email from Santiago.

I'm interested in implementing a Helix based service capable of launching
processes in a distributed environment. I would like to get feedback
regarding the design and possible implementations as I understand there may
be interest in such a service from others.

The main idea is to have a Helix cluster (let's call it the AGENT cluster)
where each machine (or virtual machine) has a participant in said cluster
(let's call it the agent server). The agent servers would launch / kill
processes in their respective machines in response to state transitions
pushed by the helix controller.

With this design, requesting processes to be launched basically would
involve setting ideal state for a new resource in the agent cluster. Each
resource would identify a process (or a set of processes) that should be
launched by the agents, and each partition of that resource would define
where that instance of the process should run via ideal state. The details
of how to run the process would be part of the resource configuration.

For instance if we wanted to launch a daemon process in 5 machines we would
create a resource with 5 partitions let's call it "foo-daemon" and set it's
ideal state to:

"instancesMap" : {
   "foo-daemon_1": {
     "machine_a": "ONLINE",
   },
   "foo-daemon_2": {
     "machine_b": "ONLINE",
   },
   "foo-daemon_3": {
     "machine_c": "ONLINE",
   },
   "foo-daemon_4": {
     "machine_d": "ONLINE",
   },
   "foo-daemon_5": {
     "machine_e": "ONLINE",
   }
}

When transitioning to the ONLINE state for the foo-daemon resource the
agent server will launch the process based on the configuration defined for
the resource and partition. When transitioning back to OFFLINE it will kill
them.

In the event of an agent being restarted, the agent should be able to
discover the processes launched by its previous instance and match them to
the state transitions pushed to avoid duplicating processes.

The agent servers should also be responsible of monitoring their processes
and exporting metrics and signals about their health.

I'll try to provide more detail in this thread as the idea evolves and I
would welcome any and all feedback.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira