You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@airavata.apache.org by DImuthu Upeksha <di...@gmail.com> on 2017/12/04 18:30:45 UTC

Async Agents to handle long running jobs

Hi folks,

I have implemented the support to Async Job Submission with the callback
workflows on top of the proposed task execution framework. This supports to
both Async Job Submission in remote compute resources using Agents and
event driven job monitoring. Using this approach, I'm going to address
following issues that we are facing today

1. Resolve fault DoS attack detection in compute resources when doing
multiple ssh command executions in a short period of time.
2. Optimize resource utilization and robustness of Airavata Task Execution
Framework when executing long running jobs

Design and implementation details can be found from [1].
Sources for the main components can be found from [2], [3], [4]

Please share your comments and suggestions

[1]
https://docs.google.com/document/d/1DIjrkjxZZWo9XiwkKWq9WZiOX-uRD5WO-eB6TLxagAg/edit?usp=sharing
[2]
https://github.com/DImuthuUpe/airavata-sandbox/tree/master/airavata-kubernetes/modules/microservices/async-event-listener
[3]
https://github.com/DImuthuUpe/airavata-sandbox/tree/master/airavata-kubernetes/modules/microservices/tasks/async-command-monitor
[4]
https://github.com/DImuthuUpe/airavata-sandbox/tree/master/airavata-kubernetes/modules/microservices/tasks/async-command-task

Thanks
Dimuthu

Re: Async Agents to handle long running jobs

Posted by DImuthu Upeksha <di...@gmail.com>.

Hi Suresh,

Got your point. I'm referring to Async Command Listener (as following
image) as the Event Listener. As you can see Agents are the ones who are
supposed to be inside the super computers as they are responsible for
executing jobs. If that's the case we have to port Agents into python.
Event listeners are just listening to the kafka event topic and processing
event messages. They can (should?) be placed outside the super computers.
I'm saying this assuming that we are deploying Airavata components (API
Server, Scheduler etc) outside the super computers. Please correct me if I
have not understood the current deployment correctly.




Thanks
Dimuthu

On Tue, Dec 5, 2017 at 9:07 PM, Suresh Marru <sm...@apache.org> wrote:

> Hi Dimuthu,
>
> I just have some high level observation so will top post.
>
> * I am also +0 on running high available services on kubernetes, no deep
> thoughts or for or against them. Just pondering at this point.
>
> * Regarding Event Listener on compute machines, these are typically
> supercomputers which do not allow any kernel modifications. Java is often
> foreign on these clusters because they are designed for high performance
> and typically only support low level languages like C, C++ and FORTRAN.
> Python is increasingly getting ubiquitous as well.
>
> Suresh
>
>
> On Dec 5, 2017, at 9:31 AM, DImuthu Upeksha <di...@gmail.com>
> wrote:
>
> Hi Suresh,
>
> Thanks for the reply. Pease find my response inline for your questions
>
> On Tue, Dec 5, 2017 at 7:58 AM, Suresh Marru <sm...@apache.org> wrote:
>
>> Hi Dimuthu,
>>
>> This is neat design. Few questions to understand your implementation:
>>
>> * Since the Async command monitor needs to be persistent high available
>> service, is it advisable to run it as a Helix Participant or should we run
>> this outside of helix system like a API gateway?
>>
>
> This design does not assume Async Command Monitor as a persistent service.
> It reads the status of the Agent and directs the message flow in the
> correct path. In java world, it is like a switch case. However we need to
> make it highly available. By making it a Helix Participant and controlling
> the replication through Kubernetes, we can fulfill above requirement and
> keep it also as a generic component in the system.
>
>
>> * On a related note, any thoughts on running database also as part of the
>> kubernetes cluster? K8s has a MySQL example [1] but wondering on any other
>> pragmatic experiences.
>>
>
> Good suggestion. I also had that idea not only for MySQL, but for Kafka
> and Zookeeper. There are few challenges when we are trying to containerize
> those applications.
>
> 1. Applications like Zookeeper has a static unique name for each node in
> the Zookeeper quorum. And each node should be configured to know about
> other nodes before starting the node. For example each zoo.cfg file should
> contain entries like this before starting the cluster
>
> server.1=node1.thegeekstuff.com:2888:3888
> server.2=node2.thegeekstuff.com:2888:3888
> server.3=node3.thegeekstuff.com:2888:3888
>
> This is not container friendly. Containers are normally stateless. So it
> is challenging to spin up a failed container with the same identity (both
> form the same host name and static configuration). Kubernetes solves this
> by a concept called Stateful Sets where the newly spawned pod contains the
> same host name of the dead pod and same persistent volume.
>
> 2. Databases like MySQL should have a persistent data directory. So we
> should make sure that the newly spawned pods should be placed at the same
> node (physical machine) where old ones existed as data directories are not
> replicated among the nodes of the Kubernetes cluster. In this case also we
> should be able to use Stateful Sets to solve above issue. The link you
> shared also provides a good evidence for that
>
> 3. Above point (data directories) are valid for Kafka brokers. However
> most of the issues that we come across in containerizing Kafka brokers are
> also solved using Stateful Sets [1].
>
> So as a summary, we can almost deploy all 3 applications in Kubernetes in
> highly available manner with auto healing features. But we have to think
> about following facts aslo
>
> 1. These applications are not designed to run in containerized
> environments. I would say we are using some "hacks" to make it container
> friendly.
>
> 2. They are inherently highly available so why do we need to introduce
> another layer of high availability?
>
> 3. We can achieve auto healing in a Kubernetes cluster where a failed pod
> is automatically replaced by a new pod. But we can not let it to place in a
> different node (physical machine) because of the above constraints. So if a
> node failed, we can not use auto healing functionality of Kuberentes in
> this case.
>
> There are pros and cons when we are selecting an either approach. I think
> this should be open for discussion and get the viewpoints of others as
> well. Personally I'm +0 for Kubernetes approach :)
>
>
>> * We need to write the event listener preferably in Python since these
>> typically run on a compute cluster where java is not so well supported and
>> python is more ubiquitous.
>>
>
> That is possible. Event Listener interacts with Kafka and invokes API
> server. We can port them to python easily. However, as we are ultimately
> bundling them as Docker containers, language that we are using should not
> be an issue as the all the libraries that are required for each language
> are bundled in the same container image. We only need the Kernal of the
> host machine and docker installed on it with Kubernetes agents. I'm not
> sure that I have completely understood your claim about not supporting for
> java. Don't those compute machines support Java in Kernel level?
>
>
>> * What is your suggestion on the job description (the message payload in
>> your example) format? Can we send in a thrift binary through Kafka and have
>> the listener parse out the required information?
>>
>
> Should be possible and a good suggestion. We can write custom serializers
> and deserializers for Kafka message topics [2].
>
>
>> Suresh
>>
>> [1] - https://kubernetes.io/docs/tasks/run-application/run-repli
>> cated-stateful-application/
>>
>>
>> On Dec 4, 2017, at 1:30 PM, DImuthu Upeksha <di...@gmail.com>
>> wrote:
>>
>> Hi folks,
>>
>> I have implemented the support to Async Job Submission with the callback
>> workflows on top of the proposed task execution framework. This supports to
>> both Async Job Submission in remote compute resources using Agents and
>> event driven job monitoring. Using this approach, I'm going to address
>> following issues that we are facing today
>>
>> 1. Resolve fault DoS attack detection in compute resources when doing
>> multiple ssh command executions in a short period of time.
>> 2. Optimize resource utilization and robustness of Airavata Task
>> Execution Framework when executing long running jobs
>>
>> Design and implementation details can be found from [1].
>> Sources for the main components can be found from [2], [3], [4]
>>
>> Please share your comments and suggestions
>>
>> [1] https://docs.google.com/document/d/1DIjrkjxZZWo9XiwkKWq9
>> WZiOX-uRD5WO-eB6TLxagAg/edit?usp=sharing
>> [2] https://github.com/DImuthuUpe/airavata-sandbox/tree/mast
>> er/airavata-kubernetes/modules/microservices/async-event-listener
>> [3] https://github.com/DImuthuUpe/airavata-sandbox/tree/mast
>> er/airavata-kubernetes/modules/microservices/tasks/async-command-monitor
>> [4] https://github.com/DImuthuUpe/airavata-sandbox/tree/mast
>> er/airavata-kubernetes/modules/microservices/tasks/async-command-task
>>
>> Thanks
>> Dimuthu
>>
>>
>> [1] https://github.com/kubernetes/contrib/tree/master/statefulsets/kafka
> [2] https://dzone.com/articles/kafka-sending-object-as-a-message
>
> Thanks
> Dimuthu
>
>
>

Re: Async Agents to handle long running jobs

Posted by Suresh Marru <sm...@apache.org>.

Hi Dimuthu,

I just have some high level observation so will top post. 

* I am also +0 on running high available services on kubernetes, no deep thoughts or for or against them. Just pondering at this point. 

* Regarding Event Listener on compute machines, these are typically supercomputers which do not allow any kernel modifications. Java is often foreign on these clusters because they are designed for high performance and typically only support low level languages like C, C++ and FORTRAN. Python is increasingly getting ubiquitous as well. 

Suresh

> On Dec 5, 2017, at 9:31 AM, DImuthu Upeksha <di...@gmail.com> wrote:
> 
> Hi Suresh,
> 
> Thanks for the reply. Pease find my response inline for your questions
> 
> On Tue, Dec 5, 2017 at 7:58 AM, Suresh Marru <smarru@apache.org <ma...@apache.org>> wrote:
> Hi Dimuthu,
> 
> This is neat design. Few questions to understand your implementation:
> 
> * Since the Async command monitor needs to be persistent high available service, is it advisable to run it as a Helix Participant or should we run this outside of helix system like a API gateway? 
> 
> This design does not assume Async Command Monitor as a persistent service. It reads the status of the Agent and directs the message flow in the correct path. In java world, it is like a switch case. However we need to make it highly available. By making it a Helix Participant and controlling the replication through Kubernetes, we can fulfill above requirement and keep it also as a generic component in the system. 
> 
> 
> * On a related note, any thoughts on running database also as part of the kubernetes cluster? K8s has a MySQL example [1] but wondering on any other pragmatic experiences. 
> 
> Good suggestion. I also had that idea not only for MySQL, but for Kafka and Zookeeper. There are few challenges when we are trying to containerize those applications.
> 
> 1. Applications like Zookeeper has a static unique name for each node in the Zookeeper quorum. And each node should be configured to know about other nodes before starting the node. For example each zoo.cfg file should contain entries like this before starting the cluster
> 
> server.1=node1.thegeekstuff.com:2888 <http://node1.thegeekstuff.com:2888/>:3888
> server.2=node2.thegeekstuff.com:2888 <http://node2.thegeekstuff.com:2888/>:3888
> server.3=node3.thegeekstuff.com:2888 <http://node3.thegeekstuff.com:2888/>:3888
> 
> This is not container friendly. Containers are normally stateless. So it is challenging to spin up a failed container with the same identity (both form the same host name and static configuration). Kubernetes solves this by a concept called Stateful Sets where the newly spawned pod contains the same host name of the dead pod and same persistent volume.
> 
> 2. Databases like MySQL should have a persistent data directory. So we should make sure that the newly spawned pods should be placed at the same node (physical machine) where old ones existed as data directories are not replicated among the nodes of the Kubernetes cluster. In this case also we should be able to use Stateful Sets to solve above issue. The link you shared also provides a good evidence for that
> 
> 3. Above point (data directories) are valid for Kafka brokers. However most of the issues that we come across in containerizing Kafka brokers are also solved using Stateful Sets [1].
> 
> So as a summary, we can almost deploy all 3 applications in Kubernetes in highly available manner with auto healing features. But we have to think about following facts aslo
> 
> 1. These applications are not designed to run in containerized environments. I would say we are using some "hacks" to make it container friendly.
> 
> 2. They are inherently highly available so why do we need to introduce another layer of high availability?
> 
> 3. We can achieve auto healing in a Kubernetes cluster where a failed pod is automatically replaced by a new pod. But we can not let it to place in a different node (physical machine) because of the above constraints. So if a node failed, we can not use auto healing functionality of Kuberentes in this case.
> 
> There are pros and cons when we are selecting an either approach. I think this should be open for discussion and get the viewpoints of others as well. Personally I'm +0 for Kubernetes approach :) 
> 
> 
> * We need to write the event listener preferably in Python since these typically run on a compute cluster where java is not so well supported and python is more ubiquitous. 
> 
> That is possible. Event Listener interacts with Kafka and invokes API server. We can port them to python easily. However, as we are ultimately bundling them as Docker containers, language that we are using should not be an issue as the all the libraries that are required for each language are bundled in the same container image. We only need the Kernal of the host machine and docker installed on it with Kubernetes agents. I'm not sure that I have completely understood your claim about not supporting for java. Don't those compute machines support Java in Kernel level?
> 
> 
> * What is your suggestion on the job description (the message payload in your example) format? Can we send in a thrift binary through Kafka and have the listener parse out the required information? 
> 
> Should be possible and a good suggestion. We can write custom serializers and deserializers for Kafka message topics [2].  
> 
> 
> Suresh 
> 
> [1] - https://kubernetes.io/docs/tasks/run-application/run-replicated-stateful-application/ <https://kubernetes.io/docs/tasks/run-application/run-replicated-stateful-application/>
> 
> 
>> On Dec 4, 2017, at 1:30 PM, DImuthu Upeksha <dimuthu.upeksha2@gmail.com <ma...@gmail.com>> wrote:
>> 
>> Hi folks,
>> 
>> I have implemented the support to Async Job Submission with the callback workflows on top of the proposed task execution framework. This supports to both Async Job Submission in remote compute resources using Agents and event driven job monitoring. Using this approach, I'm going to address following issues that we are facing today
>> 
>> 1. Resolve fault DoS attack detection in compute resources when doing multiple ssh command executions in a short period of time.
>> 2. Optimize resource utilization and robustness of Airavata Task Execution Framework when executing long running jobs
>> 
>> Design and implementation details can be found from [1]. 
>> Sources for the main components can be found from [2], [3], [4]
>> 
>> Please share your comments and suggestions
>> 
>> [1] https://docs.google.com/document/d/1DIjrkjxZZWo9XiwkKWq9WZiOX-uRD5WO-eB6TLxagAg/edit?usp=sharing <https://docs.google.com/document/d/1DIjrkjxZZWo9XiwkKWq9WZiOX-uRD5WO-eB6TLxagAg/edit?usp=sharing>
>> [2] https://github.com/DImuthuUpe/airavata-sandbox/tree/master/airavata-kubernetes/modules/microservices/async-event-listener <https://github.com/DImuthuUpe/airavata-sandbox/tree/master/airavata-kubernetes/modules/microservices/async-event-listener>
>> [3] https://github.com/DImuthuUpe/airavata-sandbox/tree/master/airavata-kubernetes/modules/microservices/tasks/async-command-monitor <https://github.com/DImuthuUpe/airavata-sandbox/tree/master/airavata-kubernetes/modules/microservices/tasks/async-command-monitor>
>> [4] https://github.com/DImuthuUpe/airavata-sandbox/tree/master/airavata-kubernetes/modules/microservices/tasks/async-command-task <https://github.com/DImuthuUpe/airavata-sandbox/tree/master/airavata-kubernetes/modules/microservices/tasks/async-command-task>
>> 
>> Thanks
>> Dimuthu
>> 
> 
> [1] https://github.com/kubernetes/contrib/tree/master/statefulsets/kafka <https://github.com/kubernetes/contrib/tree/master/statefulsets/kafka>
> [2] https://dzone.com/articles/kafka-sending-object-as-a-message <https://dzone.com/articles/kafka-sending-object-as-a-message>
> 
> Thanks
> Dimuthu

Re: Async Agents to handle long running jobs

Posted by DImuthu Upeksha <di...@gmail.com>.

Hi Suresh,

Thanks for the reply. Pease find my response inline for your questions

On Tue, Dec 5, 2017 at 7:58 AM, Suresh Marru <sm...@apache.org> wrote:

> Hi Dimuthu,
>
> This is neat design. Few questions to understand your implementation:
>
> * Since the Async command monitor needs to be persistent high available
> service, is it advisable to run it as a Helix Participant or should we run
> this outside of helix system like a API gateway?
>

This design does not assume Async Command Monitor as a persistent service.
It reads the status of the Agent and directs the message flow in the
correct path. In java world, it is like a switch case. However we need to
make it highly available. By making it a Helix Participant and controlling
the replication through Kubernetes, we can fulfill above requirement and
keep it also as a generic component in the system.

> * On a related note, any thoughts on running database also as part of the
> kubernetes cluster? K8s has a MySQL example [1] but wondering on any other
> pragmatic experiences.
>

Good suggestion. I also had that idea not only for MySQL, but for Kafka and
Zookeeper. There are few challenges when we are trying to containerize
those applications.

1. Applications like Zookeeper has a static unique name for each node in
the Zookeeper quorum. And each node should be configured to know about
other nodes before starting the node. For example each zoo.cfg file should
contain entries like this before starting the cluster

server.1=node1.thegeekstuff.com:2888:3888
server.2=node2.thegeekstuff.com:2888:3888
server.3=node3.thegeekstuff.com:2888:3888

This is not container friendly. Containers are normally stateless. So it is
challenging to spin up a failed container with the same identity (both form
the same host name and static configuration). Kubernetes solves this by a
concept called Stateful Sets where the newly spawned pod contains the same
host name of the dead pod and same persistent volume.

2. Databases like MySQL should have a persistent data directory. So we
should make sure that the newly spawned pods should be placed at the same
node (physical machine) where old ones existed as data directories are not
replicated among the nodes of the Kubernetes cluster. In this case also we
should be able to use Stateful Sets to solve above issue. The link you
shared also provides a good evidence for that

3. Above point (data directories) are valid for Kafka brokers. However most
of the issues that we come across in containerizing Kafka brokers are also
solved using Stateful Sets [1].

So as a summary, we can almost deploy all 3 applications in Kubernetes in
highly available manner with auto healing features. But we have to think
about following facts aslo

1. These applications are not designed to run in containerized
environments. I would say we are using some "hacks" to make it container
friendly.

2. They are inherently highly available so why do we need to introduce
another layer of high availability?

3. We can achieve auto healing in a Kubernetes cluster where a failed pod
is automatically replaced by a new pod. But we can not let it to place in a
different node (physical machine) because of the above constraints. So if a
node failed, we can not use auto healing functionality of Kuberentes in
this case.

There are pros and cons when we are selecting an either approach. I think
this should be open for discussion and get the viewpoints of others as
well. Personally I'm +0 for Kubernetes approach :)

> * We need to write the event listener preferably in Python since these
> typically run on a compute cluster where java is not so well supported and
> python is more ubiquitous.
>

That is possible. Event Listener interacts with Kafka and invokes API
server. We can port them to python easily. However, as we are ultimately
bundling them as Docker containers, language that we are using should not
be an issue as the all the libraries that are required for each language
are bundled in the same container image. We only need the Kernal of the
host machine and docker installed on it with Kubernetes agents. I'm not
sure that I have completely understood your claim about not supporting for
java. Don't those compute machines support Java in Kernel level?

> * What is your suggestion on the job description (the message payload in
> your example) format? Can we send in a thrift binary through Kafka and have
> the listener parse out the required information?
>

Should be possible and a good suggestion. We can write custom serializers
and deserializers for Kafka message topics [2].

> Suresh
>
> [1] - https://kubernetes.io/docs/tasks/run-application/run-
> replicated-stateful-application/
>
>
> On Dec 4, 2017, at 1:30 PM, DImuthu Upeksha <di...@gmail.com>
> wrote:
>
> Hi folks,
>
> I have implemented the support to Async Job Submission with the callback
> workflows on top of the proposed task execution framework. This supports to
> both Async Job Submission in remote compute resources using Agents and
> event driven job monitoring. Using this approach, I'm going to address
> following issues that we are facing today
>
> 1. Resolve fault DoS attack detection in compute resources when doing
> multiple ssh command executions in a short period of time.
> 2. Optimize resource utilization and robustness of Airavata Task Execution
> Framework when executing long running jobs
>
> Design and implementation details can be found from [1].
> Sources for the main components can be found from [2], [3], [4]
>
> Please share your comments and suggestions
>
> [1] https://docs.google.com/document/d/1DIjrkjxZZWo9XiwkKWq9WZiOX-
> uRD5WO-eB6TLxagAg/edit?usp=sharing
> [2] https://github.com/DImuthuUpe/airavata-sandbox/tree/master/airavata-
> kubernetes/modules/microservices/async-event-listener
> [3] https://github.com/DImuthuUpe/airavata-sandbox/tree/master/airavata-
> kubernetes/modules/microservices/tasks/async-command-monitor
> [4] https://github.com/DImuthuUpe/airavata-sandbox/tree/master/airavata-
> kubernetes/modules/microservices/tasks/async-command-task
>
> Thanks
> Dimuthu
>
>
> [1] https://github.com/kubernetes/contrib/tree/master/statefulsets/kafka
[2] https://dzone.com/articles/kafka-sending-object-as-a-message

Thanks
Dimuthu

Re: Async Agents to handle long running jobs

Posted by Suresh Marru <sm...@apache.org>.

Hi Dimuthu,

This is neat design. Few questions to understand your implementation:

* Since the Async command monitor needs to be persistent high available service, is it advisable to run it as a Helix Participant or should we run this outside of helix system like a API gateway? 

* On a related note, any thoughts on running database also as part of the kubernetes cluster? K8s has a mysql example [1] but wondering on any other pragmatic experiences. 

* We need to write the event listener preferably in Python since these typically run on a compute cluster where java is not so well supported and python is more ubiquitous. 

* What is your suggestion on the job description (the message payload in your example) format? Can we send in a thrift binary through Kafka and have the listener parse out the required information? 

Suresh 

[1] - https://kubernetes.io/docs/tasks/run-application/run-replicated-stateful-application/

> On Dec 4, 2017, at 1:30 PM, DImuthu Upeksha <di...@gmail.com> wrote:
> 
> Hi folks,
> 
> I have implemented the support to Async Job Submission with the callback workflows on top of the proposed task execution framework. This supports to both Async Job Submission in remote compute resources using Agents and event driven job monitoring. Using this approach, I'm going to address following issues that we are facing today
> 
> 1. Resolve fault DoS attack detection in compute resources when doing multiple ssh command executions in a short period of time.
> 2. Optimize resource utilization and robustness of Airavata Task Execution Framework when executing long running jobs
> 
> Design and implementation details can be found from [1]. 
> Sources for the main components can be found from [2], [3], [4]
> 
> Please share your comments and suggestions
> 
> [1] https://docs.google.com/document/d/1DIjrkjxZZWo9XiwkKWq9WZiOX-uRD5WO-eB6TLxagAg/edit?usp=sharing <https://docs.google.com/document/d/1DIjrkjxZZWo9XiwkKWq9WZiOX-uRD5WO-eB6TLxagAg/edit?usp=sharing>
> [2] https://github.com/DImuthuUpe/airavata-sandbox/tree/master/airavata-kubernetes/modules/microservices/async-event-listener <https://github.com/DImuthuUpe/airavata-sandbox/tree/master/airavata-kubernetes/modules/microservices/async-event-listener>
> [3] https://github.com/DImuthuUpe/airavata-sandbox/tree/master/airavata-kubernetes/modules/microservices/tasks/async-command-monitor <https://github.com/DImuthuUpe/airavata-sandbox/tree/master/airavata-kubernetes/modules/microservices/tasks/async-command-monitor>
> [4] https://github.com/DImuthuUpe/airavata-sandbox/tree/master/airavata-kubernetes/modules/microservices/tasks/async-command-task <https://github.com/DImuthuUpe/airavata-sandbox/tree/master/airavata-kubernetes/modules/microservices/tasks/async-command-task>
> 
> Thanks
> Dimuthu
>