You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airavata.apache.org by Amila Jayasekara <th...@gmail.com> on 2018/03/06 07:53:51 UTC

Re: Communicating with the Nodes in Intranet

Hi Hitesh,

Are you expecting zookeeper instances to run in login nodes of the
supercomputers?
If yes, I am not very sure how feasible that is going to be. I guess Suresh
can give more details (using obsolete GSI-SSH as an example).

Thanks
-AJ

On Thu, Feb 22, 2018 at 10:02 PM, Hitesh Kumar Dasika <hd...@umail.iu.edu>
wrote:

> Hello Devs
>
>
> Here is another document detailing the design specs of the problem
> discussed in previous emails. This document also contains answers to the
> questions asked by Dimuthu (Thank you!).
>
> Any corrections, criticism or appreciation will be greatly helpful.
>
>
>
> ‌Google Doc Link:
> https://docs.google.com/document/d/1zVP7SSelxaGwTTOHL70ZOy4Zl3egk
> Gx6r7TqBMqzapg/edit?usp=sharing
>
>
>
> On Mon, Feb 12, 2018 at 9:58 AM, DImuthu Upeksha <
> dimuthu.upeksha2@gmail.com> wrote:
>
>> Hi Hithesh,
>>
>> This is overall a good design. I have few areas that need further
>> clarification.
>>
>> 1. Basically this design support a one way communication. Airavata sends
>> commands and agents execute that. But we have scenarios where agents should
>> respond to the commands. For example Airavata sends a list files commands
>> and agent should respond back with the list of the files. And there could
>> be cases where respond is asynchronous so that airavata does not
>> immediately get the response. How do you handle such scenarios?
>> 2. When you are implementing queues in the external server, do you keep
>> one queue per compute resource or do you utilize a single queue for all
>> compute resources?
>> 3. Can we have multiple external servers for high availability? If so how
>> do you keep the coordination among multiple external servers?
>> 4. Did you consider other queue implementations like Kafka? If so what is
>> the advantage you get by using RabbitMQ over that?
>> 5. We might have to write same agents in different languages (python, C,
>> Java) depending on the support of the compute resource. Please verify that
>> the client libraries that you use for queue interactions support that.
>> 6. What is the process of registering or removing a compute resources
>> from the intranet (creating or deleting queues) and who is responsible for
>> that?
>>
>> Thanks
>> Dimuthu
>>
>> On Thu, Feb 8, 2018 at 6:03 PM, Hitesh Kumar Dasika <hdasika@umail.iu.edu
>> > wrote:
>>
>>> Dev,
>>>
>>> I am looking at a Mechanism which can be used to establish a
>>> communicating Architecture between a set of *intranet* nodes in a
>>> cluster and Airavata.
>>>
>>> *Problem Introduction:*
>>>
>>> There are some cases wherein a cluster or an HPC system contains nodes
>>> or machines in the intranet and these cannot be accessed through the HPC
>>> System's endpoints directly. But, these systems inside the intranet can
>>> communicate with the external world or Internet. These machines are also
>>> precious resources that can be used for Job Executions. Hence there needs
>>> to be a proper architecture in place to make use of those resources. Here
>>> is a brief architectural discussion on this particular Problem.
>>>
>>>
>>> *Google Doc Link :*
>>> ‌https://docs.google.com/document/d/11I5mboZmI_D_IocP-CfjJiN
>>> oD55qtVSLcpWGodAL0z0/edit?usp=sharing
>>>
>>
>>
>