You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@airavata.apache.org by Lahiru Gunathilake <gl...@gmail.com> on 2014/06/11 20:27:26 UTC

Zookeeper in Airavata to achieve reliability

Hi All,

I did little research about Apache Zookeeper[1] and how to use it in
airavata. Its really a nice way to achieve fault tolerance and reliable
communication between our thrift services and clients. Zookeeper is a
distributed, fault tolerant system to do a reliable communication between
distributed applications. This is like an in-memory file system which has
nodes in a tree structure and each node can have small amount of data
associated with it and these nodes are called znodes. Clients can connect
to a zookeeper server and add/delete and update these znodes.

  In Apache Airavata we start multiple thrift services and these can go
down for maintenance or these can crash, if we use zookeeper to store these
configuration(thrift service configurations) we can achieve a very reliable
system. Basically thrift clients can dynamically discover available service
by using ephemeral znodes(Here we do not have to change the generated
thrift client code but we have to change the locations we are invoking
them). ephemeral znodes will be removed when the thrift service goes down
and zookeeper guarantee the atomicity between these operations. With this
approach we can have a node hierarchy for multiple of airavata,
orchestrator,appcatalog and gfac thrift services.

For specifically for gfac we can have different types of services for each
provider implementation. This can be achieved by using the hierarchical
support in zookeeper and providing some logic in gfac-thrift service to
register it to a defined path. Using the same logic orchestrator can
discover the provider specific gfac thrift service and route the message to
the correct thrift service.

With this approach I think we simply have write some client code in thrift
services and clients and zookeeper server installation can be done as a
separate process and it will be easier to keep the Zookeeper server
separate from Airavata because installation of Zookeeper server little
complex in production scenario. I think we have to make sure everything
works fine when there is no Zookeeper running, ex: enable.zookeeper=false
should works fine and users doesn't have to download and start zookeeper.



[1]http://zookeeper.apache.org/

Thanks
Lahiru
-- 
System Analyst Programmer
PTI Lab
Indiana University

Re: Zookeeper in Airavata to achieve reliability

Posted by Marlon Pierce <ma...@iu.edu>.

+1

This is a dev list question, but what is the best way to package 
Zookeeper in an Airavata release?

Marlon

On 6/25/14, 2:46 PM, Amila Jayasekara wrote:
> Great Work !
>
> I would love to see a demo of this.
>
> Thanks
> Amila
>
>
> On Wed, Jun 25, 2014 at 2:09 PM, Lahiru Gunathilake <gl...@gmail.com>
> wrote:
>
>> Hi All,
>>
>> I have finished the initial version of the ZK integration. Now we can start
>> multiple thrift gfac services (still the communication between orchestrator
>> and gfac is RPC) and orchestrator submit jobs to multiple gfac nodes.
>>
>> I can kill a gfac node and orchestrator will make sure jobs are not lost,
>> it simply take those jobs and re-submit to gfac. Since GFac is a generic
>> framework and we have multiple plugins developed for that framework
>> checkpointing the plugin is up to the plugin developers but gfac
>> checkpoints whether those plugins invoked or not.
>>
>> I have introduced a new interface for plugin development called Recoverable
>> (RecoverableHandlers and RecoverableProvider). So state-full plugins has to
>> implement their recover method and gfac framework will make sure it will be
>> invoked during a re-run scenario. If a plugin is not recoverable and
>> already ran(can be found using framework checkpointing) during the re-run
>> that plugin will not be invoked. For now I just implemented recoverability
>> to few plugins and I have tested submitting a job to trestles and let it
>> submit and come to monitoring state and kill that gfac instance. Now
>> Orchestrator pick that execution and re-submit to another gfac node and
>> that gfac node does not re-run that job to the computing resource, but
>> simply start monitoring once the job is done outputs are downloaded from
>> the original output location.
>>
>> When a particular experiment is finished all the ZK data is removed.
>>
>> At this point following things needs to be done,
>>
>> 1. Figure out all the state-full handlers/providers and implement
>> recoverability,
>>
>> Ex: Input handler is transfering 1000 files and with 500 files gfac
>> instance crashed, during the re-run it should be able to tranfer from 501
>> file. Same logic can be applied to a single huge file. Those things are
>> completely up to the plugin developer.
>>
>> 2. Then we have to do remove the RPC invocation and make gfac nodes as
>> worker nodes.
>>
>> Regards
>> Lahiru
>>
>>
>> On Wed, Jun 18, 2014 at 12:11 PM, Lahiru Gunathilake <gl...@gmail.com>
>> wrote:
>>
>>> Hi Eran,
>>>
>>>
>>> On Tue, Jun 17, 2014 at 4:06 PM, Eran Chinthaka Withana <
>>> eran.chinthaka@gmail.com> wrote:
>>>
>>>> Storm has a Kafka spout which manages the cursor location (pointer to
>> the
>>>> head of the queue representing the next message to be processed) inside
>>>> ZK.
>>>> Each storm spout instance uses this information to get the next item to
>>>> process. Storm kafka spout won't advance to the next message until it
>> gets
>>>> an ack from the storm topology.
>>>>
>>> If we have 10 jobs in the queue and 5 GFAC instances picked 1 at a time
>>> and successfully submitted and have to start taking rest of the jobs. But
>>> all 5 GFAC instances are responsible for initially picked  5 jobs because
>>> they are still running and gfac instances are monitoring them until its
>>> done but at the same time we have to move the cursor to pick other jobs
>>> too.
>>>
>>> If we Ack and moved the cursor just after submission without waiting
>> until
>>> the job is actually finished how are we going to know which gfac is
>>> monitoring which set of jobs ?
>>>
>>> I am not getting how achieve above requirement with this suggestion. May
>>> be I am missing something here.
>>>
>>> Regards
>>> Lahiru
>>>
>>>> So, if there is an exception in the topology and ack is sent only by the
>>>> last bolt, then storm bolt make sure all messages are processed since
>>>> exceptions won't generate acks.
>>>>
>>>> Thanks,
>>>> Eran Chinthaka Withana
>>>>
>>>>
>>>> On Tue, Jun 17, 2014 at 12:30 PM, Lahiru Gunathilake <glahiru@gmail.com
>>>> wrote:
>>>>
>>>>> Hi Eran,
>>>>>
>>>>> I think I should take back my last email. When I carefully look at
>>>> storm I
>>>>> have following question.
>>>>>
>>>>> How are we going to store the Job statuses  and relaunch the jobs
>> which
>>>> was
>>>>> running in failure nodes ? Its true that storm is starting new workers
>>>> but
>>>>> there should be a way to find missing jobs by someone in the system.
>>>> Since
>>>>> we are not having a data stream there is no use to start new workers
>>>> unless
>>>>> we handler the missing jobs. I think we need to have a better control
>> of
>>>>> our component and persist the states of jobs each GFAC node is
>> handling.
>>>>> Directly using zookeeper will let us to do a proper fault tolerance
>>>>> implementation.
>>>>>
>>>>> Regards
>>>>> Lahiru
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Jun 17, 2014 at 3:14 PM, Lahiru Gunathilake <
>> glahiru@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Supun,
>>>>>>
>>>>>> I think in this usecase we only use storm topology to do the
>>>>> communication
>>>>>> among workers and we are completely ignoring the stream processing
>>>> part.
>>>>>> Orchestrator will talk to Nimbus and GFAC nodes will be Worker nodes
>>>> in
>>>>> the
>>>>>> storm topology. But I think we can achieve extremely fault tolerance
>>>>> system
>>>>>> by directly using storm based on following statement in storm site
>>>> with
>>>>>> minimum changes in airavata.
>>>>>>
>>>>>> Additionally, the Nimbus daemon and Supervisor daemons are fail-fast
>>>> and
>>>>>> stateless; all state is kept in Zookeeper or on local disk. This
>> means
>>>>> you
>>>>>> can kill -9 Nimbus or the Supervisors and they’ll start back up like
>>>>>> nothing happened. This design leads to Storm clusters being
>> incredibly
>>>>>> stable.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Jun 17, 2014 at 3:02 PM, Supun Kamburugamuva <
>>>> supun06@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Eran,
>>>>>>>
>>>>>>> I'm using Storm every day and this is one of the strangest things
>>>> I've
>>>>>>> heard about using Storm. My be there are more use cases for Storm
>>>> other
>>>>>>> than Distributed Stream processing. AFAIK the Bolts, spouts are
>>>> built to
>>>>>>> handle a stream of events that doesn't take much time to process.
>> In
>>>>>>> Airavata we don't process the messages. Instead we run experiments
>>>> based
>>>>>>> on
>>>>>>> the commands given.
>>>>>>>
>>>>>>> If you want process isolation, distributed execution, cluster
>>>> resource
>>>>>>> management Yarn would be a better thing to explore.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Supun..
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jun 17, 2014 at 2:27 PM, Eran Chinthaka Withana <
>>>>>>> eran.chinthaka@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Lahiru,
>>>>>>>>
>>>>>>>> good summarization. Thanks Lahiru.
>>>>>>>>
>>>>>>>> I think you are trying to stick to a model where Orchestrator
>>>>>>> distributing
>>>>>>>> to work for GFac worker and trying to do the impedance mismatch
>>>>> through
>>>>>>> a
>>>>>>>> messaging solution. If you step back and think, we don't even
>> want
>>>> the
>>>>>>>> orchestrator to handle everything. From its point of view, it
>>>> should
>>>>>>> submit
>>>>>>>> jobs to the framework, and will wait or get notified once the job
>>>> is
>>>>>>> done.
>>>>>>>> There are multiple ways of doing this. And here is one method.
>>>>>>>>
>>>>>>>> Orchestrator submits all its jobs to Job queue (implemented using
>>>> any
>>>>> MQ
>>>>>>>> impl like Rabbit or Kafka). A storm topology is implemented to
>>>> dequeue
>>>>>>>> messages, process them (i.e. submit those jobs and get those
>>>> executed)
>>>>>>> and
>>>>>>>> notify the Orchestrator with the status (either through another
>>>>>>>> JobCompletionQueue or direct invocation).
>>>>>>>>
>>>>>>>> With this approach, the MQ provider will help to match impedance
>>>>> between
>>>>>>>> job submission and consumption. Storm helps with worker
>>>> coordination,
>>>>>>> load
>>>>>>>> balancing, throttling on your job execution framework, worker
>> pool
>>>>>>>> management and fault tolerance.
>>>>>>>>
>>>>>>>> Of course, you can implement this based only on ZK and handle
>>>>> everything
>>>>>>>> else on your own but storm had done exactly that with the use of
>> ZK
>>>>>>>> underneath.
>>>>>>>>
>>>>>>>> Finally, if you go for a model like this, then even beyond job
>>>>>>> submission,
>>>>>>>> you can use the same model to do anything within the framework
>> for
>>>>>>> internal
>>>>>>>> communication. For example, the workflow engine will submit its
>>>> jobs
>>>>> to
>>>>>>>> queues based on what it has to do. Storm topologies exists for
>> each
>>>>>>> queues
>>>>>>>> to dequeue messages and carry out the work in a reliable manner.
>>>>>>> Consider
>>>>>>>> this as mini-workflows within a larger workflow framework.
>>>>>>>>
>>>>>>>> We can have a voice chat if its more convenient. But not at 7am
>>>> PST :)
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Eran Chinthaka Withana
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Jun 17, 2014 at 10:12 AM, Lahiru Gunathilake <
>>>>> glahiru@gmail.com
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi All,
>>>>>>>>>
>>>>>>>>> Ignoring the tool that we are going to use to implement fault
>>>>>>> tolerance I
>>>>>>>>> have summarized the model we have decided so far. I will use
>> the
>>>>> tool
>>>>>>>> name
>>>>>>>>> as X, we can use Zookeeper or some other implementation.
>>>> Following
>>>>>>> design
>>>>>>>>> assume tool X  and Registry have high availability.
>>>>>>>>>
>>>>>>>>> 1. Orchestrator and GFAC worker node communication is going to
>> be
>>>>>>> queue
>>>>>>>>> based and tool X is going to be used for this communication.
>> (We
>>>>> have
>>>>>>> to
>>>>>>>>> implement this with considering race condition between
>> different
>>>>> gfac
>>>>>>>>> workers).
>>>>>>>>> 2. We are having multiple instances of GFAC which are identical
>>>> (In
>>>>>>>> future
>>>>>>>>> we can group gfac workers). Existence of each worker node is
>>>>>>> identified
>>>>>>>>> using X. If node goes down orchestrator will be notified by X.
>>>>>>>>> 3. When a particular request comes and accepted by one gfac
>>>> worker
>>>>>>> that
>>>>>>>>> information will be replicated in tool X and a place where this
>>>>>>>> information
>>>>>>>>> is persisted even the worker failed.
>>>>>>>>> 4. When a job comes to a final state like failed or cancelled
>> or
>>>>>>>> completed
>>>>>>>>> above information will be removed. So at a given time
>>>> orchestrator
>>>>> can
>>>>>>>> poll
>>>>>>>>> active jobs in each worker by giving a worker ID.
>>>>>>>>> 5. Tool X will make sure that when a worker goes down it will
>>>> notify
>>>>>>>>> orchestrator. During a worker failure, based on step 3 and 4
>>>>>>> orchestrator
>>>>>>>>> can poll all the active jobs of that worker and do the same
>> thing
>>>>>>> like in
>>>>>>>>> step 1 (store the experiment ID to the queue) and gfac worker
>>>> will
>>>>>>> pick
>>>>>>>> the
>>>>>>>>> jobs.
>>>>>>>>>
>>>>>>>>> 6. When GFAC receive a job like in step 5 it have to carefully
>>>>>>> evaluate
>>>>>>>> the
>>>>>>>>> state from registry and decide what to be done (If the job is
>>>>> pending
>>>>>>>> then
>>>>>>>>> gfac just have to monitor, if job state is like input
>> transferred
>>>>> not
>>>>>>>> even
>>>>>>>>> submitted gfac has to execute rest of the chain and submit the
>>>> job
>>>>> to
>>>>>>> the
>>>>>>>>> resource and start monitoring).
>>>>>>>>>
>>>>>>>>> If we can find a tool X which supports all these features and
>>>> tool
>>>>>>> itself
>>>>>>>>> is fault tolerance and support atomicity, high availability and
>>>>> simply
>>>>>>>> API
>>>>>>>>> to implement we can use that tool.
>>>>>>>>>
>>>>>>>>> WDYT ?
>>>>>>>>>
>>>>>>>>> Lahiru
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Jun 16, 2014 at 2:38 PM, Supun Kamburugamuva <
>>>>>>> supun06@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Lahiru,
>>>>>>>>>>
>>>>>>>>>> Before moving with an implementation it may be worth to
>>>> consider
>>>>>>> some
>>>>>>>> of
>>>>>>>>>> the following aspects as well.
>>>>>>>>>>
>>>>>>>>>> 1. How to report the progress of an experiment as state in
>>>>>>> ZooKeeper?
>>>>>>>>> What
>>>>>>>>>> happens if a GFac instance crashes while executing an
>>>> experiment?
>>>>>>> Are
>>>>>>>>> there
>>>>>>>>>> check-points we can save so that another GFac instance can
>> take
>>>>>>> over?
>>>>>>>>>> 2. What is the threading model of GFac instances? (I consider
>>>> this
>>>>>>> as a
>>>>>>>>>> very important aspect)
>>>>>>>>>> 3. What are the information needed to be stored in the
>>>> ZooKeeper?
>>>>>>> You
>>>>>>>> may
>>>>>>>>>> need to store other information about an experiment apart
>> from
>>>> its
>>>>>>>>>> experiment ID.
>>>>>>>>>> 4. How to report errors?
>>>>>>>>>> 5. For GFac weather you need a threading model or worker
>>>> process
>>>>>>> model?
>>>>>>>>>> Thanks,
>>>>>>>>>> Supun..
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Jun 16, 2014 at 2:22 PM, Lahiru Gunathilake <
>>>>>>> glahiru@gmail.com
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi All,
>>>>>>>>>>>
>>>>>>>>>>> I think the conclusion is like this,
>>>>>>>>>>>
>>>>>>>>>>> 1, We make the gfac as a worker not a thrift service and we
>>>> can
>>>>>>> start
>>>>>>>>>>> multiple workers either with bunch of providers and
>> handlers
>>>>>>>> configured
>>>>>>>>>> in
>>>>>>>>>>> each worker or provider specific  workers to handle the
>> class
>>>>> path
>>>>>>>>> issues
>>>>>>>>>>> (not the common scenario).
>>>>>>>>>>>
>>>>>>>>>>> 2. Gfac workers can be configured to watch for a given path
>>>> in
>>>>>>>>> zookeeper,
>>>>>>>>>>> and multiple workers can listen to the same path. Default
>>>> path
>>>>>>> can be
>>>>>>>>>>> /airavata/gfac or can configure paths like
>>>> /airavata/gfac/gsissh
>>>>>>>>>>> /airavata/gfac/bes.
>>>>>>>>>>>
>>>>>>>>>>> 3. Orchestrator can configure with a logic to store
>>>> experiment
>>>>>>> IDs in
>>>>>>>>>>> zookeeper with a path, and orchestrator can be configured
>> to
>>>>>>> provider
>>>>>>>>>>> specific path logic too. So when a new request come
>>>> orchestrator
>>>>>>>> store
>>>>>>>>>> the
>>>>>>>>>>> experimentID and these experiments IDs are stored in Zk as
>> a
>>>>>>> queue.
>>>>>>>>>>> 4. Since gfac workers are watching they will be notified
>> and
>>>> as
>>>>>>> supun
>>>>>>>>>>> suggested can use a leader selection algorithm[1] and one
>>>> gfac
>>>>>>> worker
>>>>>>>>>>   will
>>>>>>>>>>> take the leadership for each experiment. If there are gfac
>>>>>>> instances
>>>>>>>>> for
>>>>>>>>>>> each provider same logic will apply among those nodes with
>>>> same
>>>>>>>>> provider
>>>>>>>>>>> type.
>>>>>>>>>>>
>>>>>>>>>>> [1]
>>>>> http://curator.apache.org/curator-recipes/leader-election.html
>>>>>>>>>>> I would like to implement this if there are  no objections.
>>>>>>>>>>>
>>>>>>>>>>> Lahiru
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Jun 16, 2014 at 11:51 AM, Supun Kamburugamuva <
>>>>>>>>> supun06@gmail.com
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Marlon,
>>>>>>>>>>>>
>>>>>>>>>>>> I think you are exactly correct.
>>>>>>>>>>>>
>>>>>>>>>>>> Supun..
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Jun 16, 2014 at 11:48 AM, Marlon Pierce <
>>>>>>> marpierc@iu.edu>
>>>>>>>>>> wrote:
>>>>>>>>>>>>> Let me restate this, and please tell me if I'm wrong.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Orchestrator decides (somehow) that a particular job
>>>>> requires
>>>>>>>>>> JSDL/BES,
>>>>>>>>>>>> so
>>>>>>>>>>>>> it places the Experiment ID in Zookeeper's
>>>>>>>> /airavata/gfac/jsdl-bes
>>>>>>>>>>> node.
>>>>>>>>>>>>>   GFAC servers associated with this instance notice the
>>>>> update.
>>>>>>>>   The
>>>>>>>>>>> first
>>>>>>>>>>>>> GFAC to claim the job gets it, uses the Experiment ID
>> to
>>>> get
>>>>>>> the
>>>>>>>>>>> detailed
>>>>>>>>>>>>> information it needs from the Registry.  ZooKeeper
>>>> handles
>>>>> the
>>>>>>>>>> locking,
>>>>>>>>>>>> etc
>>>>>>>>>>>>> to make sure that only one GFAC at a time is trying to
>>>>> handle
>>>>>>> an
>>>>>>>>>>>> experiment.
>>>>>>>>>>>>> Marlon
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 6/16/14, 11:42 AM, Lahiru Gunathilake wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Supun,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for the clarification.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>> Lahiru
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva
>> <
>>>>>>>>>>>> supun06@gmail.com>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>   Hi Lahiru,
>>>>>>>>>>>>>>> My suggestion is that may be you don't need a Thrift
>>>>> service
>>>>>>>>>> between
>>>>>>>>>>>>>>> Orchestrator and the component executing the
>>>> experiment.
>>>>>>> When a
>>>>>>>>> new
>>>>>>>>>>>>>>> experiment is submitted, orchestrator decides who can
>>>>>>> execute
>>>>>>>>> this
>>>>>>>>>>> job.
>>>>>>>>>>>>>>> Then it put the information about this experiment
>>>>> execution
>>>>>>> in
>>>>>>>>>>>> ZooKeeper.
>>>>>>>>>>>>>>> The component which wants to executes the experiment
>> is
>>>>>>>> listening
>>>>>>>>>> to
>>>>>>>>>>>> this
>>>>>>>>>>>>>>> ZooKeeper path and when it sees the experiment it
>> will
>>>>>>> execute
>>>>>>>>> it.
>>>>>>>>>> So
>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>> the communication happens through an state change in
>>>>>>> ZooKeeper.
>>>>>>>>>> This
>>>>>>>>>>>> can
>>>>>>>>>>>>>>> potentially simply your architecture.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Supun.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake
>> <
>>>>>>>>>>>> glahiru@gmail.com>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>   Hi Supun,
>>>>>>>>>>>>>>>> So your suggestion is to create a znode for each
>>>> thrift
>>>>>>>> service
>>>>>>>>> we
>>>>>>>>>>>> have
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>> when the request comes that node gets modified with
>>>> input
>>>>>>> data
>>>>>>>>> for
>>>>>>>>>>>> that
>>>>>>>>>>>>>>>> request and thrift service is having a watch for
>> that
>>>>> node
>>>>>>> and
>>>>>>>>> it
>>>>>>>>>>> will
>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>> notified because of the watch and it can read the
>>>> input
>>>>>>> from
>>>>>>>>>>> zookeeper
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>> invoke the operation?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Lahiru
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Thu, Jun 12, 2014 at 11:50 PM, Supun
>> Kamburugamuva
>>>> <
>>>>>>>>>>>>>>>> supun06@gmail.com>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>   Hi all,
>>>>>>>>>>>>>>>>> Here is what I think about Airavata and ZooKeeper.
>> In
>>>>>>>> Airavata
>>>>>>>>>>> there
>>>>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>>>> many components and these components must be
>>>> stateless
>>>>> to
>>>>>>>>> achieve
>>>>>>>>>>>>>>>>> scalability and reliability.Also there must be a
>>>>>>> mechanism to
>>>>>>>>>>>>>>>> communicate
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> between the components. At the moment Airavata uses
>>>> RPC
>>>>>>> calls
>>>>>>>>>> based
>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>> Thrift for the communication.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ZooKeeper can be used both as a place to hold state
>>>> and
>>>>>>> as a
>>>>>>>>>>>>>>>> communication
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> layer between the components. I'm involved with a
>>>>> project
>>>>>>>> that
>>>>>>>>>> has
>>>>>>>>>>>> many
>>>>>>>>>>>>>>>>> distributed components like AIravata. Right now we
>>>> use
>>>>>>> Thrift
>>>>>>>>>>>> services
>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> communicate among the components. But we find it
>>>>>>> difficult to
>>>>>>>>> use
>>>>>>>>>>> RPC
>>>>>>>>>>>>>>>> calls
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> and achieve stateless behaviour and thinking of
>>>>> replacing
>>>>>>>>> Thrift
>>>>>>>>>>>>>>>> services
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> with ZooKeeper based communication layer. So I
>> think
>>>> it
>>>>> is
>>>>>>>>> better
>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> explore the possibility of removing the Thrift
>>>> services
>>>>>>>> between
>>>>>>>>>> the
>>>>>>>>>>>>>>>>> components and use ZooKeeper as a communication
>>>>> mechanism
>>>>>>>>> between
>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> services. If you do this you will have to move the
>>>> state
>>>>>>> to
>>>>>>>>>>> ZooKeeper
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> will automatically achieve the stateless behaviour
>> in
>>>>> the
>>>>>>>>>>> components.
>>>>>>>>>>>>>>>>> Also I think trying to make ZooKeeper optional is a
>>>> bad
>>>>>>> idea.
>>>>>>>>> If
>>>>>>>>>> we
>>>>>>>>>>>> are
>>>>>>>>>>>>>>>>> trying to integrate something fundamentally
>>>> important to
>>>>>>>>>>> architecture
>>>>>>>>>>>>>>>>> as
>>>>>>>>>>>>>>>>> how to store state, we shouldn't make it optional.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Supun..
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Thu, Jun 12, 2014 at 10:57 PM, Shameera
>>>> Rathnayaka <
>>>>>>>>>>>>>>>>> shameerainfo@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>   Hi Lahiru,
>>>>>>>>>>>>>>>>>> As i understood,  not only reliability , you are
>>>> trying
>>>>>>> to
>>>>>>>>>> achieve
>>>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>>>> other requirement by introducing zookeeper, like
>>>> health
>>>>>>>>>> monitoring
>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> services, categorization with service
>> implementation
>>>> etc
>>>>>>> ...
>>>>>>>> .
>>>>>>>>> In
>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>> case, i think we can get use of zookeeper's
>> features
>>>>> but
>>>>>>> if
>>>>>>>> we
>>>>>>>>>>> only
>>>>>>>>>>>>>>>>> focus
>>>>>>>>>>>>>>>>> on reliability, i have little bit of concern, why
>>>> can't
>>>>> we
>>>>>>>> use
>>>>>>>>>>>>>>>>> clustering +
>>>>>>>>>>>>>>>>> LB ?
>>>>>>>>>>>>>>>>>> Yes it is better we add Zookeeper as a
>> prerequisite
>>>> if
>>>>>>> user
>>>>>>>>> need
>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> use
>>>>>>>>>>>>>>>>>> it.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>    Shameera.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru
>> Gunathilake
>>>> <
>>>>>>>>>>>>>>>>>> glahiru@gmail.com
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>   Hi Gagan,
>>>>>>>>>>>>>>>>>>> I need to start another discussion about it, but
>> I
>>>> had
>>>>>>> an
>>>>>>>>>> offline
>>>>>>>>>>>>>>>>>>> discussion with Suresh about auto-scaling. I will
>>>>> start
>>>>>>>>> another
>>>>>>>>>>>>>>>>>>> thread
>>>>>>>>>>>>>>>>>>> about this topic too.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>>>>>>> Lahiru
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> gagandeepjuneja@gmail.com
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>   Thanks Lahiru for pointing to nice library,
>> added
>>>> to
>>>>> my
>>>>>>>>>>> dictionary
>>>>>>>>>>>>>>>>>>> :).
>>>>>>>>>>>>>>>>>   I would like to know how are we planning to start
>>>>>>> multiple
>>>>>>>>>>> servers.
>>>>>>>>>>>>>>>>>>>> 1. Spawning new servers based on load? Some
>> times
>>>> we
>>>>>>> call
>>>>>>>> it
>>>>>>>>>> as
>>>>>>>>>>>> auto
>>>>>>>>>>>>>>>>>>>> scalable.
>>>>>>>>>>>>>>>>>>>> 2. To make some specific number of nodes
>> available
>>>>>>> such as
>>>>>>>>> we
>>>>>>>>>>>> want 2
>>>>>>>>>>>>>>>>>>>> servers to be available at any time so if one
>> goes
>>>>> down
>>>>>>>>> then I
>>>>>>>>>>>> need
>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>   spawn one new to make available servers count 2.
>>>>>>>>>>>>>>>>>>>> 3. Initially start all the servers.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> In scenario 1 and 2 zookeeper does make sense
>> but
>>>> I
>>>>>>> don't
>>>>>>>>>>> believe
>>>>>>>>>>>>>>>>>>> existing
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> architecture support this?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>> Gagan
>>>>>>>>>>>>>>>>>>>> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <
>>>>>>>>>> glahiru@gmail.com
>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi Gagan,
>>>>>>>>>>>>>>>>>>>>> Thanks for your response. Please see my inline
>>>>>>> comments.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> gagandeepjuneja@gmail.com>
>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>   Hi Lahiru,
>>>>>>>>>>>>>>>>>>>>>> Just my 2 cents.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I am big fan of zookeeper but also against
>>>> adding
>>>>>>>> multiple
>>>>>>>>>>> hops
>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> system which can add unnecessary complexity.
>> Here
>>>> I
>>>>> am
>>>>>>> not
>>>>>>>>>> able
>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>> understand the requirement of zookeeper may be
>>>> I am
>>>>>>>> wrong
>>>>>>>>>>>> because
>>>>>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>> less
>>>>>>>>>>>>>>>>>>>> knowledge of the airavata system in whole. So I
>>>> would
>>>>>>> like
>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>> discuss
>>>>>>>>>>>>>>>>>   following point.
>>>>>>>>>>>>>>>>>>>>>> 1. How it will help us in making system more
>>>>>>> reliable.
>>>>>>>>>>> Zookeeper
>>>>>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>>> able to restart services. At max it can tell
>>>> whether
>>>>>>>> service
>>>>>>>>>> is
>>>>>>>>>>> up
>>>>>>>>>>>>>>>>>>>>> or not
>>>>>>>>>>>>>>>>>>>> which could only be the case if airavata service
>>>> goes
>>>>>>> down
>>>>>>>>>>>>>>>>>>>>> gracefully and
>>>>>>>>>>>>>>>>>>>> we have any automated way to restart it. If this
>>>> is
>>>>>>> just
>>>>>>>>>> matter
>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>> routing
>>>>>>>>>>>>>>>>>>>> client requests to the available thrift servers
>>>> then
>>>>>>> this
>>>>>>>>> can
>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>>>> achieved
>>>>>>>>>>>>>>>>>>>> with the help of load balancer which I guess is
>>>>> already
>>>>>>>>> there
>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>> thrift
>>>>>>>>>>>>>>>>>>>> wish list.
>>>>>>>>>>>>>>>>>>>>>>   We have multiple thrift services and
>> currently
>>>> we
>>>>>>> start
>>>>>>>>>> only
>>>>>>>>>>>> one
>>>>>>>>>>>>>>>>>>>> instance
>>>>>>>>>>>>>>>>>>>> of them and each thrift service is a stateless
>>>>>>> service. To
>>>>>>>>>> keep
>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> high
>>>>>>>>>>>>>>>>>>>> availability we have to start multiple instances
>>>> of
>>>>>>> them
>>>>>>>> in
>>>>>>>>>>>>>>>>>>>> production
>>>>>>>>>>>>>>>>>   scenario. So for clients to get an available
>> thrift
>>>>>>> service
>>>>>>>> we
>>>>>>>>>> can
>>>>>>>>>>>>>>>>>>>> use
>>>>>>>>>>>>>>>>>   zookeeper znodes to represent each available
>>>> service.
>>>>>>> There
>>>>>>>>> are
>>>>>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>>>   libraries which is doing similar[1] and I think we
>>>> can
>>>>>>> use
>>>>>>>>> them
>>>>>>>>>>>>>>>>>>>> directly.
>>>>>>>>>>>>>>>>>>>> 2. As far as registering of different providers
>> is
>>>>>>>> concerned
>>>>>>>>>> do
>>>>>>>>>>>>>>>>>>>>> you
>>>>>>>>>>>>>>>>>   think for that we really need external store.
>>>>>>>>>>>>>>>>>>>>>>   Yes I think so, because its light weight and
>>>>>>> reliable
>>>>>>>> and
>>>>>>>>>> we
>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> do
>>>>>>>>>>>>>>>>>>>> very minimal amount of work to achieve all these
>>>>>>> features
>>>>>>>> to
>>>>>>>>>>>>>>>>>>>> Airavata
>>>>>>>>>>>>>>>>>   because zookeeper handle all the complexity.
>>>>>>>>>>>>>>>>>>>>>   I have seen people using zookeeper more for
>>>> state
>>>>>>>>> management
>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>>> distributed environments.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>   +1, we might not be the most effective users
>> of
>>>>>>>> zookeeper
>>>>>>>>>>>> because
>>>>>>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>> our services are stateless services, but my
>> point
>>>> is
>>>>> to
>>>>>>>>>> achieve
>>>>>>>>>>>>>>>>>>>>> fault-tolerance we can use zookeeper and with
>>>>> minimal
>>>>>>>> work.
>>>>>>>>>>>>>>>>>>>>>     I would like to understand more how can we
>>>>> leverage
>>>>>>>>>>> zookeeper
>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>>> airavata to make system reliable.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>   [1]
>>>> https://github.com/eirslett/thrift-zookeeper
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>   Regards,
>>>>>>>>>>>>>>>>>>>>>> Gagan
>>>>>>>>>>>>>>>>>>>>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <
>>>>>>>> marpierc@iu.edu
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>   Thanks for the summary, Lahiru. I'm cc'ing
>> the
>>>>>>>>> Architecture
>>>>>>>>>>>> list
>>>>>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>   additional comments.
>>>>>>>>>>>>>>>>>>>>>>> Marlon
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> I did little research about Apache
>>>> Zookeeper[1]
>>>>> and
>>>>>>>> how
>>>>>>>>> to
>>>>>>>>>>> use
>>>>>>>>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>   airavata. Its really a nice way to achieve
>> fault
>>>>>>>> tolerance
>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>> reliable
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> communication between our thrift services
>> and
>>>>>>> clients.
>>>>>>>>>>>>>>>>>>>>>>> Zookeeper
>>>>>>>>>>>>>>>>> is a
>>>>>>>>>>>>>>>>>>>>   distributed, fault tolerant system to do a
>>>> reliable
>>>>>>>>>>>>>>>>>>>>>>> communication
>>>>>>>>>>>>>>>>>   between
>>>>>>>>>>>>>>>>>>>>>>>> distributed applications. This is like an
>>>>> in-memory
>>>>>>>> file
>>>>>>>>>>>> system
>>>>>>>>>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>>>>   has
>>>>>>>>>>>>>>>>>>>>>>>> nodes in a tree structure and each node can
>>>> have
>>>>>>> small
>>>>>>>>>>> amount
>>>>>>>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>> data
>>>>>>>>>>>>>>>>>>>>   associated with it and these nodes are called
>>>>> znodes.
>>>>>>>>> Clients
>>>>>>>>>>>>>>>>>>>>>>> can
>>>>>>>>>>>>>>>>>   connect
>>>>>>>>>>>>>>>>>>>>>>>> to a zookeeper server and add/delete and
>>>> update
>>>>>>> these
>>>>>>>>>>> znodes.
>>>>>>>>>>>>>>>>>>>>>>>>     In Apache Airavata we start multiple
>> thrift
>>>>>>>> services
>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>> these
>>>>>>>>>>>>>>>>> can
>>>>>>>>>>>>>>>>>>>>   go
>>>>>>>>>>>>>>>>>>>>>>>> down for maintenance or these can crash, if
>> we
>>>>> use
>>>>>>>>>> zookeeper
>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>> store
>>>>>>>>>>>>>>>>>>>>   these
>>>>>>>>>>>>>>>>>>>>>>>> configuration(thrift service configurations)
>>>> we
>>>>> can
>>>>>>>>>> achieve
>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>>>>>>> very
>>>>>>>>>>>>>>>>>   reliable
>>>>>>>>>>>>>>>>>>>>>>>> system. Basically thrift clients can
>>>> dynamically
>>>>>>>>> discover
>>>>>>>>>>>>>>>>>>>>>>> available
>>>>>>>>>>>>>>>>>>>>   service
>>>>>>>>>>>>>>>>>>>>>>>> by using ephemeral znodes(Here we do not
>> have
>>>> to
>>>>>>>> change
>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>> generated
>>>>>>>>>>>>>>>>>>>>   thrift client code but we have to change the
>>>>>>> locations we
>>>>>>>>> are
>>>>>>>>>>>>>>>>>>>>>>> invoking
>>>>>>>>>>>>>>>>>>>>   them). ephemeral znodes will be removed when
>> the
>>>>>>> thrift
>>>>>>>>>> service
>>>>>>>>>>>>>>>>>>>>>>> goes
>>>>>>>>>>>>>>>>>>>>   down
>>>>>>>>>>>>>>>>>>>>>>>> and zookeeper guarantee the atomicity
>> between
>>>>> these
>>>>>>>>>>>> operations.
>>>>>>>>>>>>>>>>>>>>>>> With
>>>>>>>>>>>>>>>>>>>>   this
>>>>>>>>>>>>>>>>>>>>>>>> approach we can have a node hierarchy for
>>>>> multiple
>>>>>>> of
>>>>>>>>>>>> airavata,
>>>>>>>>>>>>>>>>>>>>>>>> orchestrator,appcatalog and gfac thrift
>>>> services.
>>>>>>>>>>>>>>>>>>>>>>>> For specifically for gfac we can have
>>>> different
>>>>>>> types
>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>>>> services
>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>>   each
>>>>>>>>>>>>>>>>>>>>>>>> provider implementation. This can be
>> achieved
>>>> by
>>>>>>> using
>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>> hierarchical
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> support in zookeeper and providing some
>> logic
>>>> in
>>>>>>>>>> gfac-thrift
>>>>>>>>>>>>>>>>>>>>>>> service
>>>>>>>>>>>>>>>>>>>>   to
>>>>>>>>>>>>>>>>>>>>>>>> register it to a defined path. Using the
>> same
>>>>> logic
>>>>>>>>>>>>>>>>>>>>>>> orchestrator
>>>>>>>>>>>>>>>>> can
>>>>>>>>>>>>>>>>>>>>   discover the provider specific gfac thrift
>>>> service
>>>>> and
>>>>>>>>> route
>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>   message to
>>>>>>>>>>>>>>>>>>>>>>>> the correct thrift service.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> With this approach I think we simply have
>>>> write
>>>>>>> some
>>>>>>>>>> client
>>>>>>>>>>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>   thrift
>>>>>>>>>>>>>>>>>>>>>>>> services and clients and zookeeper server
>>>>>>> installation
>>>>>>>>> can
>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>>>>>> done as
>>>>>>>>>>>>>>>>>>>>   a
>>>>>>>>>>>>>>>>>>>>>>>> separate process and it will be easier to
>> keep
>>>>> the
>>>>>>>>>> Zookeeper
>>>>>>>>>>>>>>>>>>>>>>> server
>>>>>>>>>>>>>>>>>>>>   separate from Airavata because installation of
>>>>>>> Zookeeper
>>>>>>>>>> server
>>>>>>>>>>>>>>>>>>>>>>> little
>>>>>>>>>>>>>>>>>>>>   complex in production scenario. I think we have
>>>> to
>>>>>>> make
>>>>>>>>> sure
>>>>>>>>>>>>>>>>>>>>>>> everything
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> works fine when there is no Zookeeper
>> running,
>>>>> ex:
>>>>>>>>>>>>>>>>>>>>>>> enable.zookeeper=false
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> should works fine and users doesn't have to
>>>>>>> download
>>>>>>>> and
>>>>>>>>>>> start
>>>>>>>>>>>>>>>>>>>>>>> zookeeper.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> [1]http://zookeeper.apache.org/
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>>>>>>>>> Lahiru
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>> System Analyst Programmer
>>>>>>>>>>>>>>>>>>>>> PTI Lab
>>>>>>>>>>>>>>>>>>>>> Indiana University
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> System Analyst Programmer
>>>>>>>>>>>>>>>>>>> PTI Lab
>>>>>>>>>>>>>>>>>>> Indiana University
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>>>>>> Shameera Rathnayaka.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> email: shameera AT apache.org , shameerainfo AT
>>>>>>> gmail.com
>>>>>>>>>>>>>>>>>> Blog : http://shameerarathnayaka.blogspot.com/
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> Supun Kamburugamuva
>>>>>>>>>>>>>>>>> Member, Apache Software Foundation;
>>>>> http://www.apache.org
>>>>>>>>>>>>>>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369
>> 6762
>>>>>>>>>>>>>>>>> Blog: http://supunk.blogspot.com
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> System Analyst Programmer
>>>>>>>>>>>>>>>> PTI Lab
>>>>>>>>>>>>>>>> Indiana University
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Supun Kamburugamuva
>>>>>>>>>>>>>>> Member, Apache Software Foundation;
>>>> http://www.apache.org
>>>>>>>>>>>>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
>>>>>>>>>>>>>>> Blog: http://supunk.blogspot.com
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Supun Kamburugamuva
>>>>>>>>>>>> Member, Apache Software Foundation;
>> http://www.apache.org
>>>>>>>>>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
>>>>>>>>>>>> Blog: http://supunk.blogspot.com
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> System Analyst Programmer
>>>>>>>>>>> PTI Lab
>>>>>>>>>>> Indiana University
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Supun Kamburugamuva
>>>>>>>>>> Member, Apache Software Foundation; http://www.apache.org
>>>>>>>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
>>>>>>>>>> Blog: http://supunk.blogspot.com
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> System Analyst Programmer
>>>>>>>>> PTI Lab
>>>>>>>>> Indiana University
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Supun Kamburugamuva
>>>>>>> Member, Apache Software Foundation; http://www.apache.org
>>>>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
>>>>>>> Blog: http://supunk.blogspot.com
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> System Analyst Programmer
>>>>>> PTI Lab
>>>>>> Indiana University
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> System Analyst Programmer
>>>>> PTI Lab
>>>>> Indiana University
>>>>>
>>>
>>>
>>> --
>>> System Analyst Programmer
>>> PTI Lab
>>> Indiana University
>>>
>>
>>
>> --
>> System Analyst Programmer
>> PTI Lab
>> Indiana University
>>

Re: Zookeeper in Airavata to achieve reliability

Posted by Lahiru Gunathilake <gl...@gmail.com>.

+1.


On Tue, Jun 17, 2014 at 1:27 PM, Supun Kamburugamuva <su...@gmail.com>
wrote:

> Also I believe the Tool X has to do some distributed coordination among the
> GFac instances. So I think we are talking about ZooKeeper or an equivalent
> :)
>
>
> On Tue, Jun 17, 2014 at 1:12 PM, Lahiru Gunathilake <gl...@gmail.com>
> wrote:
>
> > Hi All,
> >
> > Ignoring the tool that we are going to use to implement fault tolerance I
> > have summarized the model we have decided so far. I will use the tool
> name
> > as X, we can use Zookeeper or some other implementation. Following design
> > assume tool X  and Registry have high availability.
> >
> > 1. Orchestrator and GFAC worker node communication is going to be queue
> > based and tool X is going to be used for this communication. (We have to
> > implement this with considering race condition between different gfac
> > workers).
> > 2. We are having multiple instances of GFAC which are identical (In
> future
> > we can group gfac workers). Existence of each worker node is identified
> > using X. If node goes down orchestrator will be notified by X.
> > 3. When a particular request comes and accepted by one gfac worker that
> > information will be replicated in tool X and a place where this
> information
> > is persisted even the worker failed.
> > 4. When a job comes to a final state like failed or cancelled or
> completed
> > above information will be removed. So at a given time orchestrator can
> poll
> > active jobs in each worker by giving a worker ID.
> > 5. Tool X will make sure that when a worker goes down it will notify
> > orchestrator. During a worker failure, based on step 3 and 4 orchestrator
> > can poll all the active jobs of that worker and do the same thing like in
> > step 1 (store the experiment ID to the queue) and gfac worker will pick
> the
> > jobs.
> >
> > 6. When GFAC receive a job like in step 5 it have to carefully evaluate
> the
> > state from registry and decide what to be done (If the job is pending
> then
> > gfac just have to monitor, if job state is like input transferred not
> even
> > submitted gfac has to execute rest of the chain and submit the job to the
> > resource and start monitoring).
> >
> > If we can find a tool X which supports all these features and tool itself
> > is fault tolerance and support atomicity, high availability and simply
> API
> > to implement we can use that tool.
> >
> > WDYT ?
> >
> > Lahiru
> >
> >
> > On Mon, Jun 16, 2014 at 2:38 PM, Supun Kamburugamuva <su...@gmail.com>
> > wrote:
> >
> > > Hi Lahiru,
> > >
> > > Before moving with an implementation it may be worth to consider some
> of
> > > the following aspects as well.
> > >
> > > 1. How to report the progress of an experiment as state in ZooKeeper?
> > What
> > > happens if a GFac instance crashes while executing an experiment? Are
> > there
> > > check-points we can save so that another GFac instance can take over?
> > > 2. What is the threading model of GFac instances? (I consider this as a
> > > very important aspect)
> > > 3. What are the information needed to be stored in the ZooKeeper? You
> may
> > > need to store other information about an experiment apart from its
> > > experiment ID.
> > > 4. How to report errors?
> > > 5. For GFac weather you need a threading model or worker process model?
> > >
> > > Thanks,
> > > Supun..
> > >
> > >
> > >
> > >
> > >
> > > On Mon, Jun 16, 2014 at 2:22 PM, Lahiru Gunathilake <glahiru@gmail.com
> >
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > I think the conclusion is like this,
> > > >
> > > > 1, We make the gfac as a worker not a thrift service and we can start
> > > > multiple workers either with bunch of providers and handlers
> configured
> > > in
> > > > each worker or provider specific  workers to handle the class path
> > issues
> > > > (not the common scenario).
> > > >
> > > > 2. Gfac workers can be configured to watch for a given path in
> > zookeeper,
> > > > and multiple workers can listen to the same path. Default path can be
> > > > /airavata/gfac or can configure paths like /airavata/gfac/gsissh
> > > > /airavata/gfac/bes.
> > > >
> > > > 3. Orchestrator can configure with a logic to store experiment IDs in
> > > > zookeeper with a path, and orchestrator can be configured to provider
> > > > specific path logic too. So when a new request come orchestrator
> store
> > > the
> > > > experimentID and these experiments IDs are stored in Zk as a queue.
> > > >
> > > > 4. Since gfac workers are watching they will be notified and as supun
> > > > suggested can use a leader selection algorithm[1] and one gfac worker
> > >  will
> > > > take the leadership for each experiment. If there are gfac instances
> > for
> > > > each provider same logic will apply among those nodes with same
> > provider
> > > > type.
> > > >
> > > > [1]http://curator.apache.org/curator-recipes/leader-election.html
> > > >
> > > > I would like to implement this if there are  no objections.
> > > >
> > > > Lahiru
> > > >
> > > >
> > > > On Mon, Jun 16, 2014 at 11:51 AM, Supun Kamburugamuva <
> > supun06@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Hi Marlon,
> > > > >
> > > > > I think you are exactly correct.
> > > > >
> > > > > Supun..
> > > > >
> > > > >
> > > > > On Mon, Jun 16, 2014 at 11:48 AM, Marlon Pierce <ma...@iu.edu>
> > > wrote:
> > > > >
> > > > > > Let me restate this, and please tell me if I'm wrong.
> > > > > >
> > > > > > Orchestrator decides (somehow) that a particular job requires
> > > JSDL/BES,
> > > > > so
> > > > > > it places the Experiment ID in Zookeeper's
> /airavata/gfac/jsdl-bes
> > > > node.
> > > > > >  GFAC servers associated with this instance notice the update.
>  The
> > > > first
> > > > > > GFAC to claim the job gets it, uses the Experiment ID to get the
> > > > detailed
> > > > > > information it needs from the Registry.  ZooKeeper handles the
> > > locking,
> > > > > etc
> > > > > > to make sure that only one GFAC at a time is trying to handle an
> > > > > experiment.
> > > > > >
> > > > > > Marlon
> > > > > >
> > > > > >
> > > > > > On 6/16/14, 11:42 AM, Lahiru Gunathilake wrote:
> > > > > >
> > > > > >> Hi Supun,
> > > > > >>
> > > > > >> Thanks for the clarification.
> > > > > >>
> > > > > >> Regards
> > > > > >> Lahiru
> > > > > >>
> > > > > >>
> > > > > >> On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva <
> > > > > supun06@gmail.com>
> > > > > >> wrote:
> > > > > >>
> > > > > >>  Hi Lahiru,
> > > > > >>>
> > > > > >>> My suggestion is that may be you don't need a Thrift service
> > > between
> > > > > >>> Orchestrator and the component executing the experiment. When a
> > new
> > > > > >>> experiment is submitted, orchestrator decides who can execute
> > this
> > > > job.
> > > > > >>> Then it put the information about this experiment execution in
> > > > > ZooKeeper.
> > > > > >>> The component which wants to executes the experiment is
> listening
> > > to
> > > > > this
> > > > > >>> ZooKeeper path and when it sees the experiment it will execute
> > it.
> > > So
> > > > > >>> that
> > > > > >>> the communication happens through an state change in ZooKeeper.
> > > This
> > > > > can
> > > > > >>> potentially simply your architecture.
> > > > > >>>
> > > > > >>> Thanks,
> > > > > >>> Supun.
> > > > > >>>
> > > > > >>>
> > > > > >>> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake <
> > > > > glahiru@gmail.com>
> > > > > >>> wrote:
> > > > > >>>
> > > > > >>>  Hi Supun,
> > > > > >>>>
> > > > > >>>> So your suggestion is to create a znode for each thrift
> service
> > we
> > > > > have
> > > > > >>>> and
> > > > > >>>> when the request comes that node gets modified with input data
> > for
> > > > > that
> > > > > >>>> request and thrift service is having a watch for that node and
> > it
> > > > will
> > > > > >>>> be
> > > > > >>>> notified because of the watch and it can read the input from
> > > > zookeeper
> > > > > >>>> and
> > > > > >>>> invoke the operation?
> > > > > >>>>
> > > > > >>>> Lahiru
> > > > > >>>>
> > > > > >>>>
> > > > > >>>> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva <
> > > > > >>>> supun06@gmail.com>
> > > > > >>>> wrote:
> > > > > >>>>
> > > > > >>>>  Hi all,
> > > > > >>>>>
> > > > > >>>>> Here is what I think about Airavata and ZooKeeper. In
> Airavata
> > > > there
> > > > > >>>>> are
> > > > > >>>>> many components and these components must be stateless to
> > achieve
> > > > > >>>>> scalability and reliability.Also there must be a mechanism to
> > > > > >>>>>
> > > > > >>>> communicate
> > > > > >>>>
> > > > > >>>>> between the components. At the moment Airavata uses RPC calls
> > > based
> > > > > on
> > > > > >>>>> Thrift for the communication.
> > > > > >>>>>
> > > > > >>>>> ZooKeeper can be used both as a place to hold state and as a
> > > > > >>>>>
> > > > > >>>> communication
> > > > > >>>>
> > > > > >>>>> layer between the components. I'm involved with a project
> that
> > > has
> > > > > many
> > > > > >>>>> distributed components like AIravata. Right now we use Thrift
> > > > > services
> > > > > >>>>>
> > > > > >>>> to
> > > > > >>>>
> > > > > >>>>> communicate among the components. But we find it difficult to
> > use
> > > > RPC
> > > > > >>>>>
> > > > > >>>> calls
> > > > > >>>>
> > > > > >>>>> and achieve stateless behaviour and thinking of replacing
> > Thrift
> > > > > >>>>>
> > > > > >>>> services
> > > > > >>>>
> > > > > >>>>> with ZooKeeper based communication layer. So I think it is
> > better
> > > > to
> > > > > >>>>> explore the possibility of removing the Thrift services
> between
> > > the
> > > > > >>>>> components and use ZooKeeper as a communication mechanism
> > between
> > > > the
> > > > > >>>>> services. If you do this you will have to move the state to
> > > > ZooKeeper
> > > > > >>>>>
> > > > > >>>> and
> > > > > >>>>
> > > > > >>>>> will automatically achieve the stateless behaviour in the
> > > > components.
> > > > > >>>>>
> > > > > >>>>> Also I think trying to make ZooKeeper optional is a bad idea.
> > If
> > > we
> > > > > are
> > > > > >>>>> trying to integrate something fundamentally important to
> > > > architecture
> > > > > >>>>> as
> > > > > >>>>> how to store state, we shouldn't make it optional.
> > > > > >>>>>
> > > > > >>>>> Thanks,
> > > > > >>>>> Supun..
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>> On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
> > > > > >>>>> shameerainfo@gmail.com> wrote:
> > > > > >>>>>
> > > > > >>>>>  Hi Lahiru,
> > > > > >>>>>>
> > > > > >>>>>> As i understood,  not only reliability , you are trying to
> > > achieve
> > > > > >>>>>> some
> > > > > >>>>>> other requirement by introducing zookeeper, like health
> > > monitoring
> > > > > of
> > > > > >>>>>>
> > > > > >>>>> the
> > > > > >>>>
> > > > > >>>>> services, categorization with service implementation etc ...
> .
> > In
> > > > > that
> > > > > >>>>>> case, i think we can get use of zookeeper's features but if
> we
> > > > only
> > > > > >>>>>>
> > > > > >>>>> focus
> > > > > >>>>
> > > > > >>>>> on reliability, i have little bit of concern, why can't we
> use
> > > > > >>>>>>
> > > > > >>>>> clustering +
> > > > > >>>>
> > > > > >>>>> LB ?
> > > > > >>>>>>
> > > > > >>>>>> Yes it is better we add Zookeeper as a prerequisite if user
> > need
> > > > to
> > > > > >>>>>> use
> > > > > >>>>>> it.
> > > > > >>>>>>
> > > > > >>>>>> Thanks,
> > > > > >>>>>>   Shameera.
> > > > > >>>>>>
> > > > > >>>>>>
> > > > > >>>>>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <
> > > > > >>>>>> glahiru@gmail.com
> > > > > >>>>>> wrote:
> > > > > >>>>>>
> > > > > >>>>>>  Hi Gagan,
> > > > > >>>>>>>
> > > > > >>>>>>> I need to start another discussion about it, but I had an
> > > offline
> > > > > >>>>>>> discussion with Suresh about auto-scaling. I will start
> > another
> > > > > >>>>>>> thread
> > > > > >>>>>>> about this topic too.
> > > > > >>>>>>>
> > > > > >>>>>>> Regards
> > > > > >>>>>>> Lahiru
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
> > > > > >>>>>>>
> > > > > >>>>>> gagandeepjuneja@gmail.com
> > > > > >>>>
> > > > > >>>>> wrote:
> > > > > >>>>>>>
> > > > > >>>>>>>  Thanks Lahiru for pointing to nice library, added to my
> > > > dictionary
> > > > > >>>>>>>>
> > > > > >>>>>>> :).
> > > > > >>>>
> > > > > >>>>>  I would like to know how are we planning to start multiple
> > > > servers.
> > > > > >>>>>>>> 1. Spawning new servers based on load? Some times we call
> it
> > > as
> > > > > auto
> > > > > >>>>>>>> scalable.
> > > > > >>>>>>>> 2. To make some specific number of nodes available such as
> > we
> > > > > want 2
> > > > > >>>>>>>> servers to be available at any time so if one goes down
> > then I
> > > > > need
> > > > > >>>>>>>>
> > > > > >>>>>>> to
> > > > > >>>>
> > > > > >>>>>  spawn one new to make available servers count 2.
> > > > > >>>>>>>> 3. Initially start all the servers.
> > > > > >>>>>>>>
> > > > > >>>>>>>> In scenario 1 and 2 zookeeper does make sense but I don't
> > > > believe
> > > > > >>>>>>>>
> > > > > >>>>>>> existing
> > > > > >>>>>>>
> > > > > >>>>>>>> architecture support this?
> > > > > >>>>>>>>
> > > > > >>>>>>>> Regards,
> > > > > >>>>>>>> Gagan
> > > > > >>>>>>>> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <
> > > glahiru@gmail.com
> > > > >
> > > > > >>>>>>>>
> > > > > >>>>>>> wrote:
> > > > > >>>>>>>
> > > > > >>>>>>>> Hi Gagan,
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> Thanks for your response. Please see my inline comments.
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
> > > > > >>>>>>>>>
> > > > > >>>>>>>> gagandeepjuneja@gmail.com>
> > > > > >>>>>>>
> > > > > >>>>>>>> wrote:
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>  Hi Lahiru,
> > > > > >>>>>>>>>> Just my 2 cents.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> I am big fan of zookeeper but also against adding
> multiple
> > > > hops
> > > > > in
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>> the
> > > > > >>>>>>>
> > > > > >>>>>>>> system which can add unnecessary complexity. Here I am not
> > > able
> > > > to
> > > > > >>>>>>>>>> understand the requirement of zookeeper may be I am
> wrong
> > > > > because
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>> of
> > > > > >>>>
> > > > > >>>>> less
> > > > > >>>>>>>
> > > > > >>>>>>>> knowledge of the airavata system in whole. So I would like
> > to
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>> discuss
> > > > > >>>>
> > > > > >>>>>  following point.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> 1. How it will help us in making system more reliable.
> > > > Zookeeper
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>> is
> > > > > >>>>
> > > > > >>>>> not
> > > > > >>>>>>>
> > > > > >>>>>>>> able to restart services. At max it can tell whether
> service
> > > is
> > > > up
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>> or not
> > > > > >>>>>>>
> > > > > >>>>>>>> which could only be the case if airavata service goes down
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>> gracefully and
> > > > > >>>>>>>
> > > > > >>>>>>>> we have any automated way to restart it. If this is just
> > > matter
> > > > of
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>> routing
> > > > > >>>>>>>
> > > > > >>>>>>>> client requests to the available thrift servers then this
> > can
> > > be
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>> achieved
> > > > > >>>>>>>
> > > > > >>>>>>>> with the help of load balancer which I guess is already
> > there
> > > in
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>> thrift
> > > > > >>>>>>>
> > > > > >>>>>>>> wish list.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>  We have multiple thrift services and currently we start
> > > only
> > > > > one
> > > > > >>>>>>>>>
> > > > > >>>>>>>> instance
> > > > > >>>>>>>
> > > > > >>>>>>>> of them and each thrift service is a stateless service. To
> > > keep
> > > > > the
> > > > > >>>>>>>>>
> > > > > >>>>>>>> high
> > > > > >>>>>>>
> > > > > >>>>>>>> availability we have to start multiple instances of them
> in
> > > > > >>>>>>>>>
> > > > > >>>>>>>> production
> > > > > >>>>
> > > > > >>>>>  scenario. So for clients to get an available thrift service
> we
> > > can
> > > > > >>>>>>>>>
> > > > > >>>>>>>> use
> > > > > >>>>
> > > > > >>>>>  zookeeper znodes to represent each available service. There
> > are
> > > > > >>>>>>>>>
> > > > > >>>>>>>> some
> > > > > >>>>
> > > > > >>>>>  libraries which is doing similar[1] and I think we can use
> > them
> > > > > >>>>>>>>>
> > > > > >>>>>>>> directly.
> > > > > >>>>>>>
> > > > > >>>>>>>> 2. As far as registering of different providers is
> concerned
> > > do
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>> you
> > > > > >>>>
> > > > > >>>>>  think for that we really need external store.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>  Yes I think so, because its light weight and reliable
> and
> > > we
> > > > > have
> > > > > >>>>>>>>>
> > > > > >>>>>>>> to
> > > > > >>>>
> > > > > >>>>> do
> > > > > >>>>>>>
> > > > > >>>>>>>> very minimal amount of work to achieve all these features
> to
> > > > > >>>>>>>>>
> > > > > >>>>>>>> Airavata
> > > > > >>>>
> > > > > >>>>>  because zookeeper handle all the complexity.
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>  I have seen people using zookeeper more for state
> > management
> > > > in
> > > > > >>>>>>>>>> distributed environments.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>  +1, we might not be the most effective users of
> zookeeper
> > > > > because
> > > > > >>>>>>>>>
> > > > > >>>>>>>> all
> > > > > >>>>
> > > > > >>>>> of
> > > > > >>>>>>>
> > > > > >>>>>>>> our services are stateless services, but my point is to
> > > achieve
> > > > > >>>>>>>>> fault-tolerance we can use zookeeper and with minimal
> work.
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>    I would like to understand more how can we leverage
> > > > zookeeper
> > > > > in
> > > > > >>>>>>>>>> airavata to make system reliable.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>  [1]https://github.com/eirslett/thrift-zookeeper
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>  Regards,
> > > > > >>>>>>>>>> Gagan
> > > > > >>>>>>>>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <
> marpierc@iu.edu
> > >
> > > > > wrote:
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>  Thanks for the summary, Lahiru. I'm cc'ing the
> > Architecture
> > > > > list
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>> for
> > > > > >>>>
> > > > > >>>>>  additional comments.
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> Marlon
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> Hi All,
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> I did little research about Apache Zookeeper[1] and
> how
> > to
> > > > use
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> it
> > > > > >>>>
> > > > > >>>>> in
> > > > > >>>>>>>
> > > > > >>>>>>>>  airavata. Its really a nice way to achieve fault
> tolerance
> > > and
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> reliable
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> communication between our thrift services and clients.
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> Zookeeper
> > > > > >>>>
> > > > > >>>>> is a
> > > > > >>>>>>>
> > > > > >>>>>>>>  distributed, fault tolerant system to do a reliable
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> communication
> > > > > >>>>
> > > > > >>>>>  between
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> distributed applications. This is like an in-memory
> file
> > > > > system
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> which
> > > > > >>>>>>>
> > > > > >>>>>>>>  has
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> nodes in a tree structure and each node can have small
> > > > amount
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> of
> > > > > >>>>
> > > > > >>>>> data
> > > > > >>>>>>>
> > > > > >>>>>>>>  associated with it and these nodes are called znodes.
> > Clients
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> can
> > > > > >>>>
> > > > > >>>>>  connect
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> to a zookeeper server and add/delete and update these
> > > > znodes.
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>>    In Apache Airavata we start multiple thrift
> services
> > > and
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> these
> > > > > >>>>
> > > > > >>>>> can
> > > > > >>>>>>>
> > > > > >>>>>>>>  go
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> down for maintenance or these can crash, if we use
> > > zookeeper
> > > > > to
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> store
> > > > > >>>>>>>
> > > > > >>>>>>>>  these
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> configuration(thrift service configurations) we can
> > > achieve
> > > > a
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> very
> > > > > >>>>
> > > > > >>>>>  reliable
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> system. Basically thrift clients can dynamically
> > discover
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> available
> > > > > >>>>>>>
> > > > > >>>>>>>>  service
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> by using ephemeral znodes(Here we do not have to
> change
> > > the
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> generated
> > > > > >>>>>>>
> > > > > >>>>>>>>  thrift client code but we have to change the locations we
> > are
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> invoking
> > > > > >>>>>>>
> > > > > >>>>>>>>  them). ephemeral znodes will be removed when the thrift
> > > service
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> goes
> > > > > >>>>>>>
> > > > > >>>>>>>>  down
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> and zookeeper guarantee the atomicity between these
> > > > > operations.
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> With
> > > > > >>>>>>>
> > > > > >>>>>>>>  this
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> approach we can have a node hierarchy for multiple of
> > > > > airavata,
> > > > > >>>>>>>>>>>> orchestrator,appcatalog and gfac thrift services.
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> For specifically for gfac we can have different types
> of
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> services
> > > > > >>>>
> > > > > >>>>> for
> > > > > >>>>>>>
> > > > > >>>>>>>>  each
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> provider implementation. This can be achieved by using
> > the
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> hierarchical
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> support in zookeeper and providing some logic in
> > > gfac-thrift
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> service
> > > > > >>>>>>>
> > > > > >>>>>>>>  to
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> register it to a defined path. Using the same logic
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> orchestrator
> > > > > >>>>
> > > > > >>>>> can
> > > > > >>>>>>>
> > > > > >>>>>>>>  discover the provider specific gfac thrift service and
> > route
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> the
> > > > > >>>>
> > > > > >>>>>  message to
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> the correct thrift service.
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> With this approach I think we simply have write some
> > > client
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> code
> > > > > >>>>
> > > > > >>>>> in
> > > > > >>>>>>>
> > > > > >>>>>>>>  thrift
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> services and clients and zookeeper server installation
> > can
> > > > be
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> done as
> > > > > >>>>>>>
> > > > > >>>>>>>>  a
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> separate process and it will be easier to keep the
> > > Zookeeper
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> server
> > > > > >>>>>>>
> > > > > >>>>>>>>  separate from Airavata because installation of Zookeeper
> > > server
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> little
> > > > > >>>>>>>
> > > > > >>>>>>>>  complex in production scenario. I think we have to make
> > sure
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> everything
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> works fine when there is no Zookeeper running, ex:
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> enable.zookeeper=false
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> should works fine and users doesn't have to download
> and
> > > > start
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> zookeeper.
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> [1]http://zookeeper.apache.org/
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> Thanks
> > > > > >>>>>>>>>>>> Lahiru
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>> --
> > > > > >>>>>>>>> System Analyst Programmer
> > > > > >>>>>>>>> PTI Lab
> > > > > >>>>>>>>> Indiana University
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>
> > > > > >>>>>>> --
> > > > > >>>>>>> System Analyst Programmer
> > > > > >>>>>>> PTI Lab
> > > > > >>>>>>> Indiana University
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>
> > > > > >>>>>> --
> > > > > >>>>>> Best Regards,
> > > > > >>>>>> Shameera Rathnayaka.
> > > > > >>>>>>
> > > > > >>>>>> email: shameera AT apache.org , shameerainfo AT gmail.com
> > > > > >>>>>> Blog : http://shameerarathnayaka.blogspot.com/
> > > > > >>>>>>
> > > > > >>>>>>
> > > > > >>>>>
> > > > > >>>>> --
> > > > > >>>>> Supun Kamburugamuva
> > > > > >>>>> Member, Apache Software Foundation; http://www.apache.org
> > > > > >>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > > > >>>>> Blog: http://supunk.blogspot.com
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>> --
> > > > > >>>> System Analyst Programmer
> > > > > >>>> PTI Lab
> > > > > >>>> Indiana University
> > > > > >>>>
> > > > > >>>>
> > > > > >>>
> > > > > >>> --
> > > > > >>> Supun Kamburugamuva
> > > > > >>> Member, Apache Software Foundation; http://www.apache.org
> > > > > >>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > > > >>> Blog: http://supunk.blogspot.com
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Supun Kamburugamuva
> > > > > Member, Apache Software Foundation; http://www.apache.org
> > > > > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > > > Blog: http://supunk.blogspot.com
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > System Analyst Programmer
> > > > PTI Lab
> > > > Indiana University
> > > >
> > >
> > >
> > >
> > > --
> > > Supun Kamburugamuva
> > > Member, Apache Software Foundation; http://www.apache.org
> > > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > Blog: http://supunk.blogspot.com
> > >
> >
> >
> >
> > --
> > System Analyst Programmer
> > PTI Lab
> > Indiana University
> >
>
>
>
> --
> Supun Kamburugamuva
> Member, Apache Software Foundation; http://www.apache.org
> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> Blog: http://supunk.blogspot.com
>



-- 
System Analyst Programmer
PTI Lab
Indiana University

Re: Zookeeper in Airavata to achieve reliability

Posted by Supun Kamburugamuva <su...@gmail.com>.

Also I believe the Tool X has to do some distributed coordination among the
GFac instances. So I think we are talking about ZooKeeper or an equivalent
:)


On Tue, Jun 17, 2014 at 1:12 PM, Lahiru Gunathilake <gl...@gmail.com>
wrote:

> Hi All,
>
> Ignoring the tool that we are going to use to implement fault tolerance I
> have summarized the model we have decided so far. I will use the tool name
> as X, we can use Zookeeper or some other implementation. Following design
> assume tool X  and Registry have high availability.
>
> 1. Orchestrator and GFAC worker node communication is going to be queue
> based and tool X is going to be used for this communication. (We have to
> implement this with considering race condition between different gfac
> workers).
> 2. We are having multiple instances of GFAC which are identical (In future
> we can group gfac workers). Existence of each worker node is identified
> using X. If node goes down orchestrator will be notified by X.
> 3. When a particular request comes and accepted by one gfac worker that
> information will be replicated in tool X and a place where this information
> is persisted even the worker failed.
> 4. When a job comes to a final state like failed or cancelled or completed
> above information will be removed. So at a given time orchestrator can poll
> active jobs in each worker by giving a worker ID.
> 5. Tool X will make sure that when a worker goes down it will notify
> orchestrator. During a worker failure, based on step 3 and 4 orchestrator
> can poll all the active jobs of that worker and do the same thing like in
> step 1 (store the experiment ID to the queue) and gfac worker will pick the
> jobs.
>
> 6. When GFAC receive a job like in step 5 it have to carefully evaluate the
> state from registry and decide what to be done (If the job is pending then
> gfac just have to monitor, if job state is like input transferred not even
> submitted gfac has to execute rest of the chain and submit the job to the
> resource and start monitoring).
>
> If we can find a tool X which supports all these features and tool itself
> is fault tolerance and support atomicity, high availability and simply API
> to implement we can use that tool.
>
> WDYT ?
>
> Lahiru
>
>
> On Mon, Jun 16, 2014 at 2:38 PM, Supun Kamburugamuva <su...@gmail.com>
> wrote:
>
> > Hi Lahiru,
> >
> > Before moving with an implementation it may be worth to consider some of
> > the following aspects as well.
> >
> > 1. How to report the progress of an experiment as state in ZooKeeper?
> What
> > happens if a GFac instance crashes while executing an experiment? Are
> there
> > check-points we can save so that another GFac instance can take over?
> > 2. What is the threading model of GFac instances? (I consider this as a
> > very important aspect)
> > 3. What are the information needed to be stored in the ZooKeeper? You may
> > need to store other information about an experiment apart from its
> > experiment ID.
> > 4. How to report errors?
> > 5. For GFac weather you need a threading model or worker process model?
> >
> > Thanks,
> > Supun..
> >
> >
> >
> >
> >
> > On Mon, Jun 16, 2014 at 2:22 PM, Lahiru Gunathilake <gl...@gmail.com>
> > wrote:
> >
> > > Hi All,
> > >
> > > I think the conclusion is like this,
> > >
> > > 1, We make the gfac as a worker not a thrift service and we can start
> > > multiple workers either with bunch of providers and handlers configured
> > in
> > > each worker or provider specific  workers to handle the class path
> issues
> > > (not the common scenario).
> > >
> > > 2. Gfac workers can be configured to watch for a given path in
> zookeeper,
> > > and multiple workers can listen to the same path. Default path can be
> > > /airavata/gfac or can configure paths like /airavata/gfac/gsissh
> > > /airavata/gfac/bes.
> > >
> > > 3. Orchestrator can configure with a logic to store experiment IDs in
> > > zookeeper with a path, and orchestrator can be configured to provider
> > > specific path logic too. So when a new request come orchestrator store
> > the
> > > experimentID and these experiments IDs are stored in Zk as a queue.
> > >
> > > 4. Since gfac workers are watching they will be notified and as supun
> > > suggested can use a leader selection algorithm[1] and one gfac worker
> >  will
> > > take the leadership for each experiment. If there are gfac instances
> for
> > > each provider same logic will apply among those nodes with same
> provider
> > > type.
> > >
> > > [1]http://curator.apache.org/curator-recipes/leader-election.html
> > >
> > > I would like to implement this if there are  no objections.
> > >
> > > Lahiru
> > >
> > >
> > > On Mon, Jun 16, 2014 at 11:51 AM, Supun Kamburugamuva <
> supun06@gmail.com
> > >
> > > wrote:
> > >
> > > > Hi Marlon,
> > > >
> > > > I think you are exactly correct.
> > > >
> > > > Supun..
> > > >
> > > >
> > > > On Mon, Jun 16, 2014 at 11:48 AM, Marlon Pierce <ma...@iu.edu>
> > wrote:
> > > >
> > > > > Let me restate this, and please tell me if I'm wrong.
> > > > >
> > > > > Orchestrator decides (somehow) that a particular job requires
> > JSDL/BES,
> > > > so
> > > > > it places the Experiment ID in Zookeeper's /airavata/gfac/jsdl-bes
> > > node.
> > > > >  GFAC servers associated with this instance notice the update.  The
> > > first
> > > > > GFAC to claim the job gets it, uses the Experiment ID to get the
> > > detailed
> > > > > information it needs from the Registry.  ZooKeeper handles the
> > locking,
> > > > etc
> > > > > to make sure that only one GFAC at a time is trying to handle an
> > > > experiment.
> > > > >
> > > > > Marlon
> > > > >
> > > > >
> > > > > On 6/16/14, 11:42 AM, Lahiru Gunathilake wrote:
> > > > >
> > > > >> Hi Supun,
> > > > >>
> > > > >> Thanks for the clarification.
> > > > >>
> > > > >> Regards
> > > > >> Lahiru
> > > > >>
> > > > >>
> > > > >> On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva <
> > > > supun06@gmail.com>
> > > > >> wrote:
> > > > >>
> > > > >>  Hi Lahiru,
> > > > >>>
> > > > >>> My suggestion is that may be you don't need a Thrift service
> > between
> > > > >>> Orchestrator and the component executing the experiment. When a
> new
> > > > >>> experiment is submitted, orchestrator decides who can execute
> this
> > > job.
> > > > >>> Then it put the information about this experiment execution in
> > > > ZooKeeper.
> > > > >>> The component which wants to executes the experiment is listening
> > to
> > > > this
> > > > >>> ZooKeeper path and when it sees the experiment it will execute
> it.
> > So
> > > > >>> that
> > > > >>> the communication happens through an state change in ZooKeeper.
> > This
> > > > can
> > > > >>> potentially simply your architecture.
> > > > >>>
> > > > >>> Thanks,
> > > > >>> Supun.
> > > > >>>
> > > > >>>
> > > > >>> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake <
> > > > glahiru@gmail.com>
> > > > >>> wrote:
> > > > >>>
> > > > >>>  Hi Supun,
> > > > >>>>
> > > > >>>> So your suggestion is to create a znode for each thrift service
> we
> > > > have
> > > > >>>> and
> > > > >>>> when the request comes that node gets modified with input data
> for
> > > > that
> > > > >>>> request and thrift service is having a watch for that node and
> it
> > > will
> > > > >>>> be
> > > > >>>> notified because of the watch and it can read the input from
> > > zookeeper
> > > > >>>> and
> > > > >>>> invoke the operation?
> > > > >>>>
> > > > >>>> Lahiru
> > > > >>>>
> > > > >>>>
> > > > >>>> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva <
> > > > >>>> supun06@gmail.com>
> > > > >>>> wrote:
> > > > >>>>
> > > > >>>>  Hi all,
> > > > >>>>>
> > > > >>>>> Here is what I think about Airavata and ZooKeeper. In Airavata
> > > there
> > > > >>>>> are
> > > > >>>>> many components and these components must be stateless to
> achieve
> > > > >>>>> scalability and reliability.Also there must be a mechanism to
> > > > >>>>>
> > > > >>>> communicate
> > > > >>>>
> > > > >>>>> between the components. At the moment Airavata uses RPC calls
> > based
> > > > on
> > > > >>>>> Thrift for the communication.
> > > > >>>>>
> > > > >>>>> ZooKeeper can be used both as a place to hold state and as a
> > > > >>>>>
> > > > >>>> communication
> > > > >>>>
> > > > >>>>> layer between the components. I'm involved with a project that
> > has
> > > > many
> > > > >>>>> distributed components like AIravata. Right now we use Thrift
> > > > services
> > > > >>>>>
> > > > >>>> to
> > > > >>>>
> > > > >>>>> communicate among the components. But we find it difficult to
> use
> > > RPC
> > > > >>>>>
> > > > >>>> calls
> > > > >>>>
> > > > >>>>> and achieve stateless behaviour and thinking of replacing
> Thrift
> > > > >>>>>
> > > > >>>> services
> > > > >>>>
> > > > >>>>> with ZooKeeper based communication layer. So I think it is
> better
> > > to
> > > > >>>>> explore the possibility of removing the Thrift services between
> > the
> > > > >>>>> components and use ZooKeeper as a communication mechanism
> between
> > > the
> > > > >>>>> services. If you do this you will have to move the state to
> > > ZooKeeper
> > > > >>>>>
> > > > >>>> and
> > > > >>>>
> > > > >>>>> will automatically achieve the stateless behaviour in the
> > > components.
> > > > >>>>>
> > > > >>>>> Also I think trying to make ZooKeeper optional is a bad idea.
> If
> > we
> > > > are
> > > > >>>>> trying to integrate something fundamentally important to
> > > architecture
> > > > >>>>> as
> > > > >>>>> how to store state, we shouldn't make it optional.
> > > > >>>>>
> > > > >>>>> Thanks,
> > > > >>>>> Supun..
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
> > > > >>>>> shameerainfo@gmail.com> wrote:
> > > > >>>>>
> > > > >>>>>  Hi Lahiru,
> > > > >>>>>>
> > > > >>>>>> As i understood,  not only reliability , you are trying to
> > achieve
> > > > >>>>>> some
> > > > >>>>>> other requirement by introducing zookeeper, like health
> > monitoring
> > > > of
> > > > >>>>>>
> > > > >>>>> the
> > > > >>>>
> > > > >>>>> services, categorization with service implementation etc ... .
> In
> > > > that
> > > > >>>>>> case, i think we can get use of zookeeper's features but if we
> > > only
> > > > >>>>>>
> > > > >>>>> focus
> > > > >>>>
> > > > >>>>> on reliability, i have little bit of concern, why can't we use
> > > > >>>>>>
> > > > >>>>> clustering +
> > > > >>>>
> > > > >>>>> LB ?
> > > > >>>>>>
> > > > >>>>>> Yes it is better we add Zookeeper as a prerequisite if user
> need
> > > to
> > > > >>>>>> use
> > > > >>>>>> it.
> > > > >>>>>>
> > > > >>>>>> Thanks,
> > > > >>>>>>   Shameera.
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <
> > > > >>>>>> glahiru@gmail.com
> > > > >>>>>> wrote:
> > > > >>>>>>
> > > > >>>>>>  Hi Gagan,
> > > > >>>>>>>
> > > > >>>>>>> I need to start another discussion about it, but I had an
> > offline
> > > > >>>>>>> discussion with Suresh about auto-scaling. I will start
> another
> > > > >>>>>>> thread
> > > > >>>>>>> about this topic too.
> > > > >>>>>>>
> > > > >>>>>>> Regards
> > > > >>>>>>> Lahiru
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
> > > > >>>>>>>
> > > > >>>>>> gagandeepjuneja@gmail.com
> > > > >>>>
> > > > >>>>> wrote:
> > > > >>>>>>>
> > > > >>>>>>>  Thanks Lahiru for pointing to nice library, added to my
> > > dictionary
> > > > >>>>>>>>
> > > > >>>>>>> :).
> > > > >>>>
> > > > >>>>>  I would like to know how are we planning to start multiple
> > > servers.
> > > > >>>>>>>> 1. Spawning new servers based on load? Some times we call it
> > as
> > > > auto
> > > > >>>>>>>> scalable.
> > > > >>>>>>>> 2. To make some specific number of nodes available such as
> we
> > > > want 2
> > > > >>>>>>>> servers to be available at any time so if one goes down
> then I
> > > > need
> > > > >>>>>>>>
> > > > >>>>>>> to
> > > > >>>>
> > > > >>>>>  spawn one new to make available servers count 2.
> > > > >>>>>>>> 3. Initially start all the servers.
> > > > >>>>>>>>
> > > > >>>>>>>> In scenario 1 and 2 zookeeper does make sense but I don't
> > > believe
> > > > >>>>>>>>
> > > > >>>>>>> existing
> > > > >>>>>>>
> > > > >>>>>>>> architecture support this?
> > > > >>>>>>>>
> > > > >>>>>>>> Regards,
> > > > >>>>>>>> Gagan
> > > > >>>>>>>> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <
> > glahiru@gmail.com
> > > >
> > > > >>>>>>>>
> > > > >>>>>>> wrote:
> > > > >>>>>>>
> > > > >>>>>>>> Hi Gagan,
> > > > >>>>>>>>>
> > > > >>>>>>>>> Thanks for your response. Please see my inline comments.
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
> > > > >>>>>>>>>
> > > > >>>>>>>> gagandeepjuneja@gmail.com>
> > > > >>>>>>>
> > > > >>>>>>>> wrote:
> > > > >>>>>>>>>
> > > > >>>>>>>>>  Hi Lahiru,
> > > > >>>>>>>>>> Just my 2 cents.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> I am big fan of zookeeper but also against adding multiple
> > > hops
> > > > in
> > > > >>>>>>>>>>
> > > > >>>>>>>>> the
> > > > >>>>>>>
> > > > >>>>>>>> system which can add unnecessary complexity. Here I am not
> > able
> > > to
> > > > >>>>>>>>>> understand the requirement of zookeeper may be I am wrong
> > > > because
> > > > >>>>>>>>>>
> > > > >>>>>>>>> of
> > > > >>>>
> > > > >>>>> less
> > > > >>>>>>>
> > > > >>>>>>>> knowledge of the airavata system in whole. So I would like
> to
> > > > >>>>>>>>>>
> > > > >>>>>>>>> discuss
> > > > >>>>
> > > > >>>>>  following point.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 1. How it will help us in making system more reliable.
> > > Zookeeper
> > > > >>>>>>>>>>
> > > > >>>>>>>>> is
> > > > >>>>
> > > > >>>>> not
> > > > >>>>>>>
> > > > >>>>>>>> able to restart services. At max it can tell whether service
> > is
> > > up
> > > > >>>>>>>>>>
> > > > >>>>>>>>> or not
> > > > >>>>>>>
> > > > >>>>>>>> which could only be the case if airavata service goes down
> > > > >>>>>>>>>>
> > > > >>>>>>>>> gracefully and
> > > > >>>>>>>
> > > > >>>>>>>> we have any automated way to restart it. If this is just
> > matter
> > > of
> > > > >>>>>>>>>>
> > > > >>>>>>>>> routing
> > > > >>>>>>>
> > > > >>>>>>>> client requests to the available thrift servers then this
> can
> > be
> > > > >>>>>>>>>>
> > > > >>>>>>>>> achieved
> > > > >>>>>>>
> > > > >>>>>>>> with the help of load balancer which I guess is already
> there
> > in
> > > > >>>>>>>>>>
> > > > >>>>>>>>> thrift
> > > > >>>>>>>
> > > > >>>>>>>> wish list.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>  We have multiple thrift services and currently we start
> > only
> > > > one
> > > > >>>>>>>>>
> > > > >>>>>>>> instance
> > > > >>>>>>>
> > > > >>>>>>>> of them and each thrift service is a stateless service. To
> > keep
> > > > the
> > > > >>>>>>>>>
> > > > >>>>>>>> high
> > > > >>>>>>>
> > > > >>>>>>>> availability we have to start multiple instances of them in
> > > > >>>>>>>>>
> > > > >>>>>>>> production
> > > > >>>>
> > > > >>>>>  scenario. So for clients to get an available thrift service we
> > can
> > > > >>>>>>>>>
> > > > >>>>>>>> use
> > > > >>>>
> > > > >>>>>  zookeeper znodes to represent each available service. There
> are
> > > > >>>>>>>>>
> > > > >>>>>>>> some
> > > > >>>>
> > > > >>>>>  libraries which is doing similar[1] and I think we can use
> them
> > > > >>>>>>>>>
> > > > >>>>>>>> directly.
> > > > >>>>>>>
> > > > >>>>>>>> 2. As far as registering of different providers is concerned
> > do
> > > > >>>>>>>>>>
> > > > >>>>>>>>> you
> > > > >>>>
> > > > >>>>>  think for that we really need external store.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>  Yes I think so, because its light weight and reliable and
> > we
> > > > have
> > > > >>>>>>>>>
> > > > >>>>>>>> to
> > > > >>>>
> > > > >>>>> do
> > > > >>>>>>>
> > > > >>>>>>>> very minimal amount of work to achieve all these features to
> > > > >>>>>>>>>
> > > > >>>>>>>> Airavata
> > > > >>>>
> > > > >>>>>  because zookeeper handle all the complexity.
> > > > >>>>>>>>>
> > > > >>>>>>>>>  I have seen people using zookeeper more for state
> management
> > > in
> > > > >>>>>>>>>> distributed environments.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>  +1, we might not be the most effective users of zookeeper
> > > > because
> > > > >>>>>>>>>
> > > > >>>>>>>> all
> > > > >>>>
> > > > >>>>> of
> > > > >>>>>>>
> > > > >>>>>>>> our services are stateless services, but my point is to
> > achieve
> > > > >>>>>>>>> fault-tolerance we can use zookeeper and with minimal work.
> > > > >>>>>>>>>
> > > > >>>>>>>>>    I would like to understand more how can we leverage
> > > zookeeper
> > > > in
> > > > >>>>>>>>>> airavata to make system reliable.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>  [1]https://github.com/eirslett/thrift-zookeeper
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>>  Regards,
> > > > >>>>>>>>>> Gagan
> > > > >>>>>>>>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <marpierc@iu.edu
> >
> > > > wrote:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>  Thanks for the summary, Lahiru. I'm cc'ing the
> Architecture
> > > > list
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>> for
> > > > >>>>
> > > > >>>>>  additional comments.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Marlon
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> Hi All,
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> I did little research about Apache Zookeeper[1] and how
> to
> > > use
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> it
> > > > >>>>
> > > > >>>>> in
> > > > >>>>>>>
> > > > >>>>>>>>  airavata. Its really a nice way to achieve fault tolerance
> > and
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> reliable
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> communication between our thrift services and clients.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> Zookeeper
> > > > >>>>
> > > > >>>>> is a
> > > > >>>>>>>
> > > > >>>>>>>>  distributed, fault tolerant system to do a reliable
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> communication
> > > > >>>>
> > > > >>>>>  between
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> distributed applications. This is like an in-memory file
> > > > system
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> which
> > > > >>>>>>>
> > > > >>>>>>>>  has
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> nodes in a tree structure and each node can have small
> > > amount
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> of
> > > > >>>>
> > > > >>>>> data
> > > > >>>>>>>
> > > > >>>>>>>>  associated with it and these nodes are called znodes.
> Clients
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> can
> > > > >>>>
> > > > >>>>>  connect
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> to a zookeeper server and add/delete and update these
> > > znodes.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>    In Apache Airavata we start multiple thrift services
> > and
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> these
> > > > >>>>
> > > > >>>>> can
> > > > >>>>>>>
> > > > >>>>>>>>  go
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> down for maintenance or these can crash, if we use
> > zookeeper
> > > > to
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> store
> > > > >>>>>>>
> > > > >>>>>>>>  these
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> configuration(thrift service configurations) we can
> > achieve
> > > a
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> very
> > > > >>>>
> > > > >>>>>  reliable
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> system. Basically thrift clients can dynamically
> discover
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> available
> > > > >>>>>>>
> > > > >>>>>>>>  service
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> by using ephemeral znodes(Here we do not have to change
> > the
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> generated
> > > > >>>>>>>
> > > > >>>>>>>>  thrift client code but we have to change the locations we
> are
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> invoking
> > > > >>>>>>>
> > > > >>>>>>>>  them). ephemeral znodes will be removed when the thrift
> > service
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> goes
> > > > >>>>>>>
> > > > >>>>>>>>  down
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> and zookeeper guarantee the atomicity between these
> > > > operations.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> With
> > > > >>>>>>>
> > > > >>>>>>>>  this
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> approach we can have a node hierarchy for multiple of
> > > > airavata,
> > > > >>>>>>>>>>>> orchestrator,appcatalog and gfac thrift services.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> For specifically for gfac we can have different types of
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> services
> > > > >>>>
> > > > >>>>> for
> > > > >>>>>>>
> > > > >>>>>>>>  each
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> provider implementation. This can be achieved by using
> the
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> hierarchical
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> support in zookeeper and providing some logic in
> > gfac-thrift
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> service
> > > > >>>>>>>
> > > > >>>>>>>>  to
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> register it to a defined path. Using the same logic
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> orchestrator
> > > > >>>>
> > > > >>>>> can
> > > > >>>>>>>
> > > > >>>>>>>>  discover the provider specific gfac thrift service and
> route
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> the
> > > > >>>>
> > > > >>>>>  message to
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> the correct thrift service.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> With this approach I think we simply have write some
> > client
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> code
> > > > >>>>
> > > > >>>>> in
> > > > >>>>>>>
> > > > >>>>>>>>  thrift
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> services and clients and zookeeper server installation
> can
> > > be
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> done as
> > > > >>>>>>>
> > > > >>>>>>>>  a
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> separate process and it will be easier to keep the
> > Zookeeper
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> server
> > > > >>>>>>>
> > > > >>>>>>>>  separate from Airavata because installation of Zookeeper
> > server
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> little
> > > > >>>>>>>
> > > > >>>>>>>>  complex in production scenario. I think we have to make
> sure
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> everything
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> works fine when there is no Zookeeper running, ex:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> enable.zookeeper=false
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> should works fine and users doesn't have to download and
> > > start
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> zookeeper.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> [1]http://zookeeper.apache.org/
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Thanks
> > > > >>>>>>>>>>>> Lahiru
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>> --
> > > > >>>>>>>>> System Analyst Programmer
> > > > >>>>>>>>> PTI Lab
> > > > >>>>>>>>> Indiana University
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>> --
> > > > >>>>>>> System Analyst Programmer
> > > > >>>>>>> PTI Lab
> > > > >>>>>>> Indiana University
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>> --
> > > > >>>>>> Best Regards,
> > > > >>>>>> Shameera Rathnayaka.
> > > > >>>>>>
> > > > >>>>>> email: shameera AT apache.org , shameerainfo AT gmail.com
> > > > >>>>>> Blog : http://shameerarathnayaka.blogspot.com/
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>> --
> > > > >>>>> Supun Kamburugamuva
> > > > >>>>> Member, Apache Software Foundation; http://www.apache.org
> > > > >>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > > >>>>> Blog: http://supunk.blogspot.com
> > > > >>>>>
> > > > >>>>>
> > > > >>>>>
> > > > >>>> --
> > > > >>>> System Analyst Programmer
> > > > >>>> PTI Lab
> > > > >>>> Indiana University
> > > > >>>>
> > > > >>>>
> > > > >>>
> > > > >>> --
> > > > >>> Supun Kamburugamuva
> > > > >>> Member, Apache Software Foundation; http://www.apache.org
> > > > >>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > > >>> Blog: http://supunk.blogspot.com
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>
> > > > >
> > > >
> > > >
> > > > --
> > > > Supun Kamburugamuva
> > > > Member, Apache Software Foundation; http://www.apache.org
> > > > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > > Blog: http://supunk.blogspot.com
> > > >
> > >
> > >
> > >
> > > --
> > > System Analyst Programmer
> > > PTI Lab
> > > Indiana University
> > >
> >
> >
> >
> > --
> > Supun Kamburugamuva
> > Member, Apache Software Foundation; http://www.apache.org
> > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > Blog: http://supunk.blogspot.com
> >
>
>
>
> --
> System Analyst Programmer
> PTI Lab
> Indiana University
>



-- 
Supun Kamburugamuva
Member, Apache Software Foundation; http://www.apache.org
E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
Blog: http://supunk.blogspot.com

Re: Zookeeper in Airavata to achieve reliability

Posted by Jijoe Vurghese <ji...@gmail.com>.

Same for timeouts (i.e. processing takes too long once work item enters a Storm topology)…



—
Jijoe



> On Jun 17, 2014, at 1:06 PM, Eran Chinthaka Withana <er...@gmail.com> wrote:
> 
> 
> Storm has a Kafka spout which manages the cursor location (pointer to the
> head of the queue representing the next message to be processed) inside ZK.
> Each storm spout instance uses this information to get the next item to
> process. Storm kafka spout won't advance to the next message until it gets
> an ack from the storm topology.
> 
> 
> So, if there is an exception in the topology and ack is sent only by the
> last bolt, then storm bolt make sure all messages are processed since
> exceptions won't generate acks.
> 
> 
> Thanks,
> Eran Chinthaka Withana
> 
> 
> 
> 
> On Tue, Jun 17, 2014 at 12:30 PM, Lahiru Gunathilake <gl...@gmail.com>
> wrote:
> 
> 
>> Hi Eran,
>> 
>> 
>> I think I should take back my last email. When I carefully look at storm I
>> have following question.
>> 
>> 
>> How are we going to store the Job statuses and relaunch the jobs which was
>> running in failure nodes ? Its true that storm is starting new workers but
>> there should be a way to find missing jobs by someone in the system. Since
>> we are not having a data stream there is no use to start new workers unless
>> we handler the missing jobs. I think we need to have a better control of
>> our component and persist the states of jobs each GFAC node is handling.
>> Directly using zookeeper will let us to do a proper fault tolerance
>> implementation.
>> 
>> 
>> Regards
>> Lahiru
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> On Tue, Jun 17, 2014 at 3:14 PM, Lahiru Gunathilake <gl...@gmail.com>
>> wrote:
>> 
>> 
>>> Hi Supun,
>>> 
>>> 
>>> I think in this usecase we only use storm topology to do the
>>> 
>>> 
>>> communication
>> 
>> 
>>> among workers and we are completely ignoring the stream processing part.
>>> Orchestrator will talk to Nimbus and GFAC nodes will be Worker nodes in
>>> 
>>> 
>>> the
>> 
>> 
>>> storm topology. But I think we can achieve extremely fault tolerance
>>> 
>>> 
>>> system
>> 
>> 
>>> by directly using storm based on following statement in storm site with
>>> minimum changes in airavata.
>>> 
>>> 
>>> Additionally, the Nimbus daemon and Supervisor daemons are fail-fast and
>>> stateless; all state is kept in Zookeeper or on local disk. This means
>>> 
>>> 
>>> you
>> 
>> 
>>> can kill -9 Nimbus or the Supervisors and they’ll start back up like
>>> nothing happened. This design leads to Storm clusters being incredibly
>>> stable.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Tue, Jun 17, 2014 at 3:02 PM, Supun Kamburugamuva <su...@gmail.com>
>>> wrote:
>>> 
>>> 
>>>> Hi Eran,
>>>> 
>>>> 
>>>> I'm using Storm every day and this is one of the strangest things I've
>>>> heard about using Storm. My be there are more use cases for Storm other
>>>> than Distributed Stream processing. AFAIK the Bolts, spouts are built to
>>>> handle a stream of events that doesn't take much time to process. In
>>>> Airavata we don't process the messages. Instead we run experiments based
>>>> on
>>>> the commands given.
>>>> 
>>>> 
>>>> If you want process isolation, distributed execution, cluster resource
>>>> management Yarn would be a better thing to explore.
>>>> 
>>>> 
>>>> Thanks,
>>>> Supun..
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Tue, Jun 17, 2014 at 2:27 PM, Eran Chinthaka Withana <
>>>> eran.chinthaka@gmail.com> wrote:
>>>> 
>>>> 
>>>>> Hi Lahiru,
>>>>> 
>>>>> 
>>>>> good summarization. Thanks Lahiru.
>>>>> 
>>>>> 
>>>>> I think you are trying to stick to a model where Orchestrator
>>>>> 
>>>>> 
>>>>> distributing
>>>> 
>>>> 
>>>>> to work for GFac worker and trying to do the impedance mismatch
>>>>> 
>>>>> 
>>>>> through
>> 
>> 
>>>> a
>>>> 
>>>> 
>>>>> messaging solution. If you step back and think, we don't even want the
>>>>> orchestrator to handle everything. From its point of view, it should
>>>>> 
>>>>> 
>>>>> submit
>>>> 
>>>> 
>>>>> jobs to the framework, and will wait or get notified once the job is
>>>>> 
>>>>> 
>>>>> done.
>>>> 
>>>> 
>>>>> 
>>>>> 
>>>>> There are multiple ways of doing this. And here is one method.
>>>>> 
>>>>> 
>>>>> Orchestrator submits all its jobs to Job queue (implemented using any
>>>>> 
>>>>> 
>>>>> MQ
>> 
>> 
>>>>> impl like Rabbit or Kafka). A storm topology is implemented to dequeue
>>>>> messages, process them (i.e. submit those jobs and get those executed)
>>>>> 
>>>>> 
>>>>> and
>>>> 
>>>> 
>>>>> notify the Orchestrator with the status (either through another
>>>>> JobCompletionQueue or direct invocation).
>>>>> 
>>>>> 
>>>>> With this approach, the MQ provider will help to match impedance
>>>>> 
>>>>> 
>>>>> between
>> 
>> 
>>>>> job submission and consumption. Storm helps with worker coordination,
>>>>> 
>>>>> 
>>>>> load
>>>> 
>>>> 
>>>>> balancing, throttling on your job execution framework, worker pool
>>>>> management and fault tolerance.
>>>>> 
>>>>> 
>>>>> Of course, you can implement this based only on ZK and handle
>>>>> 
>>>>> 
>>>>> everything
>> 
>> 
>>>>> else on your own but storm had done exactly that with the use of ZK
>>>>> underneath.
>>>>> 
>>>>> 
>>>>> Finally, if you go for a model like this, then even beyond job
>>>>> 
>>>>> 
>>>>> submission,
>>>> 
>>>> 
>>>>> you can use the same model to do anything within the framework for
>>>>> 
>>>>> 
>>>>> internal
>>>> 
>>>> 
>>>>> communication. For example, the workflow engine will submit its jobs
>>>>> 
>>>>> 
>>>>> to
>> 
>> 
>>>>> queues based on what it has to do. Storm topologies exists for each
>>>>> 
>>>>> 
>>>>> queues
>>>> 
>>>> 
>>>>> to dequeue messages and carry out the work in a reliable manner.
>>>>> 
>>>>> 
>>>>> Consider
>>>> 
>>>> 
>>>>> this as mini-workflows within a larger workflow framework.
>>>>> 
>>>>> 
>>>>> We can have a voice chat if its more convenient. But not at 7am PST :)
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> Thanks,
>>>>> Eran Chinthaka Withana
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On Tue, Jun 17, 2014 at 10:12 AM, Lahiru Gunathilake <
>>>>> 
>>>>> 
>>>>> glahiru@gmail.com
>> 
>> 
>>>>> 
>>>>> 
>>>>> wrote:
>>>>> 
>>>>> 
>>>>>> Hi All,
>>>>>> 
>>>>>> 
>>>>>> Ignoring the tool that we are going to use to implement fault
>>>>>> 
>>>>>> 
>>>>>> tolerance I
>>>> 
>>>> 
>>>>>> have summarized the model we have decided so far. I will use the
>>>>>> 
>>>>>> 
>>>>>> tool
>> 
>> 
>>>>> name
>>>>> 
>>>>> 
>>>>>> as X, we can use Zookeeper or some other implementation. Following
>>>>>> 
>>>>>> 
>>>>>> design
>>>> 
>>>> 
>>>>>> assume tool X and Registry have high availability.
>>>>>> 
>>>>>> 
>>>>>> 1. Orchestrator and GFAC worker node communication is going to be
>>>>>> 
>>>>>> 
>>>>>> queue
>>>> 
>>>> 
>>>>>> based and tool X is going to be used for this communication. (We
>>>>>> 
>>>>>> 
>>>>>> have
>> 
>> 
>>>> to
>>>> 
>>>> 
>>>>>> implement this with considering race condition between different
>>>>>> 
>>>>>> 
>>>>>> gfac
>> 
>> 
>>>>>> workers).
>>>>>> 2. We are having multiple instances of GFAC which are identical (In
>>>>>> 
>>>>>> 
>>>>>> future
>>>>> 
>>>>> 
>>>>>> we can group gfac workers). Existence of each worker node is
>>>>>> 
>>>>>> 
>>>>>> identified
>>>> 
>>>> 
>>>>>> using X. If node goes down orchestrator will be notified by X.
>>>>>> 3. When a particular request comes and accepted by one gfac worker
>>>>>> 
>>>>>> 
>>>>>> that
>>>> 
>>>> 
>>>>>> information will be replicated in tool X and a place where this
>>>>>> 
>>>>>> 
>>>>>> information
>>>>> 
>>>>> 
>>>>>> is persisted even the worker failed.
>>>>>> 4. When a job comes to a final state like failed or cancelled or
>>>>>> 
>>>>>> 
>>>>>> completed
>>>>> 
>>>>> 
>>>>>> above information will be removed. So at a given time orchestrator
>>>>>> 
>>>>>> 
>>>>>> can
>> 
>> 
>>>>> poll
>>>>> 
>>>>> 
>>>>>> active jobs in each worker by giving a worker ID.
>>>>>> 5. Tool X will make sure that when a worker goes down it will notify
>>>>>> orchestrator. During a worker failure, based on step 3 and 4
>>>>>> 
>>>>>> 
>>>>>> orchestrator
>>>> 
>>>> 
>>>>>> can poll all the active jobs of that worker and do the same thing
>>>>>> 
>>>>>> 
>>>>>> like in
>>>> 
>>>> 
>>>>>> step 1 (store the experiment ID to the queue) and gfac worker will
>>>>>> 
>>>>>> 
>>>>>> pick
>>>> 
>>>> 
>>>>> the
>>>>> 
>>>>> 
>>>>>> jobs.
>>>>>> 
>>>>>> 
>>>>>> 6. When GFAC receive a job like in step 5 it have to carefully
>>>>>> 
>>>>>> 
>>>>>> evaluate
>>>> 
>>>> 
>>>>> the
>>>>> 
>>>>> 
>>>>>> state from registry and decide what to be done (If the job is
>>>>>> 
>>>>>> 
>>>>>> pending
>> 
>> 
>>>>> then
>>>>> 
>>>>> 
>>>>>> gfac just have to monitor, if job state is like input transferred
>>>>>> 
>>>>>> 
>>>>>> not
>> 
>> 
>>>>> even
>>>>> 
>>>>> 
>>>>>> submitted gfac has to execute rest of the chain and submit the job
>>>>>> 
>>>>>> 
>>>>>> to
>> 
>> 
>>>> the
>>>> 
>>>> 
>>>>>> resource and start monitoring).
>>>>>> 
>>>>>> 
>>>>>> If we can find a tool X which supports all these features and tool
>>>>>> 
>>>>>> 
>>>>>> itself
>>>> 
>>>> 
>>>>>> is fault tolerance and support atomicity, high availability and
>>>>>> 
>>>>>> 
>>>>>> simply
>> 
>> 
>>>>> API
>>>>> 
>>>>> 
>>>>>> to implement we can use that tool.
>>>>>> 
>>>>>> 
>>>>>> WDYT ?
>>>>>> 
>>>>>> 
>>>>>> Lahiru
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Mon, Jun 16, 2014 at 2:38 PM, Supun Kamburugamuva <
>>>>>> 
>>>>>> 
>>>>>> supun06@gmail.com>
>>>> 
>>>> 
>>>>>> wrote:
>>>>>> 
>>>>>> 
>>>>>>> Hi Lahiru,
>>>>>>> 
>>>>>>> 
>>>>>>> Before moving with an implementation it may be worth to consider
>>>>>>> 
>>>>>>> 
>>>>>>> some
>>>> 
>>>> 
>>>>> of
>>>>> 
>>>>> 
>>>>>>> the following aspects as well.
>>>>>>> 
>>>>>>> 
>>>>>>> 1. How to report the progress of an experiment as state in
>>>>>>> 
>>>>>>> 
>>>>>>> ZooKeeper?
>>>> 
>>>> 
>>>>>> What
>>>>>> 
>>>>>> 
>>>>>>> happens if a GFac instance crashes while executing an experiment?
>>>>>>> 
>>>>>>> 
>>>>>>> Are
>>>> 
>>>> 
>>>>>> there
>>>>>> 
>>>>>> 
>>>>>>> check-points we can save so that another GFac instance can take
>>>>>>> 
>>>>>>> 
>>>>>>> over?
>>>> 
>>>> 
>>>>>>> 2. What is the threading model of GFac instances? (I consider this
>>>>>>> 
>>>>>>> 
>>>>>>> as a
>>>> 
>>>> 
>>>>>>> very important aspect)
>>>>>>> 3. What are the information needed to be stored in the ZooKeeper?
>>>>>>> 
>>>>>>> 
>>>>>>> You
>>>> 
>>>> 
>>>>> may
>>>>> 
>>>>> 
>>>>>>> need to store other information about an experiment apart from its
>>>>>>> experiment ID.
>>>>>>> 4. How to report errors?
>>>>>>> 5. For GFac weather you need a threading model or worker process
>>>>>>> 
>>>>>>> 
>>>>>>> model?
>>>> 
>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Supun..
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Mon, Jun 16, 2014 at 2:22 PM, Lahiru Gunathilake <
>>>>>>> 
>>>>>>> 
>>>>>>> glahiru@gmail.com
>>>> 
>>>> 
>>>>>> 
>>>>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>> 
>>>>>>>> Hi All,
>>>>>>>> 
>>>>>>>> 
>>>>>>>> I think the conclusion is like this,
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 1, We make the gfac as a worker not a thrift service and we can
>>>>>>>> 
>>>>>>>> 
>>>>>>>> start
>>>> 
>>>> 
>>>>>>>> multiple workers either with bunch of providers and handlers
>>>>>>>> 
>>>>>>>> 
>>>>>>>> configured
>>>>> 
>>>>> 
>>>>>>> in
>>>>>>> 
>>>>>>> 
>>>>>>>> each worker or provider specific workers to handle the class
>>>>>>>> 
>>>>>>>> 
>>>>>>>> path
>> 
>> 
>>>>>> issues
>>>>>> 
>>>>>> 
>>>>>>>> (not the common scenario).
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 2. Gfac workers can be configured to watch for a given path in
>>>>>>>> 
>>>>>>>> 
>>>>>>>> zookeeper,
>>>>>> 
>>>>>> 
>>>>>>>> and multiple workers can listen to the same path. Default path
>>>>>>>> 
>>>>>>>> 
>>>>>>>> can be
>>>> 
>>>> 
>>>>>>>> /airavata/gfac or can configure paths like /airavata/gfac/gsissh
>>>>>>>> /airavata/gfac/bes.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 3. Orchestrator can configure with a logic to store experiment
>>>>>>>> 
>>>>>>>> 
>>>>>>>> IDs in
>>>> 
>>>> 
>>>>>>>> zookeeper with a path, and orchestrator can be configured to
>>>>>>>> 
>>>>>>>> 
>>>>>>>> provider
>>>> 
>>>> 
>>>>>>>> specific path logic too. So when a new request come orchestrator
>>>>>>>> 
>>>>>>>> 
>>>>>>>> store
>>>>> 
>>>>> 
>>>>>>> the
>>>>>>> 
>>>>>>> 
>>>>>>>> experimentID and these experiments IDs are stored in Zk as a
>>>>>>>> 
>>>>>>>> 
>>>>>>>> queue.
>>>> 
>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 4. Since gfac workers are watching they will be notified and as
>>>>>>>> 
>>>>>>>> 
>>>>>>>> supun
>>>> 
>>>> 
>>>>>>>> suggested can use a leader selection algorithm[1] and one gfac
>>>>>>>> 
>>>>>>>> 
>>>>>>>> worker
>>>> 
>>>> 
>>>>>>> will
>>>>>>> 
>>>>>>> 
>>>>>>>> take the leadership for each experiment. If there are gfac
>>>>>>>> 
>>>>>>>> 
>>>>>>>> instances
>>>> 
>>>> 
>>>>>> for
>>>>>> 
>>>>>> 
>>>>>>>> each provider same logic will apply among those nodes with same
>>>>>>>> 
>>>>>>>> 
>>>>>>>> provider
>>>>>> 
>>>>>> 
>>>>>>>> type.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> [1]
>>>>>>>> 
>>>>>>>> 
>>>>>>>> http://curator.apache.org/curator-recipes/leader-election.html
>> 
>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> I would like to implement this if there are no objections.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Lahiru
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Mon, Jun 16, 2014 at 11:51 AM, Supun Kamburugamuva <
>>>>>>>> 
>>>>>>>> 
>>>>>>>> supun06@gmail.com
>>>>>> 
>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> Hi Marlon,
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I think you are exactly correct.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Supun..
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Mon, Jun 16, 2014 at 11:48 AM, Marlon Pierce <
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> marpierc@iu.edu>
>>>> 
>>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> Let me restate this, and please tell me if I'm wrong.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Orchestrator decides (somehow) that a particular job
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> requires
>> 
>> 
>>>>>>> JSDL/BES,
>>>>>>> 
>>>>>>> 
>>>>>>>>> so
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> it places the Experiment ID in Zookeeper's
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> /airavata/gfac/jsdl-bes
>>>>> 
>>>>> 
>>>>>>>> node.
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> GFAC servers associated with this instance notice the
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> update.
>> 
>> 
>>>>> The
>>>>> 
>>>>> 
>>>>>>>> first
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> GFAC to claim the job gets it, uses the Experiment ID to get
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> the
>>>> 
>>>> 
>>>>>>>> detailed
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> information it needs from the Registry. ZooKeeper handles
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> the
>> 
>> 
>>>>>>> locking,
>>>>>>> 
>>>>>>> 
>>>>>>>>> etc
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> to make sure that only one GFAC at a time is trying to
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> handle
>> 
>> 
>>>> an
>>>> 
>>>> 
>>>>>>>>> experiment.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Marlon
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On 6/16/14, 11:42 AM, Lahiru Gunathilake wrote:
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> Hi Supun,
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Thanks for the clarification.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Regards
>>>>>>>>>>> Lahiru
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva <
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> supun06@gmail.com>
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Hi Lahiru,
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> My suggestion is that may be you don't need a Thrift
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> service
>> 
>> 
>>>>>>> between
>>>>>>> 
>>>>>>> 
>>>>>>>>>>>> Orchestrator and the component executing the experiment.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> When a
>>>> 
>>>> 
>>>>>> new
>>>>>> 
>>>>>> 
>>>>>>>>>>>> experiment is submitted, orchestrator decides who can
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> execute
>>>> 
>>>> 
>>>>>> this
>>>>>> 
>>>>>> 
>>>>>>>> job.
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>> Then it put the information about this experiment
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> execution
>> 
>> 
>>>> in
>>>> 
>>>> 
>>>>>>>>> ZooKeeper.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>> The component which wants to executes the experiment is
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> listening
>>>>> 
>>>>> 
>>>>>>> to
>>>>>>> 
>>>>>>> 
>>>>>>>>> this
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>> ZooKeeper path and when it sees the experiment it will
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> execute
>>>> 
>>>> 
>>>>>> it.
>>>>>> 
>>>>>> 
>>>>>>> So
>>>>>>> 
>>>>>>> 
>>>>>>>>>>>> that
>>>>>>>>>>>> the communication happens through an state change in
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> ZooKeeper.
>>>> 
>>>> 
>>>>>>> This
>>>>>>> 
>>>>>>> 
>>>>>>>>> can
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>> potentially simply your architecture.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Supun.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake <
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> glahiru@gmail.com>
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Hi Supun,
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> So your suggestion is to create a znode for each thrift
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> service
>>>>> 
>>>>> 
>>>>>> we
>>>>>> 
>>>>>> 
>>>>>>>>> have
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>>> and
>>>>>>>>>>>>> when the request comes that node gets modified with input
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> data
>>>> 
>>>> 
>>>>>> for
>>>>>> 
>>>>>> 
>>>>>>>>> that
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>>> request and thrift service is having a watch for that
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> node
>> 
>> 
>>>> and
>>>> 
>>>> 
>>>>>> it
>>>>>> 
>>>>>> 
>>>>>>>> will
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>>> be
>>>>>>>>>>>>> notified because of the watch and it can read the input
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> from
>>>> 
>>>> 
>>>>>>>> zookeeper
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>>> and
>>>>>>>>>>>>> invoke the operation?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Lahiru
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva <
>>>>>>>>>>>>> supun06@gmail.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Here is what I think about Airavata and ZooKeeper. In
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Airavata
>>>>> 
>>>>> 
>>>>>>>> there
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>>>> are
>>>>>>>>>>>>>> many components and these components must be stateless
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> to
>> 
>> 
>>>>>> achieve
>>>>>> 
>>>>>> 
>>>>>>>>>>>>>> scalability and reliability.Also there must be a
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> mechanism to
>>>> 
>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> communicate
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> between the components. At the moment Airavata uses RPC
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> calls
>>>> 
>>>> 
>>>>>>> based
>>>>>>> 
>>>>>>> 
>>>>>>>>> on
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>>>> Thrift for the communication.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> ZooKeeper can be used both as a place to hold state and
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> as a
>>>> 
>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> communication
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> layer between the components. I'm involved with a
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> project
>> 
>> 
>>>>> that
>>>>> 
>>>>> 
>>>>>>> has
>>>>>>> 
>>>>>>> 
>>>>>>>>> many
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>>>> distributed components like AIravata. Right now we use
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thrift
>>>> 
>>>> 
>>>>>>>>> services
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> to
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> communicate among the components. But we find it
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> difficult to
>>>> 
>>>> 
>>>>>> use
>>>>>> 
>>>>>> 
>>>>>>>> RPC
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> calls
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> and achieve stateless behaviour and thinking of
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> replacing
>> 
>> 
>>>>>> Thrift
>>>>>> 
>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> services
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> with ZooKeeper based communication layer. So I think it
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> is
>> 
>> 
>>>>>> better
>>>>>> 
>>>>>> 
>>>>>>>> to
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>>>> explore the possibility of removing the Thrift services
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> between
>>>>> 
>>>>> 
>>>>>>> the
>>>>>>> 
>>>>>>> 
>>>>>>>>>>>>>> components and use ZooKeeper as a communication
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> mechanism
>> 
>> 
>>>>>> between
>>>>>> 
>>>>>> 
>>>>>>>> the
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>>>> services. If you do this you will have to move the state
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> to
>>>> 
>>>> 
>>>>>>>> ZooKeeper
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> and
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> will automatically achieve the stateless behaviour in
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> the
>> 
>> 
>>>>>>>> components.
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Also I think trying to make ZooKeeper optional is a bad
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> idea.
>>>> 
>>>> 
>>>>>> If
>>>>>> 
>>>>>> 
>>>>>>> we
>>>>>>> 
>>>>>>> 
>>>>>>>>> are
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>>>> trying to integrate something fundamentally important to
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> architecture
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>>>> as
>>>>>>>>>>>>>> how to store state, we shouldn't make it optional.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Supun..
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
>>>>>>>>>>>>>> shameerainfo@gmail.com> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hi Lahiru,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> As i understood, not only reliability , you are trying
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> to
>>>> 
>>>> 
>>>>>>> achieve
>>>>>>> 
>>>>>>> 
>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>> other requirement by introducing zookeeper, like health
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> monitoring
>>>>>>> 
>>>>>>> 
>>>>>>>>> of
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> services, categorization with service implementation etc
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> ...
>>>> 
>>>> 
>>>>> .
>>>>> 
>>>>> 
>>>>>> In
>>>>>> 
>>>>>> 
>>>>>>>>> that
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>>>>> case, i think we can get use of zookeeper's features
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> but
>> 
>> 
>>>> if
>>>> 
>>>> 
>>>>> we
>>>>> 
>>>>> 
>>>>>>>> only
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> focus
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> on reliability, i have little bit of concern, why can't
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> we
>> 
>> 
>>>>> use
>>>>> 
>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> clustering +
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> LB ?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Yes it is better we add Zookeeper as a prerequisite if
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> user
>>>> 
>>>> 
>>>>>> need
>>>>>> 
>>>>>> 
>>>>>>>> to
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>>>>> use
>>>>>>>>>>>>>>> it.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Shameera.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <
>>>>>>>>>>>>>>> glahiru@gmail.com
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Hi Gagan,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> I need to start another discussion about it, but I had
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> an
>>>> 
>>>> 
>>>>>>> offline
>>>>>>> 
>>>>>>> 
>>>>>>>>>>>>>>>> discussion with Suresh about auto-scaling. I will
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> start
>> 
>> 
>>>>>> another
>>>>>> 
>>>>>> 
>>>>>>>>>>>>>>>> thread
>>>>>>>>>>>>>>>> about this topic too.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>>>> Lahiru
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> gagandeepjuneja@gmail.com
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Thanks Lahiru for pointing to nice library, added to
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> my
>> 
>> 
>>>>>>>> dictionary
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> :).
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I would like to know how are we planning to start
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> multiple
>>>> 
>>>> 
>>>>>>>> servers.
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>>>>>>> 1. Spawning new servers based on load? Some times we
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> call
>>>> 
>>>> 
>>>>> it
>>>>> 
>>>>> 
>>>>>>> as
>>>>>>> 
>>>>>>> 
>>>>>>>>> auto
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>>>>>>> scalable.
>>>>>>>>>>>>>>>>> 2. To make some specific number of nodes available
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> such as
>>>> 
>>>> 
>>>>>> we
>>>>>> 
>>>>>> 
>>>>>>>>> want 2
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>>>>>>> servers to be available at any time so if one goes
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> down
>> 
>> 
>>>>>> then I
>>>>>> 
>>>>>> 
>>>>>>>>> need
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> spawn one new to make available servers count 2.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 3. Initially start all the servers.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> In scenario 1 and 2 zookeeper does make sense but I
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> don't
>>>> 
>>>> 
>>>>>>>> believe
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> existing
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> architecture support this?
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>> Gagan
>>>>>>>>>>>>>>>>> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> glahiru@gmail.com
>>>>>>> 
>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Hi Gagan,
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Thanks for your response. Please see my inline
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> comments.
>>>> 
>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> gagandeepjuneja@gmail.com>
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Hi Lahiru,
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Just my 2 cents.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> I am big fan of zookeeper but also against adding
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> multiple
>>>>> 
>>>>> 
>>>>>>>> hops
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> in
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> system which can add unnecessary complexity. Here I
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> am
>> 
>> 
>>>> not
>>>> 
>>>> 
>>>>>>> able
>>>>>>> 
>>>>>>> 
>>>>>>>> to
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>>>>>>>>> understand the requirement of zookeeper may be I am
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> wrong
>>>>> 
>>>>> 
>>>>>>>>> because
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> less
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> knowledge of the airavata system in whole. So I would
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> like
>>>> 
>>>> 
>>>>>> to
>>>>>> 
>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> discuss
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> following point.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 1. How it will help us in making system more
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> reliable.
>>>> 
>>>> 
>>>>>>>> Zookeeper
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> not
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> able to restart services. At max it can tell whether
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> service
>>>>> 
>>>>> 
>>>>>>> is
>>>>>>> 
>>>>>>> 
>>>>>>>> up
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> or not
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> which could only be the case if airavata service goes
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> down
>>>> 
>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> gracefully and
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> we have any automated way to restart it. If this is
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> just
>>>> 
>>>> 
>>>>>>> matter
>>>>>>> 
>>>>>>> 
>>>>>>>> of
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> routing
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> client requests to the available thrift servers then
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> this
>>>> 
>>>> 
>>>>>> can
>>>>>> 
>>>>>> 
>>>>>>> be
>>>>>>> 
>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> achieved
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> with the help of load balancer which I guess is
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> already
>> 
>> 
>>>>>> there
>>>>>> 
>>>>>> 
>>>>>>> in
>>>>>>> 
>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> thrift
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> wish list.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> We have multiple thrift services and currently we
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> start
>>>> 
>>>> 
>>>>>>> only
>>>>>>> 
>>>>>>> 
>>>>>>>>> one
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> instance
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> of them and each thrift service is a stateless
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> service. To
>>>> 
>>>> 
>>>>>>> keep
>>>>>>> 
>>>>>>> 
>>>>>>>>> the
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> high
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> availability we have to start multiple instances of
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> them
>>>> 
>>>> 
>>>>> in
>>>>> 
>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> production
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> scenario. So for clients to get an available thrift
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> service
>>>> 
>>>> 
>>>>> we
>>>>> 
>>>>> 
>>>>>>> can
>>>>>>> 
>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> use
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> zookeeper znodes to represent each available service.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> There
>>>> 
>>>> 
>>>>>> are
>>>>>> 
>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> libraries which is doing similar[1] and I think we can
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> use
>>>> 
>>>> 
>>>>>> them
>>>>>> 
>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> directly.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 2. As far as registering of different providers is
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> concerned
>>>>> 
>>>>> 
>>>>>>> do
>>>>>>> 
>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> you
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> think for that we really need external store.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Yes I think so, because its light weight and
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> reliable
>>>> 
>>>> 
>>>>> and
>>>>> 
>>>>> 
>>>>>>> we
>>>>>>> 
>>>>>>> 
>>>>>>>>> have
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> do
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> very minimal amount of work to achieve all these
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> features
>>>> 
>>>> 
>>>>> to
>>>>> 
>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Airavata
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> because zookeeper handle all the complexity.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> I have seen people using zookeeper more for state
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> management
>>>>>> 
>>>>>> 
>>>>>>>> in
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>>>>>>>>> distributed environments.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> +1, we might not be the most effective users of
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> zookeeper
>>>>> 
>>>>> 
>>>>>>>>> because
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> of
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> our services are stateless services, but my point is
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> to
>> 
>> 
>>>>>>> achieve
>>>>>>> 
>>>>>>> 
>>>>>>>>>>>>>>>>>> fault-tolerance we can use zookeeper and with
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> minimal
>> 
>> 
>>>>> work.
>>>>> 
>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> I would like to understand more how can we
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> leverage
>> 
>> 
>>>>>>>> zookeeper
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> in
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> airavata to make system reliable.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> [1]https://github.com/eirslett/thrift-zookeeper
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Gagan
>>>>>>>>>>>>>>>>>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> marpierc@iu.edu
>>>>> 
>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Thanks for the summary, Lahiru. I'm cc'ing the
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Architecture
>>>>>> 
>>>>>> 
>>>>>>>>> list
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> additional comments.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Marlon
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> I did little research about Apache Zookeeper[1]
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> and
>> 
>> 
>>>>> how
>>>>> 
>>>>> 
>>>>>> to
>>>>>> 
>>>>>> 
>>>>>>>> use
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> in
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> airavata. Its really a nice way to achieve fault
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> tolerance
>>>>> 
>>>>> 
>>>>>>> and
>>>>>>> 
>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> reliable
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> communication between our thrift services and
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> clients.
>>>> 
>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Zookeeper
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> is a
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> distributed, fault tolerant system to do a reliable
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> communication
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> between
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> distributed applications. This is like an
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> in-memory
>> 
>> 
>>>>> file
>>>>> 
>>>>> 
>>>>>>>>> system
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> has
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> nodes in a tree structure and each node can have
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> small
>>>> 
>>>> 
>>>>>>>> amount
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> data
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> associated with it and these nodes are called
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> znodes.
>> 
>> 
>>>>>> Clients
>>>>>> 
>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> can
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> connect
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> to a zookeeper server and add/delete and update
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> these
>>>> 
>>>> 
>>>>>>>> znodes.
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> In Apache Airavata we start multiple thrift
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> services
>>>>> 
>>>>> 
>>>>>>> and
>>>>>>> 
>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> these
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> can
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> go
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> down for maintenance or these can crash, if we
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> use
>> 
>> 
>>>>>>> zookeeper
>>>>>>> 
>>>>>>> 
>>>>>>>>> to
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> store
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> these
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> configuration(thrift service configurations) we
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> can
>> 
>> 
>>>>>>> achieve
>>>>>>> 
>>>>>>> 
>>>>>>>> a
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> very
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> reliable
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> system. Basically thrift clients can dynamically
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> discover
>>>>>> 
>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> available
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> service
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> by using ephemeral znodes(Here we do not have to
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> change
>>>>> 
>>>>> 
>>>>>>> the
>>>>>>> 
>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> generated
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> thrift client code but we have to change the
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> locations we
>>>> 
>>>> 
>>>>>> are
>>>>>> 
>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> invoking
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> them). ephemeral znodes will be removed when the
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> thrift
>>>> 
>>>> 
>>>>>>> service
>>>>>>> 
>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> goes
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> down
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> and zookeeper guarantee the atomicity between
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> these
>> 
>> 
>>>>>>>>> operations.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> With
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> approach we can have a node hierarchy for
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> multiple
>> 
>> 
>>>> of
>>>> 
>>>> 
>>>>>>>>> airavata,
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> orchestrator,appcatalog and gfac thrift services.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> For specifically for gfac we can have different
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> types
>>>> 
>>>> 
>>>>> of
>>>>> 
>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> services
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> for
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> each
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> provider implementation. This can be achieved by
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> using
>>>> 
>>>> 
>>>>>> the
>>>>>> 
>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> hierarchical
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> support in zookeeper and providing some logic in
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> gfac-thrift
>>>>>>> 
>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> service
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> register it to a defined path. Using the same
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> logic
>> 
>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> orchestrator
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> can
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> discover the provider specific gfac thrift service
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> and
>> 
>> 
>>>>>> route
>>>>>> 
>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> message to
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> the correct thrift service.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> With this approach I think we simply have write
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> some
>>>> 
>>>> 
>>>>>>> client
>>>>>>> 
>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> in
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> thrift
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> services and clients and zookeeper server
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> installation
>>>> 
>>>> 
>>>>>> can
>>>>>> 
>>>>>> 
>>>>>>>> be
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> done as
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> separate process and it will be easier to keep
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> the
>> 
>> 
>>>>>>> Zookeeper
>>>>>>> 
>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> server
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> separate from Airavata because installation of
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Zookeeper
>>>> 
>>>> 
>>>>>>> server
>>>>>>> 
>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> little
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> complex in production scenario. I think we have to
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> make
>>>> 
>>>> 
>>>>>> sure
>>>>>> 
>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> everything
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> works fine when there is no Zookeeper running,
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> ex:
>> 
>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> enable.zookeeper=false
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> should works fine and users doesn't have to
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> download
>>>> 
>>>> 
>>>>> and
>>>>> 
>>>>> 
>>>>>>>> start
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> zookeeper.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> [1]http://zookeeper.apache.org/
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>>>>>> Lahiru
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>> System Analyst Programmer
>>>>>>>>>>>>>>>>>> PTI Lab
>>>>>>>>>>>>>>>>>> Indiana University
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> System Analyst Programmer
>>>>>>>>>>>>>>>> PTI Lab
>>>>>>>>>>>>>>>> Indiana University
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>>> Shameera Rathnayaka.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> email: shameera AT apache.org , shameerainfo AT
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> gmail.com
>>>> 
>>>> 
>>>>>>>>>>>>>>> Blog : http://shameerarathnayaka.blogspot.com/
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Supun Kamburugamuva
>>>>>>>>>>>>>> Member, Apache Software Foundation;
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> http://www.apache.org
>> 
>> 
>>>>>>>>>>>>>> E-mail: supun06@gmail.com; Mobile: +1 812 369 6762
>>>>>>>>>>>>>> Blog: http://supunk.blogspot.com
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> --
>>>>>>>>>>>>> System Analyst Programmer
>>>>>>>>>>>>> PTI Lab
>>>>>>>>>>>>> Indiana University
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> --
>>>>>>>>>>>> Supun Kamburugamuva
>>>>>>>>>>>> Member, Apache Software Foundation; http://www.apache.org
>>>>>>>>>>>> E-mail: supun06@gmail.com; Mobile: +1 812 369 6762
>>>>>>>>>>>> Blog: http://supunk.blogspot.com
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> Supun Kamburugamuva
>>>>>>>>> Member, Apache Software Foundation; http://www.apache.org
>>>>>>>>> E-mail: supun06@gmail.com; Mobile: +1 812 369 6762
>>>>>>>>> Blog: http://supunk.blogspot.com
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> System Analyst Programmer
>>>>>>>> PTI Lab
>>>>>>>> Indiana University
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Supun Kamburugamuva
>>>>>>> Member, Apache Software Foundation; http://www.apache.org
>>>>>>> E-mail: supun06@gmail.com; Mobile: +1 812 369 6762
>>>>>>> Blog: http://supunk.blogspot.com
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> System Analyst Programmer
>>>>>> PTI Lab
>>>>>> Indiana University
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Supun Kamburugamuva
>>>> Member, Apache Software Foundation; http://www.apache.org
>>>> E-mail: supun06@gmail.com; Mobile: +1 812 369 6762
>>>> Blog: http://supunk.blogspot.com
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> --
>>> System Analyst Programmer
>>> PTI Lab
>>> Indiana University
>>> 
>>> 
>>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> --
>> System Analyst Programmer
>> PTI Lab
>> Indiana University
>> 
>>

Re: Zookeeper in Airavata to achieve reliability

Posted by Suresh Marru <sm...@apache.org>.

Lahiru,

Awesome, glad to see the comprehensive thought in these use cases. Once we nail these kind of scenarios with an approach like ZK, then it will be a good starting point to explore other options. 

I really like the idea of framework level and provider level checkpointing and also having Recoverable handlers and provider implementations.  

Suresh

 
On Jun 25, 2014, at 2:09 PM, Lahiru Gunathilake <gl...@gmail.com> wrote:

> Hi All,
> 
> I have finished the initial version of the ZK integration. Now we can start
> multiple thrift gfac services (still the communication between orchestrator
> and gfac is RPC) and orchestrator submit jobs to multiple gfac nodes.
> 
> I can kill a gfac node and orchestrator will make sure jobs are not lost,
> it simply take those jobs and re-submit to gfac. Since GFac is a generic
> framework and we have multiple plugins developed for that framework
> checkpointing the plugin is up to the plugin developers but gfac
> checkpoints whether those plugins invoked or not.
> 
> I have introduced a new interface for plugin development called Recoverable
> (RecoverableHandlers and RecoverableProvider). So state-full plugins has to
> implement their recover method and gfac framework will make sure it will be
> invoked during a re-run scenario. If a plugin is not recoverable and
> already ran(can be found using framework checkpointing) during the re-run
> that plugin will not be invoked. For now I just implemented recoverability
> to few plugins and I have tested submitting a job to trestles and let it
> submit and come to monitoring state and kill that gfac instance. Now
> Orchestrator pick that execution and re-submit to another gfac node and
> that gfac node does not re-run that job to the computing resource, but
> simply start monitoring once the job is done outputs are downloaded from
> the original output location.
> 
> When a particular experiment is finished all the ZK data is removed.
> 
> At this point following things needs to be done,
> 
> 1. Figure out all the state-full handlers/providers and implement
> recoverability,
> 
> Ex: Input handler is transfering 1000 files and with 500 files gfac
> instance crashed, during the re-run it should be able to tranfer from 501
> file. Same logic can be applied to a single huge file. Those things are
> completely up to the plugin developer.
> 
> 2. Then we have to do remove the RPC invocation and make gfac nodes as
> worker nodes.
> 
> Regards
> Lahiru
> 
> 
> On Wed, Jun 18, 2014 at 12:11 PM, Lahiru Gunathilake <gl...@gmail.com>
> wrote:
> 
>> Hi Eran,
>> 
>> 
>> On Tue, Jun 17, 2014 at 4:06 PM, Eran Chinthaka Withana <
>> eran.chinthaka@gmail.com> wrote:
>> 
>>> Storm has a Kafka spout which manages the cursor location (pointer to the
>>> head of the queue representing the next message to be processed) inside
>>> ZK.
>>> Each storm spout instance uses this information to get the next item to
>>> process. Storm kafka spout won't advance to the next message until it gets
>>> an ack from the storm topology.
>>> 
>> If we have 10 jobs in the queue and 5 GFAC instances picked 1 at a time
>> and successfully submitted and have to start taking rest of the jobs. But
>> all 5 GFAC instances are responsible for initially picked  5 jobs because
>> they are still running and gfac instances are monitoring them until its
>> done but at the same time we have to move the cursor to pick other jobs
>> too.
>> 
>> If we Ack and moved the cursor just after submission without waiting until
>> the job is actually finished how are we going to know which gfac is
>> monitoring which set of jobs ?
>> 
>> I am not getting how achieve above requirement with this suggestion. May
>> be I am missing something here.
>> 
>> Regards
>> Lahiru
>> 
>>> 
>>> So, if there is an exception in the topology and ack is sent only by the
>>> last bolt, then storm bolt make sure all messages are processed since
>>> exceptions won't generate acks.
>>> 
>>> Thanks,
>>> Eran Chinthaka Withana
>>> 
>>> 
>>> On Tue, Jun 17, 2014 at 12:30 PM, Lahiru Gunathilake <gl...@gmail.com>
>>> wrote:
>>> 
>>>> Hi Eran,
>>>> 
>>>> I think I should take back my last email. When I carefully look at
>>> storm I
>>>> have following question.
>>>> 
>>>> How are we going to store the Job statuses  and relaunch the jobs which
>>> was
>>>> running in failure nodes ? Its true that storm is starting new workers
>>> but
>>>> there should be a way to find missing jobs by someone in the system.
>>> Since
>>>> we are not having a data stream there is no use to start new workers
>>> unless
>>>> we handler the missing jobs. I think we need to have a better control of
>>>> our component and persist the states of jobs each GFAC node is handling.
>>>> Directly using zookeeper will let us to do a proper fault tolerance
>>>> implementation.
>>>> 
>>>> Regards
>>>> Lahiru
>>>> 
>>>> 
>>>> 
>>>> On Tue, Jun 17, 2014 at 3:14 PM, Lahiru Gunathilake <gl...@gmail.com>
>>>> wrote:
>>>> 
>>>>> Hi Supun,
>>>>> 
>>>>> I think in this usecase we only use storm topology to do the
>>>> communication
>>>>> among workers and we are completely ignoring the stream processing
>>> part.
>>>>> Orchestrator will talk to Nimbus and GFAC nodes will be Worker nodes
>>> in
>>>> the
>>>>> storm topology. But I think we can achieve extremely fault tolerance
>>>> system
>>>>> by directly using storm based on following statement in storm site
>>> with
>>>>> minimum changes in airavata.
>>>>> 
>>>>> Additionally, the Nimbus daemon and Supervisor daemons are fail-fast
>>> and
>>>>> stateless; all state is kept in Zookeeper or on local disk. This means
>>>> you
>>>>> can kill -9 Nimbus or the Supervisors and they’ll start back up like
>>>>> nothing happened. This design leads to Storm clusters being incredibly
>>>>> stable.
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On Tue, Jun 17, 2014 at 3:02 PM, Supun Kamburugamuva <
>>> supun06@gmail.com>
>>>>> wrote:
>>>>> 
>>>>>> Hi Eran,
>>>>>> 
>>>>>> I'm using Storm every day and this is one of the strangest things
>>> I've
>>>>>> heard about using Storm. My be there are more use cases for Storm
>>> other
>>>>>> than Distributed Stream processing. AFAIK the Bolts, spouts are
>>> built to
>>>>>> handle a stream of events that doesn't take much time to process. In
>>>>>> Airavata we don't process the messages. Instead we run experiments
>>> based
>>>>>> on
>>>>>> the commands given.
>>>>>> 
>>>>>> If you want process isolation, distributed execution, cluster
>>> resource
>>>>>> management Yarn would be a better thing to explore.
>>>>>> 
>>>>>> Thanks,
>>>>>> Supun..
>>>>>> 
>>>>>> 
>>>>>> On Tue, Jun 17, 2014 at 2:27 PM, Eran Chinthaka Withana <
>>>>>> eran.chinthaka@gmail.com> wrote:
>>>>>> 
>>>>>>> Hi Lahiru,
>>>>>>> 
>>>>>>> good summarization. Thanks Lahiru.
>>>>>>> 
>>>>>>> I think you are trying to stick to a model where Orchestrator
>>>>>> distributing
>>>>>>> to work for GFac worker and trying to do the impedance mismatch
>>>> through
>>>>>> a
>>>>>>> messaging solution. If you step back and think, we don't even want
>>> the
>>>>>>> orchestrator to handle everything. From its point of view, it
>>> should
>>>>>> submit
>>>>>>> jobs to the framework, and will wait or get notified once the job
>>> is
>>>>>> done.
>>>>>>> 
>>>>>>> There are multiple ways of doing this. And here is one method.
>>>>>>> 
>>>>>>> Orchestrator submits all its jobs to Job queue (implemented using
>>> any
>>>> MQ
>>>>>>> impl like Rabbit or Kafka). A storm topology is implemented to
>>> dequeue
>>>>>>> messages, process them (i.e. submit those jobs and get those
>>> executed)
>>>>>> and
>>>>>>> notify the Orchestrator with the status (either through another
>>>>>>> JobCompletionQueue or direct invocation).
>>>>>>> 
>>>>>>> With this approach, the MQ provider will help to match impedance
>>>> between
>>>>>>> job submission and consumption. Storm helps with worker
>>> coordination,
>>>>>> load
>>>>>>> balancing, throttling on your job execution framework, worker pool
>>>>>>> management and fault tolerance.
>>>>>>> 
>>>>>>> Of course, you can implement this based only on ZK and handle
>>>> everything
>>>>>>> else on your own but storm had done exactly that with the use of ZK
>>>>>>> underneath.
>>>>>>> 
>>>>>>> Finally, if you go for a model like this, then even beyond job
>>>>>> submission,
>>>>>>> you can use the same model to do anything within the framework for
>>>>>> internal
>>>>>>> communication. For example, the workflow engine will submit its
>>> jobs
>>>> to
>>>>>>> queues based on what it has to do. Storm topologies exists for each
>>>>>> queues
>>>>>>> to dequeue messages and carry out the work in a reliable manner.
>>>>>> Consider
>>>>>>> this as mini-workflows within a larger workflow framework.
>>>>>>> 
>>>>>>> We can have a voice chat if its more convenient. But not at 7am
>>> PST :)
>>>>>>> 
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Eran Chinthaka Withana
>>>>>>> 
>>>>>>> 
>>>>>>> On Tue, Jun 17, 2014 at 10:12 AM, Lahiru Gunathilake <
>>>> glahiru@gmail.com
>>>>>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Hi All,
>>>>>>>> 
>>>>>>>> Ignoring the tool that we are going to use to implement fault
>>>>>> tolerance I
>>>>>>>> have summarized the model we have decided so far. I will use the
>>>> tool
>>>>>>> name
>>>>>>>> as X, we can use Zookeeper or some other implementation.
>>> Following
>>>>>> design
>>>>>>>> assume tool X  and Registry have high availability.
>>>>>>>> 
>>>>>>>> 1. Orchestrator and GFAC worker node communication is going to be
>>>>>> queue
>>>>>>>> based and tool X is going to be used for this communication. (We
>>>> have
>>>>>> to
>>>>>>>> implement this with considering race condition between different
>>>> gfac
>>>>>>>> workers).
>>>>>>>> 2. We are having multiple instances of GFAC which are identical
>>> (In
>>>>>>> future
>>>>>>>> we can group gfac workers). Existence of each worker node is
>>>>>> identified
>>>>>>>> using X. If node goes down orchestrator will be notified by X.
>>>>>>>> 3. When a particular request comes and accepted by one gfac
>>> worker
>>>>>> that
>>>>>>>> information will be replicated in tool X and a place where this
>>>>>>> information
>>>>>>>> is persisted even the worker failed.
>>>>>>>> 4. When a job comes to a final state like failed or cancelled or
>>>>>>> completed
>>>>>>>> above information will be removed. So at a given time
>>> orchestrator
>>>> can
>>>>>>> poll
>>>>>>>> active jobs in each worker by giving a worker ID.
>>>>>>>> 5. Tool X will make sure that when a worker goes down it will
>>> notify
>>>>>>>> orchestrator. During a worker failure, based on step 3 and 4
>>>>>> orchestrator
>>>>>>>> can poll all the active jobs of that worker and do the same thing
>>>>>> like in
>>>>>>>> step 1 (store the experiment ID to the queue) and gfac worker
>>> will
>>>>>> pick
>>>>>>> the
>>>>>>>> jobs.
>>>>>>>> 
>>>>>>>> 6. When GFAC receive a job like in step 5 it have to carefully
>>>>>> evaluate
>>>>>>> the
>>>>>>>> state from registry and decide what to be done (If the job is
>>>> pending
>>>>>>> then
>>>>>>>> gfac just have to monitor, if job state is like input transferred
>>>> not
>>>>>>> even
>>>>>>>> submitted gfac has to execute rest of the chain and submit the
>>> job
>>>> to
>>>>>> the
>>>>>>>> resource and start monitoring).
>>>>>>>> 
>>>>>>>> If we can find a tool X which supports all these features and
>>> tool
>>>>>> itself
>>>>>>>> is fault tolerance and support atomicity, high availability and
>>>> simply
>>>>>>> API
>>>>>>>> to implement we can use that tool.
>>>>>>>> 
>>>>>>>> WDYT ?
>>>>>>>> 
>>>>>>>> Lahiru
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Mon, Jun 16, 2014 at 2:38 PM, Supun Kamburugamuva <
>>>>>> supun06@gmail.com>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Hi Lahiru,
>>>>>>>>> 
>>>>>>>>> Before moving with an implementation it may be worth to
>>> consider
>>>>>> some
>>>>>>> of
>>>>>>>>> the following aspects as well.
>>>>>>>>> 
>>>>>>>>> 1. How to report the progress of an experiment as state in
>>>>>> ZooKeeper?
>>>>>>>> What
>>>>>>>>> happens if a GFac instance crashes while executing an
>>> experiment?
>>>>>> Are
>>>>>>>> there
>>>>>>>>> check-points we can save so that another GFac instance can take
>>>>>> over?
>>>>>>>>> 2. What is the threading model of GFac instances? (I consider
>>> this
>>>>>> as a
>>>>>>>>> very important aspect)
>>>>>>>>> 3. What are the information needed to be stored in the
>>> ZooKeeper?
>>>>>> You
>>>>>>> may
>>>>>>>>> need to store other information about an experiment apart from
>>> its
>>>>>>>>> experiment ID.
>>>>>>>>> 4. How to report errors?
>>>>>>>>> 5. For GFac weather you need a threading model or worker
>>> process
>>>>>> model?
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Supun..
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Mon, Jun 16, 2014 at 2:22 PM, Lahiru Gunathilake <
>>>>>> glahiru@gmail.com
>>>>>>>> 
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Hi All,
>>>>>>>>>> 
>>>>>>>>>> I think the conclusion is like this,
>>>>>>>>>> 
>>>>>>>>>> 1, We make the gfac as a worker not a thrift service and we
>>> can
>>>>>> start
>>>>>>>>>> multiple workers either with bunch of providers and handlers
>>>>>>> configured
>>>>>>>>> in
>>>>>>>>>> each worker or provider specific  workers to handle the class
>>>> path
>>>>>>>> issues
>>>>>>>>>> (not the common scenario).
>>>>>>>>>> 
>>>>>>>>>> 2. Gfac workers can be configured to watch for a given path
>>> in
>>>>>>>> zookeeper,
>>>>>>>>>> and multiple workers can listen to the same path. Default
>>> path
>>>>>> can be
>>>>>>>>>> /airavata/gfac or can configure paths like
>>> /airavata/gfac/gsissh
>>>>>>>>>> /airavata/gfac/bes.
>>>>>>>>>> 
>>>>>>>>>> 3. Orchestrator can configure with a logic to store
>>> experiment
>>>>>> IDs in
>>>>>>>>>> zookeeper with a path, and orchestrator can be configured to
>>>>>> provider
>>>>>>>>>> specific path logic too. So when a new request come
>>> orchestrator
>>>>>>> store
>>>>>>>>> the
>>>>>>>>>> experimentID and these experiments IDs are stored in Zk as a
>>>>>> queue.
>>>>>>>>>> 
>>>>>>>>>> 4. Since gfac workers are watching they will be notified and
>>> as
>>>>>> supun
>>>>>>>>>> suggested can use a leader selection algorithm[1] and one
>>> gfac
>>>>>> worker
>>>>>>>>> will
>>>>>>>>>> take the leadership for each experiment. If there are gfac
>>>>>> instances
>>>>>>>> for
>>>>>>>>>> each provider same logic will apply among those nodes with
>>> same
>>>>>>>> provider
>>>>>>>>>> type.
>>>>>>>>>> 
>>>>>>>>>> [1]
>>>> http://curator.apache.org/curator-recipes/leader-election.html
>>>>>>>>>> 
>>>>>>>>>> I would like to implement this if there are  no objections.
>>>>>>>>>> 
>>>>>>>>>> Lahiru
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Mon, Jun 16, 2014 at 11:51 AM, Supun Kamburugamuva <
>>>>>>>> supun06@gmail.com
>>>>>>>>>> 
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hi Marlon,
>>>>>>>>>>> 
>>>>>>>>>>> I think you are exactly correct.
>>>>>>>>>>> 
>>>>>>>>>>> Supun..
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Mon, Jun 16, 2014 at 11:48 AM, Marlon Pierce <
>>>>>> marpierc@iu.edu>
>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Let me restate this, and please tell me if I'm wrong.
>>>>>>>>>>>> 
>>>>>>>>>>>> Orchestrator decides (somehow) that a particular job
>>>> requires
>>>>>>>>> JSDL/BES,
>>>>>>>>>>> so
>>>>>>>>>>>> it places the Experiment ID in Zookeeper's
>>>>>>> /airavata/gfac/jsdl-bes
>>>>>>>>>> node.
>>>>>>>>>>>> GFAC servers associated with this instance notice the
>>>> update.
>>>>>>> The
>>>>>>>>>> first
>>>>>>>>>>>> GFAC to claim the job gets it, uses the Experiment ID to
>>> get
>>>>>> the
>>>>>>>>>> detailed
>>>>>>>>>>>> information it needs from the Registry.  ZooKeeper
>>> handles
>>>> the
>>>>>>>>> locking,
>>>>>>>>>>> etc
>>>>>>>>>>>> to make sure that only one GFAC at a time is trying to
>>>> handle
>>>>>> an
>>>>>>>>>>> experiment.
>>>>>>>>>>>> 
>>>>>>>>>>>> Marlon
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On 6/16/14, 11:42 AM, Lahiru Gunathilake wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi Supun,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks for the clarification.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Regards
>>>>>>>>>>>>> Lahiru
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva <
>>>>>>>>>>> supun06@gmail.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi Lahiru,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> My suggestion is that may be you don't need a Thrift
>>>> service
>>>>>>>>> between
>>>>>>>>>>>>>> Orchestrator and the component executing the
>>> experiment.
>>>>>> When a
>>>>>>>> new
>>>>>>>>>>>>>> experiment is submitted, orchestrator decides who can
>>>>>> execute
>>>>>>>> this
>>>>>>>>>> job.
>>>>>>>>>>>>>> Then it put the information about this experiment
>>>> execution
>>>>>> in
>>>>>>>>>>> ZooKeeper.
>>>>>>>>>>>>>> The component which wants to executes the experiment is
>>>>>>> listening
>>>>>>>>> to
>>>>>>>>>>> this
>>>>>>>>>>>>>> ZooKeeper path and when it sees the experiment it will
>>>>>> execute
>>>>>>>> it.
>>>>>>>>> So
>>>>>>>>>>>>>> that
>>>>>>>>>>>>>> the communication happens through an state change in
>>>>>> ZooKeeper.
>>>>>>>>> This
>>>>>>>>>>> can
>>>>>>>>>>>>>> potentially simply your architecture.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Supun.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake <
>>>>>>>>>>> glahiru@gmail.com>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hi Supun,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> So your suggestion is to create a znode for each
>>> thrift
>>>>>>> service
>>>>>>>> we
>>>>>>>>>>> have
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>> when the request comes that node gets modified with
>>> input
>>>>>> data
>>>>>>>> for
>>>>>>>>>>> that
>>>>>>>>>>>>>>> request and thrift service is having a watch for that
>>>> node
>>>>>> and
>>>>>>>> it
>>>>>>>>>> will
>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>> notified because of the watch and it can read the
>>> input
>>>>>> from
>>>>>>>>>> zookeeper
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>> invoke the operation?
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Lahiru
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva
>>> <
>>>>>>>>>>>>>>> supun06@gmail.com>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Here is what I think about Airavata and ZooKeeper. In
>>>>>>> Airavata
>>>>>>>>>> there
>>>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>>> many components and these components must be
>>> stateless
>>>> to
>>>>>>>> achieve
>>>>>>>>>>>>>>>> scalability and reliability.Also there must be a
>>>>>> mechanism to
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> communicate
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> between the components. At the moment Airavata uses
>>> RPC
>>>>>> calls
>>>>>>>>> based
>>>>>>>>>>> on
>>>>>>>>>>>>>>>> Thrift for the communication.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> ZooKeeper can be used both as a place to hold state
>>> and
>>>>>> as a
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> communication
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> layer between the components. I'm involved with a
>>>> project
>>>>>>> that
>>>>>>>>> has
>>>>>>>>>>> many
>>>>>>>>>>>>>>>> distributed components like AIravata. Right now we
>>> use
>>>>>> Thrift
>>>>>>>>>>> services
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> communicate among the components. But we find it
>>>>>> difficult to
>>>>>>>> use
>>>>>>>>>> RPC
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> calls
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> and achieve stateless behaviour and thinking of
>>>> replacing
>>>>>>>> Thrift
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> services
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> with ZooKeeper based communication layer. So I think
>>> it
>>>> is
>>>>>>>> better
>>>>>>>>>> to
>>>>>>>>>>>>>>>> explore the possibility of removing the Thrift
>>> services
>>>>>>> between
>>>>>>>>> the
>>>>>>>>>>>>>>>> components and use ZooKeeper as a communication
>>>> mechanism
>>>>>>>> between
>>>>>>>>>> the
>>>>>>>>>>>>>>>> services. If you do this you will have to move the
>>> state
>>>>>> to
>>>>>>>>>> ZooKeeper
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> will automatically achieve the stateless behaviour in
>>>> the
>>>>>>>>>> components.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Also I think trying to make ZooKeeper optional is a
>>> bad
>>>>>> idea.
>>>>>>>> If
>>>>>>>>> we
>>>>>>>>>>> are
>>>>>>>>>>>>>>>> trying to integrate something fundamentally
>>> important to
>>>>>>>>>> architecture
>>>>>>>>>>>>>>>> as
>>>>>>>>>>>>>>>> how to store state, we shouldn't make it optional.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Supun..
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On Thu, Jun 12, 2014 at 10:57 PM, Shameera
>>> Rathnayaka <
>>>>>>>>>>>>>>>> shameerainfo@gmail.com> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Hi Lahiru,
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> As i understood,  not only reliability , you are
>>> trying
>>>>>> to
>>>>>>>>> achieve
>>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>>> other requirement by introducing zookeeper, like
>>> health
>>>>>>>>> monitoring
>>>>>>>>>>> of
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> services, categorization with service implementation
>>> etc
>>>>>> ...
>>>>>>> .
>>>>>>>> In
>>>>>>>>>>> that
>>>>>>>>>>>>>>>>> case, i think we can get use of zookeeper's features
>>>> but
>>>>>> if
>>>>>>> we
>>>>>>>>>> only
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> focus
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> on reliability, i have little bit of concern, why
>>> can't
>>>> we
>>>>>>> use
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> clustering +
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> LB ?
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Yes it is better we add Zookeeper as a prerequisite
>>> if
>>>>>> user
>>>>>>>> need
>>>>>>>>>> to
>>>>>>>>>>>>>>>>> use
>>>>>>>>>>>>>>>>> it.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>  Shameera.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake
>>> <
>>>>>>>>>>>>>>>>> glahiru@gmail.com
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Hi Gagan,
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> I need to start another discussion about it, but I
>>> had
>>>>>> an
>>>>>>>>> offline
>>>>>>>>>>>>>>>>>> discussion with Suresh about auto-scaling. I will
>>>> start
>>>>>>>> another
>>>>>>>>>>>>>>>>>> thread
>>>>>>>>>>>>>>>>>> about this topic too.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>>>>>> Lahiru
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> gagandeepjuneja@gmail.com
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Thanks Lahiru for pointing to nice library, added
>>> to
>>>> my
>>>>>>>>>> dictionary
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> :).
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> I would like to know how are we planning to start
>>>>>> multiple
>>>>>>>>>> servers.
>>>>>>>>>>>>>>>>>>> 1. Spawning new servers based on load? Some times
>>> we
>>>>>> call
>>>>>>> it
>>>>>>>>> as
>>>>>>>>>>> auto
>>>>>>>>>>>>>>>>>>> scalable.
>>>>>>>>>>>>>>>>>>> 2. To make some specific number of nodes available
>>>>>> such as
>>>>>>>> we
>>>>>>>>>>> want 2
>>>>>>>>>>>>>>>>>>> servers to be available at any time so if one goes
>>>> down
>>>>>>>> then I
>>>>>>>>>>> need
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> spawn one new to make available servers count 2.
>>>>>>>>>>>>>>>>>>> 3. Initially start all the servers.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> In scenario 1 and 2 zookeeper does make sense but
>>> I
>>>>>> don't
>>>>>>>>>> believe
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> existing
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> architecture support this?
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>> Gagan
>>>>>>>>>>>>>>>>>>> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <
>>>>>>>>> glahiru@gmail.com
>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Hi Gagan,
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Thanks for your response. Please see my inline
>>>>>> comments.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> gagandeepjuneja@gmail.com>
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Hi Lahiru,
>>>>>>>>>>>>>>>>>>>>> Just my 2 cents.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> I am big fan of zookeeper but also against
>>> adding
>>>>>>> multiple
>>>>>>>>>> hops
>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> system which can add unnecessary complexity. Here
>>> I
>>>> am
>>>>>> not
>>>>>>>>> able
>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>> understand the requirement of zookeeper may be
>>> I am
>>>>>>> wrong
>>>>>>>>>>> because
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> less
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> knowledge of the airavata system in whole. So I
>>> would
>>>>>> like
>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> discuss
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> following point.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 1. How it will help us in making system more
>>>>>> reliable.
>>>>>>>>>> Zookeeper
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> able to restart services. At max it can tell
>>> whether
>>>>>>> service
>>>>>>>>> is
>>>>>>>>>> up
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> or not
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> which could only be the case if airavata service
>>> goes
>>>>>> down
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> gracefully and
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> we have any automated way to restart it. If this
>>> is
>>>>>> just
>>>>>>>>> matter
>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> routing
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> client requests to the available thrift servers
>>> then
>>>>>> this
>>>>>>>> can
>>>>>>>>> be
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> achieved
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> with the help of load balancer which I guess is
>>>> already
>>>>>>>> there
>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> thrift
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> wish list.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> We have multiple thrift services and currently
>>> we
>>>>>> start
>>>>>>>>> only
>>>>>>>>>>> one
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> instance
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> of them and each thrift service is a stateless
>>>>>> service. To
>>>>>>>>> keep
>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> high
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> availability we have to start multiple instances
>>> of
>>>>>> them
>>>>>>> in
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> production
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> scenario. So for clients to get an available thrift
>>>>>> service
>>>>>>> we
>>>>>>>>> can
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> use
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> zookeeper znodes to represent each available
>>> service.
>>>>>> There
>>>>>>>> are
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> libraries which is doing similar[1] and I think we
>>> can
>>>>>> use
>>>>>>>> them
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> directly.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 2. As far as registering of different providers is
>>>>>>> concerned
>>>>>>>>> do
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> you
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> think for that we really need external store.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Yes I think so, because its light weight and
>>>>>> reliable
>>>>>>> and
>>>>>>>>> we
>>>>>>>>>>> have
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> do
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> very minimal amount of work to achieve all these
>>>>>> features
>>>>>>> to
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Airavata
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> because zookeeper handle all the complexity.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> I have seen people using zookeeper more for
>>> state
>>>>>>>> management
>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>> distributed environments.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> +1, we might not be the most effective users of
>>>>>>> zookeeper
>>>>>>>>>>> because
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> our services are stateless services, but my point
>>> is
>>>> to
>>>>>>>>> achieve
>>>>>>>>>>>>>>>>>>>> fault-tolerance we can use zookeeper and with
>>>> minimal
>>>>>>> work.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>   I would like to understand more how can we
>>>> leverage
>>>>>>>>>> zookeeper
>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>> airavata to make system reliable.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> [1]
>>> https://github.com/eirslett/thrift-zookeeper
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>> Gagan
>>>>>>>>>>>>>>>>>>>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <
>>>>>>> marpierc@iu.edu
>>>>>>>>> 
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Thanks for the summary, Lahiru. I'm cc'ing the
>>>>>>>> Architecture
>>>>>>>>>>> list
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> additional comments.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Marlon
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> I did little research about Apache
>>> Zookeeper[1]
>>>> and
>>>>>>> how
>>>>>>>> to
>>>>>>>>>> use
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> airavata. Its really a nice way to achieve fault
>>>>>>> tolerance
>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> reliable
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> communication between our thrift services and
>>>>>> clients.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Zookeeper
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> is a
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> distributed, fault tolerant system to do a
>>> reliable
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> communication
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> between
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> distributed applications. This is like an
>>>> in-memory
>>>>>>> file
>>>>>>>>>>> system
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> has
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> nodes in a tree structure and each node can
>>> have
>>>>>> small
>>>>>>>>>> amount
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> data
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> associated with it and these nodes are called
>>>> znodes.
>>>>>>>> Clients
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> can
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> connect
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> to a zookeeper server and add/delete and
>>> update
>>>>>> these
>>>>>>>>>> znodes.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>   In Apache Airavata we start multiple thrift
>>>>>>> services
>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> these
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> can
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> go
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> down for maintenance or these can crash, if we
>>>> use
>>>>>>>>> zookeeper
>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> store
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> these
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> configuration(thrift service configurations)
>>> we
>>>> can
>>>>>>>>> achieve
>>>>>>>>>> a
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> very
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> reliable
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> system. Basically thrift clients can
>>> dynamically
>>>>>>>> discover
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> available
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> service
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> by using ephemeral znodes(Here we do not have
>>> to
>>>>>>> change
>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> generated
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> thrift client code but we have to change the
>>>>>> locations we
>>>>>>>> are
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> invoking
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> them). ephemeral znodes will be removed when the
>>>>>> thrift
>>>>>>>>> service
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> goes
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> down
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> and zookeeper guarantee the atomicity between
>>>> these
>>>>>>>>>>> operations.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> With
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> approach we can have a node hierarchy for
>>>> multiple
>>>>>> of
>>>>>>>>>>> airavata,
>>>>>>>>>>>>>>>>>>>>>>> orchestrator,appcatalog and gfac thrift
>>> services.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> For specifically for gfac we can have
>>> different
>>>>>> types
>>>>>>> of
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> services
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> each
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> provider implementation. This can be achieved
>>> by
>>>>>> using
>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> hierarchical
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> support in zookeeper and providing some logic
>>> in
>>>>>>>>> gfac-thrift
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> service
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> register it to a defined path. Using the same
>>>> logic
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> orchestrator
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> can
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> discover the provider specific gfac thrift
>>> service
>>>> and
>>>>>>>> route
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> message to
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> the correct thrift service.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> With this approach I think we simply have
>>> write
>>>>>> some
>>>>>>>>> client
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> thrift
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> services and clients and zookeeper server
>>>>>> installation
>>>>>>>> can
>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> done as
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> separate process and it will be easier to keep
>>>> the
>>>>>>>>> Zookeeper
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> server
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> separate from Airavata because installation of
>>>>>> Zookeeper
>>>>>>>>> server
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> little
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> complex in production scenario. I think we have
>>> to
>>>>>> make
>>>>>>>> sure
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> everything
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> works fine when there is no Zookeeper running,
>>>> ex:
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> enable.zookeeper=false
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> should works fine and users doesn't have to
>>>>>> download
>>>>>>> and
>>>>>>>>>> start
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> zookeeper.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> [1]http://zookeeper.apache.org/
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>>>>>>>> Lahiru
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>> System Analyst Programmer
>>>>>>>>>>>>>>>>>>>> PTI Lab
>>>>>>>>>>>>>>>>>>>> Indiana University
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>> System Analyst Programmer
>>>>>>>>>>>>>>>>>> PTI Lab
>>>>>>>>>>>>>>>>>> Indiana University
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>>>>> Shameera Rathnayaka.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> email: shameera AT apache.org , shameerainfo AT
>>>>>> gmail.com
>>>>>>>>>>>>>>>>> Blog : http://shameerarathnayaka.blogspot.com/
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> Supun Kamburugamuva
>>>>>>>>>>>>>>>> Member, Apache Software Foundation;
>>>> http://www.apache.org
>>>>>>>>>>>>>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
>>>>>>>>>>>>>>>> Blog: http://supunk.blogspot.com
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> System Analyst Programmer
>>>>>>>>>>>>>>> PTI Lab
>>>>>>>>>>>>>>> Indiana University
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Supun Kamburugamuva
>>>>>>>>>>>>>> Member, Apache Software Foundation;
>>> http://www.apache.org
>>>>>>>>>>>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
>>>>>>>>>>>>>> Blog: http://supunk.blogspot.com
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> --
>>>>>>>>>>> Supun Kamburugamuva
>>>>>>>>>>> Member, Apache Software Foundation; http://www.apache.org
>>>>>>>>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
>>>>>>>>>>> Blog: http://supunk.blogspot.com
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> System Analyst Programmer
>>>>>>>>>> PTI Lab
>>>>>>>>>> Indiana University
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> Supun Kamburugamuva
>>>>>>>>> Member, Apache Software Foundation; http://www.apache.org
>>>>>>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
>>>>>>>>> Blog: http://supunk.blogspot.com
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> System Analyst Programmer
>>>>>>>> PTI Lab
>>>>>>>> Indiana University
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Supun Kamburugamuva
>>>>>> Member, Apache Software Foundation; http://www.apache.org
>>>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
>>>>>> Blog: http://supunk.blogspot.com
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> System Analyst Programmer
>>>>> PTI Lab
>>>>> Indiana University
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> System Analyst Programmer
>>>> PTI Lab
>>>> Indiana University
>>>> 
>>> 
>> 
>> 
>> 
>> --
>> System Analyst Programmer
>> PTI Lab
>> Indiana University
>> 
> 
> 
> 
> -- 
> System Analyst Programmer
> PTI Lab
> Indiana University

Re: Zookeeper in Airavata to achieve reliability

Posted by Amila Jayasekara <th...@gmail.com>.

Great Work !

I would love to see a demo of this.

Thanks
Amila


On Wed, Jun 25, 2014 at 2:09 PM, Lahiru Gunathilake <gl...@gmail.com>
wrote:

> Hi All,
>
> I have finished the initial version of the ZK integration. Now we can start
> multiple thrift gfac services (still the communication between orchestrator
> and gfac is RPC) and orchestrator submit jobs to multiple gfac nodes.
>
> I can kill a gfac node and orchestrator will make sure jobs are not lost,
> it simply take those jobs and re-submit to gfac. Since GFac is a generic
> framework and we have multiple plugins developed for that framework
> checkpointing the plugin is up to the plugin developers but gfac
> checkpoints whether those plugins invoked or not.
>
> I have introduced a new interface for plugin development called Recoverable
> (RecoverableHandlers and RecoverableProvider). So state-full plugins has to
> implement their recover method and gfac framework will make sure it will be
> invoked during a re-run scenario. If a plugin is not recoverable and
> already ran(can be found using framework checkpointing) during the re-run
> that plugin will not be invoked. For now I just implemented recoverability
> to few plugins and I have tested submitting a job to trestles and let it
> submit and come to monitoring state and kill that gfac instance. Now
> Orchestrator pick that execution and re-submit to another gfac node and
> that gfac node does not re-run that job to the computing resource, but
> simply start monitoring once the job is done outputs are downloaded from
> the original output location.
>
> When a particular experiment is finished all the ZK data is removed.
>
> At this point following things needs to be done,
>
> 1. Figure out all the state-full handlers/providers and implement
> recoverability,
>
> Ex: Input handler is transfering 1000 files and with 500 files gfac
> instance crashed, during the re-run it should be able to tranfer from 501
> file. Same logic can be applied to a single huge file. Those things are
> completely up to the plugin developer.
>
> 2. Then we have to do remove the RPC invocation and make gfac nodes as
> worker nodes.
>
> Regards
> Lahiru
>
>
> On Wed, Jun 18, 2014 at 12:11 PM, Lahiru Gunathilake <gl...@gmail.com>
> wrote:
>
> > Hi Eran,
> >
> >
> > On Tue, Jun 17, 2014 at 4:06 PM, Eran Chinthaka Withana <
> > eran.chinthaka@gmail.com> wrote:
> >
> >> Storm has a Kafka spout which manages the cursor location (pointer to
> the
> >> head of the queue representing the next message to be processed) inside
> >> ZK.
> >> Each storm spout instance uses this information to get the next item to
> >> process. Storm kafka spout won't advance to the next message until it
> gets
> >> an ack from the storm topology.
> >>
> > If we have 10 jobs in the queue and 5 GFAC instances picked 1 at a time
> > and successfully submitted and have to start taking rest of the jobs. But
> > all 5 GFAC instances are responsible for initially picked  5 jobs because
> > they are still running and gfac instances are monitoring them until its
> > done but at the same time we have to move the cursor to pick other jobs
> > too.
> >
> > If we Ack and moved the cursor just after submission without waiting
> until
> > the job is actually finished how are we going to know which gfac is
> > monitoring which set of jobs ?
> >
> > I am not getting how achieve above requirement with this suggestion. May
> > be I am missing something here.
> >
> > Regards
> > Lahiru
> >
> >>
> >> So, if there is an exception in the topology and ack is sent only by the
> >> last bolt, then storm bolt make sure all messages are processed since
> >> exceptions won't generate acks.
> >>
> >> Thanks,
> >> Eran Chinthaka Withana
> >>
> >>
> >> On Tue, Jun 17, 2014 at 12:30 PM, Lahiru Gunathilake <glahiru@gmail.com
> >
> >> wrote:
> >>
> >> > Hi Eran,
> >> >
> >> > I think I should take back my last email. When I carefully look at
> >> storm I
> >> > have following question.
> >> >
> >> > How are we going to store the Job statuses  and relaunch the jobs
> which
> >> was
> >> > running in failure nodes ? Its true that storm is starting new workers
> >> but
> >> > there should be a way to find missing jobs by someone in the system.
> >> Since
> >> > we are not having a data stream there is no use to start new workers
> >> unless
> >> > we handler the missing jobs. I think we need to have a better control
> of
> >> > our component and persist the states of jobs each GFAC node is
> handling.
> >> > Directly using zookeeper will let us to do a proper fault tolerance
> >> > implementation.
> >> >
> >> > Regards
> >> > Lahiru
> >> >
> >> >
> >> >
> >> > On Tue, Jun 17, 2014 at 3:14 PM, Lahiru Gunathilake <
> glahiru@gmail.com>
> >> > wrote:
> >> >
> >> > > Hi Supun,
> >> > >
> >> > > I think in this usecase we only use storm topology to do the
> >> > communication
> >> > > among workers and we are completely ignoring the stream processing
> >> part.
> >> > > Orchestrator will talk to Nimbus and GFAC nodes will be Worker nodes
> >> in
> >> > the
> >> > > storm topology. But I think we can achieve extremely fault tolerance
> >> > system
> >> > > by directly using storm based on following statement in storm site
> >> with
> >> > > minimum changes in airavata.
> >> > >
> >> > > Additionally, the Nimbus daemon and Supervisor daemons are fail-fast
> >> and
> >> > > stateless; all state is kept in Zookeeper or on local disk. This
> means
> >> > you
> >> > > can kill -9 Nimbus or the Supervisors and they’ll start back up like
> >> > > nothing happened. This design leads to Storm clusters being
> incredibly
> >> > > stable.
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > On Tue, Jun 17, 2014 at 3:02 PM, Supun Kamburugamuva <
> >> supun06@gmail.com>
> >> > > wrote:
> >> > >
> >> > >> Hi Eran,
> >> > >>
> >> > >> I'm using Storm every day and this is one of the strangest things
> >> I've
> >> > >> heard about using Storm. My be there are more use cases for Storm
> >> other
> >> > >> than Distributed Stream processing. AFAIK the Bolts, spouts are
> >> built to
> >> > >> handle a stream of events that doesn't take much time to process.
> In
> >> > >> Airavata we don't process the messages. Instead we run experiments
> >> based
> >> > >> on
> >> > >> the commands given.
> >> > >>
> >> > >> If you want process isolation, distributed execution, cluster
> >> resource
> >> > >> management Yarn would be a better thing to explore.
> >> > >>
> >> > >> Thanks,
> >> > >> Supun..
> >> > >>
> >> > >>
> >> > >> On Tue, Jun 17, 2014 at 2:27 PM, Eran Chinthaka Withana <
> >> > >> eran.chinthaka@gmail.com> wrote:
> >> > >>
> >> > >> > Hi Lahiru,
> >> > >> >
> >> > >> > good summarization. Thanks Lahiru.
> >> > >> >
> >> > >> > I think you are trying to stick to a model where Orchestrator
> >> > >> distributing
> >> > >> > to work for GFac worker and trying to do the impedance mismatch
> >> > through
> >> > >> a
> >> > >> > messaging solution. If you step back and think, we don't even
> want
> >> the
> >> > >> > orchestrator to handle everything. From its point of view, it
> >> should
> >> > >> submit
> >> > >> > jobs to the framework, and will wait or get notified once the job
> >> is
> >> > >> done.
> >> > >> >
> >> > >> > There are multiple ways of doing this. And here is one method.
> >> > >> >
> >> > >> > Orchestrator submits all its jobs to Job queue (implemented using
> >> any
> >> > MQ
> >> > >> > impl like Rabbit or Kafka). A storm topology is implemented to
> >> dequeue
> >> > >> > messages, process them (i.e. submit those jobs and get those
> >> executed)
> >> > >> and
> >> > >> > notify the Orchestrator with the status (either through another
> >> > >> > JobCompletionQueue or direct invocation).
> >> > >> >
> >> > >> > With this approach, the MQ provider will help to match impedance
> >> > between
> >> > >> > job submission and consumption. Storm helps with worker
> >> coordination,
> >> > >> load
> >> > >> > balancing, throttling on your job execution framework, worker
> pool
> >> > >> > management and fault tolerance.
> >> > >> >
> >> > >> > Of course, you can implement this based only on ZK and handle
> >> > everything
> >> > >> > else on your own but storm had done exactly that with the use of
> ZK
> >> > >> > underneath.
> >> > >> >
> >> > >> > Finally, if you go for a model like this, then even beyond job
> >> > >> submission,
> >> > >> > you can use the same model to do anything within the framework
> for
> >> > >> internal
> >> > >> > communication. For example, the workflow engine will submit its
> >> jobs
> >> > to
> >> > >> > queues based on what it has to do. Storm topologies exists for
> each
> >> > >> queues
> >> > >> > to dequeue messages and carry out the work in a reliable manner.
> >> > >> Consider
> >> > >> > this as mini-workflows within a larger workflow framework.
> >> > >> >
> >> > >> > We can have a voice chat if its more convenient. But not at 7am
> >> PST :)
> >> > >> >
> >> > >> >
> >> > >> > Thanks,
> >> > >> > Eran Chinthaka Withana
> >> > >> >
> >> > >> >
> >> > >> > On Tue, Jun 17, 2014 at 10:12 AM, Lahiru Gunathilake <
> >> > glahiru@gmail.com
> >> > >> >
> >> > >> > wrote:
> >> > >> >
> >> > >> > > Hi All,
> >> > >> > >
> >> > >> > > Ignoring the tool that we are going to use to implement fault
> >> > >> tolerance I
> >> > >> > > have summarized the model we have decided so far. I will use
> the
> >> > tool
> >> > >> > name
> >> > >> > > as X, we can use Zookeeper or some other implementation.
> >> Following
> >> > >> design
> >> > >> > > assume tool X  and Registry have high availability.
> >> > >> > >
> >> > >> > > 1. Orchestrator and GFAC worker node communication is going to
> be
> >> > >> queue
> >> > >> > > based and tool X is going to be used for this communication.
> (We
> >> > have
> >> > >> to
> >> > >> > > implement this with considering race condition between
> different
> >> > gfac
> >> > >> > > workers).
> >> > >> > > 2. We are having multiple instances of GFAC which are identical
> >> (In
> >> > >> > future
> >> > >> > > we can group gfac workers). Existence of each worker node is
> >> > >> identified
> >> > >> > > using X. If node goes down orchestrator will be notified by X.
> >> > >> > > 3. When a particular request comes and accepted by one gfac
> >> worker
> >> > >> that
> >> > >> > > information will be replicated in tool X and a place where this
> >> > >> > information
> >> > >> > > is persisted even the worker failed.
> >> > >> > > 4. When a job comes to a final state like failed or cancelled
> or
> >> > >> > completed
> >> > >> > > above information will be removed. So at a given time
> >> orchestrator
> >> > can
> >> > >> > poll
> >> > >> > > active jobs in each worker by giving a worker ID.
> >> > >> > > 5. Tool X will make sure that when a worker goes down it will
> >> notify
> >> > >> > > orchestrator. During a worker failure, based on step 3 and 4
> >> > >> orchestrator
> >> > >> > > can poll all the active jobs of that worker and do the same
> thing
> >> > >> like in
> >> > >> > > step 1 (store the experiment ID to the queue) and gfac worker
> >> will
> >> > >> pick
> >> > >> > the
> >> > >> > > jobs.
> >> > >> > >
> >> > >> > > 6. When GFAC receive a job like in step 5 it have to carefully
> >> > >> evaluate
> >> > >> > the
> >> > >> > > state from registry and decide what to be done (If the job is
> >> > pending
> >> > >> > then
> >> > >> > > gfac just have to monitor, if job state is like input
> transferred
> >> > not
> >> > >> > even
> >> > >> > > submitted gfac has to execute rest of the chain and submit the
> >> job
> >> > to
> >> > >> the
> >> > >> > > resource and start monitoring).
> >> > >> > >
> >> > >> > > If we can find a tool X which supports all these features and
> >> tool
> >> > >> itself
> >> > >> > > is fault tolerance and support atomicity, high availability and
> >> > simply
> >> > >> > API
> >> > >> > > to implement we can use that tool.
> >> > >> > >
> >> > >> > > WDYT ?
> >> > >> > >
> >> > >> > > Lahiru
> >> > >> > >
> >> > >> > >
> >> > >> > > On Mon, Jun 16, 2014 at 2:38 PM, Supun Kamburugamuva <
> >> > >> supun06@gmail.com>
> >> > >> > > wrote:
> >> > >> > >
> >> > >> > > > Hi Lahiru,
> >> > >> > > >
> >> > >> > > > Before moving with an implementation it may be worth to
> >> consider
> >> > >> some
> >> > >> > of
> >> > >> > > > the following aspects as well.
> >> > >> > > >
> >> > >> > > > 1. How to report the progress of an experiment as state in
> >> > >> ZooKeeper?
> >> > >> > > What
> >> > >> > > > happens if a GFac instance crashes while executing an
> >> experiment?
> >> > >> Are
> >> > >> > > there
> >> > >> > > > check-points we can save so that another GFac instance can
> take
> >> > >> over?
> >> > >> > > > 2. What is the threading model of GFac instances? (I consider
> >> this
> >> > >> as a
> >> > >> > > > very important aspect)
> >> > >> > > > 3. What are the information needed to be stored in the
> >> ZooKeeper?
> >> > >> You
> >> > >> > may
> >> > >> > > > need to store other information about an experiment apart
> from
> >> its
> >> > >> > > > experiment ID.
> >> > >> > > > 4. How to report errors?
> >> > >> > > > 5. For GFac weather you need a threading model or worker
> >> process
> >> > >> model?
> >> > >> > > >
> >> > >> > > > Thanks,
> >> > >> > > > Supun..
> >> > >> > > >
> >> > >> > > >
> >> > >> > > >
> >> > >> > > >
> >> > >> > > >
> >> > >> > > > On Mon, Jun 16, 2014 at 2:22 PM, Lahiru Gunathilake <
> >> > >> glahiru@gmail.com
> >> > >> > >
> >> > >> > > > wrote:
> >> > >> > > >
> >> > >> > > > > Hi All,
> >> > >> > > > >
> >> > >> > > > > I think the conclusion is like this,
> >> > >> > > > >
> >> > >> > > > > 1, We make the gfac as a worker not a thrift service and we
> >> can
> >> > >> start
> >> > >> > > > > multiple workers either with bunch of providers and
> handlers
> >> > >> > configured
> >> > >> > > > in
> >> > >> > > > > each worker or provider specific  workers to handle the
> class
> >> > path
> >> > >> > > issues
> >> > >> > > > > (not the common scenario).
> >> > >> > > > >
> >> > >> > > > > 2. Gfac workers can be configured to watch for a given path
> >> in
> >> > >> > > zookeeper,
> >> > >> > > > > and multiple workers can listen to the same path. Default
> >> path
> >> > >> can be
> >> > >> > > > > /airavata/gfac or can configure paths like
> >> /airavata/gfac/gsissh
> >> > >> > > > > /airavata/gfac/bes.
> >> > >> > > > >
> >> > >> > > > > 3. Orchestrator can configure with a logic to store
> >> experiment
> >> > >> IDs in
> >> > >> > > > > zookeeper with a path, and orchestrator can be configured
> to
> >> > >> provider
> >> > >> > > > > specific path logic too. So when a new request come
> >> orchestrator
> >> > >> > store
> >> > >> > > > the
> >> > >> > > > > experimentID and these experiments IDs are stored in Zk as
> a
> >> > >> queue.
> >> > >> > > > >
> >> > >> > > > > 4. Since gfac workers are watching they will be notified
> and
> >> as
> >> > >> supun
> >> > >> > > > > suggested can use a leader selection algorithm[1] and one
> >> gfac
> >> > >> worker
> >> > >> > > >  will
> >> > >> > > > > take the leadership for each experiment. If there are gfac
> >> > >> instances
> >> > >> > > for
> >> > >> > > > > each provider same logic will apply among those nodes with
> >> same
> >> > >> > > provider
> >> > >> > > > > type.
> >> > >> > > > >
> >> > >> > > > > [1]
> >> > http://curator.apache.org/curator-recipes/leader-election.html
> >> > >> > > > >
> >> > >> > > > > I would like to implement this if there are  no objections.
> >> > >> > > > >
> >> > >> > > > > Lahiru
> >> > >> > > > >
> >> > >> > > > >
> >> > >> > > > > On Mon, Jun 16, 2014 at 11:51 AM, Supun Kamburugamuva <
> >> > >> > > supun06@gmail.com
> >> > >> > > > >
> >> > >> > > > > wrote:
> >> > >> > > > >
> >> > >> > > > > > Hi Marlon,
> >> > >> > > > > >
> >> > >> > > > > > I think you are exactly correct.
> >> > >> > > > > >
> >> > >> > > > > > Supun..
> >> > >> > > > > >
> >> > >> > > > > >
> >> > >> > > > > > On Mon, Jun 16, 2014 at 11:48 AM, Marlon Pierce <
> >> > >> marpierc@iu.edu>
> >> > >> > > > wrote:
> >> > >> > > > > >
> >> > >> > > > > > > Let me restate this, and please tell me if I'm wrong.
> >> > >> > > > > > >
> >> > >> > > > > > > Orchestrator decides (somehow) that a particular job
> >> > requires
> >> > >> > > > JSDL/BES,
> >> > >> > > > > > so
> >> > >> > > > > > > it places the Experiment ID in Zookeeper's
> >> > >> > /airavata/gfac/jsdl-bes
> >> > >> > > > > node.
> >> > >> > > > > > >  GFAC servers associated with this instance notice the
> >> > update.
> >> > >> >  The
> >> > >> > > > > first
> >> > >> > > > > > > GFAC to claim the job gets it, uses the Experiment ID
> to
> >> get
> >> > >> the
> >> > >> > > > > detailed
> >> > >> > > > > > > information it needs from the Registry.  ZooKeeper
> >> handles
> >> > the
> >> > >> > > > locking,
> >> > >> > > > > > etc
> >> > >> > > > > > > to make sure that only one GFAC at a time is trying to
> >> > handle
> >> > >> an
> >> > >> > > > > > experiment.
> >> > >> > > > > > >
> >> > >> > > > > > > Marlon
> >> > >> > > > > > >
> >> > >> > > > > > >
> >> > >> > > > > > > On 6/16/14, 11:42 AM, Lahiru Gunathilake wrote:
> >> > >> > > > > > >
> >> > >> > > > > > >> Hi Supun,
> >> > >> > > > > > >>
> >> > >> > > > > > >> Thanks for the clarification.
> >> > >> > > > > > >>
> >> > >> > > > > > >> Regards
> >> > >> > > > > > >> Lahiru
> >> > >> > > > > > >>
> >> > >> > > > > > >>
> >> > >> > > > > > >> On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva
> <
> >> > >> > > > > > supun06@gmail.com>
> >> > >> > > > > > >> wrote:
> >> > >> > > > > > >>
> >> > >> > > > > > >>  Hi Lahiru,
> >> > >> > > > > > >>>
> >> > >> > > > > > >>> My suggestion is that may be you don't need a Thrift
> >> > service
> >> > >> > > > between
> >> > >> > > > > > >>> Orchestrator and the component executing the
> >> experiment.
> >> > >> When a
> >> > >> > > new
> >> > >> > > > > > >>> experiment is submitted, orchestrator decides who can
> >> > >> execute
> >> > >> > > this
> >> > >> > > > > job.
> >> > >> > > > > > >>> Then it put the information about this experiment
> >> > execution
> >> > >> in
> >> > >> > > > > > ZooKeeper.
> >> > >> > > > > > >>> The component which wants to executes the experiment
> is
> >> > >> > listening
> >> > >> > > > to
> >> > >> > > > > > this
> >> > >> > > > > > >>> ZooKeeper path and when it sees the experiment it
> will
> >> > >> execute
> >> > >> > > it.
> >> > >> > > > So
> >> > >> > > > > > >>> that
> >> > >> > > > > > >>> the communication happens through an state change in
> >> > >> ZooKeeper.
> >> > >> > > > This
> >> > >> > > > > > can
> >> > >> > > > > > >>> potentially simply your architecture.
> >> > >> > > > > > >>>
> >> > >> > > > > > >>> Thanks,
> >> > >> > > > > > >>> Supun.
> >> > >> > > > > > >>>
> >> > >> > > > > > >>>
> >> > >> > > > > > >>> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake
> <
> >> > >> > > > > > glahiru@gmail.com>
> >> > >> > > > > > >>> wrote:
> >> > >> > > > > > >>>
> >> > >> > > > > > >>>  Hi Supun,
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>> So your suggestion is to create a znode for each
> >> thrift
> >> > >> > service
> >> > >> > > we
> >> > >> > > > > > have
> >> > >> > > > > > >>>> and
> >> > >> > > > > > >>>> when the request comes that node gets modified with
> >> input
> >> > >> data
> >> > >> > > for
> >> > >> > > > > > that
> >> > >> > > > > > >>>> request and thrift service is having a watch for
> that
> >> > node
> >> > >> and
> >> > >> > > it
> >> > >> > > > > will
> >> > >> > > > > > >>>> be
> >> > >> > > > > > >>>> notified because of the watch and it can read the
> >> input
> >> > >> from
> >> > >> > > > > zookeeper
> >> > >> > > > > > >>>> and
> >> > >> > > > > > >>>> invoke the operation?
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>> Lahiru
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>> On Thu, Jun 12, 2014 at 11:50 PM, Supun
> Kamburugamuva
> >> <
> >> > >> > > > > > >>>> supun06@gmail.com>
> >> > >> > > > > > >>>> wrote:
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>  Hi all,
> >> > >> > > > > > >>>>>
> >> > >> > > > > > >>>>> Here is what I think about Airavata and ZooKeeper.
> In
> >> > >> > Airavata
> >> > >> > > > > there
> >> > >> > > > > > >>>>> are
> >> > >> > > > > > >>>>> many components and these components must be
> >> stateless
> >> > to
> >> > >> > > achieve
> >> > >> > > > > > >>>>> scalability and reliability.Also there must be a
> >> > >> mechanism to
> >> > >> > > > > > >>>>>
> >> > >> > > > > > >>>> communicate
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>> between the components. At the moment Airavata uses
> >> RPC
> >> > >> calls
> >> > >> > > > based
> >> > >> > > > > > on
> >> > >> > > > > > >>>>> Thrift for the communication.
> >> > >> > > > > > >>>>>
> >> > >> > > > > > >>>>> ZooKeeper can be used both as a place to hold state
> >> and
> >> > >> as a
> >> > >> > > > > > >>>>>
> >> > >> > > > > > >>>> communication
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>> layer between the components. I'm involved with a
> >> > project
> >> > >> > that
> >> > >> > > > has
> >> > >> > > > > > many
> >> > >> > > > > > >>>>> distributed components like AIravata. Right now we
> >> use
> >> > >> Thrift
> >> > >> > > > > > services
> >> > >> > > > > > >>>>>
> >> > >> > > > > > >>>> to
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>> communicate among the components. But we find it
> >> > >> difficult to
> >> > >> > > use
> >> > >> > > > > RPC
> >> > >> > > > > > >>>>>
> >> > >> > > > > > >>>> calls
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>> and achieve stateless behaviour and thinking of
> >> > replacing
> >> > >> > > Thrift
> >> > >> > > > > > >>>>>
> >> > >> > > > > > >>>> services
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>> with ZooKeeper based communication layer. So I
> think
> >> it
> >> > is
> >> > >> > > better
> >> > >> > > > > to
> >> > >> > > > > > >>>>> explore the possibility of removing the Thrift
> >> services
> >> > >> > between
> >> > >> > > > the
> >> > >> > > > > > >>>>> components and use ZooKeeper as a communication
> >> > mechanism
> >> > >> > > between
> >> > >> > > > > the
> >> > >> > > > > > >>>>> services. If you do this you will have to move the
> >> state
> >> > >> to
> >> > >> > > > > ZooKeeper
> >> > >> > > > > > >>>>>
> >> > >> > > > > > >>>> and
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>> will automatically achieve the stateless behaviour
> in
> >> > the
> >> > >> > > > > components.
> >> > >> > > > > > >>>>>
> >> > >> > > > > > >>>>> Also I think trying to make ZooKeeper optional is a
> >> bad
> >> > >> idea.
> >> > >> > > If
> >> > >> > > > we
> >> > >> > > > > > are
> >> > >> > > > > > >>>>> trying to integrate something fundamentally
> >> important to
> >> > >> > > > > architecture
> >> > >> > > > > > >>>>> as
> >> > >> > > > > > >>>>> how to store state, we shouldn't make it optional.
> >> > >> > > > > > >>>>>
> >> > >> > > > > > >>>>> Thanks,
> >> > >> > > > > > >>>>> Supun..
> >> > >> > > > > > >>>>>
> >> > >> > > > > > >>>>>
> >> > >> > > > > > >>>>> On Thu, Jun 12, 2014 at 10:57 PM, Shameera
> >> Rathnayaka <
> >> > >> > > > > > >>>>> shameerainfo@gmail.com> wrote:
> >> > >> > > > > > >>>>>
> >> > >> > > > > > >>>>>  Hi Lahiru,
> >> > >> > > > > > >>>>>>
> >> > >> > > > > > >>>>>> As i understood,  not only reliability , you are
> >> trying
> >> > >> to
> >> > >> > > > achieve
> >> > >> > > > > > >>>>>> some
> >> > >> > > > > > >>>>>> other requirement by introducing zookeeper, like
> >> health
> >> > >> > > > monitoring
> >> > >> > > > > > of
> >> > >> > > > > > >>>>>>
> >> > >> > > > > > >>>>> the
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>> services, categorization with service
> implementation
> >> etc
> >> > >> ...
> >> > >> > .
> >> > >> > > In
> >> > >> > > > > > that
> >> > >> > > > > > >>>>>> case, i think we can get use of zookeeper's
> features
> >> > but
> >> > >> if
> >> > >> > we
> >> > >> > > > > only
> >> > >> > > > > > >>>>>>
> >> > >> > > > > > >>>>> focus
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>> on reliability, i have little bit of concern, why
> >> can't
> >> > we
> >> > >> > use
> >> > >> > > > > > >>>>>>
> >> > >> > > > > > >>>>> clustering +
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>> LB ?
> >> > >> > > > > > >>>>>>
> >> > >> > > > > > >>>>>> Yes it is better we add Zookeeper as a
> prerequisite
> >> if
> >> > >> user
> >> > >> > > need
> >> > >> > > > > to
> >> > >> > > > > > >>>>>> use
> >> > >> > > > > > >>>>>> it.
> >> > >> > > > > > >>>>>>
> >> > >> > > > > > >>>>>> Thanks,
> >> > >> > > > > > >>>>>>   Shameera.
> >> > >> > > > > > >>>>>>
> >> > >> > > > > > >>>>>>
> >> > >> > > > > > >>>>>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru
> Gunathilake
> >> <
> >> > >> > > > > > >>>>>> glahiru@gmail.com
> >> > >> > > > > > >>>>>> wrote:
> >> > >> > > > > > >>>>>>
> >> > >> > > > > > >>>>>>  Hi Gagan,
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>> I need to start another discussion about it, but
> I
> >> had
> >> > >> an
> >> > >> > > > offline
> >> > >> > > > > > >>>>>>> discussion with Suresh about auto-scaling. I will
> >> > start
> >> > >> > > another
> >> > >> > > > > > >>>>>>> thread
> >> > >> > > > > > >>>>>>> about this topic too.
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>> Regards
> >> > >> > > > > > >>>>>>> Lahiru
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>> gagandeepjuneja@gmail.com
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>> wrote:
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>  Thanks Lahiru for pointing to nice library,
> added
> >> to
> >> > my
> >> > >> > > > > dictionary
> >> > >> > > > > > >>>>>>>>
> >> > >> > > > > > >>>>>>> :).
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>>  I would like to know how are we planning to start
> >> > >> multiple
> >> > >> > > > > servers.
> >> > >> > > > > > >>>>>>>> 1. Spawning new servers based on load? Some
> times
> >> we
> >> > >> call
> >> > >> > it
> >> > >> > > > as
> >> > >> > > > > > auto
> >> > >> > > > > > >>>>>>>> scalable.
> >> > >> > > > > > >>>>>>>> 2. To make some specific number of nodes
> available
> >> > >> such as
> >> > >> > > we
> >> > >> > > > > > want 2
> >> > >> > > > > > >>>>>>>> servers to be available at any time so if one
> goes
> >> > down
> >> > >> > > then I
> >> > >> > > > > > need
> >> > >> > > > > > >>>>>>>>
> >> > >> > > > > > >>>>>>> to
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>>  spawn one new to make available servers count 2.
> >> > >> > > > > > >>>>>>>> 3. Initially start all the servers.
> >> > >> > > > > > >>>>>>>>
> >> > >> > > > > > >>>>>>>> In scenario 1 and 2 zookeeper does make sense
> but
> >> I
> >> > >> don't
> >> > >> > > > > believe
> >> > >> > > > > > >>>>>>>>
> >> > >> > > > > > >>>>>>> existing
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>> architecture support this?
> >> > >> > > > > > >>>>>>>>
> >> > >> > > > > > >>>>>>>> Regards,
> >> > >> > > > > > >>>>>>>> Gagan
> >> > >> > > > > > >>>>>>>> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <
> >> > >> > > > glahiru@gmail.com
> >> > >> > > > > >
> >> > >> > > > > > >>>>>>>>
> >> > >> > > > > > >>>>>>> wrote:
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>> Hi Gagan,
> >> > >> > > > > > >>>>>>>>>
> >> > >> > > > > > >>>>>>>>> Thanks for your response. Please see my inline
> >> > >> comments.
> >> > >> > > > > > >>>>>>>>>
> >> > >> > > > > > >>>>>>>>>
> >> > >> > > > > > >>>>>>>>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
> >> > >> > > > > > >>>>>>>>>
> >> > >> > > > > > >>>>>>>> gagandeepjuneja@gmail.com>
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>> wrote:
> >> > >> > > > > > >>>>>>>>>
> >> > >> > > > > > >>>>>>>>>  Hi Lahiru,
> >> > >> > > > > > >>>>>>>>>> Just my 2 cents.
> >> > >> > > > > > >>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>> I am big fan of zookeeper but also against
> >> adding
> >> > >> > multiple
> >> > >> > > > > hops
> >> > >> > > > > > in
> >> > >> > > > > > >>>>>>>>>>
> >> > >> > > > > > >>>>>>>>> the
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>> system which can add unnecessary complexity.
> Here
> >> I
> >> > am
> >> > >> not
> >> > >> > > > able
> >> > >> > > > > to
> >> > >> > > > > > >>>>>>>>>> understand the requirement of zookeeper may be
> >> I am
> >> > >> > wrong
> >> > >> > > > > > because
> >> > >> > > > > > >>>>>>>>>>
> >> > >> > > > > > >>>>>>>>> of
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>> less
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>> knowledge of the airavata system in whole. So I
> >> would
> >> > >> like
> >> > >> > > to
> >> > >> > > > > > >>>>>>>>>>
> >> > >> > > > > > >>>>>>>>> discuss
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>>  following point.
> >> > >> > > > > > >>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>> 1. How it will help us in making system more
> >> > >> reliable.
> >> > >> > > > > Zookeeper
> >> > >> > > > > > >>>>>>>>>>
> >> > >> > > > > > >>>>>>>>> is
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>> not
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>> able to restart services. At max it can tell
> >> whether
> >> > >> > service
> >> > >> > > > is
> >> > >> > > > > up
> >> > >> > > > > > >>>>>>>>>>
> >> > >> > > > > > >>>>>>>>> or not
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>> which could only be the case if airavata service
> >> goes
> >> > >> down
> >> > >> > > > > > >>>>>>>>>>
> >> > >> > > > > > >>>>>>>>> gracefully and
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>> we have any automated way to restart it. If this
> >> is
> >> > >> just
> >> > >> > > > matter
> >> > >> > > > > of
> >> > >> > > > > > >>>>>>>>>>
> >> > >> > > > > > >>>>>>>>> routing
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>> client requests to the available thrift servers
> >> then
> >> > >> this
> >> > >> > > can
> >> > >> > > > be
> >> > >> > > > > > >>>>>>>>>>
> >> > >> > > > > > >>>>>>>>> achieved
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>> with the help of load balancer which I guess is
> >> > already
> >> > >> > > there
> >> > >> > > > in
> >> > >> > > > > > >>>>>>>>>>
> >> > >> > > > > > >>>>>>>>> thrift
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>> wish list.
> >> > >> > > > > > >>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>  We have multiple thrift services and
> currently
> >> we
> >> > >> start
> >> > >> > > > only
> >> > >> > > > > > one
> >> > >> > > > > > >>>>>>>>>
> >> > >> > > > > > >>>>>>>> instance
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>> of them and each thrift service is a stateless
> >> > >> service. To
> >> > >> > > > keep
> >> > >> > > > > > the
> >> > >> > > > > > >>>>>>>>>
> >> > >> > > > > > >>>>>>>> high
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>> availability we have to start multiple instances
> >> of
> >> > >> them
> >> > >> > in
> >> > >> > > > > > >>>>>>>>>
> >> > >> > > > > > >>>>>>>> production
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>>  scenario. So for clients to get an available
> thrift
> >> > >> service
> >> > >> > we
> >> > >> > > > can
> >> > >> > > > > > >>>>>>>>>
> >> > >> > > > > > >>>>>>>> use
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>>  zookeeper znodes to represent each available
> >> service.
> >> > >> There
> >> > >> > > are
> >> > >> > > > > > >>>>>>>>>
> >> > >> > > > > > >>>>>>>> some
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>>  libraries which is doing similar[1] and I think we
> >> can
> >> > >> use
> >> > >> > > them
> >> > >> > > > > > >>>>>>>>>
> >> > >> > > > > > >>>>>>>> directly.
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>> 2. As far as registering of different providers
> is
> >> > >> > concerned
> >> > >> > > > do
> >> > >> > > > > > >>>>>>>>>>
> >> > >> > > > > > >>>>>>>>> you
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>>  think for that we really need external store.
> >> > >> > > > > > >>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>  Yes I think so, because its light weight and
> >> > >> reliable
> >> > >> > and
> >> > >> > > > we
> >> > >> > > > > > have
> >> > >> > > > > > >>>>>>>>>
> >> > >> > > > > > >>>>>>>> to
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>> do
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>> very minimal amount of work to achieve all these
> >> > >> features
> >> > >> > to
> >> > >> > > > > > >>>>>>>>>
> >> > >> > > > > > >>>>>>>> Airavata
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>>  because zookeeper handle all the complexity.
> >> > >> > > > > > >>>>>>>>>
> >> > >> > > > > > >>>>>>>>>  I have seen people using zookeeper more for
> >> state
> >> > >> > > management
> >> > >> > > > > in
> >> > >> > > > > > >>>>>>>>>> distributed environments.
> >> > >> > > > > > >>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>  +1, we might not be the most effective users
> of
> >> > >> > zookeeper
> >> > >> > > > > > because
> >> > >> > > > > > >>>>>>>>>
> >> > >> > > > > > >>>>>>>> all
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>> of
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>> our services are stateless services, but my
> point
> >> is
> >> > to
> >> > >> > > > achieve
> >> > >> > > > > > >>>>>>>>> fault-tolerance we can use zookeeper and with
> >> > minimal
> >> > >> > work.
> >> > >> > > > > > >>>>>>>>>
> >> > >> > > > > > >>>>>>>>>    I would like to understand more how can we
> >> > leverage
> >> > >> > > > > zookeeper
> >> > >> > > > > > in
> >> > >> > > > > > >>>>>>>>>> airavata to make system reliable.
> >> > >> > > > > > >>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>  [1]
> >> https://github.com/eirslett/thrift-zookeeper
> >> > >> > > > > > >>>>>>>>>
> >> > >> > > > > > >>>>>>>>>
> >> > >> > > > > > >>>>>>>>>
> >> > >> > > > > > >>>>>>>>>  Regards,
> >> > >> > > > > > >>>>>>>>>> Gagan
> >> > >> > > > > > >>>>>>>>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <
> >> > >> > marpierc@iu.edu
> >> > >> > > >
> >> > >> > > > > > wrote:
> >> > >> > > > > > >>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>  Thanks for the summary, Lahiru. I'm cc'ing
> the
> >> > >> > > Architecture
> >> > >> > > > > > list
> >> > >> > > > > > >>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>> for
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>>  additional comments.
> >> > >> > > > > > >>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> Marlon
> >> > >> > > > > > >>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
> >> > >> > > > > > >>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>> Hi All,
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>> I did little research about Apache
> >> Zookeeper[1]
> >> > and
> >> > >> > how
> >> > >> > > to
> >> > >> > > > > use
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> it
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>> in
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>>  airavata. Its really a nice way to achieve
> fault
> >> > >> > tolerance
> >> > >> > > > and
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> reliable
> >> > >> > > > > > >>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>> communication between our thrift services
> and
> >> > >> clients.
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> Zookeeper
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>> is a
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>>  distributed, fault tolerant system to do a
> >> reliable
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> communication
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>>  between
> >> > >> > > > > > >>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>> distributed applications. This is like an
> >> > in-memory
> >> > >> > file
> >> > >> > > > > > system
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> which
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>>  has
> >> > >> > > > > > >>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>> nodes in a tree structure and each node can
> >> have
> >> > >> small
> >> > >> > > > > amount
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> of
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>> data
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>>  associated with it and these nodes are called
> >> > znodes.
> >> > >> > > Clients
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> can
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>>  connect
> >> > >> > > > > > >>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>> to a zookeeper server and add/delete and
> >> update
> >> > >> these
> >> > >> > > > > znodes.
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>>    In Apache Airavata we start multiple
> thrift
> >> > >> > services
> >> > >> > > > and
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> these
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>> can
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>>  go
> >> > >> > > > > > >>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>> down for maintenance or these can crash, if
> we
> >> > use
> >> > >> > > > zookeeper
> >> > >> > > > > > to
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> store
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>>  these
> >> > >> > > > > > >>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>> configuration(thrift service configurations)
> >> we
> >> > can
> >> > >> > > > achieve
> >> > >> > > > > a
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> very
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>>  reliable
> >> > >> > > > > > >>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>> system. Basically thrift clients can
> >> dynamically
> >> > >> > > discover
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> available
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>>  service
> >> > >> > > > > > >>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>> by using ephemeral znodes(Here we do not
> have
> >> to
> >> > >> > change
> >> > >> > > > the
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> generated
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>>  thrift client code but we have to change the
> >> > >> locations we
> >> > >> > > are
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> invoking
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>>  them). ephemeral znodes will be removed when
> the
> >> > >> thrift
> >> > >> > > > service
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> goes
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>>  down
> >> > >> > > > > > >>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>> and zookeeper guarantee the atomicity
> between
> >> > these
> >> > >> > > > > > operations.
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> With
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>>  this
> >> > >> > > > > > >>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>> approach we can have a node hierarchy for
> >> > multiple
> >> > >> of
> >> > >> > > > > > airavata,
> >> > >> > > > > > >>>>>>>>>>>> orchestrator,appcatalog and gfac thrift
> >> services.
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>> For specifically for gfac we can have
> >> different
> >> > >> types
> >> > >> > of
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> services
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>> for
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>>  each
> >> > >> > > > > > >>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>> provider implementation. This can be
> achieved
> >> by
> >> > >> using
> >> > >> > > the
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> hierarchical
> >> > >> > > > > > >>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>> support in zookeeper and providing some
> logic
> >> in
> >> > >> > > > gfac-thrift
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> service
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>>  to
> >> > >> > > > > > >>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>> register it to a defined path. Using the
> same
> >> > logic
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> orchestrator
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>> can
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>>  discover the provider specific gfac thrift
> >> service
> >> > and
> >> > >> > > route
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> the
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>>  message to
> >> > >> > > > > > >>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>> the correct thrift service.
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>> With this approach I think we simply have
> >> write
> >> > >> some
> >> > >> > > > client
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> code
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>> in
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>>  thrift
> >> > >> > > > > > >>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>> services and clients and zookeeper server
> >> > >> installation
> >> > >> > > can
> >> > >> > > > > be
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> done as
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>>  a
> >> > >> > > > > > >>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>> separate process and it will be easier to
> keep
> >> > the
> >> > >> > > > Zookeeper
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> server
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>>  separate from Airavata because installation of
> >> > >> Zookeeper
> >> > >> > > > server
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> little
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>>  complex in production scenario. I think we have
> >> to
> >> > >> make
> >> > >> > > sure
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> everything
> >> > >> > > > > > >>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>> works fine when there is no Zookeeper
> running,
> >> > ex:
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> enable.zookeeper=false
> >> > >> > > > > > >>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>> should works fine and users doesn't have to
> >> > >> download
> >> > >> > and
> >> > >> > > > > start
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>> zookeeper.
> >> > >> > > > > > >>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>> [1]http://zookeeper.apache.org/
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>> Thanks
> >> > >> > > > > > >>>>>>>>>>>> Lahiru
> >> > >> > > > > > >>>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>>>>
> >> > >> > > > > > >>>>>>>>> --
> >> > >> > > > > > >>>>>>>>> System Analyst Programmer
> >> > >> > > > > > >>>>>>>>> PTI Lab
> >> > >> > > > > > >>>>>>>>> Indiana University
> >> > >> > > > > > >>>>>>>>>
> >> > >> > > > > > >>>>>>>>>
> >> > >> > > > > > >>>>>>> --
> >> > >> > > > > > >>>>>>> System Analyst Programmer
> >> > >> > > > > > >>>>>>> PTI Lab
> >> > >> > > > > > >>>>>>> Indiana University
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>>
> >> > >> > > > > > >>>>>>
> >> > >> > > > > > >>>>>> --
> >> > >> > > > > > >>>>>> Best Regards,
> >> > >> > > > > > >>>>>> Shameera Rathnayaka.
> >> > >> > > > > > >>>>>>
> >> > >> > > > > > >>>>>> email: shameera AT apache.org , shameerainfo AT
> >> > >> gmail.com
> >> > >> > > > > > >>>>>> Blog : http://shameerarathnayaka.blogspot.com/
> >> > >> > > > > > >>>>>>
> >> > >> > > > > > >>>>>>
> >> > >> > > > > > >>>>>
> >> > >> > > > > > >>>>> --
> >> > >> > > > > > >>>>> Supun Kamburugamuva
> >> > >> > > > > > >>>>> Member, Apache Software Foundation;
> >> > http://www.apache.org
> >> > >> > > > > > >>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369
> 6762
> >> > >> > > > > > >>>>> Blog: http://supunk.blogspot.com
> >> > >> > > > > > >>>>>
> >> > >> > > > > > >>>>>
> >> > >> > > > > > >>>>>
> >> > >> > > > > > >>>> --
> >> > >> > > > > > >>>> System Analyst Programmer
> >> > >> > > > > > >>>> PTI Lab
> >> > >> > > > > > >>>> Indiana University
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>>
> >> > >> > > > > > >>>
> >> > >> > > > > > >>> --
> >> > >> > > > > > >>> Supun Kamburugamuva
> >> > >> > > > > > >>> Member, Apache Software Foundation;
> >> http://www.apache.org
> >> > >> > > > > > >>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> >> > >> > > > > > >>> Blog: http://supunk.blogspot.com
> >> > >> > > > > > >>>
> >> > >> > > > > > >>>
> >> > >> > > > > > >>>
> >> > >> > > > > > >>
> >> > >> > > > > > >
> >> > >> > > > > >
> >> > >> > > > > >
> >> > >> > > > > > --
> >> > >> > > > > > Supun Kamburugamuva
> >> > >> > > > > > Member, Apache Software Foundation;
> http://www.apache.org
> >> > >> > > > > > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> >> > >> > > > > > Blog: http://supunk.blogspot.com
> >> > >> > > > > >
> >> > >> > > > >
> >> > >> > > > >
> >> > >> > > > >
> >> > >> > > > > --
> >> > >> > > > > System Analyst Programmer
> >> > >> > > > > PTI Lab
> >> > >> > > > > Indiana University
> >> > >> > > > >
> >> > >> > > >
> >> > >> > > >
> >> > >> > > >
> >> > >> > > > --
> >> > >> > > > Supun Kamburugamuva
> >> > >> > > > Member, Apache Software Foundation; http://www.apache.org
> >> > >> > > > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> >> > >> > > > Blog: http://supunk.blogspot.com
> >> > >> > > >
> >> > >> > >
> >> > >> > >
> >> > >> > >
> >> > >> > > --
> >> > >> > > System Analyst Programmer
> >> > >> > > PTI Lab
> >> > >> > > Indiana University
> >> > >> > >
> >> > >> >
> >> > >>
> >> > >>
> >> > >>
> >> > >> --
> >> > >> Supun Kamburugamuva
> >> > >> Member, Apache Software Foundation; http://www.apache.org
> >> > >> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> >> > >> Blog: http://supunk.blogspot.com
> >> > >>
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > System Analyst Programmer
> >> > > PTI Lab
> >> > > Indiana University
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > System Analyst Programmer
> >> > PTI Lab
> >> > Indiana University
> >> >
> >>
> >
> >
> >
> > --
> > System Analyst Programmer
> > PTI Lab
> > Indiana University
> >
>
>
>
> --
> System Analyst Programmer
> PTI Lab
> Indiana University
>

Re: Zookeeper in Airavata to achieve reliability

Posted by Lahiru Gunathilake <gl...@gmail.com>.

Hi All,

I have finished the initial version of the ZK integration. Now we can start
multiple thrift gfac services (still the communication between orchestrator
and gfac is RPC) and orchestrator submit jobs to multiple gfac nodes.

I can kill a gfac node and orchestrator will make sure jobs are not lost,
it simply take those jobs and re-submit to gfac. Since GFac is a generic
framework and we have multiple plugins developed for that framework
checkpointing the plugin is up to the plugin developers but gfac
checkpoints whether those plugins invoked or not.

I have introduced a new interface for plugin development called Recoverable
(RecoverableHandlers and RecoverableProvider). So state-full plugins has to
implement their recover method and gfac framework will make sure it will be
invoked during a re-run scenario. If a plugin is not recoverable and
already ran(can be found using framework checkpointing) during the re-run
that plugin will not be invoked. For now I just implemented recoverability
to few plugins and I have tested submitting a job to trestles and let it
submit and come to monitoring state and kill that gfac instance. Now
Orchestrator pick that execution and re-submit to another gfac node and
that gfac node does not re-run that job to the computing resource, but
simply start monitoring once the job is done outputs are downloaded from
the original output location.

When a particular experiment is finished all the ZK data is removed.

At this point following things needs to be done,

1. Figure out all the state-full handlers/providers and implement
recoverability,

Ex: Input handler is transfering 1000 files and with 500 files gfac
instance crashed, during the re-run it should be able to tranfer from 501
file. Same logic can be applied to a single huge file. Those things are
completely up to the plugin developer.

2. Then we have to do remove the RPC invocation and make gfac nodes as
worker nodes.

Regards
Lahiru


On Wed, Jun 18, 2014 at 12:11 PM, Lahiru Gunathilake <gl...@gmail.com>
wrote:

> Hi Eran,
>
>
> On Tue, Jun 17, 2014 at 4:06 PM, Eran Chinthaka Withana <
> eran.chinthaka@gmail.com> wrote:
>
>> Storm has a Kafka spout which manages the cursor location (pointer to the
>> head of the queue representing the next message to be processed) inside
>> ZK.
>> Each storm spout instance uses this information to get the next item to
>> process. Storm kafka spout won't advance to the next message until it gets
>> an ack from the storm topology.
>>
> If we have 10 jobs in the queue and 5 GFAC instances picked 1 at a time
> and successfully submitted and have to start taking rest of the jobs. But
> all 5 GFAC instances are responsible for initially picked  5 jobs because
> they are still running and gfac instances are monitoring them until its
> done but at the same time we have to move the cursor to pick other jobs
> too.
>
> If we Ack and moved the cursor just after submission without waiting until
> the job is actually finished how are we going to know which gfac is
> monitoring which set of jobs ?
>
> I am not getting how achieve above requirement with this suggestion. May
> be I am missing something here.
>
> Regards
> Lahiru
>
>>
>> So, if there is an exception in the topology and ack is sent only by the
>> last bolt, then storm bolt make sure all messages are processed since
>> exceptions won't generate acks.
>>
>> Thanks,
>> Eran Chinthaka Withana
>>
>>
>> On Tue, Jun 17, 2014 at 12:30 PM, Lahiru Gunathilake <gl...@gmail.com>
>> wrote:
>>
>> > Hi Eran,
>> >
>> > I think I should take back my last email. When I carefully look at
>> storm I
>> > have following question.
>> >
>> > How are we going to store the Job statuses  and relaunch the jobs which
>> was
>> > running in failure nodes ? Its true that storm is starting new workers
>> but
>> > there should be a way to find missing jobs by someone in the system.
>> Since
>> > we are not having a data stream there is no use to start new workers
>> unless
>> > we handler the missing jobs. I think we need to have a better control of
>> > our component and persist the states of jobs each GFAC node is handling.
>> > Directly using zookeeper will let us to do a proper fault tolerance
>> > implementation.
>> >
>> > Regards
>> > Lahiru
>> >
>> >
>> >
>> > On Tue, Jun 17, 2014 at 3:14 PM, Lahiru Gunathilake <gl...@gmail.com>
>> > wrote:
>> >
>> > > Hi Supun,
>> > >
>> > > I think in this usecase we only use storm topology to do the
>> > communication
>> > > among workers and we are completely ignoring the stream processing
>> part.
>> > > Orchestrator will talk to Nimbus and GFAC nodes will be Worker nodes
>> in
>> > the
>> > > storm topology. But I think we can achieve extremely fault tolerance
>> > system
>> > > by directly using storm based on following statement in storm site
>> with
>> > > minimum changes in airavata.
>> > >
>> > > Additionally, the Nimbus daemon and Supervisor daemons are fail-fast
>> and
>> > > stateless; all state is kept in Zookeeper or on local disk. This means
>> > you
>> > > can kill -9 Nimbus or the Supervisors and they’ll start back up like
>> > > nothing happened. This design leads to Storm clusters being incredibly
>> > > stable.
>> > >
>> > >
>> > >
>> > >
>> > > On Tue, Jun 17, 2014 at 3:02 PM, Supun Kamburugamuva <
>> supun06@gmail.com>
>> > > wrote:
>> > >
>> > >> Hi Eran,
>> > >>
>> > >> I'm using Storm every day and this is one of the strangest things
>> I've
>> > >> heard about using Storm. My be there are more use cases for Storm
>> other
>> > >> than Distributed Stream processing. AFAIK the Bolts, spouts are
>> built to
>> > >> handle a stream of events that doesn't take much time to process. In
>> > >> Airavata we don't process the messages. Instead we run experiments
>> based
>> > >> on
>> > >> the commands given.
>> > >>
>> > >> If you want process isolation, distributed execution, cluster
>> resource
>> > >> management Yarn would be a better thing to explore.
>> > >>
>> > >> Thanks,
>> > >> Supun..
>> > >>
>> > >>
>> > >> On Tue, Jun 17, 2014 at 2:27 PM, Eran Chinthaka Withana <
>> > >> eran.chinthaka@gmail.com> wrote:
>> > >>
>> > >> > Hi Lahiru,
>> > >> >
>> > >> > good summarization. Thanks Lahiru.
>> > >> >
>> > >> > I think you are trying to stick to a model where Orchestrator
>> > >> distributing
>> > >> > to work for GFac worker and trying to do the impedance mismatch
>> > through
>> > >> a
>> > >> > messaging solution. If you step back and think, we don't even want
>> the
>> > >> > orchestrator to handle everything. From its point of view, it
>> should
>> > >> submit
>> > >> > jobs to the framework, and will wait or get notified once the job
>> is
>> > >> done.
>> > >> >
>> > >> > There are multiple ways of doing this. And here is one method.
>> > >> >
>> > >> > Orchestrator submits all its jobs to Job queue (implemented using
>> any
>> > MQ
>> > >> > impl like Rabbit or Kafka). A storm topology is implemented to
>> dequeue
>> > >> > messages, process them (i.e. submit those jobs and get those
>> executed)
>> > >> and
>> > >> > notify the Orchestrator with the status (either through another
>> > >> > JobCompletionQueue or direct invocation).
>> > >> >
>> > >> > With this approach, the MQ provider will help to match impedance
>> > between
>> > >> > job submission and consumption. Storm helps with worker
>> coordination,
>> > >> load
>> > >> > balancing, throttling on your job execution framework, worker pool
>> > >> > management and fault tolerance.
>> > >> >
>> > >> > Of course, you can implement this based only on ZK and handle
>> > everything
>> > >> > else on your own but storm had done exactly that with the use of ZK
>> > >> > underneath.
>> > >> >
>> > >> > Finally, if you go for a model like this, then even beyond job
>> > >> submission,
>> > >> > you can use the same model to do anything within the framework for
>> > >> internal
>> > >> > communication. For example, the workflow engine will submit its
>> jobs
>> > to
>> > >> > queues based on what it has to do. Storm topologies exists for each
>> > >> queues
>> > >> > to dequeue messages and carry out the work in a reliable manner.
>> > >> Consider
>> > >> > this as mini-workflows within a larger workflow framework.
>> > >> >
>> > >> > We can have a voice chat if its more convenient. But not at 7am
>> PST :)
>> > >> >
>> > >> >
>> > >> > Thanks,
>> > >> > Eran Chinthaka Withana
>> > >> >
>> > >> >
>> > >> > On Tue, Jun 17, 2014 at 10:12 AM, Lahiru Gunathilake <
>> > glahiru@gmail.com
>> > >> >
>> > >> > wrote:
>> > >> >
>> > >> > > Hi All,
>> > >> > >
>> > >> > > Ignoring the tool that we are going to use to implement fault
>> > >> tolerance I
>> > >> > > have summarized the model we have decided so far. I will use the
>> > tool
>> > >> > name
>> > >> > > as X, we can use Zookeeper or some other implementation.
>> Following
>> > >> design
>> > >> > > assume tool X  and Registry have high availability.
>> > >> > >
>> > >> > > 1. Orchestrator and GFAC worker node communication is going to be
>> > >> queue
>> > >> > > based and tool X is going to be used for this communication. (We
>> > have
>> > >> to
>> > >> > > implement this with considering race condition between different
>> > gfac
>> > >> > > workers).
>> > >> > > 2. We are having multiple instances of GFAC which are identical
>> (In
>> > >> > future
>> > >> > > we can group gfac workers). Existence of each worker node is
>> > >> identified
>> > >> > > using X. If node goes down orchestrator will be notified by X.
>> > >> > > 3. When a particular request comes and accepted by one gfac
>> worker
>> > >> that
>> > >> > > information will be replicated in tool X and a place where this
>> > >> > information
>> > >> > > is persisted even the worker failed.
>> > >> > > 4. When a job comes to a final state like failed or cancelled or
>> > >> > completed
>> > >> > > above information will be removed. So at a given time
>> orchestrator
>> > can
>> > >> > poll
>> > >> > > active jobs in each worker by giving a worker ID.
>> > >> > > 5. Tool X will make sure that when a worker goes down it will
>> notify
>> > >> > > orchestrator. During a worker failure, based on step 3 and 4
>> > >> orchestrator
>> > >> > > can poll all the active jobs of that worker and do the same thing
>> > >> like in
>> > >> > > step 1 (store the experiment ID to the queue) and gfac worker
>> will
>> > >> pick
>> > >> > the
>> > >> > > jobs.
>> > >> > >
>> > >> > > 6. When GFAC receive a job like in step 5 it have to carefully
>> > >> evaluate
>> > >> > the
>> > >> > > state from registry and decide what to be done (If the job is
>> > pending
>> > >> > then
>> > >> > > gfac just have to monitor, if job state is like input transferred
>> > not
>> > >> > even
>> > >> > > submitted gfac has to execute rest of the chain and submit the
>> job
>> > to
>> > >> the
>> > >> > > resource and start monitoring).
>> > >> > >
>> > >> > > If we can find a tool X which supports all these features and
>> tool
>> > >> itself
>> > >> > > is fault tolerance and support atomicity, high availability and
>> > simply
>> > >> > API
>> > >> > > to implement we can use that tool.
>> > >> > >
>> > >> > > WDYT ?
>> > >> > >
>> > >> > > Lahiru
>> > >> > >
>> > >> > >
>> > >> > > On Mon, Jun 16, 2014 at 2:38 PM, Supun Kamburugamuva <
>> > >> supun06@gmail.com>
>> > >> > > wrote:
>> > >> > >
>> > >> > > > Hi Lahiru,
>> > >> > > >
>> > >> > > > Before moving with an implementation it may be worth to
>> consider
>> > >> some
>> > >> > of
>> > >> > > > the following aspects as well.
>> > >> > > >
>> > >> > > > 1. How to report the progress of an experiment as state in
>> > >> ZooKeeper?
>> > >> > > What
>> > >> > > > happens if a GFac instance crashes while executing an
>> experiment?
>> > >> Are
>> > >> > > there
>> > >> > > > check-points we can save so that another GFac instance can take
>> > >> over?
>> > >> > > > 2. What is the threading model of GFac instances? (I consider
>> this
>> > >> as a
>> > >> > > > very important aspect)
>> > >> > > > 3. What are the information needed to be stored in the
>> ZooKeeper?
>> > >> You
>> > >> > may
>> > >> > > > need to store other information about an experiment apart from
>> its
>> > >> > > > experiment ID.
>> > >> > > > 4. How to report errors?
>> > >> > > > 5. For GFac weather you need a threading model or worker
>> process
>> > >> model?
>> > >> > > >
>> > >> > > > Thanks,
>> > >> > > > Supun..
>> > >> > > >
>> > >> > > >
>> > >> > > >
>> > >> > > >
>> > >> > > >
>> > >> > > > On Mon, Jun 16, 2014 at 2:22 PM, Lahiru Gunathilake <
>> > >> glahiru@gmail.com
>> > >> > >
>> > >> > > > wrote:
>> > >> > > >
>> > >> > > > > Hi All,
>> > >> > > > >
>> > >> > > > > I think the conclusion is like this,
>> > >> > > > >
>> > >> > > > > 1, We make the gfac as a worker not a thrift service and we
>> can
>> > >> start
>> > >> > > > > multiple workers either with bunch of providers and handlers
>> > >> > configured
>> > >> > > > in
>> > >> > > > > each worker or provider specific  workers to handle the class
>> > path
>> > >> > > issues
>> > >> > > > > (not the common scenario).
>> > >> > > > >
>> > >> > > > > 2. Gfac workers can be configured to watch for a given path
>> in
>> > >> > > zookeeper,
>> > >> > > > > and multiple workers can listen to the same path. Default
>> path
>> > >> can be
>> > >> > > > > /airavata/gfac or can configure paths like
>> /airavata/gfac/gsissh
>> > >> > > > > /airavata/gfac/bes.
>> > >> > > > >
>> > >> > > > > 3. Orchestrator can configure with a logic to store
>> experiment
>> > >> IDs in
>> > >> > > > > zookeeper with a path, and orchestrator can be configured to
>> > >> provider
>> > >> > > > > specific path logic too. So when a new request come
>> orchestrator
>> > >> > store
>> > >> > > > the
>> > >> > > > > experimentID and these experiments IDs are stored in Zk as a
>> > >> queue.
>> > >> > > > >
>> > >> > > > > 4. Since gfac workers are watching they will be notified and
>> as
>> > >> supun
>> > >> > > > > suggested can use a leader selection algorithm[1] and one
>> gfac
>> > >> worker
>> > >> > > >  will
>> > >> > > > > take the leadership for each experiment. If there are gfac
>> > >> instances
>> > >> > > for
>> > >> > > > > each provider same logic will apply among those nodes with
>> same
>> > >> > > provider
>> > >> > > > > type.
>> > >> > > > >
>> > >> > > > > [1]
>> > http://curator.apache.org/curator-recipes/leader-election.html
>> > >> > > > >
>> > >> > > > > I would like to implement this if there are  no objections.
>> > >> > > > >
>> > >> > > > > Lahiru
>> > >> > > > >
>> > >> > > > >
>> > >> > > > > On Mon, Jun 16, 2014 at 11:51 AM, Supun Kamburugamuva <
>> > >> > > supun06@gmail.com
>> > >> > > > >
>> > >> > > > > wrote:
>> > >> > > > >
>> > >> > > > > > Hi Marlon,
>> > >> > > > > >
>> > >> > > > > > I think you are exactly correct.
>> > >> > > > > >
>> > >> > > > > > Supun..
>> > >> > > > > >
>> > >> > > > > >
>> > >> > > > > > On Mon, Jun 16, 2014 at 11:48 AM, Marlon Pierce <
>> > >> marpierc@iu.edu>
>> > >> > > > wrote:
>> > >> > > > > >
>> > >> > > > > > > Let me restate this, and please tell me if I'm wrong.
>> > >> > > > > > >
>> > >> > > > > > > Orchestrator decides (somehow) that a particular job
>> > requires
>> > >> > > > JSDL/BES,
>> > >> > > > > > so
>> > >> > > > > > > it places the Experiment ID in Zookeeper's
>> > >> > /airavata/gfac/jsdl-bes
>> > >> > > > > node.
>> > >> > > > > > >  GFAC servers associated with this instance notice the
>> > update.
>> > >> >  The
>> > >> > > > > first
>> > >> > > > > > > GFAC to claim the job gets it, uses the Experiment ID to
>> get
>> > >> the
>> > >> > > > > detailed
>> > >> > > > > > > information it needs from the Registry.  ZooKeeper
>> handles
>> > the
>> > >> > > > locking,
>> > >> > > > > > etc
>> > >> > > > > > > to make sure that only one GFAC at a time is trying to
>> > handle
>> > >> an
>> > >> > > > > > experiment.
>> > >> > > > > > >
>> > >> > > > > > > Marlon
>> > >> > > > > > >
>> > >> > > > > > >
>> > >> > > > > > > On 6/16/14, 11:42 AM, Lahiru Gunathilake wrote:
>> > >> > > > > > >
>> > >> > > > > > >> Hi Supun,
>> > >> > > > > > >>
>> > >> > > > > > >> Thanks for the clarification.
>> > >> > > > > > >>
>> > >> > > > > > >> Regards
>> > >> > > > > > >> Lahiru
>> > >> > > > > > >>
>> > >> > > > > > >>
>> > >> > > > > > >> On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva <
>> > >> > > > > > supun06@gmail.com>
>> > >> > > > > > >> wrote:
>> > >> > > > > > >>
>> > >> > > > > > >>  Hi Lahiru,
>> > >> > > > > > >>>
>> > >> > > > > > >>> My suggestion is that may be you don't need a Thrift
>> > service
>> > >> > > > between
>> > >> > > > > > >>> Orchestrator and the component executing the
>> experiment.
>> > >> When a
>> > >> > > new
>> > >> > > > > > >>> experiment is submitted, orchestrator decides who can
>> > >> execute
>> > >> > > this
>> > >> > > > > job.
>> > >> > > > > > >>> Then it put the information about this experiment
>> > execution
>> > >> in
>> > >> > > > > > ZooKeeper.
>> > >> > > > > > >>> The component which wants to executes the experiment is
>> > >> > listening
>> > >> > > > to
>> > >> > > > > > this
>> > >> > > > > > >>> ZooKeeper path and when it sees the experiment it will
>> > >> execute
>> > >> > > it.
>> > >> > > > So
>> > >> > > > > > >>> that
>> > >> > > > > > >>> the communication happens through an state change in
>> > >> ZooKeeper.
>> > >> > > > This
>> > >> > > > > > can
>> > >> > > > > > >>> potentially simply your architecture.
>> > >> > > > > > >>>
>> > >> > > > > > >>> Thanks,
>> > >> > > > > > >>> Supun.
>> > >> > > > > > >>>
>> > >> > > > > > >>>
>> > >> > > > > > >>> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake <
>> > >> > > > > > glahiru@gmail.com>
>> > >> > > > > > >>> wrote:
>> > >> > > > > > >>>
>> > >> > > > > > >>>  Hi Supun,
>> > >> > > > > > >>>>
>> > >> > > > > > >>>> So your suggestion is to create a znode for each
>> thrift
>> > >> > service
>> > >> > > we
>> > >> > > > > > have
>> > >> > > > > > >>>> and
>> > >> > > > > > >>>> when the request comes that node gets modified with
>> input
>> > >> data
>> > >> > > for
>> > >> > > > > > that
>> > >> > > > > > >>>> request and thrift service is having a watch for that
>> > node
>> > >> and
>> > >> > > it
>> > >> > > > > will
>> > >> > > > > > >>>> be
>> > >> > > > > > >>>> notified because of the watch and it can read the
>> input
>> > >> from
>> > >> > > > > zookeeper
>> > >> > > > > > >>>> and
>> > >> > > > > > >>>> invoke the operation?
>> > >> > > > > > >>>>
>> > >> > > > > > >>>> Lahiru
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>
>> > >> > > > > > >>>> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva
>> <
>> > >> > > > > > >>>> supun06@gmail.com>
>> > >> > > > > > >>>> wrote:
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>  Hi all,
>> > >> > > > > > >>>>>
>> > >> > > > > > >>>>> Here is what I think about Airavata and ZooKeeper. In
>> > >> > Airavata
>> > >> > > > > there
>> > >> > > > > > >>>>> are
>> > >> > > > > > >>>>> many components and these components must be
>> stateless
>> > to
>> > >> > > achieve
>> > >> > > > > > >>>>> scalability and reliability.Also there must be a
>> > >> mechanism to
>> > >> > > > > > >>>>>
>> > >> > > > > > >>>> communicate
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>> between the components. At the moment Airavata uses
>> RPC
>> > >> calls
>> > >> > > > based
>> > >> > > > > > on
>> > >> > > > > > >>>>> Thrift for the communication.
>> > >> > > > > > >>>>>
>> > >> > > > > > >>>>> ZooKeeper can be used both as a place to hold state
>> and
>> > >> as a
>> > >> > > > > > >>>>>
>> > >> > > > > > >>>> communication
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>> layer between the components. I'm involved with a
>> > project
>> > >> > that
>> > >> > > > has
>> > >> > > > > > many
>> > >> > > > > > >>>>> distributed components like AIravata. Right now we
>> use
>> > >> Thrift
>> > >> > > > > > services
>> > >> > > > > > >>>>>
>> > >> > > > > > >>>> to
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>> communicate among the components. But we find it
>> > >> difficult to
>> > >> > > use
>> > >> > > > > RPC
>> > >> > > > > > >>>>>
>> > >> > > > > > >>>> calls
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>> and achieve stateless behaviour and thinking of
>> > replacing
>> > >> > > Thrift
>> > >> > > > > > >>>>>
>> > >> > > > > > >>>> services
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>> with ZooKeeper based communication layer. So I think
>> it
>> > is
>> > >> > > better
>> > >> > > > > to
>> > >> > > > > > >>>>> explore the possibility of removing the Thrift
>> services
>> > >> > between
>> > >> > > > the
>> > >> > > > > > >>>>> components and use ZooKeeper as a communication
>> > mechanism
>> > >> > > between
>> > >> > > > > the
>> > >> > > > > > >>>>> services. If you do this you will have to move the
>> state
>> > >> to
>> > >> > > > > ZooKeeper
>> > >> > > > > > >>>>>
>> > >> > > > > > >>>> and
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>> will automatically achieve the stateless behaviour in
>> > the
>> > >> > > > > components.
>> > >> > > > > > >>>>>
>> > >> > > > > > >>>>> Also I think trying to make ZooKeeper optional is a
>> bad
>> > >> idea.
>> > >> > > If
>> > >> > > > we
>> > >> > > > > > are
>> > >> > > > > > >>>>> trying to integrate something fundamentally
>> important to
>> > >> > > > > architecture
>> > >> > > > > > >>>>> as
>> > >> > > > > > >>>>> how to store state, we shouldn't make it optional.
>> > >> > > > > > >>>>>
>> > >> > > > > > >>>>> Thanks,
>> > >> > > > > > >>>>> Supun..
>> > >> > > > > > >>>>>
>> > >> > > > > > >>>>>
>> > >> > > > > > >>>>> On Thu, Jun 12, 2014 at 10:57 PM, Shameera
>> Rathnayaka <
>> > >> > > > > > >>>>> shameerainfo@gmail.com> wrote:
>> > >> > > > > > >>>>>
>> > >> > > > > > >>>>>  Hi Lahiru,
>> > >> > > > > > >>>>>>
>> > >> > > > > > >>>>>> As i understood,  not only reliability , you are
>> trying
>> > >> to
>> > >> > > > achieve
>> > >> > > > > > >>>>>> some
>> > >> > > > > > >>>>>> other requirement by introducing zookeeper, like
>> health
>> > >> > > > monitoring
>> > >> > > > > > of
>> > >> > > > > > >>>>>>
>> > >> > > > > > >>>>> the
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>> services, categorization with service implementation
>> etc
>> > >> ...
>> > >> > .
>> > >> > > In
>> > >> > > > > > that
>> > >> > > > > > >>>>>> case, i think we can get use of zookeeper's features
>> > but
>> > >> if
>> > >> > we
>> > >> > > > > only
>> > >> > > > > > >>>>>>
>> > >> > > > > > >>>>> focus
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>> on reliability, i have little bit of concern, why
>> can't
>> > we
>> > >> > use
>> > >> > > > > > >>>>>>
>> > >> > > > > > >>>>> clustering +
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>> LB ?
>> > >> > > > > > >>>>>>
>> > >> > > > > > >>>>>> Yes it is better we add Zookeeper as a prerequisite
>> if
>> > >> user
>> > >> > > need
>> > >> > > > > to
>> > >> > > > > > >>>>>> use
>> > >> > > > > > >>>>>> it.
>> > >> > > > > > >>>>>>
>> > >> > > > > > >>>>>> Thanks,
>> > >> > > > > > >>>>>>   Shameera.
>> > >> > > > > > >>>>>>
>> > >> > > > > > >>>>>>
>> > >> > > > > > >>>>>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake
>> <
>> > >> > > > > > >>>>>> glahiru@gmail.com
>> > >> > > > > > >>>>>> wrote:
>> > >> > > > > > >>>>>>
>> > >> > > > > > >>>>>>  Hi Gagan,
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>> I need to start another discussion about it, but I
>> had
>> > >> an
>> > >> > > > offline
>> > >> > > > > > >>>>>>> discussion with Suresh about auto-scaling. I will
>> > start
>> > >> > > another
>> > >> > > > > > >>>>>>> thread
>> > >> > > > > > >>>>>>> about this topic too.
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>> Regards
>> > >> > > > > > >>>>>>> Lahiru
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>> gagandeepjuneja@gmail.com
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>> wrote:
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>  Thanks Lahiru for pointing to nice library, added
>> to
>> > my
>> > >> > > > > dictionary
>> > >> > > > > > >>>>>>>>
>> > >> > > > > > >>>>>>> :).
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>>  I would like to know how are we planning to start
>> > >> multiple
>> > >> > > > > servers.
>> > >> > > > > > >>>>>>>> 1. Spawning new servers based on load? Some times
>> we
>> > >> call
>> > >> > it
>> > >> > > > as
>> > >> > > > > > auto
>> > >> > > > > > >>>>>>>> scalable.
>> > >> > > > > > >>>>>>>> 2. To make some specific number of nodes available
>> > >> such as
>> > >> > > we
>> > >> > > > > > want 2
>> > >> > > > > > >>>>>>>> servers to be available at any time so if one goes
>> > down
>> > >> > > then I
>> > >> > > > > > need
>> > >> > > > > > >>>>>>>>
>> > >> > > > > > >>>>>>> to
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>>  spawn one new to make available servers count 2.
>> > >> > > > > > >>>>>>>> 3. Initially start all the servers.
>> > >> > > > > > >>>>>>>>
>> > >> > > > > > >>>>>>>> In scenario 1 and 2 zookeeper does make sense but
>> I
>> > >> don't
>> > >> > > > > believe
>> > >> > > > > > >>>>>>>>
>> > >> > > > > > >>>>>>> existing
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>> architecture support this?
>> > >> > > > > > >>>>>>>>
>> > >> > > > > > >>>>>>>> Regards,
>> > >> > > > > > >>>>>>>> Gagan
>> > >> > > > > > >>>>>>>> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <
>> > >> > > > glahiru@gmail.com
>> > >> > > > > >
>> > >> > > > > > >>>>>>>>
>> > >> > > > > > >>>>>>> wrote:
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>> Hi Gagan,
>> > >> > > > > > >>>>>>>>>
>> > >> > > > > > >>>>>>>>> Thanks for your response. Please see my inline
>> > >> comments.
>> > >> > > > > > >>>>>>>>>
>> > >> > > > > > >>>>>>>>>
>> > >> > > > > > >>>>>>>>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
>> > >> > > > > > >>>>>>>>>
>> > >> > > > > > >>>>>>>> gagandeepjuneja@gmail.com>
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>> wrote:
>> > >> > > > > > >>>>>>>>>
>> > >> > > > > > >>>>>>>>>  Hi Lahiru,
>> > >> > > > > > >>>>>>>>>> Just my 2 cents.
>> > >> > > > > > >>>>>>>>>>
>> > >> > > > > > >>>>>>>>>> I am big fan of zookeeper but also against
>> adding
>> > >> > multiple
>> > >> > > > > hops
>> > >> > > > > > in
>> > >> > > > > > >>>>>>>>>>
>> > >> > > > > > >>>>>>>>> the
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>> system which can add unnecessary complexity. Here
>> I
>> > am
>> > >> not
>> > >> > > > able
>> > >> > > > > to
>> > >> > > > > > >>>>>>>>>> understand the requirement of zookeeper may be
>> I am
>> > >> > wrong
>> > >> > > > > > because
>> > >> > > > > > >>>>>>>>>>
>> > >> > > > > > >>>>>>>>> of
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>> less
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>> knowledge of the airavata system in whole. So I
>> would
>> > >> like
>> > >> > > to
>> > >> > > > > > >>>>>>>>>>
>> > >> > > > > > >>>>>>>>> discuss
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>>  following point.
>> > >> > > > > > >>>>>>>>>>
>> > >> > > > > > >>>>>>>>>> 1. How it will help us in making system more
>> > >> reliable.
>> > >> > > > > Zookeeper
>> > >> > > > > > >>>>>>>>>>
>> > >> > > > > > >>>>>>>>> is
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>> not
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>> able to restart services. At max it can tell
>> whether
>> > >> > service
>> > >> > > > is
>> > >> > > > > up
>> > >> > > > > > >>>>>>>>>>
>> > >> > > > > > >>>>>>>>> or not
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>> which could only be the case if airavata service
>> goes
>> > >> down
>> > >> > > > > > >>>>>>>>>>
>> > >> > > > > > >>>>>>>>> gracefully and
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>> we have any automated way to restart it. If this
>> is
>> > >> just
>> > >> > > > matter
>> > >> > > > > of
>> > >> > > > > > >>>>>>>>>>
>> > >> > > > > > >>>>>>>>> routing
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>> client requests to the available thrift servers
>> then
>> > >> this
>> > >> > > can
>> > >> > > > be
>> > >> > > > > > >>>>>>>>>>
>> > >> > > > > > >>>>>>>>> achieved
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>> with the help of load balancer which I guess is
>> > already
>> > >> > > there
>> > >> > > > in
>> > >> > > > > > >>>>>>>>>>
>> > >> > > > > > >>>>>>>>> thrift
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>> wish list.
>> > >> > > > > > >>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>  We have multiple thrift services and currently
>> we
>> > >> start
>> > >> > > > only
>> > >> > > > > > one
>> > >> > > > > > >>>>>>>>>
>> > >> > > > > > >>>>>>>> instance
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>> of them and each thrift service is a stateless
>> > >> service. To
>> > >> > > > keep
>> > >> > > > > > the
>> > >> > > > > > >>>>>>>>>
>> > >> > > > > > >>>>>>>> high
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>> availability we have to start multiple instances
>> of
>> > >> them
>> > >> > in
>> > >> > > > > > >>>>>>>>>
>> > >> > > > > > >>>>>>>> production
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>>  scenario. So for clients to get an available thrift
>> > >> service
>> > >> > we
>> > >> > > > can
>> > >> > > > > > >>>>>>>>>
>> > >> > > > > > >>>>>>>> use
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>>  zookeeper znodes to represent each available
>> service.
>> > >> There
>> > >> > > are
>> > >> > > > > > >>>>>>>>>
>> > >> > > > > > >>>>>>>> some
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>>  libraries which is doing similar[1] and I think we
>> can
>> > >> use
>> > >> > > them
>> > >> > > > > > >>>>>>>>>
>> > >> > > > > > >>>>>>>> directly.
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>> 2. As far as registering of different providers is
>> > >> > concerned
>> > >> > > > do
>> > >> > > > > > >>>>>>>>>>
>> > >> > > > > > >>>>>>>>> you
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>>  think for that we really need external store.
>> > >> > > > > > >>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>  Yes I think so, because its light weight and
>> > >> reliable
>> > >> > and
>> > >> > > > we
>> > >> > > > > > have
>> > >> > > > > > >>>>>>>>>
>> > >> > > > > > >>>>>>>> to
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>> do
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>> very minimal amount of work to achieve all these
>> > >> features
>> > >> > to
>> > >> > > > > > >>>>>>>>>
>> > >> > > > > > >>>>>>>> Airavata
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>>  because zookeeper handle all the complexity.
>> > >> > > > > > >>>>>>>>>
>> > >> > > > > > >>>>>>>>>  I have seen people using zookeeper more for
>> state
>> > >> > > management
>> > >> > > > > in
>> > >> > > > > > >>>>>>>>>> distributed environments.
>> > >> > > > > > >>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>  +1, we might not be the most effective users of
>> > >> > zookeeper
>> > >> > > > > > because
>> > >> > > > > > >>>>>>>>>
>> > >> > > > > > >>>>>>>> all
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>> of
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>> our services are stateless services, but my point
>> is
>> > to
>> > >> > > > achieve
>> > >> > > > > > >>>>>>>>> fault-tolerance we can use zookeeper and with
>> > minimal
>> > >> > work.
>> > >> > > > > > >>>>>>>>>
>> > >> > > > > > >>>>>>>>>    I would like to understand more how can we
>> > leverage
>> > >> > > > > zookeeper
>> > >> > > > > > in
>> > >> > > > > > >>>>>>>>>> airavata to make system reliable.
>> > >> > > > > > >>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>  [1]
>> https://github.com/eirslett/thrift-zookeeper
>> > >> > > > > > >>>>>>>>>
>> > >> > > > > > >>>>>>>>>
>> > >> > > > > > >>>>>>>>>
>> > >> > > > > > >>>>>>>>>  Regards,
>> > >> > > > > > >>>>>>>>>> Gagan
>> > >> > > > > > >>>>>>>>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <
>> > >> > marpierc@iu.edu
>> > >> > > >
>> > >> > > > > > wrote:
>> > >> > > > > > >>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>  Thanks for the summary, Lahiru. I'm cc'ing the
>> > >> > > Architecture
>> > >> > > > > > list
>> > >> > > > > > >>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>> for
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>>  additional comments.
>> > >> > > > > > >>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> Marlon
>> > >> > > > > > >>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
>> > >> > > > > > >>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>> Hi All,
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>> I did little research about Apache
>> Zookeeper[1]
>> > and
>> > >> > how
>> > >> > > to
>> > >> > > > > use
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> it
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>> in
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>>  airavata. Its really a nice way to achieve fault
>> > >> > tolerance
>> > >> > > > and
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> reliable
>> > >> > > > > > >>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>> communication between our thrift services and
>> > >> clients.
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> Zookeeper
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>> is a
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>>  distributed, fault tolerant system to do a
>> reliable
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> communication
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>>  between
>> > >> > > > > > >>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>> distributed applications. This is like an
>> > in-memory
>> > >> > file
>> > >> > > > > > system
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> which
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>>  has
>> > >> > > > > > >>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>> nodes in a tree structure and each node can
>> have
>> > >> small
>> > >> > > > > amount
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> of
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>> data
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>>  associated with it and these nodes are called
>> > znodes.
>> > >> > > Clients
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> can
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>>  connect
>> > >> > > > > > >>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>> to a zookeeper server and add/delete and
>> update
>> > >> these
>> > >> > > > > znodes.
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>>    In Apache Airavata we start multiple thrift
>> > >> > services
>> > >> > > > and
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> these
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>> can
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>>  go
>> > >> > > > > > >>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>> down for maintenance or these can crash, if we
>> > use
>> > >> > > > zookeeper
>> > >> > > > > > to
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> store
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>>  these
>> > >> > > > > > >>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>> configuration(thrift service configurations)
>> we
>> > can
>> > >> > > > achieve
>> > >> > > > > a
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> very
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>>  reliable
>> > >> > > > > > >>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>> system. Basically thrift clients can
>> dynamically
>> > >> > > discover
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> available
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>>  service
>> > >> > > > > > >>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>> by using ephemeral znodes(Here we do not have
>> to
>> > >> > change
>> > >> > > > the
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> generated
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>>  thrift client code but we have to change the
>> > >> locations we
>> > >> > > are
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> invoking
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>>  them). ephemeral znodes will be removed when the
>> > >> thrift
>> > >> > > > service
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> goes
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>>  down
>> > >> > > > > > >>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>> and zookeeper guarantee the atomicity between
>> > these
>> > >> > > > > > operations.
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> With
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>>  this
>> > >> > > > > > >>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>> approach we can have a node hierarchy for
>> > multiple
>> > >> of
>> > >> > > > > > airavata,
>> > >> > > > > > >>>>>>>>>>>> orchestrator,appcatalog and gfac thrift
>> services.
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>> For specifically for gfac we can have
>> different
>> > >> types
>> > >> > of
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> services
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>> for
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>>  each
>> > >> > > > > > >>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>> provider implementation. This can be achieved
>> by
>> > >> using
>> > >> > > the
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> hierarchical
>> > >> > > > > > >>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>> support in zookeeper and providing some logic
>> in
>> > >> > > > gfac-thrift
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> service
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>>  to
>> > >> > > > > > >>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>> register it to a defined path. Using the same
>> > logic
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> orchestrator
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>> can
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>>  discover the provider specific gfac thrift
>> service
>> > and
>> > >> > > route
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> the
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>>  message to
>> > >> > > > > > >>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>> the correct thrift service.
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>> With this approach I think we simply have
>> write
>> > >> some
>> > >> > > > client
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> code
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>> in
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>>  thrift
>> > >> > > > > > >>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>> services and clients and zookeeper server
>> > >> installation
>> > >> > > can
>> > >> > > > > be
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> done as
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>>  a
>> > >> > > > > > >>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>> separate process and it will be easier to keep
>> > the
>> > >> > > > Zookeeper
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> server
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>>  separate from Airavata because installation of
>> > >> Zookeeper
>> > >> > > > server
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> little
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>>  complex in production scenario. I think we have
>> to
>> > >> make
>> > >> > > sure
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> everything
>> > >> > > > > > >>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>> works fine when there is no Zookeeper running,
>> > ex:
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> enable.zookeeper=false
>> > >> > > > > > >>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>> should works fine and users doesn't have to
>> > >> download
>> > >> > and
>> > >> > > > > start
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>> zookeeper.
>> > >> > > > > > >>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>> [1]http://zookeeper.apache.org/
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>> Thanks
>> > >> > > > > > >>>>>>>>>>>> Lahiru
>> > >> > > > > > >>>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>
>> > >> > > > > > >>>>>>>>>>>
>> > >> > > > > > >>>>>>>>> --
>> > >> > > > > > >>>>>>>>> System Analyst Programmer
>> > >> > > > > > >>>>>>>>> PTI Lab
>> > >> > > > > > >>>>>>>>> Indiana University
>> > >> > > > > > >>>>>>>>>
>> > >> > > > > > >>>>>>>>>
>> > >> > > > > > >>>>>>> --
>> > >> > > > > > >>>>>>> System Analyst Programmer
>> > >> > > > > > >>>>>>> PTI Lab
>> > >> > > > > > >>>>>>> Indiana University
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>>
>> > >> > > > > > >>>>>>
>> > >> > > > > > >>>>>> --
>> > >> > > > > > >>>>>> Best Regards,
>> > >> > > > > > >>>>>> Shameera Rathnayaka.
>> > >> > > > > > >>>>>>
>> > >> > > > > > >>>>>> email: shameera AT apache.org , shameerainfo AT
>> > >> gmail.com
>> > >> > > > > > >>>>>> Blog : http://shameerarathnayaka.blogspot.com/
>> > >> > > > > > >>>>>>
>> > >> > > > > > >>>>>>
>> > >> > > > > > >>>>>
>> > >> > > > > > >>>>> --
>> > >> > > > > > >>>>> Supun Kamburugamuva
>> > >> > > > > > >>>>> Member, Apache Software Foundation;
>> > http://www.apache.org
>> > >> > > > > > >>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
>> > >> > > > > > >>>>> Blog: http://supunk.blogspot.com
>> > >> > > > > > >>>>>
>> > >> > > > > > >>>>>
>> > >> > > > > > >>>>>
>> > >> > > > > > >>>> --
>> > >> > > > > > >>>> System Analyst Programmer
>> > >> > > > > > >>>> PTI Lab
>> > >> > > > > > >>>> Indiana University
>> > >> > > > > > >>>>
>> > >> > > > > > >>>>
>> > >> > > > > > >>>
>> > >> > > > > > >>> --
>> > >> > > > > > >>> Supun Kamburugamuva
>> > >> > > > > > >>> Member, Apache Software Foundation;
>> http://www.apache.org
>> > >> > > > > > >>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
>> > >> > > > > > >>> Blog: http://supunk.blogspot.com
>> > >> > > > > > >>>
>> > >> > > > > > >>>
>> > >> > > > > > >>>
>> > >> > > > > > >>
>> > >> > > > > > >
>> > >> > > > > >
>> > >> > > > > >
>> > >> > > > > > --
>> > >> > > > > > Supun Kamburugamuva
>> > >> > > > > > Member, Apache Software Foundation; http://www.apache.org
>> > >> > > > > > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
>> > >> > > > > > Blog: http://supunk.blogspot.com
>> > >> > > > > >
>> > >> > > > >
>> > >> > > > >
>> > >> > > > >
>> > >> > > > > --
>> > >> > > > > System Analyst Programmer
>> > >> > > > > PTI Lab
>> > >> > > > > Indiana University
>> > >> > > > >
>> > >> > > >
>> > >> > > >
>> > >> > > >
>> > >> > > > --
>> > >> > > > Supun Kamburugamuva
>> > >> > > > Member, Apache Software Foundation; http://www.apache.org
>> > >> > > > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
>> > >> > > > Blog: http://supunk.blogspot.com
>> > >> > > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > > --
>> > >> > > System Analyst Programmer
>> > >> > > PTI Lab
>> > >> > > Indiana University
>> > >> > >
>> > >> >
>> > >>
>> > >>
>> > >>
>> > >> --
>> > >> Supun Kamburugamuva
>> > >> Member, Apache Software Foundation; http://www.apache.org
>> > >> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
>> > >> Blog: http://supunk.blogspot.com
>> > >>
>> > >
>> > >
>> > >
>> > > --
>> > > System Analyst Programmer
>> > > PTI Lab
>> > > Indiana University
>> > >
>> >
>> >
>> >
>> > --
>> > System Analyst Programmer
>> > PTI Lab
>> > Indiana University
>> >
>>
>
>
>
> --
> System Analyst Programmer
> PTI Lab
> Indiana University
>



-- 
System Analyst Programmer
PTI Lab
Indiana University

Re: Zookeeper in Airavata to achieve reliability

Posted by Lahiru Gunathilake <gl...@gmail.com>.

Hi Eran,


On Tue, Jun 17, 2014 at 4:06 PM, Eran Chinthaka Withana <
eran.chinthaka@gmail.com> wrote:

> Storm has a Kafka spout which manages the cursor location (pointer to the
> head of the queue representing the next message to be processed) inside ZK.
> Each storm spout instance uses this information to get the next item to
> process. Storm kafka spout won't advance to the next message until it gets
> an ack from the storm topology.
>
If we have 10 jobs in the queue and 5 GFAC instances picked 1 at a time and
successfully submitted and have to start taking rest of the jobs. But all 5
GFAC instances are responsible for initially picked  5 jobs because they
are still running and gfac instances are monitoring them until its done but
at the same time we have to move the cursor to pick other jobs too.

If we Ack and moved the cursor just after submission without waiting until
the job is actually finished how are we going to know which gfac is
monitoring which set of jobs ?

I am not getting how achieve above requirement with this suggestion. May be
I am missing something here.

Regards
Lahiru

>
> So, if there is an exception in the topology and ack is sent only by the
> last bolt, then storm bolt make sure all messages are processed since
> exceptions won't generate acks.
>
> Thanks,
> Eran Chinthaka Withana
>
>
> On Tue, Jun 17, 2014 at 12:30 PM, Lahiru Gunathilake <gl...@gmail.com>
> wrote:
>
> > Hi Eran,
> >
> > I think I should take back my last email. When I carefully look at storm
> I
> > have following question.
> >
> > How are we going to store the Job statuses  and relaunch the jobs which
> was
> > running in failure nodes ? Its true that storm is starting new workers
> but
> > there should be a way to find missing jobs by someone in the system.
> Since
> > we are not having a data stream there is no use to start new workers
> unless
> > we handler the missing jobs. I think we need to have a better control of
> > our component and persist the states of jobs each GFAC node is handling.
> > Directly using zookeeper will let us to do a proper fault tolerance
> > implementation.
> >
> > Regards
> > Lahiru
> >
> >
> >
> > On Tue, Jun 17, 2014 at 3:14 PM, Lahiru Gunathilake <gl...@gmail.com>
> > wrote:
> >
> > > Hi Supun,
> > >
> > > I think in this usecase we only use storm topology to do the
> > communication
> > > among workers and we are completely ignoring the stream processing
> part.
> > > Orchestrator will talk to Nimbus and GFAC nodes will be Worker nodes in
> > the
> > > storm topology. But I think we can achieve extremely fault tolerance
> > system
> > > by directly using storm based on following statement in storm site with
> > > minimum changes in airavata.
> > >
> > > Additionally, the Nimbus daemon and Supervisor daemons are fail-fast
> and
> > > stateless; all state is kept in Zookeeper or on local disk. This means
> > you
> > > can kill -9 Nimbus or the Supervisors and they’ll start back up like
> > > nothing happened. This design leads to Storm clusters being incredibly
> > > stable.
> > >
> > >
> > >
> > >
> > > On Tue, Jun 17, 2014 at 3:02 PM, Supun Kamburugamuva <
> supun06@gmail.com>
> > > wrote:
> > >
> > >> Hi Eran,
> > >>
> > >> I'm using Storm every day and this is one of the strangest things I've
> > >> heard about using Storm. My be there are more use cases for Storm
> other
> > >> than Distributed Stream processing. AFAIK the Bolts, spouts are built
> to
> > >> handle a stream of events that doesn't take much time to process. In
> > >> Airavata we don't process the messages. Instead we run experiments
> based
> > >> on
> > >> the commands given.
> > >>
> > >> If you want process isolation, distributed execution, cluster resource
> > >> management Yarn would be a better thing to explore.
> > >>
> > >> Thanks,
> > >> Supun..
> > >>
> > >>
> > >> On Tue, Jun 17, 2014 at 2:27 PM, Eran Chinthaka Withana <
> > >> eran.chinthaka@gmail.com> wrote:
> > >>
> > >> > Hi Lahiru,
> > >> >
> > >> > good summarization. Thanks Lahiru.
> > >> >
> > >> > I think you are trying to stick to a model where Orchestrator
> > >> distributing
> > >> > to work for GFac worker and trying to do the impedance mismatch
> > through
> > >> a
> > >> > messaging solution. If you step back and think, we don't even want
> the
> > >> > orchestrator to handle everything. From its point of view, it should
> > >> submit
> > >> > jobs to the framework, and will wait or get notified once the job is
> > >> done.
> > >> >
> > >> > There are multiple ways of doing this. And here is one method.
> > >> >
> > >> > Orchestrator submits all its jobs to Job queue (implemented using
> any
> > MQ
> > >> > impl like Rabbit or Kafka). A storm topology is implemented to
> dequeue
> > >> > messages, process them (i.e. submit those jobs and get those
> executed)
> > >> and
> > >> > notify the Orchestrator with the status (either through another
> > >> > JobCompletionQueue or direct invocation).
> > >> >
> > >> > With this approach, the MQ provider will help to match impedance
> > between
> > >> > job submission and consumption. Storm helps with worker
> coordination,
> > >> load
> > >> > balancing, throttling on your job execution framework, worker pool
> > >> > management and fault tolerance.
> > >> >
> > >> > Of course, you can implement this based only on ZK and handle
> > everything
> > >> > else on your own but storm had done exactly that with the use of ZK
> > >> > underneath.
> > >> >
> > >> > Finally, if you go for a model like this, then even beyond job
> > >> submission,
> > >> > you can use the same model to do anything within the framework for
> > >> internal
> > >> > communication. For example, the workflow engine will submit its jobs
> > to
> > >> > queues based on what it has to do. Storm topologies exists for each
> > >> queues
> > >> > to dequeue messages and carry out the work in a reliable manner.
> > >> Consider
> > >> > this as mini-workflows within a larger workflow framework.
> > >> >
> > >> > We can have a voice chat if its more convenient. But not at 7am PST
> :)
> > >> >
> > >> >
> > >> > Thanks,
> > >> > Eran Chinthaka Withana
> > >> >
> > >> >
> > >> > On Tue, Jun 17, 2014 at 10:12 AM, Lahiru Gunathilake <
> > glahiru@gmail.com
> > >> >
> > >> > wrote:
> > >> >
> > >> > > Hi All,
> > >> > >
> > >> > > Ignoring the tool that we are going to use to implement fault
> > >> tolerance I
> > >> > > have summarized the model we have decided so far. I will use the
> > tool
> > >> > name
> > >> > > as X, we can use Zookeeper or some other implementation. Following
> > >> design
> > >> > > assume tool X  and Registry have high availability.
> > >> > >
> > >> > > 1. Orchestrator and GFAC worker node communication is going to be
> > >> queue
> > >> > > based and tool X is going to be used for this communication. (We
> > have
> > >> to
> > >> > > implement this with considering race condition between different
> > gfac
> > >> > > workers).
> > >> > > 2. We are having multiple instances of GFAC which are identical
> (In
> > >> > future
> > >> > > we can group gfac workers). Existence of each worker node is
> > >> identified
> > >> > > using X. If node goes down orchestrator will be notified by X.
> > >> > > 3. When a particular request comes and accepted by one gfac worker
> > >> that
> > >> > > information will be replicated in tool X and a place where this
> > >> > information
> > >> > > is persisted even the worker failed.
> > >> > > 4. When a job comes to a final state like failed or cancelled or
> > >> > completed
> > >> > > above information will be removed. So at a given time orchestrator
> > can
> > >> > poll
> > >> > > active jobs in each worker by giving a worker ID.
> > >> > > 5. Tool X will make sure that when a worker goes down it will
> notify
> > >> > > orchestrator. During a worker failure, based on step 3 and 4
> > >> orchestrator
> > >> > > can poll all the active jobs of that worker and do the same thing
> > >> like in
> > >> > > step 1 (store the experiment ID to the queue) and gfac worker will
> > >> pick
> > >> > the
> > >> > > jobs.
> > >> > >
> > >> > > 6. When GFAC receive a job like in step 5 it have to carefully
> > >> evaluate
> > >> > the
> > >> > > state from registry and decide what to be done (If the job is
> > pending
> > >> > then
> > >> > > gfac just have to monitor, if job state is like input transferred
> > not
> > >> > even
> > >> > > submitted gfac has to execute rest of the chain and submit the job
> > to
> > >> the
> > >> > > resource and start monitoring).
> > >> > >
> > >> > > If we can find a tool X which supports all these features and tool
> > >> itself
> > >> > > is fault tolerance and support atomicity, high availability and
> > simply
> > >> > API
> > >> > > to implement we can use that tool.
> > >> > >
> > >> > > WDYT ?
> > >> > >
> > >> > > Lahiru
> > >> > >
> > >> > >
> > >> > > On Mon, Jun 16, 2014 at 2:38 PM, Supun Kamburugamuva <
> > >> supun06@gmail.com>
> > >> > > wrote:
> > >> > >
> > >> > > > Hi Lahiru,
> > >> > > >
> > >> > > > Before moving with an implementation it may be worth to consider
> > >> some
> > >> > of
> > >> > > > the following aspects as well.
> > >> > > >
> > >> > > > 1. How to report the progress of an experiment as state in
> > >> ZooKeeper?
> > >> > > What
> > >> > > > happens if a GFac instance crashes while executing an
> experiment?
> > >> Are
> > >> > > there
> > >> > > > check-points we can save so that another GFac instance can take
> > >> over?
> > >> > > > 2. What is the threading model of GFac instances? (I consider
> this
> > >> as a
> > >> > > > very important aspect)
> > >> > > > 3. What are the information needed to be stored in the
> ZooKeeper?
> > >> You
> > >> > may
> > >> > > > need to store other information about an experiment apart from
> its
> > >> > > > experiment ID.
> > >> > > > 4. How to report errors?
> > >> > > > 5. For GFac weather you need a threading model or worker process
> > >> model?
> > >> > > >
> > >> > > > Thanks,
> > >> > > > Supun..
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > On Mon, Jun 16, 2014 at 2:22 PM, Lahiru Gunathilake <
> > >> glahiru@gmail.com
> > >> > >
> > >> > > > wrote:
> > >> > > >
> > >> > > > > Hi All,
> > >> > > > >
> > >> > > > > I think the conclusion is like this,
> > >> > > > >
> > >> > > > > 1, We make the gfac as a worker not a thrift service and we
> can
> > >> start
> > >> > > > > multiple workers either with bunch of providers and handlers
> > >> > configured
> > >> > > > in
> > >> > > > > each worker or provider specific  workers to handle the class
> > path
> > >> > > issues
> > >> > > > > (not the common scenario).
> > >> > > > >
> > >> > > > > 2. Gfac workers can be configured to watch for a given path in
> > >> > > zookeeper,
> > >> > > > > and multiple workers can listen to the same path. Default path
> > >> can be
> > >> > > > > /airavata/gfac or can configure paths like
> /airavata/gfac/gsissh
> > >> > > > > /airavata/gfac/bes.
> > >> > > > >
> > >> > > > > 3. Orchestrator can configure with a logic to store experiment
> > >> IDs in
> > >> > > > > zookeeper with a path, and orchestrator can be configured to
> > >> provider
> > >> > > > > specific path logic too. So when a new request come
> orchestrator
> > >> > store
> > >> > > > the
> > >> > > > > experimentID and these experiments IDs are stored in Zk as a
> > >> queue.
> > >> > > > >
> > >> > > > > 4. Since gfac workers are watching they will be notified and
> as
> > >> supun
> > >> > > > > suggested can use a leader selection algorithm[1] and one gfac
> > >> worker
> > >> > > >  will
> > >> > > > > take the leadership for each experiment. If there are gfac
> > >> instances
> > >> > > for
> > >> > > > > each provider same logic will apply among those nodes with
> same
> > >> > > provider
> > >> > > > > type.
> > >> > > > >
> > >> > > > > [1]
> > http://curator.apache.org/curator-recipes/leader-election.html
> > >> > > > >
> > >> > > > > I would like to implement this if there are  no objections.
> > >> > > > >
> > >> > > > > Lahiru
> > >> > > > >
> > >> > > > >
> > >> > > > > On Mon, Jun 16, 2014 at 11:51 AM, Supun Kamburugamuva <
> > >> > > supun06@gmail.com
> > >> > > > >
> > >> > > > > wrote:
> > >> > > > >
> > >> > > > > > Hi Marlon,
> > >> > > > > >
> > >> > > > > > I think you are exactly correct.
> > >> > > > > >
> > >> > > > > > Supun..
> > >> > > > > >
> > >> > > > > >
> > >> > > > > > On Mon, Jun 16, 2014 at 11:48 AM, Marlon Pierce <
> > >> marpierc@iu.edu>
> > >> > > > wrote:
> > >> > > > > >
> > >> > > > > > > Let me restate this, and please tell me if I'm wrong.
> > >> > > > > > >
> > >> > > > > > > Orchestrator decides (somehow) that a particular job
> > requires
> > >> > > > JSDL/BES,
> > >> > > > > > so
> > >> > > > > > > it places the Experiment ID in Zookeeper's
> > >> > /airavata/gfac/jsdl-bes
> > >> > > > > node.
> > >> > > > > > >  GFAC servers associated with this instance notice the
> > update.
> > >> >  The
> > >> > > > > first
> > >> > > > > > > GFAC to claim the job gets it, uses the Experiment ID to
> get
> > >> the
> > >> > > > > detailed
> > >> > > > > > > information it needs from the Registry.  ZooKeeper handles
> > the
> > >> > > > locking,
> > >> > > > > > etc
> > >> > > > > > > to make sure that only one GFAC at a time is trying to
> > handle
> > >> an
> > >> > > > > > experiment.
> > >> > > > > > >
> > >> > > > > > > Marlon
> > >> > > > > > >
> > >> > > > > > >
> > >> > > > > > > On 6/16/14, 11:42 AM, Lahiru Gunathilake wrote:
> > >> > > > > > >
> > >> > > > > > >> Hi Supun,
> > >> > > > > > >>
> > >> > > > > > >> Thanks for the clarification.
> > >> > > > > > >>
> > >> > > > > > >> Regards
> > >> > > > > > >> Lahiru
> > >> > > > > > >>
> > >> > > > > > >>
> > >> > > > > > >> On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva <
> > >> > > > > > supun06@gmail.com>
> > >> > > > > > >> wrote:
> > >> > > > > > >>
> > >> > > > > > >>  Hi Lahiru,
> > >> > > > > > >>>
> > >> > > > > > >>> My suggestion is that may be you don't need a Thrift
> > service
> > >> > > > between
> > >> > > > > > >>> Orchestrator and the component executing the experiment.
> > >> When a
> > >> > > new
> > >> > > > > > >>> experiment is submitted, orchestrator decides who can
> > >> execute
> > >> > > this
> > >> > > > > job.
> > >> > > > > > >>> Then it put the information about this experiment
> > execution
> > >> in
> > >> > > > > > ZooKeeper.
> > >> > > > > > >>> The component which wants to executes the experiment is
> > >> > listening
> > >> > > > to
> > >> > > > > > this
> > >> > > > > > >>> ZooKeeper path and when it sees the experiment it will
> > >> execute
> > >> > > it.
> > >> > > > So
> > >> > > > > > >>> that
> > >> > > > > > >>> the communication happens through an state change in
> > >> ZooKeeper.
> > >> > > > This
> > >> > > > > > can
> > >> > > > > > >>> potentially simply your architecture.
> > >> > > > > > >>>
> > >> > > > > > >>> Thanks,
> > >> > > > > > >>> Supun.
> > >> > > > > > >>>
> > >> > > > > > >>>
> > >> > > > > > >>> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake <
> > >> > > > > > glahiru@gmail.com>
> > >> > > > > > >>> wrote:
> > >> > > > > > >>>
> > >> > > > > > >>>  Hi Supun,
> > >> > > > > > >>>>
> > >> > > > > > >>>> So your suggestion is to create a znode for each thrift
> > >> > service
> > >> > > we
> > >> > > > > > have
> > >> > > > > > >>>> and
> > >> > > > > > >>>> when the request comes that node gets modified with
> input
> > >> data
> > >> > > for
> > >> > > > > > that
> > >> > > > > > >>>> request and thrift service is having a watch for that
> > node
> > >> and
> > >> > > it
> > >> > > > > will
> > >> > > > > > >>>> be
> > >> > > > > > >>>> notified because of the watch and it can read the input
> > >> from
> > >> > > > > zookeeper
> > >> > > > > > >>>> and
> > >> > > > > > >>>> invoke the operation?
> > >> > > > > > >>>>
> > >> > > > > > >>>> Lahiru
> > >> > > > > > >>>>
> > >> > > > > > >>>>
> > >> > > > > > >>>> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva <
> > >> > > > > > >>>> supun06@gmail.com>
> > >> > > > > > >>>> wrote:
> > >> > > > > > >>>>
> > >> > > > > > >>>>  Hi all,
> > >> > > > > > >>>>>
> > >> > > > > > >>>>> Here is what I think about Airavata and ZooKeeper. In
> > >> > Airavata
> > >> > > > > there
> > >> > > > > > >>>>> are
> > >> > > > > > >>>>> many components and these components must be stateless
> > to
> > >> > > achieve
> > >> > > > > > >>>>> scalability and reliability.Also there must be a
> > >> mechanism to
> > >> > > > > > >>>>>
> > >> > > > > > >>>> communicate
> > >> > > > > > >>>>
> > >> > > > > > >>>>> between the components. At the moment Airavata uses
> RPC
> > >> calls
> > >> > > > based
> > >> > > > > > on
> > >> > > > > > >>>>> Thrift for the communication.
> > >> > > > > > >>>>>
> > >> > > > > > >>>>> ZooKeeper can be used both as a place to hold state
> and
> > >> as a
> > >> > > > > > >>>>>
> > >> > > > > > >>>> communication
> > >> > > > > > >>>>
> > >> > > > > > >>>>> layer between the components. I'm involved with a
> > project
> > >> > that
> > >> > > > has
> > >> > > > > > many
> > >> > > > > > >>>>> distributed components like AIravata. Right now we use
> > >> Thrift
> > >> > > > > > services
> > >> > > > > > >>>>>
> > >> > > > > > >>>> to
> > >> > > > > > >>>>
> > >> > > > > > >>>>> communicate among the components. But we find it
> > >> difficult to
> > >> > > use
> > >> > > > > RPC
> > >> > > > > > >>>>>
> > >> > > > > > >>>> calls
> > >> > > > > > >>>>
> > >> > > > > > >>>>> and achieve stateless behaviour and thinking of
> > replacing
> > >> > > Thrift
> > >> > > > > > >>>>>
> > >> > > > > > >>>> services
> > >> > > > > > >>>>
> > >> > > > > > >>>>> with ZooKeeper based communication layer. So I think
> it
> > is
> > >> > > better
> > >> > > > > to
> > >> > > > > > >>>>> explore the possibility of removing the Thrift
> services
> > >> > between
> > >> > > > the
> > >> > > > > > >>>>> components and use ZooKeeper as a communication
> > mechanism
> > >> > > between
> > >> > > > > the
> > >> > > > > > >>>>> services. If you do this you will have to move the
> state
> > >> to
> > >> > > > > ZooKeeper
> > >> > > > > > >>>>>
> > >> > > > > > >>>> and
> > >> > > > > > >>>>
> > >> > > > > > >>>>> will automatically achieve the stateless behaviour in
> > the
> > >> > > > > components.
> > >> > > > > > >>>>>
> > >> > > > > > >>>>> Also I think trying to make ZooKeeper optional is a
> bad
> > >> idea.
> > >> > > If
> > >> > > > we
> > >> > > > > > are
> > >> > > > > > >>>>> trying to integrate something fundamentally important
> to
> > >> > > > > architecture
> > >> > > > > > >>>>> as
> > >> > > > > > >>>>> how to store state, we shouldn't make it optional.
> > >> > > > > > >>>>>
> > >> > > > > > >>>>> Thanks,
> > >> > > > > > >>>>> Supun..
> > >> > > > > > >>>>>
> > >> > > > > > >>>>>
> > >> > > > > > >>>>> On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka
> <
> > >> > > > > > >>>>> shameerainfo@gmail.com> wrote:
> > >> > > > > > >>>>>
> > >> > > > > > >>>>>  Hi Lahiru,
> > >> > > > > > >>>>>>
> > >> > > > > > >>>>>> As i understood,  not only reliability , you are
> trying
> > >> to
> > >> > > > achieve
> > >> > > > > > >>>>>> some
> > >> > > > > > >>>>>> other requirement by introducing zookeeper, like
> health
> > >> > > > monitoring
> > >> > > > > > of
> > >> > > > > > >>>>>>
> > >> > > > > > >>>>> the
> > >> > > > > > >>>>
> > >> > > > > > >>>>> services, categorization with service implementation
> etc
> > >> ...
> > >> > .
> > >> > > In
> > >> > > > > > that
> > >> > > > > > >>>>>> case, i think we can get use of zookeeper's features
> > but
> > >> if
> > >> > we
> > >> > > > > only
> > >> > > > > > >>>>>>
> > >> > > > > > >>>>> focus
> > >> > > > > > >>>>
> > >> > > > > > >>>>> on reliability, i have little bit of concern, why
> can't
> > we
> > >> > use
> > >> > > > > > >>>>>>
> > >> > > > > > >>>>> clustering +
> > >> > > > > > >>>>
> > >> > > > > > >>>>> LB ?
> > >> > > > > > >>>>>>
> > >> > > > > > >>>>>> Yes it is better we add Zookeeper as a prerequisite
> if
> > >> user
> > >> > > need
> > >> > > > > to
> > >> > > > > > >>>>>> use
> > >> > > > > > >>>>>> it.
> > >> > > > > > >>>>>>
> > >> > > > > > >>>>>> Thanks,
> > >> > > > > > >>>>>>   Shameera.
> > >> > > > > > >>>>>>
> > >> > > > > > >>>>>>
> > >> > > > > > >>>>>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <
> > >> > > > > > >>>>>> glahiru@gmail.com
> > >> > > > > > >>>>>> wrote:
> > >> > > > > > >>>>>>
> > >> > > > > > >>>>>>  Hi Gagan,
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>> I need to start another discussion about it, but I
> had
> > >> an
> > >> > > > offline
> > >> > > > > > >>>>>>> discussion with Suresh about auto-scaling. I will
> > start
> > >> > > another
> > >> > > > > > >>>>>>> thread
> > >> > > > > > >>>>>>> about this topic too.
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>> Regards
> > >> > > > > > >>>>>>> Lahiru
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>> gagandeepjuneja@gmail.com
> > >> > > > > > >>>>
> > >> > > > > > >>>>> wrote:
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>  Thanks Lahiru for pointing to nice library, added
> to
> > my
> > >> > > > > dictionary
> > >> > > > > > >>>>>>>>
> > >> > > > > > >>>>>>> :).
> > >> > > > > > >>>>
> > >> > > > > > >>>>>  I would like to know how are we planning to start
> > >> multiple
> > >> > > > > servers.
> > >> > > > > > >>>>>>>> 1. Spawning new servers based on load? Some times
> we
> > >> call
> > >> > it
> > >> > > > as
> > >> > > > > > auto
> > >> > > > > > >>>>>>>> scalable.
> > >> > > > > > >>>>>>>> 2. To make some specific number of nodes available
> > >> such as
> > >> > > we
> > >> > > > > > want 2
> > >> > > > > > >>>>>>>> servers to be available at any time so if one goes
> > down
> > >> > > then I
> > >> > > > > > need
> > >> > > > > > >>>>>>>>
> > >> > > > > > >>>>>>> to
> > >> > > > > > >>>>
> > >> > > > > > >>>>>  spawn one new to make available servers count 2.
> > >> > > > > > >>>>>>>> 3. Initially start all the servers.
> > >> > > > > > >>>>>>>>
> > >> > > > > > >>>>>>>> In scenario 1 and 2 zookeeper does make sense but I
> > >> don't
> > >> > > > > believe
> > >> > > > > > >>>>>>>>
> > >> > > > > > >>>>>>> existing
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>> architecture support this?
> > >> > > > > > >>>>>>>>
> > >> > > > > > >>>>>>>> Regards,
> > >> > > > > > >>>>>>>> Gagan
> > >> > > > > > >>>>>>>> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <
> > >> > > > glahiru@gmail.com
> > >> > > > > >
> > >> > > > > > >>>>>>>>
> > >> > > > > > >>>>>>> wrote:
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>> Hi Gagan,
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>>> Thanks for your response. Please see my inline
> > >> comments.
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>> gagandeepjuneja@gmail.com>
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>> wrote:
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>>>  Hi Lahiru,
> > >> > > > > > >>>>>>>>>> Just my 2 cents.
> > >> > > > > > >>>>>>>>>>
> > >> > > > > > >>>>>>>>>> I am big fan of zookeeper but also against adding
> > >> > multiple
> > >> > > > > hops
> > >> > > > > > in
> > >> > > > > > >>>>>>>>>>
> > >> > > > > > >>>>>>>>> the
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>> system which can add unnecessary complexity. Here I
> > am
> > >> not
> > >> > > > able
> > >> > > > > to
> > >> > > > > > >>>>>>>>>> understand the requirement of zookeeper may be I
> am
> > >> > wrong
> > >> > > > > > because
> > >> > > > > > >>>>>>>>>>
> > >> > > > > > >>>>>>>>> of
> > >> > > > > > >>>>
> > >> > > > > > >>>>> less
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>> knowledge of the airavata system in whole. So I
> would
> > >> like
> > >> > > to
> > >> > > > > > >>>>>>>>>>
> > >> > > > > > >>>>>>>>> discuss
> > >> > > > > > >>>>
> > >> > > > > > >>>>>  following point.
> > >> > > > > > >>>>>>>>>>
> > >> > > > > > >>>>>>>>>> 1. How it will help us in making system more
> > >> reliable.
> > >> > > > > Zookeeper
> > >> > > > > > >>>>>>>>>>
> > >> > > > > > >>>>>>>>> is
> > >> > > > > > >>>>
> > >> > > > > > >>>>> not
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>> able to restart services. At max it can tell
> whether
> > >> > service
> > >> > > > is
> > >> > > > > up
> > >> > > > > > >>>>>>>>>>
> > >> > > > > > >>>>>>>>> or not
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>> which could only be the case if airavata service
> goes
> > >> down
> > >> > > > > > >>>>>>>>>>
> > >> > > > > > >>>>>>>>> gracefully and
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>> we have any automated way to restart it. If this is
> > >> just
> > >> > > > matter
> > >> > > > > of
> > >> > > > > > >>>>>>>>>>
> > >> > > > > > >>>>>>>>> routing
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>> client requests to the available thrift servers
> then
> > >> this
> > >> > > can
> > >> > > > be
> > >> > > > > > >>>>>>>>>>
> > >> > > > > > >>>>>>>>> achieved
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>> with the help of load balancer which I guess is
> > already
> > >> > > there
> > >> > > > in
> > >> > > > > > >>>>>>>>>>
> > >> > > > > > >>>>>>>>> thrift
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>> wish list.
> > >> > > > > > >>>>>>>>>>
> > >> > > > > > >>>>>>>>>>  We have multiple thrift services and currently
> we
> > >> start
> > >> > > > only
> > >> > > > > > one
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>> instance
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>> of them and each thrift service is a stateless
> > >> service. To
> > >> > > > keep
> > >> > > > > > the
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>> high
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>> availability we have to start multiple instances of
> > >> them
> > >> > in
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>> production
> > >> > > > > > >>>>
> > >> > > > > > >>>>>  scenario. So for clients to get an available thrift
> > >> service
> > >> > we
> > >> > > > can
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>> use
> > >> > > > > > >>>>
> > >> > > > > > >>>>>  zookeeper znodes to represent each available service.
> > >> There
> > >> > > are
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>> some
> > >> > > > > > >>>>
> > >> > > > > > >>>>>  libraries which is doing similar[1] and I think we
> can
> > >> use
> > >> > > them
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>> directly.
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>> 2. As far as registering of different providers is
> > >> > concerned
> > >> > > > do
> > >> > > > > > >>>>>>>>>>
> > >> > > > > > >>>>>>>>> you
> > >> > > > > > >>>>
> > >> > > > > > >>>>>  think for that we really need external store.
> > >> > > > > > >>>>>>>>>>
> > >> > > > > > >>>>>>>>>>  Yes I think so, because its light weight and
> > >> reliable
> > >> > and
> > >> > > > we
> > >> > > > > > have
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>> to
> > >> > > > > > >>>>
> > >> > > > > > >>>>> do
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>> very minimal amount of work to achieve all these
> > >> features
> > >> > to
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>> Airavata
> > >> > > > > > >>>>
> > >> > > > > > >>>>>  because zookeeper handle all the complexity.
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>>>  I have seen people using zookeeper more for state
> > >> > > management
> > >> > > > > in
> > >> > > > > > >>>>>>>>>> distributed environments.
> > >> > > > > > >>>>>>>>>>
> > >> > > > > > >>>>>>>>>>  +1, we might not be the most effective users of
> > >> > zookeeper
> > >> > > > > > because
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>> all
> > >> > > > > > >>>>
> > >> > > > > > >>>>> of
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>> our services are stateless services, but my point
> is
> > to
> > >> > > > achieve
> > >> > > > > > >>>>>>>>> fault-tolerance we can use zookeeper and with
> > minimal
> > >> > work.
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>>>    I would like to understand more how can we
> > leverage
> > >> > > > > zookeeper
> > >> > > > > > in
> > >> > > > > > >>>>>>>>>> airavata to make system reliable.
> > >> > > > > > >>>>>>>>>>
> > >> > > > > > >>>>>>>>>>
> > >> > > > > > >>>>>>>>>>  [1]https://github.com/eirslett/thrift-zookeeper
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>>>  Regards,
> > >> > > > > > >>>>>>>>>> Gagan
> > >> > > > > > >>>>>>>>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <
> > >> > marpierc@iu.edu
> > >> > > >
> > >> > > > > > wrote:
> > >> > > > > > >>>>>>>>>>
> > >> > > > > > >>>>>>>>>>  Thanks for the summary, Lahiru. I'm cc'ing the
> > >> > > Architecture
> > >> > > > > > list
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>>> for
> > >> > > > > > >>>>
> > >> > > > > > >>>>>  additional comments.
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> Marlon
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> Hi All,
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> I did little research about Apache Zookeeper[1]
> > and
> > >> > how
> > >> > > to
> > >> > > > > use
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> it
> > >> > > > > > >>>>
> > >> > > > > > >>>>> in
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>>  airavata. Its really a nice way to achieve fault
> > >> > tolerance
> > >> > > > and
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> reliable
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> communication between our thrift services and
> > >> clients.
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> Zookeeper
> > >> > > > > > >>>>
> > >> > > > > > >>>>> is a
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>>  distributed, fault tolerant system to do a
> reliable
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> communication
> > >> > > > > > >>>>
> > >> > > > > > >>>>>  between
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> distributed applications. This is like an
> > in-memory
> > >> > file
> > >> > > > > > system
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> which
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>>  has
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> nodes in a tree structure and each node can
> have
> > >> small
> > >> > > > > amount
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> of
> > >> > > > > > >>>>
> > >> > > > > > >>>>> data
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>>  associated with it and these nodes are called
> > znodes.
> > >> > > Clients
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> can
> > >> > > > > > >>>>
> > >> > > > > > >>>>>  connect
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> to a zookeeper server and add/delete and update
> > >> these
> > >> > > > > znodes.
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>>    In Apache Airavata we start multiple thrift
> > >> > services
> > >> > > > and
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> these
> > >> > > > > > >>>>
> > >> > > > > > >>>>> can
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>>  go
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> down for maintenance or these can crash, if we
> > use
> > >> > > > zookeeper
> > >> > > > > > to
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> store
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>>  these
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> configuration(thrift service configurations) we
> > can
> > >> > > > achieve
> > >> > > > > a
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> very
> > >> > > > > > >>>>
> > >> > > > > > >>>>>  reliable
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> system. Basically thrift clients can
> dynamically
> > >> > > discover
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> available
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>>  service
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> by using ephemeral znodes(Here we do not have
> to
> > >> > change
> > >> > > > the
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> generated
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>>  thrift client code but we have to change the
> > >> locations we
> > >> > > are
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> invoking
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>>  them). ephemeral znodes will be removed when the
> > >> thrift
> > >> > > > service
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> goes
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>>  down
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> and zookeeper guarantee the atomicity between
> > these
> > >> > > > > > operations.
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> With
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>>  this
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> approach we can have a node hierarchy for
> > multiple
> > >> of
> > >> > > > > > airavata,
> > >> > > > > > >>>>>>>>>>>> orchestrator,appcatalog and gfac thrift
> services.
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> For specifically for gfac we can have different
> > >> types
> > >> > of
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> services
> > >> > > > > > >>>>
> > >> > > > > > >>>>> for
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>>  each
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> provider implementation. This can be achieved
> by
> > >> using
> > >> > > the
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> hierarchical
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> support in zookeeper and providing some logic
> in
> > >> > > > gfac-thrift
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> service
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>>  to
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> register it to a defined path. Using the same
> > logic
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> orchestrator
> > >> > > > > > >>>>
> > >> > > > > > >>>>> can
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>>  discover the provider specific gfac thrift service
> > and
> > >> > > route
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> the
> > >> > > > > > >>>>
> > >> > > > > > >>>>>  message to
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> the correct thrift service.
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> With this approach I think we simply have write
> > >> some
> > >> > > > client
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> code
> > >> > > > > > >>>>
> > >> > > > > > >>>>> in
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>>  thrift
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> services and clients and zookeeper server
> > >> installation
> > >> > > can
> > >> > > > > be
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> done as
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>>  a
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> separate process and it will be easier to keep
> > the
> > >> > > > Zookeeper
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> server
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>>  separate from Airavata because installation of
> > >> Zookeeper
> > >> > > > server
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> little
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>>  complex in production scenario. I think we have to
> > >> make
> > >> > > sure
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> everything
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> works fine when there is no Zookeeper running,
> > ex:
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> enable.zookeeper=false
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> should works fine and users doesn't have to
> > >> download
> > >> > and
> > >> > > > > start
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>> zookeeper.
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> [1]http://zookeeper.apache.org/
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>> Thanks
> > >> > > > > > >>>>>>>>>>>> Lahiru
> > >> > > > > > >>>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>>>>
> > >> > > > > > >>>>>>>>> --
> > >> > > > > > >>>>>>>>> System Analyst Programmer
> > >> > > > > > >>>>>>>>> PTI Lab
> > >> > > > > > >>>>>>>>> Indiana University
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>>>>
> > >> > > > > > >>>>>>> --
> > >> > > > > > >>>>>>> System Analyst Programmer
> > >> > > > > > >>>>>>> PTI Lab
> > >> > > > > > >>>>>>> Indiana University
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>>
> > >> > > > > > >>>>>>
> > >> > > > > > >>>>>> --
> > >> > > > > > >>>>>> Best Regards,
> > >> > > > > > >>>>>> Shameera Rathnayaka.
> > >> > > > > > >>>>>>
> > >> > > > > > >>>>>> email: shameera AT apache.org , shameerainfo AT
> > >> gmail.com
> > >> > > > > > >>>>>> Blog : http://shameerarathnayaka.blogspot.com/
> > >> > > > > > >>>>>>
> > >> > > > > > >>>>>>
> > >> > > > > > >>>>>
> > >> > > > > > >>>>> --
> > >> > > > > > >>>>> Supun Kamburugamuva
> > >> > > > > > >>>>> Member, Apache Software Foundation;
> > http://www.apache.org
> > >> > > > > > >>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > >> > > > > > >>>>> Blog: http://supunk.blogspot.com
> > >> > > > > > >>>>>
> > >> > > > > > >>>>>
> > >> > > > > > >>>>>
> > >> > > > > > >>>> --
> > >> > > > > > >>>> System Analyst Programmer
> > >> > > > > > >>>> PTI Lab
> > >> > > > > > >>>> Indiana University
> > >> > > > > > >>>>
> > >> > > > > > >>>>
> > >> > > > > > >>>
> > >> > > > > > >>> --
> > >> > > > > > >>> Supun Kamburugamuva
> > >> > > > > > >>> Member, Apache Software Foundation;
> http://www.apache.org
> > >> > > > > > >>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > >> > > > > > >>> Blog: http://supunk.blogspot.com
> > >> > > > > > >>>
> > >> > > > > > >>>
> > >> > > > > > >>>
> > >> > > > > > >>
> > >> > > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > > --
> > >> > > > > > Supun Kamburugamuva
> > >> > > > > > Member, Apache Software Foundation; http://www.apache.org
> > >> > > > > > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > >> > > > > > Blog: http://supunk.blogspot.com
> > >> > > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > > --
> > >> > > > > System Analyst Programmer
> > >> > > > > PTI Lab
> > >> > > > > Indiana University
> > >> > > > >
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > --
> > >> > > > Supun Kamburugamuva
> > >> > > > Member, Apache Software Foundation; http://www.apache.org
> > >> > > > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > >> > > > Blog: http://supunk.blogspot.com
> > >> > > >
> > >> > >
> > >> > >
> > >> > >
> > >> > > --
> > >> > > System Analyst Programmer
> > >> > > PTI Lab
> > >> > > Indiana University
> > >> > >
> > >> >
> > >>
> > >>
> > >>
> > >> --
> > >> Supun Kamburugamuva
> > >> Member, Apache Software Foundation; http://www.apache.org
> > >> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > >> Blog: http://supunk.blogspot.com
> > >>
> > >
> > >
> > >
> > > --
> > > System Analyst Programmer
> > > PTI Lab
> > > Indiana University
> > >
> >
> >
> >
> > --
> > System Analyst Programmer
> > PTI Lab
> > Indiana University
> >
>



-- 
System Analyst Programmer
PTI Lab
Indiana University

Re: Zookeeper in Airavata to achieve reliability

Posted by Eran Chinthaka Withana <er...@gmail.com>.

Storm has a Kafka spout which manages the cursor location (pointer to the
head of the queue representing the next message to be processed) inside ZK.
Each storm spout instance uses this information to get the next item to
process. Storm kafka spout won't advance to the next message until it gets
an ack from the storm topology.

So, if there is an exception in the topology and ack is sent only by the
last bolt, then storm bolt make sure all messages are processed since
exceptions won't generate acks.

Thanks,
Eran Chinthaka Withana


On Tue, Jun 17, 2014 at 12:30 PM, Lahiru Gunathilake <gl...@gmail.com>
wrote:

> Hi Eran,
>
> I think I should take back my last email. When I carefully look at storm I
> have following question.
>
> How are we going to store the Job statuses  and relaunch the jobs which was
> running in failure nodes ? Its true that storm is starting new workers but
> there should be a way to find missing jobs by someone in the system. Since
> we are not having a data stream there is no use to start new workers unless
> we handler the missing jobs. I think we need to have a better control of
> our component and persist the states of jobs each GFAC node is handling.
> Directly using zookeeper will let us to do a proper fault tolerance
> implementation.
>
> Regards
> Lahiru
>
>
>
> On Tue, Jun 17, 2014 at 3:14 PM, Lahiru Gunathilake <gl...@gmail.com>
> wrote:
>
> > Hi Supun,
> >
> > I think in this usecase we only use storm topology to do the
> communication
> > among workers and we are completely ignoring the stream processing part.
> > Orchestrator will talk to Nimbus and GFAC nodes will be Worker nodes in
> the
> > storm topology. But I think we can achieve extremely fault tolerance
> system
> > by directly using storm based on following statement in storm site with
> > minimum changes in airavata.
> >
> > Additionally, the Nimbus daemon and Supervisor daemons are fail-fast and
> > stateless; all state is kept in Zookeeper or on local disk. This means
> you
> > can kill -9 Nimbus or the Supervisors and they’ll start back up like
> > nothing happened. This design leads to Storm clusters being incredibly
> > stable.
> >
> >
> >
> >
> > On Tue, Jun 17, 2014 at 3:02 PM, Supun Kamburugamuva <su...@gmail.com>
> > wrote:
> >
> >> Hi Eran,
> >>
> >> I'm using Storm every day and this is one of the strangest things I've
> >> heard about using Storm. My be there are more use cases for Storm other
> >> than Distributed Stream processing. AFAIK the Bolts, spouts are built to
> >> handle a stream of events that doesn't take much time to process. In
> >> Airavata we don't process the messages. Instead we run experiments based
> >> on
> >> the commands given.
> >>
> >> If you want process isolation, distributed execution, cluster resource
> >> management Yarn would be a better thing to explore.
> >>
> >> Thanks,
> >> Supun..
> >>
> >>
> >> On Tue, Jun 17, 2014 at 2:27 PM, Eran Chinthaka Withana <
> >> eran.chinthaka@gmail.com> wrote:
> >>
> >> > Hi Lahiru,
> >> >
> >> > good summarization. Thanks Lahiru.
> >> >
> >> > I think you are trying to stick to a model where Orchestrator
> >> distributing
> >> > to work for GFac worker and trying to do the impedance mismatch
> through
> >> a
> >> > messaging solution. If you step back and think, we don't even want the
> >> > orchestrator to handle everything. From its point of view, it should
> >> submit
> >> > jobs to the framework, and will wait or get notified once the job is
> >> done.
> >> >
> >> > There are multiple ways of doing this. And here is one method.
> >> >
> >> > Orchestrator submits all its jobs to Job queue (implemented using any
> MQ
> >> > impl like Rabbit or Kafka). A storm topology is implemented to dequeue
> >> > messages, process them (i.e. submit those jobs and get those executed)
> >> and
> >> > notify the Orchestrator with the status (either through another
> >> > JobCompletionQueue or direct invocation).
> >> >
> >> > With this approach, the MQ provider will help to match impedance
> between
> >> > job submission and consumption. Storm helps with worker coordination,
> >> load
> >> > balancing, throttling on your job execution framework, worker pool
> >> > management and fault tolerance.
> >> >
> >> > Of course, you can implement this based only on ZK and handle
> everything
> >> > else on your own but storm had done exactly that with the use of ZK
> >> > underneath.
> >> >
> >> > Finally, if you go for a model like this, then even beyond job
> >> submission,
> >> > you can use the same model to do anything within the framework for
> >> internal
> >> > communication. For example, the workflow engine will submit its jobs
> to
> >> > queues based on what it has to do. Storm topologies exists for each
> >> queues
> >> > to dequeue messages and carry out the work in a reliable manner.
> >> Consider
> >> > this as mini-workflows within a larger workflow framework.
> >> >
> >> > We can have a voice chat if its more convenient. But not at 7am PST :)
> >> >
> >> >
> >> > Thanks,
> >> > Eran Chinthaka Withana
> >> >
> >> >
> >> > On Tue, Jun 17, 2014 at 10:12 AM, Lahiru Gunathilake <
> glahiru@gmail.com
> >> >
> >> > wrote:
> >> >
> >> > > Hi All,
> >> > >
> >> > > Ignoring the tool that we are going to use to implement fault
> >> tolerance I
> >> > > have summarized the model we have decided so far. I will use the
> tool
> >> > name
> >> > > as X, we can use Zookeeper or some other implementation. Following
> >> design
> >> > > assume tool X  and Registry have high availability.
> >> > >
> >> > > 1. Orchestrator and GFAC worker node communication is going to be
> >> queue
> >> > > based and tool X is going to be used for this communication. (We
> have
> >> to
> >> > > implement this with considering race condition between different
> gfac
> >> > > workers).
> >> > > 2. We are having multiple instances of GFAC which are identical (In
> >> > future
> >> > > we can group gfac workers). Existence of each worker node is
> >> identified
> >> > > using X. If node goes down orchestrator will be notified by X.
> >> > > 3. When a particular request comes and accepted by one gfac worker
> >> that
> >> > > information will be replicated in tool X and a place where this
> >> > information
> >> > > is persisted even the worker failed.
> >> > > 4. When a job comes to a final state like failed or cancelled or
> >> > completed
> >> > > above information will be removed. So at a given time orchestrator
> can
> >> > poll
> >> > > active jobs in each worker by giving a worker ID.
> >> > > 5. Tool X will make sure that when a worker goes down it will notify
> >> > > orchestrator. During a worker failure, based on step 3 and 4
> >> orchestrator
> >> > > can poll all the active jobs of that worker and do the same thing
> >> like in
> >> > > step 1 (store the experiment ID to the queue) and gfac worker will
> >> pick
> >> > the
> >> > > jobs.
> >> > >
> >> > > 6. When GFAC receive a job like in step 5 it have to carefully
> >> evaluate
> >> > the
> >> > > state from registry and decide what to be done (If the job is
> pending
> >> > then
> >> > > gfac just have to monitor, if job state is like input transferred
> not
> >> > even
> >> > > submitted gfac has to execute rest of the chain and submit the job
> to
> >> the
> >> > > resource and start monitoring).
> >> > >
> >> > > If we can find a tool X which supports all these features and tool
> >> itself
> >> > > is fault tolerance and support atomicity, high availability and
> simply
> >> > API
> >> > > to implement we can use that tool.
> >> > >
> >> > > WDYT ?
> >> > >
> >> > > Lahiru
> >> > >
> >> > >
> >> > > On Mon, Jun 16, 2014 at 2:38 PM, Supun Kamburugamuva <
> >> supun06@gmail.com>
> >> > > wrote:
> >> > >
> >> > > > Hi Lahiru,
> >> > > >
> >> > > > Before moving with an implementation it may be worth to consider
> >> some
> >> > of
> >> > > > the following aspects as well.
> >> > > >
> >> > > > 1. How to report the progress of an experiment as state in
> >> ZooKeeper?
> >> > > What
> >> > > > happens if a GFac instance crashes while executing an experiment?
> >> Are
> >> > > there
> >> > > > check-points we can save so that another GFac instance can take
> >> over?
> >> > > > 2. What is the threading model of GFac instances? (I consider this
> >> as a
> >> > > > very important aspect)
> >> > > > 3. What are the information needed to be stored in the ZooKeeper?
> >> You
> >> > may
> >> > > > need to store other information about an experiment apart from its
> >> > > > experiment ID.
> >> > > > 4. How to report errors?
> >> > > > 5. For GFac weather you need a threading model or worker process
> >> model?
> >> > > >
> >> > > > Thanks,
> >> > > > Supun..
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > > On Mon, Jun 16, 2014 at 2:22 PM, Lahiru Gunathilake <
> >> glahiru@gmail.com
> >> > >
> >> > > > wrote:
> >> > > >
> >> > > > > Hi All,
> >> > > > >
> >> > > > > I think the conclusion is like this,
> >> > > > >
> >> > > > > 1, We make the gfac as a worker not a thrift service and we can
> >> start
> >> > > > > multiple workers either with bunch of providers and handlers
> >> > configured
> >> > > > in
> >> > > > > each worker or provider specific  workers to handle the class
> path
> >> > > issues
> >> > > > > (not the common scenario).
> >> > > > >
> >> > > > > 2. Gfac workers can be configured to watch for a given path in
> >> > > zookeeper,
> >> > > > > and multiple workers can listen to the same path. Default path
> >> can be
> >> > > > > /airavata/gfac or can configure paths like /airavata/gfac/gsissh
> >> > > > > /airavata/gfac/bes.
> >> > > > >
> >> > > > > 3. Orchestrator can configure with a logic to store experiment
> >> IDs in
> >> > > > > zookeeper with a path, and orchestrator can be configured to
> >> provider
> >> > > > > specific path logic too. So when a new request come orchestrator
> >> > store
> >> > > > the
> >> > > > > experimentID and these experiments IDs are stored in Zk as a
> >> queue.
> >> > > > >
> >> > > > > 4. Since gfac workers are watching they will be notified and as
> >> supun
> >> > > > > suggested can use a leader selection algorithm[1] and one gfac
> >> worker
> >> > > >  will
> >> > > > > take the leadership for each experiment. If there are gfac
> >> instances
> >> > > for
> >> > > > > each provider same logic will apply among those nodes with same
> >> > > provider
> >> > > > > type.
> >> > > > >
> >> > > > > [1]
> http://curator.apache.org/curator-recipes/leader-election.html
> >> > > > >
> >> > > > > I would like to implement this if there are  no objections.
> >> > > > >
> >> > > > > Lahiru
> >> > > > >
> >> > > > >
> >> > > > > On Mon, Jun 16, 2014 at 11:51 AM, Supun Kamburugamuva <
> >> > > supun06@gmail.com
> >> > > > >
> >> > > > > wrote:
> >> > > > >
> >> > > > > > Hi Marlon,
> >> > > > > >
> >> > > > > > I think you are exactly correct.
> >> > > > > >
> >> > > > > > Supun..
> >> > > > > >
> >> > > > > >
> >> > > > > > On Mon, Jun 16, 2014 at 11:48 AM, Marlon Pierce <
> >> marpierc@iu.edu>
> >> > > > wrote:
> >> > > > > >
> >> > > > > > > Let me restate this, and please tell me if I'm wrong.
> >> > > > > > >
> >> > > > > > > Orchestrator decides (somehow) that a particular job
> requires
> >> > > > JSDL/BES,
> >> > > > > > so
> >> > > > > > > it places the Experiment ID in Zookeeper's
> >> > /airavata/gfac/jsdl-bes
> >> > > > > node.
> >> > > > > > >  GFAC servers associated with this instance notice the
> update.
> >> >  The
> >> > > > > first
> >> > > > > > > GFAC to claim the job gets it, uses the Experiment ID to get
> >> the
> >> > > > > detailed
> >> > > > > > > information it needs from the Registry.  ZooKeeper handles
> the
> >> > > > locking,
> >> > > > > > etc
> >> > > > > > > to make sure that only one GFAC at a time is trying to
> handle
> >> an
> >> > > > > > experiment.
> >> > > > > > >
> >> > > > > > > Marlon
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > On 6/16/14, 11:42 AM, Lahiru Gunathilake wrote:
> >> > > > > > >
> >> > > > > > >> Hi Supun,
> >> > > > > > >>
> >> > > > > > >> Thanks for the clarification.
> >> > > > > > >>
> >> > > > > > >> Regards
> >> > > > > > >> Lahiru
> >> > > > > > >>
> >> > > > > > >>
> >> > > > > > >> On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva <
> >> > > > > > supun06@gmail.com>
> >> > > > > > >> wrote:
> >> > > > > > >>
> >> > > > > > >>  Hi Lahiru,
> >> > > > > > >>>
> >> > > > > > >>> My suggestion is that may be you don't need a Thrift
> service
> >> > > > between
> >> > > > > > >>> Orchestrator and the component executing the experiment.
> >> When a
> >> > > new
> >> > > > > > >>> experiment is submitted, orchestrator decides who can
> >> execute
> >> > > this
> >> > > > > job.
> >> > > > > > >>> Then it put the information about this experiment
> execution
> >> in
> >> > > > > > ZooKeeper.
> >> > > > > > >>> The component which wants to executes the experiment is
> >> > listening
> >> > > > to
> >> > > > > > this
> >> > > > > > >>> ZooKeeper path and when it sees the experiment it will
> >> execute
> >> > > it.
> >> > > > So
> >> > > > > > >>> that
> >> > > > > > >>> the communication happens through an state change in
> >> ZooKeeper.
> >> > > > This
> >> > > > > > can
> >> > > > > > >>> potentially simply your architecture.
> >> > > > > > >>>
> >> > > > > > >>> Thanks,
> >> > > > > > >>> Supun.
> >> > > > > > >>>
> >> > > > > > >>>
> >> > > > > > >>> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake <
> >> > > > > > glahiru@gmail.com>
> >> > > > > > >>> wrote:
> >> > > > > > >>>
> >> > > > > > >>>  Hi Supun,
> >> > > > > > >>>>
> >> > > > > > >>>> So your suggestion is to create a znode for each thrift
> >> > service
> >> > > we
> >> > > > > > have
> >> > > > > > >>>> and
> >> > > > > > >>>> when the request comes that node gets modified with input
> >> data
> >> > > for
> >> > > > > > that
> >> > > > > > >>>> request and thrift service is having a watch for that
> node
> >> and
> >> > > it
> >> > > > > will
> >> > > > > > >>>> be
> >> > > > > > >>>> notified because of the watch and it can read the input
> >> from
> >> > > > > zookeeper
> >> > > > > > >>>> and
> >> > > > > > >>>> invoke the operation?
> >> > > > > > >>>>
> >> > > > > > >>>> Lahiru
> >> > > > > > >>>>
> >> > > > > > >>>>
> >> > > > > > >>>> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva <
> >> > > > > > >>>> supun06@gmail.com>
> >> > > > > > >>>> wrote:
> >> > > > > > >>>>
> >> > > > > > >>>>  Hi all,
> >> > > > > > >>>>>
> >> > > > > > >>>>> Here is what I think about Airavata and ZooKeeper. In
> >> > Airavata
> >> > > > > there
> >> > > > > > >>>>> are
> >> > > > > > >>>>> many components and these components must be stateless
> to
> >> > > achieve
> >> > > > > > >>>>> scalability and reliability.Also there must be a
> >> mechanism to
> >> > > > > > >>>>>
> >> > > > > > >>>> communicate
> >> > > > > > >>>>
> >> > > > > > >>>>> between the components. At the moment Airavata uses RPC
> >> calls
> >> > > > based
> >> > > > > > on
> >> > > > > > >>>>> Thrift for the communication.
> >> > > > > > >>>>>
> >> > > > > > >>>>> ZooKeeper can be used both as a place to hold state and
> >> as a
> >> > > > > > >>>>>
> >> > > > > > >>>> communication
> >> > > > > > >>>>
> >> > > > > > >>>>> layer between the components. I'm involved with a
> project
> >> > that
> >> > > > has
> >> > > > > > many
> >> > > > > > >>>>> distributed components like AIravata. Right now we use
> >> Thrift
> >> > > > > > services
> >> > > > > > >>>>>
> >> > > > > > >>>> to
> >> > > > > > >>>>
> >> > > > > > >>>>> communicate among the components. But we find it
> >> difficult to
> >> > > use
> >> > > > > RPC
> >> > > > > > >>>>>
> >> > > > > > >>>> calls
> >> > > > > > >>>>
> >> > > > > > >>>>> and achieve stateless behaviour and thinking of
> replacing
> >> > > Thrift
> >> > > > > > >>>>>
> >> > > > > > >>>> services
> >> > > > > > >>>>
> >> > > > > > >>>>> with ZooKeeper based communication layer. So I think it
> is
> >> > > better
> >> > > > > to
> >> > > > > > >>>>> explore the possibility of removing the Thrift services
> >> > between
> >> > > > the
> >> > > > > > >>>>> components and use ZooKeeper as a communication
> mechanism
> >> > > between
> >> > > > > the
> >> > > > > > >>>>> services. If you do this you will have to move the state
> >> to
> >> > > > > ZooKeeper
> >> > > > > > >>>>>
> >> > > > > > >>>> and
> >> > > > > > >>>>
> >> > > > > > >>>>> will automatically achieve the stateless behaviour in
> the
> >> > > > > components.
> >> > > > > > >>>>>
> >> > > > > > >>>>> Also I think trying to make ZooKeeper optional is a bad
> >> idea.
> >> > > If
> >> > > > we
> >> > > > > > are
> >> > > > > > >>>>> trying to integrate something fundamentally important to
> >> > > > > architecture
> >> > > > > > >>>>> as
> >> > > > > > >>>>> how to store state, we shouldn't make it optional.
> >> > > > > > >>>>>
> >> > > > > > >>>>> Thanks,
> >> > > > > > >>>>> Supun..
> >> > > > > > >>>>>
> >> > > > > > >>>>>
> >> > > > > > >>>>> On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
> >> > > > > > >>>>> shameerainfo@gmail.com> wrote:
> >> > > > > > >>>>>
> >> > > > > > >>>>>  Hi Lahiru,
> >> > > > > > >>>>>>
> >> > > > > > >>>>>> As i understood,  not only reliability , you are trying
> >> to
> >> > > > achieve
> >> > > > > > >>>>>> some
> >> > > > > > >>>>>> other requirement by introducing zookeeper, like health
> >> > > > monitoring
> >> > > > > > of
> >> > > > > > >>>>>>
> >> > > > > > >>>>> the
> >> > > > > > >>>>
> >> > > > > > >>>>> services, categorization with service implementation etc
> >> ...
> >> > .
> >> > > In
> >> > > > > > that
> >> > > > > > >>>>>> case, i think we can get use of zookeeper's features
> but
> >> if
> >> > we
> >> > > > > only
> >> > > > > > >>>>>>
> >> > > > > > >>>>> focus
> >> > > > > > >>>>
> >> > > > > > >>>>> on reliability, i have little bit of concern, why can't
> we
> >> > use
> >> > > > > > >>>>>>
> >> > > > > > >>>>> clustering +
> >> > > > > > >>>>
> >> > > > > > >>>>> LB ?
> >> > > > > > >>>>>>
> >> > > > > > >>>>>> Yes it is better we add Zookeeper as a prerequisite if
> >> user
> >> > > need
> >> > > > > to
> >> > > > > > >>>>>> use
> >> > > > > > >>>>>> it.
> >> > > > > > >>>>>>
> >> > > > > > >>>>>> Thanks,
> >> > > > > > >>>>>>   Shameera.
> >> > > > > > >>>>>>
> >> > > > > > >>>>>>
> >> > > > > > >>>>>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <
> >> > > > > > >>>>>> glahiru@gmail.com
> >> > > > > > >>>>>> wrote:
> >> > > > > > >>>>>>
> >> > > > > > >>>>>>  Hi Gagan,
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> I need to start another discussion about it, but I had
> >> an
> >> > > > offline
> >> > > > > > >>>>>>> discussion with Suresh about auto-scaling. I will
> start
> >> > > another
> >> > > > > > >>>>>>> thread
> >> > > > > > >>>>>>> about this topic too.
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> Regards
> >> > > > > > >>>>>>> Lahiru
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>> gagandeepjuneja@gmail.com
> >> > > > > > >>>>
> >> > > > > > >>>>> wrote:
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>  Thanks Lahiru for pointing to nice library, added to
> my
> >> > > > > dictionary
> >> > > > > > >>>>>>>>
> >> > > > > > >>>>>>> :).
> >> > > > > > >>>>
> >> > > > > > >>>>>  I would like to know how are we planning to start
> >> multiple
> >> > > > > servers.
> >> > > > > > >>>>>>>> 1. Spawning new servers based on load? Some times we
> >> call
> >> > it
> >> > > > as
> >> > > > > > auto
> >> > > > > > >>>>>>>> scalable.
> >> > > > > > >>>>>>>> 2. To make some specific number of nodes available
> >> such as
> >> > > we
> >> > > > > > want 2
> >> > > > > > >>>>>>>> servers to be available at any time so if one goes
> down
> >> > > then I
> >> > > > > > need
> >> > > > > > >>>>>>>>
> >> > > > > > >>>>>>> to
> >> > > > > > >>>>
> >> > > > > > >>>>>  spawn one new to make available servers count 2.
> >> > > > > > >>>>>>>> 3. Initially start all the servers.
> >> > > > > > >>>>>>>>
> >> > > > > > >>>>>>>> In scenario 1 and 2 zookeeper does make sense but I
> >> don't
> >> > > > > believe
> >> > > > > > >>>>>>>>
> >> > > > > > >>>>>>> existing
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>> architecture support this?
> >> > > > > > >>>>>>>>
> >> > > > > > >>>>>>>> Regards,
> >> > > > > > >>>>>>>> Gagan
> >> > > > > > >>>>>>>> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <
> >> > > > glahiru@gmail.com
> >> > > > > >
> >> > > > > > >>>>>>>>
> >> > > > > > >>>>>>> wrote:
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>> Hi Gagan,
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>> Thanks for your response. Please see my inline
> >> comments.
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>> gagandeepjuneja@gmail.com>
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>> wrote:
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>>  Hi Lahiru,
> >> > > > > > >>>>>>>>>> Just my 2 cents.
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>>> I am big fan of zookeeper but also against adding
> >> > multiple
> >> > > > > hops
> >> > > > > > in
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>> the
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>> system which can add unnecessary complexity. Here I
> am
> >> not
> >> > > > able
> >> > > > > to
> >> > > > > > >>>>>>>>>> understand the requirement of zookeeper may be I am
> >> > wrong
> >> > > > > > because
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>> of
> >> > > > > > >>>>
> >> > > > > > >>>>> less
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>> knowledge of the airavata system in whole. So I would
> >> like
> >> > > to
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>> discuss
> >> > > > > > >>>>
> >> > > > > > >>>>>  following point.
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>>> 1. How it will help us in making system more
> >> reliable.
> >> > > > > Zookeeper
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>> is
> >> > > > > > >>>>
> >> > > > > > >>>>> not
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>> able to restart services. At max it can tell whether
> >> > service
> >> > > > is
> >> > > > > up
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>> or not
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>> which could only be the case if airavata service goes
> >> down
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>> gracefully and
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>> we have any automated way to restart it. If this is
> >> just
> >> > > > matter
> >> > > > > of
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>> routing
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>> client requests to the available thrift servers then
> >> this
> >> > > can
> >> > > > be
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>> achieved
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>> with the help of load balancer which I guess is
> already
> >> > > there
> >> > > > in
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>> thrift
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>> wish list.
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>>>  We have multiple thrift services and currently we
> >> start
> >> > > > only
> >> > > > > > one
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>> instance
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>> of them and each thrift service is a stateless
> >> service. To
> >> > > > keep
> >> > > > > > the
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>> high
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>> availability we have to start multiple instances of
> >> them
> >> > in
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>> production
> >> > > > > > >>>>
> >> > > > > > >>>>>  scenario. So for clients to get an available thrift
> >> service
> >> > we
> >> > > > can
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>> use
> >> > > > > > >>>>
> >> > > > > > >>>>>  zookeeper znodes to represent each available service.
> >> There
> >> > > are
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>> some
> >> > > > > > >>>>
> >> > > > > > >>>>>  libraries which is doing similar[1] and I think we can
> >> use
> >> > > them
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>> directly.
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>> 2. As far as registering of different providers is
> >> > concerned
> >> > > > do
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>> you
> >> > > > > > >>>>
> >> > > > > > >>>>>  think for that we really need external store.
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>>>  Yes I think so, because its light weight and
> >> reliable
> >> > and
> >> > > > we
> >> > > > > > have
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>> to
> >> > > > > > >>>>
> >> > > > > > >>>>> do
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>> very minimal amount of work to achieve all these
> >> features
> >> > to
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>> Airavata
> >> > > > > > >>>>
> >> > > > > > >>>>>  because zookeeper handle all the complexity.
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>>  I have seen people using zookeeper more for state
> >> > > management
> >> > > > > in
> >> > > > > > >>>>>>>>>> distributed environments.
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>>>  +1, we might not be the most effective users of
> >> > zookeeper
> >> > > > > > because
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>> all
> >> > > > > > >>>>
> >> > > > > > >>>>> of
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>> our services are stateless services, but my point is
> to
> >> > > > achieve
> >> > > > > > >>>>>>>>> fault-tolerance we can use zookeeper and with
> minimal
> >> > work.
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>>    I would like to understand more how can we
> leverage
> >> > > > > zookeeper
> >> > > > > > in
> >> > > > > > >>>>>>>>>> airavata to make system reliable.
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>>>  [1]https://github.com/eirslett/thrift-zookeeper
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>>  Regards,
> >> > > > > > >>>>>>>>>> Gagan
> >> > > > > > >>>>>>>>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <
> >> > marpierc@iu.edu
> >> > > >
> >> > > > > > wrote:
> >> > > > > > >>>>>>>>>>
> >> > > > > > >>>>>>>>>>  Thanks for the summary, Lahiru. I'm cc'ing the
> >> > > Architecture
> >> > > > > > list
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>> for
> >> > > > > > >>>>
> >> > > > > > >>>>>  additional comments.
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> Marlon
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> Hi All,
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> I did little research about Apache Zookeeper[1]
> and
> >> > how
> >> > > to
> >> > > > > use
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> it
> >> > > > > > >>>>
> >> > > > > > >>>>> in
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>>  airavata. Its really a nice way to achieve fault
> >> > tolerance
> >> > > > and
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> reliable
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> communication between our thrift services and
> >> clients.
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> Zookeeper
> >> > > > > > >>>>
> >> > > > > > >>>>> is a
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>>  distributed, fault tolerant system to do a reliable
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> communication
> >> > > > > > >>>>
> >> > > > > > >>>>>  between
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> distributed applications. This is like an
> in-memory
> >> > file
> >> > > > > > system
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> which
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>>  has
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> nodes in a tree structure and each node can have
> >> small
> >> > > > > amount
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> of
> >> > > > > > >>>>
> >> > > > > > >>>>> data
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>>  associated with it and these nodes are called
> znodes.
> >> > > Clients
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> can
> >> > > > > > >>>>
> >> > > > > > >>>>>  connect
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> to a zookeeper server and add/delete and update
> >> these
> >> > > > > znodes.
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>    In Apache Airavata we start multiple thrift
> >> > services
> >> > > > and
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> these
> >> > > > > > >>>>
> >> > > > > > >>>>> can
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>>  go
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> down for maintenance or these can crash, if we
> use
> >> > > > zookeeper
> >> > > > > > to
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> store
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>>  these
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> configuration(thrift service configurations) we
> can
> >> > > > achieve
> >> > > > > a
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> very
> >> > > > > > >>>>
> >> > > > > > >>>>>  reliable
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> system. Basically thrift clients can dynamically
> >> > > discover
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> available
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>>  service
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> by using ephemeral znodes(Here we do not have to
> >> > change
> >> > > > the
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> generated
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>>  thrift client code but we have to change the
> >> locations we
> >> > > are
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> invoking
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>>  them). ephemeral znodes will be removed when the
> >> thrift
> >> > > > service
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> goes
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>>  down
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> and zookeeper guarantee the atomicity between
> these
> >> > > > > > operations.
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> With
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>>  this
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> approach we can have a node hierarchy for
> multiple
> >> of
> >> > > > > > airavata,
> >> > > > > > >>>>>>>>>>>> orchestrator,appcatalog and gfac thrift services.
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> For specifically for gfac we can have different
> >> types
> >> > of
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> services
> >> > > > > > >>>>
> >> > > > > > >>>>> for
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>>  each
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> provider implementation. This can be achieved by
> >> using
> >> > > the
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> hierarchical
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> support in zookeeper and providing some logic in
> >> > > > gfac-thrift
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> service
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>>  to
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> register it to a defined path. Using the same
> logic
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> orchestrator
> >> > > > > > >>>>
> >> > > > > > >>>>> can
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>>  discover the provider specific gfac thrift service
> and
> >> > > route
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> the
> >> > > > > > >>>>
> >> > > > > > >>>>>  message to
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> the correct thrift service.
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> With this approach I think we simply have write
> >> some
> >> > > > client
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> code
> >> > > > > > >>>>
> >> > > > > > >>>>> in
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>>  thrift
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> services and clients and zookeeper server
> >> installation
> >> > > can
> >> > > > > be
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> done as
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>>  a
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> separate process and it will be easier to keep
> the
> >> > > > Zookeeper
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> server
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>>  separate from Airavata because installation of
> >> Zookeeper
> >> > > > server
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> little
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>>  complex in production scenario. I think we have to
> >> make
> >> > > sure
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> everything
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> works fine when there is no Zookeeper running,
> ex:
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> enable.zookeeper=false
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> should works fine and users doesn't have to
> >> download
> >> > and
> >> > > > > start
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>> zookeeper.
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> [1]http://zookeeper.apache.org/
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>> Thanks
> >> > > > > > >>>>>>>>>>>> Lahiru
> >> > > > > > >>>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>>>>
> >> > > > > > >>>>>>>>> --
> >> > > > > > >>>>>>>>> System Analyst Programmer
> >> > > > > > >>>>>>>>> PTI Lab
> >> > > > > > >>>>>>>>> Indiana University
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>>>>
> >> > > > > > >>>>>>> --
> >> > > > > > >>>>>>> System Analyst Programmer
> >> > > > > > >>>>>>> PTI Lab
> >> > > > > > >>>>>>> Indiana University
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>>
> >> > > > > > >>>>>>
> >> > > > > > >>>>>> --
> >> > > > > > >>>>>> Best Regards,
> >> > > > > > >>>>>> Shameera Rathnayaka.
> >> > > > > > >>>>>>
> >> > > > > > >>>>>> email: shameera AT apache.org , shameerainfo AT
> >> gmail.com
> >> > > > > > >>>>>> Blog : http://shameerarathnayaka.blogspot.com/
> >> > > > > > >>>>>>
> >> > > > > > >>>>>>
> >> > > > > > >>>>>
> >> > > > > > >>>>> --
> >> > > > > > >>>>> Supun Kamburugamuva
> >> > > > > > >>>>> Member, Apache Software Foundation;
> http://www.apache.org
> >> > > > > > >>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> >> > > > > > >>>>> Blog: http://supunk.blogspot.com
> >> > > > > > >>>>>
> >> > > > > > >>>>>
> >> > > > > > >>>>>
> >> > > > > > >>>> --
> >> > > > > > >>>> System Analyst Programmer
> >> > > > > > >>>> PTI Lab
> >> > > > > > >>>> Indiana University
> >> > > > > > >>>>
> >> > > > > > >>>>
> >> > > > > > >>>
> >> > > > > > >>> --
> >> > > > > > >>> Supun Kamburugamuva
> >> > > > > > >>> Member, Apache Software Foundation; http://www.apache.org
> >> > > > > > >>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> >> > > > > > >>> Blog: http://supunk.blogspot.com
> >> > > > > > >>>
> >> > > > > > >>>
> >> > > > > > >>>
> >> > > > > > >>
> >> > > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > --
> >> > > > > > Supun Kamburugamuva
> >> > > > > > Member, Apache Software Foundation; http://www.apache.org
> >> > > > > > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> >> > > > > > Blog: http://supunk.blogspot.com
> >> > > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > --
> >> > > > > System Analyst Programmer
> >> > > > > PTI Lab
> >> > > > > Indiana University
> >> > > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > > --
> >> > > > Supun Kamburugamuva
> >> > > > Member, Apache Software Foundation; http://www.apache.org
> >> > > > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> >> > > > Blog: http://supunk.blogspot.com
> >> > > >
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > System Analyst Programmer
> >> > > PTI Lab
> >> > > Indiana University
> >> > >
> >> >
> >>
> >>
> >>
> >> --
> >> Supun Kamburugamuva
> >> Member, Apache Software Foundation; http://www.apache.org
> >> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> >> Blog: http://supunk.blogspot.com
> >>
> >
> >
> >
> > --
> > System Analyst Programmer
> > PTI Lab
> > Indiana University
> >
>
>
>
> --
> System Analyst Programmer
> PTI Lab
> Indiana University
>

Re: Zookeeper in Airavata to achieve reliability

Posted by Lahiru Gunathilake <gl...@gmail.com>.

Hi Eran,

I think I should take back my last email. When I carefully look at storm I
have following question.

How are we going to store the Job statuses  and relaunch the jobs which was
running in failure nodes ? Its true that storm is starting new workers but
there should be a way to find missing jobs by someone in the system. Since
we are not having a data stream there is no use to start new workers unless
we handler the missing jobs. I think we need to have a better control of
our component and persist the states of jobs each GFAC node is handling.
Directly using zookeeper will let us to do a proper fault tolerance
implementation.

Regards
Lahiru



On Tue, Jun 17, 2014 at 3:14 PM, Lahiru Gunathilake <gl...@gmail.com>
wrote:

> Hi Supun,
>
> I think in this usecase we only use storm topology to do the communication
> among workers and we are completely ignoring the stream processing part.
> Orchestrator will talk to Nimbus and GFAC nodes will be Worker nodes in the
> storm topology. But I think we can achieve extremely fault tolerance system
> by directly using storm based on following statement in storm site with
> minimum changes in airavata.
>
> Additionally, the Nimbus daemon and Supervisor daemons are fail-fast and
> stateless; all state is kept in Zookeeper or on local disk. This means you
> can kill -9 Nimbus or the Supervisors and they’ll start back up like
> nothing happened. This design leads to Storm clusters being incredibly
> stable.
>
>
>
>
> On Tue, Jun 17, 2014 at 3:02 PM, Supun Kamburugamuva <su...@gmail.com>
> wrote:
>
>> Hi Eran,
>>
>> I'm using Storm every day and this is one of the strangest things I've
>> heard about using Storm. My be there are more use cases for Storm other
>> than Distributed Stream processing. AFAIK the Bolts, spouts are built to
>> handle a stream of events that doesn't take much time to process. In
>> Airavata we don't process the messages. Instead we run experiments based
>> on
>> the commands given.
>>
>> If you want process isolation, distributed execution, cluster resource
>> management Yarn would be a better thing to explore.
>>
>> Thanks,
>> Supun..
>>
>>
>> On Tue, Jun 17, 2014 at 2:27 PM, Eran Chinthaka Withana <
>> eran.chinthaka@gmail.com> wrote:
>>
>> > Hi Lahiru,
>> >
>> > good summarization. Thanks Lahiru.
>> >
>> > I think you are trying to stick to a model where Orchestrator
>> distributing
>> > to work for GFac worker and trying to do the impedance mismatch through
>> a
>> > messaging solution. If you step back and think, we don't even want the
>> > orchestrator to handle everything. From its point of view, it should
>> submit
>> > jobs to the framework, and will wait or get notified once the job is
>> done.
>> >
>> > There are multiple ways of doing this. And here is one method.
>> >
>> > Orchestrator submits all its jobs to Job queue (implemented using any MQ
>> > impl like Rabbit or Kafka). A storm topology is implemented to dequeue
>> > messages, process them (i.e. submit those jobs and get those executed)
>> and
>> > notify the Orchestrator with the status (either through another
>> > JobCompletionQueue or direct invocation).
>> >
>> > With this approach, the MQ provider will help to match impedance between
>> > job submission and consumption. Storm helps with worker coordination,
>> load
>> > balancing, throttling on your job execution framework, worker pool
>> > management and fault tolerance.
>> >
>> > Of course, you can implement this based only on ZK and handle everything
>> > else on your own but storm had done exactly that with the use of ZK
>> > underneath.
>> >
>> > Finally, if you go for a model like this, then even beyond job
>> submission,
>> > you can use the same model to do anything within the framework for
>> internal
>> > communication. For example, the workflow engine will submit its jobs to
>> > queues based on what it has to do. Storm topologies exists for each
>> queues
>> > to dequeue messages and carry out the work in a reliable manner.
>> Consider
>> > this as mini-workflows within a larger workflow framework.
>> >
>> > We can have a voice chat if its more convenient. But not at 7am PST :)
>> >
>> >
>> > Thanks,
>> > Eran Chinthaka Withana
>> >
>> >
>> > On Tue, Jun 17, 2014 at 10:12 AM, Lahiru Gunathilake <glahiru@gmail.com
>> >
>> > wrote:
>> >
>> > > Hi All,
>> > >
>> > > Ignoring the tool that we are going to use to implement fault
>> tolerance I
>> > > have summarized the model we have decided so far. I will use the tool
>> > name
>> > > as X, we can use Zookeeper or some other implementation. Following
>> design
>> > > assume tool X  and Registry have high availability.
>> > >
>> > > 1. Orchestrator and GFAC worker node communication is going to be
>> queue
>> > > based and tool X is going to be used for this communication. (We have
>> to
>> > > implement this with considering race condition between different gfac
>> > > workers).
>> > > 2. We are having multiple instances of GFAC which are identical (In
>> > future
>> > > we can group gfac workers). Existence of each worker node is
>> identified
>> > > using X. If node goes down orchestrator will be notified by X.
>> > > 3. When a particular request comes and accepted by one gfac worker
>> that
>> > > information will be replicated in tool X and a place where this
>> > information
>> > > is persisted even the worker failed.
>> > > 4. When a job comes to a final state like failed or cancelled or
>> > completed
>> > > above information will be removed. So at a given time orchestrator can
>> > poll
>> > > active jobs in each worker by giving a worker ID.
>> > > 5. Tool X will make sure that when a worker goes down it will notify
>> > > orchestrator. During a worker failure, based on step 3 and 4
>> orchestrator
>> > > can poll all the active jobs of that worker and do the same thing
>> like in
>> > > step 1 (store the experiment ID to the queue) and gfac worker will
>> pick
>> > the
>> > > jobs.
>> > >
>> > > 6. When GFAC receive a job like in step 5 it have to carefully
>> evaluate
>> > the
>> > > state from registry and decide what to be done (If the job is pending
>> > then
>> > > gfac just have to monitor, if job state is like input transferred not
>> > even
>> > > submitted gfac has to execute rest of the chain and submit the job to
>> the
>> > > resource and start monitoring).
>> > >
>> > > If we can find a tool X which supports all these features and tool
>> itself
>> > > is fault tolerance and support atomicity, high availability and simply
>> > API
>> > > to implement we can use that tool.
>> > >
>> > > WDYT ?
>> > >
>> > > Lahiru
>> > >
>> > >
>> > > On Mon, Jun 16, 2014 at 2:38 PM, Supun Kamburugamuva <
>> supun06@gmail.com>
>> > > wrote:
>> > >
>> > > > Hi Lahiru,
>> > > >
>> > > > Before moving with an implementation it may be worth to consider
>> some
>> > of
>> > > > the following aspects as well.
>> > > >
>> > > > 1. How to report the progress of an experiment as state in
>> ZooKeeper?
>> > > What
>> > > > happens if a GFac instance crashes while executing an experiment?
>> Are
>> > > there
>> > > > check-points we can save so that another GFac instance can take
>> over?
>> > > > 2. What is the threading model of GFac instances? (I consider this
>> as a
>> > > > very important aspect)
>> > > > 3. What are the information needed to be stored in the ZooKeeper?
>> You
>> > may
>> > > > need to store other information about an experiment apart from its
>> > > > experiment ID.
>> > > > 4. How to report errors?
>> > > > 5. For GFac weather you need a threading model or worker process
>> model?
>> > > >
>> > > > Thanks,
>> > > > Supun..
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > On Mon, Jun 16, 2014 at 2:22 PM, Lahiru Gunathilake <
>> glahiru@gmail.com
>> > >
>> > > > wrote:
>> > > >
>> > > > > Hi All,
>> > > > >
>> > > > > I think the conclusion is like this,
>> > > > >
>> > > > > 1, We make the gfac as a worker not a thrift service and we can
>> start
>> > > > > multiple workers either with bunch of providers and handlers
>> > configured
>> > > > in
>> > > > > each worker or provider specific  workers to handle the class path
>> > > issues
>> > > > > (not the common scenario).
>> > > > >
>> > > > > 2. Gfac workers can be configured to watch for a given path in
>> > > zookeeper,
>> > > > > and multiple workers can listen to the same path. Default path
>> can be
>> > > > > /airavata/gfac or can configure paths like /airavata/gfac/gsissh
>> > > > > /airavata/gfac/bes.
>> > > > >
>> > > > > 3. Orchestrator can configure with a logic to store experiment
>> IDs in
>> > > > > zookeeper with a path, and orchestrator can be configured to
>> provider
>> > > > > specific path logic too. So when a new request come orchestrator
>> > store
>> > > > the
>> > > > > experimentID and these experiments IDs are stored in Zk as a
>> queue.
>> > > > >
>> > > > > 4. Since gfac workers are watching they will be notified and as
>> supun
>> > > > > suggested can use a leader selection algorithm[1] and one gfac
>> worker
>> > > >  will
>> > > > > take the leadership for each experiment. If there are gfac
>> instances
>> > > for
>> > > > > each provider same logic will apply among those nodes with same
>> > > provider
>> > > > > type.
>> > > > >
>> > > > > [1]http://curator.apache.org/curator-recipes/leader-election.html
>> > > > >
>> > > > > I would like to implement this if there are  no objections.
>> > > > >
>> > > > > Lahiru
>> > > > >
>> > > > >
>> > > > > On Mon, Jun 16, 2014 at 11:51 AM, Supun Kamburugamuva <
>> > > supun06@gmail.com
>> > > > >
>> > > > > wrote:
>> > > > >
>> > > > > > Hi Marlon,
>> > > > > >
>> > > > > > I think you are exactly correct.
>> > > > > >
>> > > > > > Supun..
>> > > > > >
>> > > > > >
>> > > > > > On Mon, Jun 16, 2014 at 11:48 AM, Marlon Pierce <
>> marpierc@iu.edu>
>> > > > wrote:
>> > > > > >
>> > > > > > > Let me restate this, and please tell me if I'm wrong.
>> > > > > > >
>> > > > > > > Orchestrator decides (somehow) that a particular job requires
>> > > > JSDL/BES,
>> > > > > > so
>> > > > > > > it places the Experiment ID in Zookeeper's
>> > /airavata/gfac/jsdl-bes
>> > > > > node.
>> > > > > > >  GFAC servers associated with this instance notice the update.
>> >  The
>> > > > > first
>> > > > > > > GFAC to claim the job gets it, uses the Experiment ID to get
>> the
>> > > > > detailed
>> > > > > > > information it needs from the Registry.  ZooKeeper handles the
>> > > > locking,
>> > > > > > etc
>> > > > > > > to make sure that only one GFAC at a time is trying to handle
>> an
>> > > > > > experiment.
>> > > > > > >
>> > > > > > > Marlon
>> > > > > > >
>> > > > > > >
>> > > > > > > On 6/16/14, 11:42 AM, Lahiru Gunathilake wrote:
>> > > > > > >
>> > > > > > >> Hi Supun,
>> > > > > > >>
>> > > > > > >> Thanks for the clarification.
>> > > > > > >>
>> > > > > > >> Regards
>> > > > > > >> Lahiru
>> > > > > > >>
>> > > > > > >>
>> > > > > > >> On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva <
>> > > > > > supun06@gmail.com>
>> > > > > > >> wrote:
>> > > > > > >>
>> > > > > > >>  Hi Lahiru,
>> > > > > > >>>
>> > > > > > >>> My suggestion is that may be you don't need a Thrift service
>> > > > between
>> > > > > > >>> Orchestrator and the component executing the experiment.
>> When a
>> > > new
>> > > > > > >>> experiment is submitted, orchestrator decides who can
>> execute
>> > > this
>> > > > > job.
>> > > > > > >>> Then it put the information about this experiment execution
>> in
>> > > > > > ZooKeeper.
>> > > > > > >>> The component which wants to executes the experiment is
>> > listening
>> > > > to
>> > > > > > this
>> > > > > > >>> ZooKeeper path and when it sees the experiment it will
>> execute
>> > > it.
>> > > > So
>> > > > > > >>> that
>> > > > > > >>> the communication happens through an state change in
>> ZooKeeper.
>> > > > This
>> > > > > > can
>> > > > > > >>> potentially simply your architecture.
>> > > > > > >>>
>> > > > > > >>> Thanks,
>> > > > > > >>> Supun.
>> > > > > > >>>
>> > > > > > >>>
>> > > > > > >>> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake <
>> > > > > > glahiru@gmail.com>
>> > > > > > >>> wrote:
>> > > > > > >>>
>> > > > > > >>>  Hi Supun,
>> > > > > > >>>>
>> > > > > > >>>> So your suggestion is to create a znode for each thrift
>> > service
>> > > we
>> > > > > > have
>> > > > > > >>>> and
>> > > > > > >>>> when the request comes that node gets modified with input
>> data
>> > > for
>> > > > > > that
>> > > > > > >>>> request and thrift service is having a watch for that node
>> and
>> > > it
>> > > > > will
>> > > > > > >>>> be
>> > > > > > >>>> notified because of the watch and it can read the input
>> from
>> > > > > zookeeper
>> > > > > > >>>> and
>> > > > > > >>>> invoke the operation?
>> > > > > > >>>>
>> > > > > > >>>> Lahiru
>> > > > > > >>>>
>> > > > > > >>>>
>> > > > > > >>>> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva <
>> > > > > > >>>> supun06@gmail.com>
>> > > > > > >>>> wrote:
>> > > > > > >>>>
>> > > > > > >>>>  Hi all,
>> > > > > > >>>>>
>> > > > > > >>>>> Here is what I think about Airavata and ZooKeeper. In
>> > Airavata
>> > > > > there
>> > > > > > >>>>> are
>> > > > > > >>>>> many components and these components must be stateless to
>> > > achieve
>> > > > > > >>>>> scalability and reliability.Also there must be a
>> mechanism to
>> > > > > > >>>>>
>> > > > > > >>>> communicate
>> > > > > > >>>>
>> > > > > > >>>>> between the components. At the moment Airavata uses RPC
>> calls
>> > > > based
>> > > > > > on
>> > > > > > >>>>> Thrift for the communication.
>> > > > > > >>>>>
>> > > > > > >>>>> ZooKeeper can be used both as a place to hold state and
>> as a
>> > > > > > >>>>>
>> > > > > > >>>> communication
>> > > > > > >>>>
>> > > > > > >>>>> layer between the components. I'm involved with a project
>> > that
>> > > > has
>> > > > > > many
>> > > > > > >>>>> distributed components like AIravata. Right now we use
>> Thrift
>> > > > > > services
>> > > > > > >>>>>
>> > > > > > >>>> to
>> > > > > > >>>>
>> > > > > > >>>>> communicate among the components. But we find it
>> difficult to
>> > > use
>> > > > > RPC
>> > > > > > >>>>>
>> > > > > > >>>> calls
>> > > > > > >>>>
>> > > > > > >>>>> and achieve stateless behaviour and thinking of replacing
>> > > Thrift
>> > > > > > >>>>>
>> > > > > > >>>> services
>> > > > > > >>>>
>> > > > > > >>>>> with ZooKeeper based communication layer. So I think it is
>> > > better
>> > > > > to
>> > > > > > >>>>> explore the possibility of removing the Thrift services
>> > between
>> > > > the
>> > > > > > >>>>> components and use ZooKeeper as a communication mechanism
>> > > between
>> > > > > the
>> > > > > > >>>>> services. If you do this you will have to move the state
>> to
>> > > > > ZooKeeper
>> > > > > > >>>>>
>> > > > > > >>>> and
>> > > > > > >>>>
>> > > > > > >>>>> will automatically achieve the stateless behaviour in the
>> > > > > components.
>> > > > > > >>>>>
>> > > > > > >>>>> Also I think trying to make ZooKeeper optional is a bad
>> idea.
>> > > If
>> > > > we
>> > > > > > are
>> > > > > > >>>>> trying to integrate something fundamentally important to
>> > > > > architecture
>> > > > > > >>>>> as
>> > > > > > >>>>> how to store state, we shouldn't make it optional.
>> > > > > > >>>>>
>> > > > > > >>>>> Thanks,
>> > > > > > >>>>> Supun..
>> > > > > > >>>>>
>> > > > > > >>>>>
>> > > > > > >>>>> On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
>> > > > > > >>>>> shameerainfo@gmail.com> wrote:
>> > > > > > >>>>>
>> > > > > > >>>>>  Hi Lahiru,
>> > > > > > >>>>>>
>> > > > > > >>>>>> As i understood,  not only reliability , you are trying
>> to
>> > > > achieve
>> > > > > > >>>>>> some
>> > > > > > >>>>>> other requirement by introducing zookeeper, like health
>> > > > monitoring
>> > > > > > of
>> > > > > > >>>>>>
>> > > > > > >>>>> the
>> > > > > > >>>>
>> > > > > > >>>>> services, categorization with service implementation etc
>> ...
>> > .
>> > > In
>> > > > > > that
>> > > > > > >>>>>> case, i think we can get use of zookeeper's features but
>> if
>> > we
>> > > > > only
>> > > > > > >>>>>>
>> > > > > > >>>>> focus
>> > > > > > >>>>
>> > > > > > >>>>> on reliability, i have little bit of concern, why can't we
>> > use
>> > > > > > >>>>>>
>> > > > > > >>>>> clustering +
>> > > > > > >>>>
>> > > > > > >>>>> LB ?
>> > > > > > >>>>>>
>> > > > > > >>>>>> Yes it is better we add Zookeeper as a prerequisite if
>> user
>> > > need
>> > > > > to
>> > > > > > >>>>>> use
>> > > > > > >>>>>> it.
>> > > > > > >>>>>>
>> > > > > > >>>>>> Thanks,
>> > > > > > >>>>>>   Shameera.
>> > > > > > >>>>>>
>> > > > > > >>>>>>
>> > > > > > >>>>>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <
>> > > > > > >>>>>> glahiru@gmail.com
>> > > > > > >>>>>> wrote:
>> > > > > > >>>>>>
>> > > > > > >>>>>>  Hi Gagan,
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> I need to start another discussion about it, but I had
>> an
>> > > > offline
>> > > > > > >>>>>>> discussion with Suresh about auto-scaling. I will start
>> > > another
>> > > > > > >>>>>>> thread
>> > > > > > >>>>>>> about this topic too.
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> Regards
>> > > > > > >>>>>>> Lahiru
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>
>> > > > > > >>>>>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
>> > > > > > >>>>>>>
>> > > > > > >>>>>> gagandeepjuneja@gmail.com
>> > > > > > >>>>
>> > > > > > >>>>> wrote:
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>  Thanks Lahiru for pointing to nice library, added to my
>> > > > > dictionary
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>> :).
>> > > > > > >>>>
>> > > > > > >>>>>  I would like to know how are we planning to start
>> multiple
>> > > > > servers.
>> > > > > > >>>>>>>> 1. Spawning new servers based on load? Some times we
>> call
>> > it
>> > > > as
>> > > > > > auto
>> > > > > > >>>>>>>> scalable.
>> > > > > > >>>>>>>> 2. To make some specific number of nodes available
>> such as
>> > > we
>> > > > > > want 2
>> > > > > > >>>>>>>> servers to be available at any time so if one goes down
>> > > then I
>> > > > > > need
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>> to
>> > > > > > >>>>
>> > > > > > >>>>>  spawn one new to make available servers count 2.
>> > > > > > >>>>>>>> 3. Initially start all the servers.
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>>> In scenario 1 and 2 zookeeper does make sense but I
>> don't
>> > > > > believe
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>> existing
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> architecture support this?
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>>> Regards,
>> > > > > > >>>>>>>> Gagan
>> > > > > > >>>>>>>> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <
>> > > > glahiru@gmail.com
>> > > > > >
>> > > > > > >>>>>>>>
>> > > > > > >>>>>>> wrote:
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> Hi Gagan,
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>> Thanks for your response. Please see my inline
>> comments.
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>> gagandeepjuneja@gmail.com>
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> wrote:
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>>  Hi Lahiru,
>> > > > > > >>>>>>>>>> Just my 2 cents.
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>> I am big fan of zookeeper but also against adding
>> > multiple
>> > > > > hops
>> > > > > > in
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>> the
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> system which can add unnecessary complexity. Here I am
>> not
>> > > > able
>> > > > > to
>> > > > > > >>>>>>>>>> understand the requirement of zookeeper may be I am
>> > wrong
>> > > > > > because
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>> of
>> > > > > > >>>>
>> > > > > > >>>>> less
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> knowledge of the airavata system in whole. So I would
>> like
>> > > to
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>> discuss
>> > > > > > >>>>
>> > > > > > >>>>>  following point.
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>> 1. How it will help us in making system more
>> reliable.
>> > > > > Zookeeper
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>> is
>> > > > > > >>>>
>> > > > > > >>>>> not
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> able to restart services. At max it can tell whether
>> > service
>> > > > is
>> > > > > up
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>> or not
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> which could only be the case if airavata service goes
>> down
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>> gracefully and
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> we have any automated way to restart it. If this is
>> just
>> > > > matter
>> > > > > of
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>> routing
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> client requests to the available thrift servers then
>> this
>> > > can
>> > > > be
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>> achieved
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> with the help of load balancer which I guess is already
>> > > there
>> > > > in
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>> thrift
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> wish list.
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>>  We have multiple thrift services and currently we
>> start
>> > > > only
>> > > > > > one
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>> instance
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> of them and each thrift service is a stateless
>> service. To
>> > > > keep
>> > > > > > the
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>> high
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> availability we have to start multiple instances of
>> them
>> > in
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>> production
>> > > > > > >>>>
>> > > > > > >>>>>  scenario. So for clients to get an available thrift
>> service
>> > we
>> > > > can
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>> use
>> > > > > > >>>>
>> > > > > > >>>>>  zookeeper znodes to represent each available service.
>> There
>> > > are
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>> some
>> > > > > > >>>>
>> > > > > > >>>>>  libraries which is doing similar[1] and I think we can
>> use
>> > > them
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>> directly.
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> 2. As far as registering of different providers is
>> > concerned
>> > > > do
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>> you
>> > > > > > >>>>
>> > > > > > >>>>>  think for that we really need external store.
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>>  Yes I think so, because its light weight and
>> reliable
>> > and
>> > > > we
>> > > > > > have
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>> to
>> > > > > > >>>>
>> > > > > > >>>>> do
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> very minimal amount of work to achieve all these
>> features
>> > to
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>> Airavata
>> > > > > > >>>>
>> > > > > > >>>>>  because zookeeper handle all the complexity.
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>>  I have seen people using zookeeper more for state
>> > > management
>> > > > > in
>> > > > > > >>>>>>>>>> distributed environments.
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>>  +1, we might not be the most effective users of
>> > zookeeper
>> > > > > > because
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>> all
>> > > > > > >>>>
>> > > > > > >>>>> of
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>> our services are stateless services, but my point is to
>> > > > achieve
>> > > > > > >>>>>>>>> fault-tolerance we can use zookeeper and with minimal
>> > work.
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>>    I would like to understand more how can we leverage
>> > > > > zookeeper
>> > > > > > in
>> > > > > > >>>>>>>>>> airavata to make system reliable.
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>>  [1]https://github.com/eirslett/thrift-zookeeper
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>>  Regards,
>> > > > > > >>>>>>>>>> Gagan
>> > > > > > >>>>>>>>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <
>> > marpierc@iu.edu
>> > > >
>> > > > > > wrote:
>> > > > > > >>>>>>>>>>
>> > > > > > >>>>>>>>>>  Thanks for the summary, Lahiru. I'm cc'ing the
>> > > Architecture
>> > > > > > list
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>> for
>> > > > > > >>>>
>> > > > > > >>>>>  additional comments.
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>> Marlon
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> Hi All,
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> I did little research about Apache Zookeeper[1] and
>> > how
>> > > to
>> > > > > use
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> it
>> > > > > > >>>>
>> > > > > > >>>>> in
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  airavata. Its really a nice way to achieve fault
>> > tolerance
>> > > > and
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> reliable
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> communication between our thrift services and
>> clients.
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> Zookeeper
>> > > > > > >>>>
>> > > > > > >>>>> is a
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  distributed, fault tolerant system to do a reliable
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> communication
>> > > > > > >>>>
>> > > > > > >>>>>  between
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> distributed applications. This is like an in-memory
>> > file
>> > > > > > system
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> which
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  has
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> nodes in a tree structure and each node can have
>> small
>> > > > > amount
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> of
>> > > > > > >>>>
>> > > > > > >>>>> data
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  associated with it and these nodes are called znodes.
>> > > Clients
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> can
>> > > > > > >>>>
>> > > > > > >>>>>  connect
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> to a zookeeper server and add/delete and update
>> these
>> > > > > znodes.
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>    In Apache Airavata we start multiple thrift
>> > services
>> > > > and
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> these
>> > > > > > >>>>
>> > > > > > >>>>> can
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  go
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> down for maintenance or these can crash, if we use
>> > > > zookeeper
>> > > > > > to
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> store
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  these
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> configuration(thrift service configurations) we can
>> > > > achieve
>> > > > > a
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> very
>> > > > > > >>>>
>> > > > > > >>>>>  reliable
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> system. Basically thrift clients can dynamically
>> > > discover
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> available
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  service
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> by using ephemeral znodes(Here we do not have to
>> > change
>> > > > the
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> generated
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  thrift client code but we have to change the
>> locations we
>> > > are
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> invoking
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  them). ephemeral znodes will be removed when the
>> thrift
>> > > > service
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> goes
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  down
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> and zookeeper guarantee the atomicity between these
>> > > > > > operations.
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> With
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  this
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> approach we can have a node hierarchy for multiple
>> of
>> > > > > > airavata,
>> > > > > > >>>>>>>>>>>> orchestrator,appcatalog and gfac thrift services.
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> For specifically for gfac we can have different
>> types
>> > of
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> services
>> > > > > > >>>>
>> > > > > > >>>>> for
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  each
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> provider implementation. This can be achieved by
>> using
>> > > the
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> hierarchical
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> support in zookeeper and providing some logic in
>> > > > gfac-thrift
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> service
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  to
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> register it to a defined path. Using the same logic
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> orchestrator
>> > > > > > >>>>
>> > > > > > >>>>> can
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  discover the provider specific gfac thrift service and
>> > > route
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> the
>> > > > > > >>>>
>> > > > > > >>>>>  message to
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> the correct thrift service.
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> With this approach I think we simply have write
>> some
>> > > > client
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> code
>> > > > > > >>>>
>> > > > > > >>>>> in
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  thrift
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> services and clients and zookeeper server
>> installation
>> > > can
>> > > > > be
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> done as
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  a
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> separate process and it will be easier to keep the
>> > > > Zookeeper
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> server
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  separate from Airavata because installation of
>> Zookeeper
>> > > > server
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> little
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>>  complex in production scenario. I think we have to
>> make
>> > > sure
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> everything
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> works fine when there is no Zookeeper running, ex:
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> enable.zookeeper=false
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> should works fine and users doesn't have to
>> download
>> > and
>> > > > > start
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>> zookeeper.
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> [1]http://zookeeper.apache.org/
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>> Thanks
>> > > > > > >>>>>>>>>>>> Lahiru
>> > > > > > >>>>>>>>>>>>
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>>>>
>> > > > > > >>>>>>>>> --
>> > > > > > >>>>>>>>> System Analyst Programmer
>> > > > > > >>>>>>>>> PTI Lab
>> > > > > > >>>>>>>>> Indiana University
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>>>>
>> > > > > > >>>>>>> --
>> > > > > > >>>>>>> System Analyst Programmer
>> > > > > > >>>>>>> PTI Lab
>> > > > > > >>>>>>> Indiana University
>> > > > > > >>>>>>>
>> > > > > > >>>>>>>
>> > > > > > >>>>>>
>> > > > > > >>>>>> --
>> > > > > > >>>>>> Best Regards,
>> > > > > > >>>>>> Shameera Rathnayaka.
>> > > > > > >>>>>>
>> > > > > > >>>>>> email: shameera AT apache.org , shameerainfo AT
>> gmail.com
>> > > > > > >>>>>> Blog : http://shameerarathnayaka.blogspot.com/
>> > > > > > >>>>>>
>> > > > > > >>>>>>
>> > > > > > >>>>>
>> > > > > > >>>>> --
>> > > > > > >>>>> Supun Kamburugamuva
>> > > > > > >>>>> Member, Apache Software Foundation; http://www.apache.org
>> > > > > > >>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
>> > > > > > >>>>> Blog: http://supunk.blogspot.com
>> > > > > > >>>>>
>> > > > > > >>>>>
>> > > > > > >>>>>
>> > > > > > >>>> --
>> > > > > > >>>> System Analyst Programmer
>> > > > > > >>>> PTI Lab
>> > > > > > >>>> Indiana University
>> > > > > > >>>>
>> > > > > > >>>>
>> > > > > > >>>
>> > > > > > >>> --
>> > > > > > >>> Supun Kamburugamuva
>> > > > > > >>> Member, Apache Software Foundation; http://www.apache.org
>> > > > > > >>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
>> > > > > > >>> Blog: http://supunk.blogspot.com
>> > > > > > >>>
>> > > > > > >>>
>> > > > > > >>>
>> > > > > > >>
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > > > --
>> > > > > > Supun Kamburugamuva
>> > > > > > Member, Apache Software Foundation; http://www.apache.org
>> > > > > > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
>> > > > > > Blog: http://supunk.blogspot.com
>> > > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > > System Analyst Programmer
>> > > > > PTI Lab
>> > > > > Indiana University
>> > > > >
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > > Supun Kamburugamuva
>> > > > Member, Apache Software Foundation; http://www.apache.org
>> > > > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
>> > > > Blog: http://supunk.blogspot.com
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > System Analyst Programmer
>> > > PTI Lab
>> > > Indiana University
>> > >
>> >
>>
>>
>>
>> --
>> Supun Kamburugamuva
>> Member, Apache Software Foundation; http://www.apache.org
>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
>> Blog: http://supunk.blogspot.com
>>
>
>
>
> --
> System Analyst Programmer
> PTI Lab
> Indiana University
>



-- 
System Analyst Programmer
PTI Lab
Indiana University

Re: Zookeeper in Airavata to achieve reliability

Posted by Lahiru Gunathilake <gl...@gmail.com>.

Hi Supun,

I think in this usecase we only use storm topology to do the communication
among workers and we are completely ignoring the stream processing part.
Orchestrator will talk to Nimbus and GFAC nodes will be Worker nodes in the
storm topology. But I think we can achieve extremely fault tolerance system
by directly using storm based on following statement in storm site with
minimum changes in airavata.

Additionally, the Nimbus daemon and Supervisor daemons are fail-fast and
stateless; all state is kept in Zookeeper or on local disk. This means you
can kill -9 Nimbus or the Supervisors and they’ll start back up like
nothing happened. This design leads to Storm clusters being incredibly
stable.




On Tue, Jun 17, 2014 at 3:02 PM, Supun Kamburugamuva <su...@gmail.com>
wrote:

> Hi Eran,
>
> I'm using Storm every day and this is one of the strangest things I've
> heard about using Storm. My be there are more use cases for Storm other
> than Distributed Stream processing. AFAIK the Bolts, spouts are built to
> handle a stream of events that doesn't take much time to process. In
> Airavata we don't process the messages. Instead we run experiments based on
> the commands given.
>
> If you want process isolation, distributed execution, cluster resource
> management Yarn would be a better thing to explore.
>
> Thanks,
> Supun..
>
>
> On Tue, Jun 17, 2014 at 2:27 PM, Eran Chinthaka Withana <
> eran.chinthaka@gmail.com> wrote:
>
> > Hi Lahiru,
> >
> > good summarization. Thanks Lahiru.
> >
> > I think you are trying to stick to a model where Orchestrator
> distributing
> > to work for GFac worker and trying to do the impedance mismatch through a
> > messaging solution. If you step back and think, we don't even want the
> > orchestrator to handle everything. From its point of view, it should
> submit
> > jobs to the framework, and will wait or get notified once the job is
> done.
> >
> > There are multiple ways of doing this. And here is one method.
> >
> > Orchestrator submits all its jobs to Job queue (implemented using any MQ
> > impl like Rabbit or Kafka). A storm topology is implemented to dequeue
> > messages, process them (i.e. submit those jobs and get those executed)
> and
> > notify the Orchestrator with the status (either through another
> > JobCompletionQueue or direct invocation).
> >
> > With this approach, the MQ provider will help to match impedance between
> > job submission and consumption. Storm helps with worker coordination,
> load
> > balancing, throttling on your job execution framework, worker pool
> > management and fault tolerance.
> >
> > Of course, you can implement this based only on ZK and handle everything
> > else on your own but storm had done exactly that with the use of ZK
> > underneath.
> >
> > Finally, if you go for a model like this, then even beyond job
> submission,
> > you can use the same model to do anything within the framework for
> internal
> > communication. For example, the workflow engine will submit its jobs to
> > queues based on what it has to do. Storm topologies exists for each
> queues
> > to dequeue messages and carry out the work in a reliable manner. Consider
> > this as mini-workflows within a larger workflow framework.
> >
> > We can have a voice chat if its more convenient. But not at 7am PST :)
> >
> >
> > Thanks,
> > Eran Chinthaka Withana
> >
> >
> > On Tue, Jun 17, 2014 at 10:12 AM, Lahiru Gunathilake <gl...@gmail.com>
> > wrote:
> >
> > > Hi All,
> > >
> > > Ignoring the tool that we are going to use to implement fault
> tolerance I
> > > have summarized the model we have decided so far. I will use the tool
> > name
> > > as X, we can use Zookeeper or some other implementation. Following
> design
> > > assume tool X  and Registry have high availability.
> > >
> > > 1. Orchestrator and GFAC worker node communication is going to be queue
> > > based and tool X is going to be used for this communication. (We have
> to
> > > implement this with considering race condition between different gfac
> > > workers).
> > > 2. We are having multiple instances of GFAC which are identical (In
> > future
> > > we can group gfac workers). Existence of each worker node is identified
> > > using X. If node goes down orchestrator will be notified by X.
> > > 3. When a particular request comes and accepted by one gfac worker that
> > > information will be replicated in tool X and a place where this
> > information
> > > is persisted even the worker failed.
> > > 4. When a job comes to a final state like failed or cancelled or
> > completed
> > > above information will be removed. So at a given time orchestrator can
> > poll
> > > active jobs in each worker by giving a worker ID.
> > > 5. Tool X will make sure that when a worker goes down it will notify
> > > orchestrator. During a worker failure, based on step 3 and 4
> orchestrator
> > > can poll all the active jobs of that worker and do the same thing like
> in
> > > step 1 (store the experiment ID to the queue) and gfac worker will pick
> > the
> > > jobs.
> > >
> > > 6. When GFAC receive a job like in step 5 it have to carefully evaluate
> > the
> > > state from registry and decide what to be done (If the job is pending
> > then
> > > gfac just have to monitor, if job state is like input transferred not
> > even
> > > submitted gfac has to execute rest of the chain and submit the job to
> the
> > > resource and start monitoring).
> > >
> > > If we can find a tool X which supports all these features and tool
> itself
> > > is fault tolerance and support atomicity, high availability and simply
> > API
> > > to implement we can use that tool.
> > >
> > > WDYT ?
> > >
> > > Lahiru
> > >
> > >
> > > On Mon, Jun 16, 2014 at 2:38 PM, Supun Kamburugamuva <
> supun06@gmail.com>
> > > wrote:
> > >
> > > > Hi Lahiru,
> > > >
> > > > Before moving with an implementation it may be worth to consider some
> > of
> > > > the following aspects as well.
> > > >
> > > > 1. How to report the progress of an experiment as state in ZooKeeper?
> > > What
> > > > happens if a GFac instance crashes while executing an experiment? Are
> > > there
> > > > check-points we can save so that another GFac instance can take over?
> > > > 2. What is the threading model of GFac instances? (I consider this
> as a
> > > > very important aspect)
> > > > 3. What are the information needed to be stored in the ZooKeeper? You
> > may
> > > > need to store other information about an experiment apart from its
> > > > experiment ID.
> > > > 4. How to report errors?
> > > > 5. For GFac weather you need a threading model or worker process
> model?
> > > >
> > > > Thanks,
> > > > Supun..
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Mon, Jun 16, 2014 at 2:22 PM, Lahiru Gunathilake <
> glahiru@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > I think the conclusion is like this,
> > > > >
> > > > > 1, We make the gfac as a worker not a thrift service and we can
> start
> > > > > multiple workers either with bunch of providers and handlers
> > configured
> > > > in
> > > > > each worker or provider specific  workers to handle the class path
> > > issues
> > > > > (not the common scenario).
> > > > >
> > > > > 2. Gfac workers can be configured to watch for a given path in
> > > zookeeper,
> > > > > and multiple workers can listen to the same path. Default path can
> be
> > > > > /airavata/gfac or can configure paths like /airavata/gfac/gsissh
> > > > > /airavata/gfac/bes.
> > > > >
> > > > > 3. Orchestrator can configure with a logic to store experiment IDs
> in
> > > > > zookeeper with a path, and orchestrator can be configured to
> provider
> > > > > specific path logic too. So when a new request come orchestrator
> > store
> > > > the
> > > > > experimentID and these experiments IDs are stored in Zk as a queue.
> > > > >
> > > > > 4. Since gfac workers are watching they will be notified and as
> supun
> > > > > suggested can use a leader selection algorithm[1] and one gfac
> worker
> > > >  will
> > > > > take the leadership for each experiment. If there are gfac
> instances
> > > for
> > > > > each provider same logic will apply among those nodes with same
> > > provider
> > > > > type.
> > > > >
> > > > > [1]http://curator.apache.org/curator-recipes/leader-election.html
> > > > >
> > > > > I would like to implement this if there are  no objections.
> > > > >
> > > > > Lahiru
> > > > >
> > > > >
> > > > > On Mon, Jun 16, 2014 at 11:51 AM, Supun Kamburugamuva <
> > > supun06@gmail.com
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi Marlon,
> > > > > >
> > > > > > I think you are exactly correct.
> > > > > >
> > > > > > Supun..
> > > > > >
> > > > > >
> > > > > > On Mon, Jun 16, 2014 at 11:48 AM, Marlon Pierce <marpierc@iu.edu
> >
> > > > wrote:
> > > > > >
> > > > > > > Let me restate this, and please tell me if I'm wrong.
> > > > > > >
> > > > > > > Orchestrator decides (somehow) that a particular job requires
> > > > JSDL/BES,
> > > > > > so
> > > > > > > it places the Experiment ID in Zookeeper's
> > /airavata/gfac/jsdl-bes
> > > > > node.
> > > > > > >  GFAC servers associated with this instance notice the update.
> >  The
> > > > > first
> > > > > > > GFAC to claim the job gets it, uses the Experiment ID to get
> the
> > > > > detailed
> > > > > > > information it needs from the Registry.  ZooKeeper handles the
> > > > locking,
> > > > > > etc
> > > > > > > to make sure that only one GFAC at a time is trying to handle
> an
> > > > > > experiment.
> > > > > > >
> > > > > > > Marlon
> > > > > > >
> > > > > > >
> > > > > > > On 6/16/14, 11:42 AM, Lahiru Gunathilake wrote:
> > > > > > >
> > > > > > >> Hi Supun,
> > > > > > >>
> > > > > > >> Thanks for the clarification.
> > > > > > >>
> > > > > > >> Regards
> > > > > > >> Lahiru
> > > > > > >>
> > > > > > >>
> > > > > > >> On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva <
> > > > > > supun06@gmail.com>
> > > > > > >> wrote:
> > > > > > >>
> > > > > > >>  Hi Lahiru,
> > > > > > >>>
> > > > > > >>> My suggestion is that may be you don't need a Thrift service
> > > > between
> > > > > > >>> Orchestrator and the component executing the experiment.
> When a
> > > new
> > > > > > >>> experiment is submitted, orchestrator decides who can execute
> > > this
> > > > > job.
> > > > > > >>> Then it put the information about this experiment execution
> in
> > > > > > ZooKeeper.
> > > > > > >>> The component which wants to executes the experiment is
> > listening
> > > > to
> > > > > > this
> > > > > > >>> ZooKeeper path and when it sees the experiment it will
> execute
> > > it.
> > > > So
> > > > > > >>> that
> > > > > > >>> the communication happens through an state change in
> ZooKeeper.
> > > > This
> > > > > > can
> > > > > > >>> potentially simply your architecture.
> > > > > > >>>
> > > > > > >>> Thanks,
> > > > > > >>> Supun.
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake <
> > > > > > glahiru@gmail.com>
> > > > > > >>> wrote:
> > > > > > >>>
> > > > > > >>>  Hi Supun,
> > > > > > >>>>
> > > > > > >>>> So your suggestion is to create a znode for each thrift
> > service
> > > we
> > > > > > have
> > > > > > >>>> and
> > > > > > >>>> when the request comes that node gets modified with input
> data
> > > for
> > > > > > that
> > > > > > >>>> request and thrift service is having a watch for that node
> and
> > > it
> > > > > will
> > > > > > >>>> be
> > > > > > >>>> notified because of the watch and it can read the input from
> > > > > zookeeper
> > > > > > >>>> and
> > > > > > >>>> invoke the operation?
> > > > > > >>>>
> > > > > > >>>> Lahiru
> > > > > > >>>>
> > > > > > >>>>
> > > > > > >>>> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva <
> > > > > > >>>> supun06@gmail.com>
> > > > > > >>>> wrote:
> > > > > > >>>>
> > > > > > >>>>  Hi all,
> > > > > > >>>>>
> > > > > > >>>>> Here is what I think about Airavata and ZooKeeper. In
> > Airavata
> > > > > there
> > > > > > >>>>> are
> > > > > > >>>>> many components and these components must be stateless to
> > > achieve
> > > > > > >>>>> scalability and reliability.Also there must be a mechanism
> to
> > > > > > >>>>>
> > > > > > >>>> communicate
> > > > > > >>>>
> > > > > > >>>>> between the components. At the moment Airavata uses RPC
> calls
> > > > based
> > > > > > on
> > > > > > >>>>> Thrift for the communication.
> > > > > > >>>>>
> > > > > > >>>>> ZooKeeper can be used both as a place to hold state and as
> a
> > > > > > >>>>>
> > > > > > >>>> communication
> > > > > > >>>>
> > > > > > >>>>> layer between the components. I'm involved with a project
> > that
> > > > has
> > > > > > many
> > > > > > >>>>> distributed components like AIravata. Right now we use
> Thrift
> > > > > > services
> > > > > > >>>>>
> > > > > > >>>> to
> > > > > > >>>>
> > > > > > >>>>> communicate among the components. But we find it difficult
> to
> > > use
> > > > > RPC
> > > > > > >>>>>
> > > > > > >>>> calls
> > > > > > >>>>
> > > > > > >>>>> and achieve stateless behaviour and thinking of replacing
> > > Thrift
> > > > > > >>>>>
> > > > > > >>>> services
> > > > > > >>>>
> > > > > > >>>>> with ZooKeeper based communication layer. So I think it is
> > > better
> > > > > to
> > > > > > >>>>> explore the possibility of removing the Thrift services
> > between
> > > > the
> > > > > > >>>>> components and use ZooKeeper as a communication mechanism
> > > between
> > > > > the
> > > > > > >>>>> services. If you do this you will have to move the state to
> > > > > ZooKeeper
> > > > > > >>>>>
> > > > > > >>>> and
> > > > > > >>>>
> > > > > > >>>>> will automatically achieve the stateless behaviour in the
> > > > > components.
> > > > > > >>>>>
> > > > > > >>>>> Also I think trying to make ZooKeeper optional is a bad
> idea.
> > > If
> > > > we
> > > > > > are
> > > > > > >>>>> trying to integrate something fundamentally important to
> > > > > architecture
> > > > > > >>>>> as
> > > > > > >>>>> how to store state, we shouldn't make it optional.
> > > > > > >>>>>
> > > > > > >>>>> Thanks,
> > > > > > >>>>> Supun..
> > > > > > >>>>>
> > > > > > >>>>>
> > > > > > >>>>> On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
> > > > > > >>>>> shameerainfo@gmail.com> wrote:
> > > > > > >>>>>
> > > > > > >>>>>  Hi Lahiru,
> > > > > > >>>>>>
> > > > > > >>>>>> As i understood,  not only reliability , you are trying to
> > > > achieve
> > > > > > >>>>>> some
> > > > > > >>>>>> other requirement by introducing zookeeper, like health
> > > > monitoring
> > > > > > of
> > > > > > >>>>>>
> > > > > > >>>>> the
> > > > > > >>>>
> > > > > > >>>>> services, categorization with service implementation etc
> ...
> > .
> > > In
> > > > > > that
> > > > > > >>>>>> case, i think we can get use of zookeeper's features but
> if
> > we
> > > > > only
> > > > > > >>>>>>
> > > > > > >>>>> focus
> > > > > > >>>>
> > > > > > >>>>> on reliability, i have little bit of concern, why can't we
> > use
> > > > > > >>>>>>
> > > > > > >>>>> clustering +
> > > > > > >>>>
> > > > > > >>>>> LB ?
> > > > > > >>>>>>
> > > > > > >>>>>> Yes it is better we add Zookeeper as a prerequisite if
> user
> > > need
> > > > > to
> > > > > > >>>>>> use
> > > > > > >>>>>> it.
> > > > > > >>>>>>
> > > > > > >>>>>> Thanks,
> > > > > > >>>>>>   Shameera.
> > > > > > >>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <
> > > > > > >>>>>> glahiru@gmail.com
> > > > > > >>>>>> wrote:
> > > > > > >>>>>>
> > > > > > >>>>>>  Hi Gagan,
> > > > > > >>>>>>>
> > > > > > >>>>>>> I need to start another discussion about it, but I had an
> > > > offline
> > > > > > >>>>>>> discussion with Suresh about auto-scaling. I will start
> > > another
> > > > > > >>>>>>> thread
> > > > > > >>>>>>> about this topic too.
> > > > > > >>>>>>>
> > > > > > >>>>>>> Regards
> > > > > > >>>>>>> Lahiru
> > > > > > >>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
> > > > > > >>>>>>>
> > > > > > >>>>>> gagandeepjuneja@gmail.com
> > > > > > >>>>
> > > > > > >>>>> wrote:
> > > > > > >>>>>>>
> > > > > > >>>>>>>  Thanks Lahiru for pointing to nice library, added to my
> > > > > dictionary
> > > > > > >>>>>>>>
> > > > > > >>>>>>> :).
> > > > > > >>>>
> > > > > > >>>>>  I would like to know how are we planning to start multiple
> > > > > servers.
> > > > > > >>>>>>>> 1. Spawning new servers based on load? Some times we
> call
> > it
> > > > as
> > > > > > auto
> > > > > > >>>>>>>> scalable.
> > > > > > >>>>>>>> 2. To make some specific number of nodes available such
> as
> > > we
> > > > > > want 2
> > > > > > >>>>>>>> servers to be available at any time so if one goes down
> > > then I
> > > > > > need
> > > > > > >>>>>>>>
> > > > > > >>>>>>> to
> > > > > > >>>>
> > > > > > >>>>>  spawn one new to make available servers count 2.
> > > > > > >>>>>>>> 3. Initially start all the servers.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> In scenario 1 and 2 zookeeper does make sense but I
> don't
> > > > > believe
> > > > > > >>>>>>>>
> > > > > > >>>>>>> existing
> > > > > > >>>>>>>
> > > > > > >>>>>>>> architecture support this?
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> Regards,
> > > > > > >>>>>>>> Gagan
> > > > > > >>>>>>>> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <
> > > > glahiru@gmail.com
> > > > > >
> > > > > > >>>>>>>>
> > > > > > >>>>>>> wrote:
> > > > > > >>>>>>>
> > > > > > >>>>>>>> Hi Gagan,
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> Thanks for your response. Please see my inline
> comments.
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>> gagandeepjuneja@gmail.com>
> > > > > > >>>>>>>
> > > > > > >>>>>>>> wrote:
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>>  Hi Lahiru,
> > > > > > >>>>>>>>>> Just my 2 cents.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> I am big fan of zookeeper but also against adding
> > multiple
> > > > > hops
> > > > > > in
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>> the
> > > > > > >>>>>>>
> > > > > > >>>>>>>> system which can add unnecessary complexity. Here I am
> not
> > > > able
> > > > > to
> > > > > > >>>>>>>>>> understand the requirement of zookeeper may be I am
> > wrong
> > > > > > because
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>> of
> > > > > > >>>>
> > > > > > >>>>> less
> > > > > > >>>>>>>
> > > > > > >>>>>>>> knowledge of the airavata system in whole. So I would
> like
> > > to
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>> discuss
> > > > > > >>>>
> > > > > > >>>>>  following point.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> 1. How it will help us in making system more reliable.
> > > > > Zookeeper
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>> is
> > > > > > >>>>
> > > > > > >>>>> not
> > > > > > >>>>>>>
> > > > > > >>>>>>>> able to restart services. At max it can tell whether
> > service
> > > > is
> > > > > up
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>> or not
> > > > > > >>>>>>>
> > > > > > >>>>>>>> which could only be the case if airavata service goes
> down
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>> gracefully and
> > > > > > >>>>>>>
> > > > > > >>>>>>>> we have any automated way to restart it. If this is just
> > > > matter
> > > > > of
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>> routing
> > > > > > >>>>>>>
> > > > > > >>>>>>>> client requests to the available thrift servers then
> this
> > > can
> > > > be
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>> achieved
> > > > > > >>>>>>>
> > > > > > >>>>>>>> with the help of load balancer which I guess is already
> > > there
> > > > in
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>> thrift
> > > > > > >>>>>>>
> > > > > > >>>>>>>> wish list.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>  We have multiple thrift services and currently we
> start
> > > > only
> > > > > > one
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>> instance
> > > > > > >>>>>>>
> > > > > > >>>>>>>> of them and each thrift service is a stateless service.
> To
> > > > keep
> > > > > > the
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>> high
> > > > > > >>>>>>>
> > > > > > >>>>>>>> availability we have to start multiple instances of them
> > in
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>> production
> > > > > > >>>>
> > > > > > >>>>>  scenario. So for clients to get an available thrift
> service
> > we
> > > > can
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>> use
> > > > > > >>>>
> > > > > > >>>>>  zookeeper znodes to represent each available service.
> There
> > > are
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>> some
> > > > > > >>>>
> > > > > > >>>>>  libraries which is doing similar[1] and I think we can use
> > > them
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>> directly.
> > > > > > >>>>>>>
> > > > > > >>>>>>>> 2. As far as registering of different providers is
> > concerned
> > > > do
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>> you
> > > > > > >>>>
> > > > > > >>>>>  think for that we really need external store.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>  Yes I think so, because its light weight and reliable
> > and
> > > > we
> > > > > > have
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>> to
> > > > > > >>>>
> > > > > > >>>>> do
> > > > > > >>>>>>>
> > > > > > >>>>>>>> very minimal amount of work to achieve all these
> features
> > to
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>> Airavata
> > > > > > >>>>
> > > > > > >>>>>  because zookeeper handle all the complexity.
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>>  I have seen people using zookeeper more for state
> > > management
> > > > > in
> > > > > > >>>>>>>>>> distributed environments.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>  +1, we might not be the most effective users of
> > zookeeper
> > > > > > because
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>> all
> > > > > > >>>>
> > > > > > >>>>> of
> > > > > > >>>>>>>
> > > > > > >>>>>>>> our services are stateless services, but my point is to
> > > > achieve
> > > > > > >>>>>>>>> fault-tolerance we can use zookeeper and with minimal
> > work.
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>>    I would like to understand more how can we leverage
> > > > > zookeeper
> > > > > > in
> > > > > > >>>>>>>>>> airavata to make system reliable.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>  [1]https://github.com/eirslett/thrift-zookeeper
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>>  Regards,
> > > > > > >>>>>>>>>> Gagan
> > > > > > >>>>>>>>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <
> > marpierc@iu.edu
> > > >
> > > > > > wrote:
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>  Thanks for the summary, Lahiru. I'm cc'ing the
> > > Architecture
> > > > > > list
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>> for
> > > > > > >>>>
> > > > > > >>>>>  additional comments.
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> Marlon
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>> Hi All,
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> I did little research about Apache Zookeeper[1] and
> > how
> > > to
> > > > > use
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> it
> > > > > > >>>>
> > > > > > >>>>> in
> > > > > > >>>>>>>
> > > > > > >>>>>>>>  airavata. Its really a nice way to achieve fault
> > tolerance
> > > > and
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> reliable
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>> communication between our thrift services and
> clients.
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> Zookeeper
> > > > > > >>>>
> > > > > > >>>>> is a
> > > > > > >>>>>>>
> > > > > > >>>>>>>>  distributed, fault tolerant system to do a reliable
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> communication
> > > > > > >>>>
> > > > > > >>>>>  between
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>> distributed applications. This is like an in-memory
> > file
> > > > > > system
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> which
> > > > > > >>>>>>>
> > > > > > >>>>>>>>  has
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>> nodes in a tree structure and each node can have
> small
> > > > > amount
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> of
> > > > > > >>>>
> > > > > > >>>>> data
> > > > > > >>>>>>>
> > > > > > >>>>>>>>  associated with it and these nodes are called znodes.
> > > Clients
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> can
> > > > > > >>>>
> > > > > > >>>>>  connect
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>> to a zookeeper server and add/delete and update
> these
> > > > > znodes.
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>    In Apache Airavata we start multiple thrift
> > services
> > > > and
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> these
> > > > > > >>>>
> > > > > > >>>>> can
> > > > > > >>>>>>>
> > > > > > >>>>>>>>  go
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>> down for maintenance or these can crash, if we use
> > > > zookeeper
> > > > > > to
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> store
> > > > > > >>>>>>>
> > > > > > >>>>>>>>  these
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>> configuration(thrift service configurations) we can
> > > > achieve
> > > > > a
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> very
> > > > > > >>>>
> > > > > > >>>>>  reliable
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>> system. Basically thrift clients can dynamically
> > > discover
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> available
> > > > > > >>>>>>>
> > > > > > >>>>>>>>  service
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>> by using ephemeral znodes(Here we do not have to
> > change
> > > > the
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> generated
> > > > > > >>>>>>>
> > > > > > >>>>>>>>  thrift client code but we have to change the locations
> we
> > > are
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> invoking
> > > > > > >>>>>>>
> > > > > > >>>>>>>>  them). ephemeral znodes will be removed when the thrift
> > > > service
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> goes
> > > > > > >>>>>>>
> > > > > > >>>>>>>>  down
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>> and zookeeper guarantee the atomicity between these
> > > > > > operations.
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> With
> > > > > > >>>>>>>
> > > > > > >>>>>>>>  this
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>> approach we can have a node hierarchy for multiple
> of
> > > > > > airavata,
> > > > > > >>>>>>>>>>>> orchestrator,appcatalog and gfac thrift services.
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> For specifically for gfac we can have different
> types
> > of
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> services
> > > > > > >>>>
> > > > > > >>>>> for
> > > > > > >>>>>>>
> > > > > > >>>>>>>>  each
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>> provider implementation. This can be achieved by
> using
> > > the
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> hierarchical
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>> support in zookeeper and providing some logic in
> > > > gfac-thrift
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> service
> > > > > > >>>>>>>
> > > > > > >>>>>>>>  to
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>> register it to a defined path. Using the same logic
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> orchestrator
> > > > > > >>>>
> > > > > > >>>>> can
> > > > > > >>>>>>>
> > > > > > >>>>>>>>  discover the provider specific gfac thrift service and
> > > route
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> the
> > > > > > >>>>
> > > > > > >>>>>  message to
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>> the correct thrift service.
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> With this approach I think we simply have write some
> > > > client
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> code
> > > > > > >>>>
> > > > > > >>>>> in
> > > > > > >>>>>>>
> > > > > > >>>>>>>>  thrift
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>> services and clients and zookeeper server
> installation
> > > can
> > > > > be
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> done as
> > > > > > >>>>>>>
> > > > > > >>>>>>>>  a
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>> separate process and it will be easier to keep the
> > > > Zookeeper
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> server
> > > > > > >>>>>>>
> > > > > > >>>>>>>>  separate from Airavata because installation of
> Zookeeper
> > > > server
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> little
> > > > > > >>>>>>>
> > > > > > >>>>>>>>  complex in production scenario. I think we have to make
> > > sure
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> everything
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>> works fine when there is no Zookeeper running, ex:
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> enable.zookeeper=false
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>> should works fine and users doesn't have to download
> > and
> > > > > start
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>> zookeeper.
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> [1]http://zookeeper.apache.org/
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> Thanks
> > > > > > >>>>>>>>>>>> Lahiru
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>> --
> > > > > > >>>>>>>>> System Analyst Programmer
> > > > > > >>>>>>>>> PTI Lab
> > > > > > >>>>>>>>> Indiana University
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>>
> > > > > > >>>>>>> --
> > > > > > >>>>>>> System Analyst Programmer
> > > > > > >>>>>>> PTI Lab
> > > > > > >>>>>>> Indiana University
> > > > > > >>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>> --
> > > > > > >>>>>> Best Regards,
> > > > > > >>>>>> Shameera Rathnayaka.
> > > > > > >>>>>>
> > > > > > >>>>>> email: shameera AT apache.org , shameerainfo AT gmail.com
> > > > > > >>>>>> Blog : http://shameerarathnayaka.blogspot.com/
> > > > > > >>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>
> > > > > > >>>>> --
> > > > > > >>>>> Supun Kamburugamuva
> > > > > > >>>>> Member, Apache Software Foundation; http://www.apache.org
> > > > > > >>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > > > > >>>>> Blog: http://supunk.blogspot.com
> > > > > > >>>>>
> > > > > > >>>>>
> > > > > > >>>>>
> > > > > > >>>> --
> > > > > > >>>> System Analyst Programmer
> > > > > > >>>> PTI Lab
> > > > > > >>>> Indiana University
> > > > > > >>>>
> > > > > > >>>>
> > > > > > >>>
> > > > > > >>> --
> > > > > > >>> Supun Kamburugamuva
> > > > > > >>> Member, Apache Software Foundation; http://www.apache.org
> > > > > > >>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > > > > >>> Blog: http://supunk.blogspot.com
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Supun Kamburugamuva
> > > > > > Member, Apache Software Foundation; http://www.apache.org
> > > > > > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > > > > Blog: http://supunk.blogspot.com
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > System Analyst Programmer
> > > > > PTI Lab
> > > > > Indiana University
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Supun Kamburugamuva
> > > > Member, Apache Software Foundation; http://www.apache.org
> > > > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > > Blog: http://supunk.blogspot.com
> > > >
> > >
> > >
> > >
> > > --
> > > System Analyst Programmer
> > > PTI Lab
> > > Indiana University
> > >
> >
>
>
>
> --
> Supun Kamburugamuva
> Member, Apache Software Foundation; http://www.apache.org
> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> Blog: http://supunk.blogspot.com
>



-- 
System Analyst Programmer
PTI Lab
Indiana University

Re: Zookeeper in Airavata to achieve reliability

Posted by Supun Kamburugamuva <su...@gmail.com>.

Hi Eran,

I'm using Storm every day and this is one of the strangest things I've
heard about using Storm. My be there are more use cases for Storm other
than Distributed Stream processing. AFAIK the Bolts, spouts are built to
handle a stream of events that doesn't take much time to process. In
Airavata we don't process the messages. Instead we run experiments based on
the commands given.

If you want process isolation, distributed execution, cluster resource
management Yarn would be a better thing to explore.

Thanks,
Supun..


On Tue, Jun 17, 2014 at 2:27 PM, Eran Chinthaka Withana <
eran.chinthaka@gmail.com> wrote:

> Hi Lahiru,
>
> good summarization. Thanks Lahiru.
>
> I think you are trying to stick to a model where Orchestrator distributing
> to work for GFac worker and trying to do the impedance mismatch through a
> messaging solution. If you step back and think, we don't even want the
> orchestrator to handle everything. From its point of view, it should submit
> jobs to the framework, and will wait or get notified once the job is done.
>
> There are multiple ways of doing this. And here is one method.
>
> Orchestrator submits all its jobs to Job queue (implemented using any MQ
> impl like Rabbit or Kafka). A storm topology is implemented to dequeue
> messages, process them (i.e. submit those jobs and get those executed) and
> notify the Orchestrator with the status (either through another
> JobCompletionQueue or direct invocation).
>
> With this approach, the MQ provider will help to match impedance between
> job submission and consumption. Storm helps with worker coordination, load
> balancing, throttling on your job execution framework, worker pool
> management and fault tolerance.
>
> Of course, you can implement this based only on ZK and handle everything
> else on your own but storm had done exactly that with the use of ZK
> underneath.
>
> Finally, if you go for a model like this, then even beyond job submission,
> you can use the same model to do anything within the framework for internal
> communication. For example, the workflow engine will submit its jobs to
> queues based on what it has to do. Storm topologies exists for each queues
> to dequeue messages and carry out the work in a reliable manner. Consider
> this as mini-workflows within a larger workflow framework.
>
> We can have a voice chat if its more convenient. But not at 7am PST :)
>
>
> Thanks,
> Eran Chinthaka Withana
>
>
> On Tue, Jun 17, 2014 at 10:12 AM, Lahiru Gunathilake <gl...@gmail.com>
> wrote:
>
> > Hi All,
> >
> > Ignoring the tool that we are going to use to implement fault tolerance I
> > have summarized the model we have decided so far. I will use the tool
> name
> > as X, we can use Zookeeper or some other implementation. Following design
> > assume tool X  and Registry have high availability.
> >
> > 1. Orchestrator and GFAC worker node communication is going to be queue
> > based and tool X is going to be used for this communication. (We have to
> > implement this with considering race condition between different gfac
> > workers).
> > 2. We are having multiple instances of GFAC which are identical (In
> future
> > we can group gfac workers). Existence of each worker node is identified
> > using X. If node goes down orchestrator will be notified by X.
> > 3. When a particular request comes and accepted by one gfac worker that
> > information will be replicated in tool X and a place where this
> information
> > is persisted even the worker failed.
> > 4. When a job comes to a final state like failed or cancelled or
> completed
> > above information will be removed. So at a given time orchestrator can
> poll
> > active jobs in each worker by giving a worker ID.
> > 5. Tool X will make sure that when a worker goes down it will notify
> > orchestrator. During a worker failure, based on step 3 and 4 orchestrator
> > can poll all the active jobs of that worker and do the same thing like in
> > step 1 (store the experiment ID to the queue) and gfac worker will pick
> the
> > jobs.
> >
> > 6. When GFAC receive a job like in step 5 it have to carefully evaluate
> the
> > state from registry and decide what to be done (If the job is pending
> then
> > gfac just have to monitor, if job state is like input transferred not
> even
> > submitted gfac has to execute rest of the chain and submit the job to the
> > resource and start monitoring).
> >
> > If we can find a tool X which supports all these features and tool itself
> > is fault tolerance and support atomicity, high availability and simply
> API
> > to implement we can use that tool.
> >
> > WDYT ?
> >
> > Lahiru
> >
> >
> > On Mon, Jun 16, 2014 at 2:38 PM, Supun Kamburugamuva <su...@gmail.com>
> > wrote:
> >
> > > Hi Lahiru,
> > >
> > > Before moving with an implementation it may be worth to consider some
> of
> > > the following aspects as well.
> > >
> > > 1. How to report the progress of an experiment as state in ZooKeeper?
> > What
> > > happens if a GFac instance crashes while executing an experiment? Are
> > there
> > > check-points we can save so that another GFac instance can take over?
> > > 2. What is the threading model of GFac instances? (I consider this as a
> > > very important aspect)
> > > 3. What are the information needed to be stored in the ZooKeeper? You
> may
> > > need to store other information about an experiment apart from its
> > > experiment ID.
> > > 4. How to report errors?
> > > 5. For GFac weather you need a threading model or worker process model?
> > >
> > > Thanks,
> > > Supun..
> > >
> > >
> > >
> > >
> > >
> > > On Mon, Jun 16, 2014 at 2:22 PM, Lahiru Gunathilake <glahiru@gmail.com
> >
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > I think the conclusion is like this,
> > > >
> > > > 1, We make the gfac as a worker not a thrift service and we can start
> > > > multiple workers either with bunch of providers and handlers
> configured
> > > in
> > > > each worker or provider specific  workers to handle the class path
> > issues
> > > > (not the common scenario).
> > > >
> > > > 2. Gfac workers can be configured to watch for a given path in
> > zookeeper,
> > > > and multiple workers can listen to the same path. Default path can be
> > > > /airavata/gfac or can configure paths like /airavata/gfac/gsissh
> > > > /airavata/gfac/bes.
> > > >
> > > > 3. Orchestrator can configure with a logic to store experiment IDs in
> > > > zookeeper with a path, and orchestrator can be configured to provider
> > > > specific path logic too. So when a new request come orchestrator
> store
> > > the
> > > > experimentID and these experiments IDs are stored in Zk as a queue.
> > > >
> > > > 4. Since gfac workers are watching they will be notified and as supun
> > > > suggested can use a leader selection algorithm[1] and one gfac worker
> > >  will
> > > > take the leadership for each experiment. If there are gfac instances
> > for
> > > > each provider same logic will apply among those nodes with same
> > provider
> > > > type.
> > > >
> > > > [1]http://curator.apache.org/curator-recipes/leader-election.html
> > > >
> > > > I would like to implement this if there are  no objections.
> > > >
> > > > Lahiru
> > > >
> > > >
> > > > On Mon, Jun 16, 2014 at 11:51 AM, Supun Kamburugamuva <
> > supun06@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Hi Marlon,
> > > > >
> > > > > I think you are exactly correct.
> > > > >
> > > > > Supun..
> > > > >
> > > > >
> > > > > On Mon, Jun 16, 2014 at 11:48 AM, Marlon Pierce <ma...@iu.edu>
> > > wrote:
> > > > >
> > > > > > Let me restate this, and please tell me if I'm wrong.
> > > > > >
> > > > > > Orchestrator decides (somehow) that a particular job requires
> > > JSDL/BES,
> > > > > so
> > > > > > it places the Experiment ID in Zookeeper's
> /airavata/gfac/jsdl-bes
> > > > node.
> > > > > >  GFAC servers associated with this instance notice the update.
>  The
> > > > first
> > > > > > GFAC to claim the job gets it, uses the Experiment ID to get the
> > > > detailed
> > > > > > information it needs from the Registry.  ZooKeeper handles the
> > > locking,
> > > > > etc
> > > > > > to make sure that only one GFAC at a time is trying to handle an
> > > > > experiment.
> > > > > >
> > > > > > Marlon
> > > > > >
> > > > > >
> > > > > > On 6/16/14, 11:42 AM, Lahiru Gunathilake wrote:
> > > > > >
> > > > > >> Hi Supun,
> > > > > >>
> > > > > >> Thanks for the clarification.
> > > > > >>
> > > > > >> Regards
> > > > > >> Lahiru
> > > > > >>
> > > > > >>
> > > > > >> On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva <
> > > > > supun06@gmail.com>
> > > > > >> wrote:
> > > > > >>
> > > > > >>  Hi Lahiru,
> > > > > >>>
> > > > > >>> My suggestion is that may be you don't need a Thrift service
> > > between
> > > > > >>> Orchestrator and the component executing the experiment. When a
> > new
> > > > > >>> experiment is submitted, orchestrator decides who can execute
> > this
> > > > job.
> > > > > >>> Then it put the information about this experiment execution in
> > > > > ZooKeeper.
> > > > > >>> The component which wants to executes the experiment is
> listening
> > > to
> > > > > this
> > > > > >>> ZooKeeper path and when it sees the experiment it will execute
> > it.
> > > So
> > > > > >>> that
> > > > > >>> the communication happens through an state change in ZooKeeper.
> > > This
> > > > > can
> > > > > >>> potentially simply your architecture.
> > > > > >>>
> > > > > >>> Thanks,
> > > > > >>> Supun.
> > > > > >>>
> > > > > >>>
> > > > > >>> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake <
> > > > > glahiru@gmail.com>
> > > > > >>> wrote:
> > > > > >>>
> > > > > >>>  Hi Supun,
> > > > > >>>>
> > > > > >>>> So your suggestion is to create a znode for each thrift
> service
> > we
> > > > > have
> > > > > >>>> and
> > > > > >>>> when the request comes that node gets modified with input data
> > for
> > > > > that
> > > > > >>>> request and thrift service is having a watch for that node and
> > it
> > > > will
> > > > > >>>> be
> > > > > >>>> notified because of the watch and it can read the input from
> > > > zookeeper
> > > > > >>>> and
> > > > > >>>> invoke the operation?
> > > > > >>>>
> > > > > >>>> Lahiru
> > > > > >>>>
> > > > > >>>>
> > > > > >>>> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva <
> > > > > >>>> supun06@gmail.com>
> > > > > >>>> wrote:
> > > > > >>>>
> > > > > >>>>  Hi all,
> > > > > >>>>>
> > > > > >>>>> Here is what I think about Airavata and ZooKeeper. In
> Airavata
> > > > there
> > > > > >>>>> are
> > > > > >>>>> many components and these components must be stateless to
> > achieve
> > > > > >>>>> scalability and reliability.Also there must be a mechanism to
> > > > > >>>>>
> > > > > >>>> communicate
> > > > > >>>>
> > > > > >>>>> between the components. At the moment Airavata uses RPC calls
> > > based
> > > > > on
> > > > > >>>>> Thrift for the communication.
> > > > > >>>>>
> > > > > >>>>> ZooKeeper can be used both as a place to hold state and as a
> > > > > >>>>>
> > > > > >>>> communication
> > > > > >>>>
> > > > > >>>>> layer between the components. I'm involved with a project
> that
> > > has
> > > > > many
> > > > > >>>>> distributed components like AIravata. Right now we use Thrift
> > > > > services
> > > > > >>>>>
> > > > > >>>> to
> > > > > >>>>
> > > > > >>>>> communicate among the components. But we find it difficult to
> > use
> > > > RPC
> > > > > >>>>>
> > > > > >>>> calls
> > > > > >>>>
> > > > > >>>>> and achieve stateless behaviour and thinking of replacing
> > Thrift
> > > > > >>>>>
> > > > > >>>> services
> > > > > >>>>
> > > > > >>>>> with ZooKeeper based communication layer. So I think it is
> > better
> > > > to
> > > > > >>>>> explore the possibility of removing the Thrift services
> between
> > > the
> > > > > >>>>> components and use ZooKeeper as a communication mechanism
> > between
> > > > the
> > > > > >>>>> services. If you do this you will have to move the state to
> > > > ZooKeeper
> > > > > >>>>>
> > > > > >>>> and
> > > > > >>>>
> > > > > >>>>> will automatically achieve the stateless behaviour in the
> > > > components.
> > > > > >>>>>
> > > > > >>>>> Also I think trying to make ZooKeeper optional is a bad idea.
> > If
> > > we
> > > > > are
> > > > > >>>>> trying to integrate something fundamentally important to
> > > > architecture
> > > > > >>>>> as
> > > > > >>>>> how to store state, we shouldn't make it optional.
> > > > > >>>>>
> > > > > >>>>> Thanks,
> > > > > >>>>> Supun..
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>> On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
> > > > > >>>>> shameerainfo@gmail.com> wrote:
> > > > > >>>>>
> > > > > >>>>>  Hi Lahiru,
> > > > > >>>>>>
> > > > > >>>>>> As i understood,  not only reliability , you are trying to
> > > achieve
> > > > > >>>>>> some
> > > > > >>>>>> other requirement by introducing zookeeper, like health
> > > monitoring
> > > > > of
> > > > > >>>>>>
> > > > > >>>>> the
> > > > > >>>>
> > > > > >>>>> services, categorization with service implementation etc ...
> .
> > In
> > > > > that
> > > > > >>>>>> case, i think we can get use of zookeeper's features but if
> we
> > > > only
> > > > > >>>>>>
> > > > > >>>>> focus
> > > > > >>>>
> > > > > >>>>> on reliability, i have little bit of concern, why can't we
> use
> > > > > >>>>>>
> > > > > >>>>> clustering +
> > > > > >>>>
> > > > > >>>>> LB ?
> > > > > >>>>>>
> > > > > >>>>>> Yes it is better we add Zookeeper as a prerequisite if user
> > need
> > > > to
> > > > > >>>>>> use
> > > > > >>>>>> it.
> > > > > >>>>>>
> > > > > >>>>>> Thanks,
> > > > > >>>>>>   Shameera.
> > > > > >>>>>>
> > > > > >>>>>>
> > > > > >>>>>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <
> > > > > >>>>>> glahiru@gmail.com
> > > > > >>>>>> wrote:
> > > > > >>>>>>
> > > > > >>>>>>  Hi Gagan,
> > > > > >>>>>>>
> > > > > >>>>>>> I need to start another discussion about it, but I had an
> > > offline
> > > > > >>>>>>> discussion with Suresh about auto-scaling. I will start
> > another
> > > > > >>>>>>> thread
> > > > > >>>>>>> about this topic too.
> > > > > >>>>>>>
> > > > > >>>>>>> Regards
> > > > > >>>>>>> Lahiru
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
> > > > > >>>>>>>
> > > > > >>>>>> gagandeepjuneja@gmail.com
> > > > > >>>>
> > > > > >>>>> wrote:
> > > > > >>>>>>>
> > > > > >>>>>>>  Thanks Lahiru for pointing to nice library, added to my
> > > > dictionary
> > > > > >>>>>>>>
> > > > > >>>>>>> :).
> > > > > >>>>
> > > > > >>>>>  I would like to know how are we planning to start multiple
> > > > servers.
> > > > > >>>>>>>> 1. Spawning new servers based on load? Some times we call
> it
> > > as
> > > > > auto
> > > > > >>>>>>>> scalable.
> > > > > >>>>>>>> 2. To make some specific number of nodes available such as
> > we
> > > > > want 2
> > > > > >>>>>>>> servers to be available at any time so if one goes down
> > then I
> > > > > need
> > > > > >>>>>>>>
> > > > > >>>>>>> to
> > > > > >>>>
> > > > > >>>>>  spawn one new to make available servers count 2.
> > > > > >>>>>>>> 3. Initially start all the servers.
> > > > > >>>>>>>>
> > > > > >>>>>>>> In scenario 1 and 2 zookeeper does make sense but I don't
> > > > believe
> > > > > >>>>>>>>
> > > > > >>>>>>> existing
> > > > > >>>>>>>
> > > > > >>>>>>>> architecture support this?
> > > > > >>>>>>>>
> > > > > >>>>>>>> Regards,
> > > > > >>>>>>>> Gagan
> > > > > >>>>>>>> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <
> > > glahiru@gmail.com
> > > > >
> > > > > >>>>>>>>
> > > > > >>>>>>> wrote:
> > > > > >>>>>>>
> > > > > >>>>>>>> Hi Gagan,
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> Thanks for your response. Please see my inline comments.
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
> > > > > >>>>>>>>>
> > > > > >>>>>>>> gagandeepjuneja@gmail.com>
> > > > > >>>>>>>
> > > > > >>>>>>>> wrote:
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>  Hi Lahiru,
> > > > > >>>>>>>>>> Just my 2 cents.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> I am big fan of zookeeper but also against adding
> multiple
> > > > hops
> > > > > in
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>> the
> > > > > >>>>>>>
> > > > > >>>>>>>> system which can add unnecessary complexity. Here I am not
> > > able
> > > > to
> > > > > >>>>>>>>>> understand the requirement of zookeeper may be I am
> wrong
> > > > > because
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>> of
> > > > > >>>>
> > > > > >>>>> less
> > > > > >>>>>>>
> > > > > >>>>>>>> knowledge of the airavata system in whole. So I would like
> > to
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>> discuss
> > > > > >>>>
> > > > > >>>>>  following point.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> 1. How it will help us in making system more reliable.
> > > > Zookeeper
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>> is
> > > > > >>>>
> > > > > >>>>> not
> > > > > >>>>>>>
> > > > > >>>>>>>> able to restart services. At max it can tell whether
> service
> > > is
> > > > up
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>> or not
> > > > > >>>>>>>
> > > > > >>>>>>>> which could only be the case if airavata service goes down
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>> gracefully and
> > > > > >>>>>>>
> > > > > >>>>>>>> we have any automated way to restart it. If this is just
> > > matter
> > > > of
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>> routing
> > > > > >>>>>>>
> > > > > >>>>>>>> client requests to the available thrift servers then this
> > can
> > > be
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>> achieved
> > > > > >>>>>>>
> > > > > >>>>>>>> with the help of load balancer which I guess is already
> > there
> > > in
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>> thrift
> > > > > >>>>>>>
> > > > > >>>>>>>> wish list.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>  We have multiple thrift services and currently we start
> > > only
> > > > > one
> > > > > >>>>>>>>>
> > > > > >>>>>>>> instance
> > > > > >>>>>>>
> > > > > >>>>>>>> of them and each thrift service is a stateless service. To
> > > keep
> > > > > the
> > > > > >>>>>>>>>
> > > > > >>>>>>>> high
> > > > > >>>>>>>
> > > > > >>>>>>>> availability we have to start multiple instances of them
> in
> > > > > >>>>>>>>>
> > > > > >>>>>>>> production
> > > > > >>>>
> > > > > >>>>>  scenario. So for clients to get an available thrift service
> we
> > > can
> > > > > >>>>>>>>>
> > > > > >>>>>>>> use
> > > > > >>>>
> > > > > >>>>>  zookeeper znodes to represent each available service. There
> > are
> > > > > >>>>>>>>>
> > > > > >>>>>>>> some
> > > > > >>>>
> > > > > >>>>>  libraries which is doing similar[1] and I think we can use
> > them
> > > > > >>>>>>>>>
> > > > > >>>>>>>> directly.
> > > > > >>>>>>>
> > > > > >>>>>>>> 2. As far as registering of different providers is
> concerned
> > > do
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>> you
> > > > > >>>>
> > > > > >>>>>  think for that we really need external store.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>  Yes I think so, because its light weight and reliable
> and
> > > we
> > > > > have
> > > > > >>>>>>>>>
> > > > > >>>>>>>> to
> > > > > >>>>
> > > > > >>>>> do
> > > > > >>>>>>>
> > > > > >>>>>>>> very minimal amount of work to achieve all these features
> to
> > > > > >>>>>>>>>
> > > > > >>>>>>>> Airavata
> > > > > >>>>
> > > > > >>>>>  because zookeeper handle all the complexity.
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>  I have seen people using zookeeper more for state
> > management
> > > > in
> > > > > >>>>>>>>>> distributed environments.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>  +1, we might not be the most effective users of
> zookeeper
> > > > > because
> > > > > >>>>>>>>>
> > > > > >>>>>>>> all
> > > > > >>>>
> > > > > >>>>> of
> > > > > >>>>>>>
> > > > > >>>>>>>> our services are stateless services, but my point is to
> > > achieve
> > > > > >>>>>>>>> fault-tolerance we can use zookeeper and with minimal
> work.
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>    I would like to understand more how can we leverage
> > > > zookeeper
> > > > > in
> > > > > >>>>>>>>>> airavata to make system reliable.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>  [1]https://github.com/eirslett/thrift-zookeeper
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>  Regards,
> > > > > >>>>>>>>>> Gagan
> > > > > >>>>>>>>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <
> marpierc@iu.edu
> > >
> > > > > wrote:
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>  Thanks for the summary, Lahiru. I'm cc'ing the
> > Architecture
> > > > > list
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>> for
> > > > > >>>>
> > > > > >>>>>  additional comments.
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> Marlon
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> Hi All,
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> I did little research about Apache Zookeeper[1] and
> how
> > to
> > > > use
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> it
> > > > > >>>>
> > > > > >>>>> in
> > > > > >>>>>>>
> > > > > >>>>>>>>  airavata. Its really a nice way to achieve fault
> tolerance
> > > and
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> reliable
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> communication between our thrift services and clients.
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> Zookeeper
> > > > > >>>>
> > > > > >>>>> is a
> > > > > >>>>>>>
> > > > > >>>>>>>>  distributed, fault tolerant system to do a reliable
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> communication
> > > > > >>>>
> > > > > >>>>>  between
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> distributed applications. This is like an in-memory
> file
> > > > > system
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> which
> > > > > >>>>>>>
> > > > > >>>>>>>>  has
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> nodes in a tree structure and each node can have small
> > > > amount
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> of
> > > > > >>>>
> > > > > >>>>> data
> > > > > >>>>>>>
> > > > > >>>>>>>>  associated with it and these nodes are called znodes.
> > Clients
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> can
> > > > > >>>>
> > > > > >>>>>  connect
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> to a zookeeper server and add/delete and update these
> > > > znodes.
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>>    In Apache Airavata we start multiple thrift
> services
> > > and
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> these
> > > > > >>>>
> > > > > >>>>> can
> > > > > >>>>>>>
> > > > > >>>>>>>>  go
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> down for maintenance or these can crash, if we use
> > > zookeeper
> > > > > to
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> store
> > > > > >>>>>>>
> > > > > >>>>>>>>  these
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> configuration(thrift service configurations) we can
> > > achieve
> > > > a
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> very
> > > > > >>>>
> > > > > >>>>>  reliable
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> system. Basically thrift clients can dynamically
> > discover
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> available
> > > > > >>>>>>>
> > > > > >>>>>>>>  service
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> by using ephemeral znodes(Here we do not have to
> change
> > > the
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> generated
> > > > > >>>>>>>
> > > > > >>>>>>>>  thrift client code but we have to change the locations we
> > are
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> invoking
> > > > > >>>>>>>
> > > > > >>>>>>>>  them). ephemeral znodes will be removed when the thrift
> > > service
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> goes
> > > > > >>>>>>>
> > > > > >>>>>>>>  down
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> and zookeeper guarantee the atomicity between these
> > > > > operations.
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> With
> > > > > >>>>>>>
> > > > > >>>>>>>>  this
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> approach we can have a node hierarchy for multiple of
> > > > > airavata,
> > > > > >>>>>>>>>>>> orchestrator,appcatalog and gfac thrift services.
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> For specifically for gfac we can have different types
> of
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> services
> > > > > >>>>
> > > > > >>>>> for
> > > > > >>>>>>>
> > > > > >>>>>>>>  each
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> provider implementation. This can be achieved by using
> > the
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> hierarchical
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> support in zookeeper and providing some logic in
> > > gfac-thrift
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> service
> > > > > >>>>>>>
> > > > > >>>>>>>>  to
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> register it to a defined path. Using the same logic
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> orchestrator
> > > > > >>>>
> > > > > >>>>> can
> > > > > >>>>>>>
> > > > > >>>>>>>>  discover the provider specific gfac thrift service and
> > route
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> the
> > > > > >>>>
> > > > > >>>>>  message to
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> the correct thrift service.
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> With this approach I think we simply have write some
> > > client
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> code
> > > > > >>>>
> > > > > >>>>> in
> > > > > >>>>>>>
> > > > > >>>>>>>>  thrift
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> services and clients and zookeeper server installation
> > can
> > > > be
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> done as
> > > > > >>>>>>>
> > > > > >>>>>>>>  a
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> separate process and it will be easier to keep the
> > > Zookeeper
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> server
> > > > > >>>>>>>
> > > > > >>>>>>>>  separate from Airavata because installation of Zookeeper
> > > server
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> little
> > > > > >>>>>>>
> > > > > >>>>>>>>  complex in production scenario. I think we have to make
> > sure
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> everything
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> works fine when there is no Zookeeper running, ex:
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> enable.zookeeper=false
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> should works fine and users doesn't have to download
> and
> > > > start
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>> zookeeper.
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> [1]http://zookeeper.apache.org/
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> Thanks
> > > > > >>>>>>>>>>>> Lahiru
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>> --
> > > > > >>>>>>>>> System Analyst Programmer
> > > > > >>>>>>>>> PTI Lab
> > > > > >>>>>>>>> Indiana University
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>
> > > > > >>>>>>> --
> > > > > >>>>>>> System Analyst Programmer
> > > > > >>>>>>> PTI Lab
> > > > > >>>>>>> Indiana University
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>
> > > > > >>>>>> --
> > > > > >>>>>> Best Regards,
> > > > > >>>>>> Shameera Rathnayaka.
> > > > > >>>>>>
> > > > > >>>>>> email: shameera AT apache.org , shameerainfo AT gmail.com
> > > > > >>>>>> Blog : http://shameerarathnayaka.blogspot.com/
> > > > > >>>>>>
> > > > > >>>>>>
> > > > > >>>>>
> > > > > >>>>> --
> > > > > >>>>> Supun Kamburugamuva
> > > > > >>>>> Member, Apache Software Foundation; http://www.apache.org
> > > > > >>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > > > >>>>> Blog: http://supunk.blogspot.com
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>> --
> > > > > >>>> System Analyst Programmer
> > > > > >>>> PTI Lab
> > > > > >>>> Indiana University
> > > > > >>>>
> > > > > >>>>
> > > > > >>>
> > > > > >>> --
> > > > > >>> Supun Kamburugamuva
> > > > > >>> Member, Apache Software Foundation; http://www.apache.org
> > > > > >>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > > > >>> Blog: http://supunk.blogspot.com
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Supun Kamburugamuva
> > > > > Member, Apache Software Foundation; http://www.apache.org
> > > > > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > > > Blog: http://supunk.blogspot.com
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > System Analyst Programmer
> > > > PTI Lab
> > > > Indiana University
> > > >
> > >
> > >
> > >
> > > --
> > > Supun Kamburugamuva
> > > Member, Apache Software Foundation; http://www.apache.org
> > > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > Blog: http://supunk.blogspot.com
> > >
> >
> >
> >
> > --
> > System Analyst Programmer
> > PTI Lab
> > Indiana University
> >
>



-- 
Supun Kamburugamuva
Member, Apache Software Foundation; http://www.apache.org
E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
Blog: http://supunk.blogspot.com

Re: Zookeeper in Airavata to achieve reliability

Posted by Eran Chinthaka Withana <er...@gmail.com>.

Hi Lahiru,

good summarization. Thanks Lahiru.

I think you are trying to stick to a model where Orchestrator distributing
to work for GFac worker and trying to do the impedance mismatch through a
messaging solution. If you step back and think, we don't even want the
orchestrator to handle everything. From its point of view, it should submit
jobs to the framework, and will wait or get notified once the job is done.

There are multiple ways of doing this. And here is one method.

Orchestrator submits all its jobs to Job queue (implemented using any MQ
impl like Rabbit or Kafka). A storm topology is implemented to dequeue
messages, process them (i.e. submit those jobs and get those executed) and
notify the Orchestrator with the status (either through another
JobCompletionQueue or direct invocation).

With this approach, the MQ provider will help to match impedance between
job submission and consumption. Storm helps with worker coordination, load
balancing, throttling on your job execution framework, worker pool
management and fault tolerance.

Of course, you can implement this based only on ZK and handle everything
else on your own but storm had done exactly that with the use of ZK
underneath.

Finally, if you go for a model like this, then even beyond job submission,
you can use the same model to do anything within the framework for internal
communication. For example, the workflow engine will submit its jobs to
queues based on what it has to do. Storm topologies exists for each queues
to dequeue messages and carry out the work in a reliable manner. Consider
this as mini-workflows within a larger workflow framework.

We can have a voice chat if its more convenient. But not at 7am PST :)


Thanks,
Eran Chinthaka Withana


On Tue, Jun 17, 2014 at 10:12 AM, Lahiru Gunathilake <gl...@gmail.com>
wrote:

> Hi All,
>
> Ignoring the tool that we are going to use to implement fault tolerance I
> have summarized the model we have decided so far. I will use the tool name
> as X, we can use Zookeeper or some other implementation. Following design
> assume tool X  and Registry have high availability.
>
> 1. Orchestrator and GFAC worker node communication is going to be queue
> based and tool X is going to be used for this communication. (We have to
> implement this with considering race condition between different gfac
> workers).
> 2. We are having multiple instances of GFAC which are identical (In future
> we can group gfac workers). Existence of each worker node is identified
> using X. If node goes down orchestrator will be notified by X.
> 3. When a particular request comes and accepted by one gfac worker that
> information will be replicated in tool X and a place where this information
> is persisted even the worker failed.
> 4. When a job comes to a final state like failed or cancelled or completed
> above information will be removed. So at a given time orchestrator can poll
> active jobs in each worker by giving a worker ID.
> 5. Tool X will make sure that when a worker goes down it will notify
> orchestrator. During a worker failure, based on step 3 and 4 orchestrator
> can poll all the active jobs of that worker and do the same thing like in
> step 1 (store the experiment ID to the queue) and gfac worker will pick the
> jobs.
>
> 6. When GFAC receive a job like in step 5 it have to carefully evaluate the
> state from registry and decide what to be done (If the job is pending then
> gfac just have to monitor, if job state is like input transferred not even
> submitted gfac has to execute rest of the chain and submit the job to the
> resource and start monitoring).
>
> If we can find a tool X which supports all these features and tool itself
> is fault tolerance and support atomicity, high availability and simply API
> to implement we can use that tool.
>
> WDYT ?
>
> Lahiru
>
>
> On Mon, Jun 16, 2014 at 2:38 PM, Supun Kamburugamuva <su...@gmail.com>
> wrote:
>
> > Hi Lahiru,
> >
> > Before moving with an implementation it may be worth to consider some of
> > the following aspects as well.
> >
> > 1. How to report the progress of an experiment as state in ZooKeeper?
> What
> > happens if a GFac instance crashes while executing an experiment? Are
> there
> > check-points we can save so that another GFac instance can take over?
> > 2. What is the threading model of GFac instances? (I consider this as a
> > very important aspect)
> > 3. What are the information needed to be stored in the ZooKeeper? You may
> > need to store other information about an experiment apart from its
> > experiment ID.
> > 4. How to report errors?
> > 5. For GFac weather you need a threading model or worker process model?
> >
> > Thanks,
> > Supun..
> >
> >
> >
> >
> >
> > On Mon, Jun 16, 2014 at 2:22 PM, Lahiru Gunathilake <gl...@gmail.com>
> > wrote:
> >
> > > Hi All,
> > >
> > > I think the conclusion is like this,
> > >
> > > 1, We make the gfac as a worker not a thrift service and we can start
> > > multiple workers either with bunch of providers and handlers configured
> > in
> > > each worker or provider specific  workers to handle the class path
> issues
> > > (not the common scenario).
> > >
> > > 2. Gfac workers can be configured to watch for a given path in
> zookeeper,
> > > and multiple workers can listen to the same path. Default path can be
> > > /airavata/gfac or can configure paths like /airavata/gfac/gsissh
> > > /airavata/gfac/bes.
> > >
> > > 3. Orchestrator can configure with a logic to store experiment IDs in
> > > zookeeper with a path, and orchestrator can be configured to provider
> > > specific path logic too. So when a new request come orchestrator store
> > the
> > > experimentID and these experiments IDs are stored in Zk as a queue.
> > >
> > > 4. Since gfac workers are watching they will be notified and as supun
> > > suggested can use a leader selection algorithm[1] and one gfac worker
> >  will
> > > take the leadership for each experiment. If there are gfac instances
> for
> > > each provider same logic will apply among those nodes with same
> provider
> > > type.
> > >
> > > [1]http://curator.apache.org/curator-recipes/leader-election.html
> > >
> > > I would like to implement this if there are  no objections.
> > >
> > > Lahiru
> > >
> > >
> > > On Mon, Jun 16, 2014 at 11:51 AM, Supun Kamburugamuva <
> supun06@gmail.com
> > >
> > > wrote:
> > >
> > > > Hi Marlon,
> > > >
> > > > I think you are exactly correct.
> > > >
> > > > Supun..
> > > >
> > > >
> > > > On Mon, Jun 16, 2014 at 11:48 AM, Marlon Pierce <ma...@iu.edu>
> > wrote:
> > > >
> > > > > Let me restate this, and please tell me if I'm wrong.
> > > > >
> > > > > Orchestrator decides (somehow) that a particular job requires
> > JSDL/BES,
> > > > so
> > > > > it places the Experiment ID in Zookeeper's /airavata/gfac/jsdl-bes
> > > node.
> > > > >  GFAC servers associated with this instance notice the update.  The
> > > first
> > > > > GFAC to claim the job gets it, uses the Experiment ID to get the
> > > detailed
> > > > > information it needs from the Registry.  ZooKeeper handles the
> > locking,
> > > > etc
> > > > > to make sure that only one GFAC at a time is trying to handle an
> > > > experiment.
> > > > >
> > > > > Marlon
> > > > >
> > > > >
> > > > > On 6/16/14, 11:42 AM, Lahiru Gunathilake wrote:
> > > > >
> > > > >> Hi Supun,
> > > > >>
> > > > >> Thanks for the clarification.
> > > > >>
> > > > >> Regards
> > > > >> Lahiru
> > > > >>
> > > > >>
> > > > >> On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva <
> > > > supun06@gmail.com>
> > > > >> wrote:
> > > > >>
> > > > >>  Hi Lahiru,
> > > > >>>
> > > > >>> My suggestion is that may be you don't need a Thrift service
> > between
> > > > >>> Orchestrator and the component executing the experiment. When a
> new
> > > > >>> experiment is submitted, orchestrator decides who can execute
> this
> > > job.
> > > > >>> Then it put the information about this experiment execution in
> > > > ZooKeeper.
> > > > >>> The component which wants to executes the experiment is listening
> > to
> > > > this
> > > > >>> ZooKeeper path and when it sees the experiment it will execute
> it.
> > So
> > > > >>> that
> > > > >>> the communication happens through an state change in ZooKeeper.
> > This
> > > > can
> > > > >>> potentially simply your architecture.
> > > > >>>
> > > > >>> Thanks,
> > > > >>> Supun.
> > > > >>>
> > > > >>>
> > > > >>> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake <
> > > > glahiru@gmail.com>
> > > > >>> wrote:
> > > > >>>
> > > > >>>  Hi Supun,
> > > > >>>>
> > > > >>>> So your suggestion is to create a znode for each thrift service
> we
> > > > have
> > > > >>>> and
> > > > >>>> when the request comes that node gets modified with input data
> for
> > > > that
> > > > >>>> request and thrift service is having a watch for that node and
> it
> > > will
> > > > >>>> be
> > > > >>>> notified because of the watch and it can read the input from
> > > zookeeper
> > > > >>>> and
> > > > >>>> invoke the operation?
> > > > >>>>
> > > > >>>> Lahiru
> > > > >>>>
> > > > >>>>
> > > > >>>> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva <
> > > > >>>> supun06@gmail.com>
> > > > >>>> wrote:
> > > > >>>>
> > > > >>>>  Hi all,
> > > > >>>>>
> > > > >>>>> Here is what I think about Airavata and ZooKeeper. In Airavata
> > > there
> > > > >>>>> are
> > > > >>>>> many components and these components must be stateless to
> achieve
> > > > >>>>> scalability and reliability.Also there must be a mechanism to
> > > > >>>>>
> > > > >>>> communicate
> > > > >>>>
> > > > >>>>> between the components. At the moment Airavata uses RPC calls
> > based
> > > > on
> > > > >>>>> Thrift for the communication.
> > > > >>>>>
> > > > >>>>> ZooKeeper can be used both as a place to hold state and as a
> > > > >>>>>
> > > > >>>> communication
> > > > >>>>
> > > > >>>>> layer between the components. I'm involved with a project that
> > has
> > > > many
> > > > >>>>> distributed components like AIravata. Right now we use Thrift
> > > > services
> > > > >>>>>
> > > > >>>> to
> > > > >>>>
> > > > >>>>> communicate among the components. But we find it difficult to
> use
> > > RPC
> > > > >>>>>
> > > > >>>> calls
> > > > >>>>
> > > > >>>>> and achieve stateless behaviour and thinking of replacing
> Thrift
> > > > >>>>>
> > > > >>>> services
> > > > >>>>
> > > > >>>>> with ZooKeeper based communication layer. So I think it is
> better
> > > to
> > > > >>>>> explore the possibility of removing the Thrift services between
> > the
> > > > >>>>> components and use ZooKeeper as a communication mechanism
> between
> > > the
> > > > >>>>> services. If you do this you will have to move the state to
> > > ZooKeeper
> > > > >>>>>
> > > > >>>> and
> > > > >>>>
> > > > >>>>> will automatically achieve the stateless behaviour in the
> > > components.
> > > > >>>>>
> > > > >>>>> Also I think trying to make ZooKeeper optional is a bad idea.
> If
> > we
> > > > are
> > > > >>>>> trying to integrate something fundamentally important to
> > > architecture
> > > > >>>>> as
> > > > >>>>> how to store state, we shouldn't make it optional.
> > > > >>>>>
> > > > >>>>> Thanks,
> > > > >>>>> Supun..
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
> > > > >>>>> shameerainfo@gmail.com> wrote:
> > > > >>>>>
> > > > >>>>>  Hi Lahiru,
> > > > >>>>>>
> > > > >>>>>> As i understood,  not only reliability , you are trying to
> > achieve
> > > > >>>>>> some
> > > > >>>>>> other requirement by introducing zookeeper, like health
> > monitoring
> > > > of
> > > > >>>>>>
> > > > >>>>> the
> > > > >>>>
> > > > >>>>> services, categorization with service implementation etc ... .
> In
> > > > that
> > > > >>>>>> case, i think we can get use of zookeeper's features but if we
> > > only
> > > > >>>>>>
> > > > >>>>> focus
> > > > >>>>
> > > > >>>>> on reliability, i have little bit of concern, why can't we use
> > > > >>>>>>
> > > > >>>>> clustering +
> > > > >>>>
> > > > >>>>> LB ?
> > > > >>>>>>
> > > > >>>>>> Yes it is better we add Zookeeper as a prerequisite if user
> need
> > > to
> > > > >>>>>> use
> > > > >>>>>> it.
> > > > >>>>>>
> > > > >>>>>> Thanks,
> > > > >>>>>>   Shameera.
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <
> > > > >>>>>> glahiru@gmail.com
> > > > >>>>>> wrote:
> > > > >>>>>>
> > > > >>>>>>  Hi Gagan,
> > > > >>>>>>>
> > > > >>>>>>> I need to start another discussion about it, but I had an
> > offline
> > > > >>>>>>> discussion with Suresh about auto-scaling. I will start
> another
> > > > >>>>>>> thread
> > > > >>>>>>> about this topic too.
> > > > >>>>>>>
> > > > >>>>>>> Regards
> > > > >>>>>>> Lahiru
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
> > > > >>>>>>>
> > > > >>>>>> gagandeepjuneja@gmail.com
> > > > >>>>
> > > > >>>>> wrote:
> > > > >>>>>>>
> > > > >>>>>>>  Thanks Lahiru for pointing to nice library, added to my
> > > dictionary
> > > > >>>>>>>>
> > > > >>>>>>> :).
> > > > >>>>
> > > > >>>>>  I would like to know how are we planning to start multiple
> > > servers.
> > > > >>>>>>>> 1. Spawning new servers based on load? Some times we call it
> > as
> > > > auto
> > > > >>>>>>>> scalable.
> > > > >>>>>>>> 2. To make some specific number of nodes available such as
> we
> > > > want 2
> > > > >>>>>>>> servers to be available at any time so if one goes down
> then I
> > > > need
> > > > >>>>>>>>
> > > > >>>>>>> to
> > > > >>>>
> > > > >>>>>  spawn one new to make available servers count 2.
> > > > >>>>>>>> 3. Initially start all the servers.
> > > > >>>>>>>>
> > > > >>>>>>>> In scenario 1 and 2 zookeeper does make sense but I don't
> > > believe
> > > > >>>>>>>>
> > > > >>>>>>> existing
> > > > >>>>>>>
> > > > >>>>>>>> architecture support this?
> > > > >>>>>>>>
> > > > >>>>>>>> Regards,
> > > > >>>>>>>> Gagan
> > > > >>>>>>>> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <
> > glahiru@gmail.com
> > > >
> > > > >>>>>>>>
> > > > >>>>>>> wrote:
> > > > >>>>>>>
> > > > >>>>>>>> Hi Gagan,
> > > > >>>>>>>>>
> > > > >>>>>>>>> Thanks for your response. Please see my inline comments.
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
> > > > >>>>>>>>>
> > > > >>>>>>>> gagandeepjuneja@gmail.com>
> > > > >>>>>>>
> > > > >>>>>>>> wrote:
> > > > >>>>>>>>>
> > > > >>>>>>>>>  Hi Lahiru,
> > > > >>>>>>>>>> Just my 2 cents.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> I am big fan of zookeeper but also against adding multiple
> > > hops
> > > > in
> > > > >>>>>>>>>>
> > > > >>>>>>>>> the
> > > > >>>>>>>
> > > > >>>>>>>> system which can add unnecessary complexity. Here I am not
> > able
> > > to
> > > > >>>>>>>>>> understand the requirement of zookeeper may be I am wrong
> > > > because
> > > > >>>>>>>>>>
> > > > >>>>>>>>> of
> > > > >>>>
> > > > >>>>> less
> > > > >>>>>>>
> > > > >>>>>>>> knowledge of the airavata system in whole. So I would like
> to
> > > > >>>>>>>>>>
> > > > >>>>>>>>> discuss
> > > > >>>>
> > > > >>>>>  following point.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 1. How it will help us in making system more reliable.
> > > Zookeeper
> > > > >>>>>>>>>>
> > > > >>>>>>>>> is
> > > > >>>>
> > > > >>>>> not
> > > > >>>>>>>
> > > > >>>>>>>> able to restart services. At max it can tell whether service
> > is
> > > up
> > > > >>>>>>>>>>
> > > > >>>>>>>>> or not
> > > > >>>>>>>
> > > > >>>>>>>> which could only be the case if airavata service goes down
> > > > >>>>>>>>>>
> > > > >>>>>>>>> gracefully and
> > > > >>>>>>>
> > > > >>>>>>>> we have any automated way to restart it. If this is just
> > matter
> > > of
> > > > >>>>>>>>>>
> > > > >>>>>>>>> routing
> > > > >>>>>>>
> > > > >>>>>>>> client requests to the available thrift servers then this
> can
> > be
> > > > >>>>>>>>>>
> > > > >>>>>>>>> achieved
> > > > >>>>>>>
> > > > >>>>>>>> with the help of load balancer which I guess is already
> there
> > in
> > > > >>>>>>>>>>
> > > > >>>>>>>>> thrift
> > > > >>>>>>>
> > > > >>>>>>>> wish list.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>  We have multiple thrift services and currently we start
> > only
> > > > one
> > > > >>>>>>>>>
> > > > >>>>>>>> instance
> > > > >>>>>>>
> > > > >>>>>>>> of them and each thrift service is a stateless service. To
> > keep
> > > > the
> > > > >>>>>>>>>
> > > > >>>>>>>> high
> > > > >>>>>>>
> > > > >>>>>>>> availability we have to start multiple instances of them in
> > > > >>>>>>>>>
> > > > >>>>>>>> production
> > > > >>>>
> > > > >>>>>  scenario. So for clients to get an available thrift service we
> > can
> > > > >>>>>>>>>
> > > > >>>>>>>> use
> > > > >>>>
> > > > >>>>>  zookeeper znodes to represent each available service. There
> are
> > > > >>>>>>>>>
> > > > >>>>>>>> some
> > > > >>>>
> > > > >>>>>  libraries which is doing similar[1] and I think we can use
> them
> > > > >>>>>>>>>
> > > > >>>>>>>> directly.
> > > > >>>>>>>
> > > > >>>>>>>> 2. As far as registering of different providers is concerned
> > do
> > > > >>>>>>>>>>
> > > > >>>>>>>>> you
> > > > >>>>
> > > > >>>>>  think for that we really need external store.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>  Yes I think so, because its light weight and reliable and
> > we
> > > > have
> > > > >>>>>>>>>
> > > > >>>>>>>> to
> > > > >>>>
> > > > >>>>> do
> > > > >>>>>>>
> > > > >>>>>>>> very minimal amount of work to achieve all these features to
> > > > >>>>>>>>>
> > > > >>>>>>>> Airavata
> > > > >>>>
> > > > >>>>>  because zookeeper handle all the complexity.
> > > > >>>>>>>>>
> > > > >>>>>>>>>  I have seen people using zookeeper more for state
> management
> > > in
> > > > >>>>>>>>>> distributed environments.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>  +1, we might not be the most effective users of zookeeper
> > > > because
> > > > >>>>>>>>>
> > > > >>>>>>>> all
> > > > >>>>
> > > > >>>>> of
> > > > >>>>>>>
> > > > >>>>>>>> our services are stateless services, but my point is to
> > achieve
> > > > >>>>>>>>> fault-tolerance we can use zookeeper and with minimal work.
> > > > >>>>>>>>>
> > > > >>>>>>>>>    I would like to understand more how can we leverage
> > > zookeeper
> > > > in
> > > > >>>>>>>>>> airavata to make system reliable.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>  [1]https://github.com/eirslett/thrift-zookeeper
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>>  Regards,
> > > > >>>>>>>>>> Gagan
> > > > >>>>>>>>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <marpierc@iu.edu
> >
> > > > wrote:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>  Thanks for the summary, Lahiru. I'm cc'ing the
> Architecture
> > > > list
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>> for
> > > > >>>>
> > > > >>>>>  additional comments.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Marlon
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> Hi All,
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> I did little research about Apache Zookeeper[1] and how
> to
> > > use
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> it
> > > > >>>>
> > > > >>>>> in
> > > > >>>>>>>
> > > > >>>>>>>>  airavata. Its really a nice way to achieve fault tolerance
> > and
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> reliable
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> communication between our thrift services and clients.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> Zookeeper
> > > > >>>>
> > > > >>>>> is a
> > > > >>>>>>>
> > > > >>>>>>>>  distributed, fault tolerant system to do a reliable
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> communication
> > > > >>>>
> > > > >>>>>  between
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> distributed applications. This is like an in-memory file
> > > > system
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> which
> > > > >>>>>>>
> > > > >>>>>>>>  has
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> nodes in a tree structure and each node can have small
> > > amount
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> of
> > > > >>>>
> > > > >>>>> data
> > > > >>>>>>>
> > > > >>>>>>>>  associated with it and these nodes are called znodes.
> Clients
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> can
> > > > >>>>
> > > > >>>>>  connect
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> to a zookeeper server and add/delete and update these
> > > znodes.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>    In Apache Airavata we start multiple thrift services
> > and
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> these
> > > > >>>>
> > > > >>>>> can
> > > > >>>>>>>
> > > > >>>>>>>>  go
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> down for maintenance or these can crash, if we use
> > zookeeper
> > > > to
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> store
> > > > >>>>>>>
> > > > >>>>>>>>  these
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> configuration(thrift service configurations) we can
> > achieve
> > > a
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> very
> > > > >>>>
> > > > >>>>>  reliable
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> system. Basically thrift clients can dynamically
> discover
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> available
> > > > >>>>>>>
> > > > >>>>>>>>  service
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> by using ephemeral znodes(Here we do not have to change
> > the
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> generated
> > > > >>>>>>>
> > > > >>>>>>>>  thrift client code but we have to change the locations we
> are
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> invoking
> > > > >>>>>>>
> > > > >>>>>>>>  them). ephemeral znodes will be removed when the thrift
> > service
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> goes
> > > > >>>>>>>
> > > > >>>>>>>>  down
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> and zookeeper guarantee the atomicity between these
> > > > operations.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> With
> > > > >>>>>>>
> > > > >>>>>>>>  this
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> approach we can have a node hierarchy for multiple of
> > > > airavata,
> > > > >>>>>>>>>>>> orchestrator,appcatalog and gfac thrift services.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> For specifically for gfac we can have different types of
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> services
> > > > >>>>
> > > > >>>>> for
> > > > >>>>>>>
> > > > >>>>>>>>  each
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> provider implementation. This can be achieved by using
> the
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> hierarchical
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> support in zookeeper and providing some logic in
> > gfac-thrift
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> service
> > > > >>>>>>>
> > > > >>>>>>>>  to
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> register it to a defined path. Using the same logic
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> orchestrator
> > > > >>>>
> > > > >>>>> can
> > > > >>>>>>>
> > > > >>>>>>>>  discover the provider specific gfac thrift service and
> route
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> the
> > > > >>>>
> > > > >>>>>  message to
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> the correct thrift service.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> With this approach I think we simply have write some
> > client
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> code
> > > > >>>>
> > > > >>>>> in
> > > > >>>>>>>
> > > > >>>>>>>>  thrift
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> services and clients and zookeeper server installation
> can
> > > be
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> done as
> > > > >>>>>>>
> > > > >>>>>>>>  a
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> separate process and it will be easier to keep the
> > Zookeeper
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> server
> > > > >>>>>>>
> > > > >>>>>>>>  separate from Airavata because installation of Zookeeper
> > server
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> little
> > > > >>>>>>>
> > > > >>>>>>>>  complex in production scenario. I think we have to make
> sure
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> everything
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> works fine when there is no Zookeeper running, ex:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> enable.zookeeper=false
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> should works fine and users doesn't have to download and
> > > start
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> zookeeper.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> [1]http://zookeeper.apache.org/
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Thanks
> > > > >>>>>>>>>>>> Lahiru
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>> --
> > > > >>>>>>>>> System Analyst Programmer
> > > > >>>>>>>>> PTI Lab
> > > > >>>>>>>>> Indiana University
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>> --
> > > > >>>>>>> System Analyst Programmer
> > > > >>>>>>> PTI Lab
> > > > >>>>>>> Indiana University
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>> --
> > > > >>>>>> Best Regards,
> > > > >>>>>> Shameera Rathnayaka.
> > > > >>>>>>
> > > > >>>>>> email: shameera AT apache.org , shameerainfo AT gmail.com
> > > > >>>>>> Blog : http://shameerarathnayaka.blogspot.com/
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>> --
> > > > >>>>> Supun Kamburugamuva
> > > > >>>>> Member, Apache Software Foundation; http://www.apache.org
> > > > >>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > > >>>>> Blog: http://supunk.blogspot.com
> > > > >>>>>
> > > > >>>>>
> > > > >>>>>
> > > > >>>> --
> > > > >>>> System Analyst Programmer
> > > > >>>> PTI Lab
> > > > >>>> Indiana University
> > > > >>>>
> > > > >>>>
> > > > >>>
> > > > >>> --
> > > > >>> Supun Kamburugamuva
> > > > >>> Member, Apache Software Foundation; http://www.apache.org
> > > > >>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > > >>> Blog: http://supunk.blogspot.com
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>
> > > > >
> > > >
> > > >
> > > > --
> > > > Supun Kamburugamuva
> > > > Member, Apache Software Foundation; http://www.apache.org
> > > > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > > Blog: http://supunk.blogspot.com
> > > >
> > >
> > >
> > >
> > > --
> > > System Analyst Programmer
> > > PTI Lab
> > > Indiana University
> > >
> >
> >
> >
> > --
> > Supun Kamburugamuva
> > Member, Apache Software Foundation; http://www.apache.org
> > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > Blog: http://supunk.blogspot.com
> >
>
>
>
> --
> System Analyst Programmer
> PTI Lab
> Indiana University
>

Re: Zookeeper in Airavata to achieve reliability

Posted by Lahiru Gunathilake <gl...@gmail.com>.

Hi All,

Ignoring the tool that we are going to use to implement fault tolerance I
have summarized the model we have decided so far. I will use the tool name
as X, we can use Zookeeper or some other implementation. Following design
assume tool X  and Registry have high availability.

1. Orchestrator and GFAC worker node communication is going to be queue
based and tool X is going to be used for this communication. (We have to
implement this with considering race condition between different gfac
workers).
2. We are having multiple instances of GFAC which are identical (In future
we can group gfac workers). Existence of each worker node is identified
using X. If node goes down orchestrator will be notified by X.
3. When a particular request comes and accepted by one gfac worker that
information will be replicated in tool X and a place where this information
is persisted even the worker failed.
4. When a job comes to a final state like failed or cancelled or completed
above information will be removed. So at a given time orchestrator can poll
active jobs in each worker by giving a worker ID.
5. Tool X will make sure that when a worker goes down it will notify
orchestrator. During a worker failure, based on step 3 and 4 orchestrator
can poll all the active jobs of that worker and do the same thing like in
step 1 (store the experiment ID to the queue) and gfac worker will pick the
jobs.

6. When GFAC receive a job like in step 5 it have to carefully evaluate the
state from registry and decide what to be done (If the job is pending then
gfac just have to monitor, if job state is like input transferred not even
submitted gfac has to execute rest of the chain and submit the job to the
resource and start monitoring).

If we can find a tool X which supports all these features and tool itself
is fault tolerance and support atomicity, high availability and simply API
to implement we can use that tool.

WDYT ?

Lahiru


On Mon, Jun 16, 2014 at 2:38 PM, Supun Kamburugamuva <su...@gmail.com>
wrote:

> Hi Lahiru,
>
> Before moving with an implementation it may be worth to consider some of
> the following aspects as well.
>
> 1. How to report the progress of an experiment as state in ZooKeeper? What
> happens if a GFac instance crashes while executing an experiment? Are there
> check-points we can save so that another GFac instance can take over?
> 2. What is the threading model of GFac instances? (I consider this as a
> very important aspect)
> 3. What are the information needed to be stored in the ZooKeeper? You may
> need to store other information about an experiment apart from its
> experiment ID.
> 4. How to report errors?
> 5. For GFac weather you need a threading model or worker process model?
>
> Thanks,
> Supun..
>
>
>
>
>
> On Mon, Jun 16, 2014 at 2:22 PM, Lahiru Gunathilake <gl...@gmail.com>
> wrote:
>
> > Hi All,
> >
> > I think the conclusion is like this,
> >
> > 1, We make the gfac as a worker not a thrift service and we can start
> > multiple workers either with bunch of providers and handlers configured
> in
> > each worker or provider specific  workers to handle the class path issues
> > (not the common scenario).
> >
> > 2. Gfac workers can be configured to watch for a given path in zookeeper,
> > and multiple workers can listen to the same path. Default path can be
> > /airavata/gfac or can configure paths like /airavata/gfac/gsissh
> > /airavata/gfac/bes.
> >
> > 3. Orchestrator can configure with a logic to store experiment IDs in
> > zookeeper with a path, and orchestrator can be configured to provider
> > specific path logic too. So when a new request come orchestrator store
> the
> > experimentID and these experiments IDs are stored in Zk as a queue.
> >
> > 4. Since gfac workers are watching they will be notified and as supun
> > suggested can use a leader selection algorithm[1] and one gfac worker
>  will
> > take the leadership for each experiment. If there are gfac instances for
> > each provider same logic will apply among those nodes with same provider
> > type.
> >
> > [1]http://curator.apache.org/curator-recipes/leader-election.html
> >
> > I would like to implement this if there are  no objections.
> >
> > Lahiru
> >
> >
> > On Mon, Jun 16, 2014 at 11:51 AM, Supun Kamburugamuva <supun06@gmail.com
> >
> > wrote:
> >
> > > Hi Marlon,
> > >
> > > I think you are exactly correct.
> > >
> > > Supun..
> > >
> > >
> > > On Mon, Jun 16, 2014 at 11:48 AM, Marlon Pierce <ma...@iu.edu>
> wrote:
> > >
> > > > Let me restate this, and please tell me if I'm wrong.
> > > >
> > > > Orchestrator decides (somehow) that a particular job requires
> JSDL/BES,
> > > so
> > > > it places the Experiment ID in Zookeeper's /airavata/gfac/jsdl-bes
> > node.
> > > >  GFAC servers associated with this instance notice the update.  The
> > first
> > > > GFAC to claim the job gets it, uses the Experiment ID to get the
> > detailed
> > > > information it needs from the Registry.  ZooKeeper handles the
> locking,
> > > etc
> > > > to make sure that only one GFAC at a time is trying to handle an
> > > experiment.
> > > >
> > > > Marlon
> > > >
> > > >
> > > > On 6/16/14, 11:42 AM, Lahiru Gunathilake wrote:
> > > >
> > > >> Hi Supun,
> > > >>
> > > >> Thanks for the clarification.
> > > >>
> > > >> Regards
> > > >> Lahiru
> > > >>
> > > >>
> > > >> On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva <
> > > supun06@gmail.com>
> > > >> wrote:
> > > >>
> > > >>  Hi Lahiru,
> > > >>>
> > > >>> My suggestion is that may be you don't need a Thrift service
> between
> > > >>> Orchestrator and the component executing the experiment. When a new
> > > >>> experiment is submitted, orchestrator decides who can execute this
> > job.
> > > >>> Then it put the information about this experiment execution in
> > > ZooKeeper.
> > > >>> The component which wants to executes the experiment is listening
> to
> > > this
> > > >>> ZooKeeper path and when it sees the experiment it will execute it.
> So
> > > >>> that
> > > >>> the communication happens through an state change in ZooKeeper.
> This
> > > can
> > > >>> potentially simply your architecture.
> > > >>>
> > > >>> Thanks,
> > > >>> Supun.
> > > >>>
> > > >>>
> > > >>> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake <
> > > glahiru@gmail.com>
> > > >>> wrote:
> > > >>>
> > > >>>  Hi Supun,
> > > >>>>
> > > >>>> So your suggestion is to create a znode for each thrift service we
> > > have
> > > >>>> and
> > > >>>> when the request comes that node gets modified with input data for
> > > that
> > > >>>> request and thrift service is having a watch for that node and it
> > will
> > > >>>> be
> > > >>>> notified because of the watch and it can read the input from
> > zookeeper
> > > >>>> and
> > > >>>> invoke the operation?
> > > >>>>
> > > >>>> Lahiru
> > > >>>>
> > > >>>>
> > > >>>> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva <
> > > >>>> supun06@gmail.com>
> > > >>>> wrote:
> > > >>>>
> > > >>>>  Hi all,
> > > >>>>>
> > > >>>>> Here is what I think about Airavata and ZooKeeper. In Airavata
> > there
> > > >>>>> are
> > > >>>>> many components and these components must be stateless to achieve
> > > >>>>> scalability and reliability.Also there must be a mechanism to
> > > >>>>>
> > > >>>> communicate
> > > >>>>
> > > >>>>> between the components. At the moment Airavata uses RPC calls
> based
> > > on
> > > >>>>> Thrift for the communication.
> > > >>>>>
> > > >>>>> ZooKeeper can be used both as a place to hold state and as a
> > > >>>>>
> > > >>>> communication
> > > >>>>
> > > >>>>> layer between the components. I'm involved with a project that
> has
> > > many
> > > >>>>> distributed components like AIravata. Right now we use Thrift
> > > services
> > > >>>>>
> > > >>>> to
> > > >>>>
> > > >>>>> communicate among the components. But we find it difficult to use
> > RPC
> > > >>>>>
> > > >>>> calls
> > > >>>>
> > > >>>>> and achieve stateless behaviour and thinking of replacing Thrift
> > > >>>>>
> > > >>>> services
> > > >>>>
> > > >>>>> with ZooKeeper based communication layer. So I think it is better
> > to
> > > >>>>> explore the possibility of removing the Thrift services between
> the
> > > >>>>> components and use ZooKeeper as a communication mechanism between
> > the
> > > >>>>> services. If you do this you will have to move the state to
> > ZooKeeper
> > > >>>>>
> > > >>>> and
> > > >>>>
> > > >>>>> will automatically achieve the stateless behaviour in the
> > components.
> > > >>>>>
> > > >>>>> Also I think trying to make ZooKeeper optional is a bad idea. If
> we
> > > are
> > > >>>>> trying to integrate something fundamentally important to
> > architecture
> > > >>>>> as
> > > >>>>> how to store state, we shouldn't make it optional.
> > > >>>>>
> > > >>>>> Thanks,
> > > >>>>> Supun..
> > > >>>>>
> > > >>>>>
> > > >>>>> On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
> > > >>>>> shameerainfo@gmail.com> wrote:
> > > >>>>>
> > > >>>>>  Hi Lahiru,
> > > >>>>>>
> > > >>>>>> As i understood,  not only reliability , you are trying to
> achieve
> > > >>>>>> some
> > > >>>>>> other requirement by introducing zookeeper, like health
> monitoring
> > > of
> > > >>>>>>
> > > >>>>> the
> > > >>>>
> > > >>>>> services, categorization with service implementation etc ... . In
> > > that
> > > >>>>>> case, i think we can get use of zookeeper's features but if we
> > only
> > > >>>>>>
> > > >>>>> focus
> > > >>>>
> > > >>>>> on reliability, i have little bit of concern, why can't we use
> > > >>>>>>
> > > >>>>> clustering +
> > > >>>>
> > > >>>>> LB ?
> > > >>>>>>
> > > >>>>>> Yes it is better we add Zookeeper as a prerequisite if user need
> > to
> > > >>>>>> use
> > > >>>>>> it.
> > > >>>>>>
> > > >>>>>> Thanks,
> > > >>>>>>   Shameera.
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <
> > > >>>>>> glahiru@gmail.com
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>>  Hi Gagan,
> > > >>>>>>>
> > > >>>>>>> I need to start another discussion about it, but I had an
> offline
> > > >>>>>>> discussion with Suresh about auto-scaling. I will start another
> > > >>>>>>> thread
> > > >>>>>>> about this topic too.
> > > >>>>>>>
> > > >>>>>>> Regards
> > > >>>>>>> Lahiru
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
> > > >>>>>>>
> > > >>>>>> gagandeepjuneja@gmail.com
> > > >>>>
> > > >>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>  Thanks Lahiru for pointing to nice library, added to my
> > dictionary
> > > >>>>>>>>
> > > >>>>>>> :).
> > > >>>>
> > > >>>>>  I would like to know how are we planning to start multiple
> > servers.
> > > >>>>>>>> 1. Spawning new servers based on load? Some times we call it
> as
> > > auto
> > > >>>>>>>> scalable.
> > > >>>>>>>> 2. To make some specific number of nodes available such as we
> > > want 2
> > > >>>>>>>> servers to be available at any time so if one goes down then I
> > > need
> > > >>>>>>>>
> > > >>>>>>> to
> > > >>>>
> > > >>>>>  spawn one new to make available servers count 2.
> > > >>>>>>>> 3. Initially start all the servers.
> > > >>>>>>>>
> > > >>>>>>>> In scenario 1 and 2 zookeeper does make sense but I don't
> > believe
> > > >>>>>>>>
> > > >>>>>>> existing
> > > >>>>>>>
> > > >>>>>>>> architecture support this?
> > > >>>>>>>>
> > > >>>>>>>> Regards,
> > > >>>>>>>> Gagan
> > > >>>>>>>> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <
> glahiru@gmail.com
> > >
> > > >>>>>>>>
> > > >>>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>> Hi Gagan,
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks for your response. Please see my inline comments.
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
> > > >>>>>>>>>
> > > >>>>>>>> gagandeepjuneja@gmail.com>
> > > >>>>>>>
> > > >>>>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>  Hi Lahiru,
> > > >>>>>>>>>> Just my 2 cents.
> > > >>>>>>>>>>
> > > >>>>>>>>>> I am big fan of zookeeper but also against adding multiple
> > hops
> > > in
> > > >>>>>>>>>>
> > > >>>>>>>>> the
> > > >>>>>>>
> > > >>>>>>>> system which can add unnecessary complexity. Here I am not
> able
> > to
> > > >>>>>>>>>> understand the requirement of zookeeper may be I am wrong
> > > because
> > > >>>>>>>>>>
> > > >>>>>>>>> of
> > > >>>>
> > > >>>>> less
> > > >>>>>>>
> > > >>>>>>>> knowledge of the airavata system in whole. So I would like to
> > > >>>>>>>>>>
> > > >>>>>>>>> discuss
> > > >>>>
> > > >>>>>  following point.
> > > >>>>>>>>>>
> > > >>>>>>>>>> 1. How it will help us in making system more reliable.
> > Zookeeper
> > > >>>>>>>>>>
> > > >>>>>>>>> is
> > > >>>>
> > > >>>>> not
> > > >>>>>>>
> > > >>>>>>>> able to restart services. At max it can tell whether service
> is
> > up
> > > >>>>>>>>>>
> > > >>>>>>>>> or not
> > > >>>>>>>
> > > >>>>>>>> which could only be the case if airavata service goes down
> > > >>>>>>>>>>
> > > >>>>>>>>> gracefully and
> > > >>>>>>>
> > > >>>>>>>> we have any automated way to restart it. If this is just
> matter
> > of
> > > >>>>>>>>>>
> > > >>>>>>>>> routing
> > > >>>>>>>
> > > >>>>>>>> client requests to the available thrift servers then this can
> be
> > > >>>>>>>>>>
> > > >>>>>>>>> achieved
> > > >>>>>>>
> > > >>>>>>>> with the help of load balancer which I guess is already there
> in
> > > >>>>>>>>>>
> > > >>>>>>>>> thrift
> > > >>>>>>>
> > > >>>>>>>> wish list.
> > > >>>>>>>>>>
> > > >>>>>>>>>>  We have multiple thrift services and currently we start
> only
> > > one
> > > >>>>>>>>>
> > > >>>>>>>> instance
> > > >>>>>>>
> > > >>>>>>>> of them and each thrift service is a stateless service. To
> keep
> > > the
> > > >>>>>>>>>
> > > >>>>>>>> high
> > > >>>>>>>
> > > >>>>>>>> availability we have to start multiple instances of them in
> > > >>>>>>>>>
> > > >>>>>>>> production
> > > >>>>
> > > >>>>>  scenario. So for clients to get an available thrift service we
> can
> > > >>>>>>>>>
> > > >>>>>>>> use
> > > >>>>
> > > >>>>>  zookeeper znodes to represent each available service. There are
> > > >>>>>>>>>
> > > >>>>>>>> some
> > > >>>>
> > > >>>>>  libraries which is doing similar[1] and I think we can use them
> > > >>>>>>>>>
> > > >>>>>>>> directly.
> > > >>>>>>>
> > > >>>>>>>> 2. As far as registering of different providers is concerned
> do
> > > >>>>>>>>>>
> > > >>>>>>>>> you
> > > >>>>
> > > >>>>>  think for that we really need external store.
> > > >>>>>>>>>>
> > > >>>>>>>>>>  Yes I think so, because its light weight and reliable and
> we
> > > have
> > > >>>>>>>>>
> > > >>>>>>>> to
> > > >>>>
> > > >>>>> do
> > > >>>>>>>
> > > >>>>>>>> very minimal amount of work to achieve all these features to
> > > >>>>>>>>>
> > > >>>>>>>> Airavata
> > > >>>>
> > > >>>>>  because zookeeper handle all the complexity.
> > > >>>>>>>>>
> > > >>>>>>>>>  I have seen people using zookeeper more for state management
> > in
> > > >>>>>>>>>> distributed environments.
> > > >>>>>>>>>>
> > > >>>>>>>>>>  +1, we might not be the most effective users of zookeeper
> > > because
> > > >>>>>>>>>
> > > >>>>>>>> all
> > > >>>>
> > > >>>>> of
> > > >>>>>>>
> > > >>>>>>>> our services are stateless services, but my point is to
> achieve
> > > >>>>>>>>> fault-tolerance we can use zookeeper and with minimal work.
> > > >>>>>>>>>
> > > >>>>>>>>>    I would like to understand more how can we leverage
> > zookeeper
> > > in
> > > >>>>>>>>>> airavata to make system reliable.
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>>  [1]https://github.com/eirslett/thrift-zookeeper
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>  Regards,
> > > >>>>>>>>>> Gagan
> > > >>>>>>>>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu>
> > > wrote:
> > > >>>>>>>>>>
> > > >>>>>>>>>>  Thanks for the summary, Lahiru. I'm cc'ing the Architecture
> > > list
> > > >>>>>>>>>>>
> > > >>>>>>>>>> for
> > > >>>>
> > > >>>>>  additional comments.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Marlon
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> Hi All,
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> I did little research about Apache Zookeeper[1] and how to
> > use
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> it
> > > >>>>
> > > >>>>> in
> > > >>>>>>>
> > > >>>>>>>>  airavata. Its really a nice way to achieve fault tolerance
> and
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> reliable
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> communication between our thrift services and clients.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> Zookeeper
> > > >>>>
> > > >>>>> is a
> > > >>>>>>>
> > > >>>>>>>>  distributed, fault tolerant system to do a reliable
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> communication
> > > >>>>
> > > >>>>>  between
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> distributed applications. This is like an in-memory file
> > > system
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> which
> > > >>>>>>>
> > > >>>>>>>>  has
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> nodes in a tree structure and each node can have small
> > amount
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> of
> > > >>>>
> > > >>>>> data
> > > >>>>>>>
> > > >>>>>>>>  associated with it and these nodes are called znodes. Clients
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> can
> > > >>>>
> > > >>>>>  connect
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> to a zookeeper server and add/delete and update these
> > znodes.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>    In Apache Airavata we start multiple thrift services
> and
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> these
> > > >>>>
> > > >>>>> can
> > > >>>>>>>
> > > >>>>>>>>  go
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> down for maintenance or these can crash, if we use
> zookeeper
> > > to
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> store
> > > >>>>>>>
> > > >>>>>>>>  these
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> configuration(thrift service configurations) we can
> achieve
> > a
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> very
> > > >>>>
> > > >>>>>  reliable
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> system. Basically thrift clients can dynamically discover
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> available
> > > >>>>>>>
> > > >>>>>>>>  service
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> by using ephemeral znodes(Here we do not have to change
> the
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> generated
> > > >>>>>>>
> > > >>>>>>>>  thrift client code but we have to change the locations we are
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> invoking
> > > >>>>>>>
> > > >>>>>>>>  them). ephemeral znodes will be removed when the thrift
> service
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> goes
> > > >>>>>>>
> > > >>>>>>>>  down
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> and zookeeper guarantee the atomicity between these
> > > operations.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> With
> > > >>>>>>>
> > > >>>>>>>>  this
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> approach we can have a node hierarchy for multiple of
> > > airavata,
> > > >>>>>>>>>>>> orchestrator,appcatalog and gfac thrift services.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> For specifically for gfac we can have different types of
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> services
> > > >>>>
> > > >>>>> for
> > > >>>>>>>
> > > >>>>>>>>  each
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> provider implementation. This can be achieved by using the
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> hierarchical
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> support in zookeeper and providing some logic in
> gfac-thrift
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> service
> > > >>>>>>>
> > > >>>>>>>>  to
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> register it to a defined path. Using the same logic
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> orchestrator
> > > >>>>
> > > >>>>> can
> > > >>>>>>>
> > > >>>>>>>>  discover the provider specific gfac thrift service and route
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> the
> > > >>>>
> > > >>>>>  message to
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> the correct thrift service.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> With this approach I think we simply have write some
> client
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> code
> > > >>>>
> > > >>>>> in
> > > >>>>>>>
> > > >>>>>>>>  thrift
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> services and clients and zookeeper server installation can
> > be
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> done as
> > > >>>>>>>
> > > >>>>>>>>  a
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> separate process and it will be easier to keep the
> Zookeeper
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> server
> > > >>>>>>>
> > > >>>>>>>>  separate from Airavata because installation of Zookeeper
> server
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> little
> > > >>>>>>>
> > > >>>>>>>>  complex in production scenario. I think we have to make sure
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> everything
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> works fine when there is no Zookeeper running, ex:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> enable.zookeeper=false
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> should works fine and users doesn't have to download and
> > start
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> zookeeper.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> [1]http://zookeeper.apache.org/
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Thanks
> > > >>>>>>>>>>>> Lahiru
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>> --
> > > >>>>>>>>> System Analyst Programmer
> > > >>>>>>>>> PTI Lab
> > > >>>>>>>>> Indiana University
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>> --
> > > >>>>>>> System Analyst Programmer
> > > >>>>>>> PTI Lab
> > > >>>>>>> Indiana University
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>> --
> > > >>>>>> Best Regards,
> > > >>>>>> Shameera Rathnayaka.
> > > >>>>>>
> > > >>>>>> email: shameera AT apache.org , shameerainfo AT gmail.com
> > > >>>>>> Blog : http://shameerarathnayaka.blogspot.com/
> > > >>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>> --
> > > >>>>> Supun Kamburugamuva
> > > >>>>> Member, Apache Software Foundation; http://www.apache.org
> > > >>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > >>>>> Blog: http://supunk.blogspot.com
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>> --
> > > >>>> System Analyst Programmer
> > > >>>> PTI Lab
> > > >>>> Indiana University
> > > >>>>
> > > >>>>
> > > >>>
> > > >>> --
> > > >>> Supun Kamburugamuva
> > > >>> Member, Apache Software Foundation; http://www.apache.org
> > > >>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > >>> Blog: http://supunk.blogspot.com
> > > >>>
> > > >>>
> > > >>>
> > > >>
> > > >
> > >
> > >
> > > --
> > > Supun Kamburugamuva
> > > Member, Apache Software Foundation; http://www.apache.org
> > > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > Blog: http://supunk.blogspot.com
> > >
> >
> >
> >
> > --
> > System Analyst Programmer
> > PTI Lab
> > Indiana University
> >
>
>
>
> --
> Supun Kamburugamuva
> Member, Apache Software Foundation; http://www.apache.org
> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> Blog: http://supunk.blogspot.com
>



-- 
System Analyst Programmer
PTI Lab
Indiana University

Re: Zookeeper in Airavata to achieve reliability

Posted by Lahiru Gunathilake <gl...@gmail.com>.

Hi Thilina,


On Mon, Jun 16, 2014 at 2:55 PM, Thilina Gunarathne <cs...@gmail.com>
wrote:

> >
> > In addition to this, I have a more abstract question. Isn't this simply a
> > pub-sub system we are talking about? Orchestrator, acting as the
> publisher
> > will put a job (experiment) to the queue. Worker, acting as subscribers,
> > get the work and execute it. So the main question is why are we trying to
> > use zookeeper to act as a queue. I'm not saying its bad but there are
> other
> > scalable and proven ways of doing this (like a persistence messaging
> > solutions) but state shared using ZK.
> >
> Going through this thread, the same question came to my mind. Why are not
> consider one of the many *MQ solutions? What does ZooKeeper give us more
> than them? Are we already using ZooKeeper in Airavata deployments for any
> other use case?
>
Not now but that was the plan. We initially started looking in to ZK to
store configuration of gfac and other components, We have multiple
communication layers between several thrift services. Each service can be
clustered and each node details can be stored in ZK and read by clients.

>
> thanks,
> Thilina
>
>
>
> >
> >
> > Thanks,
> > Eran Chinthaka Withana
> >
> >
> > On Mon, Jun 16, 2014 at 11:22 AM, Lahiru Gunathilake <gl...@gmail.com>
> > wrote:
> >
> > > Hi All,
> > >
> > > I think the conclusion is like this,
> > >
> > > 1, We make the gfac as a worker not a thrift service and we can start
> > > multiple workers either with bunch of providers and handlers configured
> > in
> > > each worker or provider specific  workers to handle the class path
> issues
> > > (not the common scenario).
> > >
> > > 2. Gfac workers can be configured to watch for a given path in
> zookeeper,
> > > and multiple workers can listen to the same path. Default path can be
> > > /airavata/gfac or can configure paths like /airavata/gfac/gsissh
> > > /airavata/gfac/bes.
> > >
> > > 3. Orchestrator can configure with a logic to store experiment IDs in
> > > zookeeper with a path, and orchestrator can be configured to provider
> > > specific path logic too. So when a new request come orchestrator store
> > the
> > > experimentID and these experiments IDs are stored in Zk as a queue.
> > >
> > > 4. Since gfac workers are watching they will be notified and as supun
> > > suggested can use a leader selection algorithm[1] and one gfac worker
> >  will
> > > take the leadership for each experiment. If there are gfac instances
> for
> > > each provider same logic will apply among those nodes with same
> provider
> > > type.
> > >
> > > [1]http://curator.apache.org/curator-recipes/leader-election.html
> > >
> > > I would like to implement this if there are  no objections.
> > >
> > > Lahiru
> > >
> > >
> > > On Mon, Jun 16, 2014 at 11:51 AM, Supun Kamburugamuva <
> supun06@gmail.com
> > >
> > > wrote:
> > >
> > > > Hi Marlon,
> > > >
> > > > I think you are exactly correct.
> > > >
> > > > Supun..
> > > >
> > > >
> > > > On Mon, Jun 16, 2014 at 11:48 AM, Marlon Pierce <ma...@iu.edu>
> > wrote:
> > > >
> > > > > Let me restate this, and please tell me if I'm wrong.
> > > > >
> > > > > Orchestrator decides (somehow) that a particular job requires
> > JSDL/BES,
> > > > so
> > > > > it places the Experiment ID in Zookeeper's /airavata/gfac/jsdl-bes
> > > node.
> > > > >  GFAC servers associated with this instance notice the update.  The
> > > first
> > > > > GFAC to claim the job gets it, uses the Experiment ID to get the
> > > detailed
> > > > > information it needs from the Registry.  ZooKeeper handles the
> > locking,
> > > > etc
> > > > > to make sure that only one GFAC at a time is trying to handle an
> > > > experiment.
> > > > >
> > > > > Marlon
> > > > >
> > > > >
> > > > > On 6/16/14, 11:42 AM, Lahiru Gunathilake wrote:
> > > > >
> > > > >> Hi Supun,
> > > > >>
> > > > >> Thanks for the clarification.
> > > > >>
> > > > >> Regards
> > > > >> Lahiru
> > > > >>
> > > > >>
> > > > >> On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva <
> > > > supun06@gmail.com>
> > > > >> wrote:
> > > > >>
> > > > >>  Hi Lahiru,
> > > > >>>
> > > > >>> My suggestion is that may be you don't need a Thrift service
> > between
> > > > >>> Orchestrator and the component executing the experiment. When a
> new
> > > > >>> experiment is submitted, orchestrator decides who can execute
> this
> > > job.
> > > > >>> Then it put the information about this experiment execution in
> > > > ZooKeeper.
> > > > >>> The component which wants to executes the experiment is listening
> > to
> > > > this
> > > > >>> ZooKeeper path and when it sees the experiment it will execute
> it.
> > So
> > > > >>> that
> > > > >>> the communication happens through an state change in ZooKeeper.
> > This
> > > > can
> > > > >>> potentially simply your architecture.
> > > > >>>
> > > > >>> Thanks,
> > > > >>> Supun.
> > > > >>>
> > > > >>>
> > > > >>> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake <
> > > > glahiru@gmail.com>
> > > > >>> wrote:
> > > > >>>
> > > > >>>  Hi Supun,
> > > > >>>>
> > > > >>>> So your suggestion is to create a znode for each thrift service
> we
> > > > have
> > > > >>>> and
> > > > >>>> when the request comes that node gets modified with input data
> for
> > > > that
> > > > >>>> request and thrift service is having a watch for that node and
> it
> > > will
> > > > >>>> be
> > > > >>>> notified because of the watch and it can read the input from
> > > zookeeper
> > > > >>>> and
> > > > >>>> invoke the operation?
> > > > >>>>
> > > > >>>> Lahiru
> > > > >>>>
> > > > >>>>
> > > > >>>> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva <
> > > > >>>> supun06@gmail.com>
> > > > >>>> wrote:
> > > > >>>>
> > > > >>>>  Hi all,
> > > > >>>>>
> > > > >>>>> Here is what I think about Airavata and ZooKeeper. In Airavata
> > > there
> > > > >>>>> are
> > > > >>>>> many components and these components must be stateless to
> achieve
> > > > >>>>> scalability and reliability.Also there must be a mechanism to
> > > > >>>>>
> > > > >>>> communicate
> > > > >>>>
> > > > >>>>> between the components. At the moment Airavata uses RPC calls
> > based
> > > > on
> > > > >>>>> Thrift for the communication.
> > > > >>>>>
> > > > >>>>> ZooKeeper can be used both as a place to hold state and as a
> > > > >>>>>
> > > > >>>> communication
> > > > >>>>
> > > > >>>>> layer between the components. I'm involved with a project that
> > has
> > > > many
> > > > >>>>> distributed components like AIravata. Right now we use Thrift
> > > > services
> > > > >>>>>
> > > > >>>> to
> > > > >>>>
> > > > >>>>> communicate among the components. But we find it difficult to
> use
> > > RPC
> > > > >>>>>
> > > > >>>> calls
> > > > >>>>
> > > > >>>>> and achieve stateless behaviour and thinking of replacing
> Thrift
> > > > >>>>>
> > > > >>>> services
> > > > >>>>
> > > > >>>>> with ZooKeeper based communication layer. So I think it is
> better
> > > to
> > > > >>>>> explore the possibility of removing the Thrift services between
> > the
> > > > >>>>> components and use ZooKeeper as a communication mechanism
> between
> > > the
> > > > >>>>> services. If you do this you will have to move the state to
> > > ZooKeeper
> > > > >>>>>
> > > > >>>> and
> > > > >>>>
> > > > >>>>> will automatically achieve the stateless behaviour in the
> > > components.
> > > > >>>>>
> > > > >>>>> Also I think trying to make ZooKeeper optional is a bad idea.
> If
> > we
> > > > are
> > > > >>>>> trying to integrate something fundamentally important to
> > > architecture
> > > > >>>>> as
> > > > >>>>> how to store state, we shouldn't make it optional.
> > > > >>>>>
> > > > >>>>> Thanks,
> > > > >>>>> Supun..
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
> > > > >>>>> shameerainfo@gmail.com> wrote:
> > > > >>>>>
> > > > >>>>>  Hi Lahiru,
> > > > >>>>>>
> > > > >>>>>> As i understood,  not only reliability , you are trying to
> > achieve
> > > > >>>>>> some
> > > > >>>>>> other requirement by introducing zookeeper, like health
> > monitoring
> > > > of
> > > > >>>>>>
> > > > >>>>> the
> > > > >>>>
> > > > >>>>> services, categorization with service implementation etc ... .
> In
> > > > that
> > > > >>>>>> case, i think we can get use of zookeeper's features but if we
> > > only
> > > > >>>>>>
> > > > >>>>> focus
> > > > >>>>
> > > > >>>>> on reliability, i have little bit of concern, why can't we use
> > > > >>>>>>
> > > > >>>>> clustering +
> > > > >>>>
> > > > >>>>> LB ?
> > > > >>>>>>
> > > > >>>>>> Yes it is better we add Zookeeper as a prerequisite if user
> need
> > > to
> > > > >>>>>> use
> > > > >>>>>> it.
> > > > >>>>>>
> > > > >>>>>> Thanks,
> > > > >>>>>>   Shameera.
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <
> > > > >>>>>> glahiru@gmail.com
> > > > >>>>>> wrote:
> > > > >>>>>>
> > > > >>>>>>  Hi Gagan,
> > > > >>>>>>>
> > > > >>>>>>> I need to start another discussion about it, but I had an
> > offline
> > > > >>>>>>> discussion with Suresh about auto-scaling. I will start
> another
> > > > >>>>>>> thread
> > > > >>>>>>> about this topic too.
> > > > >>>>>>>
> > > > >>>>>>> Regards
> > > > >>>>>>> Lahiru
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
> > > > >>>>>>>
> > > > >>>>>> gagandeepjuneja@gmail.com
> > > > >>>>
> > > > >>>>> wrote:
> > > > >>>>>>>
> > > > >>>>>>>  Thanks Lahiru for pointing to nice library, added to my
> > > dictionary
> > > > >>>>>>>>
> > > > >>>>>>> :).
> > > > >>>>
> > > > >>>>>  I would like to know how are we planning to start multiple
> > > servers.
> > > > >>>>>>>> 1. Spawning new servers based on load? Some times we call it
> > as
> > > > auto
> > > > >>>>>>>> scalable.
> > > > >>>>>>>> 2. To make some specific number of nodes available such as
> we
> > > > want 2
> > > > >>>>>>>> servers to be available at any time so if one goes down
> then I
> > > > need
> > > > >>>>>>>>
> > > > >>>>>>> to
> > > > >>>>
> > > > >>>>>  spawn one new to make available servers count 2.
> > > > >>>>>>>> 3. Initially start all the servers.
> > > > >>>>>>>>
> > > > >>>>>>>> In scenario 1 and 2 zookeeper does make sense but I don't
> > > believe
> > > > >>>>>>>>
> > > > >>>>>>> existing
> > > > >>>>>>>
> > > > >>>>>>>> architecture support this?
> > > > >>>>>>>>
> > > > >>>>>>>> Regards,
> > > > >>>>>>>> Gagan
> > > > >>>>>>>> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <
> > glahiru@gmail.com
> > > >
> > > > >>>>>>>>
> > > > >>>>>>> wrote:
> > > > >>>>>>>
> > > > >>>>>>>> Hi Gagan,
> > > > >>>>>>>>>
> > > > >>>>>>>>> Thanks for your response. Please see my inline comments.
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
> > > > >>>>>>>>>
> > > > >>>>>>>> gagandeepjuneja@gmail.com>
> > > > >>>>>>>
> > > > >>>>>>>> wrote:
> > > > >>>>>>>>>
> > > > >>>>>>>>>  Hi Lahiru,
> > > > >>>>>>>>>> Just my 2 cents.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> I am big fan of zookeeper but also against adding multiple
> > > hops
> > > > in
> > > > >>>>>>>>>>
> > > > >>>>>>>>> the
> > > > >>>>>>>
> > > > >>>>>>>> system which can add unnecessary complexity. Here I am not
> > able
> > > to
> > > > >>>>>>>>>> understand the requirement of zookeeper may be I am wrong
> > > > because
> > > > >>>>>>>>>>
> > > > >>>>>>>>> of
> > > > >>>>
> > > > >>>>> less
> > > > >>>>>>>
> > > > >>>>>>>> knowledge of the airavata system in whole. So I would like
> to
> > > > >>>>>>>>>>
> > > > >>>>>>>>> discuss
> > > > >>>>
> > > > >>>>>  following point.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> 1. How it will help us in making system more reliable.
> > > Zookeeper
> > > > >>>>>>>>>>
> > > > >>>>>>>>> is
> > > > >>>>
> > > > >>>>> not
> > > > >>>>>>>
> > > > >>>>>>>> able to restart services. At max it can tell whether service
> > is
> > > up
> > > > >>>>>>>>>>
> > > > >>>>>>>>> or not
> > > > >>>>>>>
> > > > >>>>>>>> which could only be the case if airavata service goes down
> > > > >>>>>>>>>>
> > > > >>>>>>>>> gracefully and
> > > > >>>>>>>
> > > > >>>>>>>> we have any automated way to restart it. If this is just
> > matter
> > > of
> > > > >>>>>>>>>>
> > > > >>>>>>>>> routing
> > > > >>>>>>>
> > > > >>>>>>>> client requests to the available thrift servers then this
> can
> > be
> > > > >>>>>>>>>>
> > > > >>>>>>>>> achieved
> > > > >>>>>>>
> > > > >>>>>>>> with the help of load balancer which I guess is already
> there
> > in
> > > > >>>>>>>>>>
> > > > >>>>>>>>> thrift
> > > > >>>>>>>
> > > > >>>>>>>> wish list.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>  We have multiple thrift services and currently we start
> > only
> > > > one
> > > > >>>>>>>>>
> > > > >>>>>>>> instance
> > > > >>>>>>>
> > > > >>>>>>>> of them and each thrift service is a stateless service. To
> > keep
> > > > the
> > > > >>>>>>>>>
> > > > >>>>>>>> high
> > > > >>>>>>>
> > > > >>>>>>>> availability we have to start multiple instances of them in
> > > > >>>>>>>>>
> > > > >>>>>>>> production
> > > > >>>>
> > > > >>>>>  scenario. So for clients to get an available thrift service we
> > can
> > > > >>>>>>>>>
> > > > >>>>>>>> use
> > > > >>>>
> > > > >>>>>  zookeeper znodes to represent each available service. There
> are
> > > > >>>>>>>>>
> > > > >>>>>>>> some
> > > > >>>>
> > > > >>>>>  libraries which is doing similar[1] and I think we can use
> them
> > > > >>>>>>>>>
> > > > >>>>>>>> directly.
> > > > >>>>>>>
> > > > >>>>>>>> 2. As far as registering of different providers is concerned
> > do
> > > > >>>>>>>>>>
> > > > >>>>>>>>> you
> > > > >>>>
> > > > >>>>>  think for that we really need external store.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>  Yes I think so, because its light weight and reliable and
> > we
> > > > have
> > > > >>>>>>>>>
> > > > >>>>>>>> to
> > > > >>>>
> > > > >>>>> do
> > > > >>>>>>>
> > > > >>>>>>>> very minimal amount of work to achieve all these features to
> > > > >>>>>>>>>
> > > > >>>>>>>> Airavata
> > > > >>>>
> > > > >>>>>  because zookeeper handle all the complexity.
> > > > >>>>>>>>>
> > > > >>>>>>>>>  I have seen people using zookeeper more for state
> management
> > > in
> > > > >>>>>>>>>> distributed environments.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>  +1, we might not be the most effective users of zookeeper
> > > > because
> > > > >>>>>>>>>
> > > > >>>>>>>> all
> > > > >>>>
> > > > >>>>> of
> > > > >>>>>>>
> > > > >>>>>>>> our services are stateless services, but my point is to
> > achieve
> > > > >>>>>>>>> fault-tolerance we can use zookeeper and with minimal work.
> > > > >>>>>>>>>
> > > > >>>>>>>>>    I would like to understand more how can we leverage
> > > zookeeper
> > > > in
> > > > >>>>>>>>>> airavata to make system reliable.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>  [1]https://github.com/eirslett/thrift-zookeeper
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>>>>  Regards,
> > > > >>>>>>>>>> Gagan
> > > > >>>>>>>>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <marpierc@iu.edu
> >
> > > > wrote:
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>  Thanks for the summary, Lahiru. I'm cc'ing the
> Architecture
> > > > list
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>> for
> > > > >>>>
> > > > >>>>>  additional comments.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Marlon
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> Hi All,
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> I did little research about Apache Zookeeper[1] and how
> to
> > > use
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> it
> > > > >>>>
> > > > >>>>> in
> > > > >>>>>>>
> > > > >>>>>>>>  airavata. Its really a nice way to achieve fault tolerance
> > and
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> reliable
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> communication between our thrift services and clients.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> Zookeeper
> > > > >>>>
> > > > >>>>> is a
> > > > >>>>>>>
> > > > >>>>>>>>  distributed, fault tolerant system to do a reliable
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> communication
> > > > >>>>
> > > > >>>>>  between
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> distributed applications. This is like an in-memory file
> > > > system
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> which
> > > > >>>>>>>
> > > > >>>>>>>>  has
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> nodes in a tree structure and each node can have small
> > > amount
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> of
> > > > >>>>
> > > > >>>>> data
> > > > >>>>>>>
> > > > >>>>>>>>  associated with it and these nodes are called znodes.
> Clients
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> can
> > > > >>>>
> > > > >>>>>  connect
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> to a zookeeper server and add/delete and update these
> > > znodes.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>    In Apache Airavata we start multiple thrift services
> > and
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> these
> > > > >>>>
> > > > >>>>> can
> > > > >>>>>>>
> > > > >>>>>>>>  go
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> down for maintenance or these can crash, if we use
> > zookeeper
> > > > to
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> store
> > > > >>>>>>>
> > > > >>>>>>>>  these
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> configuration(thrift service configurations) we can
> > achieve
> > > a
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> very
> > > > >>>>
> > > > >>>>>  reliable
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> system. Basically thrift clients can dynamically
> discover
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> available
> > > > >>>>>>>
> > > > >>>>>>>>  service
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> by using ephemeral znodes(Here we do not have to change
> > the
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> generated
> > > > >>>>>>>
> > > > >>>>>>>>  thrift client code but we have to change the locations we
> are
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> invoking
> > > > >>>>>>>
> > > > >>>>>>>>  them). ephemeral znodes will be removed when the thrift
> > service
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> goes
> > > > >>>>>>>
> > > > >>>>>>>>  down
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> and zookeeper guarantee the atomicity between these
> > > > operations.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> With
> > > > >>>>>>>
> > > > >>>>>>>>  this
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> approach we can have a node hierarchy for multiple of
> > > > airavata,
> > > > >>>>>>>>>>>> orchestrator,appcatalog and gfac thrift services.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> For specifically for gfac we can have different types of
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> services
> > > > >>>>
> > > > >>>>> for
> > > > >>>>>>>
> > > > >>>>>>>>  each
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> provider implementation. This can be achieved by using
> the
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> hierarchical
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> support in zookeeper and providing some logic in
> > gfac-thrift
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> service
> > > > >>>>>>>
> > > > >>>>>>>>  to
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> register it to a defined path. Using the same logic
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> orchestrator
> > > > >>>>
> > > > >>>>> can
> > > > >>>>>>>
> > > > >>>>>>>>  discover the provider specific gfac thrift service and
> route
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> the
> > > > >>>>
> > > > >>>>>  message to
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> the correct thrift service.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> With this approach I think we simply have write some
> > client
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> code
> > > > >>>>
> > > > >>>>> in
> > > > >>>>>>>
> > > > >>>>>>>>  thrift
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> services and clients and zookeeper server installation
> can
> > > be
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> done as
> > > > >>>>>>>
> > > > >>>>>>>>  a
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> separate process and it will be easier to keep the
> > Zookeeper
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> server
> > > > >>>>>>>
> > > > >>>>>>>>  separate from Airavata because installation of Zookeeper
> > server
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> little
> > > > >>>>>>>
> > > > >>>>>>>>  complex in production scenario. I think we have to make
> sure
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> everything
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> works fine when there is no Zookeeper running, ex:
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> enable.zookeeper=false
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>> should works fine and users doesn't have to download and
> > > start
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>> zookeeper.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> [1]http://zookeeper.apache.org/
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> Thanks
> > > > >>>>>>>>>>>> Lahiru
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>
> > > > >>>>>>>>> --
> > > > >>>>>>>>> System Analyst Programmer
> > > > >>>>>>>>> PTI Lab
> > > > >>>>>>>>> Indiana University
> > > > >>>>>>>>>
> > > > >>>>>>>>>
> > > > >>>>>>> --
> > > > >>>>>>> System Analyst Programmer
> > > > >>>>>>> PTI Lab
> > > > >>>>>>> Indiana University
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>> --
> > > > >>>>>> Best Regards,
> > > > >>>>>> Shameera Rathnayaka.
> > > > >>>>>>
> > > > >>>>>> email: shameera AT apache.org , shameerainfo AT gmail.com
> > > > >>>>>> Blog : http://shameerarathnayaka.blogspot.com/
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>> --
> > > > >>>>> Supun Kamburugamuva
> > > > >>>>> Member, Apache Software Foundation; http://www.apache.org
> > > > >>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > > >>>>> Blog: http://supunk.blogspot.com
> > > > >>>>>
> > > > >>>>>
> > > > >>>>>
> > > > >>>> --
> > > > >>>> System Analyst Programmer
> > > > >>>> PTI Lab
> > > > >>>> Indiana University
> > > > >>>>
> > > > >>>>
> > > > >>>
> > > > >>> --
> > > > >>> Supun Kamburugamuva
> > > > >>> Member, Apache Software Foundation; http://www.apache.org
> > > > >>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > > >>> Blog: http://supunk.blogspot.com
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>
> > > > >
> > > >
> > > >
> > > > --
> > > > Supun Kamburugamuva
> > > > Member, Apache Software Foundation; http://www.apache.org
> > > > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > > Blog: http://supunk.blogspot.com
> > > >
> > >
> > >
> > >
> > > --
> > > System Analyst Programmer
> > > PTI Lab
> > > Indiana University
> > >
> >
>
>
>
> --
> https://www.cs.indiana.edu/~tgunarat/
> http://www.linkedin.com/in/thilina
> http://thilina.gunarathne.org
>



-- 
System Analyst Programmer
PTI Lab
Indiana University

Re: Zookeeper in Airavata to achieve reliability

Posted by Thilina Gunarathne <cs...@gmail.com>.

>
> In addition to this, I have a more abstract question. Isn't this simply a
> pub-sub system we are talking about? Orchestrator, acting as the publisher
> will put a job (experiment) to the queue. Worker, acting as subscribers,
> get the work and execute it. So the main question is why are we trying to
> use zookeeper to act as a queue. I'm not saying its bad but there are other
> scalable and proven ways of doing this (like a persistence messaging
> solutions) but state shared using ZK.
>
Going through this thread, the same question came to my mind. Why are not
consider one of the many *MQ solutions? What does ZooKeeper give us more
than them? Are we already using ZooKeeper in Airavata deployments for any
other use case?

thanks,
Thilina



>
>
> Thanks,
> Eran Chinthaka Withana
>
>
> On Mon, Jun 16, 2014 at 11:22 AM, Lahiru Gunathilake <gl...@gmail.com>
> wrote:
>
> > Hi All,
> >
> > I think the conclusion is like this,
> >
> > 1, We make the gfac as a worker not a thrift service and we can start
> > multiple workers either with bunch of providers and handlers configured
> in
> > each worker or provider specific  workers to handle the class path issues
> > (not the common scenario).
> >
> > 2. Gfac workers can be configured to watch for a given path in zookeeper,
> > and multiple workers can listen to the same path. Default path can be
> > /airavata/gfac or can configure paths like /airavata/gfac/gsissh
> > /airavata/gfac/bes.
> >
> > 3. Orchestrator can configure with a logic to store experiment IDs in
> > zookeeper with a path, and orchestrator can be configured to provider
> > specific path logic too. So when a new request come orchestrator store
> the
> > experimentID and these experiments IDs are stored in Zk as a queue.
> >
> > 4. Since gfac workers are watching they will be notified and as supun
> > suggested can use a leader selection algorithm[1] and one gfac worker
>  will
> > take the leadership for each experiment. If there are gfac instances for
> > each provider same logic will apply among those nodes with same provider
> > type.
> >
> > [1]http://curator.apache.org/curator-recipes/leader-election.html
> >
> > I would like to implement this if there are  no objections.
> >
> > Lahiru
> >
> >
> > On Mon, Jun 16, 2014 at 11:51 AM, Supun Kamburugamuva <supun06@gmail.com
> >
> > wrote:
> >
> > > Hi Marlon,
> > >
> > > I think you are exactly correct.
> > >
> > > Supun..
> > >
> > >
> > > On Mon, Jun 16, 2014 at 11:48 AM, Marlon Pierce <ma...@iu.edu>
> wrote:
> > >
> > > > Let me restate this, and please tell me if I'm wrong.
> > > >
> > > > Orchestrator decides (somehow) that a particular job requires
> JSDL/BES,
> > > so
> > > > it places the Experiment ID in Zookeeper's /airavata/gfac/jsdl-bes
> > node.
> > > >  GFAC servers associated with this instance notice the update.  The
> > first
> > > > GFAC to claim the job gets it, uses the Experiment ID to get the
> > detailed
> > > > information it needs from the Registry.  ZooKeeper handles the
> locking,
> > > etc
> > > > to make sure that only one GFAC at a time is trying to handle an
> > > experiment.
> > > >
> > > > Marlon
> > > >
> > > >
> > > > On 6/16/14, 11:42 AM, Lahiru Gunathilake wrote:
> > > >
> > > >> Hi Supun,
> > > >>
> > > >> Thanks for the clarification.
> > > >>
> > > >> Regards
> > > >> Lahiru
> > > >>
> > > >>
> > > >> On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva <
> > > supun06@gmail.com>
> > > >> wrote:
> > > >>
> > > >>  Hi Lahiru,
> > > >>>
> > > >>> My suggestion is that may be you don't need a Thrift service
> between
> > > >>> Orchestrator and the component executing the experiment. When a new
> > > >>> experiment is submitted, orchestrator decides who can execute this
> > job.
> > > >>> Then it put the information about this experiment execution in
> > > ZooKeeper.
> > > >>> The component which wants to executes the experiment is listening
> to
> > > this
> > > >>> ZooKeeper path and when it sees the experiment it will execute it.
> So
> > > >>> that
> > > >>> the communication happens through an state change in ZooKeeper.
> This
> > > can
> > > >>> potentially simply your architecture.
> > > >>>
> > > >>> Thanks,
> > > >>> Supun.
> > > >>>
> > > >>>
> > > >>> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake <
> > > glahiru@gmail.com>
> > > >>> wrote:
> > > >>>
> > > >>>  Hi Supun,
> > > >>>>
> > > >>>> So your suggestion is to create a znode for each thrift service we
> > > have
> > > >>>> and
> > > >>>> when the request comes that node gets modified with input data for
> > > that
> > > >>>> request and thrift service is having a watch for that node and it
> > will
> > > >>>> be
> > > >>>> notified because of the watch and it can read the input from
> > zookeeper
> > > >>>> and
> > > >>>> invoke the operation?
> > > >>>>
> > > >>>> Lahiru
> > > >>>>
> > > >>>>
> > > >>>> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva <
> > > >>>> supun06@gmail.com>
> > > >>>> wrote:
> > > >>>>
> > > >>>>  Hi all,
> > > >>>>>
> > > >>>>> Here is what I think about Airavata and ZooKeeper. In Airavata
> > there
> > > >>>>> are
> > > >>>>> many components and these components must be stateless to achieve
> > > >>>>> scalability and reliability.Also there must be a mechanism to
> > > >>>>>
> > > >>>> communicate
> > > >>>>
> > > >>>>> between the components. At the moment Airavata uses RPC calls
> based
> > > on
> > > >>>>> Thrift for the communication.
> > > >>>>>
> > > >>>>> ZooKeeper can be used both as a place to hold state and as a
> > > >>>>>
> > > >>>> communication
> > > >>>>
> > > >>>>> layer between the components. I'm involved with a project that
> has
> > > many
> > > >>>>> distributed components like AIravata. Right now we use Thrift
> > > services
> > > >>>>>
> > > >>>> to
> > > >>>>
> > > >>>>> communicate among the components. But we find it difficult to use
> > RPC
> > > >>>>>
> > > >>>> calls
> > > >>>>
> > > >>>>> and achieve stateless behaviour and thinking of replacing Thrift
> > > >>>>>
> > > >>>> services
> > > >>>>
> > > >>>>> with ZooKeeper based communication layer. So I think it is better
> > to
> > > >>>>> explore the possibility of removing the Thrift services between
> the
> > > >>>>> components and use ZooKeeper as a communication mechanism between
> > the
> > > >>>>> services. If you do this you will have to move the state to
> > ZooKeeper
> > > >>>>>
> > > >>>> and
> > > >>>>
> > > >>>>> will automatically achieve the stateless behaviour in the
> > components.
> > > >>>>>
> > > >>>>> Also I think trying to make ZooKeeper optional is a bad idea. If
> we
> > > are
> > > >>>>> trying to integrate something fundamentally important to
> > architecture
> > > >>>>> as
> > > >>>>> how to store state, we shouldn't make it optional.
> > > >>>>>
> > > >>>>> Thanks,
> > > >>>>> Supun..
> > > >>>>>
> > > >>>>>
> > > >>>>> On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
> > > >>>>> shameerainfo@gmail.com> wrote:
> > > >>>>>
> > > >>>>>  Hi Lahiru,
> > > >>>>>>
> > > >>>>>> As i understood,  not only reliability , you are trying to
> achieve
> > > >>>>>> some
> > > >>>>>> other requirement by introducing zookeeper, like health
> monitoring
> > > of
> > > >>>>>>
> > > >>>>> the
> > > >>>>
> > > >>>>> services, categorization with service implementation etc ... . In
> > > that
> > > >>>>>> case, i think we can get use of zookeeper's features but if we
> > only
> > > >>>>>>
> > > >>>>> focus
> > > >>>>
> > > >>>>> on reliability, i have little bit of concern, why can't we use
> > > >>>>>>
> > > >>>>> clustering +
> > > >>>>
> > > >>>>> LB ?
> > > >>>>>>
> > > >>>>>> Yes it is better we add Zookeeper as a prerequisite if user need
> > to
> > > >>>>>> use
> > > >>>>>> it.
> > > >>>>>>
> > > >>>>>> Thanks,
> > > >>>>>>   Shameera.
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <
> > > >>>>>> glahiru@gmail.com
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>>  Hi Gagan,
> > > >>>>>>>
> > > >>>>>>> I need to start another discussion about it, but I had an
> offline
> > > >>>>>>> discussion with Suresh about auto-scaling. I will start another
> > > >>>>>>> thread
> > > >>>>>>> about this topic too.
> > > >>>>>>>
> > > >>>>>>> Regards
> > > >>>>>>> Lahiru
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
> > > >>>>>>>
> > > >>>>>> gagandeepjuneja@gmail.com
> > > >>>>
> > > >>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>  Thanks Lahiru for pointing to nice library, added to my
> > dictionary
> > > >>>>>>>>
> > > >>>>>>> :).
> > > >>>>
> > > >>>>>  I would like to know how are we planning to start multiple
> > servers.
> > > >>>>>>>> 1. Spawning new servers based on load? Some times we call it
> as
> > > auto
> > > >>>>>>>> scalable.
> > > >>>>>>>> 2. To make some specific number of nodes available such as we
> > > want 2
> > > >>>>>>>> servers to be available at any time so if one goes down then I
> > > need
> > > >>>>>>>>
> > > >>>>>>> to
> > > >>>>
> > > >>>>>  spawn one new to make available servers count 2.
> > > >>>>>>>> 3. Initially start all the servers.
> > > >>>>>>>>
> > > >>>>>>>> In scenario 1 and 2 zookeeper does make sense but I don't
> > believe
> > > >>>>>>>>
> > > >>>>>>> existing
> > > >>>>>>>
> > > >>>>>>>> architecture support this?
> > > >>>>>>>>
> > > >>>>>>>> Regards,
> > > >>>>>>>> Gagan
> > > >>>>>>>> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <
> glahiru@gmail.com
> > >
> > > >>>>>>>>
> > > >>>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>> Hi Gagan,
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks for your response. Please see my inline comments.
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
> > > >>>>>>>>>
> > > >>>>>>>> gagandeepjuneja@gmail.com>
> > > >>>>>>>
> > > >>>>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>  Hi Lahiru,
> > > >>>>>>>>>> Just my 2 cents.
> > > >>>>>>>>>>
> > > >>>>>>>>>> I am big fan of zookeeper but also against adding multiple
> > hops
> > > in
> > > >>>>>>>>>>
> > > >>>>>>>>> the
> > > >>>>>>>
> > > >>>>>>>> system which can add unnecessary complexity. Here I am not
> able
> > to
> > > >>>>>>>>>> understand the requirement of zookeeper may be I am wrong
> > > because
> > > >>>>>>>>>>
> > > >>>>>>>>> of
> > > >>>>
> > > >>>>> less
> > > >>>>>>>
> > > >>>>>>>> knowledge of the airavata system in whole. So I would like to
> > > >>>>>>>>>>
> > > >>>>>>>>> discuss
> > > >>>>
> > > >>>>>  following point.
> > > >>>>>>>>>>
> > > >>>>>>>>>> 1. How it will help us in making system more reliable.
> > Zookeeper
> > > >>>>>>>>>>
> > > >>>>>>>>> is
> > > >>>>
> > > >>>>> not
> > > >>>>>>>
> > > >>>>>>>> able to restart services. At max it can tell whether service
> is
> > up
> > > >>>>>>>>>>
> > > >>>>>>>>> or not
> > > >>>>>>>
> > > >>>>>>>> which could only be the case if airavata service goes down
> > > >>>>>>>>>>
> > > >>>>>>>>> gracefully and
> > > >>>>>>>
> > > >>>>>>>> we have any automated way to restart it. If this is just
> matter
> > of
> > > >>>>>>>>>>
> > > >>>>>>>>> routing
> > > >>>>>>>
> > > >>>>>>>> client requests to the available thrift servers then this can
> be
> > > >>>>>>>>>>
> > > >>>>>>>>> achieved
> > > >>>>>>>
> > > >>>>>>>> with the help of load balancer which I guess is already there
> in
> > > >>>>>>>>>>
> > > >>>>>>>>> thrift
> > > >>>>>>>
> > > >>>>>>>> wish list.
> > > >>>>>>>>>>
> > > >>>>>>>>>>  We have multiple thrift services and currently we start
> only
> > > one
> > > >>>>>>>>>
> > > >>>>>>>> instance
> > > >>>>>>>
> > > >>>>>>>> of them and each thrift service is a stateless service. To
> keep
> > > the
> > > >>>>>>>>>
> > > >>>>>>>> high
> > > >>>>>>>
> > > >>>>>>>> availability we have to start multiple instances of them in
> > > >>>>>>>>>
> > > >>>>>>>> production
> > > >>>>
> > > >>>>>  scenario. So for clients to get an available thrift service we
> can
> > > >>>>>>>>>
> > > >>>>>>>> use
> > > >>>>
> > > >>>>>  zookeeper znodes to represent each available service. There are
> > > >>>>>>>>>
> > > >>>>>>>> some
> > > >>>>
> > > >>>>>  libraries which is doing similar[1] and I think we can use them
> > > >>>>>>>>>
> > > >>>>>>>> directly.
> > > >>>>>>>
> > > >>>>>>>> 2. As far as registering of different providers is concerned
> do
> > > >>>>>>>>>>
> > > >>>>>>>>> you
> > > >>>>
> > > >>>>>  think for that we really need external store.
> > > >>>>>>>>>>
> > > >>>>>>>>>>  Yes I think so, because its light weight and reliable and
> we
> > > have
> > > >>>>>>>>>
> > > >>>>>>>> to
> > > >>>>
> > > >>>>> do
> > > >>>>>>>
> > > >>>>>>>> very minimal amount of work to achieve all these features to
> > > >>>>>>>>>
> > > >>>>>>>> Airavata
> > > >>>>
> > > >>>>>  because zookeeper handle all the complexity.
> > > >>>>>>>>>
> > > >>>>>>>>>  I have seen people using zookeeper more for state management
> > in
> > > >>>>>>>>>> distributed environments.
> > > >>>>>>>>>>
> > > >>>>>>>>>>  +1, we might not be the most effective users of zookeeper
> > > because
> > > >>>>>>>>>
> > > >>>>>>>> all
> > > >>>>
> > > >>>>> of
> > > >>>>>>>
> > > >>>>>>>> our services are stateless services, but my point is to
> achieve
> > > >>>>>>>>> fault-tolerance we can use zookeeper and with minimal work.
> > > >>>>>>>>>
> > > >>>>>>>>>    I would like to understand more how can we leverage
> > zookeeper
> > > in
> > > >>>>>>>>>> airavata to make system reliable.
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>>  [1]https://github.com/eirslett/thrift-zookeeper
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>  Regards,
> > > >>>>>>>>>> Gagan
> > > >>>>>>>>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu>
> > > wrote:
> > > >>>>>>>>>>
> > > >>>>>>>>>>  Thanks for the summary, Lahiru. I'm cc'ing the Architecture
> > > list
> > > >>>>>>>>>>>
> > > >>>>>>>>>> for
> > > >>>>
> > > >>>>>  additional comments.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Marlon
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> Hi All,
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> I did little research about Apache Zookeeper[1] and how to
> > use
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> it
> > > >>>>
> > > >>>>> in
> > > >>>>>>>
> > > >>>>>>>>  airavata. Its really a nice way to achieve fault tolerance
> and
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> reliable
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> communication between our thrift services and clients.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> Zookeeper
> > > >>>>
> > > >>>>> is a
> > > >>>>>>>
> > > >>>>>>>>  distributed, fault tolerant system to do a reliable
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> communication
> > > >>>>
> > > >>>>>  between
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> distributed applications. This is like an in-memory file
> > > system
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> which
> > > >>>>>>>
> > > >>>>>>>>  has
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> nodes in a tree structure and each node can have small
> > amount
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> of
> > > >>>>
> > > >>>>> data
> > > >>>>>>>
> > > >>>>>>>>  associated with it and these nodes are called znodes. Clients
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> can
> > > >>>>
> > > >>>>>  connect
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> to a zookeeper server and add/delete and update these
> > znodes.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>    In Apache Airavata we start multiple thrift services
> and
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> these
> > > >>>>
> > > >>>>> can
> > > >>>>>>>
> > > >>>>>>>>  go
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> down for maintenance or these can crash, if we use
> zookeeper
> > > to
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> store
> > > >>>>>>>
> > > >>>>>>>>  these
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> configuration(thrift service configurations) we can
> achieve
> > a
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> very
> > > >>>>
> > > >>>>>  reliable
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> system. Basically thrift clients can dynamically discover
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> available
> > > >>>>>>>
> > > >>>>>>>>  service
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> by using ephemeral znodes(Here we do not have to change
> the
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> generated
> > > >>>>>>>
> > > >>>>>>>>  thrift client code but we have to change the locations we are
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> invoking
> > > >>>>>>>
> > > >>>>>>>>  them). ephemeral znodes will be removed when the thrift
> service
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> goes
> > > >>>>>>>
> > > >>>>>>>>  down
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> and zookeeper guarantee the atomicity between these
> > > operations.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> With
> > > >>>>>>>
> > > >>>>>>>>  this
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> approach we can have a node hierarchy for multiple of
> > > airavata,
> > > >>>>>>>>>>>> orchestrator,appcatalog and gfac thrift services.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> For specifically for gfac we can have different types of
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> services
> > > >>>>
> > > >>>>> for
> > > >>>>>>>
> > > >>>>>>>>  each
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> provider implementation. This can be achieved by using the
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> hierarchical
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> support in zookeeper and providing some logic in
> gfac-thrift
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> service
> > > >>>>>>>
> > > >>>>>>>>  to
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> register it to a defined path. Using the same logic
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> orchestrator
> > > >>>>
> > > >>>>> can
> > > >>>>>>>
> > > >>>>>>>>  discover the provider specific gfac thrift service and route
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> the
> > > >>>>
> > > >>>>>  message to
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> the correct thrift service.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> With this approach I think we simply have write some
> client
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> code
> > > >>>>
> > > >>>>> in
> > > >>>>>>>
> > > >>>>>>>>  thrift
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> services and clients and zookeeper server installation can
> > be
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> done as
> > > >>>>>>>
> > > >>>>>>>>  a
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> separate process and it will be easier to keep the
> Zookeeper
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> server
> > > >>>>>>>
> > > >>>>>>>>  separate from Airavata because installation of Zookeeper
> server
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> little
> > > >>>>>>>
> > > >>>>>>>>  complex in production scenario. I think we have to make sure
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> everything
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> works fine when there is no Zookeeper running, ex:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> enable.zookeeper=false
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> should works fine and users doesn't have to download and
> > start
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> zookeeper.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> [1]http://zookeeper.apache.org/
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Thanks
> > > >>>>>>>>>>>> Lahiru
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>> --
> > > >>>>>>>>> System Analyst Programmer
> > > >>>>>>>>> PTI Lab
> > > >>>>>>>>> Indiana University
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>> --
> > > >>>>>>> System Analyst Programmer
> > > >>>>>>> PTI Lab
> > > >>>>>>> Indiana University
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>> --
> > > >>>>>> Best Regards,
> > > >>>>>> Shameera Rathnayaka.
> > > >>>>>>
> > > >>>>>> email: shameera AT apache.org , shameerainfo AT gmail.com
> > > >>>>>> Blog : http://shameerarathnayaka.blogspot.com/
> > > >>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>> --
> > > >>>>> Supun Kamburugamuva
> > > >>>>> Member, Apache Software Foundation; http://www.apache.org
> > > >>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > >>>>> Blog: http://supunk.blogspot.com
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>> --
> > > >>>> System Analyst Programmer
> > > >>>> PTI Lab
> > > >>>> Indiana University
> > > >>>>
> > > >>>>
> > > >>>
> > > >>> --
> > > >>> Supun Kamburugamuva
> > > >>> Member, Apache Software Foundation; http://www.apache.org
> > > >>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > >>> Blog: http://supunk.blogspot.com
> > > >>>
> > > >>>
> > > >>>
> > > >>
> > > >
> > >
> > >
> > > --
> > > Supun Kamburugamuva
> > > Member, Apache Software Foundation; http://www.apache.org
> > > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > Blog: http://supunk.blogspot.com
> > >
> >
> >
> >
> > --
> > System Analyst Programmer
> > PTI Lab
> > Indiana University
> >
>



-- 
https://www.cs.indiana.edu/~tgunarat/
http://www.linkedin.com/in/thilina
http://thilina.gunarathne.org

Re: Zookeeper in Airavata to achieve reliability

Posted by Lahiru Gunathilake <gl...@gmail.com>.

Hi All,

Currently GFAC  saves states of jobs in to Registry and orchestrator has to
modify to handle failed jobs  and relaunch, We were planning to run a
thread in orchestrator to find failed jobs and submit to gfac. So
Orchestrator is responsible for handing gfac failures. All the job states
is saving to registry frequently by GFAC.

I see good usecases with ZK to handle gfac worker failures. At a given
point Orchestrator can figure out what experiments are run by each GFAc
instance(We can either use ZK or registry to keep this data). If GFAC
crashes orchestrator can easily find out with ephemeral znodes so all the
jobs related to this node can be evaluated and relaunched again by
orchestrator.

Threading model in GFAC is like each experiment will run in a separate
thread until the job is submitted. Once it submit the job will put in to a
queue and there is another thread to do the monitoring for all the jobs for
that instance. IMO if we use ZK we can solve lot of fault tolerance issues
we have in our components by using a single ZK cluster.


Lahiru


On Mon, Jun 16, 2014 at 2:38 PM, Supun Kamburugamuva <su...@gmail.com>
wrote:

> Hi Lahiru,
>
> Before moving with an implementation it may be worth to consider some of
> the following aspects as well.
>
> 1. How to report the progress of an experiment as state in ZooKeeper? What
> happens if a GFac instance crashes while executing an experiment? Are there
> check-points we can save so that another GFac instance can take over?
> 2. What is the threading model of GFac instances? (I consider this as a
> very important aspect)
> 3. What are the information needed to be stored in the ZooKeeper? You may
> need to store other information about an experiment apart from its
> experiment ID.
> 4. How to report errors?
> 5. For GFac weather you need a threading model or worker process model?
>
> Thanks,
> Supun..
>
>
>
>
>
> On Mon, Jun 16, 2014 at 2:22 PM, Lahiru Gunathilake <gl...@gmail.com>
> wrote:
>
> > Hi All,
> >
> > I think the conclusion is like this,
> >
> > 1, We make the gfac as a worker not a thrift service and we can start
> > multiple workers either with bunch of providers and handlers configured
> in
> > each worker or provider specific  workers to handle the class path issues
> > (not the common scenario).
> >
> > 2. Gfac workers can be configured to watch for a given path in zookeeper,
> > and multiple workers can listen to the same path. Default path can be
> > /airavata/gfac or can configure paths like /airavata/gfac/gsissh
> > /airavata/gfac/bes.
> >
> > 3. Orchestrator can configure with a logic to store experiment IDs in
> > zookeeper with a path, and orchestrator can be configured to provider
> > specific path logic too. So when a new request come orchestrator store
> the
> > experimentID and these experiments IDs are stored in Zk as a queue.
> >
> > 4. Since gfac workers are watching they will be notified and as supun
> > suggested can use a leader selection algorithm[1] and one gfac worker
>  will
> > take the leadership for each experiment. If there are gfac instances for
> > each provider same logic will apply among those nodes with same provider
> > type.
> >
> > [1]http://curator.apache.org/curator-recipes/leader-election.html
> >
> > I would like to implement this if there are  no objections.
> >
> > Lahiru
> >
> >
> > On Mon, Jun 16, 2014 at 11:51 AM, Supun Kamburugamuva <supun06@gmail.com
> >
> > wrote:
> >
> > > Hi Marlon,
> > >
> > > I think you are exactly correct.
> > >
> > > Supun..
> > >
> > >
> > > On Mon, Jun 16, 2014 at 11:48 AM, Marlon Pierce <ma...@iu.edu>
> wrote:
> > >
> > > > Let me restate this, and please tell me if I'm wrong.
> > > >
> > > > Orchestrator decides (somehow) that a particular job requires
> JSDL/BES,
> > > so
> > > > it places the Experiment ID in Zookeeper's /airavata/gfac/jsdl-bes
> > node.
> > > >  GFAC servers associated with this instance notice the update.  The
> > first
> > > > GFAC to claim the job gets it, uses the Experiment ID to get the
> > detailed
> > > > information it needs from the Registry.  ZooKeeper handles the
> locking,
> > > etc
> > > > to make sure that only one GFAC at a time is trying to handle an
> > > experiment.
> > > >
> > > > Marlon
> > > >
> > > >
> > > > On 6/16/14, 11:42 AM, Lahiru Gunathilake wrote:
> > > >
> > > >> Hi Supun,
> > > >>
> > > >> Thanks for the clarification.
> > > >>
> > > >> Regards
> > > >> Lahiru
> > > >>
> > > >>
> > > >> On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva <
> > > supun06@gmail.com>
> > > >> wrote:
> > > >>
> > > >>  Hi Lahiru,
> > > >>>
> > > >>> My suggestion is that may be you don't need a Thrift service
> between
> > > >>> Orchestrator and the component executing the experiment. When a new
> > > >>> experiment is submitted, orchestrator decides who can execute this
> > job.
> > > >>> Then it put the information about this experiment execution in
> > > ZooKeeper.
> > > >>> The component which wants to executes the experiment is listening
> to
> > > this
> > > >>> ZooKeeper path and when it sees the experiment it will execute it.
> So
> > > >>> that
> > > >>> the communication happens through an state change in ZooKeeper.
> This
> > > can
> > > >>> potentially simply your architecture.
> > > >>>
> > > >>> Thanks,
> > > >>> Supun.
> > > >>>
> > > >>>
> > > >>> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake <
> > > glahiru@gmail.com>
> > > >>> wrote:
> > > >>>
> > > >>>  Hi Supun,
> > > >>>>
> > > >>>> So your suggestion is to create a znode for each thrift service we
> > > have
> > > >>>> and
> > > >>>> when the request comes that node gets modified with input data for
> > > that
> > > >>>> request and thrift service is having a watch for that node and it
> > will
> > > >>>> be
> > > >>>> notified because of the watch and it can read the input from
> > zookeeper
> > > >>>> and
> > > >>>> invoke the operation?
> > > >>>>
> > > >>>> Lahiru
> > > >>>>
> > > >>>>
> > > >>>> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva <
> > > >>>> supun06@gmail.com>
> > > >>>> wrote:
> > > >>>>
> > > >>>>  Hi all,
> > > >>>>>
> > > >>>>> Here is what I think about Airavata and ZooKeeper. In Airavata
> > there
> > > >>>>> are
> > > >>>>> many components and these components must be stateless to achieve
> > > >>>>> scalability and reliability.Also there must be a mechanism to
> > > >>>>>
> > > >>>> communicate
> > > >>>>
> > > >>>>> between the components. At the moment Airavata uses RPC calls
> based
> > > on
> > > >>>>> Thrift for the communication.
> > > >>>>>
> > > >>>>> ZooKeeper can be used both as a place to hold state and as a
> > > >>>>>
> > > >>>> communication
> > > >>>>
> > > >>>>> layer between the components. I'm involved with a project that
> has
> > > many
> > > >>>>> distributed components like AIravata. Right now we use Thrift
> > > services
> > > >>>>>
> > > >>>> to
> > > >>>>
> > > >>>>> communicate among the components. But we find it difficult to use
> > RPC
> > > >>>>>
> > > >>>> calls
> > > >>>>
> > > >>>>> and achieve stateless behaviour and thinking of replacing Thrift
> > > >>>>>
> > > >>>> services
> > > >>>>
> > > >>>>> with ZooKeeper based communication layer. So I think it is better
> > to
> > > >>>>> explore the possibility of removing the Thrift services between
> the
> > > >>>>> components and use ZooKeeper as a communication mechanism between
> > the
> > > >>>>> services. If you do this you will have to move the state to
> > ZooKeeper
> > > >>>>>
> > > >>>> and
> > > >>>>
> > > >>>>> will automatically achieve the stateless behaviour in the
> > components.
> > > >>>>>
> > > >>>>> Also I think trying to make ZooKeeper optional is a bad idea. If
> we
> > > are
> > > >>>>> trying to integrate something fundamentally important to
> > architecture
> > > >>>>> as
> > > >>>>> how to store state, we shouldn't make it optional.
> > > >>>>>
> > > >>>>> Thanks,
> > > >>>>> Supun..
> > > >>>>>
> > > >>>>>
> > > >>>>> On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
> > > >>>>> shameerainfo@gmail.com> wrote:
> > > >>>>>
> > > >>>>>  Hi Lahiru,
> > > >>>>>>
> > > >>>>>> As i understood,  not only reliability , you are trying to
> achieve
> > > >>>>>> some
> > > >>>>>> other requirement by introducing zookeeper, like health
> monitoring
> > > of
> > > >>>>>>
> > > >>>>> the
> > > >>>>
> > > >>>>> services, categorization with service implementation etc ... . In
> > > that
> > > >>>>>> case, i think we can get use of zookeeper's features but if we
> > only
> > > >>>>>>
> > > >>>>> focus
> > > >>>>
> > > >>>>> on reliability, i have little bit of concern, why can't we use
> > > >>>>>>
> > > >>>>> clustering +
> > > >>>>
> > > >>>>> LB ?
> > > >>>>>>
> > > >>>>>> Yes it is better we add Zookeeper as a prerequisite if user need
> > to
> > > >>>>>> use
> > > >>>>>> it.
> > > >>>>>>
> > > >>>>>> Thanks,
> > > >>>>>>   Shameera.
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <
> > > >>>>>> glahiru@gmail.com
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>>  Hi Gagan,
> > > >>>>>>>
> > > >>>>>>> I need to start another discussion about it, but I had an
> offline
> > > >>>>>>> discussion with Suresh about auto-scaling. I will start another
> > > >>>>>>> thread
> > > >>>>>>> about this topic too.
> > > >>>>>>>
> > > >>>>>>> Regards
> > > >>>>>>> Lahiru
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
> > > >>>>>>>
> > > >>>>>> gagandeepjuneja@gmail.com
> > > >>>>
> > > >>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>  Thanks Lahiru for pointing to nice library, added to my
> > dictionary
> > > >>>>>>>>
> > > >>>>>>> :).
> > > >>>>
> > > >>>>>  I would like to know how are we planning to start multiple
> > servers.
> > > >>>>>>>> 1. Spawning new servers based on load? Some times we call it
> as
> > > auto
> > > >>>>>>>> scalable.
> > > >>>>>>>> 2. To make some specific number of nodes available such as we
> > > want 2
> > > >>>>>>>> servers to be available at any time so if one goes down then I
> > > need
> > > >>>>>>>>
> > > >>>>>>> to
> > > >>>>
> > > >>>>>  spawn one new to make available servers count 2.
> > > >>>>>>>> 3. Initially start all the servers.
> > > >>>>>>>>
> > > >>>>>>>> In scenario 1 and 2 zookeeper does make sense but I don't
> > believe
> > > >>>>>>>>
> > > >>>>>>> existing
> > > >>>>>>>
> > > >>>>>>>> architecture support this?
> > > >>>>>>>>
> > > >>>>>>>> Regards,
> > > >>>>>>>> Gagan
> > > >>>>>>>> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <
> glahiru@gmail.com
> > >
> > > >>>>>>>>
> > > >>>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>> Hi Gagan,
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks for your response. Please see my inline comments.
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
> > > >>>>>>>>>
> > > >>>>>>>> gagandeepjuneja@gmail.com>
> > > >>>>>>>
> > > >>>>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>  Hi Lahiru,
> > > >>>>>>>>>> Just my 2 cents.
> > > >>>>>>>>>>
> > > >>>>>>>>>> I am big fan of zookeeper but also against adding multiple
> > hops
> > > in
> > > >>>>>>>>>>
> > > >>>>>>>>> the
> > > >>>>>>>
> > > >>>>>>>> system which can add unnecessary complexity. Here I am not
> able
> > to
> > > >>>>>>>>>> understand the requirement of zookeeper may be I am wrong
> > > because
> > > >>>>>>>>>>
> > > >>>>>>>>> of
> > > >>>>
> > > >>>>> less
> > > >>>>>>>
> > > >>>>>>>> knowledge of the airavata system in whole. So I would like to
> > > >>>>>>>>>>
> > > >>>>>>>>> discuss
> > > >>>>
> > > >>>>>  following point.
> > > >>>>>>>>>>
> > > >>>>>>>>>> 1. How it will help us in making system more reliable.
> > Zookeeper
> > > >>>>>>>>>>
> > > >>>>>>>>> is
> > > >>>>
> > > >>>>> not
> > > >>>>>>>
> > > >>>>>>>> able to restart services. At max it can tell whether service
> is
> > up
> > > >>>>>>>>>>
> > > >>>>>>>>> or not
> > > >>>>>>>
> > > >>>>>>>> which could only be the case if airavata service goes down
> > > >>>>>>>>>>
> > > >>>>>>>>> gracefully and
> > > >>>>>>>
> > > >>>>>>>> we have any automated way to restart it. If this is just
> matter
> > of
> > > >>>>>>>>>>
> > > >>>>>>>>> routing
> > > >>>>>>>
> > > >>>>>>>> client requests to the available thrift servers then this can
> be
> > > >>>>>>>>>>
> > > >>>>>>>>> achieved
> > > >>>>>>>
> > > >>>>>>>> with the help of load balancer which I guess is already there
> in
> > > >>>>>>>>>>
> > > >>>>>>>>> thrift
> > > >>>>>>>
> > > >>>>>>>> wish list.
> > > >>>>>>>>>>
> > > >>>>>>>>>>  We have multiple thrift services and currently we start
> only
> > > one
> > > >>>>>>>>>
> > > >>>>>>>> instance
> > > >>>>>>>
> > > >>>>>>>> of them and each thrift service is a stateless service. To
> keep
> > > the
> > > >>>>>>>>>
> > > >>>>>>>> high
> > > >>>>>>>
> > > >>>>>>>> availability we have to start multiple instances of them in
> > > >>>>>>>>>
> > > >>>>>>>> production
> > > >>>>
> > > >>>>>  scenario. So for clients to get an available thrift service we
> can
> > > >>>>>>>>>
> > > >>>>>>>> use
> > > >>>>
> > > >>>>>  zookeeper znodes to represent each available service. There are
> > > >>>>>>>>>
> > > >>>>>>>> some
> > > >>>>
> > > >>>>>  libraries which is doing similar[1] and I think we can use them
> > > >>>>>>>>>
> > > >>>>>>>> directly.
> > > >>>>>>>
> > > >>>>>>>> 2. As far as registering of different providers is concerned
> do
> > > >>>>>>>>>>
> > > >>>>>>>>> you
> > > >>>>
> > > >>>>>  think for that we really need external store.
> > > >>>>>>>>>>
> > > >>>>>>>>>>  Yes I think so, because its light weight and reliable and
> we
> > > have
> > > >>>>>>>>>
> > > >>>>>>>> to
> > > >>>>
> > > >>>>> do
> > > >>>>>>>
> > > >>>>>>>> very minimal amount of work to achieve all these features to
> > > >>>>>>>>>
> > > >>>>>>>> Airavata
> > > >>>>
> > > >>>>>  because zookeeper handle all the complexity.
> > > >>>>>>>>>
> > > >>>>>>>>>  I have seen people using zookeeper more for state management
> > in
> > > >>>>>>>>>> distributed environments.
> > > >>>>>>>>>>
> > > >>>>>>>>>>  +1, we might not be the most effective users of zookeeper
> > > because
> > > >>>>>>>>>
> > > >>>>>>>> all
> > > >>>>
> > > >>>>> of
> > > >>>>>>>
> > > >>>>>>>> our services are stateless services, but my point is to
> achieve
> > > >>>>>>>>> fault-tolerance we can use zookeeper and with minimal work.
> > > >>>>>>>>>
> > > >>>>>>>>>    I would like to understand more how can we leverage
> > zookeeper
> > > in
> > > >>>>>>>>>> airavata to make system reliable.
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>>  [1]https://github.com/eirslett/thrift-zookeeper
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>  Regards,
> > > >>>>>>>>>> Gagan
> > > >>>>>>>>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu>
> > > wrote:
> > > >>>>>>>>>>
> > > >>>>>>>>>>  Thanks for the summary, Lahiru. I'm cc'ing the Architecture
> > > list
> > > >>>>>>>>>>>
> > > >>>>>>>>>> for
> > > >>>>
> > > >>>>>  additional comments.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Marlon
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> Hi All,
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> I did little research about Apache Zookeeper[1] and how to
> > use
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> it
> > > >>>>
> > > >>>>> in
> > > >>>>>>>
> > > >>>>>>>>  airavata. Its really a nice way to achieve fault tolerance
> and
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> reliable
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> communication between our thrift services and clients.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> Zookeeper
> > > >>>>
> > > >>>>> is a
> > > >>>>>>>
> > > >>>>>>>>  distributed, fault tolerant system to do a reliable
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> communication
> > > >>>>
> > > >>>>>  between
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> distributed applications. This is like an in-memory file
> > > system
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> which
> > > >>>>>>>
> > > >>>>>>>>  has
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> nodes in a tree structure and each node can have small
> > amount
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> of
> > > >>>>
> > > >>>>> data
> > > >>>>>>>
> > > >>>>>>>>  associated with it and these nodes are called znodes. Clients
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> can
> > > >>>>
> > > >>>>>  connect
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> to a zookeeper server and add/delete and update these
> > znodes.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>    In Apache Airavata we start multiple thrift services
> and
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> these
> > > >>>>
> > > >>>>> can
> > > >>>>>>>
> > > >>>>>>>>  go
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> down for maintenance or these can crash, if we use
> zookeeper
> > > to
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> store
> > > >>>>>>>
> > > >>>>>>>>  these
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> configuration(thrift service configurations) we can
> achieve
> > a
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> very
> > > >>>>
> > > >>>>>  reliable
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> system. Basically thrift clients can dynamically discover
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> available
> > > >>>>>>>
> > > >>>>>>>>  service
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> by using ephemeral znodes(Here we do not have to change
> the
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> generated
> > > >>>>>>>
> > > >>>>>>>>  thrift client code but we have to change the locations we are
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> invoking
> > > >>>>>>>
> > > >>>>>>>>  them). ephemeral znodes will be removed when the thrift
> service
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> goes
> > > >>>>>>>
> > > >>>>>>>>  down
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> and zookeeper guarantee the atomicity between these
> > > operations.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> With
> > > >>>>>>>
> > > >>>>>>>>  this
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> approach we can have a node hierarchy for multiple of
> > > airavata,
> > > >>>>>>>>>>>> orchestrator,appcatalog and gfac thrift services.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> For specifically for gfac we can have different types of
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> services
> > > >>>>
> > > >>>>> for
> > > >>>>>>>
> > > >>>>>>>>  each
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> provider implementation. This can be achieved by using the
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> hierarchical
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> support in zookeeper and providing some logic in
> gfac-thrift
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> service
> > > >>>>>>>
> > > >>>>>>>>  to
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> register it to a defined path. Using the same logic
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> orchestrator
> > > >>>>
> > > >>>>> can
> > > >>>>>>>
> > > >>>>>>>>  discover the provider specific gfac thrift service and route
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> the
> > > >>>>
> > > >>>>>  message to
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> the correct thrift service.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> With this approach I think we simply have write some
> client
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> code
> > > >>>>
> > > >>>>> in
> > > >>>>>>>
> > > >>>>>>>>  thrift
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> services and clients and zookeeper server installation can
> > be
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> done as
> > > >>>>>>>
> > > >>>>>>>>  a
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> separate process and it will be easier to keep the
> Zookeeper
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> server
> > > >>>>>>>
> > > >>>>>>>>  separate from Airavata because installation of Zookeeper
> server
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> little
> > > >>>>>>>
> > > >>>>>>>>  complex in production scenario. I think we have to make sure
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> everything
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> works fine when there is no Zookeeper running, ex:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> enable.zookeeper=false
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> should works fine and users doesn't have to download and
> > start
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> zookeeper.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> [1]http://zookeeper.apache.org/
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Thanks
> > > >>>>>>>>>>>> Lahiru
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>> --
> > > >>>>>>>>> System Analyst Programmer
> > > >>>>>>>>> PTI Lab
> > > >>>>>>>>> Indiana University
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>> --
> > > >>>>>>> System Analyst Programmer
> > > >>>>>>> PTI Lab
> > > >>>>>>> Indiana University
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>> --
> > > >>>>>> Best Regards,
> > > >>>>>> Shameera Rathnayaka.
> > > >>>>>>
> > > >>>>>> email: shameera AT apache.org , shameerainfo AT gmail.com
> > > >>>>>> Blog : http://shameerarathnayaka.blogspot.com/
> > > >>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>> --
> > > >>>>> Supun Kamburugamuva
> > > >>>>> Member, Apache Software Foundation; http://www.apache.org
> > > >>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > >>>>> Blog: http://supunk.blogspot.com
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>> --
> > > >>>> System Analyst Programmer
> > > >>>> PTI Lab
> > > >>>> Indiana University
> > > >>>>
> > > >>>>
> > > >>>
> > > >>> --
> > > >>> Supun Kamburugamuva
> > > >>> Member, Apache Software Foundation; http://www.apache.org
> > > >>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > >>> Blog: http://supunk.blogspot.com
> > > >>>
> > > >>>
> > > >>>
> > > >>
> > > >
> > >
> > >
> > > --
> > > Supun Kamburugamuva
> > > Member, Apache Software Foundation; http://www.apache.org
> > > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > Blog: http://supunk.blogspot.com
> > >
> >
> >
> >
> > --
> > System Analyst Programmer
> > PTI Lab
> > Indiana University
> >
>
>
>
> --
> Supun Kamburugamuva
> Member, Apache Software Foundation; http://www.apache.org
> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> Blog: http://supunk.blogspot.com
>



-- 
System Analyst Programmer
PTI Lab
Indiana University

Re: Zookeeper in Airavata to achieve reliability

Posted by Supun Kamburugamuva <su...@gmail.com>.

Hi Lahiru,

Before moving with an implementation it may be worth to consider some of
the following aspects as well.

1. How to report the progress of an experiment as state in ZooKeeper? What
happens if a GFac instance crashes while executing an experiment? Are there
check-points we can save so that another GFac instance can take over?
2. What is the threading model of GFac instances? (I consider this as a
very important aspect)
3. What are the information needed to be stored in the ZooKeeper? You may
need to store other information about an experiment apart from its
experiment ID.
4. How to report errors?
5. For GFac weather you need a threading model or worker process model?

Thanks,
Supun..





On Mon, Jun 16, 2014 at 2:22 PM, Lahiru Gunathilake <gl...@gmail.com>
wrote:

> Hi All,
>
> I think the conclusion is like this,
>
> 1, We make the gfac as a worker not a thrift service and we can start
> multiple workers either with bunch of providers and handlers configured in
> each worker or provider specific  workers to handle the class path issues
> (not the common scenario).
>
> 2. Gfac workers can be configured to watch for a given path in zookeeper,
> and multiple workers can listen to the same path. Default path can be
> /airavata/gfac or can configure paths like /airavata/gfac/gsissh
> /airavata/gfac/bes.
>
> 3. Orchestrator can configure with a logic to store experiment IDs in
> zookeeper with a path, and orchestrator can be configured to provider
> specific path logic too. So when a new request come orchestrator store the
> experimentID and these experiments IDs are stored in Zk as a queue.
>
> 4. Since gfac workers are watching they will be notified and as supun
> suggested can use a leader selection algorithm[1] and one gfac worker  will
> take the leadership for each experiment. If there are gfac instances for
> each provider same logic will apply among those nodes with same provider
> type.
>
> [1]http://curator.apache.org/curator-recipes/leader-election.html
>
> I would like to implement this if there are  no objections.
>
> Lahiru
>
>
> On Mon, Jun 16, 2014 at 11:51 AM, Supun Kamburugamuva <su...@gmail.com>
> wrote:
>
> > Hi Marlon,
> >
> > I think you are exactly correct.
> >
> > Supun..
> >
> >
> > On Mon, Jun 16, 2014 at 11:48 AM, Marlon Pierce <ma...@iu.edu> wrote:
> >
> > > Let me restate this, and please tell me if I'm wrong.
> > >
> > > Orchestrator decides (somehow) that a particular job requires JSDL/BES,
> > so
> > > it places the Experiment ID in Zookeeper's /airavata/gfac/jsdl-bes
> node.
> > >  GFAC servers associated with this instance notice the update.  The
> first
> > > GFAC to claim the job gets it, uses the Experiment ID to get the
> detailed
> > > information it needs from the Registry.  ZooKeeper handles the locking,
> > etc
> > > to make sure that only one GFAC at a time is trying to handle an
> > experiment.
> > >
> > > Marlon
> > >
> > >
> > > On 6/16/14, 11:42 AM, Lahiru Gunathilake wrote:
> > >
> > >> Hi Supun,
> > >>
> > >> Thanks for the clarification.
> > >>
> > >> Regards
> > >> Lahiru
> > >>
> > >>
> > >> On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva <
> > supun06@gmail.com>
> > >> wrote:
> > >>
> > >>  Hi Lahiru,
> > >>>
> > >>> My suggestion is that may be you don't need a Thrift service between
> > >>> Orchestrator and the component executing the experiment. When a new
> > >>> experiment is submitted, orchestrator decides who can execute this
> job.
> > >>> Then it put the information about this experiment execution in
> > ZooKeeper.
> > >>> The component which wants to executes the experiment is listening to
> > this
> > >>> ZooKeeper path and when it sees the experiment it will execute it. So
> > >>> that
> > >>> the communication happens through an state change in ZooKeeper. This
> > can
> > >>> potentially simply your architecture.
> > >>>
> > >>> Thanks,
> > >>> Supun.
> > >>>
> > >>>
> > >>> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake <
> > glahiru@gmail.com>
> > >>> wrote:
> > >>>
> > >>>  Hi Supun,
> > >>>>
> > >>>> So your suggestion is to create a znode for each thrift service we
> > have
> > >>>> and
> > >>>> when the request comes that node gets modified with input data for
> > that
> > >>>> request and thrift service is having a watch for that node and it
> will
> > >>>> be
> > >>>> notified because of the watch and it can read the input from
> zookeeper
> > >>>> and
> > >>>> invoke the operation?
> > >>>>
> > >>>> Lahiru
> > >>>>
> > >>>>
> > >>>> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva <
> > >>>> supun06@gmail.com>
> > >>>> wrote:
> > >>>>
> > >>>>  Hi all,
> > >>>>>
> > >>>>> Here is what I think about Airavata and ZooKeeper. In Airavata
> there
> > >>>>> are
> > >>>>> many components and these components must be stateless to achieve
> > >>>>> scalability and reliability.Also there must be a mechanism to
> > >>>>>
> > >>>> communicate
> > >>>>
> > >>>>> between the components. At the moment Airavata uses RPC calls based
> > on
> > >>>>> Thrift for the communication.
> > >>>>>
> > >>>>> ZooKeeper can be used both as a place to hold state and as a
> > >>>>>
> > >>>> communication
> > >>>>
> > >>>>> layer between the components. I'm involved with a project that has
> > many
> > >>>>> distributed components like AIravata. Right now we use Thrift
> > services
> > >>>>>
> > >>>> to
> > >>>>
> > >>>>> communicate among the components. But we find it difficult to use
> RPC
> > >>>>>
> > >>>> calls
> > >>>>
> > >>>>> and achieve stateless behaviour and thinking of replacing Thrift
> > >>>>>
> > >>>> services
> > >>>>
> > >>>>> with ZooKeeper based communication layer. So I think it is better
> to
> > >>>>> explore the possibility of removing the Thrift services between the
> > >>>>> components and use ZooKeeper as a communication mechanism between
> the
> > >>>>> services. If you do this you will have to move the state to
> ZooKeeper
> > >>>>>
> > >>>> and
> > >>>>
> > >>>>> will automatically achieve the stateless behaviour in the
> components.
> > >>>>>
> > >>>>> Also I think trying to make ZooKeeper optional is a bad idea. If we
> > are
> > >>>>> trying to integrate something fundamentally important to
> architecture
> > >>>>> as
> > >>>>> how to store state, we shouldn't make it optional.
> > >>>>>
> > >>>>> Thanks,
> > >>>>> Supun..
> > >>>>>
> > >>>>>
> > >>>>> On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
> > >>>>> shameerainfo@gmail.com> wrote:
> > >>>>>
> > >>>>>  Hi Lahiru,
> > >>>>>>
> > >>>>>> As i understood,  not only reliability , you are trying to achieve
> > >>>>>> some
> > >>>>>> other requirement by introducing zookeeper, like health monitoring
> > of
> > >>>>>>
> > >>>>> the
> > >>>>
> > >>>>> services, categorization with service implementation etc ... . In
> > that
> > >>>>>> case, i think we can get use of zookeeper's features but if we
> only
> > >>>>>>
> > >>>>> focus
> > >>>>
> > >>>>> on reliability, i have little bit of concern, why can't we use
> > >>>>>>
> > >>>>> clustering +
> > >>>>
> > >>>>> LB ?
> > >>>>>>
> > >>>>>> Yes it is better we add Zookeeper as a prerequisite if user need
> to
> > >>>>>> use
> > >>>>>> it.
> > >>>>>>
> > >>>>>> Thanks,
> > >>>>>>   Shameera.
> > >>>>>>
> > >>>>>>
> > >>>>>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <
> > >>>>>> glahiru@gmail.com
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>  Hi Gagan,
> > >>>>>>>
> > >>>>>>> I need to start another discussion about it, but I had an offline
> > >>>>>>> discussion with Suresh about auto-scaling. I will start another
> > >>>>>>> thread
> > >>>>>>> about this topic too.
> > >>>>>>>
> > >>>>>>> Regards
> > >>>>>>> Lahiru
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
> > >>>>>>>
> > >>>>>> gagandeepjuneja@gmail.com
> > >>>>
> > >>>>> wrote:
> > >>>>>>>
> > >>>>>>>  Thanks Lahiru for pointing to nice library, added to my
> dictionary
> > >>>>>>>>
> > >>>>>>> :).
> > >>>>
> > >>>>>  I would like to know how are we planning to start multiple
> servers.
> > >>>>>>>> 1. Spawning new servers based on load? Some times we call it as
> > auto
> > >>>>>>>> scalable.
> > >>>>>>>> 2. To make some specific number of nodes available such as we
> > want 2
> > >>>>>>>> servers to be available at any time so if one goes down then I
> > need
> > >>>>>>>>
> > >>>>>>> to
> > >>>>
> > >>>>>  spawn one new to make available servers count 2.
> > >>>>>>>> 3. Initially start all the servers.
> > >>>>>>>>
> > >>>>>>>> In scenario 1 and 2 zookeeper does make sense but I don't
> believe
> > >>>>>>>>
> > >>>>>>> existing
> > >>>>>>>
> > >>>>>>>> architecture support this?
> > >>>>>>>>
> > >>>>>>>> Regards,
> > >>>>>>>> Gagan
> > >>>>>>>> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <glahiru@gmail.com
> >
> > >>>>>>>>
> > >>>>>>> wrote:
> > >>>>>>>
> > >>>>>>>> Hi Gagan,
> > >>>>>>>>>
> > >>>>>>>>> Thanks for your response. Please see my inline comments.
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
> > >>>>>>>>>
> > >>>>>>>> gagandeepjuneja@gmail.com>
> > >>>>>>>
> > >>>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>  Hi Lahiru,
> > >>>>>>>>>> Just my 2 cents.
> > >>>>>>>>>>
> > >>>>>>>>>> I am big fan of zookeeper but also against adding multiple
> hops
> > in
> > >>>>>>>>>>
> > >>>>>>>>> the
> > >>>>>>>
> > >>>>>>>> system which can add unnecessary complexity. Here I am not able
> to
> > >>>>>>>>>> understand the requirement of zookeeper may be I am wrong
> > because
> > >>>>>>>>>>
> > >>>>>>>>> of
> > >>>>
> > >>>>> less
> > >>>>>>>
> > >>>>>>>> knowledge of the airavata system in whole. So I would like to
> > >>>>>>>>>>
> > >>>>>>>>> discuss
> > >>>>
> > >>>>>  following point.
> > >>>>>>>>>>
> > >>>>>>>>>> 1. How it will help us in making system more reliable.
> Zookeeper
> > >>>>>>>>>>
> > >>>>>>>>> is
> > >>>>
> > >>>>> not
> > >>>>>>>
> > >>>>>>>> able to restart services. At max it can tell whether service is
> up
> > >>>>>>>>>>
> > >>>>>>>>> or not
> > >>>>>>>
> > >>>>>>>> which could only be the case if airavata service goes down
> > >>>>>>>>>>
> > >>>>>>>>> gracefully and
> > >>>>>>>
> > >>>>>>>> we have any automated way to restart it. If this is just matter
> of
> > >>>>>>>>>>
> > >>>>>>>>> routing
> > >>>>>>>
> > >>>>>>>> client requests to the available thrift servers then this can be
> > >>>>>>>>>>
> > >>>>>>>>> achieved
> > >>>>>>>
> > >>>>>>>> with the help of load balancer which I guess is already there in
> > >>>>>>>>>>
> > >>>>>>>>> thrift
> > >>>>>>>
> > >>>>>>>> wish list.
> > >>>>>>>>>>
> > >>>>>>>>>>  We have multiple thrift services and currently we start only
> > one
> > >>>>>>>>>
> > >>>>>>>> instance
> > >>>>>>>
> > >>>>>>>> of them and each thrift service is a stateless service. To keep
> > the
> > >>>>>>>>>
> > >>>>>>>> high
> > >>>>>>>
> > >>>>>>>> availability we have to start multiple instances of them in
> > >>>>>>>>>
> > >>>>>>>> production
> > >>>>
> > >>>>>  scenario. So for clients to get an available thrift service we can
> > >>>>>>>>>
> > >>>>>>>> use
> > >>>>
> > >>>>>  zookeeper znodes to represent each available service. There are
> > >>>>>>>>>
> > >>>>>>>> some
> > >>>>
> > >>>>>  libraries which is doing similar[1] and I think we can use them
> > >>>>>>>>>
> > >>>>>>>> directly.
> > >>>>>>>
> > >>>>>>>> 2. As far as registering of different providers is concerned do
> > >>>>>>>>>>
> > >>>>>>>>> you
> > >>>>
> > >>>>>  think for that we really need external store.
> > >>>>>>>>>>
> > >>>>>>>>>>  Yes I think so, because its light weight and reliable and we
> > have
> > >>>>>>>>>
> > >>>>>>>> to
> > >>>>
> > >>>>> do
> > >>>>>>>
> > >>>>>>>> very minimal amount of work to achieve all these features to
> > >>>>>>>>>
> > >>>>>>>> Airavata
> > >>>>
> > >>>>>  because zookeeper handle all the complexity.
> > >>>>>>>>>
> > >>>>>>>>>  I have seen people using zookeeper more for state management
> in
> > >>>>>>>>>> distributed environments.
> > >>>>>>>>>>
> > >>>>>>>>>>  +1, we might not be the most effective users of zookeeper
> > because
> > >>>>>>>>>
> > >>>>>>>> all
> > >>>>
> > >>>>> of
> > >>>>>>>
> > >>>>>>>> our services are stateless services, but my point is to achieve
> > >>>>>>>>> fault-tolerance we can use zookeeper and with minimal work.
> > >>>>>>>>>
> > >>>>>>>>>    I would like to understand more how can we leverage
> zookeeper
> > in
> > >>>>>>>>>> airavata to make system reliable.
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>  [1]https://github.com/eirslett/thrift-zookeeper
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>  Regards,
> > >>>>>>>>>> Gagan
> > >>>>>>>>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu>
> > wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>  Thanks for the summary, Lahiru. I'm cc'ing the Architecture
> > list
> > >>>>>>>>>>>
> > >>>>>>>>>> for
> > >>>>
> > >>>>>  additional comments.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Marlon
> > >>>>>>>>>>>
> > >>>>>>>>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> Hi All,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I did little research about Apache Zookeeper[1] and how to
> use
> > >>>>>>>>>>>>
> > >>>>>>>>>>> it
> > >>>>
> > >>>>> in
> > >>>>>>>
> > >>>>>>>>  airavata. Its really a nice way to achieve fault tolerance and
> > >>>>>>>>>>>>
> > >>>>>>>>>>> reliable
> > >>>>>>>>>>>
> > >>>>>>>>>>>> communication between our thrift services and clients.
> > >>>>>>>>>>>>
> > >>>>>>>>>>> Zookeeper
> > >>>>
> > >>>>> is a
> > >>>>>>>
> > >>>>>>>>  distributed, fault tolerant system to do a reliable
> > >>>>>>>>>>>>
> > >>>>>>>>>>> communication
> > >>>>
> > >>>>>  between
> > >>>>>>>>>>>
> > >>>>>>>>>>>> distributed applications. This is like an in-memory file
> > system
> > >>>>>>>>>>>>
> > >>>>>>>>>>> which
> > >>>>>>>
> > >>>>>>>>  has
> > >>>>>>>>>>>
> > >>>>>>>>>>>> nodes in a tree structure and each node can have small
> amount
> > >>>>>>>>>>>>
> > >>>>>>>>>>> of
> > >>>>
> > >>>>> data
> > >>>>>>>
> > >>>>>>>>  associated with it and these nodes are called znodes. Clients
> > >>>>>>>>>>>>
> > >>>>>>>>>>> can
> > >>>>
> > >>>>>  connect
> > >>>>>>>>>>>
> > >>>>>>>>>>>> to a zookeeper server and add/delete and update these
> znodes.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>    In Apache Airavata we start multiple thrift services and
> > >>>>>>>>>>>>
> > >>>>>>>>>>> these
> > >>>>
> > >>>>> can
> > >>>>>>>
> > >>>>>>>>  go
> > >>>>>>>>>>>
> > >>>>>>>>>>>> down for maintenance or these can crash, if we use zookeeper
> > to
> > >>>>>>>>>>>>
> > >>>>>>>>>>> store
> > >>>>>>>
> > >>>>>>>>  these
> > >>>>>>>>>>>
> > >>>>>>>>>>>> configuration(thrift service configurations) we can achieve
> a
> > >>>>>>>>>>>>
> > >>>>>>>>>>> very
> > >>>>
> > >>>>>  reliable
> > >>>>>>>>>>>
> > >>>>>>>>>>>> system. Basically thrift clients can dynamically discover
> > >>>>>>>>>>>>
> > >>>>>>>>>>> available
> > >>>>>>>
> > >>>>>>>>  service
> > >>>>>>>>>>>
> > >>>>>>>>>>>> by using ephemeral znodes(Here we do not have to change the
> > >>>>>>>>>>>>
> > >>>>>>>>>>> generated
> > >>>>>>>
> > >>>>>>>>  thrift client code but we have to change the locations we are
> > >>>>>>>>>>>>
> > >>>>>>>>>>> invoking
> > >>>>>>>
> > >>>>>>>>  them). ephemeral znodes will be removed when the thrift service
> > >>>>>>>>>>>>
> > >>>>>>>>>>> goes
> > >>>>>>>
> > >>>>>>>>  down
> > >>>>>>>>>>>
> > >>>>>>>>>>>> and zookeeper guarantee the atomicity between these
> > operations.
> > >>>>>>>>>>>>
> > >>>>>>>>>>> With
> > >>>>>>>
> > >>>>>>>>  this
> > >>>>>>>>>>>
> > >>>>>>>>>>>> approach we can have a node hierarchy for multiple of
> > airavata,
> > >>>>>>>>>>>> orchestrator,appcatalog and gfac thrift services.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> For specifically for gfac we can have different types of
> > >>>>>>>>>>>>
> > >>>>>>>>>>> services
> > >>>>
> > >>>>> for
> > >>>>>>>
> > >>>>>>>>  each
> > >>>>>>>>>>>
> > >>>>>>>>>>>> provider implementation. This can be achieved by using the
> > >>>>>>>>>>>>
> > >>>>>>>>>>> hierarchical
> > >>>>>>>>>>>
> > >>>>>>>>>>>> support in zookeeper and providing some logic in gfac-thrift
> > >>>>>>>>>>>>
> > >>>>>>>>>>> service
> > >>>>>>>
> > >>>>>>>>  to
> > >>>>>>>>>>>
> > >>>>>>>>>>>> register it to a defined path. Using the same logic
> > >>>>>>>>>>>>
> > >>>>>>>>>>> orchestrator
> > >>>>
> > >>>>> can
> > >>>>>>>
> > >>>>>>>>  discover the provider specific gfac thrift service and route
> > >>>>>>>>>>>>
> > >>>>>>>>>>> the
> > >>>>
> > >>>>>  message to
> > >>>>>>>>>>>
> > >>>>>>>>>>>> the correct thrift service.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> With this approach I think we simply have write some client
> > >>>>>>>>>>>>
> > >>>>>>>>>>> code
> > >>>>
> > >>>>> in
> > >>>>>>>
> > >>>>>>>>  thrift
> > >>>>>>>>>>>
> > >>>>>>>>>>>> services and clients and zookeeper server installation can
> be
> > >>>>>>>>>>>>
> > >>>>>>>>>>> done as
> > >>>>>>>
> > >>>>>>>>  a
> > >>>>>>>>>>>
> > >>>>>>>>>>>> separate process and it will be easier to keep the Zookeeper
> > >>>>>>>>>>>>
> > >>>>>>>>>>> server
> > >>>>>>>
> > >>>>>>>>  separate from Airavata because installation of Zookeeper server
> > >>>>>>>>>>>>
> > >>>>>>>>>>> little
> > >>>>>>>
> > >>>>>>>>  complex in production scenario. I think we have to make sure
> > >>>>>>>>>>>>
> > >>>>>>>>>>> everything
> > >>>>>>>>>>>
> > >>>>>>>>>>>> works fine when there is no Zookeeper running, ex:
> > >>>>>>>>>>>>
> > >>>>>>>>>>> enable.zookeeper=false
> > >>>>>>>>>>>
> > >>>>>>>>>>>> should works fine and users doesn't have to download and
> start
> > >>>>>>>>>>>>
> > >>>>>>>>>>> zookeeper.
> > >>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> [1]http://zookeeper.apache.org/
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Thanks
> > >>>>>>>>>>>> Lahiru
> > >>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>> --
> > >>>>>>>>> System Analyst Programmer
> > >>>>>>>>> PTI Lab
> > >>>>>>>>> Indiana University
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>> --
> > >>>>>>> System Analyst Programmer
> > >>>>>>> PTI Lab
> > >>>>>>> Indiana University
> > >>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>> --
> > >>>>>> Best Regards,
> > >>>>>> Shameera Rathnayaka.
> > >>>>>>
> > >>>>>> email: shameera AT apache.org , shameerainfo AT gmail.com
> > >>>>>> Blog : http://shameerarathnayaka.blogspot.com/
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>> --
> > >>>>> Supun Kamburugamuva
> > >>>>> Member, Apache Software Foundation; http://www.apache.org
> > >>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > >>>>> Blog: http://supunk.blogspot.com
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>> --
> > >>>> System Analyst Programmer
> > >>>> PTI Lab
> > >>>> Indiana University
> > >>>>
> > >>>>
> > >>>
> > >>> --
> > >>> Supun Kamburugamuva
> > >>> Member, Apache Software Foundation; http://www.apache.org
> > >>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > >>> Blog: http://supunk.blogspot.com
> > >>>
> > >>>
> > >>>
> > >>
> > >
> >
> >
> > --
> > Supun Kamburugamuva
> > Member, Apache Software Foundation; http://www.apache.org
> > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > Blog: http://supunk.blogspot.com
> >
>
>
>
> --
> System Analyst Programmer
> PTI Lab
> Indiana University
>



-- 
Supun Kamburugamuva
Member, Apache Software Foundation; http://www.apache.org
E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
Blog: http://supunk.blogspot.com

Re: Zookeeper in Airavata to achieve reliability

Posted by Gagan Juneja <ga...@gmail.com>.

+1.

There are multiple ways to handle failure in above mentioned scenarios.
What is happening today if any GFac server fails while serving?

And these cases need to be taken care any which path we opt either
zookeeper or persistent messaging technique.

Kafka also handles same leader selection concept using zookeeper.

Regards,
Gagan
On 17-Jun-2014 12:11 am, "Eran Chinthaka Withana" <er...@gmail.com>
wrote:

> Hi Lahiru
>
> As you explained, ZK is good at maintaining ephemeral nodes. And you can
> also use ZK to maintain your queues. But I have few questions here (didn't
> get the chance to read the whole thread and the following questions are
> based on your summary)
>
> 1. How do you handle failures in workers? IOW, you have mentioned that
> through a leader election a GFac worker will be assigned an experiment and
> he will run with it. What if it fails? What if takes too much time?
> 2. How do you load balance between the workers? How do you maintain the
> worker pool?
> 3. Can more than one GFac worker be used to run an experiment? If yes, how
> will they communicate?
>
> The reason I raised all these questions is storm is implemented just to do
> what you are trying to do and it had encountered and addressed all these
> problems. May be GFac use case is not as complicated as I'm thinking. I'm
> not suggesting to use storm if GFac use case is too light but there are
> lessons we can learn from it.
>
> In addition to this, I have a more abstract question. Isn't this simply a
> pub-sub system we are talking about? Orchestrator, acting as the publisher
> will put a job (experiment) to the queue. Worker, acting as subscribers,
> get the work and execute it. So the main question is why are we trying to
> use zookeeper to act as a queue. I'm not saying its bad but there are other
> scalable and proven ways of doing this (like a persistence messaging
> solutions) but state shared using ZK.
>
>
> Thanks,
> Eran Chinthaka Withana
>
>
> On Mon, Jun 16, 2014 at 11:22 AM, Lahiru Gunathilake <gl...@gmail.com>
> wrote:
>
> > Hi All,
> >
> > I think the conclusion is like this,
> >
> > 1, We make the gfac as a worker not a thrift service and we can start
> > multiple workers either with bunch of providers and handlers configured
> in
> > each worker or provider specific  workers to handle the class path issues
> > (not the common scenario).
> >
> > 2. Gfac workers can be configured to watch for a given path in zookeeper,
> > and multiple workers can listen to the same path. Default path can be
> > /airavata/gfac or can configure paths like /airavata/gfac/gsissh
> > /airavata/gfac/bes.
> >
> > 3. Orchestrator can configure with a logic to store experiment IDs in
> > zookeeper with a path, and orchestrator can be configured to provider
> > specific path logic too. So when a new request come orchestrator store
> the
> > experimentID and these experiments IDs are stored in Zk as a queue.
> >
> > 4. Since gfac workers are watching they will be notified and as supun
> > suggested can use a leader selection algorithm[1] and one gfac worker
>  will
> > take the leadership for each experiment. If there are gfac instances for
> > each provider same logic will apply among those nodes with same provider
> > type.
> >
> > [1]http://curator.apache.org/curator-recipes/leader-election.html
> >
> > I would like to implement this if there are  no objections.
> >
> > Lahiru
> >
> >
> > On Mon, Jun 16, 2014 at 11:51 AM, Supun Kamburugamuva <supun06@gmail.com
> >
> > wrote:
> >
> > > Hi Marlon,
> > >
> > > I think you are exactly correct.
> > >
> > > Supun..
> > >
> > >
> > > On Mon, Jun 16, 2014 at 11:48 AM, Marlon Pierce <ma...@iu.edu>
> wrote:
> > >
> > > > Let me restate this, and please tell me if I'm wrong.
> > > >
> > > > Orchestrator decides (somehow) that a particular job requires
> JSDL/BES,
> > > so
> > > > it places the Experiment ID in Zookeeper's /airavata/gfac/jsdl-bes
> > node.
> > > >  GFAC servers associated with this instance notice the update.  The
> > first
> > > > GFAC to claim the job gets it, uses the Experiment ID to get the
> > detailed
> > > > information it needs from the Registry.  ZooKeeper handles the
> locking,
> > > etc
> > > > to make sure that only one GFAC at a time is trying to handle an
> > > experiment.
> > > >
> > > > Marlon
> > > >
> > > >
> > > > On 6/16/14, 11:42 AM, Lahiru Gunathilake wrote:
> > > >
> > > >> Hi Supun,
> > > >>
> > > >> Thanks for the clarification.
> > > >>
> > > >> Regards
> > > >> Lahiru
> > > >>
> > > >>
> > > >> On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva <
> > > supun06@gmail.com>
> > > >> wrote:
> > > >>
> > > >>  Hi Lahiru,
> > > >>>
> > > >>> My suggestion is that may be you don't need a Thrift service
> between
> > > >>> Orchestrator and the component executing the experiment. When a new
> > > >>> experiment is submitted, orchestrator decides who can execute this
> > job.
> > > >>> Then it put the information about this experiment execution in
> > > ZooKeeper.
> > > >>> The component which wants to executes the experiment is listening
> to
> > > this
> > > >>> ZooKeeper path and when it sees the experiment it will execute it.
> So
> > > >>> that
> > > >>> the communication happens through an state change in ZooKeeper.
> This
> > > can
> > > >>> potentially simply your architecture.
> > > >>>
> > > >>> Thanks,
> > > >>> Supun.
> > > >>>
> > > >>>
> > > >>> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake <
> > > glahiru@gmail.com>
> > > >>> wrote:
> > > >>>
> > > >>>  Hi Supun,
> > > >>>>
> > > >>>> So your suggestion is to create a znode for each thrift service we
> > > have
> > > >>>> and
> > > >>>> when the request comes that node gets modified with input data for
> > > that
> > > >>>> request and thrift service is having a watch for that node and it
> > will
> > > >>>> be
> > > >>>> notified because of the watch and it can read the input from
> > zookeeper
> > > >>>> and
> > > >>>> invoke the operation?
> > > >>>>
> > > >>>> Lahiru
> > > >>>>
> > > >>>>
> > > >>>> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva <
> > > >>>> supun06@gmail.com>
> > > >>>> wrote:
> > > >>>>
> > > >>>>  Hi all,
> > > >>>>>
> > > >>>>> Here is what I think about Airavata and ZooKeeper. In Airavata
> > there
> > > >>>>> are
> > > >>>>> many components and these components must be stateless to achieve
> > > >>>>> scalability and reliability.Also there must be a mechanism to
> > > >>>>>
> > > >>>> communicate
> > > >>>>
> > > >>>>> between the components. At the moment Airavata uses RPC calls
> based
> > > on
> > > >>>>> Thrift for the communication.
> > > >>>>>
> > > >>>>> ZooKeeper can be used both as a place to hold state and as a
> > > >>>>>
> > > >>>> communication
> > > >>>>
> > > >>>>> layer between the components. I'm involved with a project that
> has
> > > many
> > > >>>>> distributed components like AIravata. Right now we use Thrift
> > > services
> > > >>>>>
> > > >>>> to
> > > >>>>
> > > >>>>> communicate among the components. But we find it difficult to use
> > RPC
> > > >>>>>
> > > >>>> calls
> > > >>>>
> > > >>>>> and achieve stateless behaviour and thinking of replacing Thrift
> > > >>>>>
> > > >>>> services
> > > >>>>
> > > >>>>> with ZooKeeper based communication layer. So I think it is better
> > to
> > > >>>>> explore the possibility of removing the Thrift services between
> the
> > > >>>>> components and use ZooKeeper as a communication mechanism between
> > the
> > > >>>>> services. If you do this you will have to move the state to
> > ZooKeeper
> > > >>>>>
> > > >>>> and
> > > >>>>
> > > >>>>> will automatically achieve the stateless behaviour in the
> > components.
> > > >>>>>
> > > >>>>> Also I think trying to make ZooKeeper optional is a bad idea. If
> we
> > > are
> > > >>>>> trying to integrate something fundamentally important to
> > architecture
> > > >>>>> as
> > > >>>>> how to store state, we shouldn't make it optional.
> > > >>>>>
> > > >>>>> Thanks,
> > > >>>>> Supun..
> > > >>>>>
> > > >>>>>
> > > >>>>> On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
> > > >>>>> shameerainfo@gmail.com> wrote:
> > > >>>>>
> > > >>>>>  Hi Lahiru,
> > > >>>>>>
> > > >>>>>> As i understood,  not only reliability , you are trying to
> achieve
> > > >>>>>> some
> > > >>>>>> other requirement by introducing zookeeper, like health
> monitoring
> > > of
> > > >>>>>>
> > > >>>>> the
> > > >>>>
> > > >>>>> services, categorization with service implementation etc ... . In
> > > that
> > > >>>>>> case, i think we can get use of zookeeper's features but if we
> > only
> > > >>>>>>
> > > >>>>> focus
> > > >>>>
> > > >>>>> on reliability, i have little bit of concern, why can't we use
> > > >>>>>>
> > > >>>>> clustering +
> > > >>>>
> > > >>>>> LB ?
> > > >>>>>>
> > > >>>>>> Yes it is better we add Zookeeper as a prerequisite if user need
> > to
> > > >>>>>> use
> > > >>>>>> it.
> > > >>>>>>
> > > >>>>>> Thanks,
> > > >>>>>>   Shameera.
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <
> > > >>>>>> glahiru@gmail.com
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>>  Hi Gagan,
> > > >>>>>>>
> > > >>>>>>> I need to start another discussion about it, but I had an
> offline
> > > >>>>>>> discussion with Suresh about auto-scaling. I will start another
> > > >>>>>>> thread
> > > >>>>>>> about this topic too.
> > > >>>>>>>
> > > >>>>>>> Regards
> > > >>>>>>> Lahiru
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
> > > >>>>>>>
> > > >>>>>> gagandeepjuneja@gmail.com
> > > >>>>
> > > >>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>  Thanks Lahiru for pointing to nice library, added to my
> > dictionary
> > > >>>>>>>>
> > > >>>>>>> :).
> > > >>>>
> > > >>>>>  I would like to know how are we planning to start multiple
> > servers.
> > > >>>>>>>> 1. Spawning new servers based on load? Some times we call it
> as
> > > auto
> > > >>>>>>>> scalable.
> > > >>>>>>>> 2. To make some specific number of nodes available such as we
> > > want 2
> > > >>>>>>>> servers to be available at any time so if one goes down then I
> > > need
> > > >>>>>>>>
> > > >>>>>>> to
> > > >>>>
> > > >>>>>  spawn one new to make available servers count 2.
> > > >>>>>>>> 3. Initially start all the servers.
> > > >>>>>>>>
> > > >>>>>>>> In scenario 1 and 2 zookeeper does make sense but I don't
> > believe
> > > >>>>>>>>
> > > >>>>>>> existing
> > > >>>>>>>
> > > >>>>>>>> architecture support this?
> > > >>>>>>>>
> > > >>>>>>>> Regards,
> > > >>>>>>>> Gagan
> > > >>>>>>>> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <
> glahiru@gmail.com
> > >
> > > >>>>>>>>
> > > >>>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>> Hi Gagan,
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks for your response. Please see my inline comments.
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
> > > >>>>>>>>>
> > > >>>>>>>> gagandeepjuneja@gmail.com>
> > > >>>>>>>
> > > >>>>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>  Hi Lahiru,
> > > >>>>>>>>>> Just my 2 cents.
> > > >>>>>>>>>>
> > > >>>>>>>>>> I am big fan of zookeeper but also against adding multiple
> > hops
> > > in
> > > >>>>>>>>>>
> > > >>>>>>>>> the
> > > >>>>>>>
> > > >>>>>>>> system which can add unnecessary complexity. Here I am not
> able
> > to
> > > >>>>>>>>>> understand the requirement of zookeeper may be I am wrong
> > > because
> > > >>>>>>>>>>
> > > >>>>>>>>> of
> > > >>>>
> > > >>>>> less
> > > >>>>>>>
> > > >>>>>>>> knowledge of the airavata system in whole. So I would like to
> > > >>>>>>>>>>
> > > >>>>>>>>> discuss
> > > >>>>
> > > >>>>>  following point.
> > > >>>>>>>>>>
> > > >>>>>>>>>> 1. How it will help us in making system more reliable.
> > Zookeeper
> > > >>>>>>>>>>
> > > >>>>>>>>> is
> > > >>>>
> > > >>>>> not
> > > >>>>>>>
> > > >>>>>>>> able to restart services. At max it can tell whether service
> is
> > up
> > > >>>>>>>>>>
> > > >>>>>>>>> or not
> > > >>>>>>>
> > > >>>>>>>> which could only be the case if airavata service goes down
> > > >>>>>>>>>>
> > > >>>>>>>>> gracefully and
> > > >>>>>>>
> > > >>>>>>>> we have any automated way to restart it. If this is just
> matter
> > of
> > > >>>>>>>>>>
> > > >>>>>>>>> routing
> > > >>>>>>>
> > > >>>>>>>> client requests to the available thrift servers then this can
> be
> > > >>>>>>>>>>
> > > >>>>>>>>> achieved
> > > >>>>>>>
> > > >>>>>>>> with the help of load balancer which I guess is already there
> in
> > > >>>>>>>>>>
> > > >>>>>>>>> thrift
> > > >>>>>>>
> > > >>>>>>>> wish list.
> > > >>>>>>>>>>
> > > >>>>>>>>>>  We have multiple thrift services and currently we start
> only
> > > one
> > > >>>>>>>>>
> > > >>>>>>>> instance
> > > >>>>>>>
> > > >>>>>>>> of them and each thrift service is a stateless service. To
> keep
> > > the
> > > >>>>>>>>>
> > > >>>>>>>> high
> > > >>>>>>>
> > > >>>>>>>> availability we have to start multiple instances of them in
> > > >>>>>>>>>
> > > >>>>>>>> production
> > > >>>>
> > > >>>>>  scenario. So for clients to get an available thrift service we
> can
> > > >>>>>>>>>
> > > >>>>>>>> use
> > > >>>>
> > > >>>>>  zookeeper znodes to represent each available service. There are
> > > >>>>>>>>>
> > > >>>>>>>> some
> > > >>>>
> > > >>>>>  libraries which is doing similar[1] and I think we can use them
> > > >>>>>>>>>
> > > >>>>>>>> directly.
> > > >>>>>>>
> > > >>>>>>>> 2. As far as registering of different providers is concerned
> do
> > > >>>>>>>>>>
> > > >>>>>>>>> you
> > > >>>>
> > > >>>>>  think for that we really need external store.
> > > >>>>>>>>>>
> > > >>>>>>>>>>  Yes I think so, because its light weight and reliable and
> we
> > > have
> > > >>>>>>>>>
> > > >>>>>>>> to
> > > >>>>
> > > >>>>> do
> > > >>>>>>>
> > > >>>>>>>> very minimal amount of work to achieve all these features to
> > > >>>>>>>>>
> > > >>>>>>>> Airavata
> > > >>>>
> > > >>>>>  because zookeeper handle all the complexity.
> > > >>>>>>>>>
> > > >>>>>>>>>  I have seen people using zookeeper more for state management
> > in
> > > >>>>>>>>>> distributed environments.
> > > >>>>>>>>>>
> > > >>>>>>>>>>  +1, we might not be the most effective users of zookeeper
> > > because
> > > >>>>>>>>>
> > > >>>>>>>> all
> > > >>>>
> > > >>>>> of
> > > >>>>>>>
> > > >>>>>>>> our services are stateless services, but my point is to
> achieve
> > > >>>>>>>>> fault-tolerance we can use zookeeper and with minimal work.
> > > >>>>>>>>>
> > > >>>>>>>>>    I would like to understand more how can we leverage
> > zookeeper
> > > in
> > > >>>>>>>>>> airavata to make system reliable.
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>>  [1]https://github.com/eirslett/thrift-zookeeper
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>  Regards,
> > > >>>>>>>>>> Gagan
> > > >>>>>>>>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu>
> > > wrote:
> > > >>>>>>>>>>
> > > >>>>>>>>>>  Thanks for the summary, Lahiru. I'm cc'ing the Architecture
> > > list
> > > >>>>>>>>>>>
> > > >>>>>>>>>> for
> > > >>>>
> > > >>>>>  additional comments.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Marlon
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> Hi All,
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> I did little research about Apache Zookeeper[1] and how to
> > use
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> it
> > > >>>>
> > > >>>>> in
> > > >>>>>>>
> > > >>>>>>>>  airavata. Its really a nice way to achieve fault tolerance
> and
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> reliable
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> communication between our thrift services and clients.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> Zookeeper
> > > >>>>
> > > >>>>> is a
> > > >>>>>>>
> > > >>>>>>>>  distributed, fault tolerant system to do a reliable
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> communication
> > > >>>>
> > > >>>>>  between
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> distributed applications. This is like an in-memory file
> > > system
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> which
> > > >>>>>>>
> > > >>>>>>>>  has
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> nodes in a tree structure and each node can have small
> > amount
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> of
> > > >>>>
> > > >>>>> data
> > > >>>>>>>
> > > >>>>>>>>  associated with it and these nodes are called znodes. Clients
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> can
> > > >>>>
> > > >>>>>  connect
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> to a zookeeper server and add/delete and update these
> > znodes.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>    In Apache Airavata we start multiple thrift services
> and
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> these
> > > >>>>
> > > >>>>> can
> > > >>>>>>>
> > > >>>>>>>>  go
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> down for maintenance or these can crash, if we use
> zookeeper
> > > to
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> store
> > > >>>>>>>
> > > >>>>>>>>  these
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> configuration(thrift service configurations) we can
> achieve
> > a
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> very
> > > >>>>
> > > >>>>>  reliable
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> system. Basically thrift clients can dynamically discover
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> available
> > > >>>>>>>
> > > >>>>>>>>  service
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> by using ephemeral znodes(Here we do not have to change
> the
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> generated
> > > >>>>>>>
> > > >>>>>>>>  thrift client code but we have to change the locations we are
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> invoking
> > > >>>>>>>
> > > >>>>>>>>  them). ephemeral znodes will be removed when the thrift
> service
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> goes
> > > >>>>>>>
> > > >>>>>>>>  down
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> and zookeeper guarantee the atomicity between these
> > > operations.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> With
> > > >>>>>>>
> > > >>>>>>>>  this
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> approach we can have a node hierarchy for multiple of
> > > airavata,
> > > >>>>>>>>>>>> orchestrator,appcatalog and gfac thrift services.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> For specifically for gfac we can have different types of
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> services
> > > >>>>
> > > >>>>> for
> > > >>>>>>>
> > > >>>>>>>>  each
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> provider implementation. This can be achieved by using the
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> hierarchical
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> support in zookeeper and providing some logic in
> gfac-thrift
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> service
> > > >>>>>>>
> > > >>>>>>>>  to
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> register it to a defined path. Using the same logic
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> orchestrator
> > > >>>>
> > > >>>>> can
> > > >>>>>>>
> > > >>>>>>>>  discover the provider specific gfac thrift service and route
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> the
> > > >>>>
> > > >>>>>  message to
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> the correct thrift service.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> With this approach I think we simply have write some
> client
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> code
> > > >>>>
> > > >>>>> in
> > > >>>>>>>
> > > >>>>>>>>  thrift
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> services and clients and zookeeper server installation can
> > be
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> done as
> > > >>>>>>>
> > > >>>>>>>>  a
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> separate process and it will be easier to keep the
> Zookeeper
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> server
> > > >>>>>>>
> > > >>>>>>>>  separate from Airavata because installation of Zookeeper
> server
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> little
> > > >>>>>>>
> > > >>>>>>>>  complex in production scenario. I think we have to make sure
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> everything
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> works fine when there is no Zookeeper running, ex:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> enable.zookeeper=false
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> should works fine and users doesn't have to download and
> > start
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>> zookeeper.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> [1]http://zookeeper.apache.org/
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Thanks
> > > >>>>>>>>>>>> Lahiru
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>> --
> > > >>>>>>>>> System Analyst Programmer
> > > >>>>>>>>> PTI Lab
> > > >>>>>>>>> Indiana University
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>> --
> > > >>>>>>> System Analyst Programmer
> > > >>>>>>> PTI Lab
> > > >>>>>>> Indiana University
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>> --
> > > >>>>>> Best Regards,
> > > >>>>>> Shameera Rathnayaka.
> > > >>>>>>
> > > >>>>>> email: shameera AT apache.org , shameerainfo AT gmail.com
> > > >>>>>> Blog : http://shameerarathnayaka.blogspot.com/
> > > >>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>> --
> > > >>>>> Supun Kamburugamuva
> > > >>>>> Member, Apache Software Foundation; http://www.apache.org
> > > >>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > >>>>> Blog: http://supunk.blogspot.com
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>> --
> > > >>>> System Analyst Programmer
> > > >>>> PTI Lab
> > > >>>> Indiana University
> > > >>>>
> > > >>>>
> > > >>>
> > > >>> --
> > > >>> Supun Kamburugamuva
> > > >>> Member, Apache Software Foundation; http://www.apache.org
> > > >>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > >>> Blog: http://supunk.blogspot.com
> > > >>>
> > > >>>
> > > >>>
> > > >>
> > > >
> > >
> > >
> > > --
> > > Supun Kamburugamuva
> > > Member, Apache Software Foundation; http://www.apache.org
> > > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > Blog: http://supunk.blogspot.com
> > >
> >
> >
> >
> > --
> > System Analyst Programmer
> > PTI Lab
> > Indiana University
> >
>

Re: Zookeeper in Airavata to achieve reliability

Posted by Eran Chinthaka Withana <er...@gmail.com>.

Hi Lahiru

As you explained, ZK is good at maintaining ephemeral nodes. And you can
also use ZK to maintain your queues. But I have few questions here (didn't
get the chance to read the whole thread and the following questions are
based on your summary)

1. How do you handle failures in workers? IOW, you have mentioned that
through a leader election a GFac worker will be assigned an experiment and
he will run with it. What if it fails? What if takes too much time?
2. How do you load balance between the workers? How do you maintain the
worker pool?
3. Can more than one GFac worker be used to run an experiment? If yes, how
will they communicate?

The reason I raised all these questions is storm is implemented just to do
what you are trying to do and it had encountered and addressed all these
problems. May be GFac use case is not as complicated as I'm thinking. I'm
not suggesting to use storm if GFac use case is too light but there are
lessons we can learn from it.

In addition to this, I have a more abstract question. Isn't this simply a
pub-sub system we are talking about? Orchestrator, acting as the publisher
will put a job (experiment) to the queue. Worker, acting as subscribers,
get the work and execute it. So the main question is why are we trying to
use zookeeper to act as a queue. I'm not saying its bad but there are other
scalable and proven ways of doing this (like a persistence messaging
solutions) but state shared using ZK.


Thanks,
Eran Chinthaka Withana


On Mon, Jun 16, 2014 at 11:22 AM, Lahiru Gunathilake <gl...@gmail.com>
wrote:

> Hi All,
>
> I think the conclusion is like this,
>
> 1, We make the gfac as a worker not a thrift service and we can start
> multiple workers either with bunch of providers and handlers configured in
> each worker or provider specific  workers to handle the class path issues
> (not the common scenario).
>
> 2. Gfac workers can be configured to watch for a given path in zookeeper,
> and multiple workers can listen to the same path. Default path can be
> /airavata/gfac or can configure paths like /airavata/gfac/gsissh
> /airavata/gfac/bes.
>
> 3. Orchestrator can configure with a logic to store experiment IDs in
> zookeeper with a path, and orchestrator can be configured to provider
> specific path logic too. So when a new request come orchestrator store the
> experimentID and these experiments IDs are stored in Zk as a queue.
>
> 4. Since gfac workers are watching they will be notified and as supun
> suggested can use a leader selection algorithm[1] and one gfac worker  will
> take the leadership for each experiment. If there are gfac instances for
> each provider same logic will apply among those nodes with same provider
> type.
>
> [1]http://curator.apache.org/curator-recipes/leader-election.html
>
> I would like to implement this if there are  no objections.
>
> Lahiru
>
>
> On Mon, Jun 16, 2014 at 11:51 AM, Supun Kamburugamuva <su...@gmail.com>
> wrote:
>
> > Hi Marlon,
> >
> > I think you are exactly correct.
> >
> > Supun..
> >
> >
> > On Mon, Jun 16, 2014 at 11:48 AM, Marlon Pierce <ma...@iu.edu> wrote:
> >
> > > Let me restate this, and please tell me if I'm wrong.
> > >
> > > Orchestrator decides (somehow) that a particular job requires JSDL/BES,
> > so
> > > it places the Experiment ID in Zookeeper's /airavata/gfac/jsdl-bes
> node.
> > >  GFAC servers associated with this instance notice the update.  The
> first
> > > GFAC to claim the job gets it, uses the Experiment ID to get the
> detailed
> > > information it needs from the Registry.  ZooKeeper handles the locking,
> > etc
> > > to make sure that only one GFAC at a time is trying to handle an
> > experiment.
> > >
> > > Marlon
> > >
> > >
> > > On 6/16/14, 11:42 AM, Lahiru Gunathilake wrote:
> > >
> > >> Hi Supun,
> > >>
> > >> Thanks for the clarification.
> > >>
> > >> Regards
> > >> Lahiru
> > >>
> > >>
> > >> On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva <
> > supun06@gmail.com>
> > >> wrote:
> > >>
> > >>  Hi Lahiru,
> > >>>
> > >>> My suggestion is that may be you don't need a Thrift service between
> > >>> Orchestrator and the component executing the experiment. When a new
> > >>> experiment is submitted, orchestrator decides who can execute this
> job.
> > >>> Then it put the information about this experiment execution in
> > ZooKeeper.
> > >>> The component which wants to executes the experiment is listening to
> > this
> > >>> ZooKeeper path and when it sees the experiment it will execute it. So
> > >>> that
> > >>> the communication happens through an state change in ZooKeeper. This
> > can
> > >>> potentially simply your architecture.
> > >>>
> > >>> Thanks,
> > >>> Supun.
> > >>>
> > >>>
> > >>> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake <
> > glahiru@gmail.com>
> > >>> wrote:
> > >>>
> > >>>  Hi Supun,
> > >>>>
> > >>>> So your suggestion is to create a znode for each thrift service we
> > have
> > >>>> and
> > >>>> when the request comes that node gets modified with input data for
> > that
> > >>>> request and thrift service is having a watch for that node and it
> will
> > >>>> be
> > >>>> notified because of the watch and it can read the input from
> zookeeper
> > >>>> and
> > >>>> invoke the operation?
> > >>>>
> > >>>> Lahiru
> > >>>>
> > >>>>
> > >>>> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva <
> > >>>> supun06@gmail.com>
> > >>>> wrote:
> > >>>>
> > >>>>  Hi all,
> > >>>>>
> > >>>>> Here is what I think about Airavata and ZooKeeper. In Airavata
> there
> > >>>>> are
> > >>>>> many components and these components must be stateless to achieve
> > >>>>> scalability and reliability.Also there must be a mechanism to
> > >>>>>
> > >>>> communicate
> > >>>>
> > >>>>> between the components. At the moment Airavata uses RPC calls based
> > on
> > >>>>> Thrift for the communication.
> > >>>>>
> > >>>>> ZooKeeper can be used both as a place to hold state and as a
> > >>>>>
> > >>>> communication
> > >>>>
> > >>>>> layer between the components. I'm involved with a project that has
> > many
> > >>>>> distributed components like AIravata. Right now we use Thrift
> > services
> > >>>>>
> > >>>> to
> > >>>>
> > >>>>> communicate among the components. But we find it difficult to use
> RPC
> > >>>>>
> > >>>> calls
> > >>>>
> > >>>>> and achieve stateless behaviour and thinking of replacing Thrift
> > >>>>>
> > >>>> services
> > >>>>
> > >>>>> with ZooKeeper based communication layer. So I think it is better
> to
> > >>>>> explore the possibility of removing the Thrift services between the
> > >>>>> components and use ZooKeeper as a communication mechanism between
> the
> > >>>>> services. If you do this you will have to move the state to
> ZooKeeper
> > >>>>>
> > >>>> and
> > >>>>
> > >>>>> will automatically achieve the stateless behaviour in the
> components.
> > >>>>>
> > >>>>> Also I think trying to make ZooKeeper optional is a bad idea. If we
> > are
> > >>>>> trying to integrate something fundamentally important to
> architecture
> > >>>>> as
> > >>>>> how to store state, we shouldn't make it optional.
> > >>>>>
> > >>>>> Thanks,
> > >>>>> Supun..
> > >>>>>
> > >>>>>
> > >>>>> On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
> > >>>>> shameerainfo@gmail.com> wrote:
> > >>>>>
> > >>>>>  Hi Lahiru,
> > >>>>>>
> > >>>>>> As i understood,  not only reliability , you are trying to achieve
> > >>>>>> some
> > >>>>>> other requirement by introducing zookeeper, like health monitoring
> > of
> > >>>>>>
> > >>>>> the
> > >>>>
> > >>>>> services, categorization with service implementation etc ... . In
> > that
> > >>>>>> case, i think we can get use of zookeeper's features but if we
> only
> > >>>>>>
> > >>>>> focus
> > >>>>
> > >>>>> on reliability, i have little bit of concern, why can't we use
> > >>>>>>
> > >>>>> clustering +
> > >>>>
> > >>>>> LB ?
> > >>>>>>
> > >>>>>> Yes it is better we add Zookeeper as a prerequisite if user need
> to
> > >>>>>> use
> > >>>>>> it.
> > >>>>>>
> > >>>>>> Thanks,
> > >>>>>>   Shameera.
> > >>>>>>
> > >>>>>>
> > >>>>>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <
> > >>>>>> glahiru@gmail.com
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>  Hi Gagan,
> > >>>>>>>
> > >>>>>>> I need to start another discussion about it, but I had an offline
> > >>>>>>> discussion with Suresh about auto-scaling. I will start another
> > >>>>>>> thread
> > >>>>>>> about this topic too.
> > >>>>>>>
> > >>>>>>> Regards
> > >>>>>>> Lahiru
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
> > >>>>>>>
> > >>>>>> gagandeepjuneja@gmail.com
> > >>>>
> > >>>>> wrote:
> > >>>>>>>
> > >>>>>>>  Thanks Lahiru for pointing to nice library, added to my
> dictionary
> > >>>>>>>>
> > >>>>>>> :).
> > >>>>
> > >>>>>  I would like to know how are we planning to start multiple
> servers.
> > >>>>>>>> 1. Spawning new servers based on load? Some times we call it as
> > auto
> > >>>>>>>> scalable.
> > >>>>>>>> 2. To make some specific number of nodes available such as we
> > want 2
> > >>>>>>>> servers to be available at any time so if one goes down then I
> > need
> > >>>>>>>>
> > >>>>>>> to
> > >>>>
> > >>>>>  spawn one new to make available servers count 2.
> > >>>>>>>> 3. Initially start all the servers.
> > >>>>>>>>
> > >>>>>>>> In scenario 1 and 2 zookeeper does make sense but I don't
> believe
> > >>>>>>>>
> > >>>>>>> existing
> > >>>>>>>
> > >>>>>>>> architecture support this?
> > >>>>>>>>
> > >>>>>>>> Regards,
> > >>>>>>>> Gagan
> > >>>>>>>> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <glahiru@gmail.com
> >
> > >>>>>>>>
> > >>>>>>> wrote:
> > >>>>>>>
> > >>>>>>>> Hi Gagan,
> > >>>>>>>>>
> > >>>>>>>>> Thanks for your response. Please see my inline comments.
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
> > >>>>>>>>>
> > >>>>>>>> gagandeepjuneja@gmail.com>
> > >>>>>>>
> > >>>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>  Hi Lahiru,
> > >>>>>>>>>> Just my 2 cents.
> > >>>>>>>>>>
> > >>>>>>>>>> I am big fan of zookeeper but also against adding multiple
> hops
> > in
> > >>>>>>>>>>
> > >>>>>>>>> the
> > >>>>>>>
> > >>>>>>>> system which can add unnecessary complexity. Here I am not able
> to
> > >>>>>>>>>> understand the requirement of zookeeper may be I am wrong
> > because
> > >>>>>>>>>>
> > >>>>>>>>> of
> > >>>>
> > >>>>> less
> > >>>>>>>
> > >>>>>>>> knowledge of the airavata system in whole. So I would like to
> > >>>>>>>>>>
> > >>>>>>>>> discuss
> > >>>>
> > >>>>>  following point.
> > >>>>>>>>>>
> > >>>>>>>>>> 1. How it will help us in making system more reliable.
> Zookeeper
> > >>>>>>>>>>
> > >>>>>>>>> is
> > >>>>
> > >>>>> not
> > >>>>>>>
> > >>>>>>>> able to restart services. At max it can tell whether service is
> up
> > >>>>>>>>>>
> > >>>>>>>>> or not
> > >>>>>>>
> > >>>>>>>> which could only be the case if airavata service goes down
> > >>>>>>>>>>
> > >>>>>>>>> gracefully and
> > >>>>>>>
> > >>>>>>>> we have any automated way to restart it. If this is just matter
> of
> > >>>>>>>>>>
> > >>>>>>>>> routing
> > >>>>>>>
> > >>>>>>>> client requests to the available thrift servers then this can be
> > >>>>>>>>>>
> > >>>>>>>>> achieved
> > >>>>>>>
> > >>>>>>>> with the help of load balancer which I guess is already there in
> > >>>>>>>>>>
> > >>>>>>>>> thrift
> > >>>>>>>
> > >>>>>>>> wish list.
> > >>>>>>>>>>
> > >>>>>>>>>>  We have multiple thrift services and currently we start only
> > one
> > >>>>>>>>>
> > >>>>>>>> instance
> > >>>>>>>
> > >>>>>>>> of them and each thrift service is a stateless service. To keep
> > the
> > >>>>>>>>>
> > >>>>>>>> high
> > >>>>>>>
> > >>>>>>>> availability we have to start multiple instances of them in
> > >>>>>>>>>
> > >>>>>>>> production
> > >>>>
> > >>>>>  scenario. So for clients to get an available thrift service we can
> > >>>>>>>>>
> > >>>>>>>> use
> > >>>>
> > >>>>>  zookeeper znodes to represent each available service. There are
> > >>>>>>>>>
> > >>>>>>>> some
> > >>>>
> > >>>>>  libraries which is doing similar[1] and I think we can use them
> > >>>>>>>>>
> > >>>>>>>> directly.
> > >>>>>>>
> > >>>>>>>> 2. As far as registering of different providers is concerned do
> > >>>>>>>>>>
> > >>>>>>>>> you
> > >>>>
> > >>>>>  think for that we really need external store.
> > >>>>>>>>>>
> > >>>>>>>>>>  Yes I think so, because its light weight and reliable and we
> > have
> > >>>>>>>>>
> > >>>>>>>> to
> > >>>>
> > >>>>> do
> > >>>>>>>
> > >>>>>>>> very minimal amount of work to achieve all these features to
> > >>>>>>>>>
> > >>>>>>>> Airavata
> > >>>>
> > >>>>>  because zookeeper handle all the complexity.
> > >>>>>>>>>
> > >>>>>>>>>  I have seen people using zookeeper more for state management
> in
> > >>>>>>>>>> distributed environments.
> > >>>>>>>>>>
> > >>>>>>>>>>  +1, we might not be the most effective users of zookeeper
> > because
> > >>>>>>>>>
> > >>>>>>>> all
> > >>>>
> > >>>>> of
> > >>>>>>>
> > >>>>>>>> our services are stateless services, but my point is to achieve
> > >>>>>>>>> fault-tolerance we can use zookeeper and with minimal work.
> > >>>>>>>>>
> > >>>>>>>>>    I would like to understand more how can we leverage
> zookeeper
> > in
> > >>>>>>>>>> airavata to make system reliable.
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>  [1]https://github.com/eirslett/thrift-zookeeper
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>  Regards,
> > >>>>>>>>>> Gagan
> > >>>>>>>>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu>
> > wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>  Thanks for the summary, Lahiru. I'm cc'ing the Architecture
> > list
> > >>>>>>>>>>>
> > >>>>>>>>>> for
> > >>>>
> > >>>>>  additional comments.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Marlon
> > >>>>>>>>>>>
> > >>>>>>>>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> Hi All,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I did little research about Apache Zookeeper[1] and how to
> use
> > >>>>>>>>>>>>
> > >>>>>>>>>>> it
> > >>>>
> > >>>>> in
> > >>>>>>>
> > >>>>>>>>  airavata. Its really a nice way to achieve fault tolerance and
> > >>>>>>>>>>>>
> > >>>>>>>>>>> reliable
> > >>>>>>>>>>>
> > >>>>>>>>>>>> communication between our thrift services and clients.
> > >>>>>>>>>>>>
> > >>>>>>>>>>> Zookeeper
> > >>>>
> > >>>>> is a
> > >>>>>>>
> > >>>>>>>>  distributed, fault tolerant system to do a reliable
> > >>>>>>>>>>>>
> > >>>>>>>>>>> communication
> > >>>>
> > >>>>>  between
> > >>>>>>>>>>>
> > >>>>>>>>>>>> distributed applications. This is like an in-memory file
> > system
> > >>>>>>>>>>>>
> > >>>>>>>>>>> which
> > >>>>>>>
> > >>>>>>>>  has
> > >>>>>>>>>>>
> > >>>>>>>>>>>> nodes in a tree structure and each node can have small
> amount
> > >>>>>>>>>>>>
> > >>>>>>>>>>> of
> > >>>>
> > >>>>> data
> > >>>>>>>
> > >>>>>>>>  associated with it and these nodes are called znodes. Clients
> > >>>>>>>>>>>>
> > >>>>>>>>>>> can
> > >>>>
> > >>>>>  connect
> > >>>>>>>>>>>
> > >>>>>>>>>>>> to a zookeeper server and add/delete and update these
> znodes.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>    In Apache Airavata we start multiple thrift services and
> > >>>>>>>>>>>>
> > >>>>>>>>>>> these
> > >>>>
> > >>>>> can
> > >>>>>>>
> > >>>>>>>>  go
> > >>>>>>>>>>>
> > >>>>>>>>>>>> down for maintenance or these can crash, if we use zookeeper
> > to
> > >>>>>>>>>>>>
> > >>>>>>>>>>> store
> > >>>>>>>
> > >>>>>>>>  these
> > >>>>>>>>>>>
> > >>>>>>>>>>>> configuration(thrift service configurations) we can achieve
> a
> > >>>>>>>>>>>>
> > >>>>>>>>>>> very
> > >>>>
> > >>>>>  reliable
> > >>>>>>>>>>>
> > >>>>>>>>>>>> system. Basically thrift clients can dynamically discover
> > >>>>>>>>>>>>
> > >>>>>>>>>>> available
> > >>>>>>>
> > >>>>>>>>  service
> > >>>>>>>>>>>
> > >>>>>>>>>>>> by using ephemeral znodes(Here we do not have to change the
> > >>>>>>>>>>>>
> > >>>>>>>>>>> generated
> > >>>>>>>
> > >>>>>>>>  thrift client code but we have to change the locations we are
> > >>>>>>>>>>>>
> > >>>>>>>>>>> invoking
> > >>>>>>>
> > >>>>>>>>  them). ephemeral znodes will be removed when the thrift service
> > >>>>>>>>>>>>
> > >>>>>>>>>>> goes
> > >>>>>>>
> > >>>>>>>>  down
> > >>>>>>>>>>>
> > >>>>>>>>>>>> and zookeeper guarantee the atomicity between these
> > operations.
> > >>>>>>>>>>>>
> > >>>>>>>>>>> With
> > >>>>>>>
> > >>>>>>>>  this
> > >>>>>>>>>>>
> > >>>>>>>>>>>> approach we can have a node hierarchy for multiple of
> > airavata,
> > >>>>>>>>>>>> orchestrator,appcatalog and gfac thrift services.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> For specifically for gfac we can have different types of
> > >>>>>>>>>>>>
> > >>>>>>>>>>> services
> > >>>>
> > >>>>> for
> > >>>>>>>
> > >>>>>>>>  each
> > >>>>>>>>>>>
> > >>>>>>>>>>>> provider implementation. This can be achieved by using the
> > >>>>>>>>>>>>
> > >>>>>>>>>>> hierarchical
> > >>>>>>>>>>>
> > >>>>>>>>>>>> support in zookeeper and providing some logic in gfac-thrift
> > >>>>>>>>>>>>
> > >>>>>>>>>>> service
> > >>>>>>>
> > >>>>>>>>  to
> > >>>>>>>>>>>
> > >>>>>>>>>>>> register it to a defined path. Using the same logic
> > >>>>>>>>>>>>
> > >>>>>>>>>>> orchestrator
> > >>>>
> > >>>>> can
> > >>>>>>>
> > >>>>>>>>  discover the provider specific gfac thrift service and route
> > >>>>>>>>>>>>
> > >>>>>>>>>>> the
> > >>>>
> > >>>>>  message to
> > >>>>>>>>>>>
> > >>>>>>>>>>>> the correct thrift service.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> With this approach I think we simply have write some client
> > >>>>>>>>>>>>
> > >>>>>>>>>>> code
> > >>>>
> > >>>>> in
> > >>>>>>>
> > >>>>>>>>  thrift
> > >>>>>>>>>>>
> > >>>>>>>>>>>> services and clients and zookeeper server installation can
> be
> > >>>>>>>>>>>>
> > >>>>>>>>>>> done as
> > >>>>>>>
> > >>>>>>>>  a
> > >>>>>>>>>>>
> > >>>>>>>>>>>> separate process and it will be easier to keep the Zookeeper
> > >>>>>>>>>>>>
> > >>>>>>>>>>> server
> > >>>>>>>
> > >>>>>>>>  separate from Airavata because installation of Zookeeper server
> > >>>>>>>>>>>>
> > >>>>>>>>>>> little
> > >>>>>>>
> > >>>>>>>>  complex in production scenario. I think we have to make sure
> > >>>>>>>>>>>>
> > >>>>>>>>>>> everything
> > >>>>>>>>>>>
> > >>>>>>>>>>>> works fine when there is no Zookeeper running, ex:
> > >>>>>>>>>>>>
> > >>>>>>>>>>> enable.zookeeper=false
> > >>>>>>>>>>>
> > >>>>>>>>>>>> should works fine and users doesn't have to download and
> start
> > >>>>>>>>>>>>
> > >>>>>>>>>>> zookeeper.
> > >>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> [1]http://zookeeper.apache.org/
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Thanks
> > >>>>>>>>>>>> Lahiru
> > >>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>> --
> > >>>>>>>>> System Analyst Programmer
> > >>>>>>>>> PTI Lab
> > >>>>>>>>> Indiana University
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>> --
> > >>>>>>> System Analyst Programmer
> > >>>>>>> PTI Lab
> > >>>>>>> Indiana University
> > >>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>> --
> > >>>>>> Best Regards,
> > >>>>>> Shameera Rathnayaka.
> > >>>>>>
> > >>>>>> email: shameera AT apache.org , shameerainfo AT gmail.com
> > >>>>>> Blog : http://shameerarathnayaka.blogspot.com/
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>> --
> > >>>>> Supun Kamburugamuva
> > >>>>> Member, Apache Software Foundation; http://www.apache.org
> > >>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > >>>>> Blog: http://supunk.blogspot.com
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>> --
> > >>>> System Analyst Programmer
> > >>>> PTI Lab
> > >>>> Indiana University
> > >>>>
> > >>>>
> > >>>
> > >>> --
> > >>> Supun Kamburugamuva
> > >>> Member, Apache Software Foundation; http://www.apache.org
> > >>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > >>> Blog: http://supunk.blogspot.com
> > >>>
> > >>>
> > >>>
> > >>
> > >
> >
> >
> > --
> > Supun Kamburugamuva
> > Member, Apache Software Foundation; http://www.apache.org
> > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > Blog: http://supunk.blogspot.com
> >
>
>
>
> --
> System Analyst Programmer
> PTI Lab
> Indiana University
>

Re: Zookeeper in Airavata to achieve reliability

Posted by Lahiru Gunathilake <gl...@gmail.com>.

Hi All,

I think the conclusion is like this,

1, We make the gfac as a worker not a thrift service and we can start
multiple workers either with bunch of providers and handlers configured in
each worker or provider specific  workers to handle the class path issues
(not the common scenario).

2. Gfac workers can be configured to watch for a given path in zookeeper,
and multiple workers can listen to the same path. Default path can be
/airavata/gfac or can configure paths like /airavata/gfac/gsissh
/airavata/gfac/bes.

3. Orchestrator can configure with a logic to store experiment IDs in
zookeeper with a path, and orchestrator can be configured to provider
specific path logic too. So when a new request come orchestrator store the
experimentID and these experiments IDs are stored in Zk as a queue.

4. Since gfac workers are watching they will be notified and as supun
suggested can use a leader selection algorithm[1] and one gfac worker  will
take the leadership for each experiment. If there are gfac instances for
each provider same logic will apply among those nodes with same provider
type.

[1]http://curator.apache.org/curator-recipes/leader-election.html

I would like to implement this if there are  no objections.

Lahiru


On Mon, Jun 16, 2014 at 11:51 AM, Supun Kamburugamuva <su...@gmail.com>
wrote:

> Hi Marlon,
>
> I think you are exactly correct.
>
> Supun..
>
>
> On Mon, Jun 16, 2014 at 11:48 AM, Marlon Pierce <ma...@iu.edu> wrote:
>
> > Let me restate this, and please tell me if I'm wrong.
> >
> > Orchestrator decides (somehow) that a particular job requires JSDL/BES,
> so
> > it places the Experiment ID in Zookeeper's /airavata/gfac/jsdl-bes node.
> >  GFAC servers associated with this instance notice the update.  The first
> > GFAC to claim the job gets it, uses the Experiment ID to get the detailed
> > information it needs from the Registry.  ZooKeeper handles the locking,
> etc
> > to make sure that only one GFAC at a time is trying to handle an
> experiment.
> >
> > Marlon
> >
> >
> > On 6/16/14, 11:42 AM, Lahiru Gunathilake wrote:
> >
> >> Hi Supun,
> >>
> >> Thanks for the clarification.
> >>
> >> Regards
> >> Lahiru
> >>
> >>
> >> On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva <
> supun06@gmail.com>
> >> wrote:
> >>
> >>  Hi Lahiru,
> >>>
> >>> My suggestion is that may be you don't need a Thrift service between
> >>> Orchestrator and the component executing the experiment. When a new
> >>> experiment is submitted, orchestrator decides who can execute this job.
> >>> Then it put the information about this experiment execution in
> ZooKeeper.
> >>> The component which wants to executes the experiment is listening to
> this
> >>> ZooKeeper path and when it sees the experiment it will execute it. So
> >>> that
> >>> the communication happens through an state change in ZooKeeper. This
> can
> >>> potentially simply your architecture.
> >>>
> >>> Thanks,
> >>> Supun.
> >>>
> >>>
> >>> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake <
> glahiru@gmail.com>
> >>> wrote:
> >>>
> >>>  Hi Supun,
> >>>>
> >>>> So your suggestion is to create a znode for each thrift service we
> have
> >>>> and
> >>>> when the request comes that node gets modified with input data for
> that
> >>>> request and thrift service is having a watch for that node and it will
> >>>> be
> >>>> notified because of the watch and it can read the input from zookeeper
> >>>> and
> >>>> invoke the operation?
> >>>>
> >>>> Lahiru
> >>>>
> >>>>
> >>>> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva <
> >>>> supun06@gmail.com>
> >>>> wrote:
> >>>>
> >>>>  Hi all,
> >>>>>
> >>>>> Here is what I think about Airavata and ZooKeeper. In Airavata there
> >>>>> are
> >>>>> many components and these components must be stateless to achieve
> >>>>> scalability and reliability.Also there must be a mechanism to
> >>>>>
> >>>> communicate
> >>>>
> >>>>> between the components. At the moment Airavata uses RPC calls based
> on
> >>>>> Thrift for the communication.
> >>>>>
> >>>>> ZooKeeper can be used both as a place to hold state and as a
> >>>>>
> >>>> communication
> >>>>
> >>>>> layer between the components. I'm involved with a project that has
> many
> >>>>> distributed components like AIravata. Right now we use Thrift
> services
> >>>>>
> >>>> to
> >>>>
> >>>>> communicate among the components. But we find it difficult to use RPC
> >>>>>
> >>>> calls
> >>>>
> >>>>> and achieve stateless behaviour and thinking of replacing Thrift
> >>>>>
> >>>> services
> >>>>
> >>>>> with ZooKeeper based communication layer. So I think it is better to
> >>>>> explore the possibility of removing the Thrift services between the
> >>>>> components and use ZooKeeper as a communication mechanism between the
> >>>>> services. If you do this you will have to move the state to ZooKeeper
> >>>>>
> >>>> and
> >>>>
> >>>>> will automatically achieve the stateless behaviour in the components.
> >>>>>
> >>>>> Also I think trying to make ZooKeeper optional is a bad idea. If we
> are
> >>>>> trying to integrate something fundamentally important to architecture
> >>>>> as
> >>>>> how to store state, we shouldn't make it optional.
> >>>>>
> >>>>> Thanks,
> >>>>> Supun..
> >>>>>
> >>>>>
> >>>>> On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
> >>>>> shameerainfo@gmail.com> wrote:
> >>>>>
> >>>>>  Hi Lahiru,
> >>>>>>
> >>>>>> As i understood,  not only reliability , you are trying to achieve
> >>>>>> some
> >>>>>> other requirement by introducing zookeeper, like health monitoring
> of
> >>>>>>
> >>>>> the
> >>>>
> >>>>> services, categorization with service implementation etc ... . In
> that
> >>>>>> case, i think we can get use of zookeeper's features but if we only
> >>>>>>
> >>>>> focus
> >>>>
> >>>>> on reliability, i have little bit of concern, why can't we use
> >>>>>>
> >>>>> clustering +
> >>>>
> >>>>> LB ?
> >>>>>>
> >>>>>> Yes it is better we add Zookeeper as a prerequisite if user need to
> >>>>>> use
> >>>>>> it.
> >>>>>>
> >>>>>> Thanks,
> >>>>>>   Shameera.
> >>>>>>
> >>>>>>
> >>>>>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <
> >>>>>> glahiru@gmail.com
> >>>>>> wrote:
> >>>>>>
> >>>>>>  Hi Gagan,
> >>>>>>>
> >>>>>>> I need to start another discussion about it, but I had an offline
> >>>>>>> discussion with Suresh about auto-scaling. I will start another
> >>>>>>> thread
> >>>>>>> about this topic too.
> >>>>>>>
> >>>>>>> Regards
> >>>>>>> Lahiru
> >>>>>>>
> >>>>>>>
> >>>>>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
> >>>>>>>
> >>>>>> gagandeepjuneja@gmail.com
> >>>>
> >>>>> wrote:
> >>>>>>>
> >>>>>>>  Thanks Lahiru for pointing to nice library, added to my dictionary
> >>>>>>>>
> >>>>>>> :).
> >>>>
> >>>>>  I would like to know how are we planning to start multiple servers.
> >>>>>>>> 1. Spawning new servers based on load? Some times we call it as
> auto
> >>>>>>>> scalable.
> >>>>>>>> 2. To make some specific number of nodes available such as we
> want 2
> >>>>>>>> servers to be available at any time so if one goes down then I
> need
> >>>>>>>>
> >>>>>>> to
> >>>>
> >>>>>  spawn one new to make available servers count 2.
> >>>>>>>> 3. Initially start all the servers.
> >>>>>>>>
> >>>>>>>> In scenario 1 and 2 zookeeper does make sense but I don't believe
> >>>>>>>>
> >>>>>>> existing
> >>>>>>>
> >>>>>>>> architecture support this?
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Gagan
> >>>>>>>> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <gl...@gmail.com>
> >>>>>>>>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Hi Gagan,
> >>>>>>>>>
> >>>>>>>>> Thanks for your response. Please see my inline comments.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
> >>>>>>>>>
> >>>>>>>> gagandeepjuneja@gmail.com>
> >>>>>>>
> >>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>  Hi Lahiru,
> >>>>>>>>>> Just my 2 cents.
> >>>>>>>>>>
> >>>>>>>>>> I am big fan of zookeeper but also against adding multiple hops
> in
> >>>>>>>>>>
> >>>>>>>>> the
> >>>>>>>
> >>>>>>>> system which can add unnecessary complexity. Here I am not able to
> >>>>>>>>>> understand the requirement of zookeeper may be I am wrong
> because
> >>>>>>>>>>
> >>>>>>>>> of
> >>>>
> >>>>> less
> >>>>>>>
> >>>>>>>> knowledge of the airavata system in whole. So I would like to
> >>>>>>>>>>
> >>>>>>>>> discuss
> >>>>
> >>>>>  following point.
> >>>>>>>>>>
> >>>>>>>>>> 1. How it will help us in making system more reliable. Zookeeper
> >>>>>>>>>>
> >>>>>>>>> is
> >>>>
> >>>>> not
> >>>>>>>
> >>>>>>>> able to restart services. At max it can tell whether service is up
> >>>>>>>>>>
> >>>>>>>>> or not
> >>>>>>>
> >>>>>>>> which could only be the case if airavata service goes down
> >>>>>>>>>>
> >>>>>>>>> gracefully and
> >>>>>>>
> >>>>>>>> we have any automated way to restart it. If this is just matter of
> >>>>>>>>>>
> >>>>>>>>> routing
> >>>>>>>
> >>>>>>>> client requests to the available thrift servers then this can be
> >>>>>>>>>>
> >>>>>>>>> achieved
> >>>>>>>
> >>>>>>>> with the help of load balancer which I guess is already there in
> >>>>>>>>>>
> >>>>>>>>> thrift
> >>>>>>>
> >>>>>>>> wish list.
> >>>>>>>>>>
> >>>>>>>>>>  We have multiple thrift services and currently we start only
> one
> >>>>>>>>>
> >>>>>>>> instance
> >>>>>>>
> >>>>>>>> of them and each thrift service is a stateless service. To keep
> the
> >>>>>>>>>
> >>>>>>>> high
> >>>>>>>
> >>>>>>>> availability we have to start multiple instances of them in
> >>>>>>>>>
> >>>>>>>> production
> >>>>
> >>>>>  scenario. So for clients to get an available thrift service we can
> >>>>>>>>>
> >>>>>>>> use
> >>>>
> >>>>>  zookeeper znodes to represent each available service. There are
> >>>>>>>>>
> >>>>>>>> some
> >>>>
> >>>>>  libraries which is doing similar[1] and I think we can use them
> >>>>>>>>>
> >>>>>>>> directly.
> >>>>>>>
> >>>>>>>> 2. As far as registering of different providers is concerned do
> >>>>>>>>>>
> >>>>>>>>> you
> >>>>
> >>>>>  think for that we really need external store.
> >>>>>>>>>>
> >>>>>>>>>>  Yes I think so, because its light weight and reliable and we
> have
> >>>>>>>>>
> >>>>>>>> to
> >>>>
> >>>>> do
> >>>>>>>
> >>>>>>>> very minimal amount of work to achieve all these features to
> >>>>>>>>>
> >>>>>>>> Airavata
> >>>>
> >>>>>  because zookeeper handle all the complexity.
> >>>>>>>>>
> >>>>>>>>>  I have seen people using zookeeper more for state management in
> >>>>>>>>>> distributed environments.
> >>>>>>>>>>
> >>>>>>>>>>  +1, we might not be the most effective users of zookeeper
> because
> >>>>>>>>>
> >>>>>>>> all
> >>>>
> >>>>> of
> >>>>>>>
> >>>>>>>> our services are stateless services, but my point is to achieve
> >>>>>>>>> fault-tolerance we can use zookeeper and with minimal work.
> >>>>>>>>>
> >>>>>>>>>    I would like to understand more how can we leverage zookeeper
> in
> >>>>>>>>>> airavata to make system reliable.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>  [1]https://github.com/eirslett/thrift-zookeeper
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>  Regards,
> >>>>>>>>>> Gagan
> >>>>>>>>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu>
> wrote:
> >>>>>>>>>>
> >>>>>>>>>>  Thanks for the summary, Lahiru. I'm cc'ing the Architecture
> list
> >>>>>>>>>>>
> >>>>>>>>>> for
> >>>>
> >>>>>  additional comments.
> >>>>>>>>>>>
> >>>>>>>>>>> Marlon
> >>>>>>>>>>>
> >>>>>>>>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi All,
> >>>>>>>>>>>>
> >>>>>>>>>>>> I did little research about Apache Zookeeper[1] and how to use
> >>>>>>>>>>>>
> >>>>>>>>>>> it
> >>>>
> >>>>> in
> >>>>>>>
> >>>>>>>>  airavata. Its really a nice way to achieve fault tolerance and
> >>>>>>>>>>>>
> >>>>>>>>>>> reliable
> >>>>>>>>>>>
> >>>>>>>>>>>> communication between our thrift services and clients.
> >>>>>>>>>>>>
> >>>>>>>>>>> Zookeeper
> >>>>
> >>>>> is a
> >>>>>>>
> >>>>>>>>  distributed, fault tolerant system to do a reliable
> >>>>>>>>>>>>
> >>>>>>>>>>> communication
> >>>>
> >>>>>  between
> >>>>>>>>>>>
> >>>>>>>>>>>> distributed applications. This is like an in-memory file
> system
> >>>>>>>>>>>>
> >>>>>>>>>>> which
> >>>>>>>
> >>>>>>>>  has
> >>>>>>>>>>>
> >>>>>>>>>>>> nodes in a tree structure and each node can have small amount
> >>>>>>>>>>>>
> >>>>>>>>>>> of
> >>>>
> >>>>> data
> >>>>>>>
> >>>>>>>>  associated with it and these nodes are called znodes. Clients
> >>>>>>>>>>>>
> >>>>>>>>>>> can
> >>>>
> >>>>>  connect
> >>>>>>>>>>>
> >>>>>>>>>>>> to a zookeeper server and add/delete and update these znodes.
> >>>>>>>>>>>>
> >>>>>>>>>>>>    In Apache Airavata we start multiple thrift services and
> >>>>>>>>>>>>
> >>>>>>>>>>> these
> >>>>
> >>>>> can
> >>>>>>>
> >>>>>>>>  go
> >>>>>>>>>>>
> >>>>>>>>>>>> down for maintenance or these can crash, if we use zookeeper
> to
> >>>>>>>>>>>>
> >>>>>>>>>>> store
> >>>>>>>
> >>>>>>>>  these
> >>>>>>>>>>>
> >>>>>>>>>>>> configuration(thrift service configurations) we can achieve a
> >>>>>>>>>>>>
> >>>>>>>>>>> very
> >>>>
> >>>>>  reliable
> >>>>>>>>>>>
> >>>>>>>>>>>> system. Basically thrift clients can dynamically discover
> >>>>>>>>>>>>
> >>>>>>>>>>> available
> >>>>>>>
> >>>>>>>>  service
> >>>>>>>>>>>
> >>>>>>>>>>>> by using ephemeral znodes(Here we do not have to change the
> >>>>>>>>>>>>
> >>>>>>>>>>> generated
> >>>>>>>
> >>>>>>>>  thrift client code but we have to change the locations we are
> >>>>>>>>>>>>
> >>>>>>>>>>> invoking
> >>>>>>>
> >>>>>>>>  them). ephemeral znodes will be removed when the thrift service
> >>>>>>>>>>>>
> >>>>>>>>>>> goes
> >>>>>>>
> >>>>>>>>  down
> >>>>>>>>>>>
> >>>>>>>>>>>> and zookeeper guarantee the atomicity between these
> operations.
> >>>>>>>>>>>>
> >>>>>>>>>>> With
> >>>>>>>
> >>>>>>>>  this
> >>>>>>>>>>>
> >>>>>>>>>>>> approach we can have a node hierarchy for multiple of
> airavata,
> >>>>>>>>>>>> orchestrator,appcatalog and gfac thrift services.
> >>>>>>>>>>>>
> >>>>>>>>>>>> For specifically for gfac we can have different types of
> >>>>>>>>>>>>
> >>>>>>>>>>> services
> >>>>
> >>>>> for
> >>>>>>>
> >>>>>>>>  each
> >>>>>>>>>>>
> >>>>>>>>>>>> provider implementation. This can be achieved by using the
> >>>>>>>>>>>>
> >>>>>>>>>>> hierarchical
> >>>>>>>>>>>
> >>>>>>>>>>>> support in zookeeper and providing some logic in gfac-thrift
> >>>>>>>>>>>>
> >>>>>>>>>>> service
> >>>>>>>
> >>>>>>>>  to
> >>>>>>>>>>>
> >>>>>>>>>>>> register it to a defined path. Using the same logic
> >>>>>>>>>>>>
> >>>>>>>>>>> orchestrator
> >>>>
> >>>>> can
> >>>>>>>
> >>>>>>>>  discover the provider specific gfac thrift service and route
> >>>>>>>>>>>>
> >>>>>>>>>>> the
> >>>>
> >>>>>  message to
> >>>>>>>>>>>
> >>>>>>>>>>>> the correct thrift service.
> >>>>>>>>>>>>
> >>>>>>>>>>>> With this approach I think we simply have write some client
> >>>>>>>>>>>>
> >>>>>>>>>>> code
> >>>>
> >>>>> in
> >>>>>>>
> >>>>>>>>  thrift
> >>>>>>>>>>>
> >>>>>>>>>>>> services and clients and zookeeper server installation can be
> >>>>>>>>>>>>
> >>>>>>>>>>> done as
> >>>>>>>
> >>>>>>>>  a
> >>>>>>>>>>>
> >>>>>>>>>>>> separate process and it will be easier to keep the Zookeeper
> >>>>>>>>>>>>
> >>>>>>>>>>> server
> >>>>>>>
> >>>>>>>>  separate from Airavata because installation of Zookeeper server
> >>>>>>>>>>>>
> >>>>>>>>>>> little
> >>>>>>>
> >>>>>>>>  complex in production scenario. I think we have to make sure
> >>>>>>>>>>>>
> >>>>>>>>>>> everything
> >>>>>>>>>>>
> >>>>>>>>>>>> works fine when there is no Zookeeper running, ex:
> >>>>>>>>>>>>
> >>>>>>>>>>> enable.zookeeper=false
> >>>>>>>>>>>
> >>>>>>>>>>>> should works fine and users doesn't have to download and start
> >>>>>>>>>>>>
> >>>>>>>>>>> zookeeper.
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> [1]http://zookeeper.apache.org/
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks
> >>>>>>>>>>>> Lahiru
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> System Analyst Programmer
> >>>>>>>>> PTI Lab
> >>>>>>>>> Indiana University
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>> --
> >>>>>>> System Analyst Programmer
> >>>>>>> PTI Lab
> >>>>>>> Indiana University
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Best Regards,
> >>>>>> Shameera Rathnayaka.
> >>>>>>
> >>>>>> email: shameera AT apache.org , shameerainfo AT gmail.com
> >>>>>> Blog : http://shameerarathnayaka.blogspot.com/
> >>>>>>
> >>>>>>
> >>>>>
> >>>>> --
> >>>>> Supun Kamburugamuva
> >>>>> Member, Apache Software Foundation; http://www.apache.org
> >>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> >>>>> Blog: http://supunk.blogspot.com
> >>>>>
> >>>>>
> >>>>>
> >>>> --
> >>>> System Analyst Programmer
> >>>> PTI Lab
> >>>> Indiana University
> >>>>
> >>>>
> >>>
> >>> --
> >>> Supun Kamburugamuva
> >>> Member, Apache Software Foundation; http://www.apache.org
> >>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> >>> Blog: http://supunk.blogspot.com
> >>>
> >>>
> >>>
> >>
> >
>
>
> --
> Supun Kamburugamuva
> Member, Apache Software Foundation; http://www.apache.org
> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> Blog: http://supunk.blogspot.com
>



-- 
System Analyst Programmer
PTI Lab
Indiana University

Re: Zookeeper in Airavata to achieve reliability

Posted by Supun Kamburugamuva <su...@gmail.com>.

Hi Marlon,

I think you are exactly correct.

Supun..


On Mon, Jun 16, 2014 at 11:48 AM, Marlon Pierce <ma...@iu.edu> wrote:

> Let me restate this, and please tell me if I'm wrong.
>
> Orchestrator decides (somehow) that a particular job requires JSDL/BES, so
> it places the Experiment ID in Zookeeper's /airavata/gfac/jsdl-bes node.
>  GFAC servers associated with this instance notice the update.  The first
> GFAC to claim the job gets it, uses the Experiment ID to get the detailed
> information it needs from the Registry.  ZooKeeper handles the locking, etc
> to make sure that only one GFAC at a time is trying to handle an experiment.
>
> Marlon
>
>
> On 6/16/14, 11:42 AM, Lahiru Gunathilake wrote:
>
>> Hi Supun,
>>
>> Thanks for the clarification.
>>
>> Regards
>> Lahiru
>>
>>
>> On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva <su...@gmail.com>
>> wrote:
>>
>>  Hi Lahiru,
>>>
>>> My suggestion is that may be you don't need a Thrift service between
>>> Orchestrator and the component executing the experiment. When a new
>>> experiment is submitted, orchestrator decides who can execute this job.
>>> Then it put the information about this experiment execution in ZooKeeper.
>>> The component which wants to executes the experiment is listening to this
>>> ZooKeeper path and when it sees the experiment it will execute it. So
>>> that
>>> the communication happens through an state change in ZooKeeper. This can
>>> potentially simply your architecture.
>>>
>>> Thanks,
>>> Supun.
>>>
>>>
>>> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake <gl...@gmail.com>
>>> wrote:
>>>
>>>  Hi Supun,
>>>>
>>>> So your suggestion is to create a znode for each thrift service we have
>>>> and
>>>> when the request comes that node gets modified with input data for that
>>>> request and thrift service is having a watch for that node and it will
>>>> be
>>>> notified because of the watch and it can read the input from zookeeper
>>>> and
>>>> invoke the operation?
>>>>
>>>> Lahiru
>>>>
>>>>
>>>> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva <
>>>> supun06@gmail.com>
>>>> wrote:
>>>>
>>>>  Hi all,
>>>>>
>>>>> Here is what I think about Airavata and ZooKeeper. In Airavata there
>>>>> are
>>>>> many components and these components must be stateless to achieve
>>>>> scalability and reliability.Also there must be a mechanism to
>>>>>
>>>> communicate
>>>>
>>>>> between the components. At the moment Airavata uses RPC calls based on
>>>>> Thrift for the communication.
>>>>>
>>>>> ZooKeeper can be used both as a place to hold state and as a
>>>>>
>>>> communication
>>>>
>>>>> layer between the components. I'm involved with a project that has many
>>>>> distributed components like AIravata. Right now we use Thrift services
>>>>>
>>>> to
>>>>
>>>>> communicate among the components. But we find it difficult to use RPC
>>>>>
>>>> calls
>>>>
>>>>> and achieve stateless behaviour and thinking of replacing Thrift
>>>>>
>>>> services
>>>>
>>>>> with ZooKeeper based communication layer. So I think it is better to
>>>>> explore the possibility of removing the Thrift services between the
>>>>> components and use ZooKeeper as a communication mechanism between the
>>>>> services. If you do this you will have to move the state to ZooKeeper
>>>>>
>>>> and
>>>>
>>>>> will automatically achieve the stateless behaviour in the components.
>>>>>
>>>>> Also I think trying to make ZooKeeper optional is a bad idea. If we are
>>>>> trying to integrate something fundamentally important to architecture
>>>>> as
>>>>> how to store state, we shouldn't make it optional.
>>>>>
>>>>> Thanks,
>>>>> Supun..
>>>>>
>>>>>
>>>>> On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
>>>>> shameerainfo@gmail.com> wrote:
>>>>>
>>>>>  Hi Lahiru,
>>>>>>
>>>>>> As i understood,  not only reliability , you are trying to achieve
>>>>>> some
>>>>>> other requirement by introducing zookeeper, like health monitoring of
>>>>>>
>>>>> the
>>>>
>>>>> services, categorization with service implementation etc ... . In that
>>>>>> case, i think we can get use of zookeeper's features but if we only
>>>>>>
>>>>> focus
>>>>
>>>>> on reliability, i have little bit of concern, why can't we use
>>>>>>
>>>>> clustering +
>>>>
>>>>> LB ?
>>>>>>
>>>>>> Yes it is better we add Zookeeper as a prerequisite if user need to
>>>>>> use
>>>>>> it.
>>>>>>
>>>>>> Thanks,
>>>>>>   Shameera.
>>>>>>
>>>>>>
>>>>>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <
>>>>>> glahiru@gmail.com
>>>>>> wrote:
>>>>>>
>>>>>>  Hi Gagan,
>>>>>>>
>>>>>>> I need to start another discussion about it, but I had an offline
>>>>>>> discussion with Suresh about auto-scaling. I will start another
>>>>>>> thread
>>>>>>> about this topic too.
>>>>>>>
>>>>>>> Regards
>>>>>>> Lahiru
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
>>>>>>>
>>>>>> gagandeepjuneja@gmail.com
>>>>
>>>>> wrote:
>>>>>>>
>>>>>>>  Thanks Lahiru for pointing to nice library, added to my dictionary
>>>>>>>>
>>>>>>> :).
>>>>
>>>>>  I would like to know how are we planning to start multiple servers.
>>>>>>>> 1. Spawning new servers based on load? Some times we call it as auto
>>>>>>>> scalable.
>>>>>>>> 2. To make some specific number of nodes available such as we want 2
>>>>>>>> servers to be available at any time so if one goes down then I need
>>>>>>>>
>>>>>>> to
>>>>
>>>>>  spawn one new to make available servers count 2.
>>>>>>>> 3. Initially start all the servers.
>>>>>>>>
>>>>>>>> In scenario 1 and 2 zookeeper does make sense but I don't believe
>>>>>>>>
>>>>>>> existing
>>>>>>>
>>>>>>>> architecture support this?
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Gagan
>>>>>>>> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <gl...@gmail.com>
>>>>>>>>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Gagan,
>>>>>>>>>
>>>>>>>>> Thanks for your response. Please see my inline comments.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
>>>>>>>>>
>>>>>>>> gagandeepjuneja@gmail.com>
>>>>>>>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>  Hi Lahiru,
>>>>>>>>>> Just my 2 cents.
>>>>>>>>>>
>>>>>>>>>> I am big fan of zookeeper but also against adding multiple hops in
>>>>>>>>>>
>>>>>>>>> the
>>>>>>>
>>>>>>>> system which can add unnecessary complexity. Here I am not able to
>>>>>>>>>> understand the requirement of zookeeper may be I am wrong because
>>>>>>>>>>
>>>>>>>>> of
>>>>
>>>>> less
>>>>>>>
>>>>>>>> knowledge of the airavata system in whole. So I would like to
>>>>>>>>>>
>>>>>>>>> discuss
>>>>
>>>>>  following point.
>>>>>>>>>>
>>>>>>>>>> 1. How it will help us in making system more reliable. Zookeeper
>>>>>>>>>>
>>>>>>>>> is
>>>>
>>>>> not
>>>>>>>
>>>>>>>> able to restart services. At max it can tell whether service is up
>>>>>>>>>>
>>>>>>>>> or not
>>>>>>>
>>>>>>>> which could only be the case if airavata service goes down
>>>>>>>>>>
>>>>>>>>> gracefully and
>>>>>>>
>>>>>>>> we have any automated way to restart it. If this is just matter of
>>>>>>>>>>
>>>>>>>>> routing
>>>>>>>
>>>>>>>> client requests to the available thrift servers then this can be
>>>>>>>>>>
>>>>>>>>> achieved
>>>>>>>
>>>>>>>> with the help of load balancer which I guess is already there in
>>>>>>>>>>
>>>>>>>>> thrift
>>>>>>>
>>>>>>>> wish list.
>>>>>>>>>>
>>>>>>>>>>  We have multiple thrift services and currently we start only one
>>>>>>>>>
>>>>>>>> instance
>>>>>>>
>>>>>>>> of them and each thrift service is a stateless service. To keep the
>>>>>>>>>
>>>>>>>> high
>>>>>>>
>>>>>>>> availability we have to start multiple instances of them in
>>>>>>>>>
>>>>>>>> production
>>>>
>>>>>  scenario. So for clients to get an available thrift service we can
>>>>>>>>>
>>>>>>>> use
>>>>
>>>>>  zookeeper znodes to represent each available service. There are
>>>>>>>>>
>>>>>>>> some
>>>>
>>>>>  libraries which is doing similar[1] and I think we can use them
>>>>>>>>>
>>>>>>>> directly.
>>>>>>>
>>>>>>>> 2. As far as registering of different providers is concerned do
>>>>>>>>>>
>>>>>>>>> you
>>>>
>>>>>  think for that we really need external store.
>>>>>>>>>>
>>>>>>>>>>  Yes I think so, because its light weight and reliable and we have
>>>>>>>>>
>>>>>>>> to
>>>>
>>>>> do
>>>>>>>
>>>>>>>> very minimal amount of work to achieve all these features to
>>>>>>>>>
>>>>>>>> Airavata
>>>>
>>>>>  because zookeeper handle all the complexity.
>>>>>>>>>
>>>>>>>>>  I have seen people using zookeeper more for state management in
>>>>>>>>>> distributed environments.
>>>>>>>>>>
>>>>>>>>>>  +1, we might not be the most effective users of zookeeper because
>>>>>>>>>
>>>>>>>> all
>>>>
>>>>> of
>>>>>>>
>>>>>>>> our services are stateless services, but my point is to achieve
>>>>>>>>> fault-tolerance we can use zookeeper and with minimal work.
>>>>>>>>>
>>>>>>>>>    I would like to understand more how can we leverage zookeeper in
>>>>>>>>>> airavata to make system reliable.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>  [1]https://github.com/eirslett/thrift-zookeeper
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  Regards,
>>>>>>>>>> Gagan
>>>>>>>>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu> wrote:
>>>>>>>>>>
>>>>>>>>>>  Thanks for the summary, Lahiru. I'm cc'ing the Architecture list
>>>>>>>>>>>
>>>>>>>>>> for
>>>>
>>>>>  additional comments.
>>>>>>>>>>>
>>>>>>>>>>> Marlon
>>>>>>>>>>>
>>>>>>>>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>
>>>>>>>>>>>> I did little research about Apache Zookeeper[1] and how to use
>>>>>>>>>>>>
>>>>>>>>>>> it
>>>>
>>>>> in
>>>>>>>
>>>>>>>>  airavata. Its really a nice way to achieve fault tolerance and
>>>>>>>>>>>>
>>>>>>>>>>> reliable
>>>>>>>>>>>
>>>>>>>>>>>> communication between our thrift services and clients.
>>>>>>>>>>>>
>>>>>>>>>>> Zookeeper
>>>>
>>>>> is a
>>>>>>>
>>>>>>>>  distributed, fault tolerant system to do a reliable
>>>>>>>>>>>>
>>>>>>>>>>> communication
>>>>
>>>>>  between
>>>>>>>>>>>
>>>>>>>>>>>> distributed applications. This is like an in-memory file system
>>>>>>>>>>>>
>>>>>>>>>>> which
>>>>>>>
>>>>>>>>  has
>>>>>>>>>>>
>>>>>>>>>>>> nodes in a tree structure and each node can have small amount
>>>>>>>>>>>>
>>>>>>>>>>> of
>>>>
>>>>> data
>>>>>>>
>>>>>>>>  associated with it and these nodes are called znodes. Clients
>>>>>>>>>>>>
>>>>>>>>>>> can
>>>>
>>>>>  connect
>>>>>>>>>>>
>>>>>>>>>>>> to a zookeeper server and add/delete and update these znodes.
>>>>>>>>>>>>
>>>>>>>>>>>>    In Apache Airavata we start multiple thrift services and
>>>>>>>>>>>>
>>>>>>>>>>> these
>>>>
>>>>> can
>>>>>>>
>>>>>>>>  go
>>>>>>>>>>>
>>>>>>>>>>>> down for maintenance or these can crash, if we use zookeeper to
>>>>>>>>>>>>
>>>>>>>>>>> store
>>>>>>>
>>>>>>>>  these
>>>>>>>>>>>
>>>>>>>>>>>> configuration(thrift service configurations) we can achieve a
>>>>>>>>>>>>
>>>>>>>>>>> very
>>>>
>>>>>  reliable
>>>>>>>>>>>
>>>>>>>>>>>> system. Basically thrift clients can dynamically discover
>>>>>>>>>>>>
>>>>>>>>>>> available
>>>>>>>
>>>>>>>>  service
>>>>>>>>>>>
>>>>>>>>>>>> by using ephemeral znodes(Here we do not have to change the
>>>>>>>>>>>>
>>>>>>>>>>> generated
>>>>>>>
>>>>>>>>  thrift client code but we have to change the locations we are
>>>>>>>>>>>>
>>>>>>>>>>> invoking
>>>>>>>
>>>>>>>>  them). ephemeral znodes will be removed when the thrift service
>>>>>>>>>>>>
>>>>>>>>>>> goes
>>>>>>>
>>>>>>>>  down
>>>>>>>>>>>
>>>>>>>>>>>> and zookeeper guarantee the atomicity between these operations.
>>>>>>>>>>>>
>>>>>>>>>>> With
>>>>>>>
>>>>>>>>  this
>>>>>>>>>>>
>>>>>>>>>>>> approach we can have a node hierarchy for multiple of airavata,
>>>>>>>>>>>> orchestrator,appcatalog and gfac thrift services.
>>>>>>>>>>>>
>>>>>>>>>>>> For specifically for gfac we can have different types of
>>>>>>>>>>>>
>>>>>>>>>>> services
>>>>
>>>>> for
>>>>>>>
>>>>>>>>  each
>>>>>>>>>>>
>>>>>>>>>>>> provider implementation. This can be achieved by using the
>>>>>>>>>>>>
>>>>>>>>>>> hierarchical
>>>>>>>>>>>
>>>>>>>>>>>> support in zookeeper and providing some logic in gfac-thrift
>>>>>>>>>>>>
>>>>>>>>>>> service
>>>>>>>
>>>>>>>>  to
>>>>>>>>>>>
>>>>>>>>>>>> register it to a defined path. Using the same logic
>>>>>>>>>>>>
>>>>>>>>>>> orchestrator
>>>>
>>>>> can
>>>>>>>
>>>>>>>>  discover the provider specific gfac thrift service and route
>>>>>>>>>>>>
>>>>>>>>>>> the
>>>>
>>>>>  message to
>>>>>>>>>>>
>>>>>>>>>>>> the correct thrift service.
>>>>>>>>>>>>
>>>>>>>>>>>> With this approach I think we simply have write some client
>>>>>>>>>>>>
>>>>>>>>>>> code
>>>>
>>>>> in
>>>>>>>
>>>>>>>>  thrift
>>>>>>>>>>>
>>>>>>>>>>>> services and clients and zookeeper server installation can be
>>>>>>>>>>>>
>>>>>>>>>>> done as
>>>>>>>
>>>>>>>>  a
>>>>>>>>>>>
>>>>>>>>>>>> separate process and it will be easier to keep the Zookeeper
>>>>>>>>>>>>
>>>>>>>>>>> server
>>>>>>>
>>>>>>>>  separate from Airavata because installation of Zookeeper server
>>>>>>>>>>>>
>>>>>>>>>>> little
>>>>>>>
>>>>>>>>  complex in production scenario. I think we have to make sure
>>>>>>>>>>>>
>>>>>>>>>>> everything
>>>>>>>>>>>
>>>>>>>>>>>> works fine when there is no Zookeeper running, ex:
>>>>>>>>>>>>
>>>>>>>>>>> enable.zookeeper=false
>>>>>>>>>>>
>>>>>>>>>>>> should works fine and users doesn't have to download and start
>>>>>>>>>>>>
>>>>>>>>>>> zookeeper.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> [1]http://zookeeper.apache.org/
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks
>>>>>>>>>>>> Lahiru
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> System Analyst Programmer
>>>>>>>>> PTI Lab
>>>>>>>>> Indiana University
>>>>>>>>>
>>>>>>>>>
>>>>>>> --
>>>>>>> System Analyst Programmer
>>>>>>> PTI Lab
>>>>>>> Indiana University
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best Regards,
>>>>>> Shameera Rathnayaka.
>>>>>>
>>>>>> email: shameera AT apache.org , shameerainfo AT gmail.com
>>>>>> Blog : http://shameerarathnayaka.blogspot.com/
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Supun Kamburugamuva
>>>>> Member, Apache Software Foundation; http://www.apache.org
>>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
>>>>> Blog: http://supunk.blogspot.com
>>>>>
>>>>>
>>>>>
>>>> --
>>>> System Analyst Programmer
>>>> PTI Lab
>>>> Indiana University
>>>>
>>>>
>>>
>>> --
>>> Supun Kamburugamuva
>>> Member, Apache Software Foundation; http://www.apache.org
>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
>>> Blog: http://supunk.blogspot.com
>>>
>>>
>>>
>>
>


-- 
Supun Kamburugamuva
Member, Apache Software Foundation; http://www.apache.org
E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
Blog: http://supunk.blogspot.com

Re: Zookeeper in Airavata to achieve reliability

Posted by Marlon Pierce <ma...@iu.edu>.

Let me restate this, and please tell me if I'm wrong.

Orchestrator decides (somehow) that a particular job requires JSDL/BES, 
so it places the Experiment ID in Zookeeper's /airavata/gfac/jsdl-bes 
node.  GFAC servers associated with this instance notice the update.  
The first GFAC to claim the job gets it, uses the Experiment ID to get 
the detailed information it needs from the Registry.  ZooKeeper handles 
the locking, etc to make sure that only one GFAC at a time is trying to 
handle an experiment.

Marlon

On 6/16/14, 11:42 AM, Lahiru Gunathilake wrote:
> Hi Supun,
>
> Thanks for the clarification.
>
> Regards
> Lahiru
>
>
> On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva <su...@gmail.com>
> wrote:
>
>> Hi Lahiru,
>>
>> My suggestion is that may be you don't need a Thrift service between
>> Orchestrator and the component executing the experiment. When a new
>> experiment is submitted, orchestrator decides who can execute this job.
>> Then it put the information about this experiment execution in ZooKeeper.
>> The component which wants to executes the experiment is listening to this
>> ZooKeeper path and when it sees the experiment it will execute it. So that
>> the communication happens through an state change in ZooKeeper. This can
>> potentially simply your architecture.
>>
>> Thanks,
>> Supun.
>>
>>
>> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake <gl...@gmail.com>
>> wrote:
>>
>>> Hi Supun,
>>>
>>> So your suggestion is to create a znode for each thrift service we have
>>> and
>>> when the request comes that node gets modified with input data for that
>>> request and thrift service is having a watch for that node and it will be
>>> notified because of the watch and it can read the input from zookeeper and
>>> invoke the operation?
>>>
>>> Lahiru
>>>
>>>
>>> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva <su...@gmail.com>
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> Here is what I think about Airavata and ZooKeeper. In Airavata there are
>>>> many components and these components must be stateless to achieve
>>>> scalability and reliability.Also there must be a mechanism to
>>> communicate
>>>> between the components. At the moment Airavata uses RPC calls based on
>>>> Thrift for the communication.
>>>>
>>>> ZooKeeper can be used both as a place to hold state and as a
>>> communication
>>>> layer between the components. I'm involved with a project that has many
>>>> distributed components like AIravata. Right now we use Thrift services
>>> to
>>>> communicate among the components. But we find it difficult to use RPC
>>> calls
>>>> and achieve stateless behaviour and thinking of replacing Thrift
>>> services
>>>> with ZooKeeper based communication layer. So I think it is better to
>>>> explore the possibility of removing the Thrift services between the
>>>> components and use ZooKeeper as a communication mechanism between the
>>>> services. If you do this you will have to move the state to ZooKeeper
>>> and
>>>> will automatically achieve the stateless behaviour in the components.
>>>>
>>>> Also I think trying to make ZooKeeper optional is a bad idea. If we are
>>>> trying to integrate something fundamentally important to architecture as
>>>> how to store state, we shouldn't make it optional.
>>>>
>>>> Thanks,
>>>> Supun..
>>>>
>>>>
>>>> On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
>>>> shameerainfo@gmail.com> wrote:
>>>>
>>>>> Hi Lahiru,
>>>>>
>>>>> As i understood,  not only reliability , you are trying to achieve some
>>>>> other requirement by introducing zookeeper, like health monitoring of
>>> the
>>>>> services, categorization with service implementation etc ... . In that
>>>>> case, i think we can get use of zookeeper's features but if we only
>>> focus
>>>>> on reliability, i have little bit of concern, why can't we use
>>> clustering +
>>>>> LB ?
>>>>>
>>>>> Yes it is better we add Zookeeper as a prerequisite if user need to use
>>>>> it.
>>>>>
>>>>> Thanks,
>>>>>   Shameera.
>>>>>
>>>>>
>>>>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <glahiru@gmail.com
>>>>> wrote:
>>>>>
>>>>>> Hi Gagan,
>>>>>>
>>>>>> I need to start another discussion about it, but I had an offline
>>>>>> discussion with Suresh about auto-scaling. I will start another thread
>>>>>> about this topic too.
>>>>>>
>>>>>> Regards
>>>>>> Lahiru
>>>>>>
>>>>>>
>>>>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
>>> gagandeepjuneja@gmail.com
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks Lahiru for pointing to nice library, added to my dictionary
>>> :).
>>>>>>> I would like to know how are we planning to start multiple servers.
>>>>>>> 1. Spawning new servers based on load? Some times we call it as auto
>>>>>>> scalable.
>>>>>>> 2. To make some specific number of nodes available such as we want 2
>>>>>>> servers to be available at any time so if one goes down then I need
>>> to
>>>>>>> spawn one new to make available servers count 2.
>>>>>>> 3. Initially start all the servers.
>>>>>>>
>>>>>>> In scenario 1 and 2 zookeeper does make sense but I don't believe
>>>>>> existing
>>>>>>> architecture support this?
>>>>>>>
>>>>>>> Regards,
>>>>>>> Gagan
>>>>>>> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <gl...@gmail.com>
>>>>>> wrote:
>>>>>>>> Hi Gagan,
>>>>>>>>
>>>>>>>> Thanks for your response. Please see my inline comments.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
>>>>>> gagandeepjuneja@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Lahiru,
>>>>>>>>> Just my 2 cents.
>>>>>>>>>
>>>>>>>>> I am big fan of zookeeper but also against adding multiple hops in
>>>>>> the
>>>>>>>>> system which can add unnecessary complexity. Here I am not able to
>>>>>>>>> understand the requirement of zookeeper may be I am wrong because
>>> of
>>>>>> less
>>>>>>>>> knowledge of the airavata system in whole. So I would like to
>>> discuss
>>>>>>>>> following point.
>>>>>>>>>
>>>>>>>>> 1. How it will help us in making system more reliable. Zookeeper
>>> is
>>>>>> not
>>>>>>>>> able to restart services. At max it can tell whether service is up
>>>>>> or not
>>>>>>>>> which could only be the case if airavata service goes down
>>>>>> gracefully and
>>>>>>>>> we have any automated way to restart it. If this is just matter of
>>>>>> routing
>>>>>>>>> client requests to the available thrift servers then this can be
>>>>>> achieved
>>>>>>>>> with the help of load balancer which I guess is already there in
>>>>>> thrift
>>>>>>>>> wish list.
>>>>>>>>>
>>>>>>>> We have multiple thrift services and currently we start only one
>>>>>> instance
>>>>>>>> of them and each thrift service is a stateless service. To keep the
>>>>>> high
>>>>>>>> availability we have to start multiple instances of them in
>>> production
>>>>>>>> scenario. So for clients to get an available thrift service we can
>>> use
>>>>>>>> zookeeper znodes to represent each available service. There are
>>> some
>>>>>>>> libraries which is doing similar[1] and I think we can use them
>>>>>> directly.
>>>>>>>>> 2. As far as registering of different providers is concerned do
>>> you
>>>>>>>>> think for that we really need external store.
>>>>>>>>>
>>>>>>>> Yes I think so, because its light weight and reliable and we have
>>> to
>>>>>> do
>>>>>>>> very minimal amount of work to achieve all these features to
>>> Airavata
>>>>>>>> because zookeeper handle all the complexity.
>>>>>>>>
>>>>>>>>> I have seen people using zookeeper more for state management in
>>>>>>>>> distributed environments.
>>>>>>>>>
>>>>>>>> +1, we might not be the most effective users of zookeeper because
>>> all
>>>>>> of
>>>>>>>> our services are stateless services, but my point is to achieve
>>>>>>>> fault-tolerance we can use zookeeper and with minimal work.
>>>>>>>>
>>>>>>>>>   I would like to understand more how can we leverage zookeeper in
>>>>>>>>> airavata to make system reliable.
>>>>>>>>>
>>>>>>>>>
>>>>>>>> [1]https://github.com/eirslett/thrift-zookeeper
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Gagan
>>>>>>>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu> wrote:
>>>>>>>>>
>>>>>>>>>> Thanks for the summary, Lahiru. I'm cc'ing the Architecture list
>>> for
>>>>>>>>>> additional comments.
>>>>>>>>>>
>>>>>>>>>> Marlon
>>>>>>>>>>
>>>>>>>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
>>>>>>>>>>> Hi All,
>>>>>>>>>>>
>>>>>>>>>>> I did little research about Apache Zookeeper[1] and how to use
>>> it
>>>>>> in
>>>>>>>>>>> airavata. Its really a nice way to achieve fault tolerance and
>>>>>>>>>> reliable
>>>>>>>>>>> communication between our thrift services and clients.
>>> Zookeeper
>>>>>> is a
>>>>>>>>>>> distributed, fault tolerant system to do a reliable
>>> communication
>>>>>>>>>> between
>>>>>>>>>>> distributed applications. This is like an in-memory file system
>>>>>> which
>>>>>>>>>> has
>>>>>>>>>>> nodes in a tree structure and each node can have small amount
>>> of
>>>>>> data
>>>>>>>>>>> associated with it and these nodes are called znodes. Clients
>>> can
>>>>>>>>>> connect
>>>>>>>>>>> to a zookeeper server and add/delete and update these znodes.
>>>>>>>>>>>
>>>>>>>>>>>    In Apache Airavata we start multiple thrift services and
>>> these
>>>>>> can
>>>>>>>>>> go
>>>>>>>>>>> down for maintenance or these can crash, if we use zookeeper to
>>>>>> store
>>>>>>>>>> these
>>>>>>>>>>> configuration(thrift service configurations) we can achieve a
>>> very
>>>>>>>>>> reliable
>>>>>>>>>>> system. Basically thrift clients can dynamically discover
>>>>>> available
>>>>>>>>>> service
>>>>>>>>>>> by using ephemeral znodes(Here we do not have to change the
>>>>>> generated
>>>>>>>>>>> thrift client code but we have to change the locations we are
>>>>>> invoking
>>>>>>>>>>> them). ephemeral znodes will be removed when the thrift service
>>>>>> goes
>>>>>>>>>> down
>>>>>>>>>>> and zookeeper guarantee the atomicity between these operations.
>>>>>> With
>>>>>>>>>> this
>>>>>>>>>>> approach we can have a node hierarchy for multiple of airavata,
>>>>>>>>>>> orchestrator,appcatalog and gfac thrift services.
>>>>>>>>>>>
>>>>>>>>>>> For specifically for gfac we can have different types of
>>> services
>>>>>> for
>>>>>>>>>> each
>>>>>>>>>>> provider implementation. This can be achieved by using the
>>>>>>>>>> hierarchical
>>>>>>>>>>> support in zookeeper and providing some logic in gfac-thrift
>>>>>> service
>>>>>>>>>> to
>>>>>>>>>>> register it to a defined path. Using the same logic
>>> orchestrator
>>>>>> can
>>>>>>>>>>> discover the provider specific gfac thrift service and route
>>> the
>>>>>>>>>> message to
>>>>>>>>>>> the correct thrift service.
>>>>>>>>>>>
>>>>>>>>>>> With this approach I think we simply have write some client
>>> code
>>>>>> in
>>>>>>>>>> thrift
>>>>>>>>>>> services and clients and zookeeper server installation can be
>>>>>> done as
>>>>>>>>>> a
>>>>>>>>>>> separate process and it will be easier to keep the Zookeeper
>>>>>> server
>>>>>>>>>>> separate from Airavata because installation of Zookeeper server
>>>>>> little
>>>>>>>>>>> complex in production scenario. I think we have to make sure
>>>>>>>>>> everything
>>>>>>>>>>> works fine when there is no Zookeeper running, ex:
>>>>>>>>>> enable.zookeeper=false
>>>>>>>>>>> should works fine and users doesn't have to download and start
>>>>>>>>>> zookeeper.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> [1]http://zookeeper.apache.org/
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> Lahiru
>>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> System Analyst Programmer
>>>>>>>> PTI Lab
>>>>>>>> Indiana University
>>>>>>>>
>>>>>>
>>>>>> --
>>>>>> System Analyst Programmer
>>>>>> PTI Lab
>>>>>> Indiana University
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards,
>>>>> Shameera Rathnayaka.
>>>>>
>>>>> email: shameera AT apache.org , shameerainfo AT gmail.com
>>>>> Blog : http://shameerarathnayaka.blogspot.com/
>>>>>
>>>>
>>>>
>>>> --
>>>> Supun Kamburugamuva
>>>> Member, Apache Software Foundation; http://www.apache.org
>>>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
>>>> Blog: http://supunk.blogspot.com
>>>>
>>>>
>>>
>>> --
>>> System Analyst Programmer
>>> PTI Lab
>>> Indiana University
>>>
>>
>>
>> --
>> Supun Kamburugamuva
>> Member, Apache Software Foundation; http://www.apache.org
>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
>> Blog: http://supunk.blogspot.com
>>
>>
>

Re: Zookeeper in Airavata to achieve reliability

Posted by Lahiru Gunathilake <gl...@gmail.com>.

Hi Supun,

Thanks for the clarification.

Regards
Lahiru


On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva <su...@gmail.com>
wrote:

> Hi Lahiru,
>
> My suggestion is that may be you don't need a Thrift service between
> Orchestrator and the component executing the experiment. When a new
> experiment is submitted, orchestrator decides who can execute this job.
> Then it put the information about this experiment execution in ZooKeeper.
> The component which wants to executes the experiment is listening to this
> ZooKeeper path and when it sees the experiment it will execute it. So that
> the communication happens through an state change in ZooKeeper. This can
> potentially simply your architecture.
>
> Thanks,
> Supun.
>
>
> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake <gl...@gmail.com>
> wrote:
>
>> Hi Supun,
>>
>> So your suggestion is to create a znode for each thrift service we have
>> and
>> when the request comes that node gets modified with input data for that
>> request and thrift service is having a watch for that node and it will be
>> notified because of the watch and it can read the input from zookeeper and
>> invoke the operation?
>>
>> Lahiru
>>
>>
>> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva <su...@gmail.com>
>> wrote:
>>
>> > Hi all,
>> >
>> > Here is what I think about Airavata and ZooKeeper. In Airavata there are
>> > many components and these components must be stateless to achieve
>> > scalability and reliability.Also there must be a mechanism to
>> communicate
>> > between the components. At the moment Airavata uses RPC calls based on
>> > Thrift for the communication.
>> >
>> > ZooKeeper can be used both as a place to hold state and as a
>> communication
>> > layer between the components. I'm involved with a project that has many
>> > distributed components like AIravata. Right now we use Thrift services
>> to
>> > communicate among the components. But we find it difficult to use RPC
>> calls
>> > and achieve stateless behaviour and thinking of replacing Thrift
>> services
>> > with ZooKeeper based communication layer. So I think it is better to
>> > explore the possibility of removing the Thrift services between the
>> > components and use ZooKeeper as a communication mechanism between the
>> > services. If you do this you will have to move the state to ZooKeeper
>> and
>> > will automatically achieve the stateless behaviour in the components.
>> >
>> > Also I think trying to make ZooKeeper optional is a bad idea. If we are
>> > trying to integrate something fundamentally important to architecture as
>> > how to store state, we shouldn't make it optional.
>> >
>> > Thanks,
>> > Supun..
>> >
>> >
>> > On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
>> > shameerainfo@gmail.com> wrote:
>> >
>> >> Hi Lahiru,
>> >>
>> >> As i understood,  not only reliability , you are trying to achieve some
>> >> other requirement by introducing zookeeper, like health monitoring of
>> the
>> >> services, categorization with service implementation etc ... . In that
>> >> case, i think we can get use of zookeeper's features but if we only
>> focus
>> >> on reliability, i have little bit of concern, why can't we use
>> clustering +
>> >> LB ?
>> >>
>> >> Yes it is better we add Zookeeper as a prerequisite if user need to use
>> >> it.
>> >>
>> >> Thanks,
>> >>  Shameera.
>> >>
>> >>
>> >> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <glahiru@gmail.com
>> >
>> >> wrote:
>> >>
>> >>> Hi Gagan,
>> >>>
>> >>> I need to start another discussion about it, but I had an offline
>> >>> discussion with Suresh about auto-scaling. I will start another thread
>> >>> about this topic too.
>> >>>
>> >>> Regards
>> >>> Lahiru
>> >>>
>> >>>
>> >>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
>> gagandeepjuneja@gmail.com
>> >>> >
>> >>> wrote:
>> >>>
>> >>> > Thanks Lahiru for pointing to nice library, added to my dictionary
>> :).
>> >>> >
>> >>> > I would like to know how are we planning to start multiple servers.
>> >>> > 1. Spawning new servers based on load? Some times we call it as auto
>> >>> > scalable.
>> >>> > 2. To make some specific number of nodes available such as we want 2
>> >>> > servers to be available at any time so if one goes down then I need
>> to
>> >>> > spawn one new to make available servers count 2.
>> >>> > 3. Initially start all the servers.
>> >>> >
>> >>> > In scenario 1 and 2 zookeeper does make sense but I don't believe
>> >>> existing
>> >>> > architecture support this?
>> >>> >
>> >>> > Regards,
>> >>> > Gagan
>> >>> > On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <gl...@gmail.com>
>> >>> wrote:
>> >>> >
>> >>> >> Hi Gagan,
>> >>> >>
>> >>> >> Thanks for your response. Please see my inline comments.
>> >>> >>
>> >>> >>
>> >>> >> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
>> >>> gagandeepjuneja@gmail.com>
>> >>> >> wrote:
>> >>> >>
>> >>> >>> Hi Lahiru,
>> >>> >>> Just my 2 cents.
>> >>> >>>
>> >>> >>> I am big fan of zookeeper but also against adding multiple hops in
>> >>> the
>> >>> >>> system which can add unnecessary complexity. Here I am not able to
>> >>> >>> understand the requirement of zookeeper may be I am wrong because
>> of
>> >>> less
>> >>> >>> knowledge of the airavata system in whole. So I would like to
>> discuss
>> >>> >>> following point.
>> >>> >>>
>> >>> >>> 1. How it will help us in making system more reliable. Zookeeper
>> is
>> >>> not
>> >>> >>> able to restart services. At max it can tell whether service is up
>> >>> or not
>> >>> >>> which could only be the case if airavata service goes down
>> >>> gracefully and
>> >>> >>> we have any automated way to restart it. If this is just matter of
>> >>> routing
>> >>> >>> client requests to the available thrift servers then this can be
>> >>> achieved
>> >>> >>> with the help of load balancer which I guess is already there in
>> >>> thrift
>> >>> >>> wish list.
>> >>> >>>
>> >>> >> We have multiple thrift services and currently we start only one
>> >>> instance
>> >>> >> of them and each thrift service is a stateless service. To keep the
>> >>> high
>> >>> >> availability we have to start multiple instances of them in
>> production
>> >>> >> scenario. So for clients to get an available thrift service we can
>> use
>> >>> >> zookeeper znodes to represent each available service. There are
>> some
>> >>> >> libraries which is doing similar[1] and I think we can use them
>> >>> directly.
>> >>> >>
>> >>> >>> 2. As far as registering of different providers is concerned do
>> you
>> >>> >>> think for that we really need external store.
>> >>> >>>
>> >>> >> Yes I think so, because its light weight and reliable and we have
>> to
>> >>> do
>> >>> >> very minimal amount of work to achieve all these features to
>> Airavata
>> >>> >> because zookeeper handle all the complexity.
>> >>> >>
>> >>> >>> I have seen people using zookeeper more for state management in
>> >>> >>> distributed environments.
>> >>> >>>
>> >>> >> +1, we might not be the most effective users of zookeeper because
>> all
>> >>> of
>> >>> >> our services are stateless services, but my point is to achieve
>> >>> >> fault-tolerance we can use zookeeper and with minimal work.
>> >>> >>
>> >>> >>>  I would like to understand more how can we leverage zookeeper in
>> >>> >>> airavata to make system reliable.
>> >>> >>>
>> >>> >>>
>> >>> >> [1]https://github.com/eirslett/thrift-zookeeper
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>> Regards,
>> >>> >>> Gagan
>> >>> >>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu> wrote:
>> >>> >>>
>> >>> >>>> Thanks for the summary, Lahiru. I'm cc'ing the Architecture list
>> for
>> >>> >>>> additional comments.
>> >>> >>>>
>> >>> >>>> Marlon
>> >>> >>>>
>> >>> >>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
>> >>> >>>> > Hi All,
>> >>> >>>> >
>> >>> >>>> > I did little research about Apache Zookeeper[1] and how to use
>> it
>> >>> in
>> >>> >>>> > airavata. Its really a nice way to achieve fault tolerance and
>> >>> >>>> reliable
>> >>> >>>> > communication between our thrift services and clients.
>> Zookeeper
>> >>> is a
>> >>> >>>> > distributed, fault tolerant system to do a reliable
>> communication
>> >>> >>>> between
>> >>> >>>> > distributed applications. This is like an in-memory file system
>> >>> which
>> >>> >>>> has
>> >>> >>>> > nodes in a tree structure and each node can have small amount
>> of
>> >>> data
>> >>> >>>> > associated with it and these nodes are called znodes. Clients
>> can
>> >>> >>>> connect
>> >>> >>>> > to a zookeeper server and add/delete and update these znodes.
>> >>> >>>> >
>> >>> >>>> >   In Apache Airavata we start multiple thrift services and
>> these
>> >>> can
>> >>> >>>> go
>> >>> >>>> > down for maintenance or these can crash, if we use zookeeper to
>> >>> store
>> >>> >>>> these
>> >>> >>>> > configuration(thrift service configurations) we can achieve a
>> very
>> >>> >>>> reliable
>> >>> >>>> > system. Basically thrift clients can dynamically discover
>> >>> available
>> >>> >>>> service
>> >>> >>>> > by using ephemeral znodes(Here we do not have to change the
>> >>> generated
>> >>> >>>> > thrift client code but we have to change the locations we are
>> >>> invoking
>> >>> >>>> > them). ephemeral znodes will be removed when the thrift service
>> >>> goes
>> >>> >>>> down
>> >>> >>>> > and zookeeper guarantee the atomicity between these operations.
>> >>> With
>> >>> >>>> this
>> >>> >>>> > approach we can have a node hierarchy for multiple of airavata,
>> >>> >>>> > orchestrator,appcatalog and gfac thrift services.
>> >>> >>>> >
>> >>> >>>> > For specifically for gfac we can have different types of
>> services
>> >>> for
>> >>> >>>> each
>> >>> >>>> > provider implementation. This can be achieved by using the
>> >>> >>>> hierarchical
>> >>> >>>> > support in zookeeper and providing some logic in gfac-thrift
>> >>> service
>> >>> >>>> to
>> >>> >>>> > register it to a defined path. Using the same logic
>> orchestrator
>> >>> can
>> >>> >>>> > discover the provider specific gfac thrift service and route
>> the
>> >>> >>>> message to
>> >>> >>>> > the correct thrift service.
>> >>> >>>> >
>> >>> >>>> > With this approach I think we simply have write some client
>> code
>> >>> in
>> >>> >>>> thrift
>> >>> >>>> > services and clients and zookeeper server installation can be
>> >>> done as
>> >>> >>>> a
>> >>> >>>> > separate process and it will be easier to keep the Zookeeper
>> >>> server
>> >>> >>>> > separate from Airavata because installation of Zookeeper server
>> >>> little
>> >>> >>>> > complex in production scenario. I think we have to make sure
>> >>> >>>> everything
>> >>> >>>> > works fine when there is no Zookeeper running, ex:
>> >>> >>>> enable.zookeeper=false
>> >>> >>>> > should works fine and users doesn't have to download and start
>> >>> >>>> zookeeper.
>> >>> >>>> >
>> >>> >>>> >
>> >>> >>>> >
>> >>> >>>> > [1]http://zookeeper.apache.org/
>> >>> >>>> >
>> >>> >>>> > Thanks
>> >>> >>>> > Lahiru
>> >>> >>>>
>> >>> >>>>
>> >>> >>
>> >>> >>
>> >>> >> --
>> >>> >> System Analyst Programmer
>> >>> >> PTI Lab
>> >>> >> Indiana University
>> >>> >>
>> >>> >
>> >>>
>> >>>
>> >>> --
>> >>> System Analyst Programmer
>> >>> PTI Lab
>> >>> Indiana University
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Best Regards,
>> >> Shameera Rathnayaka.
>> >>
>> >> email: shameera AT apache.org , shameerainfo AT gmail.com
>> >> Blog : http://shameerarathnayaka.blogspot.com/
>> >>
>> >
>> >
>> >
>> > --
>> > Supun Kamburugamuva
>> > Member, Apache Software Foundation; http://www.apache.org
>> > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
>> > Blog: http://supunk.blogspot.com
>> >
>> >
>>
>>
>> --
>> System Analyst Programmer
>> PTI Lab
>> Indiana University
>>
>
>
>
> --
> Supun Kamburugamuva
> Member, Apache Software Foundation; http://www.apache.org
> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> Blog: http://supunk.blogspot.com
>
>


-- 
System Analyst Programmer
PTI Lab
Indiana University

Re: Zookeeper in Airavata to achieve reliability

Posted by Lahiru Gunathilake <gl...@gmail.com>.

Hi Supun,

Thanks for the clarification.

Regards
Lahiru


On Mon, Jun 16, 2014 at 11:38 AM, Supun Kamburugamuva <su...@gmail.com>
wrote:

> Hi Lahiru,
>
> My suggestion is that may be you don't need a Thrift service between
> Orchestrator and the component executing the experiment. When a new
> experiment is submitted, orchestrator decides who can execute this job.
> Then it put the information about this experiment execution in ZooKeeper.
> The component which wants to executes the experiment is listening to this
> ZooKeeper path and when it sees the experiment it will execute it. So that
> the communication happens through an state change in ZooKeeper. This can
> potentially simply your architecture.
>
> Thanks,
> Supun.
>
>
> On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake <gl...@gmail.com>
> wrote:
>
>> Hi Supun,
>>
>> So your suggestion is to create a znode for each thrift service we have
>> and
>> when the request comes that node gets modified with input data for that
>> request and thrift service is having a watch for that node and it will be
>> notified because of the watch and it can read the input from zookeeper and
>> invoke the operation?
>>
>> Lahiru
>>
>>
>> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva <su...@gmail.com>
>> wrote:
>>
>> > Hi all,
>> >
>> > Here is what I think about Airavata and ZooKeeper. In Airavata there are
>> > many components and these components must be stateless to achieve
>> > scalability and reliability.Also there must be a mechanism to
>> communicate
>> > between the components. At the moment Airavata uses RPC calls based on
>> > Thrift for the communication.
>> >
>> > ZooKeeper can be used both as a place to hold state and as a
>> communication
>> > layer between the components. I'm involved with a project that has many
>> > distributed components like AIravata. Right now we use Thrift services
>> to
>> > communicate among the components. But we find it difficult to use RPC
>> calls
>> > and achieve stateless behaviour and thinking of replacing Thrift
>> services
>> > with ZooKeeper based communication layer. So I think it is better to
>> > explore the possibility of removing the Thrift services between the
>> > components and use ZooKeeper as a communication mechanism between the
>> > services. If you do this you will have to move the state to ZooKeeper
>> and
>> > will automatically achieve the stateless behaviour in the components.
>> >
>> > Also I think trying to make ZooKeeper optional is a bad idea. If we are
>> > trying to integrate something fundamentally important to architecture as
>> > how to store state, we shouldn't make it optional.
>> >
>> > Thanks,
>> > Supun..
>> >
>> >
>> > On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
>> > shameerainfo@gmail.com> wrote:
>> >
>> >> Hi Lahiru,
>> >>
>> >> As i understood,  not only reliability , you are trying to achieve some
>> >> other requirement by introducing zookeeper, like health monitoring of
>> the
>> >> services, categorization with service implementation etc ... . In that
>> >> case, i think we can get use of zookeeper's features but if we only
>> focus
>> >> on reliability, i have little bit of concern, why can't we use
>> clustering +
>> >> LB ?
>> >>
>> >> Yes it is better we add Zookeeper as a prerequisite if user need to use
>> >> it.
>> >>
>> >> Thanks,
>> >>  Shameera.
>> >>
>> >>
>> >> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <glahiru@gmail.com
>> >
>> >> wrote:
>> >>
>> >>> Hi Gagan,
>> >>>
>> >>> I need to start another discussion about it, but I had an offline
>> >>> discussion with Suresh about auto-scaling. I will start another thread
>> >>> about this topic too.
>> >>>
>> >>> Regards
>> >>> Lahiru
>> >>>
>> >>>
>> >>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
>> gagandeepjuneja@gmail.com
>> >>> >
>> >>> wrote:
>> >>>
>> >>> > Thanks Lahiru for pointing to nice library, added to my dictionary
>> :).
>> >>> >
>> >>> > I would like to know how are we planning to start multiple servers.
>> >>> > 1. Spawning new servers based on load? Some times we call it as auto
>> >>> > scalable.
>> >>> > 2. To make some specific number of nodes available such as we want 2
>> >>> > servers to be available at any time so if one goes down then I need
>> to
>> >>> > spawn one new to make available servers count 2.
>> >>> > 3. Initially start all the servers.
>> >>> >
>> >>> > In scenario 1 and 2 zookeeper does make sense but I don't believe
>> >>> existing
>> >>> > architecture support this?
>> >>> >
>> >>> > Regards,
>> >>> > Gagan
>> >>> > On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <gl...@gmail.com>
>> >>> wrote:
>> >>> >
>> >>> >> Hi Gagan,
>> >>> >>
>> >>> >> Thanks for your response. Please see my inline comments.
>> >>> >>
>> >>> >>
>> >>> >> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
>> >>> gagandeepjuneja@gmail.com>
>> >>> >> wrote:
>> >>> >>
>> >>> >>> Hi Lahiru,
>> >>> >>> Just my 2 cents.
>> >>> >>>
>> >>> >>> I am big fan of zookeeper but also against adding multiple hops in
>> >>> the
>> >>> >>> system which can add unnecessary complexity. Here I am not able to
>> >>> >>> understand the requirement of zookeeper may be I am wrong because
>> of
>> >>> less
>> >>> >>> knowledge of the airavata system in whole. So I would like to
>> discuss
>> >>> >>> following point.
>> >>> >>>
>> >>> >>> 1. How it will help us in making system more reliable. Zookeeper
>> is
>> >>> not
>> >>> >>> able to restart services. At max it can tell whether service is up
>> >>> or not
>> >>> >>> which could only be the case if airavata service goes down
>> >>> gracefully and
>> >>> >>> we have any automated way to restart it. If this is just matter of
>> >>> routing
>> >>> >>> client requests to the available thrift servers then this can be
>> >>> achieved
>> >>> >>> with the help of load balancer which I guess is already there in
>> >>> thrift
>> >>> >>> wish list.
>> >>> >>>
>> >>> >> We have multiple thrift services and currently we start only one
>> >>> instance
>> >>> >> of them and each thrift service is a stateless service. To keep the
>> >>> high
>> >>> >> availability we have to start multiple instances of them in
>> production
>> >>> >> scenario. So for clients to get an available thrift service we can
>> use
>> >>> >> zookeeper znodes to represent each available service. There are
>> some
>> >>> >> libraries which is doing similar[1] and I think we can use them
>> >>> directly.
>> >>> >>
>> >>> >>> 2. As far as registering of different providers is concerned do
>> you
>> >>> >>> think for that we really need external store.
>> >>> >>>
>> >>> >> Yes I think so, because its light weight and reliable and we have
>> to
>> >>> do
>> >>> >> very minimal amount of work to achieve all these features to
>> Airavata
>> >>> >> because zookeeper handle all the complexity.
>> >>> >>
>> >>> >>> I have seen people using zookeeper more for state management in
>> >>> >>> distributed environments.
>> >>> >>>
>> >>> >> +1, we might not be the most effective users of zookeeper because
>> all
>> >>> of
>> >>> >> our services are stateless services, but my point is to achieve
>> >>> >> fault-tolerance we can use zookeeper and with minimal work.
>> >>> >>
>> >>> >>>  I would like to understand more how can we leverage zookeeper in
>> >>> >>> airavata to make system reliable.
>> >>> >>>
>> >>> >>>
>> >>> >> [1]https://github.com/eirslett/thrift-zookeeper
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>> Regards,
>> >>> >>> Gagan
>> >>> >>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu> wrote:
>> >>> >>>
>> >>> >>>> Thanks for the summary, Lahiru. I'm cc'ing the Architecture list
>> for
>> >>> >>>> additional comments.
>> >>> >>>>
>> >>> >>>> Marlon
>> >>> >>>>
>> >>> >>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
>> >>> >>>> > Hi All,
>> >>> >>>> >
>> >>> >>>> > I did little research about Apache Zookeeper[1] and how to use
>> it
>> >>> in
>> >>> >>>> > airavata. Its really a nice way to achieve fault tolerance and
>> >>> >>>> reliable
>> >>> >>>> > communication between our thrift services and clients.
>> Zookeeper
>> >>> is a
>> >>> >>>> > distributed, fault tolerant system to do a reliable
>> communication
>> >>> >>>> between
>> >>> >>>> > distributed applications. This is like an in-memory file system
>> >>> which
>> >>> >>>> has
>> >>> >>>> > nodes in a tree structure and each node can have small amount
>> of
>> >>> data
>> >>> >>>> > associated with it and these nodes are called znodes. Clients
>> can
>> >>> >>>> connect
>> >>> >>>> > to a zookeeper server and add/delete and update these znodes.
>> >>> >>>> >
>> >>> >>>> >   In Apache Airavata we start multiple thrift services and
>> these
>> >>> can
>> >>> >>>> go
>> >>> >>>> > down for maintenance or these can crash, if we use zookeeper to
>> >>> store
>> >>> >>>> these
>> >>> >>>> > configuration(thrift service configurations) we can achieve a
>> very
>> >>> >>>> reliable
>> >>> >>>> > system. Basically thrift clients can dynamically discover
>> >>> available
>> >>> >>>> service
>> >>> >>>> > by using ephemeral znodes(Here we do not have to change the
>> >>> generated
>> >>> >>>> > thrift client code but we have to change the locations we are
>> >>> invoking
>> >>> >>>> > them). ephemeral znodes will be removed when the thrift service
>> >>> goes
>> >>> >>>> down
>> >>> >>>> > and zookeeper guarantee the atomicity between these operations.
>> >>> With
>> >>> >>>> this
>> >>> >>>> > approach we can have a node hierarchy for multiple of airavata,
>> >>> >>>> > orchestrator,appcatalog and gfac thrift services.
>> >>> >>>> >
>> >>> >>>> > For specifically for gfac we can have different types of
>> services
>> >>> for
>> >>> >>>> each
>> >>> >>>> > provider implementation. This can be achieved by using the
>> >>> >>>> hierarchical
>> >>> >>>> > support in zookeeper and providing some logic in gfac-thrift
>> >>> service
>> >>> >>>> to
>> >>> >>>> > register it to a defined path. Using the same logic
>> orchestrator
>> >>> can
>> >>> >>>> > discover the provider specific gfac thrift service and route
>> the
>> >>> >>>> message to
>> >>> >>>> > the correct thrift service.
>> >>> >>>> >
>> >>> >>>> > With this approach I think we simply have write some client
>> code
>> >>> in
>> >>> >>>> thrift
>> >>> >>>> > services and clients and zookeeper server installation can be
>> >>> done as
>> >>> >>>> a
>> >>> >>>> > separate process and it will be easier to keep the Zookeeper
>> >>> server
>> >>> >>>> > separate from Airavata because installation of Zookeeper server
>> >>> little
>> >>> >>>> > complex in production scenario. I think we have to make sure
>> >>> >>>> everything
>> >>> >>>> > works fine when there is no Zookeeper running, ex:
>> >>> >>>> enable.zookeeper=false
>> >>> >>>> > should works fine and users doesn't have to download and start
>> >>> >>>> zookeeper.
>> >>> >>>> >
>> >>> >>>> >
>> >>> >>>> >
>> >>> >>>> > [1]http://zookeeper.apache.org/
>> >>> >>>> >
>> >>> >>>> > Thanks
>> >>> >>>> > Lahiru
>> >>> >>>>
>> >>> >>>>
>> >>> >>
>> >>> >>
>> >>> >> --
>> >>> >> System Analyst Programmer
>> >>> >> PTI Lab
>> >>> >> Indiana University
>> >>> >>
>> >>> >
>> >>>
>> >>>
>> >>> --
>> >>> System Analyst Programmer
>> >>> PTI Lab
>> >>> Indiana University
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Best Regards,
>> >> Shameera Rathnayaka.
>> >>
>> >> email: shameera AT apache.org , shameerainfo AT gmail.com
>> >> Blog : http://shameerarathnayaka.blogspot.com/
>> >>
>> >
>> >
>> >
>> > --
>> > Supun Kamburugamuva
>> > Member, Apache Software Foundation; http://www.apache.org
>> > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
>> > Blog: http://supunk.blogspot.com
>> >
>> >
>>
>>
>> --
>> System Analyst Programmer
>> PTI Lab
>> Indiana University
>>
>
>
>
> --
> Supun Kamburugamuva
> Member, Apache Software Foundation; http://www.apache.org
> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> Blog: http://supunk.blogspot.com
>
>


-- 
System Analyst Programmer
PTI Lab
Indiana University

Re: Zookeeper in Airavata to achieve reliability

Posted by Supun Kamburugamuva <su...@gmail.com>.

Hi Lahiru,

My suggestion is that may be you don't need a Thrift service between
Orchestrator and the component executing the experiment. When a new
experiment is submitted, orchestrator decides who can execute this job.
Then it put the information about this experiment execution in ZooKeeper.
The component which wants to executes the experiment is listening to this
ZooKeeper path and when it sees the experiment it will execute it. So that
the communication happens through an state change in ZooKeeper. This can
potentially simply your architecture.

Thanks,
Supun.


On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake <gl...@gmail.com>
wrote:

> Hi Supun,
>
> So your suggestion is to create a znode for each thrift service we have and
> when the request comes that node gets modified with input data for that
> request and thrift service is having a watch for that node and it will be
> notified because of the watch and it can read the input from zookeeper and
> invoke the operation?
>
> Lahiru
>
>
> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva <su...@gmail.com>
> wrote:
>
> > Hi all,
> >
> > Here is what I think about Airavata and ZooKeeper. In Airavata there are
> > many components and these components must be stateless to achieve
> > scalability and reliability.Also there must be a mechanism to communicate
> > between the components. At the moment Airavata uses RPC calls based on
> > Thrift for the communication.
> >
> > ZooKeeper can be used both as a place to hold state and as a
> communication
> > layer between the components. I'm involved with a project that has many
> > distributed components like AIravata. Right now we use Thrift services to
> > communicate among the components. But we find it difficult to use RPC
> calls
> > and achieve stateless behaviour and thinking of replacing Thrift services
> > with ZooKeeper based communication layer. So I think it is better to
> > explore the possibility of removing the Thrift services between the
> > components and use ZooKeeper as a communication mechanism between the
> > services. If you do this you will have to move the state to ZooKeeper and
> > will automatically achieve the stateless behaviour in the components.
> >
> > Also I think trying to make ZooKeeper optional is a bad idea. If we are
> > trying to integrate something fundamentally important to architecture as
> > how to store state, we shouldn't make it optional.
> >
> > Thanks,
> > Supun..
> >
> >
> > On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
> > shameerainfo@gmail.com> wrote:
> >
> >> Hi Lahiru,
> >>
> >> As i understood,  not only reliability , you are trying to achieve some
> >> other requirement by introducing zookeeper, like health monitoring of
> the
> >> services, categorization with service implementation etc ... . In that
> >> case, i think we can get use of zookeeper's features but if we only
> focus
> >> on reliability, i have little bit of concern, why can't we use
> clustering +
> >> LB ?
> >>
> >> Yes it is better we add Zookeeper as a prerequisite if user need to use
> >> it.
> >>
> >> Thanks,
> >>  Shameera.
> >>
> >>
> >> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <gl...@gmail.com>
> >> wrote:
> >>
> >>> Hi Gagan,
> >>>
> >>> I need to start another discussion about it, but I had an offline
> >>> discussion with Suresh about auto-scaling. I will start another thread
> >>> about this topic too.
> >>>
> >>> Regards
> >>> Lahiru
> >>>
> >>>
> >>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
> gagandeepjuneja@gmail.com
> >>> >
> >>> wrote:
> >>>
> >>> > Thanks Lahiru for pointing to nice library, added to my dictionary
> :).
> >>> >
> >>> > I would like to know how are we planning to start multiple servers.
> >>> > 1. Spawning new servers based on load? Some times we call it as auto
> >>> > scalable.
> >>> > 2. To make some specific number of nodes available such as we want 2
> >>> > servers to be available at any time so if one goes down then I need
> to
> >>> > spawn one new to make available servers count 2.
> >>> > 3. Initially start all the servers.
> >>> >
> >>> > In scenario 1 and 2 zookeeper does make sense but I don't believe
> >>> existing
> >>> > architecture support this?
> >>> >
> >>> > Regards,
> >>> > Gagan
> >>> > On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <gl...@gmail.com>
> >>> wrote:
> >>> >
> >>> >> Hi Gagan,
> >>> >>
> >>> >> Thanks for your response. Please see my inline comments.
> >>> >>
> >>> >>
> >>> >> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
> >>> gagandeepjuneja@gmail.com>
> >>> >> wrote:
> >>> >>
> >>> >>> Hi Lahiru,
> >>> >>> Just my 2 cents.
> >>> >>>
> >>> >>> I am big fan of zookeeper but also against adding multiple hops in
> >>> the
> >>> >>> system which can add unnecessary complexity. Here I am not able to
> >>> >>> understand the requirement of zookeeper may be I am wrong because
> of
> >>> less
> >>> >>> knowledge of the airavata system in whole. So I would like to
> discuss
> >>> >>> following point.
> >>> >>>
> >>> >>> 1. How it will help us in making system more reliable. Zookeeper is
> >>> not
> >>> >>> able to restart services. At max it can tell whether service is up
> >>> or not
> >>> >>> which could only be the case if airavata service goes down
> >>> gracefully and
> >>> >>> we have any automated way to restart it. If this is just matter of
> >>> routing
> >>> >>> client requests to the available thrift servers then this can be
> >>> achieved
> >>> >>> with the help of load balancer which I guess is already there in
> >>> thrift
> >>> >>> wish list.
> >>> >>>
> >>> >> We have multiple thrift services and currently we start only one
> >>> instance
> >>> >> of them and each thrift service is a stateless service. To keep the
> >>> high
> >>> >> availability we have to start multiple instances of them in
> production
> >>> >> scenario. So for clients to get an available thrift service we can
> use
> >>> >> zookeeper znodes to represent each available service. There are some
> >>> >> libraries which is doing similar[1] and I think we can use them
> >>> directly.
> >>> >>
> >>> >>> 2. As far as registering of different providers is concerned do you
> >>> >>> think for that we really need external store.
> >>> >>>
> >>> >> Yes I think so, because its light weight and reliable and we have to
> >>> do
> >>> >> very minimal amount of work to achieve all these features to
> Airavata
> >>> >> because zookeeper handle all the complexity.
> >>> >>
> >>> >>> I have seen people using zookeeper more for state management in
> >>> >>> distributed environments.
> >>> >>>
> >>> >> +1, we might not be the most effective users of zookeeper because
> all
> >>> of
> >>> >> our services are stateless services, but my point is to achieve
> >>> >> fault-tolerance we can use zookeeper and with minimal work.
> >>> >>
> >>> >>>  I would like to understand more how can we leverage zookeeper in
> >>> >>> airavata to make system reliable.
> >>> >>>
> >>> >>>
> >>> >> [1]https://github.com/eirslett/thrift-zookeeper
> >>> >>
> >>> >>
> >>> >>
> >>> >>> Regards,
> >>> >>> Gagan
> >>> >>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu> wrote:
> >>> >>>
> >>> >>>> Thanks for the summary, Lahiru. I'm cc'ing the Architecture list
> for
> >>> >>>> additional comments.
> >>> >>>>
> >>> >>>> Marlon
> >>> >>>>
> >>> >>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
> >>> >>>> > Hi All,
> >>> >>>> >
> >>> >>>> > I did little research about Apache Zookeeper[1] and how to use
> it
> >>> in
> >>> >>>> > airavata. Its really a nice way to achieve fault tolerance and
> >>> >>>> reliable
> >>> >>>> > communication between our thrift services and clients. Zookeeper
> >>> is a
> >>> >>>> > distributed, fault tolerant system to do a reliable
> communication
> >>> >>>> between
> >>> >>>> > distributed applications. This is like an in-memory file system
> >>> which
> >>> >>>> has
> >>> >>>> > nodes in a tree structure and each node can have small amount of
> >>> data
> >>> >>>> > associated with it and these nodes are called znodes. Clients
> can
> >>> >>>> connect
> >>> >>>> > to a zookeeper server and add/delete and update these znodes.
> >>> >>>> >
> >>> >>>> >   In Apache Airavata we start multiple thrift services and these
> >>> can
> >>> >>>> go
> >>> >>>> > down for maintenance or these can crash, if we use zookeeper to
> >>> store
> >>> >>>> these
> >>> >>>> > configuration(thrift service configurations) we can achieve a
> very
> >>> >>>> reliable
> >>> >>>> > system. Basically thrift clients can dynamically discover
> >>> available
> >>> >>>> service
> >>> >>>> > by using ephemeral znodes(Here we do not have to change the
> >>> generated
> >>> >>>> > thrift client code but we have to change the locations we are
> >>> invoking
> >>> >>>> > them). ephemeral znodes will be removed when the thrift service
> >>> goes
> >>> >>>> down
> >>> >>>> > and zookeeper guarantee the atomicity between these operations.
> >>> With
> >>> >>>> this
> >>> >>>> > approach we can have a node hierarchy for multiple of airavata,
> >>> >>>> > orchestrator,appcatalog and gfac thrift services.
> >>> >>>> >
> >>> >>>> > For specifically for gfac we can have different types of
> services
> >>> for
> >>> >>>> each
> >>> >>>> > provider implementation. This can be achieved by using the
> >>> >>>> hierarchical
> >>> >>>> > support in zookeeper and providing some logic in gfac-thrift
> >>> service
> >>> >>>> to
> >>> >>>> > register it to a defined path. Using the same logic orchestrator
> >>> can
> >>> >>>> > discover the provider specific gfac thrift service and route the
> >>> >>>> message to
> >>> >>>> > the correct thrift service.
> >>> >>>> >
> >>> >>>> > With this approach I think we simply have write some client code
> >>> in
> >>> >>>> thrift
> >>> >>>> > services and clients and zookeeper server installation can be
> >>> done as
> >>> >>>> a
> >>> >>>> > separate process and it will be easier to keep the Zookeeper
> >>> server
> >>> >>>> > separate from Airavata because installation of Zookeeper server
> >>> little
> >>> >>>> > complex in production scenario. I think we have to make sure
> >>> >>>> everything
> >>> >>>> > works fine when there is no Zookeeper running, ex:
> >>> >>>> enable.zookeeper=false
> >>> >>>> > should works fine and users doesn't have to download and start
> >>> >>>> zookeeper.
> >>> >>>> >
> >>> >>>> >
> >>> >>>> >
> >>> >>>> > [1]http://zookeeper.apache.org/
> >>> >>>> >
> >>> >>>> > Thanks
> >>> >>>> > Lahiru
> >>> >>>>
> >>> >>>>
> >>> >>
> >>> >>
> >>> >> --
> >>> >> System Analyst Programmer
> >>> >> PTI Lab
> >>> >> Indiana University
> >>> >>
> >>> >
> >>>
> >>>
> >>> --
> >>> System Analyst Programmer
> >>> PTI Lab
> >>> Indiana University
> >>>
> >>
> >>
> >>
> >> --
> >> Best Regards,
> >> Shameera Rathnayaka.
> >>
> >> email: shameera AT apache.org , shameerainfo AT gmail.com
> >> Blog : http://shameerarathnayaka.blogspot.com/
> >>
> >
> >
> >
> > --
> > Supun Kamburugamuva
> > Member, Apache Software Foundation; http://www.apache.org
> > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > Blog: http://supunk.blogspot.com
> >
> >
>
>
> --
> System Analyst Programmer
> PTI Lab
> Indiana University
>



-- 
Supun Kamburugamuva
Member, Apache Software Foundation; http://www.apache.org
E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
Blog: http://supunk.blogspot.com

Re: Zookeeper in Airavata to achieve reliability

Posted by Supun Kamburugamuva <su...@gmail.com>.

Hi Lahiru,

My suggestion is that may be you don't need a Thrift service between
Orchestrator and the component executing the experiment. When a new
experiment is submitted, orchestrator decides who can execute this job.
Then it put the information about this experiment execution in ZooKeeper.
The component which wants to executes the experiment is listening to this
ZooKeeper path and when it sees the experiment it will execute it. So that
the communication happens through an state change in ZooKeeper. This can
potentially simply your architecture.

Thanks,
Supun.


On Mon, Jun 16, 2014 at 11:14 AM, Lahiru Gunathilake <gl...@gmail.com>
wrote:

> Hi Supun,
>
> So your suggestion is to create a znode for each thrift service we have and
> when the request comes that node gets modified with input data for that
> request and thrift service is having a watch for that node and it will be
> notified because of the watch and it can read the input from zookeeper and
> invoke the operation?
>
> Lahiru
>
>
> On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva <su...@gmail.com>
> wrote:
>
> > Hi all,
> >
> > Here is what I think about Airavata and ZooKeeper. In Airavata there are
> > many components and these components must be stateless to achieve
> > scalability and reliability.Also there must be a mechanism to communicate
> > between the components. At the moment Airavata uses RPC calls based on
> > Thrift for the communication.
> >
> > ZooKeeper can be used both as a place to hold state and as a
> communication
> > layer between the components. I'm involved with a project that has many
> > distributed components like AIravata. Right now we use Thrift services to
> > communicate among the components. But we find it difficult to use RPC
> calls
> > and achieve stateless behaviour and thinking of replacing Thrift services
> > with ZooKeeper based communication layer. So I think it is better to
> > explore the possibility of removing the Thrift services between the
> > components and use ZooKeeper as a communication mechanism between the
> > services. If you do this you will have to move the state to ZooKeeper and
> > will automatically achieve the stateless behaviour in the components.
> >
> > Also I think trying to make ZooKeeper optional is a bad idea. If we are
> > trying to integrate something fundamentally important to architecture as
> > how to store state, we shouldn't make it optional.
> >
> > Thanks,
> > Supun..
> >
> >
> > On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
> > shameerainfo@gmail.com> wrote:
> >
> >> Hi Lahiru,
> >>
> >> As i understood,  not only reliability , you are trying to achieve some
> >> other requirement by introducing zookeeper, like health monitoring of
> the
> >> services, categorization with service implementation etc ... . In that
> >> case, i think we can get use of zookeeper's features but if we only
> focus
> >> on reliability, i have little bit of concern, why can't we use
> clustering +
> >> LB ?
> >>
> >> Yes it is better we add Zookeeper as a prerequisite if user need to use
> >> it.
> >>
> >> Thanks,
> >>  Shameera.
> >>
> >>
> >> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <gl...@gmail.com>
> >> wrote:
> >>
> >>> Hi Gagan,
> >>>
> >>> I need to start another discussion about it, but I had an offline
> >>> discussion with Suresh about auto-scaling. I will start another thread
> >>> about this topic too.
> >>>
> >>> Regards
> >>> Lahiru
> >>>
> >>>
> >>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
> gagandeepjuneja@gmail.com
> >>> >
> >>> wrote:
> >>>
> >>> > Thanks Lahiru for pointing to nice library, added to my dictionary
> :).
> >>> >
> >>> > I would like to know how are we planning to start multiple servers.
> >>> > 1. Spawning new servers based on load? Some times we call it as auto
> >>> > scalable.
> >>> > 2. To make some specific number of nodes available such as we want 2
> >>> > servers to be available at any time so if one goes down then I need
> to
> >>> > spawn one new to make available servers count 2.
> >>> > 3. Initially start all the servers.
> >>> >
> >>> > In scenario 1 and 2 zookeeper does make sense but I don't believe
> >>> existing
> >>> > architecture support this?
> >>> >
> >>> > Regards,
> >>> > Gagan
> >>> > On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <gl...@gmail.com>
> >>> wrote:
> >>> >
> >>> >> Hi Gagan,
> >>> >>
> >>> >> Thanks for your response. Please see my inline comments.
> >>> >>
> >>> >>
> >>> >> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
> >>> gagandeepjuneja@gmail.com>
> >>> >> wrote:
> >>> >>
> >>> >>> Hi Lahiru,
> >>> >>> Just my 2 cents.
> >>> >>>
> >>> >>> I am big fan of zookeeper but also against adding multiple hops in
> >>> the
> >>> >>> system which can add unnecessary complexity. Here I am not able to
> >>> >>> understand the requirement of zookeeper may be I am wrong because
> of
> >>> less
> >>> >>> knowledge of the airavata system in whole. So I would like to
> discuss
> >>> >>> following point.
> >>> >>>
> >>> >>> 1. How it will help us in making system more reliable. Zookeeper is
> >>> not
> >>> >>> able to restart services. At max it can tell whether service is up
> >>> or not
> >>> >>> which could only be the case if airavata service goes down
> >>> gracefully and
> >>> >>> we have any automated way to restart it. If this is just matter of
> >>> routing
> >>> >>> client requests to the available thrift servers then this can be
> >>> achieved
> >>> >>> with the help of load balancer which I guess is already there in
> >>> thrift
> >>> >>> wish list.
> >>> >>>
> >>> >> We have multiple thrift services and currently we start only one
> >>> instance
> >>> >> of them and each thrift service is a stateless service. To keep the
> >>> high
> >>> >> availability we have to start multiple instances of them in
> production
> >>> >> scenario. So for clients to get an available thrift service we can
> use
> >>> >> zookeeper znodes to represent each available service. There are some
> >>> >> libraries which is doing similar[1] and I think we can use them
> >>> directly.
> >>> >>
> >>> >>> 2. As far as registering of different providers is concerned do you
> >>> >>> think for that we really need external store.
> >>> >>>
> >>> >> Yes I think so, because its light weight and reliable and we have to
> >>> do
> >>> >> very minimal amount of work to achieve all these features to
> Airavata
> >>> >> because zookeeper handle all the complexity.
> >>> >>
> >>> >>> I have seen people using zookeeper more for state management in
> >>> >>> distributed environments.
> >>> >>>
> >>> >> +1, we might not be the most effective users of zookeeper because
> all
> >>> of
> >>> >> our services are stateless services, but my point is to achieve
> >>> >> fault-tolerance we can use zookeeper and with minimal work.
> >>> >>
> >>> >>>  I would like to understand more how can we leverage zookeeper in
> >>> >>> airavata to make system reliable.
> >>> >>>
> >>> >>>
> >>> >> [1]https://github.com/eirslett/thrift-zookeeper
> >>> >>
> >>> >>
> >>> >>
> >>> >>> Regards,
> >>> >>> Gagan
> >>> >>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu> wrote:
> >>> >>>
> >>> >>>> Thanks for the summary, Lahiru. I'm cc'ing the Architecture list
> for
> >>> >>>> additional comments.
> >>> >>>>
> >>> >>>> Marlon
> >>> >>>>
> >>> >>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
> >>> >>>> > Hi All,
> >>> >>>> >
> >>> >>>> > I did little research about Apache Zookeeper[1] and how to use
> it
> >>> in
> >>> >>>> > airavata. Its really a nice way to achieve fault tolerance and
> >>> >>>> reliable
> >>> >>>> > communication between our thrift services and clients. Zookeeper
> >>> is a
> >>> >>>> > distributed, fault tolerant system to do a reliable
> communication
> >>> >>>> between
> >>> >>>> > distributed applications. This is like an in-memory file system
> >>> which
> >>> >>>> has
> >>> >>>> > nodes in a tree structure and each node can have small amount of
> >>> data
> >>> >>>> > associated with it and these nodes are called znodes. Clients
> can
> >>> >>>> connect
> >>> >>>> > to a zookeeper server and add/delete and update these znodes.
> >>> >>>> >
> >>> >>>> >   In Apache Airavata we start multiple thrift services and these
> >>> can
> >>> >>>> go
> >>> >>>> > down for maintenance or these can crash, if we use zookeeper to
> >>> store
> >>> >>>> these
> >>> >>>> > configuration(thrift service configurations) we can achieve a
> very
> >>> >>>> reliable
> >>> >>>> > system. Basically thrift clients can dynamically discover
> >>> available
> >>> >>>> service
> >>> >>>> > by using ephemeral znodes(Here we do not have to change the
> >>> generated
> >>> >>>> > thrift client code but we have to change the locations we are
> >>> invoking
> >>> >>>> > them). ephemeral znodes will be removed when the thrift service
> >>> goes
> >>> >>>> down
> >>> >>>> > and zookeeper guarantee the atomicity between these operations.
> >>> With
> >>> >>>> this
> >>> >>>> > approach we can have a node hierarchy for multiple of airavata,
> >>> >>>> > orchestrator,appcatalog and gfac thrift services.
> >>> >>>> >
> >>> >>>> > For specifically for gfac we can have different types of
> services
> >>> for
> >>> >>>> each
> >>> >>>> > provider implementation. This can be achieved by using the
> >>> >>>> hierarchical
> >>> >>>> > support in zookeeper and providing some logic in gfac-thrift
> >>> service
> >>> >>>> to
> >>> >>>> > register it to a defined path. Using the same logic orchestrator
> >>> can
> >>> >>>> > discover the provider specific gfac thrift service and route the
> >>> >>>> message to
> >>> >>>> > the correct thrift service.
> >>> >>>> >
> >>> >>>> > With this approach I think we simply have write some client code
> >>> in
> >>> >>>> thrift
> >>> >>>> > services and clients and zookeeper server installation can be
> >>> done as
> >>> >>>> a
> >>> >>>> > separate process and it will be easier to keep the Zookeeper
> >>> server
> >>> >>>> > separate from Airavata because installation of Zookeeper server
> >>> little
> >>> >>>> > complex in production scenario. I think we have to make sure
> >>> >>>> everything
> >>> >>>> > works fine when there is no Zookeeper running, ex:
> >>> >>>> enable.zookeeper=false
> >>> >>>> > should works fine and users doesn't have to download and start
> >>> >>>> zookeeper.
> >>> >>>> >
> >>> >>>> >
> >>> >>>> >
> >>> >>>> > [1]http://zookeeper.apache.org/
> >>> >>>> >
> >>> >>>> > Thanks
> >>> >>>> > Lahiru
> >>> >>>>
> >>> >>>>
> >>> >>
> >>> >>
> >>> >> --
> >>> >> System Analyst Programmer
> >>> >> PTI Lab
> >>> >> Indiana University
> >>> >>
> >>> >
> >>>
> >>>
> >>> --
> >>> System Analyst Programmer
> >>> PTI Lab
> >>> Indiana University
> >>>
> >>
> >>
> >>
> >> --
> >> Best Regards,
> >> Shameera Rathnayaka.
> >>
> >> email: shameera AT apache.org , shameerainfo AT gmail.com
> >> Blog : http://shameerarathnayaka.blogspot.com/
> >>
> >
> >
> >
> > --
> > Supun Kamburugamuva
> > Member, Apache Software Foundation; http://www.apache.org
> > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > Blog: http://supunk.blogspot.com
> >
> >
>
>
> --
> System Analyst Programmer
> PTI Lab
> Indiana University
>



-- 
Supun Kamburugamuva
Member, Apache Software Foundation; http://www.apache.org
E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
Blog: http://supunk.blogspot.com

Re: Zookeeper in Airavata to achieve reliability

Posted by Lahiru Gunathilake <gl...@gmail.com>.

Hi Supun,

So your suggestion is to create a znode for each thrift service we have and
when the request comes that node gets modified with input data for that
request and thrift service is having a watch for that node and it will be
notified because of the watch and it can read the input from zookeeper and
invoke the operation?

Lahiru


On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva <su...@gmail.com>
wrote:

> Hi all,
>
> Here is what I think about Airavata and ZooKeeper. In Airavata there are
> many components and these components must be stateless to achieve
> scalability and reliability.Also there must be a mechanism to communicate
> between the components. At the moment Airavata uses RPC calls based on
> Thrift for the communication.
>
> ZooKeeper can be used both as a place to hold state and as a communication
> layer between the components. I'm involved with a project that has many
> distributed components like AIravata. Right now we use Thrift services to
> communicate among the components. But we find it difficult to use RPC calls
> and achieve stateless behaviour and thinking of replacing Thrift services
> with ZooKeeper based communication layer. So I think it is better to
> explore the possibility of removing the Thrift services between the
> components and use ZooKeeper as a communication mechanism between the
> services. If you do this you will have to move the state to ZooKeeper and
> will automatically achieve the stateless behaviour in the components.
>
> Also I think trying to make ZooKeeper optional is a bad idea. If we are
> trying to integrate something fundamentally important to architecture as
> how to store state, we shouldn't make it optional.
>
> Thanks,
> Supun..
>
>
> On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
> shameerainfo@gmail.com> wrote:
>
>> Hi Lahiru,
>>
>> As i understood,  not only reliability , you are trying to achieve some
>> other requirement by introducing zookeeper, like health monitoring of the
>> services, categorization with service implementation etc ... . In that
>> case, i think we can get use of zookeeper's features but if we only focus
>> on reliability, i have little bit of concern, why can't we use clustering +
>> LB ?
>>
>> Yes it is better we add Zookeeper as a prerequisite if user need to use
>> it.
>>
>> Thanks,
>>  Shameera.
>>
>>
>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <gl...@gmail.com>
>> wrote:
>>
>>> Hi Gagan,
>>>
>>> I need to start another discussion about it, but I had an offline
>>> discussion with Suresh about auto-scaling. I will start another thread
>>> about this topic too.
>>>
>>> Regards
>>> Lahiru
>>>
>>>
>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <gagandeepjuneja@gmail.com
>>> >
>>> wrote:
>>>
>>> > Thanks Lahiru for pointing to nice library, added to my dictionary :).
>>> >
>>> > I would like to know how are we planning to start multiple servers.
>>> > 1. Spawning new servers based on load? Some times we call it as auto
>>> > scalable.
>>> > 2. To make some specific number of nodes available such as we want 2
>>> > servers to be available at any time so if one goes down then I need to
>>> > spawn one new to make available servers count 2.
>>> > 3. Initially start all the servers.
>>> >
>>> > In scenario 1 and 2 zookeeper does make sense but I don't believe
>>> existing
>>> > architecture support this?
>>> >
>>> > Regards,
>>> > Gagan
>>> > On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <gl...@gmail.com>
>>> wrote:
>>> >
>>> >> Hi Gagan,
>>> >>
>>> >> Thanks for your response. Please see my inline comments.
>>> >>
>>> >>
>>> >> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
>>> gagandeepjuneja@gmail.com>
>>> >> wrote:
>>> >>
>>> >>> Hi Lahiru,
>>> >>> Just my 2 cents.
>>> >>>
>>> >>> I am big fan of zookeeper but also against adding multiple hops in
>>> the
>>> >>> system which can add unnecessary complexity. Here I am not able to
>>> >>> understand the requirement of zookeeper may be I am wrong because of
>>> less
>>> >>> knowledge of the airavata system in whole. So I would like to discuss
>>> >>> following point.
>>> >>>
>>> >>> 1. How it will help us in making system more reliable. Zookeeper is
>>> not
>>> >>> able to restart services. At max it can tell whether service is up
>>> or not
>>> >>> which could only be the case if airavata service goes down
>>> gracefully and
>>> >>> we have any automated way to restart it. If this is just matter of
>>> routing
>>> >>> client requests to the available thrift servers then this can be
>>> achieved
>>> >>> with the help of load balancer which I guess is already there in
>>> thrift
>>> >>> wish list.
>>> >>>
>>> >> We have multiple thrift services and currently we start only one
>>> instance
>>> >> of them and each thrift service is a stateless service. To keep the
>>> high
>>> >> availability we have to start multiple instances of them in production
>>> >> scenario. So for clients to get an available thrift service we can use
>>> >> zookeeper znodes to represent each available service. There are some
>>> >> libraries which is doing similar[1] and I think we can use them
>>> directly.
>>> >>
>>> >>> 2. As far as registering of different providers is concerned do you
>>> >>> think for that we really need external store.
>>> >>>
>>> >> Yes I think so, because its light weight and reliable and we have to
>>> do
>>> >> very minimal amount of work to achieve all these features to Airavata
>>> >> because zookeeper handle all the complexity.
>>> >>
>>> >>> I have seen people using zookeeper more for state management in
>>> >>> distributed environments.
>>> >>>
>>> >> +1, we might not be the most effective users of zookeeper because all
>>> of
>>> >> our services are stateless services, but my point is to achieve
>>> >> fault-tolerance we can use zookeeper and with minimal work.
>>> >>
>>> >>>  I would like to understand more how can we leverage zookeeper in
>>> >>> airavata to make system reliable.
>>> >>>
>>> >>>
>>> >> [1]https://github.com/eirslett/thrift-zookeeper
>>> >>
>>> >>
>>> >>
>>> >>> Regards,
>>> >>> Gagan
>>> >>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu> wrote:
>>> >>>
>>> >>>> Thanks for the summary, Lahiru. I'm cc'ing the Architecture list for
>>> >>>> additional comments.
>>> >>>>
>>> >>>> Marlon
>>> >>>>
>>> >>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
>>> >>>> > Hi All,
>>> >>>> >
>>> >>>> > I did little research about Apache Zookeeper[1] and how to use it
>>> in
>>> >>>> > airavata. Its really a nice way to achieve fault tolerance and
>>> >>>> reliable
>>> >>>> > communication between our thrift services and clients. Zookeeper
>>> is a
>>> >>>> > distributed, fault tolerant system to do a reliable communication
>>> >>>> between
>>> >>>> > distributed applications. This is like an in-memory file system
>>> which
>>> >>>> has
>>> >>>> > nodes in a tree structure and each node can have small amount of
>>> data
>>> >>>> > associated with it and these nodes are called znodes. Clients can
>>> >>>> connect
>>> >>>> > to a zookeeper server and add/delete and update these znodes.
>>> >>>> >
>>> >>>> >   In Apache Airavata we start multiple thrift services and these
>>> can
>>> >>>> go
>>> >>>> > down for maintenance or these can crash, if we use zookeeper to
>>> store
>>> >>>> these
>>> >>>> > configuration(thrift service configurations) we can achieve a very
>>> >>>> reliable
>>> >>>> > system. Basically thrift clients can dynamically discover
>>> available
>>> >>>> service
>>> >>>> > by using ephemeral znodes(Here we do not have to change the
>>> generated
>>> >>>> > thrift client code but we have to change the locations we are
>>> invoking
>>> >>>> > them). ephemeral znodes will be removed when the thrift service
>>> goes
>>> >>>> down
>>> >>>> > and zookeeper guarantee the atomicity between these operations.
>>> With
>>> >>>> this
>>> >>>> > approach we can have a node hierarchy for multiple of airavata,
>>> >>>> > orchestrator,appcatalog and gfac thrift services.
>>> >>>> >
>>> >>>> > For specifically for gfac we can have different types of services
>>> for
>>> >>>> each
>>> >>>> > provider implementation. This can be achieved by using the
>>> >>>> hierarchical
>>> >>>> > support in zookeeper and providing some logic in gfac-thrift
>>> service
>>> >>>> to
>>> >>>> > register it to a defined path. Using the same logic orchestrator
>>> can
>>> >>>> > discover the provider specific gfac thrift service and route the
>>> >>>> message to
>>> >>>> > the correct thrift service.
>>> >>>> >
>>> >>>> > With this approach I think we simply have write some client code
>>> in
>>> >>>> thrift
>>> >>>> > services and clients and zookeeper server installation can be
>>> done as
>>> >>>> a
>>> >>>> > separate process and it will be easier to keep the Zookeeper
>>> server
>>> >>>> > separate from Airavata because installation of Zookeeper server
>>> little
>>> >>>> > complex in production scenario. I think we have to make sure
>>> >>>> everything
>>> >>>> > works fine when there is no Zookeeper running, ex:
>>> >>>> enable.zookeeper=false
>>> >>>> > should works fine and users doesn't have to download and start
>>> >>>> zookeeper.
>>> >>>> >
>>> >>>> >
>>> >>>> >
>>> >>>> > [1]http://zookeeper.apache.org/
>>> >>>> >
>>> >>>> > Thanks
>>> >>>> > Lahiru
>>> >>>>
>>> >>>>
>>> >>
>>> >>
>>> >> --
>>> >> System Analyst Programmer
>>> >> PTI Lab
>>> >> Indiana University
>>> >>
>>> >
>>>
>>>
>>> --
>>> System Analyst Programmer
>>> PTI Lab
>>> Indiana University
>>>
>>
>>
>>
>> --
>> Best Regards,
>> Shameera Rathnayaka.
>>
>> email: shameera AT apache.org , shameerainfo AT gmail.com
>> Blog : http://shameerarathnayaka.blogspot.com/
>>
>
>
>
> --
> Supun Kamburugamuva
> Member, Apache Software Foundation; http://www.apache.org
> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> Blog: http://supunk.blogspot.com
>
>


-- 
System Analyst Programmer
PTI Lab
Indiana University

Re: Zookeeper in Airavata to achieve reliability

Posted by Lahiru Gunathilake <gl...@gmail.com>.

Hi Supun,

So your suggestion is to create a znode for each thrift service we have and
when the request comes that node gets modified with input data for that
request and thrift service is having a watch for that node and it will be
notified because of the watch and it can read the input from zookeeper and
invoke the operation?

Lahiru


On Thu, Jun 12, 2014 at 11:50 PM, Supun Kamburugamuva <su...@gmail.com>
wrote:

> Hi all,
>
> Here is what I think about Airavata and ZooKeeper. In Airavata there are
> many components and these components must be stateless to achieve
> scalability and reliability.Also there must be a mechanism to communicate
> between the components. At the moment Airavata uses RPC calls based on
> Thrift for the communication.
>
> ZooKeeper can be used both as a place to hold state and as a communication
> layer between the components. I'm involved with a project that has many
> distributed components like AIravata. Right now we use Thrift services to
> communicate among the components. But we find it difficult to use RPC calls
> and achieve stateless behaviour and thinking of replacing Thrift services
> with ZooKeeper based communication layer. So I think it is better to
> explore the possibility of removing the Thrift services between the
> components and use ZooKeeper as a communication mechanism between the
> services. If you do this you will have to move the state to ZooKeeper and
> will automatically achieve the stateless behaviour in the components.
>
> Also I think trying to make ZooKeeper optional is a bad idea. If we are
> trying to integrate something fundamentally important to architecture as
> how to store state, we shouldn't make it optional.
>
> Thanks,
> Supun..
>
>
> On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
> shameerainfo@gmail.com> wrote:
>
>> Hi Lahiru,
>>
>> As i understood,  not only reliability , you are trying to achieve some
>> other requirement by introducing zookeeper, like health monitoring of the
>> services, categorization with service implementation etc ... . In that
>> case, i think we can get use of zookeeper's features but if we only focus
>> on reliability, i have little bit of concern, why can't we use clustering +
>> LB ?
>>
>> Yes it is better we add Zookeeper as a prerequisite if user need to use
>> it.
>>
>> Thanks,
>>  Shameera.
>>
>>
>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <gl...@gmail.com>
>> wrote:
>>
>>> Hi Gagan,
>>>
>>> I need to start another discussion about it, but I had an offline
>>> discussion with Suresh about auto-scaling. I will start another thread
>>> about this topic too.
>>>
>>> Regards
>>> Lahiru
>>>
>>>
>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <gagandeepjuneja@gmail.com
>>> >
>>> wrote:
>>>
>>> > Thanks Lahiru for pointing to nice library, added to my dictionary :).
>>> >
>>> > I would like to know how are we planning to start multiple servers.
>>> > 1. Spawning new servers based on load? Some times we call it as auto
>>> > scalable.
>>> > 2. To make some specific number of nodes available such as we want 2
>>> > servers to be available at any time so if one goes down then I need to
>>> > spawn one new to make available servers count 2.
>>> > 3. Initially start all the servers.
>>> >
>>> > In scenario 1 and 2 zookeeper does make sense but I don't believe
>>> existing
>>> > architecture support this?
>>> >
>>> > Regards,
>>> > Gagan
>>> > On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <gl...@gmail.com>
>>> wrote:
>>> >
>>> >> Hi Gagan,
>>> >>
>>> >> Thanks for your response. Please see my inline comments.
>>> >>
>>> >>
>>> >> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
>>> gagandeepjuneja@gmail.com>
>>> >> wrote:
>>> >>
>>> >>> Hi Lahiru,
>>> >>> Just my 2 cents.
>>> >>>
>>> >>> I am big fan of zookeeper but also against adding multiple hops in
>>> the
>>> >>> system which can add unnecessary complexity. Here I am not able to
>>> >>> understand the requirement of zookeeper may be I am wrong because of
>>> less
>>> >>> knowledge of the airavata system in whole. So I would like to discuss
>>> >>> following point.
>>> >>>
>>> >>> 1. How it will help us in making system more reliable. Zookeeper is
>>> not
>>> >>> able to restart services. At max it can tell whether service is up
>>> or not
>>> >>> which could only be the case if airavata service goes down
>>> gracefully and
>>> >>> we have any automated way to restart it. If this is just matter of
>>> routing
>>> >>> client requests to the available thrift servers then this can be
>>> achieved
>>> >>> with the help of load balancer which I guess is already there in
>>> thrift
>>> >>> wish list.
>>> >>>
>>> >> We have multiple thrift services and currently we start only one
>>> instance
>>> >> of them and each thrift service is a stateless service. To keep the
>>> high
>>> >> availability we have to start multiple instances of them in production
>>> >> scenario. So for clients to get an available thrift service we can use
>>> >> zookeeper znodes to represent each available service. There are some
>>> >> libraries which is doing similar[1] and I think we can use them
>>> directly.
>>> >>
>>> >>> 2. As far as registering of different providers is concerned do you
>>> >>> think for that we really need external store.
>>> >>>
>>> >> Yes I think so, because its light weight and reliable and we have to
>>> do
>>> >> very minimal amount of work to achieve all these features to Airavata
>>> >> because zookeeper handle all the complexity.
>>> >>
>>> >>> I have seen people using zookeeper more for state management in
>>> >>> distributed environments.
>>> >>>
>>> >> +1, we might not be the most effective users of zookeeper because all
>>> of
>>> >> our services are stateless services, but my point is to achieve
>>> >> fault-tolerance we can use zookeeper and with minimal work.
>>> >>
>>> >>>  I would like to understand more how can we leverage zookeeper in
>>> >>> airavata to make system reliable.
>>> >>>
>>> >>>
>>> >> [1]https://github.com/eirslett/thrift-zookeeper
>>> >>
>>> >>
>>> >>
>>> >>> Regards,
>>> >>> Gagan
>>> >>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu> wrote:
>>> >>>
>>> >>>> Thanks for the summary, Lahiru. I'm cc'ing the Architecture list for
>>> >>>> additional comments.
>>> >>>>
>>> >>>> Marlon
>>> >>>>
>>> >>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
>>> >>>> > Hi All,
>>> >>>> >
>>> >>>> > I did little research about Apache Zookeeper[1] and how to use it
>>> in
>>> >>>> > airavata. Its really a nice way to achieve fault tolerance and
>>> >>>> reliable
>>> >>>> > communication between our thrift services and clients. Zookeeper
>>> is a
>>> >>>> > distributed, fault tolerant system to do a reliable communication
>>> >>>> between
>>> >>>> > distributed applications. This is like an in-memory file system
>>> which
>>> >>>> has
>>> >>>> > nodes in a tree structure and each node can have small amount of
>>> data
>>> >>>> > associated with it and these nodes are called znodes. Clients can
>>> >>>> connect
>>> >>>> > to a zookeeper server and add/delete and update these znodes.
>>> >>>> >
>>> >>>> >   In Apache Airavata we start multiple thrift services and these
>>> can
>>> >>>> go
>>> >>>> > down for maintenance or these can crash, if we use zookeeper to
>>> store
>>> >>>> these
>>> >>>> > configuration(thrift service configurations) we can achieve a very
>>> >>>> reliable
>>> >>>> > system. Basically thrift clients can dynamically discover
>>> available
>>> >>>> service
>>> >>>> > by using ephemeral znodes(Here we do not have to change the
>>> generated
>>> >>>> > thrift client code but we have to change the locations we are
>>> invoking
>>> >>>> > them). ephemeral znodes will be removed when the thrift service
>>> goes
>>> >>>> down
>>> >>>> > and zookeeper guarantee the atomicity between these operations.
>>> With
>>> >>>> this
>>> >>>> > approach we can have a node hierarchy for multiple of airavata,
>>> >>>> > orchestrator,appcatalog and gfac thrift services.
>>> >>>> >
>>> >>>> > For specifically for gfac we can have different types of services
>>> for
>>> >>>> each
>>> >>>> > provider implementation. This can be achieved by using the
>>> >>>> hierarchical
>>> >>>> > support in zookeeper and providing some logic in gfac-thrift
>>> service
>>> >>>> to
>>> >>>> > register it to a defined path. Using the same logic orchestrator
>>> can
>>> >>>> > discover the provider specific gfac thrift service and route the
>>> >>>> message to
>>> >>>> > the correct thrift service.
>>> >>>> >
>>> >>>> > With this approach I think we simply have write some client code
>>> in
>>> >>>> thrift
>>> >>>> > services and clients and zookeeper server installation can be
>>> done as
>>> >>>> a
>>> >>>> > separate process and it will be easier to keep the Zookeeper
>>> server
>>> >>>> > separate from Airavata because installation of Zookeeper server
>>> little
>>> >>>> > complex in production scenario. I think we have to make sure
>>> >>>> everything
>>> >>>> > works fine when there is no Zookeeper running, ex:
>>> >>>> enable.zookeeper=false
>>> >>>> > should works fine and users doesn't have to download and start
>>> >>>> zookeeper.
>>> >>>> >
>>> >>>> >
>>> >>>> >
>>> >>>> > [1]http://zookeeper.apache.org/
>>> >>>> >
>>> >>>> > Thanks
>>> >>>> > Lahiru
>>> >>>>
>>> >>>>
>>> >>
>>> >>
>>> >> --
>>> >> System Analyst Programmer
>>> >> PTI Lab
>>> >> Indiana University
>>> >>
>>> >
>>>
>>>
>>> --
>>> System Analyst Programmer
>>> PTI Lab
>>> Indiana University
>>>
>>
>>
>>
>> --
>> Best Regards,
>> Shameera Rathnayaka.
>>
>> email: shameera AT apache.org , shameerainfo AT gmail.com
>> Blog : http://shameerarathnayaka.blogspot.com/
>>
>
>
>
> --
> Supun Kamburugamuva
> Member, Apache Software Foundation; http://www.apache.org
> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> Blog: http://supunk.blogspot.com
>
>


-- 
System Analyst Programmer
PTI Lab
Indiana University

Re: Internal communication in micro-service architectures

Posted by Eran Chinthaka Withana <er...@gmail.com>.

What you've explained is distributed coordination.

When Suresh mentioned about internal communication I thought its about how
internal services communicate to get the work done. Lets see what the
usecases are. May be I misunderstood the requirements.

Thanks,
Eran Chinthaka Withana


On Sat, Jun 14, 2014 at 5:39 PM, Supun Kamburugamuva <su...@gmail.com>
wrote:

> Hi Eran,
>
> Yes, I agree there can be many internal communications. The communication I
> was referring to was the communication involved with running jobs in a
> distributed manner. For example there can be many GFac instances capable of
> running an experiment. I believe at the moment Airavata uses or plan to use
> a Thrift call to ask the GFac to execute the experiment. This communication
> can be done through ZooKeeper using a state change.
>
> Going back to Storm, Nimbus uses ZooKeeper to communicate the things like
> worker node assignments to typologies through the ZooKeeper [1]. In Storm
> there is nothing between Nimbus and Supervisors other than ZooKeeper. The
> state is saved in the ZooKeeper and state change is being used as the
> communication. If you look at how Storm is using ZooKeeper it is very close
> to what Airavata needs.
>
> [1]
>
> https://github.com/apache/incubator-storm/blob/master/storm-core/src/clj/backtype/storm/cluster.clj
>
> Regards,
> Supun..
>
>
> On Sat, Jun 14, 2014 at 7:42 PM, Eran Chinthaka Withana <
> eran.chinthaka@gmail.com> wrote:
>
> > Hi
> >
> > On Fri, Jun 13, 2014 at 9:27 PM, Supun Kamburugamuva <su...@gmail.com>
> >  wrote:
> >
> > > Best example of using ZooKeeper as a communication + state holding
> place
> > is
> > > Apache Storm.
> > >
> > > http://storm.incubator.apache.org/
> >
> >
> > Supun, IIRC, storm uses ZK to keep status and coordination between
> workers
> > (bolts and spouts). The fact that multiple workers in storm uses that
> > information to take the decisions is, IMO, not considered as internal
> > communication. Its distributed coordination.
> >
> > I felt like the meaning of "internal communication" is overloaded here.
> >
> > *Suresh *we desperately need an example here for internal communication
> you
> > are talking about :)
> >
> > -- Eran
> >
>
>
>
> --
> Supun Kamburugamuva
> Member, Apache Software Foundation; http://www.apache.org
> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> Blog: http://supunk.blogspot.com
>

Re: Internal communication in micro-service architectures

Posted by Supun Kamburugamuva <su...@gmail.com>.

Hi Eran,

Yes, I agree there can be many internal communications. The communication I
was referring to was the communication involved with running jobs in a
distributed manner. For example there can be many GFac instances capable of
running an experiment. I believe at the moment Airavata uses or plan to use
a Thrift call to ask the GFac to execute the experiment. This communication
can be done through ZooKeeper using a state change.

Going back to Storm, Nimbus uses ZooKeeper to communicate the things like
worker node assignments to typologies through the ZooKeeper [1]. In Storm
there is nothing between Nimbus and Supervisors other than ZooKeeper. The
state is saved in the ZooKeeper and state change is being used as the
communication. If you look at how Storm is using ZooKeeper it is very close
to what Airavata needs.

[1]
https://github.com/apache/incubator-storm/blob/master/storm-core/src/clj/backtype/storm/cluster.clj

Regards,
Supun..

On Sat, Jun 14, 2014 at 7:42 PM, Eran Chinthaka Withana <
eran.chinthaka@gmail.com> wrote:

> Hi
>
> On Fri, Jun 13, 2014 at 9:27 PM, Supun Kamburugamuva <su...@gmail.com>
>  wrote:
>
> > Best example of using ZooKeeper as a communication + state holding place
> is
> > Apache Storm.
> >
> > http://storm.incubator.apache.org/
>
>
> Supun, IIRC, storm uses ZK to keep status and coordination between workers
> (bolts and spouts). The fact that multiple workers in storm uses that
> information to take the decisions is, IMO, not considered as internal
> communication. Its distributed coordination.
>
> I felt like the meaning of "internal communication" is overloaded here.
>
> *Suresh *we desperately need an example here for internal communication you
> are talking about :)
>
> -- Eran
>

-- 
Supun Kamburugamuva
Member, Apache Software Foundation; http://www.apache.org
E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
Blog: http://supunk.blogspot.com

Re: Internal communication in micro-service architectures

Posted by Eran Chinthaka Withana <er...@gmail.com>.

Hi

On Fri, Jun 13, 2014 at 9:27 PM, Supun Kamburugamuva <su...@gmail.com>
 wrote:

> Best example of using ZooKeeper as a communication + state holding place is
> Apache Storm.
>
> http://storm.incubator.apache.org/

Supun, IIRC, storm uses ZK to keep status and coordination between workers
(bolts and spouts). The fact that multiple workers in storm uses that
information to take the decisions is, IMO, not considered as internal
communication. Its distributed coordination.

I felt like the meaning of "internal communication" is overloaded here.

*Suresh *we desperately need an example here for internal communication you
are talking about :)

-- Eran

Re: Internal communication in micro-service architectures

Posted by Supun Kamburugamuva <su...@gmail.com>.

Best example of using ZooKeeper as a communication + state holding place is
Apache Storm.

http://storm.incubator.apache.org/

Thanks,
Supun..


On Sat, Jun 14, 2014 at 12:24 AM, Eran Chinthaka Withana <
eran.chinthaka@gmail.com> wrote:

> Hi,
>
> First, it was very surprising for me to hear that ZK was used as the
> communication mechanism. The closest I have heard is recipes:
> http://zookeeper.apache.org/doc/trunk/recipes.html. Also, since thrift
> services forces us to deploy stateless services the load balancing and
> scaling is pretty easy. Its just a matter of bringing up new instances. Can
> you please explain one use case where you use zookeeper as the
> communication mechanism?
>
> Anyway, *Suresh*, why do you want to pick something else for internal
> communication? Why can't we use thrift even for that? At some point, you
> won't be able to define whats internal and external. May be I'm not
> understanding the requirement/usecases here. Can you please elaborate with
> an example?
>
> Thanks,
> Eran Chinthaka Withana
>
>
> On Fri, Jun 13, 2014 at 4:43 AM, Suresh Marru <sm...@apache.org> wrote:
>
> > Thanks Supun these are great thoughts and is interesting to know how you
> > are using zookeeper for state management and communication. Since we are
> > diverging the discussion into a different direction (this is good
> important
> > topic though), let me start a new thread and continue to discuss
> zookeeper
> > in other thread.
> >
> > Thrift is serving well (atleast for now, barring some workable
> > limitations) for Airavata client facing API. Its helping with the
> > complexity in the data model and polygot clients. For justified reasons,
> we
> > are slowly breaking down and expanding the services within Airavata
> leading
> > to a micro-service architectures, may be towards more reactive
> architecture.
> >
> > What will be the good internal communication mechanism? The architectures
> > I have come across have used zookeeper for mainly load balancing,
> > distributed configuration management, change notifications and so forth.
> > What are the alternatives?
> >
> > Since Eran, Patanachani, Jijoe and Samir have gone through similar
> > iteration, it will be great to hear their opinions.
> >
> > Suresh
> >
> > On Jun 12, 2014, at 11:50 PM, Supun Kamburugamuva <su...@gmail.com>
> > wrote:
> >
> > > Hi all,
> > >
> > > Here is what I think about Airavata and ZooKeeper. In Airavata there
> are
> > many components and these components must be stateless to achieve
> > scalability and reliability.Also there must be a mechanism to communicate
> > between the components. At the moment Airavata uses RPC calls based on
> > Thrift for the communication.
> > >
> > > ZooKeeper can be used both as a place to hold state and as a
> > communication layer between the components. I'm involved with a project
> > that has many distributed components like AIravata. Right now we use
> Thrift
> > services to communicate among the components. But we find it difficult to
> > use RPC calls and achieve stateless behaviour and thinking of replacing
> > Thrift services with ZooKeeper based communication layer. So I think it
> is
> > better to explore the possibility of removing the Thrift services between
> > the components and use ZooKeeper as a communication mechanism between the
> > services. If you do this you will have to move the state to ZooKeeper and
> > will automatically achieve the stateless behaviour in the components.
> > >
> > > Also I think trying to make ZooKeeper optional is a bad idea. If we are
> > trying to integrate something fundamentally important to architecture as
> > how to store state, we shouldn't make it optional.
> > >
> > > Thanks,
> > > Supun..
> > >
> > >
> > > On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
> > shameerainfo@gmail.com> wrote:
> > > Hi Lahiru,
> > >
> > > As i understood,  not only reliability , you are trying to achieve some
> > other requirement by introducing zookeeper, like health monitoring of the
> > services, categorization with service implementation etc ... . In that
> > case, i think we can get use of zookeeper's features but if we only focus
> > on reliability, i have little bit of concern, why can't we use
> clustering +
> > LB ?
> > >
> > > Yes it is better we add Zookeeper as a prerequisite if user need to use
> > it.
> > >
> > > Thanks,
> > > Shameera.
> > >
> > >
> > > On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <glahiru@gmail.com
> >
> > wrote:
> > > Hi Gagan,
> > >
> > > I need to start another discussion about it, but I had an offline
> > > discussion with Suresh about auto-scaling. I will start another thread
> > > about this topic too.
> > >
> > > Regards
> > > Lahiru
> > >
> > >
> > > On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <
> gagandeepjuneja@gmail.com
> > >
> > > wrote:
> > >
> > > > Thanks Lahiru for pointing to nice library, added to my dictionary
> :).
> > > >
> > > > I would like to know how are we planning to start multiple servers.
> > > > 1. Spawning new servers based on load? Some times we call it as auto
> > > > scalable.
> > > > 2. To make some specific number of nodes available such as we want 2
> > > > servers to be available at any time so if one goes down then I need
> to
> > > > spawn one new to make available servers count 2.
> > > > 3. Initially start all the servers.
> > > >
> > > > In scenario 1 and 2 zookeeper does make sense but I don't believe
> > existing
> > > > architecture support this?
> > > >
> > > > Regards,
> > > > Gagan
> > > > On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <gl...@gmail.com>
> > wrote:
> > > >
> > > >> Hi Gagan,
> > > >>
> > > >> Thanks for your response. Please see my inline comments.
> > > >>
> > > >>
> > > >> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
> > gagandeepjuneja@gmail.com>
> > > >> wrote:
> > > >>
> > > >>> Hi Lahiru,
> > > >>> Just my 2 cents.
> > > >>>
> > > >>> I am big fan of zookeeper but also against adding multiple hops in
> > the
> > > >>> system which can add unnecessary complexity. Here I am not able to
> > > >>> understand the requirement of zookeeper may be I am wrong because
> of
> > less
> > > >>> knowledge of the airavata system in whole. So I would like to
> discuss
> > > >>> following point.
> > > >>>
> > > >>> 1. How it will help us in making system more reliable. Zookeeper is
> > not
> > > >>> able to restart services. At max it can tell whether service is up
> > or not
> > > >>> which could only be the case if airavata service goes down
> > gracefully and
> > > >>> we have any automated way to restart it. If this is just matter of
> > routing
> > > >>> client requests to the available thrift servers then this can be
> > achieved
> > > >>> with the help of load balancer which I guess is already there in
> > thrift
> > > >>> wish list.
> > > >>>
> > > >> We have multiple thrift services and currently we start only one
> > instance
> > > >> of them and each thrift service is a stateless service. To keep the
> > high
> > > >> availability we have to start multiple instances of them in
> production
> > > >> scenario. So for clients to get an available thrift service we can
> use
> > > >> zookeeper znodes to represent each available service. There are some
> > > >> libraries which is doing similar[1] and I think we can use them
> > directly.
> > > >>
> > > >>> 2. As far as registering of different providers is concerned do you
> > > >>> think for that we really need external store.
> > > >>>
> > > >> Yes I think so, because its light weight and reliable and we have to
> > do
> > > >> very minimal amount of work to achieve all these features to
> Airavata
> > > >> because zookeeper handle all the complexity.
> > > >>
> > > >>> I have seen people using zookeeper more for state management in
> > > >>> distributed environments.
> > > >>>
> > > >> +1, we might not be the most effective users of zookeeper because
> all
> > of
> > > >> our services are stateless services, but my point is to achieve
> > > >> fault-tolerance we can use zookeeper and with minimal work.
> > > >>
> > > >>>  I would like to understand more how can we leverage zookeeper in
> > > >>> airavata to make system reliable.
> > > >>>
> > > >>>
> > > >> [1]https://github.com/eirslett/thrift-zookeeper
> > > >>
> > > >>
> > > >>
> > > >>> Regards,
> > > >>> Gagan
> > > >>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu> wrote:
> > > >>>
> > > >>>> Thanks for the summary, Lahiru. I'm cc'ing the Architecture list
> for
> > > >>>> additional comments.
> > > >>>>
> > > >>>> Marlon
> > > >>>>
> > > >>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
> > > >>>> > Hi All,
> > > >>>> >
> > > >>>> > I did little research about Apache Zookeeper[1] and how to use
> it
> > in
> > > >>>> > airavata. Its really a nice way to achieve fault tolerance and
> > > >>>> reliable
> > > >>>> > communication between our thrift services and clients. Zookeeper
> > is a
> > > >>>> > distributed, fault tolerant system to do a reliable
> communication
> > > >>>> between
> > > >>>> > distributed applications. This is like an in-memory file system
> > which
> > > >>>> has
> > > >>>> > nodes in a tree structure and each node can have small amount of
> > data
> > > >>>> > associated with it and these nodes are called znodes. Clients
> can
> > > >>>> connect
> > > >>>> > to a zookeeper server and add/delete and update these znodes.
> > > >>>> >
> > > >>>> >   In Apache Airavata we start multiple thrift services and these
> > can
> > > >>>> go
> > > >>>> > down for maintenance or these can crash, if we use zookeeper to
> > store
> > > >>>> these
> > > >>>> > configuration(thrift service configurations) we can achieve a
> very
> > > >>>> reliable
> > > >>>> > system. Basically thrift clients can dynamically discover
> > available
> > > >>>> service
> > > >>>> > by using ephemeral znodes(Here we do not have to change the
> > generated
> > > >>>> > thrift client code but we have to change the locations we are
> > invoking
> > > >>>> > them). ephemeral znodes will be removed when the thrift service
> > goes
> > > >>>> down
> > > >>>> > and zookeeper guarantee the atomicity between these operations.
> > With
> > > >>>> this
> > > >>>> > approach we can have a node hierarchy for multiple of airavata,
> > > >>>> > orchestrator,appcatalog and gfac thrift services.
> > > >>>> >
> > > >>>> > For specifically for gfac we can have different types of
> services
> > for
> > > >>>> each
> > > >>>> > provider implementation. This can be achieved by using the
> > > >>>> hierarchical
> > > >>>> > support in zookeeper and providing some logic in gfac-thrift
> > service
> > > >>>> to
> > > >>>> > register it to a defined path. Using the same logic orchestrator
> > can
> > > >>>> > discover the provider specific gfac thrift service and route the
> > > >>>> message to
> > > >>>> > the correct thrift service.
> > > >>>> >
> > > >>>> > With this approach I think we simply have write some client code
> > in
> > > >>>> thrift
> > > >>>> > services and clients and zookeeper server installation can be
> > done as
> > > >>>> a
> > > >>>> > separate process and it will be easier to keep the Zookeeper
> > server
> > > >>>> > separate from Airavata because installation of Zookeeper server
> > little
> > > >>>> > complex in production scenario. I think we have to make sure
> > > >>>> everything
> > > >>>> > works fine when there is no Zookeeper running, ex:
> > > >>>> enable.zookeeper=false
> > > >>>> > should works fine and users doesn't have to download and start
> > > >>>> zookeeper.
> > > >>>> >
> > > >>>> >
> > > >>>> >
> > > >>>> > [1]http://zookeeper.apache.org/
> > > >>>> >
> > > >>>> > Thanks
> > > >>>> > Lahiru
> > > >>>>
> > > >>>>
> > > >>
> > > >>
> > > >> --
> > > >> System Analyst Programmer
> > > >> PTI Lab
> > > >> Indiana University
> > > >>
> > > >
> > >
> > >
> > > --
> > > System Analyst Programmer
> > > PTI Lab
> > > Indiana University
> > >
> > >
> > >
> > > --
> > > Best Regards,
> > > Shameera Rathnayaka.
> > >
> > > email: shameera AT apache.org , shameerainfo AT gmail.com
> > > Blog : http://shameerarathnayaka.blogspot.com/
> > >
> > >
> > >
> > > --
> > > Supun Kamburugamuva
> > > Member, Apache Software Foundation; http://www.apache.org
> > > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > > Blog: http://supunk.blogspot.com
> > >
> >
> >
>



-- 
Supun Kamburugamuva
Member, Apache Software Foundation; http://www.apache.org
E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
Blog: http://supunk.blogspot.com

Re: Internal communication in micro-service architectures

Posted by Jijoe Vurghese <ji...@gmail.com>.

+1 on striving for internal communications to be the same as external - it has advantages like forcing “us” to be clients to our own micro services (dog fooding), helping uncover weaknesses and bugs before external clients do. Also keeps things simpler for both external and internal communications - due to same-ness. 


>From Supun’s email, I see…


But we find it difficult to use RPC calls and achieve stateless behavior and thinking of replacing Thrift services with ZooKeeper based communication layer.


Curious to understand the difficulty at a deeper level. As Eran may have mentioned we have a micro services architecture with services depending on each other in a stateless manner (we’re up to 3 depths of dependencies now). We haven’t run into any difficulties. Hence, my curiosity to understand if you’ve learnt something we haven’t hit yet. Please clarify with some examples.


On ZK use, I don’t understand what it means to use ZK as a communication mechanism. As Eran pointed out, I’ve heard of ZK recipes for implementing various distributed consensus and coordination scenarios (Netlfix implemented some of these ZK recipes via http://curator.apache.org ). So, yes, a concrete example would be helpful to understand this option.


Thanks,


—Jijoe



> On Jun 13, 2014, at 9:24 PM, Eran Chinthaka Withana <er...@gmail.com> wrote:
> 
> 
> Hi,
> 
> 
> First, it was very surprising for me to hear that ZK was used as the
> communication mechanism. The closest I have heard is recipes:
> http://zookeeper.apache.org/doc/trunk/recipes.html. Also, since thrift
> services forces us to deploy stateless services the load balancing and
> scaling is pretty easy. Its just a matter of bringing up new instances. Can
> you please explain one use case where you use zookeeper as the
> communication mechanism?
> 
> 
> Anyway, *Suresh*, why do you want to pick something else for internal
> communication? Why can't we use thrift even for that? At some point, you
> won't be able to define whats internal and external. May be I'm not
> understanding the requirement/usecases here. Can you please elaborate with
> an example?
> 
> 
> Thanks,
> Eran Chinthaka Withana
> 
> 
> 
> 
> On Fri, Jun 13, 2014 at 4:43 AM, Suresh Marru <sm...@apache.org> wrote:
> 
> 
>> Thanks Supun these are great thoughts and is interesting to know how you
>> are using zookeeper for state management and communication. Since we are
>> diverging the discussion into a different direction (this is good important
>> topic though), let me start a new thread and continue to discuss zookeeper
>> in other thread.
>> 
>> 
>> Thrift is serving well (atleast for now, barring some workable
>> limitations) for Airavata client facing API. Its helping with the
>> complexity in the data model and polygot clients. For justified reasons, we
>> are slowly breaking down and expanding the services within Airavata leading
>> to a micro-service architectures, may be towards more reactive architecture.
>> 
>> 
>> What will be the good internal communication mechanism? The architectures
>> I have come across have used zookeeper for mainly load balancing,
>> distributed configuration management, change notifications and so forth.
>> What are the alternatives?
>> 
>> 
>> Since Eran, Patanachani, Jijoe and Samir have gone through similar
>> iteration, it will be great to hear their opinions.
>> 
>> 
>> Suresh
>> 
>> 
>> On Jun 12, 2014, at 11:50 PM, Supun Kamburugamuva <su...@gmail.com>
>> wrote:
>> 
>> 
>>> Hi all,
>>> 
>>> 
>>> Here is what I think about Airavata and ZooKeeper. In Airavata there are
>>> 
>>> 
>>> many components and these components must be stateless to achieve
>> scalability and reliability.Also there must be a mechanism to communicate
>> between the components. At the moment Airavata uses RPC calls based on
>> Thrift for the communication.
>> 
>> 
>>> 
>>> 
>>> ZooKeeper can be used both as a place to hold state and as a
>>> 
>>> 
>>> communication layer between the components. I'm involved with a project
>> that has many distributed components like AIravata. Right now we use Thrift
>> services to communicate among the components. But we find it difficult to
>> use RPC calls and achieve stateless behaviour and thinking of replacing
>> Thrift services with ZooKeeper based communication layer. So I think it is
>> better to explore the possibility of removing the Thrift services between
>> the components and use ZooKeeper as a communication mechanism between the
>> services. If you do this you will have to move the state to ZooKeeper and
>> will automatically achieve the stateless behaviour in the components.
>> 
>> 
>>> 
>>> 
>>> Also I think trying to make ZooKeeper optional is a bad idea. If we are
>>> 
>>> 
>>> trying to integrate something fundamentally important to architecture as
>> how to store state, we shouldn't make it optional.
>> 
>> 
>>> 
>>> 
>>> Thanks,
>>> Supun..
>>> 
>>> 
>>> 
>>> 
>>> On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
>>> 
>>> 
>>> shameerainfo@gmail.com> wrote:
>> 
>> 
>>> Hi Lahiru,
>>> 
>>> 
>>> As i understood, not only reliability , you are trying to achieve some
>>> 
>>> 
>>> other requirement by introducing zookeeper, like health monitoring of the
>> services, categorization with service implementation etc ... . In that
>> case, i think we can get use of zookeeper's features but if we only focus
>> on reliability, i have little bit of concern, why can't we use clustering +
>> LB ?
>> 
>> 
>>> 
>>> 
>>> Yes it is better we add Zookeeper as a prerequisite if user need to use
>>> 
>>> 
>>> it.
>> 
>> 
>>> 
>>> 
>>> Thanks,
>>> Shameera.
>>> 
>>> 
>>> 
>>> 
>>> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <gl...@gmail.com>
>>> 
>>> 
>>> wrote:
>> 
>> 
>>> Hi Gagan,
>>> 
>>> 
>>> I need to start another discussion about it, but I had an offline
>>> discussion with Suresh about auto-scaling. I will start another thread
>>> about this topic too.
>>> 
>>> 
>>> Regards
>>> Lahiru
>>> 
>>> 
>>> 
>>> 
>>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <gagandeepjuneja@gmail.com
>>> 
>>> 
>>> wrote:
>>> 
>>> 
>>>> Thanks Lahiru for pointing to nice library, added to my dictionary :).
>>>> 
>>>> 
>>>> I would like to know how are we planning to start multiple servers.
>>>> 1. Spawning new servers based on load? Some times we call it as auto
>>>> scalable.
>>>> 2. To make some specific number of nodes available such as we want 2
>>>> servers to be available at any time so if one goes down then I need to
>>>> spawn one new to make available servers count 2.
>>>> 3. Initially start all the servers.
>>>> 
>>>> 
>>>> In scenario 1 and 2 zookeeper does make sense but I don't believe
>>>> 
>>>> 
>>>> existing
>> 
>> 
>>>> architecture support this?
>>>> 
>>>> 
>>>> Regards,
>>>> Gagan
>>>> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <gl...@gmail.com>
>>>> 
>>>> 
>>>> wrote:
>> 
>> 
>>>> 
>>>> 
>>>>> Hi Gagan,
>>>>> 
>>>>> 
>>>>> Thanks for your response. Please see my inline comments.
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
>>>>> 
>>>>> 
>>>>> gagandeepjuneja@gmail.com>
>> 
>> 
>>>>> wrote:
>>>>> 
>>>>> 
>>>>>> Hi Lahiru,
>>>>>> Just my 2 cents.
>>>>>> 
>>>>>> 
>>>>>> I am big fan of zookeeper but also against adding multiple hops in
>>>>>> 
>>>>>> 
>>>>>> the
>> 
>> 
>>>>>> system which can add unnecessary complexity. Here I am not able to
>>>>>> understand the requirement of zookeeper may be I am wrong because of
>>>>>> 
>>>>>> 
>>>>>> less
>> 
>> 
>>>>>> knowledge of the airavata system in whole. So I would like to discuss
>>>>>> following point.
>>>>>> 
>>>>>> 
>>>>>> 1. How it will help us in making system more reliable. Zookeeper is
>>>>>> 
>>>>>> 
>>>>>> not
>> 
>> 
>>>>>> able to restart services. At max it can tell whether service is up
>>>>>> 
>>>>>> 
>>>>>> or not
>> 
>> 
>>>>>> which could only be the case if airavata service goes down
>>>>>> 
>>>>>> 
>>>>>> gracefully and
>> 
>> 
>>>>>> we have any automated way to restart it. If this is just matter of
>>>>>> 
>>>>>> 
>>>>>> routing
>> 
>> 
>>>>>> client requests to the available thrift servers then this can be
>>>>>> 
>>>>>> 
>>>>>> achieved
>> 
>> 
>>>>>> with the help of load balancer which I guess is already there in
>>>>>> 
>>>>>> 
>>>>>> thrift
>> 
>> 
>>>>>> wish list.
>>>>>> 
>>>>>> 
>>>>>> We have multiple thrift services and currently we start only one
>>>>> 
>>>>> 
>>>>> instance
>> 
>> 
>>>>> of them and each thrift service is a stateless service. To keep the
>>>>> 
>>>>> 
>>>>> high
>> 
>> 
>>>>> availability we have to start multiple instances of them in production
>>>>> scenario. So for clients to get an available thrift service we can use
>>>>> zookeeper znodes to represent each available service. There are some
>>>>> libraries which is doing similar[1] and I think we can use them
>>>>> 
>>>>> 
>>>>> directly.
>> 
>> 
>>>>> 
>>>>> 
>>>>>> 2. As far as registering of different providers is concerned do you
>>>>>> think for that we really need external store.
>>>>>> 
>>>>>> 
>>>>>> Yes I think so, because its light weight and reliable and we have to
>>>>> 
>>>>> 
>>>>> do
>> 
>> 
>>>>> very minimal amount of work to achieve all these features to Airavata
>>>>> because zookeeper handle all the complexity.
>>>>> 
>>>>> 
>>>>>> I have seen people using zookeeper more for state management in
>>>>>> distributed environments.
>>>>>> 
>>>>>> 
>>>>>> +1, we might not be the most effective users of zookeeper because all
>>>>> 
>>>>> 
>>>>> of
>> 
>> 
>>>>> our services are stateless services, but my point is to achieve
>>>>> fault-tolerance we can use zookeeper and with minimal work.
>>>>> 
>>>>> 
>>>>>> I would like to understand more how can we leverage zookeeper in
>>>>>> airavata to make system reliable.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> [1]https://github.com/eirslett/thrift-zookeeper
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>> Regards,
>>>>>> Gagan
>>>>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu> wrote:
>>>>>> 
>>>>>> 
>>>>>>> Thanks for the summary, Lahiru. I'm cc'ing the Architecture list for
>>>>>>> additional comments.
>>>>>>> 
>>>>>>> 
>>>>>>> Marlon
>>>>>>> 
>>>>>>> 
>>>>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
>>>>>>> 
>>>>>>> 
>>>>>>>> Hi All,
>>>>>>>> 
>>>>>>>> 
>>>>>>>> I did little research about Apache Zookeeper[1] and how to use it
>>>>>>>> 
>>>>>>>> 
>>>>>>>> in
>> 
>> 
>>>>>>>> airavata. Its really a nice way to achieve fault tolerance and
>>>>>>>> 
>>>>>>>> 
>>>>>>>> reliable
>>>>>>> 
>>>>>>> 
>>>>>>>> communication between our thrift services and clients. Zookeeper
>>>>>>>> 
>>>>>>>> 
>>>>>>>> is a
>> 
>> 
>>>>>>>> distributed, fault tolerant system to do a reliable communication
>>>>>>>> 
>>>>>>>> 
>>>>>>>> between
>>>>>>> 
>>>>>>> 
>>>>>>>> distributed applications. This is like an in-memory file system
>>>>>>>> 
>>>>>>>> 
>>>>>>>> which
>> 
>> 
>>>>>>> has
>>>>>>> 
>>>>>>> 
>>>>>>>> nodes in a tree structure and each node can have small amount of
>>>>>>>> 
>>>>>>>> 
>>>>>>>> data
>> 
>> 
>>>>>>>> associated with it and these nodes are called znodes. Clients can
>>>>>>>> 
>>>>>>>> 
>>>>>>>> connect
>>>>>>> 
>>>>>>> 
>>>>>>>> to a zookeeper server and add/delete and update these znodes.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> In Apache Airavata we start multiple thrift services and these
>>>>>>>> 
>>>>>>>> 
>>>>>>>> can
>> 
>> 
>>>>>>> go
>>>>>>> 
>>>>>>> 
>>>>>>>> down for maintenance or these can crash, if we use zookeeper to
>>>>>>>> 
>>>>>>>> 
>>>>>>>> store
>> 
>> 
>>>>>>> these
>>>>>>> 
>>>>>>> 
>>>>>>>> configuration(thrift service configurations) we can achieve a very
>>>>>>>> 
>>>>>>>> 
>>>>>>>> reliable
>>>>>>> 
>>>>>>> 
>>>>>>>> system. Basically thrift clients can dynamically discover
>>>>>>>> 
>>>>>>>> 
>>>>>>>> available
>> 
>> 
>>>>>>> service
>>>>>>> 
>>>>>>> 
>>>>>>>> by using ephemeral znodes(Here we do not have to change the
>>>>>>>> 
>>>>>>>> 
>>>>>>>> generated
>> 
>> 
>>>>>>>> thrift client code but we have to change the locations we are
>>>>>>>> 
>>>>>>>> 
>>>>>>>> invoking
>> 
>> 
>>>>>>>> them). ephemeral znodes will be removed when the thrift service
>>>>>>>> 
>>>>>>>> 
>>>>>>>> goes
>> 
>> 
>>>>>>> down
>>>>>>> 
>>>>>>> 
>>>>>>>> and zookeeper guarantee the atomicity between these operations.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> With
>> 
>> 
>>>>>>> this
>>>>>>> 
>>>>>>> 
>>>>>>>> approach we can have a node hierarchy for multiple of airavata,
>>>>>>>> orchestrator,appcatalog and gfac thrift services.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> For specifically for gfac we can have different types of services
>>>>>>>> 
>>>>>>>> 
>>>>>>>> for
>> 
>> 
>>>>>>> each
>>>>>>> 
>>>>>>> 
>>>>>>>> provider implementation. This can be achieved by using the
>>>>>>>> 
>>>>>>>> 
>>>>>>>> hierarchical
>>>>>>> 
>>>>>>> 
>>>>>>>> support in zookeeper and providing some logic in gfac-thrift
>>>>>>>> 
>>>>>>>> 
>>>>>>>> service
>> 
>> 
>>>>>>> to
>>>>>>> 
>>>>>>> 
>>>>>>>> register it to a defined path. Using the same logic orchestrator
>>>>>>>> 
>>>>>>>> 
>>>>>>>> can
>> 
>> 
>>>>>>>> discover the provider specific gfac thrift service and route the
>>>>>>>> 
>>>>>>>> 
>>>>>>>> message to
>>>>>>> 
>>>>>>> 
>>>>>>>> the correct thrift service.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> With this approach I think we simply have write some client code
>>>>>>>> 
>>>>>>>> 
>>>>>>>> in
>> 
>> 
>>>>>>> thrift
>>>>>>> 
>>>>>>> 
>>>>>>>> services and clients and zookeeper server installation can be
>>>>>>>> 
>>>>>>>> 
>>>>>>>> done as
>> 
>> 
>>>>>>> a
>>>>>>> 
>>>>>>> 
>>>>>>>> separate process and it will be easier to keep the Zookeeper
>>>>>>>> 
>>>>>>>> 
>>>>>>>> server
>> 
>> 
>>>>>>>> separate from Airavata because installation of Zookeeper server
>>>>>>>> 
>>>>>>>> 
>>>>>>>> little
>> 
>> 
>>>>>>>> complex in production scenario. I think we have to make sure
>>>>>>>> 
>>>>>>>> 
>>>>>>>> everything
>>>>>>> 
>>>>>>> 
>>>>>>>> works fine when there is no Zookeeper running, ex:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> enable.zookeeper=false
>>>>>>> 
>>>>>>> 
>>>>>>>> should works fine and users doesn't have to download and start
>>>>>>>> 
>>>>>>>> 
>>>>>>>> zookeeper.
>>>>>>> 
>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> [1]http://zookeeper.apache.org/
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Thanks
>>>>>>>> Lahiru
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> System Analyst Programmer
>>>>> PTI Lab
>>>>> Indiana University
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> System Analyst Programmer
>>> PTI Lab
>>> Indiana University
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> --
>>> Best Regards,
>>> Shameera Rathnayaka.
>>> 
>>> 
>>> email: shameera AT apache.org , shameerainfo AT gmail.com
>>> Blog : http://shameerarathnayaka.blogspot.com/
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> --
>>> Supun Kamburugamuva
>>> Member, Apache Software Foundation; http://www.apache.org
>>> E-mail: supun06@gmail.com; Mobile: +1 812 369 6762
>>> Blog: http://supunk.blogspot.com
>>> 
>>> 
>>> 
>> 
>> 
>> 
>> 
>>

Re: Internal communication in micro-service architectures

Posted by Eran Chinthaka Withana <er...@gmail.com>.

Hi,

First, it was very surprising for me to hear that ZK was used as the
communication mechanism. The closest I have heard is recipes:
http://zookeeper.apache.org/doc/trunk/recipes.html. Also, since thrift
services forces us to deploy stateless services the load balancing and
scaling is pretty easy. Its just a matter of bringing up new instances. Can
you please explain one use case where you use zookeeper as the
communication mechanism?

Anyway, *Suresh*, why do you want to pick something else for internal
communication? Why can't we use thrift even for that? At some point, you
won't be able to define whats internal and external. May be I'm not
understanding the requirement/usecases here. Can you please elaborate with
an example?

Thanks,
Eran Chinthaka Withana


On Fri, Jun 13, 2014 at 4:43 AM, Suresh Marru <sm...@apache.org> wrote:

> Thanks Supun these are great thoughts and is interesting to know how you
> are using zookeeper for state management and communication. Since we are
> diverging the discussion into a different direction (this is good important
> topic though), let me start a new thread and continue to discuss zookeeper
> in other thread.
>
> Thrift is serving well (atleast for now, barring some workable
> limitations) for Airavata client facing API. Its helping with the
> complexity in the data model and polygot clients. For justified reasons, we
> are slowly breaking down and expanding the services within Airavata leading
> to a micro-service architectures, may be towards more reactive architecture.
>
> What will be the good internal communication mechanism? The architectures
> I have come across have used zookeeper for mainly load balancing,
> distributed configuration management, change notifications and so forth.
> What are the alternatives?
>
> Since Eran, Patanachani, Jijoe and Samir have gone through similar
> iteration, it will be great to hear their opinions.
>
> Suresh
>
> On Jun 12, 2014, at 11:50 PM, Supun Kamburugamuva <su...@gmail.com>
> wrote:
>
> > Hi all,
> >
> > Here is what I think about Airavata and ZooKeeper. In Airavata there are
> many components and these components must be stateless to achieve
> scalability and reliability.Also there must be a mechanism to communicate
> between the components. At the moment Airavata uses RPC calls based on
> Thrift for the communication.
> >
> > ZooKeeper can be used both as a place to hold state and as a
> communication layer between the components. I'm involved with a project
> that has many distributed components like AIravata. Right now we use Thrift
> services to communicate among the components. But we find it difficult to
> use RPC calls and achieve stateless behaviour and thinking of replacing
> Thrift services with ZooKeeper based communication layer. So I think it is
> better to explore the possibility of removing the Thrift services between
> the components and use ZooKeeper as a communication mechanism between the
> services. If you do this you will have to move the state to ZooKeeper and
> will automatically achieve the stateless behaviour in the components.
> >
> > Also I think trying to make ZooKeeper optional is a bad idea. If we are
> trying to integrate something fundamentally important to architecture as
> how to store state, we shouldn't make it optional.
> >
> > Thanks,
> > Supun..
> >
> >
> > On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
> shameerainfo@gmail.com> wrote:
> > Hi Lahiru,
> >
> > As i understood,  not only reliability , you are trying to achieve some
> other requirement by introducing zookeeper, like health monitoring of the
> services, categorization with service implementation etc ... . In that
> case, i think we can get use of zookeeper's features but if we only focus
> on reliability, i have little bit of concern, why can't we use clustering +
> LB ?
> >
> > Yes it is better we add Zookeeper as a prerequisite if user need to use
> it.
> >
> > Thanks,
> > Shameera.
> >
> >
> > On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <gl...@gmail.com>
> wrote:
> > Hi Gagan,
> >
> > I need to start another discussion about it, but I had an offline
> > discussion with Suresh about auto-scaling. I will start another thread
> > about this topic too.
> >
> > Regards
> > Lahiru
> >
> >
> > On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <gagandeepjuneja@gmail.com
> >
> > wrote:
> >
> > > Thanks Lahiru for pointing to nice library, added to my dictionary :).
> > >
> > > I would like to know how are we planning to start multiple servers.
> > > 1. Spawning new servers based on load? Some times we call it as auto
> > > scalable.
> > > 2. To make some specific number of nodes available such as we want 2
> > > servers to be available at any time so if one goes down then I need to
> > > spawn one new to make available servers count 2.
> > > 3. Initially start all the servers.
> > >
> > > In scenario 1 and 2 zookeeper does make sense but I don't believe
> existing
> > > architecture support this?
> > >
> > > Regards,
> > > Gagan
> > > On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <gl...@gmail.com>
> wrote:
> > >
> > >> Hi Gagan,
> > >>
> > >> Thanks for your response. Please see my inline comments.
> > >>
> > >>
> > >> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
> gagandeepjuneja@gmail.com>
> > >> wrote:
> > >>
> > >>> Hi Lahiru,
> > >>> Just my 2 cents.
> > >>>
> > >>> I am big fan of zookeeper but also against adding multiple hops in
> the
> > >>> system which can add unnecessary complexity. Here I am not able to
> > >>> understand the requirement of zookeeper may be I am wrong because of
> less
> > >>> knowledge of the airavata system in whole. So I would like to discuss
> > >>> following point.
> > >>>
> > >>> 1. How it will help us in making system more reliable. Zookeeper is
> not
> > >>> able to restart services. At max it can tell whether service is up
> or not
> > >>> which could only be the case if airavata service goes down
> gracefully and
> > >>> we have any automated way to restart it. If this is just matter of
> routing
> > >>> client requests to the available thrift servers then this can be
> achieved
> > >>> with the help of load balancer which I guess is already there in
> thrift
> > >>> wish list.
> > >>>
> > >> We have multiple thrift services and currently we start only one
> instance
> > >> of them and each thrift service is a stateless service. To keep the
> high
> > >> availability we have to start multiple instances of them in production
> > >> scenario. So for clients to get an available thrift service we can use
> > >> zookeeper znodes to represent each available service. There are some
> > >> libraries which is doing similar[1] and I think we can use them
> directly.
> > >>
> > >>> 2. As far as registering of different providers is concerned do you
> > >>> think for that we really need external store.
> > >>>
> > >> Yes I think so, because its light weight and reliable and we have to
> do
> > >> very minimal amount of work to achieve all these features to Airavata
> > >> because zookeeper handle all the complexity.
> > >>
> > >>> I have seen people using zookeeper more for state management in
> > >>> distributed environments.
> > >>>
> > >> +1, we might not be the most effective users of zookeeper because all
> of
> > >> our services are stateless services, but my point is to achieve
> > >> fault-tolerance we can use zookeeper and with minimal work.
> > >>
> > >>>  I would like to understand more how can we leverage zookeeper in
> > >>> airavata to make system reliable.
> > >>>
> > >>>
> > >> [1]https://github.com/eirslett/thrift-zookeeper
> > >>
> > >>
> > >>
> > >>> Regards,
> > >>> Gagan
> > >>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu> wrote:
> > >>>
> > >>>> Thanks for the summary, Lahiru. I'm cc'ing the Architecture list for
> > >>>> additional comments.
> > >>>>
> > >>>> Marlon
> > >>>>
> > >>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
> > >>>> > Hi All,
> > >>>> >
> > >>>> > I did little research about Apache Zookeeper[1] and how to use it
> in
> > >>>> > airavata. Its really a nice way to achieve fault tolerance and
> > >>>> reliable
> > >>>> > communication between our thrift services and clients. Zookeeper
> is a
> > >>>> > distributed, fault tolerant system to do a reliable communication
> > >>>> between
> > >>>> > distributed applications. This is like an in-memory file system
> which
> > >>>> has
> > >>>> > nodes in a tree structure and each node can have small amount of
> data
> > >>>> > associated with it and these nodes are called znodes. Clients can
> > >>>> connect
> > >>>> > to a zookeeper server and add/delete and update these znodes.
> > >>>> >
> > >>>> >   In Apache Airavata we start multiple thrift services and these
> can
> > >>>> go
> > >>>> > down for maintenance or these can crash, if we use zookeeper to
> store
> > >>>> these
> > >>>> > configuration(thrift service configurations) we can achieve a very
> > >>>> reliable
> > >>>> > system. Basically thrift clients can dynamically discover
> available
> > >>>> service
> > >>>> > by using ephemeral znodes(Here we do not have to change the
> generated
> > >>>> > thrift client code but we have to change the locations we are
> invoking
> > >>>> > them). ephemeral znodes will be removed when the thrift service
> goes
> > >>>> down
> > >>>> > and zookeeper guarantee the atomicity between these operations.
> With
> > >>>> this
> > >>>> > approach we can have a node hierarchy for multiple of airavata,
> > >>>> > orchestrator,appcatalog and gfac thrift services.
> > >>>> >
> > >>>> > For specifically for gfac we can have different types of services
> for
> > >>>> each
> > >>>> > provider implementation. This can be achieved by using the
> > >>>> hierarchical
> > >>>> > support in zookeeper and providing some logic in gfac-thrift
> service
> > >>>> to
> > >>>> > register it to a defined path. Using the same logic orchestrator
> can
> > >>>> > discover the provider specific gfac thrift service and route the
> > >>>> message to
> > >>>> > the correct thrift service.
> > >>>> >
> > >>>> > With this approach I think we simply have write some client code
> in
> > >>>> thrift
> > >>>> > services and clients and zookeeper server installation can be
> done as
> > >>>> a
> > >>>> > separate process and it will be easier to keep the Zookeeper
> server
> > >>>> > separate from Airavata because installation of Zookeeper server
> little
> > >>>> > complex in production scenario. I think we have to make sure
> > >>>> everything
> > >>>> > works fine when there is no Zookeeper running, ex:
> > >>>> enable.zookeeper=false
> > >>>> > should works fine and users doesn't have to download and start
> > >>>> zookeeper.
> > >>>> >
> > >>>> >
> > >>>> >
> > >>>> > [1]http://zookeeper.apache.org/
> > >>>> >
> > >>>> > Thanks
> > >>>> > Lahiru
> > >>>>
> > >>>>
> > >>
> > >>
> > >> --
> > >> System Analyst Programmer
> > >> PTI Lab
> > >> Indiana University
> > >>
> > >
> >
> >
> > --
> > System Analyst Programmer
> > PTI Lab
> > Indiana University
> >
> >
> >
> > --
> > Best Regards,
> > Shameera Rathnayaka.
> >
> > email: shameera AT apache.org , shameerainfo AT gmail.com
> > Blog : http://shameerarathnayaka.blogspot.com/
> >
> >
> >
> > --
> > Supun Kamburugamuva
> > Member, Apache Software Foundation; http://www.apache.org
> > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > Blog: http://supunk.blogspot.com
> >
>
>

Internal communication in micro-service architectures

Posted by Suresh Marru <sm...@apache.org>.

Thanks Supun these are great thoughts and is interesting to know how you are using zookeeper for state management and communication. Since we are diverging the discussion into a different direction (this is good important topic though), let me start a new thread and continue to discuss zookeeper in other thread. 

Thrift is serving well (atleast for now, barring some workable limitations) for Airavata client facing API. Its helping with the complexity in the data model and polygot clients. For justified reasons, we are slowly breaking down and expanding the services within Airavata leading to a micro-service architectures, may be towards more reactive architecture.

What will be the good internal communication mechanism? The architectures I have come across have used zookeeper for mainly load balancing, distributed configuration management, change notifications and so forth. What are the alternatives? 

Since Eran, Patanachani, Jijoe and Samir have gone through similar iteration, it will be great to hear their opinions. 

Suresh

On Jun 12, 2014, at 11:50 PM, Supun Kamburugamuva <su...@gmail.com> wrote:

> Hi all,
> 
> Here is what I think about Airavata and ZooKeeper. In Airavata there are many components and these components must be stateless to achieve scalability and reliability.Also there must be a mechanism to communicate between the components. At the moment Airavata uses RPC calls based on Thrift for the communication. 
> 
> ZooKeeper can be used both as a place to hold state and as a communication layer between the components. I'm involved with a project that has many distributed components like AIravata. Right now we use Thrift services to communicate among the components. But we find it difficult to use RPC calls and achieve stateless behaviour and thinking of replacing Thrift services with ZooKeeper based communication layer. So I think it is better to explore the possibility of removing the Thrift services between the components and use ZooKeeper as a communication mechanism between the services. If you do this you will have to move the state to ZooKeeper and will automatically achieve the stateless behaviour in the components.
> 
> Also I think trying to make ZooKeeper optional is a bad idea. If we are trying to integrate something fundamentally important to architecture as how to store state, we shouldn't make it optional.
> 
> Thanks,
> Supun..
> 
> 
> On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <sh...@gmail.com> wrote:
> Hi Lahiru, 
> 
> As i understood,  not only reliability , you are trying to achieve some other requirement by introducing zookeeper, like health monitoring of the services, categorization with service implementation etc ... . In that case, i think we can get use of zookeeper's features but if we only focus on reliability, i have little bit of concern, why can't we use clustering + LB ?
> 
> Yes it is better we add Zookeeper as a prerequisite if user need to use it. 
> 
> Thanks, 
> Shameera.
> 
> 
> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <gl...@gmail.com> wrote:
> Hi Gagan,
> 
> I need to start another discussion about it, but I had an offline
> discussion with Suresh about auto-scaling. I will start another thread
> about this topic too.
> 
> Regards
> Lahiru
> 
> 
> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <ga...@gmail.com>
> wrote:
> 
> > Thanks Lahiru for pointing to nice library, added to my dictionary :).
> >
> > I would like to know how are we planning to start multiple servers.
> > 1. Spawning new servers based on load? Some times we call it as auto
> > scalable.
> > 2. To make some specific number of nodes available such as we want 2
> > servers to be available at any time so if one goes down then I need to
> > spawn one new to make available servers count 2.
> > 3. Initially start all the servers.
> >
> > In scenario 1 and 2 zookeeper does make sense but I don't believe existing
> > architecture support this?
> >
> > Regards,
> > Gagan
> > On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <gl...@gmail.com> wrote:
> >
> >> Hi Gagan,
> >>
> >> Thanks for your response. Please see my inline comments.
> >>
> >>
> >> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <ga...@gmail.com>
> >> wrote:
> >>
> >>> Hi Lahiru,
> >>> Just my 2 cents.
> >>>
> >>> I am big fan of zookeeper but also against adding multiple hops in the
> >>> system which can add unnecessary complexity. Here I am not able to
> >>> understand the requirement of zookeeper may be I am wrong because of less
> >>> knowledge of the airavata system in whole. So I would like to discuss
> >>> following point.
> >>>
> >>> 1. How it will help us in making system more reliable. Zookeeper is not
> >>> able to restart services. At max it can tell whether service is up or not
> >>> which could only be the case if airavata service goes down gracefully and
> >>> we have any automated way to restart it. If this is just matter of routing
> >>> client requests to the available thrift servers then this can be achieved
> >>> with the help of load balancer which I guess is already there in thrift
> >>> wish list.
> >>>
> >> We have multiple thrift services and currently we start only one instance
> >> of them and each thrift service is a stateless service. To keep the high
> >> availability we have to start multiple instances of them in production
> >> scenario. So for clients to get an available thrift service we can use
> >> zookeeper znodes to represent each available service. There are some
> >> libraries which is doing similar[1] and I think we can use them directly.
> >>
> >>> 2. As far as registering of different providers is concerned do you
> >>> think for that we really need external store.
> >>>
> >> Yes I think so, because its light weight and reliable and we have to do
> >> very minimal amount of work to achieve all these features to Airavata
> >> because zookeeper handle all the complexity.
> >>
> >>> I have seen people using zookeeper more for state management in
> >>> distributed environments.
> >>>
> >> +1, we might not be the most effective users of zookeeper because all of
> >> our services are stateless services, but my point is to achieve
> >> fault-tolerance we can use zookeeper and with minimal work.
> >>
> >>>  I would like to understand more how can we leverage zookeeper in
> >>> airavata to make system reliable.
> >>>
> >>>
> >> [1]https://github.com/eirslett/thrift-zookeeper
> >>
> >>
> >>
> >>> Regards,
> >>> Gagan
> >>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu> wrote:
> >>>
> >>>> Thanks for the summary, Lahiru. I'm cc'ing the Architecture list for
> >>>> additional comments.
> >>>>
> >>>> Marlon
> >>>>
> >>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
> >>>> > Hi All,
> >>>> >
> >>>> > I did little research about Apache Zookeeper[1] and how to use it in
> >>>> > airavata. Its really a nice way to achieve fault tolerance and
> >>>> reliable
> >>>> > communication between our thrift services and clients. Zookeeper is a
> >>>> > distributed, fault tolerant system to do a reliable communication
> >>>> between
> >>>> > distributed applications. This is like an in-memory file system which
> >>>> has
> >>>> > nodes in a tree structure and each node can have small amount of data
> >>>> > associated with it and these nodes are called znodes. Clients can
> >>>> connect
> >>>> > to a zookeeper server and add/delete and update these znodes.
> >>>> >
> >>>> >   In Apache Airavata we start multiple thrift services and these can
> >>>> go
> >>>> > down for maintenance or these can crash, if we use zookeeper to store
> >>>> these
> >>>> > configuration(thrift service configurations) we can achieve a very
> >>>> reliable
> >>>> > system. Basically thrift clients can dynamically discover available
> >>>> service
> >>>> > by using ephemeral znodes(Here we do not have to change the generated
> >>>> > thrift client code but we have to change the locations we are invoking
> >>>> > them). ephemeral znodes will be removed when the thrift service goes
> >>>> down
> >>>> > and zookeeper guarantee the atomicity between these operations. With
> >>>> this
> >>>> > approach we can have a node hierarchy for multiple of airavata,
> >>>> > orchestrator,appcatalog and gfac thrift services.
> >>>> >
> >>>> > For specifically for gfac we can have different types of services for
> >>>> each
> >>>> > provider implementation. This can be achieved by using the
> >>>> hierarchical
> >>>> > support in zookeeper and providing some logic in gfac-thrift service
> >>>> to
> >>>> > register it to a defined path. Using the same logic orchestrator can
> >>>> > discover the provider specific gfac thrift service and route the
> >>>> message to
> >>>> > the correct thrift service.
> >>>> >
> >>>> > With this approach I think we simply have write some client code in
> >>>> thrift
> >>>> > services and clients and zookeeper server installation can be done as
> >>>> a
> >>>> > separate process and it will be easier to keep the Zookeeper server
> >>>> > separate from Airavata because installation of Zookeeper server little
> >>>> > complex in production scenario. I think we have to make sure
> >>>> everything
> >>>> > works fine when there is no Zookeeper running, ex:
> >>>> enable.zookeeper=false
> >>>> > should works fine and users doesn't have to download and start
> >>>> zookeeper.
> >>>> >
> >>>> >
> >>>> >
> >>>> > [1]http://zookeeper.apache.org/
> >>>> >
> >>>> > Thanks
> >>>> > Lahiru
> >>>>
> >>>>
> >>
> >>
> >> --
> >> System Analyst Programmer
> >> PTI Lab
> >> Indiana University
> >>
> >
> 
> 
> --
> System Analyst Programmer
> PTI Lab
> Indiana University
> 
> 
> 
> -- 
> Best Regards,
> Shameera Rathnayaka.
> 
> email: shameera AT apache.org , shameerainfo AT gmail.com
> Blog : http://shameerarathnayaka.blogspot.com/
> 
> 
> 
> -- 
> Supun Kamburugamuva
> Member, Apache Software Foundation; http://www.apache.org
> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> Blog: http://supunk.blogspot.com
>

Re: Zookeeper in Airavata to achieve reliability

Posted by Supun Kamburugamuva <su...@gmail.com>.

Hi all,

Here is what I think about Airavata and ZooKeeper. In Airavata there are
many components and these components must be stateless to achieve
scalability and reliability.Also there must be a mechanism to communicate
between the components. At the moment Airavata uses RPC calls based on
Thrift for the communication.

ZooKeeper can be used both as a place to hold state and as a communication
layer between the components. I'm involved with a project that has many
distributed components like AIravata. Right now we use Thrift services to
communicate among the components. But we find it difficult to use RPC calls
and achieve stateless behaviour and thinking of replacing Thrift services
with ZooKeeper based communication layer. So I think it is better to
explore the possibility of removing the Thrift services between the
components and use ZooKeeper as a communication mechanism between the
services. If you do this you will have to move the state to ZooKeeper and
will automatically achieve the stateless behaviour in the components.

Also I think trying to make ZooKeeper optional is a bad idea. If we are
trying to integrate something fundamentally important to architecture as
how to store state, we shouldn't make it optional.

Thanks,
Supun..


On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
shameerainfo@gmail.com> wrote:

> Hi Lahiru,
>
> As i understood,  not only reliability , you are trying to achieve some
> other requirement by introducing zookeeper, like health monitoring of the
> services, categorization with service implementation etc ... . In that
> case, i think we can get use of zookeeper's features but if we only focus
> on reliability, i have little bit of concern, why can't we use clustering +
> LB ?
>
> Yes it is better we add Zookeeper as a prerequisite if user need to use
> it.
>
> Thanks,
>  Shameera.
>
>
> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <gl...@gmail.com>
> wrote:
>
>> Hi Gagan,
>>
>> I need to start another discussion about it, but I had an offline
>> discussion with Suresh about auto-scaling. I will start another thread
>> about this topic too.
>>
>> Regards
>> Lahiru
>>
>>
>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <ga...@gmail.com>
>> wrote:
>>
>> > Thanks Lahiru for pointing to nice library, added to my dictionary :).
>> >
>> > I would like to know how are we planning to start multiple servers.
>> > 1. Spawning new servers based on load? Some times we call it as auto
>> > scalable.
>> > 2. To make some specific number of nodes available such as we want 2
>> > servers to be available at any time so if one goes down then I need to
>> > spawn one new to make available servers count 2.
>> > 3. Initially start all the servers.
>> >
>> > In scenario 1 and 2 zookeeper does make sense but I don't believe
>> existing
>> > architecture support this?
>> >
>> > Regards,
>> > Gagan
>> > On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <gl...@gmail.com> wrote:
>> >
>> >> Hi Gagan,
>> >>
>> >> Thanks for your response. Please see my inline comments.
>> >>
>> >>
>> >> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
>> gagandeepjuneja@gmail.com>
>> >> wrote:
>> >>
>> >>> Hi Lahiru,
>> >>> Just my 2 cents.
>> >>>
>> >>> I am big fan of zookeeper but also against adding multiple hops in the
>> >>> system which can add unnecessary complexity. Here I am not able to
>> >>> understand the requirement of zookeeper may be I am wrong because of
>> less
>> >>> knowledge of the airavata system in whole. So I would like to discuss
>> >>> following point.
>> >>>
>> >>> 1. How it will help us in making system more reliable. Zookeeper is
>> not
>> >>> able to restart services. At max it can tell whether service is up or
>> not
>> >>> which could only be the case if airavata service goes down gracefully
>> and
>> >>> we have any automated way to restart it. If this is just matter of
>> routing
>> >>> client requests to the available thrift servers then this can be
>> achieved
>> >>> with the help of load balancer which I guess is already there in
>> thrift
>> >>> wish list.
>> >>>
>> >> We have multiple thrift services and currently we start only one
>> instance
>> >> of them and each thrift service is a stateless service. To keep the
>> high
>> >> availability we have to start multiple instances of them in production
>> >> scenario. So for clients to get an available thrift service we can use
>> >> zookeeper znodes to represent each available service. There are some
>> >> libraries which is doing similar[1] and I think we can use them
>> directly.
>> >>
>> >>> 2. As far as registering of different providers is concerned do you
>> >>> think for that we really need external store.
>> >>>
>> >> Yes I think so, because its light weight and reliable and we have to do
>> >> very minimal amount of work to achieve all these features to Airavata
>> >> because zookeeper handle all the complexity.
>> >>
>> >>> I have seen people using zookeeper more for state management in
>> >>> distributed environments.
>> >>>
>> >> +1, we might not be the most effective users of zookeeper because all
>> of
>> >> our services are stateless services, but my point is to achieve
>> >> fault-tolerance we can use zookeeper and with minimal work.
>> >>
>> >>>  I would like to understand more how can we leverage zookeeper in
>> >>> airavata to make system reliable.
>> >>>
>> >>>
>> >> [1]https://github.com/eirslett/thrift-zookeeper
>> >>
>> >>
>> >>
>> >>> Regards,
>> >>> Gagan
>> >>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu> wrote:
>> >>>
>> >>>> Thanks for the summary, Lahiru. I'm cc'ing the Architecture list for
>> >>>> additional comments.
>> >>>>
>> >>>> Marlon
>> >>>>
>> >>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
>> >>>> > Hi All,
>> >>>> >
>> >>>> > I did little research about Apache Zookeeper[1] and how to use it
>> in
>> >>>> > airavata. Its really a nice way to achieve fault tolerance and
>> >>>> reliable
>> >>>> > communication between our thrift services and clients. Zookeeper
>> is a
>> >>>> > distributed, fault tolerant system to do a reliable communication
>> >>>> between
>> >>>> > distributed applications. This is like an in-memory file system
>> which
>> >>>> has
>> >>>> > nodes in a tree structure and each node can have small amount of
>> data
>> >>>> > associated with it and these nodes are called znodes. Clients can
>> >>>> connect
>> >>>> > to a zookeeper server and add/delete and update these znodes.
>> >>>> >
>> >>>> >   In Apache Airavata we start multiple thrift services and these
>> can
>> >>>> go
>> >>>> > down for maintenance or these can crash, if we use zookeeper to
>> store
>> >>>> these
>> >>>> > configuration(thrift service configurations) we can achieve a very
>> >>>> reliable
>> >>>> > system. Basically thrift clients can dynamically discover available
>> >>>> service
>> >>>> > by using ephemeral znodes(Here we do not have to change the
>> generated
>> >>>> > thrift client code but we have to change the locations we are
>> invoking
>> >>>> > them). ephemeral znodes will be removed when the thrift service
>> goes
>> >>>> down
>> >>>> > and zookeeper guarantee the atomicity between these operations.
>> With
>> >>>> this
>> >>>> > approach we can have a node hierarchy for multiple of airavata,
>> >>>> > orchestrator,appcatalog and gfac thrift services.
>> >>>> >
>> >>>> > For specifically for gfac we can have different types of services
>> for
>> >>>> each
>> >>>> > provider implementation. This can be achieved by using the
>> >>>> hierarchical
>> >>>> > support in zookeeper and providing some logic in gfac-thrift
>> service
>> >>>> to
>> >>>> > register it to a defined path. Using the same logic orchestrator
>> can
>> >>>> > discover the provider specific gfac thrift service and route the
>> >>>> message to
>> >>>> > the correct thrift service.
>> >>>> >
>> >>>> > With this approach I think we simply have write some client code in
>> >>>> thrift
>> >>>> > services and clients and zookeeper server installation can be done
>> as
>> >>>> a
>> >>>> > separate process and it will be easier to keep the Zookeeper server
>> >>>> > separate from Airavata because installation of Zookeeper server
>> little
>> >>>> > complex in production scenario. I think we have to make sure
>> >>>> everything
>> >>>> > works fine when there is no Zookeeper running, ex:
>> >>>> enable.zookeeper=false
>> >>>> > should works fine and users doesn't have to download and start
>> >>>> zookeeper.
>> >>>> >
>> >>>> >
>> >>>> >
>> >>>> > [1]http://zookeeper.apache.org/
>> >>>> >
>> >>>> > Thanks
>> >>>> > Lahiru
>> >>>>
>> >>>>
>> >>
>> >>
>> >> --
>> >> System Analyst Programmer
>> >> PTI Lab
>> >> Indiana University
>> >>
>> >
>>
>>
>> --
>> System Analyst Programmer
>> PTI Lab
>> Indiana University
>>
>
>
>
> --
> Best Regards,
> Shameera Rathnayaka.
>
> email: shameera AT apache.org , shameerainfo AT gmail.com
> Blog : http://shameerarathnayaka.blogspot.com/
>



-- 
Supun Kamburugamuva
Member, Apache Software Foundation; http://www.apache.org
E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
Blog: http://supunk.blogspot.com

Re: Zookeeper in Airavata to achieve reliability

Posted by Supun Kamburugamuva <su...@gmail.com>.

Hi all,

Here is what I think about Airavata and ZooKeeper. In Airavata there are
many components and these components must be stateless to achieve
scalability and reliability.Also there must be a mechanism to communicate
between the components. At the moment Airavata uses RPC calls based on
Thrift for the communication.

ZooKeeper can be used both as a place to hold state and as a communication
layer between the components. I'm involved with a project that has many
distributed components like AIravata. Right now we use Thrift services to
communicate among the components. But we find it difficult to use RPC calls
and achieve stateless behaviour and thinking of replacing Thrift services
with ZooKeeper based communication layer. So I think it is better to
explore the possibility of removing the Thrift services between the
components and use ZooKeeper as a communication mechanism between the
services. If you do this you will have to move the state to ZooKeeper and
will automatically achieve the stateless behaviour in the components.

Also I think trying to make ZooKeeper optional is a bad idea. If we are
trying to integrate something fundamentally important to architecture as
how to store state, we shouldn't make it optional.

Thanks,
Supun..


On Thu, Jun 12, 2014 at 10:57 PM, Shameera Rathnayaka <
shameerainfo@gmail.com> wrote:

> Hi Lahiru,
>
> As i understood,  not only reliability , you are trying to achieve some
> other requirement by introducing zookeeper, like health monitoring of the
> services, categorization with service implementation etc ... . In that
> case, i think we can get use of zookeeper's features but if we only focus
> on reliability, i have little bit of concern, why can't we use clustering +
> LB ?
>
> Yes it is better we add Zookeeper as a prerequisite if user need to use
> it.
>
> Thanks,
>  Shameera.
>
>
> On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <gl...@gmail.com>
> wrote:
>
>> Hi Gagan,
>>
>> I need to start another discussion about it, but I had an offline
>> discussion with Suresh about auto-scaling. I will start another thread
>> about this topic too.
>>
>> Regards
>> Lahiru
>>
>>
>> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <ga...@gmail.com>
>> wrote:
>>
>> > Thanks Lahiru for pointing to nice library, added to my dictionary :).
>> >
>> > I would like to know how are we planning to start multiple servers.
>> > 1. Spawning new servers based on load? Some times we call it as auto
>> > scalable.
>> > 2. To make some specific number of nodes available such as we want 2
>> > servers to be available at any time so if one goes down then I need to
>> > spawn one new to make available servers count 2.
>> > 3. Initially start all the servers.
>> >
>> > In scenario 1 and 2 zookeeper does make sense but I don't believe
>> existing
>> > architecture support this?
>> >
>> > Regards,
>> > Gagan
>> > On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <gl...@gmail.com> wrote:
>> >
>> >> Hi Gagan,
>> >>
>> >> Thanks for your response. Please see my inline comments.
>> >>
>> >>
>> >> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
>> gagandeepjuneja@gmail.com>
>> >> wrote:
>> >>
>> >>> Hi Lahiru,
>> >>> Just my 2 cents.
>> >>>
>> >>> I am big fan of zookeeper but also against adding multiple hops in the
>> >>> system which can add unnecessary complexity. Here I am not able to
>> >>> understand the requirement of zookeeper may be I am wrong because of
>> less
>> >>> knowledge of the airavata system in whole. So I would like to discuss
>> >>> following point.
>> >>>
>> >>> 1. How it will help us in making system more reliable. Zookeeper is
>> not
>> >>> able to restart services. At max it can tell whether service is up or
>> not
>> >>> which could only be the case if airavata service goes down gracefully
>> and
>> >>> we have any automated way to restart it. If this is just matter of
>> routing
>> >>> client requests to the available thrift servers then this can be
>> achieved
>> >>> with the help of load balancer which I guess is already there in
>> thrift
>> >>> wish list.
>> >>>
>> >> We have multiple thrift services and currently we start only one
>> instance
>> >> of them and each thrift service is a stateless service. To keep the
>> high
>> >> availability we have to start multiple instances of them in production
>> >> scenario. So for clients to get an available thrift service we can use
>> >> zookeeper znodes to represent each available service. There are some
>> >> libraries which is doing similar[1] and I think we can use them
>> directly.
>> >>
>> >>> 2. As far as registering of different providers is concerned do you
>> >>> think for that we really need external store.
>> >>>
>> >> Yes I think so, because its light weight and reliable and we have to do
>> >> very minimal amount of work to achieve all these features to Airavata
>> >> because zookeeper handle all the complexity.
>> >>
>> >>> I have seen people using zookeeper more for state management in
>> >>> distributed environments.
>> >>>
>> >> +1, we might not be the most effective users of zookeeper because all
>> of
>> >> our services are stateless services, but my point is to achieve
>> >> fault-tolerance we can use zookeeper and with minimal work.
>> >>
>> >>>  I would like to understand more how can we leverage zookeeper in
>> >>> airavata to make system reliable.
>> >>>
>> >>>
>> >> [1]https://github.com/eirslett/thrift-zookeeper
>> >>
>> >>
>> >>
>> >>> Regards,
>> >>> Gagan
>> >>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu> wrote:
>> >>>
>> >>>> Thanks for the summary, Lahiru. I'm cc'ing the Architecture list for
>> >>>> additional comments.
>> >>>>
>> >>>> Marlon
>> >>>>
>> >>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
>> >>>> > Hi All,
>> >>>> >
>> >>>> > I did little research about Apache Zookeeper[1] and how to use it
>> in
>> >>>> > airavata. Its really a nice way to achieve fault tolerance and
>> >>>> reliable
>> >>>> > communication between our thrift services and clients. Zookeeper
>> is a
>> >>>> > distributed, fault tolerant system to do a reliable communication
>> >>>> between
>> >>>> > distributed applications. This is like an in-memory file system
>> which
>> >>>> has
>> >>>> > nodes in a tree structure and each node can have small amount of
>> data
>> >>>> > associated with it and these nodes are called znodes. Clients can
>> >>>> connect
>> >>>> > to a zookeeper server and add/delete and update these znodes.
>> >>>> >
>> >>>> >   In Apache Airavata we start multiple thrift services and these
>> can
>> >>>> go
>> >>>> > down for maintenance or these can crash, if we use zookeeper to
>> store
>> >>>> these
>> >>>> > configuration(thrift service configurations) we can achieve a very
>> >>>> reliable
>> >>>> > system. Basically thrift clients can dynamically discover available
>> >>>> service
>> >>>> > by using ephemeral znodes(Here we do not have to change the
>> generated
>> >>>> > thrift client code but we have to change the locations we are
>> invoking
>> >>>> > them). ephemeral znodes will be removed when the thrift service
>> goes
>> >>>> down
>> >>>> > and zookeeper guarantee the atomicity between these operations.
>> With
>> >>>> this
>> >>>> > approach we can have a node hierarchy for multiple of airavata,
>> >>>> > orchestrator,appcatalog and gfac thrift services.
>> >>>> >
>> >>>> > For specifically for gfac we can have different types of services
>> for
>> >>>> each
>> >>>> > provider implementation. This can be achieved by using the
>> >>>> hierarchical
>> >>>> > support in zookeeper and providing some logic in gfac-thrift
>> service
>> >>>> to
>> >>>> > register it to a defined path. Using the same logic orchestrator
>> can
>> >>>> > discover the provider specific gfac thrift service and route the
>> >>>> message to
>> >>>> > the correct thrift service.
>> >>>> >
>> >>>> > With this approach I think we simply have write some client code in
>> >>>> thrift
>> >>>> > services and clients and zookeeper server installation can be done
>> as
>> >>>> a
>> >>>> > separate process and it will be easier to keep the Zookeeper server
>> >>>> > separate from Airavata because installation of Zookeeper server
>> little
>> >>>> > complex in production scenario. I think we have to make sure
>> >>>> everything
>> >>>> > works fine when there is no Zookeeper running, ex:
>> >>>> enable.zookeeper=false
>> >>>> > should works fine and users doesn't have to download and start
>> >>>> zookeeper.
>> >>>> >
>> >>>> >
>> >>>> >
>> >>>> > [1]http://zookeeper.apache.org/
>> >>>> >
>> >>>> > Thanks
>> >>>> > Lahiru
>> >>>>
>> >>>>
>> >>
>> >>
>> >> --
>> >> System Analyst Programmer
>> >> PTI Lab
>> >> Indiana University
>> >>
>> >
>>
>>
>> --
>> System Analyst Programmer
>> PTI Lab
>> Indiana University
>>
>
>
>
> --
> Best Regards,
> Shameera Rathnayaka.
>
> email: shameera AT apache.org , shameerainfo AT gmail.com
> Blog : http://shameerarathnayaka.blogspot.com/
>



-- 
Supun Kamburugamuva
Member, Apache Software Foundation; http://www.apache.org
E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
Blog: http://supunk.blogspot.com

Re: Zookeeper in Airavata to achieve reliability

Posted by Shameera Rathnayaka <sh...@gmail.com>.

Hi Lahiru,

As i understood,  not only reliability , you are trying to achieve some
other requirement by introducing zookeeper, like health monitoring of the
services, categorization with service implementation etc ... . In that
case, i think we can get use of zookeeper's features but if we only focus
on reliability, i have little bit of concern, why can't we use clustering +
LB ?

Yes it is better we add Zookeeper as a prerequisite if user need to use it.

Thanks,
Shameera.


On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <gl...@gmail.com>
wrote:

> Hi Gagan,
>
> I need to start another discussion about it, but I had an offline
> discussion with Suresh about auto-scaling. I will start another thread
> about this topic too.
>
> Regards
> Lahiru
>
>
> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <ga...@gmail.com>
> wrote:
>
> > Thanks Lahiru for pointing to nice library, added to my dictionary :).
> >
> > I would like to know how are we planning to start multiple servers.
> > 1. Spawning new servers based on load? Some times we call it as auto
> > scalable.
> > 2. To make some specific number of nodes available such as we want 2
> > servers to be available at any time so if one goes down then I need to
> > spawn one new to make available servers count 2.
> > 3. Initially start all the servers.
> >
> > In scenario 1 and 2 zookeeper does make sense but I don't believe
> existing
> > architecture support this?
> >
> > Regards,
> > Gagan
> > On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <gl...@gmail.com> wrote:
> >
> >> Hi Gagan,
> >>
> >> Thanks for your response. Please see my inline comments.
> >>
> >>
> >> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
> gagandeepjuneja@gmail.com>
> >> wrote:
> >>
> >>> Hi Lahiru,
> >>> Just my 2 cents.
> >>>
> >>> I am big fan of zookeeper but also against adding multiple hops in the
> >>> system which can add unnecessary complexity. Here I am not able to
> >>> understand the requirement of zookeeper may be I am wrong because of
> less
> >>> knowledge of the airavata system in whole. So I would like to discuss
> >>> following point.
> >>>
> >>> 1. How it will help us in making system more reliable. Zookeeper is not
> >>> able to restart services. At max it can tell whether service is up or
> not
> >>> which could only be the case if airavata service goes down gracefully
> and
> >>> we have any automated way to restart it. If this is just matter of
> routing
> >>> client requests to the available thrift servers then this can be
> achieved
> >>> with the help of load balancer which I guess is already there in thrift
> >>> wish list.
> >>>
> >> We have multiple thrift services and currently we start only one
> instance
> >> of them and each thrift service is a stateless service. To keep the high
> >> availability we have to start multiple instances of them in production
> >> scenario. So for clients to get an available thrift service we can use
> >> zookeeper znodes to represent each available service. There are some
> >> libraries which is doing similar[1] and I think we can use them
> directly.
> >>
> >>> 2. As far as registering of different providers is concerned do you
> >>> think for that we really need external store.
> >>>
> >> Yes I think so, because its light weight and reliable and we have to do
> >> very minimal amount of work to achieve all these features to Airavata
> >> because zookeeper handle all the complexity.
> >>
> >>> I have seen people using zookeeper more for state management in
> >>> distributed environments.
> >>>
> >> +1, we might not be the most effective users of zookeeper because all of
> >> our services are stateless services, but my point is to achieve
> >> fault-tolerance we can use zookeeper and with minimal work.
> >>
> >>>  I would like to understand more how can we leverage zookeeper in
> >>> airavata to make system reliable.
> >>>
> >>>
> >> [1]https://github.com/eirslett/thrift-zookeeper
> >>
> >>
> >>
> >>> Regards,
> >>> Gagan
> >>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu> wrote:
> >>>
> >>>> Thanks for the summary, Lahiru. I'm cc'ing the Architecture list for
> >>>> additional comments.
> >>>>
> >>>> Marlon
> >>>>
> >>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
> >>>> > Hi All,
> >>>> >
> >>>> > I did little research about Apache Zookeeper[1] and how to use it in
> >>>> > airavata. Its really a nice way to achieve fault tolerance and
> >>>> reliable
> >>>> > communication between our thrift services and clients. Zookeeper is
> a
> >>>> > distributed, fault tolerant system to do a reliable communication
> >>>> between
> >>>> > distributed applications. This is like an in-memory file system
> which
> >>>> has
> >>>> > nodes in a tree structure and each node can have small amount of
> data
> >>>> > associated with it and these nodes are called znodes. Clients can
> >>>> connect
> >>>> > to a zookeeper server and add/delete and update these znodes.
> >>>> >
> >>>> >   In Apache Airavata we start multiple thrift services and these can
> >>>> go
> >>>> > down for maintenance or these can crash, if we use zookeeper to
> store
> >>>> these
> >>>> > configuration(thrift service configurations) we can achieve a very
> >>>> reliable
> >>>> > system. Basically thrift clients can dynamically discover available
> >>>> service
> >>>> > by using ephemeral znodes(Here we do not have to change the
> generated
> >>>> > thrift client code but we have to change the locations we are
> invoking
> >>>> > them). ephemeral znodes will be removed when the thrift service goes
> >>>> down
> >>>> > and zookeeper guarantee the atomicity between these operations. With
> >>>> this
> >>>> > approach we can have a node hierarchy for multiple of airavata,
> >>>> > orchestrator,appcatalog and gfac thrift services.
> >>>> >
> >>>> > For specifically for gfac we can have different types of services
> for
> >>>> each
> >>>> > provider implementation. This can be achieved by using the
> >>>> hierarchical
> >>>> > support in zookeeper and providing some logic in gfac-thrift service
> >>>> to
> >>>> > register it to a defined path. Using the same logic orchestrator can
> >>>> > discover the provider specific gfac thrift service and route the
> >>>> message to
> >>>> > the correct thrift service.
> >>>> >
> >>>> > With this approach I think we simply have write some client code in
> >>>> thrift
> >>>> > services and clients and zookeeper server installation can be done
> as
> >>>> a
> >>>> > separate process and it will be easier to keep the Zookeeper server
> >>>> > separate from Airavata because installation of Zookeeper server
> little
> >>>> > complex in production scenario. I think we have to make sure
> >>>> everything
> >>>> > works fine when there is no Zookeeper running, ex:
> >>>> enable.zookeeper=false
> >>>> > should works fine and users doesn't have to download and start
> >>>> zookeeper.
> >>>> >
> >>>> >
> >>>> >
> >>>> > [1]http://zookeeper.apache.org/
> >>>> >
> >>>> > Thanks
> >>>> > Lahiru
> >>>>
> >>>>
> >>
> >>
> >> --
> >> System Analyst Programmer
> >> PTI Lab
> >> Indiana University
> >>
> >
>
>
> --
> System Analyst Programmer
> PTI Lab
> Indiana University
>



-- 
Best Regards,
Shameera Rathnayaka.

email: shameera AT apache.org , shameerainfo AT gmail.com
Blog : http://shameerarathnayaka.blogspot.com/

Re: Zookeeper in Airavata to achieve reliability

Posted by Shameera Rathnayaka <sh...@gmail.com>.

Hi Lahiru,

As i understood,  not only reliability , you are trying to achieve some
other requirement by introducing zookeeper, like health monitoring of the
services, categorization with service implementation etc ... . In that
case, i think we can get use of zookeeper's features but if we only focus
on reliability, i have little bit of concern, why can't we use clustering +
LB ?

Yes it is better we add Zookeeper as a prerequisite if user need to use it.

Thanks,
Shameera.


On Thu, Jun 12, 2014 at 5:19 AM, Lahiru Gunathilake <gl...@gmail.com>
wrote:

> Hi Gagan,
>
> I need to start another discussion about it, but I had an offline
> discussion with Suresh about auto-scaling. I will start another thread
> about this topic too.
>
> Regards
> Lahiru
>
>
> On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <ga...@gmail.com>
> wrote:
>
> > Thanks Lahiru for pointing to nice library, added to my dictionary :).
> >
> > I would like to know how are we planning to start multiple servers.
> > 1. Spawning new servers based on load? Some times we call it as auto
> > scalable.
> > 2. To make some specific number of nodes available such as we want 2
> > servers to be available at any time so if one goes down then I need to
> > spawn one new to make available servers count 2.
> > 3. Initially start all the servers.
> >
> > In scenario 1 and 2 zookeeper does make sense but I don't believe
> existing
> > architecture support this?
> >
> > Regards,
> > Gagan
> > On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <gl...@gmail.com> wrote:
> >
> >> Hi Gagan,
> >>
> >> Thanks for your response. Please see my inline comments.
> >>
> >>
> >> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <
> gagandeepjuneja@gmail.com>
> >> wrote:
> >>
> >>> Hi Lahiru,
> >>> Just my 2 cents.
> >>>
> >>> I am big fan of zookeeper but also against adding multiple hops in the
> >>> system which can add unnecessary complexity. Here I am not able to
> >>> understand the requirement of zookeeper may be I am wrong because of
> less
> >>> knowledge of the airavata system in whole. So I would like to discuss
> >>> following point.
> >>>
> >>> 1. How it will help us in making system more reliable. Zookeeper is not
> >>> able to restart services. At max it can tell whether service is up or
> not
> >>> which could only be the case if airavata service goes down gracefully
> and
> >>> we have any automated way to restart it. If this is just matter of
> routing
> >>> client requests to the available thrift servers then this can be
> achieved
> >>> with the help of load balancer which I guess is already there in thrift
> >>> wish list.
> >>>
> >> We have multiple thrift services and currently we start only one
> instance
> >> of them and each thrift service is a stateless service. To keep the high
> >> availability we have to start multiple instances of them in production
> >> scenario. So for clients to get an available thrift service we can use
> >> zookeeper znodes to represent each available service. There are some
> >> libraries which is doing similar[1] and I think we can use them
> directly.
> >>
> >>> 2. As far as registering of different providers is concerned do you
> >>> think for that we really need external store.
> >>>
> >> Yes I think so, because its light weight and reliable and we have to do
> >> very minimal amount of work to achieve all these features to Airavata
> >> because zookeeper handle all the complexity.
> >>
> >>> I have seen people using zookeeper more for state management in
> >>> distributed environments.
> >>>
> >> +1, we might not be the most effective users of zookeeper because all of
> >> our services are stateless services, but my point is to achieve
> >> fault-tolerance we can use zookeeper and with minimal work.
> >>
> >>>  I would like to understand more how can we leverage zookeeper in
> >>> airavata to make system reliable.
> >>>
> >>>
> >> [1]https://github.com/eirslett/thrift-zookeeper
> >>
> >>
> >>
> >>> Regards,
> >>> Gagan
> >>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu> wrote:
> >>>
> >>>> Thanks for the summary, Lahiru. I'm cc'ing the Architecture list for
> >>>> additional comments.
> >>>>
> >>>> Marlon
> >>>>
> >>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
> >>>> > Hi All,
> >>>> >
> >>>> > I did little research about Apache Zookeeper[1] and how to use it in
> >>>> > airavata. Its really a nice way to achieve fault tolerance and
> >>>> reliable
> >>>> > communication between our thrift services and clients. Zookeeper is
> a
> >>>> > distributed, fault tolerant system to do a reliable communication
> >>>> between
> >>>> > distributed applications. This is like an in-memory file system
> which
> >>>> has
> >>>> > nodes in a tree structure and each node can have small amount of
> data
> >>>> > associated with it and these nodes are called znodes. Clients can
> >>>> connect
> >>>> > to a zookeeper server and add/delete and update these znodes.
> >>>> >
> >>>> >   In Apache Airavata we start multiple thrift services and these can
> >>>> go
> >>>> > down for maintenance or these can crash, if we use zookeeper to
> store
> >>>> these
> >>>> > configuration(thrift service configurations) we can achieve a very
> >>>> reliable
> >>>> > system. Basically thrift clients can dynamically discover available
> >>>> service
> >>>> > by using ephemeral znodes(Here we do not have to change the
> generated
> >>>> > thrift client code but we have to change the locations we are
> invoking
> >>>> > them). ephemeral znodes will be removed when the thrift service goes
> >>>> down
> >>>> > and zookeeper guarantee the atomicity between these operations. With
> >>>> this
> >>>> > approach we can have a node hierarchy for multiple of airavata,
> >>>> > orchestrator,appcatalog and gfac thrift services.
> >>>> >
> >>>> > For specifically for gfac we can have different types of services
> for
> >>>> each
> >>>> > provider implementation. This can be achieved by using the
> >>>> hierarchical
> >>>> > support in zookeeper and providing some logic in gfac-thrift service
> >>>> to
> >>>> > register it to a defined path. Using the same logic orchestrator can
> >>>> > discover the provider specific gfac thrift service and route the
> >>>> message to
> >>>> > the correct thrift service.
> >>>> >
> >>>> > With this approach I think we simply have write some client code in
> >>>> thrift
> >>>> > services and clients and zookeeper server installation can be done
> as
> >>>> a
> >>>> > separate process and it will be easier to keep the Zookeeper server
> >>>> > separate from Airavata because installation of Zookeeper server
> little
> >>>> > complex in production scenario. I think we have to make sure
> >>>> everything
> >>>> > works fine when there is no Zookeeper running, ex:
> >>>> enable.zookeeper=false
> >>>> > should works fine and users doesn't have to download and start
> >>>> zookeeper.
> >>>> >
> >>>> >
> >>>> >
> >>>> > [1]http://zookeeper.apache.org/
> >>>> >
> >>>> > Thanks
> >>>> > Lahiru
> >>>>
> >>>>
> >>
> >>
> >> --
> >> System Analyst Programmer
> >> PTI Lab
> >> Indiana University
> >>
> >
>
>
> --
> System Analyst Programmer
> PTI Lab
> Indiana University
>



-- 
Best Regards,
Shameera Rathnayaka.

email: shameera AT apache.org , shameerainfo AT gmail.com
Blog : http://shameerarathnayaka.blogspot.com/

Re: Zookeeper in Airavata to achieve reliability

Posted by Lahiru Gunathilake <gl...@gmail.com>.

Hi Gagan,

I need to start another discussion about it, but I had an offline
discussion with Suresh about auto-scaling. I will start another thread
about this topic too.

Regards
Lahiru


On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <ga...@gmail.com>
wrote:

> Thanks Lahiru for pointing to nice library, added to my dictionary :).
>
> I would like to know how are we planning to start multiple servers.
> 1. Spawning new servers based on load? Some times we call it as auto
> scalable.
> 2. To make some specific number of nodes available such as we want 2
> servers to be available at any time so if one goes down then I need to
> spawn one new to make available servers count 2.
> 3. Initially start all the servers.
>
> In scenario 1 and 2 zookeeper does make sense but I don't believe existing
> architecture support this?
>
> Regards,
> Gagan
> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <gl...@gmail.com> wrote:
>
>> Hi Gagan,
>>
>> Thanks for your response. Please see my inline comments.
>>
>>
>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <ga...@gmail.com>
>> wrote:
>>
>>> Hi Lahiru,
>>> Just my 2 cents.
>>>
>>> I am big fan of zookeeper but also against adding multiple hops in the
>>> system which can add unnecessary complexity. Here I am not able to
>>> understand the requirement of zookeeper may be I am wrong because of less
>>> knowledge of the airavata system in whole. So I would like to discuss
>>> following point.
>>>
>>> 1. How it will help us in making system more reliable. Zookeeper is not
>>> able to restart services. At max it can tell whether service is up or not
>>> which could only be the case if airavata service goes down gracefully and
>>> we have any automated way to restart it. If this is just matter of routing
>>> client requests to the available thrift servers then this can be achieved
>>> with the help of load balancer which I guess is already there in thrift
>>> wish list.
>>>
>> We have multiple thrift services and currently we start only one instance
>> of them and each thrift service is a stateless service. To keep the high
>> availability we have to start multiple instances of them in production
>> scenario. So for clients to get an available thrift service we can use
>> zookeeper znodes to represent each available service. There are some
>> libraries which is doing similar[1] and I think we can use them directly.
>>
>>> 2. As far as registering of different providers is concerned do you
>>> think for that we really need external store.
>>>
>> Yes I think so, because its light weight and reliable and we have to do
>> very minimal amount of work to achieve all these features to Airavata
>> because zookeeper handle all the complexity.
>>
>>> I have seen people using zookeeper more for state management in
>>> distributed environments.
>>>
>> +1, we might not be the most effective users of zookeeper because all of
>> our services are stateless services, but my point is to achieve
>> fault-tolerance we can use zookeeper and with minimal work.
>>
>>>  I would like to understand more how can we leverage zookeeper in
>>> airavata to make system reliable.
>>>
>>>
>> [1]https://github.com/eirslett/thrift-zookeeper
>>
>>
>>
>>> Regards,
>>> Gagan
>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu> wrote:
>>>
>>>> Thanks for the summary, Lahiru. I'm cc'ing the Architecture list for
>>>> additional comments.
>>>>
>>>> Marlon
>>>>
>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
>>>> > Hi All,
>>>> >
>>>> > I did little research about Apache Zookeeper[1] and how to use it in
>>>> > airavata. Its really a nice way to achieve fault tolerance and
>>>> reliable
>>>> > communication between our thrift services and clients. Zookeeper is a
>>>> > distributed, fault tolerant system to do a reliable communication
>>>> between
>>>> > distributed applications. This is like an in-memory file system which
>>>> has
>>>> > nodes in a tree structure and each node can have small amount of data
>>>> > associated with it and these nodes are called znodes. Clients can
>>>> connect
>>>> > to a zookeeper server and add/delete and update these znodes.
>>>> >
>>>> >   In Apache Airavata we start multiple thrift services and these can
>>>> go
>>>> > down for maintenance or these can crash, if we use zookeeper to store
>>>> these
>>>> > configuration(thrift service configurations) we can achieve a very
>>>> reliable
>>>> > system. Basically thrift clients can dynamically discover available
>>>> service
>>>> > by using ephemeral znodes(Here we do not have to change the generated
>>>> > thrift client code but we have to change the locations we are invoking
>>>> > them). ephemeral znodes will be removed when the thrift service goes
>>>> down
>>>> > and zookeeper guarantee the atomicity between these operations. With
>>>> this
>>>> > approach we can have a node hierarchy for multiple of airavata,
>>>> > orchestrator,appcatalog and gfac thrift services.
>>>> >
>>>> > For specifically for gfac we can have different types of services for
>>>> each
>>>> > provider implementation. This can be achieved by using the
>>>> hierarchical
>>>> > support in zookeeper and providing some logic in gfac-thrift service
>>>> to
>>>> > register it to a defined path. Using the same logic orchestrator can
>>>> > discover the provider specific gfac thrift service and route the
>>>> message to
>>>> > the correct thrift service.
>>>> >
>>>> > With this approach I think we simply have write some client code in
>>>> thrift
>>>> > services and clients and zookeeper server installation can be done as
>>>> a
>>>> > separate process and it will be easier to keep the Zookeeper server
>>>> > separate from Airavata because installation of Zookeeper server little
>>>> > complex in production scenario. I think we have to make sure
>>>> everything
>>>> > works fine when there is no Zookeeper running, ex:
>>>> enable.zookeeper=false
>>>> > should works fine and users doesn't have to download and start
>>>> zookeeper.
>>>> >
>>>> >
>>>> >
>>>> > [1]http://zookeeper.apache.org/
>>>> >
>>>> > Thanks
>>>> > Lahiru
>>>>
>>>>
>>
>>
>> --
>> System Analyst Programmer
>> PTI Lab
>> Indiana University
>>
>


-- 
System Analyst Programmer
PTI Lab
Indiana University

Re: Zookeeper in Airavata to achieve reliability

Posted by Lahiru Gunathilake <gl...@gmail.com>.

Hi Gagan,

I need to start another discussion about it, but I had an offline
discussion with Suresh about auto-scaling. I will start another thread
about this topic too.

Regards
Lahiru


On Wed, Jun 11, 2014 at 4:10 PM, Gagan Juneja <ga...@gmail.com>
wrote:

> Thanks Lahiru for pointing to nice library, added to my dictionary :).
>
> I would like to know how are we planning to start multiple servers.
> 1. Spawning new servers based on load? Some times we call it as auto
> scalable.
> 2. To make some specific number of nodes available such as we want 2
> servers to be available at any time so if one goes down then I need to
> spawn one new to make available servers count 2.
> 3. Initially start all the servers.
>
> In scenario 1 and 2 zookeeper does make sense but I don't believe existing
> architecture support this?
>
> Regards,
> Gagan
> On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <gl...@gmail.com> wrote:
>
>> Hi Gagan,
>>
>> Thanks for your response. Please see my inline comments.
>>
>>
>> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <ga...@gmail.com>
>> wrote:
>>
>>> Hi Lahiru,
>>> Just my 2 cents.
>>>
>>> I am big fan of zookeeper but also against adding multiple hops in the
>>> system which can add unnecessary complexity. Here I am not able to
>>> understand the requirement of zookeeper may be I am wrong because of less
>>> knowledge of the airavata system in whole. So I would like to discuss
>>> following point.
>>>
>>> 1. How it will help us in making system more reliable. Zookeeper is not
>>> able to restart services. At max it can tell whether service is up or not
>>> which could only be the case if airavata service goes down gracefully and
>>> we have any automated way to restart it. If this is just matter of routing
>>> client requests to the available thrift servers then this can be achieved
>>> with the help of load balancer which I guess is already there in thrift
>>> wish list.
>>>
>> We have multiple thrift services and currently we start only one instance
>> of them and each thrift service is a stateless service. To keep the high
>> availability we have to start multiple instances of them in production
>> scenario. So for clients to get an available thrift service we can use
>> zookeeper znodes to represent each available service. There are some
>> libraries which is doing similar[1] and I think we can use them directly.
>>
>>> 2. As far as registering of different providers is concerned do you
>>> think for that we really need external store.
>>>
>> Yes I think so, because its light weight and reliable and we have to do
>> very minimal amount of work to achieve all these features to Airavata
>> because zookeeper handle all the complexity.
>>
>>> I have seen people using zookeeper more for state management in
>>> distributed environments.
>>>
>> +1, we might not be the most effective users of zookeeper because all of
>> our services are stateless services, but my point is to achieve
>> fault-tolerance we can use zookeeper and with minimal work.
>>
>>>  I would like to understand more how can we leverage zookeeper in
>>> airavata to make system reliable.
>>>
>>>
>> [1]https://github.com/eirslett/thrift-zookeeper
>>
>>
>>
>>> Regards,
>>> Gagan
>>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu> wrote:
>>>
>>>> Thanks for the summary, Lahiru. I'm cc'ing the Architecture list for
>>>> additional comments.
>>>>
>>>> Marlon
>>>>
>>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
>>>> > Hi All,
>>>> >
>>>> > I did little research about Apache Zookeeper[1] and how to use it in
>>>> > airavata. Its really a nice way to achieve fault tolerance and
>>>> reliable
>>>> > communication between our thrift services and clients. Zookeeper is a
>>>> > distributed, fault tolerant system to do a reliable communication
>>>> between
>>>> > distributed applications. This is like an in-memory file system which
>>>> has
>>>> > nodes in a tree structure and each node can have small amount of data
>>>> > associated with it and these nodes are called znodes. Clients can
>>>> connect
>>>> > to a zookeeper server and add/delete and update these znodes.
>>>> >
>>>> >   In Apache Airavata we start multiple thrift services and these can
>>>> go
>>>> > down for maintenance or these can crash, if we use zookeeper to store
>>>> these
>>>> > configuration(thrift service configurations) we can achieve a very
>>>> reliable
>>>> > system. Basically thrift clients can dynamically discover available
>>>> service
>>>> > by using ephemeral znodes(Here we do not have to change the generated
>>>> > thrift client code but we have to change the locations we are invoking
>>>> > them). ephemeral znodes will be removed when the thrift service goes
>>>> down
>>>> > and zookeeper guarantee the atomicity between these operations. With
>>>> this
>>>> > approach we can have a node hierarchy for multiple of airavata,
>>>> > orchestrator,appcatalog and gfac thrift services.
>>>> >
>>>> > For specifically for gfac we can have different types of services for
>>>> each
>>>> > provider implementation. This can be achieved by using the
>>>> hierarchical
>>>> > support in zookeeper and providing some logic in gfac-thrift service
>>>> to
>>>> > register it to a defined path. Using the same logic orchestrator can
>>>> > discover the provider specific gfac thrift service and route the
>>>> message to
>>>> > the correct thrift service.
>>>> >
>>>> > With this approach I think we simply have write some client code in
>>>> thrift
>>>> > services and clients and zookeeper server installation can be done as
>>>> a
>>>> > separate process and it will be easier to keep the Zookeeper server
>>>> > separate from Airavata because installation of Zookeeper server little
>>>> > complex in production scenario. I think we have to make sure
>>>> everything
>>>> > works fine when there is no Zookeeper running, ex:
>>>> enable.zookeeper=false
>>>> > should works fine and users doesn't have to download and start
>>>> zookeeper.
>>>> >
>>>> >
>>>> >
>>>> > [1]http://zookeeper.apache.org/
>>>> >
>>>> > Thanks
>>>> > Lahiru
>>>>
>>>>
>>
>>
>> --
>> System Analyst Programmer
>> PTI Lab
>> Indiana University
>>
>


-- 
System Analyst Programmer
PTI Lab
Indiana University

Re: Zookeeper in Airavata to achieve reliability

Posted by Gagan Juneja <ga...@gmail.com>.

Thanks Lahiru for pointing to nice library, added to my dictionary :).

I would like to know how are we planning to start multiple servers.
1. Spawning new servers based on load? Some times we call it as auto
scalable.
2. To make some specific number of nodes available such as we want 2
servers to be available at any time so if one goes down then I need to
spawn one new to make available servers count 2.
3. Initially start all the servers.

In scenario 1 and 2 zookeeper does make sense but I don't believe existing
architecture support this?

Regards,
Gagan
On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <gl...@gmail.com> wrote:

> Hi Gagan,
>
> Thanks for your response. Please see my inline comments.
>
>
> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <ga...@gmail.com>
> wrote:
>
>> Hi Lahiru,
>> Just my 2 cents.
>>
>> I am big fan of zookeeper but also against adding multiple hops in the
>> system which can add unnecessary complexity. Here I am not able to
>> understand the requirement of zookeeper may be I am wrong because of less
>> knowledge of the airavata system in whole. So I would like to discuss
>> following point.
>>
>> 1. How it will help us in making system more reliable. Zookeeper is not
>> able to restart services. At max it can tell whether service is up or not
>> which could only be the case if airavata service goes down gracefully and
>> we have any automated way to restart it. If this is just matter of routing
>> client requests to the available thrift servers then this can be achieved
>> with the help of load balancer which I guess is already there in thrift
>> wish list.
>>
> We have multiple thrift services and currently we start only one instance
> of them and each thrift service is a stateless service. To keep the high
> availability we have to start multiple instances of them in production
> scenario. So for clients to get an available thrift service we can use
> zookeeper znodes to represent each available service. There are some
> libraries which is doing similar[1] and I think we can use them directly.
>
>> 2. As far as registering of different providers is concerned do you think
>> for that we really need external store.
>>
> Yes I think so, because its light weight and reliable and we have to do
> very minimal amount of work to achieve all these features to Airavata
> because zookeeper handle all the complexity.
>
>> I have seen people using zookeeper more for state management in
>> distributed environments.
>>
> +1, we might not be the most effective users of zookeeper because all of
> our services are stateless services, but my point is to achieve
> fault-tolerance we can use zookeeper and with minimal work.
>
>>  I would like to understand more how can we leverage zookeeper in
>> airavata to make system reliable.
>>
>>
> [1]https://github.com/eirslett/thrift-zookeeper
>
>
>
>> Regards,
>> Gagan
>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu> wrote:
>>
>>> Thanks for the summary, Lahiru. I'm cc'ing the Architecture list for
>>> additional comments.
>>>
>>> Marlon
>>>
>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
>>> > Hi All,
>>> >
>>> > I did little research about Apache Zookeeper[1] and how to use it in
>>> > airavata. Its really a nice way to achieve fault tolerance and reliable
>>> > communication between our thrift services and clients. Zookeeper is a
>>> > distributed, fault tolerant system to do a reliable communication
>>> between
>>> > distributed applications. This is like an in-memory file system which
>>> has
>>> > nodes in a tree structure and each node can have small amount of data
>>> > associated with it and these nodes are called znodes. Clients can
>>> connect
>>> > to a zookeeper server and add/delete and update these znodes.
>>> >
>>> >   In Apache Airavata we start multiple thrift services and these can go
>>> > down for maintenance or these can crash, if we use zookeeper to store
>>> these
>>> > configuration(thrift service configurations) we can achieve a very
>>> reliable
>>> > system. Basically thrift clients can dynamically discover available
>>> service
>>> > by using ephemeral znodes(Here we do not have to change the generated
>>> > thrift client code but we have to change the locations we are invoking
>>> > them). ephemeral znodes will be removed when the thrift service goes
>>> down
>>> > and zookeeper guarantee the atomicity between these operations. With
>>> this
>>> > approach we can have a node hierarchy for multiple of airavata,
>>> > orchestrator,appcatalog and gfac thrift services.
>>> >
>>> > For specifically for gfac we can have different types of services for
>>> each
>>> > provider implementation. This can be achieved by using the hierarchical
>>> > support in zookeeper and providing some logic in gfac-thrift service to
>>> > register it to a defined path. Using the same logic orchestrator can
>>> > discover the provider specific gfac thrift service and route the
>>> message to
>>> > the correct thrift service.
>>> >
>>> > With this approach I think we simply have write some client code in
>>> thrift
>>> > services and clients and zookeeper server installation can be done as a
>>> > separate process and it will be easier to keep the Zookeeper server
>>> > separate from Airavata because installation of Zookeeper server little
>>> > complex in production scenario. I think we have to make sure everything
>>> > works fine when there is no Zookeeper running, ex:
>>> enable.zookeeper=false
>>> > should works fine and users doesn't have to download and start
>>> zookeeper.
>>> >
>>> >
>>> >
>>> > [1]http://zookeeper.apache.org/
>>> >
>>> > Thanks
>>> > Lahiru
>>>
>>>
>
>
> --
> System Analyst Programmer
> PTI Lab
> Indiana University
>

Re: Zookeeper in Airavata to achieve reliability

Posted by Gagan Juneja <ga...@gmail.com>.

Thanks Lahiru for pointing to nice library, added to my dictionary :).

I would like to know how are we planning to start multiple servers.
1. Spawning new servers based on load? Some times we call it as auto
scalable.
2. To make some specific number of nodes available such as we want 2
servers to be available at any time so if one goes down then I need to
spawn one new to make available servers count 2.
3. Initially start all the servers.

In scenario 1 and 2 zookeeper does make sense but I don't believe existing
architecture support this?

Regards,
Gagan
On 12-Jun-2014 1:19 am, "Lahiru Gunathilake" <gl...@gmail.com> wrote:

> Hi Gagan,
>
> Thanks for your response. Please see my inline comments.
>
>
> On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <ga...@gmail.com>
> wrote:
>
>> Hi Lahiru,
>> Just my 2 cents.
>>
>> I am big fan of zookeeper but also against adding multiple hops in the
>> system which can add unnecessary complexity. Here I am not able to
>> understand the requirement of zookeeper may be I am wrong because of less
>> knowledge of the airavata system in whole. So I would like to discuss
>> following point.
>>
>> 1. How it will help us in making system more reliable. Zookeeper is not
>> able to restart services. At max it can tell whether service is up or not
>> which could only be the case if airavata service goes down gracefully and
>> we have any automated way to restart it. If this is just matter of routing
>> client requests to the available thrift servers then this can be achieved
>> with the help of load balancer which I guess is already there in thrift
>> wish list.
>>
> We have multiple thrift services and currently we start only one instance
> of them and each thrift service is a stateless service. To keep the high
> availability we have to start multiple instances of them in production
> scenario. So for clients to get an available thrift service we can use
> zookeeper znodes to represent each available service. There are some
> libraries which is doing similar[1] and I think we can use them directly.
>
>> 2. As far as registering of different providers is concerned do you think
>> for that we really need external store.
>>
> Yes I think so, because its light weight and reliable and we have to do
> very minimal amount of work to achieve all these features to Airavata
> because zookeeper handle all the complexity.
>
>> I have seen people using zookeeper more for state management in
>> distributed environments.
>>
> +1, we might not be the most effective users of zookeeper because all of
> our services are stateless services, but my point is to achieve
> fault-tolerance we can use zookeeper and with minimal work.
>
>>  I would like to understand more how can we leverage zookeeper in
>> airavata to make system reliable.
>>
>>
> [1]https://github.com/eirslett/thrift-zookeeper
>
>
>
>> Regards,
>> Gagan
>> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu> wrote:
>>
>>> Thanks for the summary, Lahiru. I'm cc'ing the Architecture list for
>>> additional comments.
>>>
>>> Marlon
>>>
>>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
>>> > Hi All,
>>> >
>>> > I did little research about Apache Zookeeper[1] and how to use it in
>>> > airavata. Its really a nice way to achieve fault tolerance and reliable
>>> > communication between our thrift services and clients. Zookeeper is a
>>> > distributed, fault tolerant system to do a reliable communication
>>> between
>>> > distributed applications. This is like an in-memory file system which
>>> has
>>> > nodes in a tree structure and each node can have small amount of data
>>> > associated with it and these nodes are called znodes. Clients can
>>> connect
>>> > to a zookeeper server and add/delete and update these znodes.
>>> >
>>> >   In Apache Airavata we start multiple thrift services and these can go
>>> > down for maintenance or these can crash, if we use zookeeper to store
>>> these
>>> > configuration(thrift service configurations) we can achieve a very
>>> reliable
>>> > system. Basically thrift clients can dynamically discover available
>>> service
>>> > by using ephemeral znodes(Here we do not have to change the generated
>>> > thrift client code but we have to change the locations we are invoking
>>> > them). ephemeral znodes will be removed when the thrift service goes
>>> down
>>> > and zookeeper guarantee the atomicity between these operations. With
>>> this
>>> > approach we can have a node hierarchy for multiple of airavata,
>>> > orchestrator,appcatalog and gfac thrift services.
>>> >
>>> > For specifically for gfac we can have different types of services for
>>> each
>>> > provider implementation. This can be achieved by using the hierarchical
>>> > support in zookeeper and providing some logic in gfac-thrift service to
>>> > register it to a defined path. Using the same logic orchestrator can
>>> > discover the provider specific gfac thrift service and route the
>>> message to
>>> > the correct thrift service.
>>> >
>>> > With this approach I think we simply have write some client code in
>>> thrift
>>> > services and clients and zookeeper server installation can be done as a
>>> > separate process and it will be easier to keep the Zookeeper server
>>> > separate from Airavata because installation of Zookeeper server little
>>> > complex in production scenario. I think we have to make sure everything
>>> > works fine when there is no Zookeeper running, ex:
>>> enable.zookeeper=false
>>> > should works fine and users doesn't have to download and start
>>> zookeeper.
>>> >
>>> >
>>> >
>>> > [1]http://zookeeper.apache.org/
>>> >
>>> > Thanks
>>> > Lahiru
>>>
>>>
>
>
> --
> System Analyst Programmer
> PTI Lab
> Indiana University
>

Re: Zookeeper in Airavata to achieve reliability

Posted by Lahiru Gunathilake <gl...@gmail.com>.

Hi Gagan,

Thanks for your response. Please see my inline comments.


On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <ga...@gmail.com>
wrote:

> Hi Lahiru,
> Just my 2 cents.
>
> I am big fan of zookeeper but also against adding multiple hops in the
> system which can add unnecessary complexity. Here I am not able to
> understand the requirement of zookeeper may be I am wrong because of less
> knowledge of the airavata system in whole. So I would like to discuss
> following point.
>
> 1. How it will help us in making system more reliable. Zookeeper is not
> able to restart services. At max it can tell whether service is up or not
> which could only be the case if airavata service goes down gracefully and
> we have any automated way to restart it. If this is just matter of routing
> client requests to the available thrift servers then this can be achieved
> with the help of load balancer which I guess is already there in thrift
> wish list.
>
We have multiple thrift services and currently we start only one instance
of them and each thrift service is a stateless service. To keep the high
availability we have to start multiple instances of them in production
scenario. So for clients to get an available thrift service we can use
zookeeper znodes to represent each available service. There are some
libraries which is doing similar[1] and I think we can use them directly.

> 2. As far as registering of different providers is concerned do you think
> for that we really need external store.
>
Yes I think so, because its light weight and reliable and we have to do
very minimal amount of work to achieve all these features to Airavata
because zookeeper handle all the complexity.

> I have seen people using zookeeper more for state management in
> distributed environments.
>
+1, we might not be the most effective users of zookeeper because all of
our services are stateless services, but my point is to achieve
fault-tolerance we can use zookeeper and with minimal work.

>  I would like to understand more how can we leverage zookeeper in
> airavata to make system reliable.
>
>
[1]https://github.com/eirslett/thrift-zookeeper



> Regards,
> Gagan
> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu> wrote:
>
>> Thanks for the summary, Lahiru. I'm cc'ing the Architecture list for
>> additional comments.
>>
>> Marlon
>>
>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
>> > Hi All,
>> >
>> > I did little research about Apache Zookeeper[1] and how to use it in
>> > airavata. Its really a nice way to achieve fault tolerance and reliable
>> > communication between our thrift services and clients. Zookeeper is a
>> > distributed, fault tolerant system to do a reliable communication
>> between
>> > distributed applications. This is like an in-memory file system which
>> has
>> > nodes in a tree structure and each node can have small amount of data
>> > associated with it and these nodes are called znodes. Clients can
>> connect
>> > to a zookeeper server and add/delete and update these znodes.
>> >
>> >   In Apache Airavata we start multiple thrift services and these can go
>> > down for maintenance or these can crash, if we use zookeeper to store
>> these
>> > configuration(thrift service configurations) we can achieve a very
>> reliable
>> > system. Basically thrift clients can dynamically discover available
>> service
>> > by using ephemeral znodes(Here we do not have to change the generated
>> > thrift client code but we have to change the locations we are invoking
>> > them). ephemeral znodes will be removed when the thrift service goes
>> down
>> > and zookeeper guarantee the atomicity between these operations. With
>> this
>> > approach we can have a node hierarchy for multiple of airavata,
>> > orchestrator,appcatalog and gfac thrift services.
>> >
>> > For specifically for gfac we can have different types of services for
>> each
>> > provider implementation. This can be achieved by using the hierarchical
>> > support in zookeeper and providing some logic in gfac-thrift service to
>> > register it to a defined path. Using the same logic orchestrator can
>> > discover the provider specific gfac thrift service and route the
>> message to
>> > the correct thrift service.
>> >
>> > With this approach I think we simply have write some client code in
>> thrift
>> > services and clients and zookeeper server installation can be done as a
>> > separate process and it will be easier to keep the Zookeeper server
>> > separate from Airavata because installation of Zookeeper server little
>> > complex in production scenario. I think we have to make sure everything
>> > works fine when there is no Zookeeper running, ex:
>> enable.zookeeper=false
>> > should works fine and users doesn't have to download and start
>> zookeeper.
>> >
>> >
>> >
>> > [1]http://zookeeper.apache.org/
>> >
>> > Thanks
>> > Lahiru
>>
>>


-- 
System Analyst Programmer
PTI Lab
Indiana University

Re: Zookeeper in Airavata to achieve reliability

Posted by Lahiru Gunathilake <gl...@gmail.com>.

Hi Gagan,

Thanks for your response. Please see my inline comments.


On Wed, Jun 11, 2014 at 3:37 PM, Gagan Juneja <ga...@gmail.com>
wrote:

> Hi Lahiru,
> Just my 2 cents.
>
> I am big fan of zookeeper but also against adding multiple hops in the
> system which can add unnecessary complexity. Here I am not able to
> understand the requirement of zookeeper may be I am wrong because of less
> knowledge of the airavata system in whole. So I would like to discuss
> following point.
>
> 1. How it will help us in making system more reliable. Zookeeper is not
> able to restart services. At max it can tell whether service is up or not
> which could only be the case if airavata service goes down gracefully and
> we have any automated way to restart it. If this is just matter of routing
> client requests to the available thrift servers then this can be achieved
> with the help of load balancer which I guess is already there in thrift
> wish list.
>
We have multiple thrift services and currently we start only one instance
of them and each thrift service is a stateless service. To keep the high
availability we have to start multiple instances of them in production
scenario. So for clients to get an available thrift service we can use
zookeeper znodes to represent each available service. There are some
libraries which is doing similar[1] and I think we can use them directly.

> 2. As far as registering of different providers is concerned do you think
> for that we really need external store.
>
Yes I think so, because its light weight and reliable and we have to do
very minimal amount of work to achieve all these features to Airavata
because zookeeper handle all the complexity.

> I have seen people using zookeeper more for state management in
> distributed environments.
>
+1, we might not be the most effective users of zookeeper because all of
our services are stateless services, but my point is to achieve
fault-tolerance we can use zookeeper and with minimal work.

>  I would like to understand more how can we leverage zookeeper in
> airavata to make system reliable.
>
>
[1]https://github.com/eirslett/thrift-zookeeper



> Regards,
> Gagan
> On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu> wrote:
>
>> Thanks for the summary, Lahiru. I'm cc'ing the Architecture list for
>> additional comments.
>>
>> Marlon
>>
>> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
>> > Hi All,
>> >
>> > I did little research about Apache Zookeeper[1] and how to use it in
>> > airavata. Its really a nice way to achieve fault tolerance and reliable
>> > communication between our thrift services and clients. Zookeeper is a
>> > distributed, fault tolerant system to do a reliable communication
>> between
>> > distributed applications. This is like an in-memory file system which
>> has
>> > nodes in a tree structure and each node can have small amount of data
>> > associated with it and these nodes are called znodes. Clients can
>> connect
>> > to a zookeeper server and add/delete and update these znodes.
>> >
>> >   In Apache Airavata we start multiple thrift services and these can go
>> > down for maintenance or these can crash, if we use zookeeper to store
>> these
>> > configuration(thrift service configurations) we can achieve a very
>> reliable
>> > system. Basically thrift clients can dynamically discover available
>> service
>> > by using ephemeral znodes(Here we do not have to change the generated
>> > thrift client code but we have to change the locations we are invoking
>> > them). ephemeral znodes will be removed when the thrift service goes
>> down
>> > and zookeeper guarantee the atomicity between these operations. With
>> this
>> > approach we can have a node hierarchy for multiple of airavata,
>> > orchestrator,appcatalog and gfac thrift services.
>> >
>> > For specifically for gfac we can have different types of services for
>> each
>> > provider implementation. This can be achieved by using the hierarchical
>> > support in zookeeper and providing some logic in gfac-thrift service to
>> > register it to a defined path. Using the same logic orchestrator can
>> > discover the provider specific gfac thrift service and route the
>> message to
>> > the correct thrift service.
>> >
>> > With this approach I think we simply have write some client code in
>> thrift
>> > services and clients and zookeeper server installation can be done as a
>> > separate process and it will be easier to keep the Zookeeper server
>> > separate from Airavata because installation of Zookeeper server little
>> > complex in production scenario. I think we have to make sure everything
>> > works fine when there is no Zookeeper running, ex:
>> enable.zookeeper=false
>> > should works fine and users doesn't have to download and start
>> zookeeper.
>> >
>> >
>> >
>> > [1]http://zookeeper.apache.org/
>> >
>> > Thanks
>> > Lahiru
>>
>>


-- 
System Analyst Programmer
PTI Lab
Indiana University

Re: Zookeeper in Airavata to achieve reliability

Posted by Gagan Juneja <ga...@gmail.com>.

Hi Lahiru,
Just my 2 cents.

I am big fan of zookeeper but also against adding multiple hops in the
system which can add unnecessary complexity. Here I am not able to
understand the requirement of zookeeper may be I am wrong because of less
knowledge of the airavata system in whole. So I would like to discuss
following point.

1. How it will help us in making system more reliable. Zookeeper is not
able to restart services. At max it can tell whether service is up or not
which could only be the case if airavata service goes down gracefully and
we have any automated way to restart it. If this is just matter of routing
client requests to the available thrift servers then this can be achieved
with the help of load balancer which I guess is already there in thrift
wish list.
2. As far as registering of different providers is concerned do you think
for that we really need external store.

I have seen people using zookeeper more for state management in distributed
environments.

I would like to understand more how can we leverage zookeeper in airavata
to make system reliable.

Regards,
Gagan
On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu> wrote:

> Thanks for the summary, Lahiru. I'm cc'ing the Architecture list for
> additional comments.
>
> Marlon
>
> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
> > Hi All,
> >
> > I did little research about Apache Zookeeper[1] and how to use it in
> > airavata. Its really a nice way to achieve fault tolerance and reliable
> > communication between our thrift services and clients. Zookeeper is a
> > distributed, fault tolerant system to do a reliable communication between
> > distributed applications. This is like an in-memory file system which has
> > nodes in a tree structure and each node can have small amount of data
> > associated with it and these nodes are called znodes. Clients can connect
> > to a zookeeper server and add/delete and update these znodes.
> >
> >   In Apache Airavata we start multiple thrift services and these can go
> > down for maintenance or these can crash, if we use zookeeper to store
> these
> > configuration(thrift service configurations) we can achieve a very
> reliable
> > system. Basically thrift clients can dynamically discover available
> service
> > by using ephemeral znodes(Here we do not have to change the generated
> > thrift client code but we have to change the locations we are invoking
> > them). ephemeral znodes will be removed when the thrift service goes down
> > and zookeeper guarantee the atomicity between these operations. With this
> > approach we can have a node hierarchy for multiple of airavata,
> > orchestrator,appcatalog and gfac thrift services.
> >
> > For specifically for gfac we can have different types of services for
> each
> > provider implementation. This can be achieved by using the hierarchical
> > support in zookeeper and providing some logic in gfac-thrift service to
> > register it to a defined path. Using the same logic orchestrator can
> > discover the provider specific gfac thrift service and route the message
> to
> > the correct thrift service.
> >
> > With this approach I think we simply have write some client code in
> thrift
> > services and clients and zookeeper server installation can be done as a
> > separate process and it will be easier to keep the Zookeeper server
> > separate from Airavata because installation of Zookeeper server little
> > complex in production scenario. I think we have to make sure everything
> > works fine when there is no Zookeeper running, ex: enable.zookeeper=false
> > should works fine and users doesn't have to download and start zookeeper.
> >
> >
> >
> > [1]http://zookeeper.apache.org/
> >
> > Thanks
> > Lahiru
>
>

Re: Zookeeper in Airavata to achieve reliability

Posted by Gagan Juneja <ga...@gmail.com>.

Hi Lahiru,
Just my 2 cents.

I am big fan of zookeeper but also against adding multiple hops in the
system which can add unnecessary complexity. Here I am not able to
understand the requirement of zookeeper may be I am wrong because of less
knowledge of the airavata system in whole. So I would like to discuss
following point.

1. How it will help us in making system more reliable. Zookeeper is not
able to restart services. At max it can tell whether service is up or not
which could only be the case if airavata service goes down gracefully and
we have any automated way to restart it. If this is just matter of routing
client requests to the available thrift servers then this can be achieved
with the help of load balancer which I guess is already there in thrift
wish list.
2. As far as registering of different providers is concerned do you think
for that we really need external store.

I have seen people using zookeeper more for state management in distributed
environments.

I would like to understand more how can we leverage zookeeper in airavata
to make system reliable.

Regards,
Gagan
On 12-Jun-2014 12:33 am, "Marlon Pierce" <ma...@iu.edu> wrote:

> Thanks for the summary, Lahiru. I'm cc'ing the Architecture list for
> additional comments.
>
> Marlon
>
> On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
> > Hi All,
> >
> > I did little research about Apache Zookeeper[1] and how to use it in
> > airavata. Its really a nice way to achieve fault tolerance and reliable
> > communication between our thrift services and clients. Zookeeper is a
> > distributed, fault tolerant system to do a reliable communication between
> > distributed applications. This is like an in-memory file system which has
> > nodes in a tree structure and each node can have small amount of data
> > associated with it and these nodes are called znodes. Clients can connect
> > to a zookeeper server and add/delete and update these znodes.
> >
> >   In Apache Airavata we start multiple thrift services and these can go
> > down for maintenance or these can crash, if we use zookeeper to store
> these
> > configuration(thrift service configurations) we can achieve a very
> reliable
> > system. Basically thrift clients can dynamically discover available
> service
> > by using ephemeral znodes(Here we do not have to change the generated
> > thrift client code but we have to change the locations we are invoking
> > them). ephemeral znodes will be removed when the thrift service goes down
> > and zookeeper guarantee the atomicity between these operations. With this
> > approach we can have a node hierarchy for multiple of airavata,
> > orchestrator,appcatalog and gfac thrift services.
> >
> > For specifically for gfac we can have different types of services for
> each
> > provider implementation. This can be achieved by using the hierarchical
> > support in zookeeper and providing some logic in gfac-thrift service to
> > register it to a defined path. Using the same logic orchestrator can
> > discover the provider specific gfac thrift service and route the message
> to
> > the correct thrift service.
> >
> > With this approach I think we simply have write some client code in
> thrift
> > services and clients and zookeeper server installation can be done as a
> > separate process and it will be easier to keep the Zookeeper server
> > separate from Airavata because installation of Zookeeper server little
> > complex in production scenario. I think we have to make sure everything
> > works fine when there is no Zookeeper running, ex: enable.zookeeper=false
> > should works fine and users doesn't have to download and start zookeeper.
> >
> >
> >
> > [1]http://zookeeper.apache.org/
> >
> > Thanks
> > Lahiru
>
>

Re: Zookeeper in Airavata to achieve reliability

Posted by Marlon Pierce <ma...@iu.edu>.

Thanks for the summary, Lahiru. I'm cc'ing the Architecture list for
additional comments.

Marlon

On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
> Hi All,
>
> I did little research about Apache Zookeeper[1] and how to use it in
> airavata. Its really a nice way to achieve fault tolerance and reliable
> communication between our thrift services and clients. Zookeeper is a
> distributed, fault tolerant system to do a reliable communication between
> distributed applications. This is like an in-memory file system which has
> nodes in a tree structure and each node can have small amount of data
> associated with it and these nodes are called znodes. Clients can connect
> to a zookeeper server and add/delete and update these znodes.
>
>   In Apache Airavata we start multiple thrift services and these can go
> down for maintenance or these can crash, if we use zookeeper to store these
> configuration(thrift service configurations) we can achieve a very reliable
> system. Basically thrift clients can dynamically discover available service
> by using ephemeral znodes(Here we do not have to change the generated
> thrift client code but we have to change the locations we are invoking
> them). ephemeral znodes will be removed when the thrift service goes down
> and zookeeper guarantee the atomicity between these operations. With this
> approach we can have a node hierarchy for multiple of airavata,
> orchestrator,appcatalog and gfac thrift services.
>
> For specifically for gfac we can have different types of services for each
> provider implementation. This can be achieved by using the hierarchical
> support in zookeeper and providing some logic in gfac-thrift service to
> register it to a defined path. Using the same logic orchestrator can
> discover the provider specific gfac thrift service and route the message to
> the correct thrift service.
>
> With this approach I think we simply have write some client code in thrift
> services and clients and zookeeper server installation can be done as a
> separate process and it will be easier to keep the Zookeeper server
> separate from Airavata because installation of Zookeeper server little
> complex in production scenario. I think we have to make sure everything
> works fine when there is no Zookeeper running, ex: enable.zookeeper=false
> should works fine and users doesn't have to download and start zookeeper.
>
>
>
> [1]http://zookeeper.apache.org/
>
> Thanks
> Lahiru

Re: Zookeeper in Airavata to achieve reliability

Posted by Marlon Pierce <ma...@iu.edu>.

Thanks for the summary, Lahiru. I'm cc'ing the Architecture list for
additional comments.

Marlon

On 6/11/14 2:27 PM, Lahiru Gunathilake wrote:
> Hi All,
>
> I did little research about Apache Zookeeper[1] and how to use it in
> airavata. Its really a nice way to achieve fault tolerance and reliable
> communication between our thrift services and clients. Zookeeper is a
> distributed, fault tolerant system to do a reliable communication between
> distributed applications. This is like an in-memory file system which has
> nodes in a tree structure and each node can have small amount of data
> associated with it and these nodes are called znodes. Clients can connect
> to a zookeeper server and add/delete and update these znodes.
>
>   In Apache Airavata we start multiple thrift services and these can go
> down for maintenance or these can crash, if we use zookeeper to store these
> configuration(thrift service configurations) we can achieve a very reliable
> system. Basically thrift clients can dynamically discover available service
> by using ephemeral znodes(Here we do not have to change the generated
> thrift client code but we have to change the locations we are invoking
> them). ephemeral znodes will be removed when the thrift service goes down
> and zookeeper guarantee the atomicity between these operations. With this
> approach we can have a node hierarchy for multiple of airavata,
> orchestrator,appcatalog and gfac thrift services.
>
> For specifically for gfac we can have different types of services for each
> provider implementation. This can be achieved by using the hierarchical
> support in zookeeper and providing some logic in gfac-thrift service to
> register it to a defined path. Using the same logic orchestrator can
> discover the provider specific gfac thrift service and route the message to
> the correct thrift service.
>
> With this approach I think we simply have write some client code in thrift
> services and clients and zookeeper server installation can be done as a
> separate process and it will be easier to keep the Zookeeper server
> separate from Airavata because installation of Zookeeper server little
> complex in production scenario. I think we have to make sure everything
> works fine when there is no Zookeeper running, ex: enable.zookeeper=false
> should works fine and users doesn't have to download and start zookeeper.
>
>
>
> [1]http://zookeeper.apache.org/
>
> Thanks
> Lahiru