You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airavata.apache.org by Raminderjeet Singh <ra...@gmail.com> on 2015/06/01 19:25:39 UTC

Re: Too Many Leaf Modules.

Hi Shameera,

I think, we are tying to deal with 2 problems. Code manageability/usage and
distribution.

In GFAC, I noticed that there is a dependencies on modules, which should
not exist like GFAC's GSISSH module dependent on SSH module and both the
modules depending on GSISSH library. Reason of such depend is duplication
of utility methods (see GFACSSHUtils and GFACGSISSHUtils for duplicate
code) in both of these modules and we need to fix it. I think there are
similar examples in different modules. According to me, if 60% code is
duplicate, we need to merge the module as one or come. Other way of saying
is we should only create an module when its needed. Local,SSH,GSISSH should
not be separate modules as they have lot in common. Just to give you some
background on GFAC modules, they were created for the purpose of having
different flavors of GFAC. This was mainly done to fix the problem of jar
dependencies (different security dependencies) of GFAC modules in a JVM. It
was effecting modules to work together. An example in the past was,
Unicore, GRAM and GSISSH modules did not work together so we had to spend
time to inspect distribution to find which runtime dependency is causing
problem and fix it. With the current design we can create individual
version of GFAC. We still need to enhance GFAC service to have a flavor of
GFAC registered with type and route the jobs. So we need some of this
flavoring support in GFAC.

I agree with you that we have too many core modules and in core
modules like GFAC core, we have implementations (e.g. BetterGFACImpl). We
should move implementations to Gfac service. Lets talk about pros and cons
about having a single core and then we can decide how to proceed. Currently
airavata does not provide a single view to all of its functionality. API
server was designed to do that but its also overloaded with lot of
implementation details. If I am following your advice right, having a
common airavata core will definitely help developer to think about airavata
as a system and design new components with system prospective so +1 for
something like this. Only drawback can be, taking away some flexibly from a
component developer, which is anyhow good for airavata system point of
view. Before we just into conclusion, We need to just evaluate how it will
work with our thrift services design.

Thanks
Raminder



On Sat, May 30, 2015 at 11:55 PM, Shameera Rathnayaka <
shameerainfo@gmail.com> wrote:

> Hi Suresh,
>
> You are thinking about deployment perspective while I am thinking about
> dependency issue. With my suggestion for each component distribution will
> be increased by less than 1MB,  because only the interfaces are in the
> core. And in runtime those interfaces will not be loaded. Thinking the
> trouble we are getting at development time and code maintain issues. I
> think we can bare with that 1MB.
>
> Thanks,
> Shameera.
>
> On Sat, May 30, 2015 at 10:45 PM, Suresh Marru <sm...@apache.org> wrote:
>
>> Shameera,
>>
>> Every component has in its own thrift service interface (registry and
>> messaging have exceptions). Every component will need to have a dependency
>> to airavata data models (which includes util classes) and probably registry
>> and messaging. if a component A needs to invoke component B via RPC call,
>> then it just needs to include its component A’s thrift client. If the
>> communication is through work queue’s then there is no dependency between
>> them. Can you describe what you want to propose in this context?
>>
>> Suresh
>>
>> On May 30, 2015, at 8:36 PM, Shameera Rathnayaka <sh...@gmail.com>
>> wrote:
>>
>> Hi Suresh,
>>
>>>
>>> Spark is not the right comparison for this discussion. I have been a
>>> spark incubation mentor and have been following the code organization since
>>> its early days. All of the spark components you mentions rely on the core.
>>>
>>
>> ​Yes, that is what I highlight , each components doesn't have their own
>> core modules which has the interfaces. Spark SQL module has core submodule
>> but all interfaces reside in main core module.
>>
>>
>>> Let me step back and ask, what is the problem you are trying to solve?
>>> We sure need to cleanup modules and it is time to re-look at the component
>>> organization. But what do you want to really achieve by combining all the
>>> core into monolithic components. It took an effort to cleanly separate
>>> functionality so they can evolve and can be improved independently.
>>>
>>
>>
>> I am not suggesting to go back to the monolithic core which has all
>> implementations and interfaces bundle with together. What i am saying is
>> have core interfaces all together, this will give us clear module
>> dependency graph. This is something like having one root dependency graph
>> instead of multiple roots. As a developer it is cumbersome and hard to deal
>> with module dependency issues. Having too fine gran core modules introduce
>> wrong dependency graph eventually, and it will prevent us to follow proper
>> design patterns in our code base. If we avoid design patterns, it will
>> require more time to find bugs and maintain. This is what I am trying to
>> resolve. I have first hand experiences with current airavata code. If you
>> see the current module dependency graphs, then you will understand why I am
>> making such noise to resolve this.
>>
>>
>>> So why do we make our repository bulky with modules unless it doesn't
>>> provide any considerable advantage.
>>>
>>> ​If we really w
>>> ​ant to ​
>>> separate distribution bundle for each component ( apiServer ,
>>> Orchestrator and Gfac ) let
>>> ​'​
>>> s use different bin.xml file
>>> ​s​
>>>  to do it instead of using different modules. But reality is we only
>>> use ​all in one distribution.
>>>
>>>
>>> Once the monitoring is fixed to use messaging, we really need to
>>> decouple the component deployments. Yes there is a considerable advantage.
>>> Each components has different quality of service requirements. A production
>>> platform has to load balance and scale horizontally. And thats different
>>> for different component. The all in one bundle has 300+ jars, but API
>>> server and orchestrator when independent will have around 50 or so jars.
>>> When I want to deploy api server and orchestrator there is significant
>>> different in small light weight components vs one monolithic core.
>>>
>>> Another problem is component evolution. Lets say there is a production
>>> deployment running 1.1.4 version. Lets say the single job execution is
>>> stable enough and there is a 6 month focused effort on workflow. Say the
>>> master moves to 1.2.8 with all the changes to workflow and only few to
>>> single application execution. We can more comfortably upgrade is they are
>>> cleanly separated modules. But if it is one core and so many changes to it
>>> (even though technically they are to different classes, the perception will
>>> remain), the upgrades will get behind.
>>>
>>> Bottom line I am + 1 for cleaning up the modules. Past few years we have
>>> been moving towards micro service architectures and your suggestions will
>>> reverse this back to monolithic architecture. I am -1 for this change in
>>> direction.
>>>
>>
>>
>> By looking at your miniature component suggestion above, it has 30+
>> modules. Do you think we really need this number of categorization? With my
>> industrial experience, I have seen
>> ​number of modules in a project always increase with time. Hence​
>>  If we start 30+ we will come to 40+ and then 50+
>> ​ and so on so forth​
>> .  Why we make this complicated?
>>
>> ​Thanks,
>> Shameera.​
>>
>>
>>
>>>
>>> Suresh
>>>
>>>
>>> ​​Thanks,
>>> ​Shameera.
>>> ​
>>>
>>>>
>>>> Suresh
>>>>
>>>> On May 29, 2015, at 11:29 AM, Suresh Marru <sm...@apache.org> wrote:
>>>>
>>>> + 1.
>>>>
>>>> I was planning to bring up this issue also. Probably it will not
>>>> address what you are raising, but here is a tree output from airavata labs
>>>> code I was toying with locally. I did not yet compare it with what you
>>>> proposed, I will do so later today.
>>>>
>>>> ├── airavata-api
>>>> │   ├── airavata-api-interface-descriptions
>>>> │   ├── airavata-api-java-stubs
>>>> │   ├── airavata-api-server
>>>> │   ├── airavata-data-models
>>>> │   ├── api-security-manager
>>>> ├── clients
>>>> │   ├── airavata-client-cpp-sdk
>>>> │   ├── airavata-client-java-sdk
>>>> │   ├── airavata-client-php-sdk
>>>> │   ├── airavata-client-python-sdk
>>>> │   ├── airavata-sample-examples
>>>> │   └── airavata-xbaya-gui
>>>> ├── components
>>>> │   ├── commons
>>>> │   ├── component-interface-descriptions
>>>> │   ├── component-services
>>>> │   │   ├── credential-store-service
>>>> │   │   ├── orchestrator-service
>>>> │   │   ├── task-executor-service
>>>> │   │   └── workflow-interpreter-service
>>>> │   ├── component-clients
>>>> │   │   ├── credential-store-client
>>>> │   │   ├── orchestrator-client
>>>> │   │   ├── task-executor-client
>>>> │   │   ├── workflow-interpreter-client
>>>> │   │   └── messaging
>>>> │   ├── task-adaptors
>>>> │   │   ├── compute
>>>> │   │   └── data-movement
>>>> │   ├── registry
>>>> │   │   ├── app-catalog
>>>> │   │   ├── experiment-catalog
>>>> │   │   └── resource-catalog
>>>> │   └── workflow-interpreter
>>>> ├── distribution
>>>> ├── integration-tests
>>>>
>>>>
>>>>
>>>> On May 29, 2015, at 10:15 AM, Shameera Rathnayaka <sh...@apache.org>
>>>> wrote:
>>>>
>>>> Hi Devs,
>>>>
>>>> As we are using different modules to package different type of
>>>> functionalities, which will help us to maintain loosely couple codes. Now
>>>> the project has 49 leaf module ( one to hit half century :) ). If we allow
>>>> project to grow this way, having too fine grain modules will be huge
>>>> headache in future. IMO we should clean this ASAP before it become really
>>>> mess. Actually we half way there, I experienced cyclic dependency issues
>>>> when I was writing workflow implementation and email monitoring. Please see
>>>> the modules in current repo below.
>>>>
>>>> <module-name> ( <num of child modules> )
>>>>
>>>> modules  ( 43 )
>>>>      app-catalog ( 2 )
>>>>      commons ( 1 )
>>>>      configurations ( 2 )
>>>>      credential-store ( 3 )
>>>>      distribution ( 8 )
>>>>      gfac ( 10 )
>>>>      integration test ( 1 )
>>>>      messaging ( 2 )
>>>>      orchestrator ( 3 )
>>>>      registry ( 3 )
>>>>      security ( 1 )
>>>>      server ( 1 )
>>>>      test-suit ( 1 )
>>>>      workflow ( 1 )
>>>>      workflow-modal ( 3 )
>>>>      xbaya ( 1 )
>>>> airavata-api ( 5 )
>>>> tools ( 1 )
>>>>
>>>> Most of the current modules have interfaces and implementations
>>>> together, but this violate our main goal which reduce inter module
>>>> dependencies. Following is what I am suggesting, WDYS?
>>>>
>>>> core { has all core interfaces and basic classes of gfac-core ,
>>>> orchestrator-core , message-core , monitor core, registry core,
>>>> workflow-core}
>>>> service - all thrift services and service handlers
>>>> orchestrator - orchestrator specific classes
>>>> gfac
>>>>      SSH
>>>>      BES
>>>>      Local
>>>> message - amqp implemention
>>>> distribution
>>>>      XBaya
>>>>      server - { use different mode input to start server as
>>>> orchestrator , Gfac or/and api-server }
>>>> commons
>>>> registry
>>>> app-catalog
>>>> security
>>>> Workflow
>>>> XBaya-gui
>>>> Integration-test
>>>>
>>>> Thanks,
>>>> Shameera.
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Best Regards,
>>> Shameera Rathnayaka.
>>>
>>> email: shameera AT apache.org , shameerainfo AT gmail.com
>>> Blog : http://shameerarathnayaka.blogspot.com/
>>>
>>>
>>>
>>
>>
>> --
>> Best Regards,
>> Shameera Rathnayaka.
>>
>> email: shameera AT apache.org , shameerainfo AT gmail.com
>> Blog : http://shameerarathnayaka.blogspot.com/
>>
>>
>>
>
>
> --
> Best Regards,
> Shameera Rathnayaka.
>
> email: shameera AT apache.org , shameerainfo AT gmail.com
> Blog : http://shameerarathnayaka.blogspot.com/
>

Re: Too Many Leaf Modules.

Posted by Suresh Marru <sm...@apache.org>.
Let me try to put it in the terminology you are using:

GFac component level goal is to submit a job. In the RPC approach (which is defunct now) the component level interface looks like - https://github.com/apache/airavata/blob/master/modules/gfac/gfac-thrift-descriptions/gfac.cpi.service.thrift#L47 <https://github.com/apache/airavata/blob/master/modules/gfac/gfac-thrift-descriptions/gfac.cpi.service.thrift#L47>

This is currently implemented as one such component implementation. The component implementation is programmed internally against an internal interface - GFac which has implementations like BetterGfacImpl

There is a Orchestrator component level interface methods to validate a experiment - https://github.com/apache/airavata/blob/master/modules/orchestrator/orchestrator-thrift-descriptions/orchestrator.cpi.service.thrift#L67 <https://github.com/apache/airavata/blob/master/modules/orchestrator/orchestrator-thrift-descriptions/orchestrator.cpi.service.thrift#L67>

You can implement this in any number of ways ultimately giving a boolean. There is one such implementation defined by  AbstractOrchestrator interfaces and implemented by SimpleOrchestratorImpl classes and so forth. 

So in your proposed maven modules, I am not seeing you pulling together component interfaces. You are pulling together component implementations (defined as Java interfaces). Do Orchestrator component level implementation GFac component level details fit in one place? 

What if you want to have a new component which can do submit Job with a totally new implementation (which will have its own GFac interface and corresponding one or more GfacImpl’s)? Isn’t this component level details which Airavata as a whole should not be concerned about?  

Suresh

> On Jun 2, 2015, at 10:28 AM, Shameera Rathnayaka <sh...@gmail.com> wrote:
> 
> Hi Suresh, 
> 
> As you have fist hand experience of how airavata architecture evolve, let's do as you suggest. But this is not correct way IMO. "Airavata Components will need to be loosely coupled and should be developed at a different pace upgradable and replaceable independently" , we can achieve these goals without any issue with the Maven modules that I am suggesting, may be I am not descriptive enough to make you understand why this maven module refactoring is so important for us. It would be great to know how you suggest to have project structure after the refactoring.
> 
> Thanks, 
> Shameera. 
> 
> On Tue, Jun 2, 2015 at 9:49 AM, Suresh Marru <smarru@apache.org <ma...@apache.org>> wrote:
> Hi Shameera,
> 
> We are getting close to be on same page but not quite yet. What I see missing is a reference to a succinct Airavata architecture vision document. We have many papers on high level goals, but what we need is a concise one pager on Architecture goals. I will work on it, but will need some time. In the mean time I will suggest we proceed without yet merging the core (even though it is only interfaces). It is not the additional 1MB I am worried about, I am worried about going against the architectural principles laid out (which I will extract from papers onto the website).
> 
> Very briefly, Airavata is a high level framework close to the business functionality and assembles together multiple usecases. To make this challenge a conceivable effort, we layer over rich lower level tools and framework. Initial struggle was to come up with a unified API. But embracing thrift we addressed this by not unifying all usecases into one abstraction, but multiple blocks of abstractions. This will require assembling together multiple components at the implementation. Hypothetically you can assemble multiple cores, but that counters the principle that components can be re-usable across recipes. There is no good description of Airavata technical recipes but similar community effort in science gateways is at [1] [2], there are about 33 recopies which need to be consolidated. 
> 
> I appreciate your enthusiasm, but need to slow down on drastic changes. We sure need to get a stable one capability which is single job execution, but we need to do so without jeopardizing larger goals of the project. I will very soon start discussions on these larger goals. These will infer architectural principles which will need to be slowly build into 2 or 3 major versions. Understandably some of these will not make sense for 1.0. For instance Airavata Components will need to be loosely coupled and should be developed at a different pace upgradable and replaceable independently. This is requirement for making Airavata platform ready. Just because we cannot do that yet, we do not want to go in other direction. Similarly, there should be a minimal shared understanding of Airavata (data models, registry catalogs and messaging contexts), but otherwise each component in the system will require minimal understanding of remaining constituents of the airavata system. All of this goes against merging all interfaces into a common core. 
> 
> For 0.16 release, lets please proceed with inconvenience of having multi-module parents. Before next version we will need to do sufficient discussion to alter this. 
> 
> Suresh
> 
> [1] - http://dx.doi.org/10.1109/CLUSTER.2013.6702702 <http://dx.doi.org/10.1109/CLUSTER.2013.6702702>
> [2] - https://www.xsede.org/web/gateways/gateways-cookbook <https://www.xsede.org/web/gateways/gateways-cookbook> 
> 
>> On Jun 1, 2015, at 11:13 PM, Shameera Rathnayaka <shameerainfo@gmail.com <ma...@gmail.com>> wrote:
>> 
>> Hi All, 
>> 
>> While reading through this thread again, I understood that we might be talking two different things here. So I thought to explain the difference between Maven modules and Modular( Separation of Concern) in Component based architecture. What I am suggesting is restructure maven modules not to change existing Modular concept. If anyone say making one maven module call "core" by putting all basic interfaces together will break Modular concept in Component based architecture, that is not true. These are two different concepts. While component based architecture is software engineering architectural design, Maven is used to project management. Component based architecture may or may not have one maven module "core" in project which has all basic interfaces. If this core maven module is so complex then it is worth to break it to few modules. But for Airavata this doesn't relate, as suggested core maven module is not that complex.  
>> Thanks, 
>> Shameera.
>> 
>> On Mon, Jun 1, 2015 at 9:09 PM, Shameera Rathnayaka <shameerainfo@gmail.com <ma...@gmail.com>> wrote:
>> Hi Raminder, 
>> 
>> Different implementations can depend on different versions of the same jar, as we are using native java class loading, it doesn't support different versions to class loaded and live in one runtime, unless we come up with OSGi like modular system or customize class loader behavior which is tricky. If we are getting this issue with basic Airavata components then we should fix this and use one version through all the components. If different plugable implementations use different versions then both can't work together only one version will be class loaded, this is a known restriction. Anyway I am not suggesting to have one bulky Maven module, we should have some level of categorization. If  any interface has different implementations and those are distinct enough, then having different modules make sense. 
>> 
>> Definitely we must remove implementation classes from current core modules. With multiplex thrift support which Suresh is working on, for me it make sense to have all services in one maven module call "service". Anyway let's think other alternatives too. 
>> 
>> Thanks, 
>> Shameera.
>> 
>> On Mon, Jun 1, 2015 at 1:25 PM, Raminderjeet Singh <raminderjsingh@gmail.com <ma...@gmail.com>> wrote:
>> Hi Shameera,
>> 
>> I think, we are tying to deal with 2 problems. Code manageability/usage and distribution.  
>> 
>> In GFAC, I noticed that there is a dependencies on modules, which should not exist like GFAC's GSISSH module dependent on SSH module and both the modules depending on GSISSH library. Reason of such depend is duplication of utility methods (see GFACSSHUtils and GFACGSISSHUtils for duplicate code) in both of these modules and we need to fix it. I think there are similar examples in different modules. According to me, if 60% code is duplicate, we need to merge the module as one or come. Other way of saying is we should only create an module when its needed. Local,SSH,GSISSH should not be separate modules as they have lot in common. Just to give you some background on GFAC modules, they were created for the purpose of having different flavors of GFAC. This was mainly done to fix the problem of jar dependencies (different security dependencies) of GFAC modules in a JVM. It was effecting modules to work together. An example in the past was, Unicore, GRAM and GSISSH modules did not work together so we had to spend time to inspect distribution to find which runtime dependency is causing problem and fix it. With the current design we can create individual version of GFAC. We still need to enhance GFAC service to have a flavor of GFAC registered with type and route the jobs. So we need some of this flavoring support in GFAC.
>> 
>> I agree with you that we have too many core modules and in core modules like GFAC core, we have implementations (e.g. BetterGFACImpl). We should move implementations to Gfac service. Lets talk about pros and cons about having a single core and then we can decide how to proceed. Currently airavata does not provide a single view to all of its functionality. API server was designed to do that but its also overloaded with lot of implementation details. If I am following your advice right, having a common airavata core will definitely help developer to think about airavata as a system and design new components with system prospective so +1 for something like this. Only drawback can be, taking away some flexibly from a component developer, which is anyhow good for airavata system point of view. Before we just into conclusion, We need to just evaluate how it will work with our thrift services design.
>> 
>> Thanks
>> Raminder
>> 
>>  
>> 
>> On Sat, May 30, 2015 at 11:55 PM, Shameera Rathnayaka <shameerainfo@gmail.com <ma...@gmail.com>> wrote:
>> Hi Suresh, 
>> 
>> You are thinking about deployment perspective while I am thinking about dependency issue. With my suggestion for each component distribution will be increased by less than 1MB,  because only the interfaces are in the core. And in runtime those interfaces will not be loaded. Thinking the trouble we are getting at development time and code maintain issues. I think we can bare with that 1MB. 
>> 
>> Thanks, 
>> Shameera.
>> 
>> On Sat, May 30, 2015 at 10:45 PM, Suresh Marru <smarru@apache.org <ma...@apache.org>> wrote:
>> Shameera,
>> 
>> Every component has in its own thrift service interface (registry and messaging have exceptions). Every component will need to have a dependency to airavata data models (which includes util classes) and probably registry and messaging. if a component A needs to invoke component B via RPC call, then it just needs to include its component A’s thrift client. If the communication is through work queue’s then there is no dependency between them. Can you describe what you want to propose in this context? 
>> 
>> Suresh
>> 
>>> On May 30, 2015, at 8:36 PM, Shameera Rathnayaka <shameerainfo@gmail.com <ma...@gmail.com>> wrote:
>>> 
>>> Hi Suresh, 
>>> 
>>> Spark is not the right comparison for this discussion. I have been a spark incubation mentor and have been following the code organization since its early days. All of the spark components you mentions rely on the core. 
>>> 
>>> ​Yes, that is what I highlight , each components doesn't have their own core modules which has the interfaces. Spark SQL module has core submodule but all interfaces reside in main core module.
>>> 
>>> 
>>> Let me step back and ask, what is the problem you are trying to solve? We sure need to cleanup modules and it is time to re-look at the component organization. But what do you want to really achieve by combining all the core into monolithic components. It took an effort to cleanly separate functionality so they can evolve and can be improved independently. 
>>> 
>>> 
>>> I am not suggesting to go back to the monolithic core which has all implementations and interfaces bundle with together. What i am saying is have core interfaces all together, this will give us clear module dependency graph. This is something like having one root dependency graph instead of multiple roots. As a developer it is cumbersome and hard to deal with module dependency issues. Having too fine gran core modules introduce wrong dependency graph eventually, and it will prevent us to follow proper design patterns in our code base. If we avoid design patterns, it will require more time to find bugs and maintain. This is what I am trying to resolve. I have first hand experiences with current airavata code. If you see the current module dependency graphs, then you will understand why I am making such noise to resolve this.
>>> 
>>> 
>>>> So why do we make our repository bulky with modules unless it doesn't provide any considerable advantage. 
>>>> ​If we really w​ant to ​separate distribution bundle for each component ( apiServer , Orchestrator and Gfac ) let​'​s use different bin.xml file​s​ to do it instead of using different modules. But reality is we only use ​all in one distribution.
>>> 
>>> Once the monitoring is fixed to use messaging, we really need to decouple the component deployments. Yes there is a considerable advantage. Each components has different quality of service requirements. A production platform has to load balance and scale horizontally. And thats different for different component. The all in one bundle has 300+ jars, but API server and orchestrator when independent will have around 50 or so jars. When I want to deploy api server and orchestrator there is significant different in small light weight components vs one monolithic core. 
>>> 
>>> Another problem is component evolution. Lets say there is a production deployment running 1.1.4 version. Lets say the single job execution is stable enough and there is a 6 month focused effort on workflow. Say the master moves to 1.2.8 with all the changes to workflow and only few to single application execution. We can more comfortably upgrade is they are cleanly separated modules. But if it is one core and so many changes to it (even though technically they are to different classes, the perception will remain), the upgrades will get behind.
>>> 
>>> Bottom line I am + 1 for cleaning up the modules. Past few years we have been moving towards micro service architectures and your suggestions will reverse this back to monolithic architecture. I am -1 for this change in direction. 
>>> 
>>> 
>>> By looking at your miniature component suggestion above, it has 30+ modules. Do you think we really need this number of categorization? With my industrial experience, I have seen ​number of modules in a project always increase with time. Hence​ If we start 30+ we will come to 40+ and then 50+​ and so on so forth​.  Why we make this complicated? 
>>> 
>>> ​Thanks,
>>> Shameera.​
>>> 
>>>  
>>> 
>>> Suresh
>>> 
>>>> 
>>>> ​​Thanks,
>>>> ​Shameera.
>>>> ​
>>>> 
>>>> Suresh
>>>> 
>>>>> On May 29, 2015, at 11:29 AM, Suresh Marru <smarru@apache.org <ma...@apache.org>> wrote:
>>>>> 
>>>>> + 1. 
>>>>> 
>>>>> I was planning to bring up this issue also. Probably it will not address what you are raising, but here is a tree output from airavata labs code I was toying with locally. I did not yet compare it with what you proposed, I will do so later today.
>>>>> 
>>>>> ├── airavata-api
>>>>> │   ├── airavata-api-interface-descriptions
>>>>> │   ├── airavata-api-java-stubs
>>>>> │   ├── airavata-api-server
>>>>> │   ├── airavata-data-models
>>>>> │   ├── api-security-manager
>>>>> ├── clients
>>>>> │   ├── airavata-client-cpp-sdk
>>>>> │   ├── airavata-client-java-sdk
>>>>> │   ├── airavata-client-php-sdk
>>>>> │   ├── airavata-client-python-sdk
>>>>> │   ├── airavata-sample-examples
>>>>> │   └── airavata-xbaya-gui
>>>>> ├── components
>>>>> │   ├── commons
>>>>> │   ├── component-interface-descriptions
>>>>> │   ├── component-services
>>>>> │   │   ├── credential-store-service
>>>>> │   │   ├── orchestrator-service
>>>>> │   │   ├── task-executor-service
>>>>> │   │   └── workflow-interpreter-service
>>>>> │   ├── component-clients
>>>>> │   │   ├── credential-store-client
>>>>> │   │   ├── orchestrator-client
>>>>> │   │   ├── task-executor-client
>>>>> │   │   ├── workflow-interpreter-client
>>>>> │   │   └── messaging
>>>>> │   ├── task-adaptors
>>>>> │   │   ├── compute
>>>>> │   │   └── data-movement
>>>>> │   ├── registry
>>>>> │   │   ├── app-catalog
>>>>> │   │   ├── experiment-catalog
>>>>> │   │   └── resource-catalog
>>>>> │   └── workflow-interpreter
>>>>> ├── distribution
>>>>> ├── integration-tests
>>>>> 
>>>>> 
>>>>> 
>>>>>> On May 29, 2015, at 10:15 AM, Shameera Rathnayaka <shameera@apache.org <ma...@apache.org>> wrote:
>>>>>> 
>>>>>> Hi Devs, 
>>>>>> 
>>>>>> As we are using different modules to package different type of functionalities, which will help us to maintain loosely couple codes. Now the project has 49 leaf module ( one to hit half century :) ). If we allow project to grow this way, having too fine grain modules will be huge headache in future. IMO we should clean this ASAP before it become really mess. Actually we half way there, I experienced cyclic dependency issues when I was writing workflow implementation and email monitoring. Please see the modules in current repo below. 
>>>>>> 
>>>>>> <module-name> ( <num of child modules> )
>>>>>> 
>>>>>> modules  ( 43 )
>>>>>>      app-catalog ( 2 )
>>>>>>      commons ( 1 )
>>>>>>      configurations ( 2 )
>>>>>>      credential-store ( 3 )
>>>>>>      distribution ( 8 )
>>>>>>      gfac ( 10 )
>>>>>>      integration test ( 1 )
>>>>>>      messaging ( 2 )
>>>>>>      orchestrator ( 3 )
>>>>>>      registry ( 3 )
>>>>>>      security ( 1 )
>>>>>>      server ( 1 )
>>>>>>      test-suit ( 1 )
>>>>>>      workflow ( 1 )
>>>>>>      workflow-modal ( 3 )
>>>>>>      xbaya ( 1 ) 
>>>>>> airavata-api ( 5 )
>>>>>> tools ( 1 ) 
>>>>>> 
>>>>>> Most of the current modules have interfaces and implementations together, but this violate our main goal which reduce inter module dependencies. Following is what I am suggesting, WDYS?
>>>>>> 
>>>>>> core { has all core interfaces and basic classes of gfac-core , orchestrator-core , message-core , monitor core, registry core, workflow-core}
>>>>>> service - all thrift services and service handlers 
>>>>>> orchestrator - orchestrator specific classes
>>>>>> gfac 
>>>>>>      SSH  
>>>>>>      BES
>>>>>>      Local
>>>>>> message - amqp implemention 
>>>>>> distribution 
>>>>>>      XBaya
>>>>>>      server - { use different mode input to start server as orchestrator , Gfac or/and api-server }
>>>>>> commons
>>>>>> registry
>>>>>> app-catalog
>>>>>> security
>>>>>> Workflow
>>>>>> XBaya-gui
>>>>>> Integration-test 
>>>>>> 
>>>>>> Thanks, 
>>>>>> Shameera.
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> Best Regards,
>>>> Shameera Rathnayaka.
>>>> 
>>>> email: shameera AT apache.org <http://apache.org/> , shameerainfo AT gmail.com <http://gmail.com/>
>>>> Blog : http://shameerarathnayaka.blogspot.com/ <http://shameerarathnayaka.blogspot.com/>
>>> 
>>> 
>>> 
>>> -- 
>>> Best Regards,
>>> Shameera Rathnayaka.
>>> 
>>> email: shameera AT apache.org <http://apache.org/> , shameerainfo AT gmail.com <http://gmail.com/>
>>> Blog : http://shameerarathnayaka.blogspot.com/ <http://shameerarathnayaka.blogspot.com/>
>> 
>> 
>> 
>> -- 
>> Best Regards,
>> Shameera Rathnayaka.
>> 
>> email: shameera AT apache.org <http://apache.org/> , shameerainfo AT gmail.com <http://gmail.com/>
>> Blog : http://shameerarathnayaka.blogspot.com/ <http://shameerarathnayaka.blogspot.com/>
>> 
>> 
>> 
>> 
>> -- 
>> Best Regards,
>> Shameera Rathnayaka.
>> 
>> email: shameera AT apache.org <http://apache.org/> , shameerainfo AT gmail.com <http://gmail.com/>
>> Blog : http://shameerarathnayaka.blogspot.com/ <http://shameerarathnayaka.blogspot.com/>
>> 
>> 
>> 
>> -- 
>> Best Regards,
>> Shameera Rathnayaka.
>> 
>> email: shameera AT apache.org <http://apache.org/> , shameerainfo AT gmail.com <http://gmail.com/>
>> Blog : http://shameerarathnayaka.blogspot.com/ <http://shameerarathnayaka.blogspot.com/>
> 
> 
> 
> 
> -- 
> Best Regards,
> Shameera Rathnayaka.
> 
> email: shameera AT apache.org <http://apache.org/> , shameerainfo AT gmail.com <http://gmail.com/>
> Blog : http://shameerarathnayaka.blogspot.com/ <http://shameerarathnayaka.blogspot.com/>


Re: Too Many Leaf Modules.

Posted by Shameera Rathnayaka <sh...@gmail.com>.
Hi Suresh,

As you have fist hand experience of how airavata architecture evolve, let's
do as you suggest. But this is not correct way IMO. "Airavata Components
will need to be loosely coupled and should be developed at a different pace
upgradable and replaceable independently" , we can achieve these goals
without any issue with the Maven modules that I am suggesting, may be I am
not descriptive enough to make you understand why this maven module
refactoring is so important for us. It would be great to know how you
suggest to have project structure after the refactoring.

Thanks,
Shameera.

On Tue, Jun 2, 2015 at 9:49 AM, Suresh Marru <sm...@apache.org> wrote:

> Hi Shameera,
>
> We are getting close to be on same page but not quite yet. What I see
> missing is a reference to a succinct Airavata architecture vision document.
> We have many papers on high level goals, but what we need is a concise one
> pager on Architecture goals. I will work on it, but will need some time. In
> the mean time I will suggest we proceed without yet merging the core (even
> though it is only interfaces). It is not the additional 1MB I am worried
> about, I am worried about going against the architectural principles laid
> out (which I will extract from papers onto the website).
>
> Very briefly, Airavata is a high level framework close to the business
> functionality and assembles together multiple usecases. To make this
> challenge a conceivable effort, we layer over rich lower level tools and
> framework. Initial struggle was to come up with a unified API. But
> embracing thrift we addressed this by not unifying all usecases into one
> abstraction, but multiple blocks of abstractions. This will require
> assembling together multiple components at the implementation.
> Hypothetically you can assemble multiple cores, but that counters the
> principle that components can be re-usable across recipes. There is no good
> description of Airavata technical recipes but similar community effort in
> science gateways is at [1] [2], there are about 33 recopies which need to
> be consolidated.
>
> I appreciate your enthusiasm, but need to slow down on drastic changes. We
> sure need to get a stable one capability which is single job execution, but
> we need to do so without jeopardizing larger goals of the project. I will
> very soon start discussions on these larger goals. These will infer
> architectural principles which will need to be slowly build into 2 or 3
> major versions. Understandably some of these will not make sense for 1.0.
> For instance Airavata Components will need to be loosely coupled and should
> be developed at a different pace upgradable and replaceable independently.
> This is requirement for making Airavata platform ready. Just because we
> cannot do that yet, we do not want to go in other direction. Similarly,
> there should be a minimal shared understanding of Airavata (data models,
> registry catalogs and messaging contexts), but otherwise each component in
> the system will require minimal understanding of remaining constituents of
> the airavata system. All of this goes against merging all interfaces into a
> common core.
>
> For 0.16 release, lets please proceed with inconvenience of having
> multi-module parents. Before next version we will need to do sufficient
> discussion to alter this.
>
> Suresh
>
> [1] - http://dx.doi.org/10.1109/CLUSTER.2013.6702702
> [2] - https://www.xsede.org/web/gateways/gateways-cookbook
>
> On Jun 1, 2015, at 11:13 PM, Shameera Rathnayaka <sh...@gmail.com>
> wrote:
>
> Hi All,
>
> While reading through this thread again, I understood that we might be
> talking two different things here. So I thought to explain the difference
> between Maven modules and Modular( Separation of Concern) in Component
> based architecture. What I am suggesting is restructure maven modules not
> to change existing Modular concept. If anyone say making one maven module
> call "core" by putting all basic interfaces together will break Modular
> concept in Component based architecture, that is not true. These are two
> different concepts. While component based architecture is software
> engineering architectural design, Maven is used to project management.
> Component based architecture may or may not have one maven module "core" in
> project which has all basic interfaces. If this core maven module is so
> complex then it is worth to break it to few modules. But for Airavata this
> doesn't relate, as suggested core maven module is not that complex.
> Thanks,
> Shameera.
>
> On Mon, Jun 1, 2015 at 9:09 PM, Shameera Rathnayaka <
> shameerainfo@gmail.com> wrote:
>
>> Hi Raminder,
>>
>> Different implementations can depend on different versions of the same
>> jar, as we are using native java class loading, it doesn't support
>> different versions to class loaded and live in one runtime, unless we come
>> up with OSGi like modular system or customize class loader behavior which
>> is tricky. If we are getting this issue with basic Airavata components then
>> we should fix this and use one version through all the components. If
>> different plugable implementations use different versions then both can't
>> work together only one version will be class loaded, this is a known
>> restriction. Anyway I am not suggesting to have one bulky Maven module, we
>> should have some level of categorization. If  any interface has different
>> implementations and those are distinct enough, then having different
>> modules make sense.
>>
>> Definitely we must remove implementation classes from current core
>> modules. With multiplex thrift support which Suresh is working on, for me
>> it make sense to have all services in one maven module call "service".
>> Anyway let's think other alternatives too.
>>
>> Thanks,
>> Shameera.
>>
>> On Mon, Jun 1, 2015 at 1:25 PM, Raminderjeet Singh <
>> raminderjsingh@gmail.com> wrote:
>>
>>> Hi Shameera,
>>>
>>> I think, we are tying to deal with 2 problems. Code manageability/usage
>>> and distribution.
>>>
>>> In GFAC, I noticed that there is a dependencies on modules, which should
>>> not exist like GFAC's GSISSH module dependent on SSH module and both the
>>> modules depending on GSISSH library. Reason of such depend is duplication
>>> of utility methods (see GFACSSHUtils and GFACGSISSHUtils for duplicate
>>> code) in both of these modules and we need to fix it. I think there are
>>> similar examples in different modules. According to me, if 60% code is
>>> duplicate, we need to merge the module as one or come. Other way of saying
>>> is we should only create an module when its needed. Local,SSH,GSISSH should
>>> not be separate modules as they have lot in common. Just to give you some
>>> background on GFAC modules, they were created for the purpose of having
>>> different flavors of GFAC. This was mainly done to fix the problem of jar
>>> dependencies (different security dependencies) of GFAC modules in a JVM. It
>>> was effecting modules to work together. An example in the past was,
>>> Unicore, GRAM and GSISSH modules did not work together so we had to spend
>>> time to inspect distribution to find which runtime dependency is causing
>>> problem and fix it. With the current design we can create individual
>>> version of GFAC. We still need to enhance GFAC service to have a flavor of
>>> GFAC registered with type and route the jobs. So we need some of this
>>> flavoring support in GFAC.
>>>
>>> I agree with you that we have too many core modules and in core
>>> modules like GFAC core, we have implementations (e.g. BetterGFACImpl). We
>>> should move implementations to Gfac service. Lets talk about pros and cons
>>> about having a single core and then we can decide how to proceed. Currently
>>> airavata does not provide a single view to all of its functionality. API
>>> server was designed to do that but its also overloaded with lot of
>>> implementation details. If I am following your advice right, having a
>>> common airavata core will definitely help developer to think about airavata
>>> as a system and design new components with system prospective so +1 for
>>> something like this. Only drawback can be, taking away some flexibly from a
>>> component developer, which is anyhow good for airavata system point of
>>> view. Before we just into conclusion, We need to just evaluate how it will
>>> work with our thrift services design.
>>>
>>> Thanks
>>> Raminder
>>>
>>>
>>>
>>> On Sat, May 30, 2015 at 11:55 PM, Shameera Rathnayaka <
>>> shameerainfo@gmail.com> wrote:
>>>
>>>> Hi Suresh,
>>>>
>>>> You are thinking about deployment perspective while I am thinking about
>>>> dependency issue. With my suggestion for each component distribution will
>>>> be increased by less than 1MB,  because only the interfaces are in the
>>>> core. And in runtime those interfaces will not be loaded. Thinking the
>>>> trouble we are getting at development time and code maintain issues. I
>>>> think we can bare with that 1MB.
>>>>
>>>> Thanks,
>>>> Shameera.
>>>>
>>>> On Sat, May 30, 2015 at 10:45 PM, Suresh Marru <sm...@apache.org>
>>>> wrote:
>>>>
>>>>> Shameera,
>>>>>
>>>>> Every component has in its own thrift service interface (registry and
>>>>> messaging have exceptions). Every component will need to have a dependency
>>>>> to airavata data models (which includes util classes) and probably registry
>>>>> and messaging. if a component A needs to invoke component B via RPC call,
>>>>> then it just needs to include its component A’s thrift client. If the
>>>>> communication is through work queue’s then there is no dependency between
>>>>> them. Can you describe what you want to propose in this context?
>>>>>
>>>>> Suresh
>>>>>
>>>>> On May 30, 2015, at 8:36 PM, Shameera Rathnayaka <
>>>>> shameerainfo@gmail.com> wrote:
>>>>>
>>>>> Hi Suresh,
>>>>>
>>>>>>
>>>>>> Spark is not the right comparison for this discussion. I have been a
>>>>>> spark incubation mentor and have been following the code organization since
>>>>>> its early days. All of the spark components you mentions rely on the core.
>>>>>>
>>>>>
>>>>> ​Yes, that is what I highlight , each components doesn't have their
>>>>> own core modules which has the interfaces. Spark SQL module has core
>>>>> submodule but all interfaces reside in main core module.
>>>>>
>>>>>
>>>>>> Let me step back and ask, what is the problem you are trying to
>>>>>> solve? We sure need to cleanup modules and it is time to re-look at the
>>>>>> component organization. But what do you want to really achieve by combining
>>>>>> all the core into monolithic components. It took an effort to cleanly
>>>>>> separate functionality so they can evolve and can be improved
>>>>>> independently.
>>>>>>
>>>>>
>>>>>
>>>>> I am not suggesting to go back to the monolithic core which has all
>>>>> implementations and interfaces bundle with together. What i am saying is
>>>>> have core interfaces all together, this will give us clear module
>>>>> dependency graph. This is something like having one root dependency graph
>>>>> instead of multiple roots. As a developer it is cumbersome and hard to deal
>>>>> with module dependency issues. Having too fine gran core modules introduce
>>>>> wrong dependency graph eventually, and it will prevent us to follow proper
>>>>> design patterns in our code base. If we avoid design patterns, it will
>>>>> require more time to find bugs and maintain. This is what I am trying to
>>>>> resolve. I have first hand experiences with current airavata code. If you
>>>>> see the current module dependency graphs, then you will understand why I am
>>>>> making such noise to resolve this.
>>>>>
>>>>>
>>>>>> So why do we make our repository bulky with modules unless it doesn't
>>>>>> provide any considerable advantage.
>>>>>>
>>>>>> ​If we really w
>>>>>> ​ant to ​
>>>>>> separate distribution bundle for each component ( apiServer ,
>>>>>> Orchestrator and Gfac ) let
>>>>>> ​'​
>>>>>> s use different bin.xml file
>>>>>> ​s​
>>>>>>  to do it instead of using different modules. But reality is we only
>>>>>> use ​all in one distribution.
>>>>>>
>>>>>>
>>>>>> Once the monitoring is fixed to use messaging, we really need to
>>>>>> decouple the component deployments. Yes there is a considerable advantage.
>>>>>> Each components has different quality of service requirements. A production
>>>>>> platform has to load balance and scale horizontally. And thats different
>>>>>> for different component. The all in one bundle has 300+ jars, but API
>>>>>> server and orchestrator when independent will have around 50 or so jars.
>>>>>> When I want to deploy api server and orchestrator there is significant
>>>>>> different in small light weight components vs one monolithic core.
>>>>>>
>>>>>> Another problem is component evolution. Lets say there is a
>>>>>> production deployment running 1.1.4 version. Lets say the single job
>>>>>> execution is stable enough and there is a 6 month focused effort on
>>>>>> workflow. Say the master moves to 1.2.8 with all the changes to workflow
>>>>>> and only few to single application execution. We can more comfortably
>>>>>> upgrade is they are cleanly separated modules. But if it is one core and so
>>>>>> many changes to it (even though technically they are to different classes,
>>>>>> the perception will remain), the upgrades will get behind.
>>>>>>
>>>>>> Bottom line I am + 1 for cleaning up the modules. Past few years we
>>>>>> have been moving towards micro service architectures and your suggestions
>>>>>> will reverse this back to monolithic architecture. I am -1 for this change
>>>>>> in direction.
>>>>>>
>>>>>
>>>>>
>>>>> By looking at your miniature component suggestion above, it has 30+
>>>>> modules. Do you think we really need this number of categorization? With my
>>>>> industrial experience, I have seen
>>>>> ​number of modules in a project always increase with time. Hence​
>>>>>  If we start 30+ we will come to 40+ and then 50+
>>>>> ​ and so on so forth​
>>>>> .  Why we make this complicated?
>>>>>
>>>>> ​Thanks,
>>>>> Shameera.​
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> Suresh
>>>>>>
>>>>>>
>>>>>> ​​Thanks,
>>>>>> ​Shameera.
>>>>>> ​
>>>>>>
>>>>>>>
>>>>>>> Suresh
>>>>>>>
>>>>>>> On May 29, 2015, at 11:29 AM, Suresh Marru <sm...@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>> + 1.
>>>>>>>
>>>>>>> I was planning to bring up this issue also. Probably it will not
>>>>>>> address what you are raising, but here is a tree output from airavata labs
>>>>>>> code I was toying with locally. I did not yet compare it with what you
>>>>>>> proposed, I will do so later today.
>>>>>>>
>>>>>>> ├── airavata-api
>>>>>>> │   ├── airavata-api-interface-descriptions
>>>>>>> │   ├── airavata-api-java-stubs
>>>>>>> │   ├── airavata-api-server
>>>>>>> │   ├── airavata-data-models
>>>>>>> │   ├── api-security-manager
>>>>>>> ├── clients
>>>>>>> │   ├── airavata-client-cpp-sdk
>>>>>>> │   ├── airavata-client-java-sdk
>>>>>>> │   ├── airavata-client-php-sdk
>>>>>>> │   ├── airavata-client-python-sdk
>>>>>>> │   ├── airavata-sample-examples
>>>>>>> │   └── airavata-xbaya-gui
>>>>>>> ├── components
>>>>>>> │   ├── commons
>>>>>>> │   ├── component-interface-descriptions
>>>>>>> │   ├── component-services
>>>>>>> │   │   ├── credential-store-service
>>>>>>> │   │   ├── orchestrator-service
>>>>>>> │   │   ├── task-executor-service
>>>>>>> │   │   └── workflow-interpreter-service
>>>>>>> │   ├── component-clients
>>>>>>> │   │   ├── credential-store-client
>>>>>>> │   │   ├── orchestrator-client
>>>>>>> │   │   ├── task-executor-client
>>>>>>> │   │   ├── workflow-interpreter-client
>>>>>>> │   │   └── messaging
>>>>>>> │   ├── task-adaptors
>>>>>>> │   │   ├── compute
>>>>>>> │   │   └── data-movement
>>>>>>> │   ├── registry
>>>>>>> │   │   ├── app-catalog
>>>>>>> │   │   ├── experiment-catalog
>>>>>>> │   │   └── resource-catalog
>>>>>>> │   └── workflow-interpreter
>>>>>>> ├── distribution
>>>>>>> ├── integration-tests
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On May 29, 2015, at 10:15 AM, Shameera Rathnayaka <
>>>>>>> shameera@apache.org> wrote:
>>>>>>>
>>>>>>> Hi Devs,
>>>>>>>
>>>>>>> As we are using different modules to package different type of
>>>>>>> functionalities, which will help us to maintain loosely couple codes. Now
>>>>>>> the project has 49 leaf module ( one to hit half century :) ). If we allow
>>>>>>> project to grow this way, having too fine grain modules will be huge
>>>>>>> headache in future. IMO we should clean this ASAP before it become really
>>>>>>> mess. Actually we half way there, I experienced cyclic dependency issues
>>>>>>> when I was writing workflow implementation and email monitoring. Please see
>>>>>>> the modules in current repo below.
>>>>>>>
>>>>>>> <module-name> ( <num of child modules> )
>>>>>>>
>>>>>>> modules  ( 43 )
>>>>>>>      app-catalog ( 2 )
>>>>>>>      commons ( 1 )
>>>>>>>      configurations ( 2 )
>>>>>>>      credential-store ( 3 )
>>>>>>>      distribution ( 8 )
>>>>>>>      gfac ( 10 )
>>>>>>>      integration test ( 1 )
>>>>>>>      messaging ( 2 )
>>>>>>>      orchestrator ( 3 )
>>>>>>>      registry ( 3 )
>>>>>>>      security ( 1 )
>>>>>>>      server ( 1 )
>>>>>>>      test-suit ( 1 )
>>>>>>>      workflow ( 1 )
>>>>>>>      workflow-modal ( 3 )
>>>>>>>      xbaya ( 1 )
>>>>>>> airavata-api ( 5 )
>>>>>>> tools ( 1 )
>>>>>>>
>>>>>>> Most of the current modules have interfaces and implementations
>>>>>>> together, but this violate our main goal which reduce inter module
>>>>>>> dependencies. Following is what I am suggesting, WDYS?
>>>>>>>
>>>>>>> core { has all core interfaces and basic classes of gfac-core ,
>>>>>>> orchestrator-core , message-core , monitor core, registry core,
>>>>>>> workflow-core}
>>>>>>> service - all thrift services and service handlers
>>>>>>> orchestrator - orchestrator specific classes
>>>>>>> gfac
>>>>>>>      SSH
>>>>>>>      BES
>>>>>>>      Local
>>>>>>> message - amqp implemention
>>>>>>> distribution
>>>>>>>      XBaya
>>>>>>>      server - { use different mode input to start server as
>>>>>>> orchestrator , Gfac or/and api-server }
>>>>>>> commons
>>>>>>> registry
>>>>>>> app-catalog
>>>>>>> security
>>>>>>> Workflow
>>>>>>> XBaya-gui
>>>>>>> Integration-test
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Shameera.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best Regards,
>>>>>> Shameera Rathnayaka.
>>>>>>
>>>>>> email: shameera AT apache.org , shameerainfo AT gmail.com
>>>>>> Blog : http://shameerarathnayaka.blogspot.com/
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards,
>>>>> Shameera Rathnayaka.
>>>>>
>>>>> email: shameera AT apache.org , shameerainfo AT gmail.com
>>>>> Blog : http://shameerarathnayaka.blogspot.com/
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best Regards,
>>>> Shameera Rathnayaka.
>>>>
>>>> email: shameera AT apache.org , shameerainfo AT gmail.com
>>>> Blog : http://shameerarathnayaka.blogspot.com/
>>>>
>>>
>>>
>>
>>
>> --
>> Best Regards,
>> Shameera Rathnayaka.
>>
>> email: shameera AT apache.org , shameerainfo AT gmail.com
>> Blog : http://shameerarathnayaka.blogspot.com/
>>
>
>
>
> --
> Best Regards,
> Shameera Rathnayaka.
>
> email: shameera AT apache.org , shameerainfo AT gmail.com
> Blog : http://shameerarathnayaka.blogspot.com/
>
>
>


-- 
Best Regards,
Shameera Rathnayaka.

email: shameera AT apache.org , shameerainfo AT gmail.com
Blog : http://shameerarathnayaka.blogspot.com/

Re: Too Many Leaf Modules.

Posted by Suresh Marru <sm...@apache.org>.
Hi Shameera,

We are getting close to be on same page but not quite yet. What I see missing is a reference to a succinct Airavata architecture vision document. We have many papers on high level goals, but what we need is a concise one pager on Architecture goals. I will work on it, but will need some time. In the mean time I will suggest we proceed without yet merging the core (even though it is only interfaces). It is not the additional 1MB I am worried about, I am worried about going against the architectural principles laid out (which I will extract from papers onto the website).

Very briefly, Airavata is a high level framework close to the business functionality and assembles together multiple usecases. To make this challenge a conceivable effort, we layer over rich lower level tools and framework. Initial struggle was to come up with a unified API. But embracing thrift we addressed this by not unifying all usecases into one abstraction, but multiple blocks of abstractions. This will require assembling together multiple components at the implementation. Hypothetically you can assemble multiple cores, but that counters the principle that components can be re-usable across recipes. There is no good description of Airavata technical recipes but similar community effort in science gateways is at [1] [2], there are about 33 recopies which need to be consolidated. 

I appreciate your enthusiasm, but need to slow down on drastic changes. We sure need to get a stable one capability which is single job execution, but we need to do so without jeopardizing larger goals of the project. I will very soon start discussions on these larger goals. These will infer architectural principles which will need to be slowly build into 2 or 3 major versions. Understandably some of these will not make sense for 1.0. For instance Airavata Components will need to be loosely coupled and should be developed at a different pace upgradable and replaceable independently. This is requirement for making Airavata platform ready. Just because we cannot do that yet, we do not want to go in other direction. Similarly, there should be a minimal shared understanding of Airavata (data models, registry catalogs and messaging contexts), but otherwise each component in the system will require minimal understanding of remaining constituents of the airavata system. All of this goes against merging all interfaces into a common core. 

For 0.16 release, lets please proceed with inconvenience of having multi-module parents. Before next version we will need to do sufficient discussion to alter this. 

Suresh

[1] - http://dx.doi.org/10.1109/CLUSTER.2013.6702702 <http://dx.doi.org/10.1109/CLUSTER.2013.6702702>
[2] - https://www.xsede.org/web/gateways/gateways-cookbook <https://www.xsede.org/web/gateways/gateways-cookbook> 

> On Jun 1, 2015, at 11:13 PM, Shameera Rathnayaka <sh...@gmail.com> wrote:
> 
> Hi All, 
> 
> While reading through this thread again, I understood that we might be talking two different things here. So I thought to explain the difference between Maven modules and Modular( Separation of Concern) in Component based architecture. What I am suggesting is restructure maven modules not to change existing Modular concept. If anyone say making one maven module call "core" by putting all basic interfaces together will break Modular concept in Component based architecture, that is not true. These are two different concepts. While component based architecture is software engineering architectural design, Maven is used to project management. Component based architecture may or may not have one maven module "core" in project which has all basic interfaces. If this core maven module is so complex then it is worth to break it to few modules. But for Airavata this doesn't relate, as suggested core maven module is not that complex.  
> Thanks, 
> Shameera.
> 
> On Mon, Jun 1, 2015 at 9:09 PM, Shameera Rathnayaka <shameerainfo@gmail.com <ma...@gmail.com>> wrote:
> Hi Raminder, 
> 
> Different implementations can depend on different versions of the same jar, as we are using native java class loading, it doesn't support different versions to class loaded and live in one runtime, unless we come up with OSGi like modular system or customize class loader behavior which is tricky. If we are getting this issue with basic Airavata components then we should fix this and use one version through all the components. If different plugable implementations use different versions then both can't work together only one version will be class loaded, this is a known restriction. Anyway I am not suggesting to have one bulky Maven module, we should have some level of categorization. If  any interface has different implementations and those are distinct enough, then having different modules make sense. 
> 
> Definitely we must remove implementation classes from current core modules. With multiplex thrift support which Suresh is working on, for me it make sense to have all services in one maven module call "service". Anyway let's think other alternatives too. 
> 
> Thanks, 
> Shameera.
> 
> On Mon, Jun 1, 2015 at 1:25 PM, Raminderjeet Singh <raminderjsingh@gmail.com <ma...@gmail.com>> wrote:
> Hi Shameera,
> 
> I think, we are tying to deal with 2 problems. Code manageability/usage and distribution.  
> 
> In GFAC, I noticed that there is a dependencies on modules, which should not exist like GFAC's GSISSH module dependent on SSH module and both the modules depending on GSISSH library. Reason of such depend is duplication of utility methods (see GFACSSHUtils and GFACGSISSHUtils for duplicate code) in both of these modules and we need to fix it. I think there are similar examples in different modules. According to me, if 60% code is duplicate, we need to merge the module as one or come. Other way of saying is we should only create an module when its needed. Local,SSH,GSISSH should not be separate modules as they have lot in common. Just to give you some background on GFAC modules, they were created for the purpose of having different flavors of GFAC. This was mainly done to fix the problem of jar dependencies (different security dependencies) of GFAC modules in a JVM. It was effecting modules to work together. An example in the past was, Unicore, GRAM and GSISSH modules did not work together so we had to spend time to inspect distribution to find which runtime dependency is causing problem and fix it. With the current design we can create individual version of GFAC. We still need to enhance GFAC service to have a flavor of GFAC registered with type and route the jobs. So we need some of this flavoring support in GFAC.
> 
> I agree with you that we have too many core modules and in core modules like GFAC core, we have implementations (e.g. BetterGFACImpl). We should move implementations to Gfac service. Lets talk about pros and cons about having a single core and then we can decide how to proceed. Currently airavata does not provide a single view to all of its functionality. API server was designed to do that but its also overloaded with lot of implementation details. If I am following your advice right, having a common airavata core will definitely help developer to think about airavata as a system and design new components with system prospective so +1 for something like this. Only drawback can be, taking away some flexibly from a component developer, which is anyhow good for airavata system point of view. Before we just into conclusion, We need to just evaluate how it will work with our thrift services design.
> 
> Thanks
> Raminder
> 
>  
> 
> On Sat, May 30, 2015 at 11:55 PM, Shameera Rathnayaka <shameerainfo@gmail.com <ma...@gmail.com>> wrote:
> Hi Suresh, 
> 
> You are thinking about deployment perspective while I am thinking about dependency issue. With my suggestion for each component distribution will be increased by less than 1MB,  because only the interfaces are in the core. And in runtime those interfaces will not be loaded. Thinking the trouble we are getting at development time and code maintain issues. I think we can bare with that 1MB. 
> 
> Thanks, 
> Shameera.
> 
> On Sat, May 30, 2015 at 10:45 PM, Suresh Marru <smarru@apache.org <ma...@apache.org>> wrote:
> Shameera,
> 
> Every component has in its own thrift service interface (registry and messaging have exceptions). Every component will need to have a dependency to airavata data models (which includes util classes) and probably registry and messaging. if a component A needs to invoke component B via RPC call, then it just needs to include its component A’s thrift client. If the communication is through work queue’s then there is no dependency between them. Can you describe what you want to propose in this context? 
> 
> Suresh
> 
>> On May 30, 2015, at 8:36 PM, Shameera Rathnayaka <shameerainfo@gmail.com <ma...@gmail.com>> wrote:
>> 
>> Hi Suresh, 
>> 
>> Spark is not the right comparison for this discussion. I have been a spark incubation mentor and have been following the code organization since its early days. All of the spark components you mentions rely on the core. 
>> 
>> ​Yes, that is what I highlight , each components doesn't have their own core modules which has the interfaces. Spark SQL module has core submodule but all interfaces reside in main core module.
>> 
>> 
>> Let me step back and ask, what is the problem you are trying to solve? We sure need to cleanup modules and it is time to re-look at the component organization. But what do you want to really achieve by combining all the core into monolithic components. It took an effort to cleanly separate functionality so they can evolve and can be improved independently. 
>> 
>> 
>> I am not suggesting to go back to the monolithic core which has all implementations and interfaces bundle with together. What i am saying is have core interfaces all together, this will give us clear module dependency graph. This is something like having one root dependency graph instead of multiple roots. As a developer it is cumbersome and hard to deal with module dependency issues. Having too fine gran core modules introduce wrong dependency graph eventually, and it will prevent us to follow proper design patterns in our code base. If we avoid design patterns, it will require more time to find bugs and maintain. This is what I am trying to resolve. I have first hand experiences with current airavata code. If you see the current module dependency graphs, then you will understand why I am making such noise to resolve this.
>> 
>> 
>>> So why do we make our repository bulky with modules unless it doesn't provide any considerable advantage. 
>>> ​If we really w​ant to ​separate distribution bundle for each component ( apiServer , Orchestrator and Gfac ) let​'​s use different bin.xml file​s​ to do it instead of using different modules. But reality is we only use ​all in one distribution.
>> 
>> Once the monitoring is fixed to use messaging, we really need to decouple the component deployments. Yes there is a considerable advantage. Each components has different quality of service requirements. A production platform has to load balance and scale horizontally. And thats different for different component. The all in one bundle has 300+ jars, but API server and orchestrator when independent will have around 50 or so jars. When I want to deploy api server and orchestrator there is significant different in small light weight components vs one monolithic core. 
>> 
>> Another problem is component evolution. Lets say there is a production deployment running 1.1.4 version. Lets say the single job execution is stable enough and there is a 6 month focused effort on workflow. Say the master moves to 1.2.8 with all the changes to workflow and only few to single application execution. We can more comfortably upgrade is they are cleanly separated modules. But if it is one core and so many changes to it (even though technically they are to different classes, the perception will remain), the upgrades will get behind.
>> 
>> Bottom line I am + 1 for cleaning up the modules. Past few years we have been moving towards micro service architectures and your suggestions will reverse this back to monolithic architecture. I am -1 for this change in direction. 
>> 
>> 
>> By looking at your miniature component suggestion above, it has 30+ modules. Do you think we really need this number of categorization? With my industrial experience, I have seen ​number of modules in a project always increase with time. Hence​ If we start 30+ we will come to 40+ and then 50+​ and so on so forth​.  Why we make this complicated? 
>> 
>> ​Thanks,
>> Shameera.​
>> 
>>  
>> 
>> Suresh
>> 
>>> 
>>> ​​Thanks,
>>> ​Shameera.
>>> ​
>>> 
>>> Suresh
>>> 
>>>> On May 29, 2015, at 11:29 AM, Suresh Marru <smarru@apache.org <ma...@apache.org>> wrote:
>>>> 
>>>> + 1. 
>>>> 
>>>> I was planning to bring up this issue also. Probably it will not address what you are raising, but here is a tree output from airavata labs code I was toying with locally. I did not yet compare it with what you proposed, I will do so later today.
>>>> 
>>>> ├── airavata-api
>>>> │   ├── airavata-api-interface-descriptions
>>>> │   ├── airavata-api-java-stubs
>>>> │   ├── airavata-api-server
>>>> │   ├── airavata-data-models
>>>> │   ├── api-security-manager
>>>> ├── clients
>>>> │   ├── airavata-client-cpp-sdk
>>>> │   ├── airavata-client-java-sdk
>>>> │   ├── airavata-client-php-sdk
>>>> │   ├── airavata-client-python-sdk
>>>> │   ├── airavata-sample-examples
>>>> │   └── airavata-xbaya-gui
>>>> ├── components
>>>> │   ├── commons
>>>> │   ├── component-interface-descriptions
>>>> │   ├── component-services
>>>> │   │   ├── credential-store-service
>>>> │   │   ├── orchestrator-service
>>>> │   │   ├── task-executor-service
>>>> │   │   └── workflow-interpreter-service
>>>> │   ├── component-clients
>>>> │   │   ├── credential-store-client
>>>> │   │   ├── orchestrator-client
>>>> │   │   ├── task-executor-client
>>>> │   │   ├── workflow-interpreter-client
>>>> │   │   └── messaging
>>>> │   ├── task-adaptors
>>>> │   │   ├── compute
>>>> │   │   └── data-movement
>>>> │   ├── registry
>>>> │   │   ├── app-catalog
>>>> │   │   ├── experiment-catalog
>>>> │   │   └── resource-catalog
>>>> │   └── workflow-interpreter
>>>> ├── distribution
>>>> ├── integration-tests
>>>> 
>>>> 
>>>> 
>>>>> On May 29, 2015, at 10:15 AM, Shameera Rathnayaka <shameera@apache.org <ma...@apache.org>> wrote:
>>>>> 
>>>>> Hi Devs, 
>>>>> 
>>>>> As we are using different modules to package different type of functionalities, which will help us to maintain loosely couple codes. Now the project has 49 leaf module ( one to hit half century :) ). If we allow project to grow this way, having too fine grain modules will be huge headache in future. IMO we should clean this ASAP before it become really mess. Actually we half way there, I experienced cyclic dependency issues when I was writing workflow implementation and email monitoring. Please see the modules in current repo below. 
>>>>> 
>>>>> <module-name> ( <num of child modules> )
>>>>> 
>>>>> modules  ( 43 )
>>>>>      app-catalog ( 2 )
>>>>>      commons ( 1 )
>>>>>      configurations ( 2 )
>>>>>      credential-store ( 3 )
>>>>>      distribution ( 8 )
>>>>>      gfac ( 10 )
>>>>>      integration test ( 1 )
>>>>>      messaging ( 2 )
>>>>>      orchestrator ( 3 )
>>>>>      registry ( 3 )
>>>>>      security ( 1 )
>>>>>      server ( 1 )
>>>>>      test-suit ( 1 )
>>>>>      workflow ( 1 )
>>>>>      workflow-modal ( 3 )
>>>>>      xbaya ( 1 ) 
>>>>> airavata-api ( 5 )
>>>>> tools ( 1 ) 
>>>>> 
>>>>> Most of the current modules have interfaces and implementations together, but this violate our main goal which reduce inter module dependencies. Following is what I am suggesting, WDYS?
>>>>> 
>>>>> core { has all core interfaces and basic classes of gfac-core , orchestrator-core , message-core , monitor core, registry core, workflow-core}
>>>>> service - all thrift services and service handlers 
>>>>> orchestrator - orchestrator specific classes
>>>>> gfac 
>>>>>      SSH  
>>>>>      BES
>>>>>      Local
>>>>> message - amqp implemention 
>>>>> distribution 
>>>>>      XBaya
>>>>>      server - { use different mode input to start server as orchestrator , Gfac or/and api-server }
>>>>> commons
>>>>> registry
>>>>> app-catalog
>>>>> security
>>>>> Workflow
>>>>> XBaya-gui
>>>>> Integration-test 
>>>>> 
>>>>> Thanks, 
>>>>> Shameera.
>>>> 
>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> Best Regards,
>>> Shameera Rathnayaka.
>>> 
>>> email: shameera AT apache.org <http://apache.org/> , shameerainfo AT gmail.com <http://gmail.com/>
>>> Blog : http://shameerarathnayaka.blogspot.com/ <http://shameerarathnayaka.blogspot.com/>
>> 
>> 
>> 
>> -- 
>> Best Regards,
>> Shameera Rathnayaka.
>> 
>> email: shameera AT apache.org <http://apache.org/> , shameerainfo AT gmail.com <http://gmail.com/>
>> Blog : http://shameerarathnayaka.blogspot.com/ <http://shameerarathnayaka.blogspot.com/>
> 
> 
> 
> -- 
> Best Regards,
> Shameera Rathnayaka.
> 
> email: shameera AT apache.org <http://apache.org/> , shameerainfo AT gmail.com <http://gmail.com/>
> Blog : http://shameerarathnayaka.blogspot.com/ <http://shameerarathnayaka.blogspot.com/>
> 
> 
> 
> 
> -- 
> Best Regards,
> Shameera Rathnayaka.
> 
> email: shameera AT apache.org <http://apache.org/> , shameerainfo AT gmail.com <http://gmail.com/>
> Blog : http://shameerarathnayaka.blogspot.com/ <http://shameerarathnayaka.blogspot.com/>
> 
> 
> 
> -- 
> Best Regards,
> Shameera Rathnayaka.
> 
> email: shameera AT apache.org <http://apache.org/> , shameerainfo AT gmail.com <http://gmail.com/>
> Blog : http://shameerarathnayaka.blogspot.com/ <http://shameerarathnayaka.blogspot.com/>


Re: Too Many Leaf Modules.

Posted by Shameera Rathnayaka <sh...@gmail.com>.
Hi All,

While reading through this thread again, I understood that we might be
talking two different things here. So I thought to explain the difference
between Maven modules and Modular( Separation of Concern) in Component
based architecture. What I am suggesting is restructure maven modules not
to change existing Modular concept. If anyone say making one maven module
call "core" by putting all basic interfaces together will break Modular
concept in Component based architecture, that is not true. These are two
different concepts. While component based architecture is software
engineering architectural design, Maven is used to project management.
Component based architecture may or may not have one maven module "core" in
project which has all basic interfaces. If this core maven module is so
complex then it is worth to break it to few modules. But for Airavata this
doesn't relate, as suggested core maven module is not that complex.
Thanks,
Shameera.

On Mon, Jun 1, 2015 at 9:09 PM, Shameera Rathnayaka <sh...@gmail.com>
wrote:

> Hi Raminder,
>
> Different implementations can depend on different versions of the same
> jar, as we are using native java class loading, it doesn't support
> different versions to class loaded and live in one runtime, unless we come
> up with OSGi like modular system or customize class loader behavior which
> is tricky. If we are getting this issue with basic Airavata components then
> we should fix this and use one version through all the components. If
> different plugable implementations use different versions then both can't
> work together only one version will be class loaded, this is a known
> restriction. Anyway I am not suggesting to have one bulky Maven module, we
> should have some level of categorization. If  any interface has different
> implementations and those are distinct enough, then having different
> modules make sense.
>
> Definitely we must remove implementation classes from current core
> modules. With multiplex thrift support which Suresh is working on, for me
> it make sense to have all services in one maven module call "service".
> Anyway let's think other alternatives too.
>
> Thanks,
> Shameera.
>
> On Mon, Jun 1, 2015 at 1:25 PM, Raminderjeet Singh <
> raminderjsingh@gmail.com> wrote:
>
>> Hi Shameera,
>>
>> I think, we are tying to deal with 2 problems. Code manageability/usage
>> and distribution.
>>
>> In GFAC, I noticed that there is a dependencies on modules, which should
>> not exist like GFAC's GSISSH module dependent on SSH module and both the
>> modules depending on GSISSH library. Reason of such depend is duplication
>> of utility methods (see GFACSSHUtils and GFACGSISSHUtils for duplicate
>> code) in both of these modules and we need to fix it. I think there are
>> similar examples in different modules. According to me, if 60% code is
>> duplicate, we need to merge the module as one or come. Other way of saying
>> is we should only create an module when its needed. Local,SSH,GSISSH should
>> not be separate modules as they have lot in common. Just to give you some
>> background on GFAC modules, they were created for the purpose of having
>> different flavors of GFAC. This was mainly done to fix the problem of jar
>> dependencies (different security dependencies) of GFAC modules in a JVM. It
>> was effecting modules to work together. An example in the past was,
>> Unicore, GRAM and GSISSH modules did not work together so we had to spend
>> time to inspect distribution to find which runtime dependency is causing
>> problem and fix it. With the current design we can create individual
>> version of GFAC. We still need to enhance GFAC service to have a flavor of
>> GFAC registered with type and route the jobs. So we need some of this
>> flavoring support in GFAC.
>>
>> I agree with you that we have too many core modules and in core
>> modules like GFAC core, we have implementations (e.g. BetterGFACImpl). We
>> should move implementations to Gfac service. Lets talk about pros and cons
>> about having a single core and then we can decide how to proceed. Currently
>> airavata does not provide a single view to all of its functionality. API
>> server was designed to do that but its also overloaded with lot of
>> implementation details. If I am following your advice right, having a
>> common airavata core will definitely help developer to think about airavata
>> as a system and design new components with system prospective so +1 for
>> something like this. Only drawback can be, taking away some flexibly from a
>> component developer, which is anyhow good for airavata system point of
>> view. Before we just into conclusion, We need to just evaluate how it will
>> work with our thrift services design.
>>
>> Thanks
>> Raminder
>>
>>
>>
>> On Sat, May 30, 2015 at 11:55 PM, Shameera Rathnayaka <
>> shameerainfo@gmail.com> wrote:
>>
>>> Hi Suresh,
>>>
>>> You are thinking about deployment perspective while I am thinking about
>>> dependency issue. With my suggestion for each component distribution will
>>> be increased by less than 1MB,  because only the interfaces are in the
>>> core. And in runtime those interfaces will not be loaded. Thinking the
>>> trouble we are getting at development time and code maintain issues. I
>>> think we can bare with that 1MB.
>>>
>>> Thanks,
>>> Shameera.
>>>
>>> On Sat, May 30, 2015 at 10:45 PM, Suresh Marru <sm...@apache.org>
>>> wrote:
>>>
>>>> Shameera,
>>>>
>>>> Every component has in its own thrift service interface (registry and
>>>> messaging have exceptions). Every component will need to have a dependency
>>>> to airavata data models (which includes util classes) and probably registry
>>>> and messaging. if a component A needs to invoke component B via RPC call,
>>>> then it just needs to include its component A’s thrift client. If the
>>>> communication is through work queue’s then there is no dependency between
>>>> them. Can you describe what you want to propose in this context?
>>>>
>>>> Suresh
>>>>
>>>> On May 30, 2015, at 8:36 PM, Shameera Rathnayaka <
>>>> shameerainfo@gmail.com> wrote:
>>>>
>>>> Hi Suresh,
>>>>
>>>>>
>>>>> Spark is not the right comparison for this discussion. I have been a
>>>>> spark incubation mentor and have been following the code organization since
>>>>> its early days. All of the spark components you mentions rely on the core.
>>>>>
>>>>
>>>> ​Yes, that is what I highlight , each components doesn't have their own
>>>> core modules which has the interfaces. Spark SQL module has core submodule
>>>> but all interfaces reside in main core module.
>>>>
>>>>
>>>>> Let me step back and ask, what is the problem you are trying to solve?
>>>>> We sure need to cleanup modules and it is time to re-look at the component
>>>>> organization. But what do you want to really achieve by combining all the
>>>>> core into monolithic components. It took an effort to cleanly separate
>>>>> functionality so they can evolve and can be improved independently.
>>>>>
>>>>
>>>>
>>>> I am not suggesting to go back to the monolithic core which has all
>>>> implementations and interfaces bundle with together. What i am saying is
>>>> have core interfaces all together, this will give us clear module
>>>> dependency graph. This is something like having one root dependency graph
>>>> instead of multiple roots. As a developer it is cumbersome and hard to deal
>>>> with module dependency issues. Having too fine gran core modules introduce
>>>> wrong dependency graph eventually, and it will prevent us to follow proper
>>>> design patterns in our code base. If we avoid design patterns, it will
>>>> require more time to find bugs and maintain. This is what I am trying to
>>>> resolve. I have first hand experiences with current airavata code. If you
>>>> see the current module dependency graphs, then you will understand why I am
>>>> making such noise to resolve this.
>>>>
>>>>
>>>>> So why do we make our repository bulky with modules unless it doesn't
>>>>> provide any considerable advantage.
>>>>>
>>>>> ​If we really w
>>>>> ​ant to ​
>>>>> separate distribution bundle for each component ( apiServer ,
>>>>> Orchestrator and Gfac ) let
>>>>> ​'​
>>>>> s use different bin.xml file
>>>>> ​s​
>>>>>  to do it instead of using different modules. But reality is we only
>>>>> use ​all in one distribution.
>>>>>
>>>>>
>>>>> Once the monitoring is fixed to use messaging, we really need to
>>>>> decouple the component deployments. Yes there is a considerable advantage.
>>>>> Each components has different quality of service requirements. A production
>>>>> platform has to load balance and scale horizontally. And thats different
>>>>> for different component. The all in one bundle has 300+ jars, but API
>>>>> server and orchestrator when independent will have around 50 or so jars.
>>>>> When I want to deploy api server and orchestrator there is significant
>>>>> different in small light weight components vs one monolithic core.
>>>>>
>>>>> Another problem is component evolution. Lets say there is a production
>>>>> deployment running 1.1.4 version. Lets say the single job execution is
>>>>> stable enough and there is a 6 month focused effort on workflow. Say the
>>>>> master moves to 1.2.8 with all the changes to workflow and only few to
>>>>> single application execution. We can more comfortably upgrade is they are
>>>>> cleanly separated modules. But if it is one core and so many changes to it
>>>>> (even though technically they are to different classes, the perception will
>>>>> remain), the upgrades will get behind.
>>>>>
>>>>> Bottom line I am + 1 for cleaning up the modules. Past few years we
>>>>> have been moving towards micro service architectures and your suggestions
>>>>> will reverse this back to monolithic architecture. I am -1 for this change
>>>>> in direction.
>>>>>
>>>>
>>>>
>>>> By looking at your miniature component suggestion above, it has 30+
>>>> modules. Do you think we really need this number of categorization? With my
>>>> industrial experience, I have seen
>>>> ​number of modules in a project always increase with time. Hence​
>>>>  If we start 30+ we will come to 40+ and then 50+
>>>> ​ and so on so forth​
>>>> .  Why we make this complicated?
>>>>
>>>> ​Thanks,
>>>> Shameera.​
>>>>
>>>>
>>>>
>>>>>
>>>>> Suresh
>>>>>
>>>>>
>>>>> ​​Thanks,
>>>>> ​Shameera.
>>>>> ​
>>>>>
>>>>>>
>>>>>> Suresh
>>>>>>
>>>>>> On May 29, 2015, at 11:29 AM, Suresh Marru <sm...@apache.org> wrote:
>>>>>>
>>>>>> + 1.
>>>>>>
>>>>>> I was planning to bring up this issue also. Probably it will not
>>>>>> address what you are raising, but here is a tree output from airavata labs
>>>>>> code I was toying with locally. I did not yet compare it with what you
>>>>>> proposed, I will do so later today.
>>>>>>
>>>>>> ├── airavata-api
>>>>>> │   ├── airavata-api-interface-descriptions
>>>>>> │   ├── airavata-api-java-stubs
>>>>>> │   ├── airavata-api-server
>>>>>> │   ├── airavata-data-models
>>>>>> │   ├── api-security-manager
>>>>>> ├── clients
>>>>>> │   ├── airavata-client-cpp-sdk
>>>>>> │   ├── airavata-client-java-sdk
>>>>>> │   ├── airavata-client-php-sdk
>>>>>> │   ├── airavata-client-python-sdk
>>>>>> │   ├── airavata-sample-examples
>>>>>> │   └── airavata-xbaya-gui
>>>>>> ├── components
>>>>>> │   ├── commons
>>>>>> │   ├── component-interface-descriptions
>>>>>> │   ├── component-services
>>>>>> │   │   ├── credential-store-service
>>>>>> │   │   ├── orchestrator-service
>>>>>> │   │   ├── task-executor-service
>>>>>> │   │   └── workflow-interpreter-service
>>>>>> │   ├── component-clients
>>>>>> │   │   ├── credential-store-client
>>>>>> │   │   ├── orchestrator-client
>>>>>> │   │   ├── task-executor-client
>>>>>> │   │   ├── workflow-interpreter-client
>>>>>> │   │   └── messaging
>>>>>> │   ├── task-adaptors
>>>>>> │   │   ├── compute
>>>>>> │   │   └── data-movement
>>>>>> │   ├── registry
>>>>>> │   │   ├── app-catalog
>>>>>> │   │   ├── experiment-catalog
>>>>>> │   │   └── resource-catalog
>>>>>> │   └── workflow-interpreter
>>>>>> ├── distribution
>>>>>> ├── integration-tests
>>>>>>
>>>>>>
>>>>>>
>>>>>> On May 29, 2015, at 10:15 AM, Shameera Rathnayaka <
>>>>>> shameera@apache.org> wrote:
>>>>>>
>>>>>> Hi Devs,
>>>>>>
>>>>>> As we are using different modules to package different type of
>>>>>> functionalities, which will help us to maintain loosely couple codes. Now
>>>>>> the project has 49 leaf module ( one to hit half century :) ). If we allow
>>>>>> project to grow this way, having too fine grain modules will be huge
>>>>>> headache in future. IMO we should clean this ASAP before it become really
>>>>>> mess. Actually we half way there, I experienced cyclic dependency issues
>>>>>> when I was writing workflow implementation and email monitoring. Please see
>>>>>> the modules in current repo below.
>>>>>>
>>>>>> <module-name> ( <num of child modules> )
>>>>>>
>>>>>> modules  ( 43 )
>>>>>>      app-catalog ( 2 )
>>>>>>      commons ( 1 )
>>>>>>      configurations ( 2 )
>>>>>>      credential-store ( 3 )
>>>>>>      distribution ( 8 )
>>>>>>      gfac ( 10 )
>>>>>>      integration test ( 1 )
>>>>>>      messaging ( 2 )
>>>>>>      orchestrator ( 3 )
>>>>>>      registry ( 3 )
>>>>>>      security ( 1 )
>>>>>>      server ( 1 )
>>>>>>      test-suit ( 1 )
>>>>>>      workflow ( 1 )
>>>>>>      workflow-modal ( 3 )
>>>>>>      xbaya ( 1 )
>>>>>> airavata-api ( 5 )
>>>>>> tools ( 1 )
>>>>>>
>>>>>> Most of the current modules have interfaces and implementations
>>>>>> together, but this violate our main goal which reduce inter module
>>>>>> dependencies. Following is what I am suggesting, WDYS?
>>>>>>
>>>>>> core { has all core interfaces and basic classes of gfac-core ,
>>>>>> orchestrator-core , message-core , monitor core, registry core,
>>>>>> workflow-core}
>>>>>> service - all thrift services and service handlers
>>>>>> orchestrator - orchestrator specific classes
>>>>>> gfac
>>>>>>      SSH
>>>>>>      BES
>>>>>>      Local
>>>>>> message - amqp implemention
>>>>>> distribution
>>>>>>      XBaya
>>>>>>      server - { use different mode input to start server as
>>>>>> orchestrator , Gfac or/and api-server }
>>>>>> commons
>>>>>> registry
>>>>>> app-catalog
>>>>>> security
>>>>>> Workflow
>>>>>> XBaya-gui
>>>>>> Integration-test
>>>>>>
>>>>>> Thanks,
>>>>>> Shameera.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards,
>>>>> Shameera Rathnayaka.
>>>>>
>>>>> email: shameera AT apache.org , shameerainfo AT gmail.com
>>>>> Blog : http://shameerarathnayaka.blogspot.com/
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best Regards,
>>>> Shameera Rathnayaka.
>>>>
>>>> email: shameera AT apache.org , shameerainfo AT gmail.com
>>>> Blog : http://shameerarathnayaka.blogspot.com/
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Best Regards,
>>> Shameera Rathnayaka.
>>>
>>> email: shameera AT apache.org , shameerainfo AT gmail.com
>>> Blog : http://shameerarathnayaka.blogspot.com/
>>>
>>
>>
>
>
> --
> Best Regards,
> Shameera Rathnayaka.
>
> email: shameera AT apache.org , shameerainfo AT gmail.com
> Blog : http://shameerarathnayaka.blogspot.com/
>



-- 
Best Regards,
Shameera Rathnayaka.

email: shameera AT apache.org , shameerainfo AT gmail.com
Blog : http://shameerarathnayaka.blogspot.com/

Re: Too Many Leaf Modules.

Posted by Shameera Rathnayaka <sh...@gmail.com>.
Hi Raminder,

Different implementations can depend on different versions of the same jar,
as we are using native java class loading, it doesn't support different
versions to class loaded and live in one runtime, unless we come up with
OSGi like modular system or customize class loader behavior which is
tricky. If we are getting this issue with basic Airavata components then we
should fix this and use one version through all the components. If
different plugable implementations use different versions then both can't
work together only one version will be class loaded, this is a known
restriction. Anyway I am not suggesting to have one bulky Maven module, we
should have some level of categorization. If  any interface has different
implementations and those are distinct enough, then having different
modules make sense.

Definitely we must remove implementation classes from current core modules.
With multiplex thrift support which Suresh is working on, for me it make
sense to have all services in one maven module call "service". Anyway let's
think other alternatives too.

Thanks,
Shameera.

On Mon, Jun 1, 2015 at 1:25 PM, Raminderjeet Singh <raminderjsingh@gmail.com
> wrote:

> Hi Shameera,
>
> I think, we are tying to deal with 2 problems. Code manageability/usage
> and distribution.
>
> In GFAC, I noticed that there is a dependencies on modules, which should
> not exist like GFAC's GSISSH module dependent on SSH module and both the
> modules depending on GSISSH library. Reason of such depend is duplication
> of utility methods (see GFACSSHUtils and GFACGSISSHUtils for duplicate
> code) in both of these modules and we need to fix it. I think there are
> similar examples in different modules. According to me, if 60% code is
> duplicate, we need to merge the module as one or come. Other way of saying
> is we should only create an module when its needed. Local,SSH,GSISSH should
> not be separate modules as they have lot in common. Just to give you some
> background on GFAC modules, they were created for the purpose of having
> different flavors of GFAC. This was mainly done to fix the problem of jar
> dependencies (different security dependencies) of GFAC modules in a JVM. It
> was effecting modules to work together. An example in the past was,
> Unicore, GRAM and GSISSH modules did not work together so we had to spend
> time to inspect distribution to find which runtime dependency is causing
> problem and fix it. With the current design we can create individual
> version of GFAC. We still need to enhance GFAC service to have a flavor of
> GFAC registered with type and route the jobs. So we need some of this
> flavoring support in GFAC.
>
> I agree with you that we have too many core modules and in core
> modules like GFAC core, we have implementations (e.g. BetterGFACImpl). We
> should move implementations to Gfac service. Lets talk about pros and cons
> about having a single core and then we can decide how to proceed. Currently
> airavata does not provide a single view to all of its functionality. API
> server was designed to do that but its also overloaded with lot of
> implementation details. If I am following your advice right, having a
> common airavata core will definitely help developer to think about airavata
> as a system and design new components with system prospective so +1 for
> something like this. Only drawback can be, taking away some flexibly from a
> component developer, which is anyhow good for airavata system point of
> view. Before we just into conclusion, We need to just evaluate how it will
> work with our thrift services design.
>
> Thanks
> Raminder
>
>
>
> On Sat, May 30, 2015 at 11:55 PM, Shameera Rathnayaka <
> shameerainfo@gmail.com> wrote:
>
>> Hi Suresh,
>>
>> You are thinking about deployment perspective while I am thinking about
>> dependency issue. With my suggestion for each component distribution will
>> be increased by less than 1MB,  because only the interfaces are in the
>> core. And in runtime those interfaces will not be loaded. Thinking the
>> trouble we are getting at development time and code maintain issues. I
>> think we can bare with that 1MB.
>>
>> Thanks,
>> Shameera.
>>
>> On Sat, May 30, 2015 at 10:45 PM, Suresh Marru <sm...@apache.org> wrote:
>>
>>> Shameera,
>>>
>>> Every component has in its own thrift service interface (registry and
>>> messaging have exceptions). Every component will need to have a dependency
>>> to airavata data models (which includes util classes) and probably registry
>>> and messaging. if a component A needs to invoke component B via RPC call,
>>> then it just needs to include its component A’s thrift client. If the
>>> communication is through work queue’s then there is no dependency between
>>> them. Can you describe what you want to propose in this context?
>>>
>>> Suresh
>>>
>>> On May 30, 2015, at 8:36 PM, Shameera Rathnayaka <sh...@gmail.com>
>>> wrote:
>>>
>>> Hi Suresh,
>>>
>>>>
>>>> Spark is not the right comparison for this discussion. I have been a
>>>> spark incubation mentor and have been following the code organization since
>>>> its early days. All of the spark components you mentions rely on the core.
>>>>
>>>
>>> ​Yes, that is what I highlight , each components doesn't have their own
>>> core modules which has the interfaces. Spark SQL module has core submodule
>>> but all interfaces reside in main core module.
>>>
>>>
>>>> Let me step back and ask, what is the problem you are trying to solve?
>>>> We sure need to cleanup modules and it is time to re-look at the component
>>>> organization. But what do you want to really achieve by combining all the
>>>> core into monolithic components. It took an effort to cleanly separate
>>>> functionality so they can evolve and can be improved independently.
>>>>
>>>
>>>
>>> I am not suggesting to go back to the monolithic core which has all
>>> implementations and interfaces bundle with together. What i am saying is
>>> have core interfaces all together, this will give us clear module
>>> dependency graph. This is something like having one root dependency graph
>>> instead of multiple roots. As a developer it is cumbersome and hard to deal
>>> with module dependency issues. Having too fine gran core modules introduce
>>> wrong dependency graph eventually, and it will prevent us to follow proper
>>> design patterns in our code base. If we avoid design patterns, it will
>>> require more time to find bugs and maintain. This is what I am trying to
>>> resolve. I have first hand experiences with current airavata code. If you
>>> see the current module dependency graphs, then you will understand why I am
>>> making such noise to resolve this.
>>>
>>>
>>>> So why do we make our repository bulky with modules unless it doesn't
>>>> provide any considerable advantage.
>>>>
>>>> ​If we really w
>>>> ​ant to ​
>>>> separate distribution bundle for each component ( apiServer ,
>>>> Orchestrator and Gfac ) let
>>>> ​'​
>>>> s use different bin.xml file
>>>> ​s​
>>>>  to do it instead of using different modules. But reality is we only
>>>> use ​all in one distribution.
>>>>
>>>>
>>>> Once the monitoring is fixed to use messaging, we really need to
>>>> decouple the component deployments. Yes there is a considerable advantage.
>>>> Each components has different quality of service requirements. A production
>>>> platform has to load balance and scale horizontally. And thats different
>>>> for different component. The all in one bundle has 300+ jars, but API
>>>> server and orchestrator when independent will have around 50 or so jars.
>>>> When I want to deploy api server and orchestrator there is significant
>>>> different in small light weight components vs one monolithic core.
>>>>
>>>> Another problem is component evolution. Lets say there is a production
>>>> deployment running 1.1.4 version. Lets say the single job execution is
>>>> stable enough and there is a 6 month focused effort on workflow. Say the
>>>> master moves to 1.2.8 with all the changes to workflow and only few to
>>>> single application execution. We can more comfortably upgrade is they are
>>>> cleanly separated modules. But if it is one core and so many changes to it
>>>> (even though technically they are to different classes, the perception will
>>>> remain), the upgrades will get behind.
>>>>
>>>> Bottom line I am + 1 for cleaning up the modules. Past few years we
>>>> have been moving towards micro service architectures and your suggestions
>>>> will reverse this back to monolithic architecture. I am -1 for this change
>>>> in direction.
>>>>
>>>
>>>
>>> By looking at your miniature component suggestion above, it has 30+
>>> modules. Do you think we really need this number of categorization? With my
>>> industrial experience, I have seen
>>> ​number of modules in a project always increase with time. Hence​
>>>  If we start 30+ we will come to 40+ and then 50+
>>> ​ and so on so forth​
>>> .  Why we make this complicated?
>>>
>>> ​Thanks,
>>> Shameera.​
>>>
>>>
>>>
>>>>
>>>> Suresh
>>>>
>>>>
>>>> ​​Thanks,
>>>> ​Shameera.
>>>> ​
>>>>
>>>>>
>>>>> Suresh
>>>>>
>>>>> On May 29, 2015, at 11:29 AM, Suresh Marru <sm...@apache.org> wrote:
>>>>>
>>>>> + 1.
>>>>>
>>>>> I was planning to bring up this issue also. Probably it will not
>>>>> address what you are raising, but here is a tree output from airavata labs
>>>>> code I was toying with locally. I did not yet compare it with what you
>>>>> proposed, I will do so later today.
>>>>>
>>>>> ├── airavata-api
>>>>> │   ├── airavata-api-interface-descriptions
>>>>> │   ├── airavata-api-java-stubs
>>>>> │   ├── airavata-api-server
>>>>> │   ├── airavata-data-models
>>>>> │   ├── api-security-manager
>>>>> ├── clients
>>>>> │   ├── airavata-client-cpp-sdk
>>>>> │   ├── airavata-client-java-sdk
>>>>> │   ├── airavata-client-php-sdk
>>>>> │   ├── airavata-client-python-sdk
>>>>> │   ├── airavata-sample-examples
>>>>> │   └── airavata-xbaya-gui
>>>>> ├── components
>>>>> │   ├── commons
>>>>> │   ├── component-interface-descriptions
>>>>> │   ├── component-services
>>>>> │   │   ├── credential-store-service
>>>>> │   │   ├── orchestrator-service
>>>>> │   │   ├── task-executor-service
>>>>> │   │   └── workflow-interpreter-service
>>>>> │   ├── component-clients
>>>>> │   │   ├── credential-store-client
>>>>> │   │   ├── orchestrator-client
>>>>> │   │   ├── task-executor-client
>>>>> │   │   ├── workflow-interpreter-client
>>>>> │   │   └── messaging
>>>>> │   ├── task-adaptors
>>>>> │   │   ├── compute
>>>>> │   │   └── data-movement
>>>>> │   ├── registry
>>>>> │   │   ├── app-catalog
>>>>> │   │   ├── experiment-catalog
>>>>> │   │   └── resource-catalog
>>>>> │   └── workflow-interpreter
>>>>> ├── distribution
>>>>> ├── integration-tests
>>>>>
>>>>>
>>>>>
>>>>> On May 29, 2015, at 10:15 AM, Shameera Rathnayaka <sh...@apache.org>
>>>>> wrote:
>>>>>
>>>>> Hi Devs,
>>>>>
>>>>> As we are using different modules to package different type of
>>>>> functionalities, which will help us to maintain loosely couple codes. Now
>>>>> the project has 49 leaf module ( one to hit half century :) ). If we allow
>>>>> project to grow this way, having too fine grain modules will be huge
>>>>> headache in future. IMO we should clean this ASAP before it become really
>>>>> mess. Actually we half way there, I experienced cyclic dependency issues
>>>>> when I was writing workflow implementation and email monitoring. Please see
>>>>> the modules in current repo below.
>>>>>
>>>>> <module-name> ( <num of child modules> )
>>>>>
>>>>> modules  ( 43 )
>>>>>      app-catalog ( 2 )
>>>>>      commons ( 1 )
>>>>>      configurations ( 2 )
>>>>>      credential-store ( 3 )
>>>>>      distribution ( 8 )
>>>>>      gfac ( 10 )
>>>>>      integration test ( 1 )
>>>>>      messaging ( 2 )
>>>>>      orchestrator ( 3 )
>>>>>      registry ( 3 )
>>>>>      security ( 1 )
>>>>>      server ( 1 )
>>>>>      test-suit ( 1 )
>>>>>      workflow ( 1 )
>>>>>      workflow-modal ( 3 )
>>>>>      xbaya ( 1 )
>>>>> airavata-api ( 5 )
>>>>> tools ( 1 )
>>>>>
>>>>> Most of the current modules have interfaces and implementations
>>>>> together, but this violate our main goal which reduce inter module
>>>>> dependencies. Following is what I am suggesting, WDYS?
>>>>>
>>>>> core { has all core interfaces and basic classes of gfac-core ,
>>>>> orchestrator-core , message-core , monitor core, registry core,
>>>>> workflow-core}
>>>>> service - all thrift services and service handlers
>>>>> orchestrator - orchestrator specific classes
>>>>> gfac
>>>>>      SSH
>>>>>      BES
>>>>>      Local
>>>>> message - amqp implemention
>>>>> distribution
>>>>>      XBaya
>>>>>      server - { use different mode input to start server as
>>>>> orchestrator , Gfac or/and api-server }
>>>>> commons
>>>>> registry
>>>>> app-catalog
>>>>> security
>>>>> Workflow
>>>>> XBaya-gui
>>>>> Integration-test
>>>>>
>>>>> Thanks,
>>>>> Shameera.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best Regards,
>>>> Shameera Rathnayaka.
>>>>
>>>> email: shameera AT apache.org , shameerainfo AT gmail.com
>>>> Blog : http://shameerarathnayaka.blogspot.com/
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Best Regards,
>>> Shameera Rathnayaka.
>>>
>>> email: shameera AT apache.org , shameerainfo AT gmail.com
>>> Blog : http://shameerarathnayaka.blogspot.com/
>>>
>>>
>>>
>>
>>
>> --
>> Best Regards,
>> Shameera Rathnayaka.
>>
>> email: shameera AT apache.org , shameerainfo AT gmail.com
>> Blog : http://shameerarathnayaka.blogspot.com/
>>
>
>


-- 
Best Regards,
Shameera Rathnayaka.

email: shameera AT apache.org , shameerainfo AT gmail.com
Blog : http://shameerarathnayaka.blogspot.com/