You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apex.apache.org by Andy Perlitch <an...@datatorrent.com> on 2015/09/17 20:01:37 UTC

More sensible modules/artifacts in malhar

Hi everyone,

I am currently assigned to MLHR-1843
<https://malhar.atlassian.net/browse/MLHR-1843>, which essentially aims to
expose smaller, more consumable maven artifacts that would do away with the
need to manually include necessary dependencies based on the operators in
use.

As an example, say I am building an app package that needs Kafka input and
output operators, but I don't want all the other transitive dependencies
that come via malhar-contrib. Currently I would need to specify
malhar-contrib as a dependency, and add an exclusions block  in my app
package pom:





*<dependency>  <groupId>com.datatorrent</groupId>
<artifactId>malhar-contrib</artifactId>  <version>3.0.0</version>  <!-- so
none of malhar-contrib's deps are included -->*






*  <exclusions>    <exclusion>      <groupId>*</groupId>
<artifactId>*</artifactId>    </exclusion>  </exclusions></dependency>*

Then, I would have to include the kafka library explicitly as a dependency:





*<dependency>  <groupId>org.apache.kafka</groupId>
<artifactId>kafka_2.10</artifactId>
<version>0.8.1.1</version></dependency>*

Wouldn't it be nice if I could just put this in my pom?:





*<dependency>  <groupId>com.datatorrent</groupId>
<artifactId>malhar-contrib-kafka</artifactId>
<version>3.0.0</version></dependency>*


In order to make this possible, we will need to organize the malhar project
into more granular modules (artifacts). Specifically, the malhar-contrib
artifact would essentially just be a pom that specifies each smaller module
as a dependency:

*<!-- in malhar-contrib's pom.xml: -->*

*<modules>  <module>kafka</module>*
*  <module>twitter</module>*
*  <module>redis</module>*

*  <!-- other smaller modules --></modules>*




*<dependency>  <groupId>com.datatorrent</groupId>
<artifactId>malhar-contrib-kafka</artifactId>
<version>3.0.0</version></dependency>*




*<dependency>  <groupId>com.datatorrent</groupId>
<artifactId>malhar-contrib-twitter</artifactId>
<version>3.0.0</version></dependency>*




*<dependency>  <groupId>com.datatorrent</groupId>
<artifactId>malhar-contrib-redis</artifactId>
<version>3.0.0</version></dependency>*

With these changes, there may be a risk of breaking backwards
compatibility, however I think the gain in usability of malhar merits the
effort to make this work.

I am still relatively new to maven, so I would love to get some feedback
from other devs about this!

-- 
Regards,
Andy Perlitch
Software Engineer
DataTorrent Inc
(408)829-9319

Re: More sensible modules/artifacts in malhar

Posted by Vlad Rozov <v....@datatorrent.com>.
As part of the change I recommend to separate benchmark application from 
operators certification and move benchmark to the APEX core.

Thank you,

Vlad

On 9/29/15 00:19, Andy Perlitch wrote:
> Hi all,
>
> This is a first cut at a plan to restructure malhar in a way that is more
> portable and adherent to Maven's principles of modularity and dependency
> management.
>
> Overview of Current Malhar Architecture
> ---------------------------------------------------------------
> The current malhar repo consists of several maven modules:
>
> * *malhar-library*
>     operators which do not require additional transitive dependencies beyond
> what Apex and Hadoop require
> *  *malhar-contrib*
>     operators requiring other maven dependencies
> * *malhar-demos*
>     demo applications
> * *malhar-samples*
>     sample code showing example usage of malhar operators
> * *malhar-apps*
>     apex applications (currently only logstream)
>
>
> Proposed Changes
> ---------------------------------------------------------------
>
> 1. *Scrub malhar-library for any operators needing additional dependencies*
>    `malhar-library` is intended to consist of only operators without extra
> transitive dependencies. All operators should be checked for the necessity
> of extra dependencies.
>
> 2. *Move operators from malhar-demos and malhar-apps into contrib (or
> library if prudent)*
>      There are various operators in both of these modules that are general
> enough to move into library or contrib.
>
> 3. *Create modules for all contrib subfolders*
>      All folders under `contrib/src/main/com/datatorrent/contrib/` should be
> converted to modules of contrib and listed as such in `/contrib/pom.xml`.
>      Additionally, each of these smaller contrib modules will have its own
> version and dependencies.
>
> 4. *Use the Shades Plugin to allow for backwards-compatible fully-qualified
> class names*
>      This is made possible by shades class relocation
> <https://maven.apache.org/plugins/maven-shade-plugin/examples/class-relocation.html>
> feature. This might be a bit error prone as well as confusing to use for
> outside developers, but it must be done if these changes are to be made
> prior to a major release.
>
>
>
> Let me know what you all think of this approach.
>
> Best,
> Andy
>
>
> On Tue, Sep 22, 2015 at 11:20 AM, Chetan Narsude <ch...@datatorrent.com>
> wrote:
>
>> +1
>>
>> On Tue, Sep 22, 2015 at 11:08 AM, Gaurav Gupta <ga...@datatorrent.com>
>> wrote:
>>
>>> I agree with David.. Each artifact should have it's own version
>>>
>>> Thanks
>>> -Gaurav
>>>
>>> On Tue, Sep 22, 2015 at 11:07 AM, David Yan <da...@datatorrent.com>
>> wrote:
>>>> I actually think that each baby artifact should have its own version,
>>>> because each artifact has its own interface and its own life cycle,
>>>> especially after we break up the giant library, applications will
>> depend
>>> on
>>>> the baby artifacts instead of the giant library.  For example if there
>> is
>>>> no change in malhar-contrib-kafka (I think the name should actually be
>>>> apex-malhar-kafka), we should not confuse users by bumping the version.
>>>>
>>>> David
>>>>
>>>> On Tue, Sep 22, 2015 at 9:03 AM, Andy Perlitch <an...@datatorrent.com>
>>>> wrote:
>>>>
>>>>> Tushar,
>>>>>
>>>>> I agree that all modules should inherit the version from the "parent
>>> pom"
>>>>> of the malhar repo. I think the benefits outweigh the cost of bumping
>>>>> versions of components that haven't actually changed. I'd love to get
>>>>> others feedback on this as well.
>>>>>
>>>>> On another note, I plan on starting a spreadsheet/googledoc with the
>>>>> possible groupings of operators into these modules. Stay tuned...
>>>>>
>>>>> -Andy
>>>>>
>>>>> On Mon, Sep 21, 2015 at 11:51 PM, Tushar Gosavi <
>>> tushar@datatorrent.com>
>>>>> wrote:
>>>>>
>>>>>> +1 for the general idea
>>>>>>
>>>>>> Does these independent modules going to have independent versions?
>>> For
>>>>>> example, if there is no change in kafka operator between malhar 3.0
>>> and
>>>>>> malhar 4.0, will we increment version of malhar-contrib-kafka to
>>> 4.0. I
>>>>>> have learned from my previous project that, It is easier to manage
>>>>> versions
>>>>>> if we make all modules at same version level for a release, even if
>>>> there
>>>>>> is no change in a particular module.
>>>>>>
>>>>>> - Tushar.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Sep 18, 2015 at 12:18 AM, Timothy Farkas <
>>> tim@datatorrent.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I agree Andy's solution is better, but just for the sake of
>>> argument
>>>>>>> profiles can be inherited from a parent pom, so if the maven
>>>> archetype
>>>>>>> defines a new project with a parent pom with the correct profiles
>>>>>> defined,
>>>>>>> then the desired profiles can be activated in the pom of the new
>>>>> project.
>>>>>>> It is no more complicated than adding additional dependencies to
>>> your
>>>>>>> project.
>>>>>>>
>>>>>>> On Thu, Sep 17, 2015 at 10:32 AM, Sandesh Hegde <
>>>>> sandesh@datatorrent.com
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Currently all the dependencies in Malhar-Contrib are marked as
>>>>>> optional.
>>>>>>> So
>>>>>>>> users have to already modify the existing POM to use it in
>> their
>>>>>> project.
>>>>>>>> So restructuring should be fine.
>>>>>>>>
>>>>>>>> On Thu, Sep 17, 2015 at 11:29 AM Chetan Narsude <
>>>>>> chetan@datatorrent.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> The profiles are excellent when you are developing
>>>> malhar-contrib.
>>>>>>>> Profiles
>>>>>>>>> do not work when you are using malhar-contrib. The problem
>> Andy
>>>> is
>>>>>>>> trying
>>>>>>>>> to solve is the later. If there is an elegant solution which
>> I
>>> am
>>>>>>> missing
>>>>>>>>> using profiles, please correct me.
>>>>>>>>>
>>>>>>>>> The way Andy suggested is the way many successful projects do
>>> it.
>>>>>> Look
>>>>>>> at
>>>>>>>>> Netty as an example.
>>>>>>>>>
>>>>>>>>> +1 for that.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Chetan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas <
>>>>>> tim@datatorrent.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> I think restructuring the project in that way would be the
>>>>>>> technically
>>>>>>>>>> correct thing to do, but if people are unwilling to accept
>>> the
>>>>>> change
>>>>>>>> in
>>>>>>>>>> project structure you could achieve something similar by
>>> using
>>>>>> maven
>>>>>>>>>> profiles. With profiles the project structure would remain
>> as
>>>> is.
>>>>>>>>> Profiles
>>>>>>>>>> could be added to the malhar pom, and a profile would
>> define
>>>> the
>>>>>>>>>> dependencies needed for different types of operators. For
>>>> example
>>>>>> the
>>>>>>>>> hbase
>>>>>>>>>> profile would define the dependencies for the hbase
>> operator.
>>>>> Then
>>>>>>> any
>>>>>>>>>> project using a malhar library would just activate the
>>> correct
>>>>>>> profile
>>>>>>>> in
>>>>>>>>>> it's pom, and the correct dependencies would be pulled in.
>>>>>>>>>>
>>>>>>>>>>
>> http://maven.apache.org/guides/introduction/introduction-to-profiles.html
>>>>>>>>>> On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch <
>>>>>>> andy@datatorrent.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>
>>>>>>>>>>> I am currently assigned to MLHR-1843
>>>>>>>>>>> <https://malhar.atlassian.net/browse/MLHR-1843>, which
>>>>>> essentially
>>>>>>>>> aims
>>>>>>>>>> to
>>>>>>>>>>> expose smaller, more consumable maven artifacts that
>> would
>>> do
>>>>>> away
>>>>>>>> with
>>>>>>>>>> the
>>>>>>>>>>> need to manually include necessary dependencies based on
>>> the
>>>>>>>> operators
>>>>>>>>> in
>>>>>>>>>>> use.
>>>>>>>>>>>
>>>>>>>>>>> As an example, say I am building an app package that
>> needs
>>>>> Kafka
>>>>>>>> input
>>>>>>>>>> and
>>>>>>>>>>> output operators, but I don't want all the other
>> transitive
>>>>>>>>> dependencies
>>>>>>>>>>> that come via malhar-contrib. Currently I would need to
>>>> specify
>>>>>>>>>>> malhar-contrib as a dependency, and add an exclusions
>> block
>>>> in
>>>>>> my
>>>>>>>> app
>>>>>>>>>>> package pom:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
>>>>>>>>>>> <artifactId>malhar-contrib</artifactId>
>>>>> <version>3.0.0</version>
>>>>>>>> <!--
>>>>>>>>>> so
>>>>>>>>>>> none of malhar-contrib's deps are included -->*
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *  <exclusions>    <exclusion>      <groupId>*</groupId>
>>>>>>>>>>> <artifactId>*</artifactId>    </exclusion>
>>>>>>>> </exclusions></dependency>*
>>>>>>>>>>> Then, I would have to include the kafka library
>> explicitly
>>>> as a
>>>>>>>>>> dependency:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *<dependency>  <groupId>org.apache.kafka</groupId>
>>>>>>>>>>> <artifactId>kafka_2.10</artifactId>
>>>>>>>>>>> <version>0.8.1.1</version></dependency>*
>>>>>>>>>>>
>>>>>>>>>>> Wouldn't it be nice if I could just put this in my pom?:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId>
>>>>>>>>>>> <version>3.0.0</version></dependency>*
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> In order to make this possible, we will need to organize
>>> the
>>>>>> malhar
>>>>>>>>>> project
>>>>>>>>>>> into more granular modules (artifacts). Specifically, the
>>>>>>>>> malhar-contrib
>>>>>>>>>>> artifact would essentially just be a pom that specifies
>>> each
>>>>>>> smaller
>>>>>>>>>> module
>>>>>>>>>>> as a dependency:
>>>>>>>>>>>
>>>>>>>>>>> *<!-- in malhar-contrib's pom.xml: -->*
>>>>>>>>>>>
>>>>>>>>>>> *<modules>  <module>kafka</module>*
>>>>>>>>>>> *  <module>twitter</module>*
>>>>>>>>>>> *  <module>redis</module>*
>>>>>>>>>>>
>>>>>>>>>>> *  <!-- other smaller modules --></modules>*
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId>
>>>>>>>>>>> <version>3.0.0</version></dependency>*
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
>>>>>>>>>>> <artifactId>malhar-contrib-twitter</artifactId>
>>>>>>>>>>> <version>3.0.0</version></dependency>*
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
>>>>>>>>>>> <artifactId>malhar-contrib-redis</artifactId>
>>>>>>>>>>> <version>3.0.0</version></dependency>*
>>>>>>>>>>>
>>>>>>>>>>> With these changes, there may be a risk of breaking
>>> backwards
>>>>>>>>>>> compatibility, however I think the gain in usability of
>>>> malhar
>>>>>>> merits
>>>>>>>>> the
>>>>>>>>>>> effort to make this work.
>>>>>>>>>>>
>>>>>>>>>>> I am still relatively new to maven, so I would love to
>> get
>>>> some
>>>>>>>>> feedback
>>>>>>>>>>> from other devs about this!
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Regards,
>>>>>>>>>>> Andy Perlitch
>>>>>>>>>>> Software Engineer
>>>>>>>>>>> DataTorrent Inc
>>>>>>>>>>> (408)829-9319
>>>>>>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Regards,
>>>>> Andy Perlitch
>>>>> Software Engineer
>>>>> DataTorrent Inc
>>>>> (408)829-9319
>>>>>
>
>


Re: More sensible modules/artifacts in malhar

Posted by Andy Perlitch <an...@datatorrent.com>.
+1

> On Sep 29, 2015, at 4:46 AM, Thomas Weise <th...@datatorrent.com> wrote:
> 
> I actually think that these changes should be made as part of a major
> release (Malhar only, not engine). Along with other changes to convert to
> Apache package names, purge deprecated operators etc.
> 
> Since Apex core and Malhar will be decoupled going forward, such major
> release can be done without affecting existing users. Since the package
> names change, both major versions can also be used together in the same
> application, no forced upgrade, ability to selectively pick new operators.
> 
> Thoughts?
> 
> 
> 
> 
>> On Tue, Sep 29, 2015 at 3:19 AM, Andy Perlitch <an...@datatorrent.com> wrote:
>> 
>> Hi all,
>> 
>> This is a first cut at a plan to restructure malhar in a way that is more
>> portable and adherent to Maven's principles of modularity and dependency
>> management.
>> 
>> Overview of Current Malhar Architecture
>> ---------------------------------------------------------------
>> The current malhar repo consists of several maven modules:
>> 
>> * *malhar-library*
>>   operators which do not require additional transitive dependencies beyond
>> what Apex and Hadoop require
>> *  *malhar-contrib*
>>   operators requiring other maven dependencies
>> * *malhar-demos*
>>   demo applications
>> * *malhar-samples*
>>   sample code showing example usage of malhar operators
>> * *malhar-apps*
>>   apex applications (currently only logstream)
>> 
>> 
>> Proposed Changes
>> ---------------------------------------------------------------
>> 
>> 1. *Scrub malhar-library for any operators needing additional dependencies*
>>  `malhar-library` is intended to consist of only operators without extra
>> transitive dependencies. All operators should be checked for the necessity
>> of extra dependencies.
>> 
>> 2. *Move operators from malhar-demos and malhar-apps into contrib (or
>> library if prudent)*
>>    There are various operators in both of these modules that are general
>> enough to move into library or contrib.
>> 
>> 3. *Create modules for all contrib subfolders*
>>    All folders under `contrib/src/main/com/datatorrent/contrib/` should be
>> converted to modules of contrib and listed as such in `/contrib/pom.xml`.
>>    Additionally, each of these smaller contrib modules will have its own
>> version and dependencies.
>> 
>> 4. *Use the Shades Plugin to allow for backwards-compatible fully-qualified
>> class names*
>>    This is made possible by shades class relocation
>> <
>> https://maven.apache.org/plugins/maven-shade-plugin/examples/class-relocation.html
>> feature. This might be a bit error prone as well as confusing to use for
>> outside developers, but it must be done if these changes are to be made
>> prior to a major release.
>> 
>> 
>> 
>> Let me know what you all think of this approach.
>> 
>> Best,
>> Andy
>> 
>> 
>> On Tue, Sep 22, 2015 at 11:20 AM, Chetan Narsude <ch...@datatorrent.com>
>> wrote:
>> 
>>> +1
>>> 
>>> On Tue, Sep 22, 2015 at 11:08 AM, Gaurav Gupta <ga...@datatorrent.com>
>>> wrote:
>>> 
>>>> I agree with David.. Each artifact should have it's own version
>>>> 
>>>> Thanks
>>>> -Gaurav
>>>> 
>>>>> On Tue, Sep 22, 2015 at 11:07 AM, David Yan <da...@datatorrent.com>
>>>> wrote:
>>>> 
>>>>> I actually think that each baby artifact should have its own version,
>>>>> because each artifact has its own interface and its own life cycle,
>>>>> especially after we break up the giant library, applications will
>>> depend
>>>> on
>>>>> the baby artifacts instead of the giant library.  For example if
>> there
>>> is
>>>>> no change in malhar-contrib-kafka (I think the name should actually
>> be
>>>>> apex-malhar-kafka), we should not confuse users by bumping the
>> version.
>>>>> 
>>>>> David
>>>>> 
>>>>> On Tue, Sep 22, 2015 at 9:03 AM, Andy Perlitch <andy@datatorrent.com
>>> 
>>>>> wrote:
>>>>> 
>>>>>> Tushar,
>>>>>> 
>>>>>> I agree that all modules should inherit the version from the
>> "parent
>>>> pom"
>>>>>> of the malhar repo. I think the benefits outweigh the cost of
>> bumping
>>>>>> versions of components that haven't actually changed. I'd love to
>> get
>>>>>> others feedback on this as well.
>>>>>> 
>>>>>> On another note, I plan on starting a spreadsheet/googledoc with
>> the
>>>>>> possible groupings of operators into these modules. Stay tuned...
>>>>>> 
>>>>>> -Andy
>>>>>> 
>>>>>> On Mon, Sep 21, 2015 at 11:51 PM, Tushar Gosavi <
>>>> tushar@datatorrent.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> +1 for the general idea
>>>>>>> 
>>>>>>> Does these independent modules going to have independent
>> versions?
>>>> For
>>>>>>> example, if there is no change in kafka operator between malhar
>> 3.0
>>>> and
>>>>>>> malhar 4.0, will we increment version of malhar-contrib-kafka to
>>>> 4.0. I
>>>>>>> have learned from my previous project that, It is easier to
>> manage
>>>>>> versions
>>>>>>> if we make all modules at same version level for a release, even
>> if
>>>>> there
>>>>>>> is no change in a particular module.
>>>>>>> 
>>>>>>> - Tushar.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Fri, Sep 18, 2015 at 12:18 AM, Timothy Farkas <
>>>> tim@datatorrent.com>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> I agree Andy's solution is better, but just for the sake of
>>>> argument
>>>>>>>> profiles can be inherited from a parent pom, so if the maven
>>>>> archetype
>>>>>>>> defines a new project with a parent pom with the correct
>> profiles
>>>>>>> defined,
>>>>>>>> then the desired profiles can be activated in the pom of the
>> new
>>>>>> project.
>>>>>>>> It is no more complicated than adding additional dependencies
>> to
>>>> your
>>>>>>>> project.
>>>>>>>> 
>>>>>>>> On Thu, Sep 17, 2015 at 10:32 AM, Sandesh Hegde <
>>>>>> sandesh@datatorrent.com
>>>>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Currently all the dependencies in Malhar-Contrib are marked
>> as
>>>>>>> optional.
>>>>>>>> So
>>>>>>>>> users have to already modify the existing POM to use it in
>>> their
>>>>>>> project.
>>>>>>>>> So restructuring should be fine.
>>>>>>>>> 
>>>>>>>>> On Thu, Sep 17, 2015 at 11:29 AM Chetan Narsude <
>>>>>>> chetan@datatorrent.com>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> The profiles are excellent when you are developing
>>>>> malhar-contrib.
>>>>>>>>> Profiles
>>>>>>>>>> do not work when you are using malhar-contrib. The problem
>>> Andy
>>>>> is
>>>>>>>>> trying
>>>>>>>>>> to solve is the later. If there is an elegant solution
>> which
>>> I
>>>> am
>>>>>>>> missing
>>>>>>>>>> using profiles, please correct me.
>>>>>>>>>> 
>>>>>>>>>> The way Andy suggested is the way many successful projects
>> do
>>>> it.
>>>>>>> Look
>>>>>>>> at
>>>>>>>>>> Netty as an example.
>>>>>>>>>> 
>>>>>>>>>> +1 for that.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> Chetan
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas <
>>>>>>> tim@datatorrent.com>
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> I think restructuring the project in that way would be
>> the
>>>>>>>> technically
>>>>>>>>>>> correct thing to do, but if people are unwilling to
>> accept
>>>> the
>>>>>>> change
>>>>>>>>> in
>>>>>>>>>>> project structure you could achieve something similar by
>>>> using
>>>>>>> maven
>>>>>>>>>>> profiles. With profiles the project structure would
>> remain
>>> as
>>>>> is.
>>>>>>>>>> Profiles
>>>>>>>>>>> could be added to the malhar pom, and a profile would
>>> define
>>>>> the
>>>>>>>>>>> dependencies needed for different types of operators. For
>>>>> example
>>>>>>> the
>>>>>>>>>> hbase
>>>>>>>>>>> profile would define the dependencies for the hbase
>>> operator.
>>>>>> Then
>>>>>>>> any
>>>>>>>>>>> project using a malhar library would just activate the
>>>> correct
>>>>>>>> profile
>>>>>>>>> in
>>>>>>>>>>> it's pom, and the correct dependencies would be pulled
>> in.
>> http://maven.apache.org/guides/introduction/introduction-to-profiles.html
>>>>>>>>>>> 
>>>>>>>>>>> On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch <
>>>>>>>> andy@datatorrent.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>> 
>>>>>>>>>>>> I am currently assigned to MLHR-1843
>>>>>>>>>>>> <https://malhar.atlassian.net/browse/MLHR-1843>, which
>>>>>>> essentially
>>>>>>>>>> aims
>>>>>>>>>>> to
>>>>>>>>>>>> expose smaller, more consumable maven artifacts that
>>> would
>>>> do
>>>>>>> away
>>>>>>>>> with
>>>>>>>>>>> the
>>>>>>>>>>>> need to manually include necessary dependencies based
>> on
>>>> the
>>>>>>>>> operators
>>>>>>>>>> in
>>>>>>>>>>>> use.
>>>>>>>>>>>> 
>>>>>>>>>>>> As an example, say I am building an app package that
>>> needs
>>>>>> Kafka
>>>>>>>>> input
>>>>>>>>>>> and
>>>>>>>>>>>> output operators, but I don't want all the other
>>> transitive
>>>>>>>>>> dependencies
>>>>>>>>>>>> that come via malhar-contrib. Currently I would need to
>>>>> specify
>>>>>>>>>>>> malhar-contrib as a dependency, and add an exclusions
>>> block
>>>>> in
>>>>>>> my
>>>>>>>>> app
>>>>>>>>>>>> package pom:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
>>>>>>>>>>>> <artifactId>malhar-contrib</artifactId>
>>>>>> <version>3.0.0</version>
>>>>>>>>> <!--
>>>>>>>>>>> so
>>>>>>>>>>>> none of malhar-contrib's deps are included -->*
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> *  <exclusions>    <exclusion>
>> <groupId>*</groupId>
>>>>>>>>>>>> <artifactId>*</artifactId>    </exclusion>
>>>>>>>>> </exclusions></dependency>*
>>>>>>>>>>>> 
>>>>>>>>>>>> Then, I would have to include the kafka library
>>> explicitly
>>>>> as a
>>>>>>>>>>> dependency:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> *<dependency>  <groupId>org.apache.kafka</groupId>
>>>>>>>>>>>> <artifactId>kafka_2.10</artifactId>
>>>>>>>>>>>> <version>0.8.1.1</version></dependency>*
>>>>>>>>>>>> 
>>>>>>>>>>>> Wouldn't it be nice if I could just put this in my
>> pom?:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
>>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId>
>>>>>>>>>>>> <version>3.0.0</version></dependency>*
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> In order to make this possible, we will need to
>> organize
>>>> the
>>>>>>> malhar
>>>>>>>>>>> project
>>>>>>>>>>>> into more granular modules (artifacts). Specifically,
>> the
>>>>>>>>>> malhar-contrib
>>>>>>>>>>>> artifact would essentially just be a pom that specifies
>>>> each
>>>>>>>> smaller
>>>>>>>>>>> module
>>>>>>>>>>>> as a dependency:
>>>>>>>>>>>> 
>>>>>>>>>>>> *<!-- in malhar-contrib's pom.xml: -->*
>>>>>>>>>>>> 
>>>>>>>>>>>> *<modules>  <module>kafka</module>*
>>>>>>>>>>>> *  <module>twitter</module>*
>>>>>>>>>>>> *  <module>redis</module>*
>>>>>>>>>>>> 
>>>>>>>>>>>> *  <!-- other smaller modules --></modules>*
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
>>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId>
>>>>>>>>>>>> <version>3.0.0</version></dependency>*
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
>>>>>>>>>>>> <artifactId>malhar-contrib-twitter</artifactId>
>>>>>>>>>>>> <version>3.0.0</version></dependency>*
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
>>>>>>>>>>>> <artifactId>malhar-contrib-redis</artifactId>
>>>>>>>>>>>> <version>3.0.0</version></dependency>*
>>>>>>>>>>>> 
>>>>>>>>>>>> With these changes, there may be a risk of breaking
>>>> backwards
>>>>>>>>>>>> compatibility, however I think the gain in usability of
>>>>> malhar
>>>>>>>> merits
>>>>>>>>>> the
>>>>>>>>>>>> effort to make this work.
>>>>>>>>>>>> 
>>>>>>>>>>>> I am still relatively new to maven, so I would love to
>>> get
>>>>> some
>>>>>>>>>> feedback
>>>>>>>>>>>> from other devs about this!
>>>>>>>>>>>> 
>>>>>>>>>>>> --
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Andy Perlitch
>>>>>>>>>>>> Software Engineer
>>>>>>>>>>>> DataTorrent Inc
>>>>>>>>>>>> (408)829-9319
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Regards,
>>>>>> Andy Perlitch
>>>>>> Software Engineer
>>>>>> DataTorrent Inc
>>>>>> (408)829-9319
>> 
>> 
>> 
>> --
>> Regards,
>> Andy Perlitch
>> Software Engineer
>> DataTorrent Inc
>> (408)829-9319
>> 

Re: More sensible modules/artifacts in malhar

Posted by Thomas Weise <th...@datatorrent.com>.
I actually think that these changes should be made as part of a major
release (Malhar only, not engine). Along with other changes to convert to
Apache package names, purge deprecated operators etc.

Since Apex core and Malhar will be decoupled going forward, such major
release can be done without affecting existing users. Since the package
names change, both major versions can also be used together in the same
application, no forced upgrade, ability to selectively pick new operators.

Thoughts?




On Tue, Sep 29, 2015 at 3:19 AM, Andy Perlitch <an...@datatorrent.com> wrote:

> Hi all,
>
> This is a first cut at a plan to restructure malhar in a way that is more
> portable and adherent to Maven's principles of modularity and dependency
> management.
>
> Overview of Current Malhar Architecture
> ---------------------------------------------------------------
> The current malhar repo consists of several maven modules:
>
> * *malhar-library*
>    operators which do not require additional transitive dependencies beyond
> what Apex and Hadoop require
> *  *malhar-contrib*
>    operators requiring other maven dependencies
> * *malhar-demos*
>    demo applications
> * *malhar-samples*
>    sample code showing example usage of malhar operators
> * *malhar-apps*
>    apex applications (currently only logstream)
>
>
> Proposed Changes
> ---------------------------------------------------------------
>
> 1. *Scrub malhar-library for any operators needing additional dependencies*
>   `malhar-library` is intended to consist of only operators without extra
> transitive dependencies. All operators should be checked for the necessity
> of extra dependencies.
>
> 2. *Move operators from malhar-demos and malhar-apps into contrib (or
> library if prudent)*
>     There are various operators in both of these modules that are general
> enough to move into library or contrib.
>
> 3. *Create modules for all contrib subfolders*
>     All folders under `contrib/src/main/com/datatorrent/contrib/` should be
> converted to modules of contrib and listed as such in `/contrib/pom.xml`.
>     Additionally, each of these smaller contrib modules will have its own
> version and dependencies.
>
> 4. *Use the Shades Plugin to allow for backwards-compatible fully-qualified
> class names*
>     This is made possible by shades class relocation
> <
> https://maven.apache.org/plugins/maven-shade-plugin/examples/class-relocation.html
> >
> feature. This might be a bit error prone as well as confusing to use for
> outside developers, but it must be done if these changes are to be made
> prior to a major release.
>
>
>
> Let me know what you all think of this approach.
>
> Best,
> Andy
>
>
> On Tue, Sep 22, 2015 at 11:20 AM, Chetan Narsude <ch...@datatorrent.com>
> wrote:
>
> > +1
> >
> > On Tue, Sep 22, 2015 at 11:08 AM, Gaurav Gupta <ga...@datatorrent.com>
> > wrote:
> >
> > > I agree with David.. Each artifact should have it's own version
> > >
> > > Thanks
> > > -Gaurav
> > >
> > > On Tue, Sep 22, 2015 at 11:07 AM, David Yan <da...@datatorrent.com>
> > wrote:
> > >
> > > > I actually think that each baby artifact should have its own version,
> > > > because each artifact has its own interface and its own life cycle,
> > > > especially after we break up the giant library, applications will
> > depend
> > > on
> > > > the baby artifacts instead of the giant library.  For example if
> there
> > is
> > > > no change in malhar-contrib-kafka (I think the name should actually
> be
> > > > apex-malhar-kafka), we should not confuse users by bumping the
> version.
> > > >
> > > > David
> > > >
> > > > On Tue, Sep 22, 2015 at 9:03 AM, Andy Perlitch <andy@datatorrent.com
> >
> > > > wrote:
> > > >
> > > > > Tushar,
> > > > >
> > > > > I agree that all modules should inherit the version from the
> "parent
> > > pom"
> > > > > of the malhar repo. I think the benefits outweigh the cost of
> bumping
> > > > > versions of components that haven't actually changed. I'd love to
> get
> > > > > others feedback on this as well.
> > > > >
> > > > > On another note, I plan on starting a spreadsheet/googledoc with
> the
> > > > > possible groupings of operators into these modules. Stay tuned...
> > > > >
> > > > > -Andy
> > > > >
> > > > > On Mon, Sep 21, 2015 at 11:51 PM, Tushar Gosavi <
> > > tushar@datatorrent.com>
> > > > > wrote:
> > > > >
> > > > > > +1 for the general idea
> > > > > >
> > > > > > Does these independent modules going to have independent
> versions?
> > > For
> > > > > > example, if there is no change in kafka operator between malhar
> 3.0
> > > and
> > > > > > malhar 4.0, will we increment version of malhar-contrib-kafka to
> > > 4.0. I
> > > > > > have learned from my previous project that, It is easier to
> manage
> > > > > versions
> > > > > > if we make all modules at same version level for a release, even
> if
> > > > there
> > > > > > is no change in a particular module.
> > > > > >
> > > > > > - Tushar.
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Fri, Sep 18, 2015 at 12:18 AM, Timothy Farkas <
> > > tim@datatorrent.com>
> > > > > > wrote:
> > > > > >
> > > > > > > I agree Andy's solution is better, but just for the sake of
> > > argument
> > > > > > > profiles can be inherited from a parent pom, so if the maven
> > > > archetype
> > > > > > > defines a new project with a parent pom with the correct
> profiles
> > > > > > defined,
> > > > > > > then the desired profiles can be activated in the pom of the
> new
> > > > > project.
> > > > > > > It is no more complicated than adding additional dependencies
> to
> > > your
> > > > > > > project.
> > > > > > >
> > > > > > > On Thu, Sep 17, 2015 at 10:32 AM, Sandesh Hegde <
> > > > > sandesh@datatorrent.com
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Currently all the dependencies in Malhar-Contrib are marked
> as
> > > > > > optional.
> > > > > > > So
> > > > > > > > users have to already modify the existing POM to use it in
> > their
> > > > > > project.
> > > > > > > > So restructuring should be fine.
> > > > > > > >
> > > > > > > > On Thu, Sep 17, 2015 at 11:29 AM Chetan Narsude <
> > > > > > chetan@datatorrent.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > The profiles are excellent when you are developing
> > > > malhar-contrib.
> > > > > > > > Profiles
> > > > > > > > > do not work when you are using malhar-contrib. The problem
> > Andy
> > > > is
> > > > > > > > trying
> > > > > > > > > to solve is the later. If there is an elegant solution
> which
> > I
> > > am
> > > > > > > missing
> > > > > > > > > using profiles, please correct me.
> > > > > > > > >
> > > > > > > > > The way Andy suggested is the way many successful projects
> do
> > > it.
> > > > > > Look
> > > > > > > at
> > > > > > > > > Netty as an example.
> > > > > > > > >
> > > > > > > > > +1 for that.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Chetan
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas <
> > > > > > tim@datatorrent.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > I think restructuring the project in that way would be
> the
> > > > > > > technically
> > > > > > > > > > correct thing to do, but if people are unwilling to
> accept
> > > the
> > > > > > change
> > > > > > > > in
> > > > > > > > > > project structure you could achieve something similar by
> > > using
> > > > > > maven
> > > > > > > > > > profiles. With profiles the project structure would
> remain
> > as
> > > > is.
> > > > > > > > > Profiles
> > > > > > > > > > could be added to the malhar pom, and a profile would
> > define
> > > > the
> > > > > > > > > > dependencies needed for different types of operators. For
> > > > example
> > > > > > the
> > > > > > > > > hbase
> > > > > > > > > > profile would define the dependencies for the hbase
> > operator.
> > > > > Then
> > > > > > > any
> > > > > > > > > > project using a malhar library would just activate the
> > > correct
> > > > > > > profile
> > > > > > > > in
> > > > > > > > > > it's pom, and the correct dependencies would be pulled
> in.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://maven.apache.org/guides/introduction/introduction-to-profiles.html
> > > > > > > > > >
> > > > > > > > > > On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch <
> > > > > > > andy@datatorrent.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi everyone,
> > > > > > > > > > >
> > > > > > > > > > > I am currently assigned to MLHR-1843
> > > > > > > > > > > <https://malhar.atlassian.net/browse/MLHR-1843>, which
> > > > > > essentially
> > > > > > > > > aims
> > > > > > > > > > to
> > > > > > > > > > > expose smaller, more consumable maven artifacts that
> > would
> > > do
> > > > > > away
> > > > > > > > with
> > > > > > > > > > the
> > > > > > > > > > > need to manually include necessary dependencies based
> on
> > > the
> > > > > > > > operators
> > > > > > > > > in
> > > > > > > > > > > use.
> > > > > > > > > > >
> > > > > > > > > > > As an example, say I am building an app package that
> > needs
> > > > > Kafka
> > > > > > > > input
> > > > > > > > > > and
> > > > > > > > > > > output operators, but I don't want all the other
> > transitive
> > > > > > > > > dependencies
> > > > > > > > > > > that come via malhar-contrib. Currently I would need to
> > > > specify
> > > > > > > > > > > malhar-contrib as a dependency, and add an exclusions
> > block
> > > > in
> > > > > > my
> > > > > > > > app
> > > > > > > > > > > package pom:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > > <artifactId>malhar-contrib</artifactId>
> > > > > <version>3.0.0</version>
> > > > > > > > <!--
> > > > > > > > > > so
> > > > > > > > > > > none of malhar-contrib's deps are included -->*
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > *  <exclusions>    <exclusion>
> <groupId>*</groupId>
> > > > > > > > > > > <artifactId>*</artifactId>    </exclusion>
> > > > > > > > </exclusions></dependency>*
> > > > > > > > > > >
> > > > > > > > > > > Then, I would have to include the kafka library
> > explicitly
> > > > as a
> > > > > > > > > > dependency:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > *<dependency>  <groupId>org.apache.kafka</groupId>
> > > > > > > > > > > <artifactId>kafka_2.10</artifactId>
> > > > > > > > > > > <version>0.8.1.1</version></dependency>*
> > > > > > > > > > >
> > > > > > > > > > > Wouldn't it be nice if I could just put this in my
> pom?:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > > <artifactId>malhar-contrib-kafka</artifactId>
> > > > > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > In order to make this possible, we will need to
> organize
> > > the
> > > > > > malhar
> > > > > > > > > > project
> > > > > > > > > > > into more granular modules (artifacts). Specifically,
> the
> > > > > > > > > malhar-contrib
> > > > > > > > > > > artifact would essentially just be a pom that specifies
> > > each
> > > > > > > smaller
> > > > > > > > > > module
> > > > > > > > > > > as a dependency:
> > > > > > > > > > >
> > > > > > > > > > > *<!-- in malhar-contrib's pom.xml: -->*
> > > > > > > > > > >
> > > > > > > > > > > *<modules>  <module>kafka</module>*
> > > > > > > > > > > *  <module>twitter</module>*
> > > > > > > > > > > *  <module>redis</module>*
> > > > > > > > > > >
> > > > > > > > > > > *  <!-- other smaller modules --></modules>*
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > > <artifactId>malhar-contrib-kafka</artifactId>
> > > > > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > > <artifactId>malhar-contrib-twitter</artifactId>
> > > > > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > > <artifactId>malhar-contrib-redis</artifactId>
> > > > > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > > > > >
> > > > > > > > > > > With these changes, there may be a risk of breaking
> > > backwards
> > > > > > > > > > > compatibility, however I think the gain in usability of
> > > > malhar
> > > > > > > merits
> > > > > > > > > the
> > > > > > > > > > > effort to make this work.
> > > > > > > > > > >
> > > > > > > > > > > I am still relatively new to maven, so I would love to
> > get
> > > > some
> > > > > > > > > feedback
> > > > > > > > > > > from other devs about this!
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Regards,
> > > > > > > > > > > Andy Perlitch
> > > > > > > > > > > Software Engineer
> > > > > > > > > > > DataTorrent Inc
> > > > > > > > > > > (408)829-9319
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Regards,
> > > > > Andy Perlitch
> > > > > Software Engineer
> > > > > DataTorrent Inc
> > > > > (408)829-9319
> > > > >
> > > >
> > >
> >
>
>
>
> --
> Regards,
> Andy Perlitch
> Software Engineer
> DataTorrent Inc
> (408)829-9319
>

Re: More sensible modules/artifacts in malhar

Posted by Thomas Weise <th...@datatorrent.com>.
Yes, as long as the existing package and artifact names don't change.

On Tue, Sep 29, 2015 at 11:09 AM, Pramod Immaneni <pr...@datatorrent.com>
wrote:

> Where is number 4 needed? The current contrib package structure could
> remain even though we move them into modules right?
>
> On Tue, Sep 29, 2015 at 12:19 AM, Andy Perlitch <an...@datatorrent.com>
> wrote:
>
> > Hi all,
> >
> > This is a first cut at a plan to restructure malhar in a way that is more
> > portable and adherent to Maven's principles of modularity and dependency
> > management.
> >
> > Overview of Current Malhar Architecture
> > ---------------------------------------------------------------
> > The current malhar repo consists of several maven modules:
> >
> > * *malhar-library*
> >    operators which do not require additional transitive dependencies
> beyond
> > what Apex and Hadoop require
> > *  *malhar-contrib*
> >    operators requiring other maven dependencies
> > * *malhar-demos*
> >    demo applications
> > * *malhar-samples*
> >    sample code showing example usage of malhar operators
> > * *malhar-apps*
> >    apex applications (currently only logstream)
> >
> >
> > Proposed Changes
> > ---------------------------------------------------------------
> >
> > 1. *Scrub malhar-library for any operators needing additional
> dependencies*
> >   `malhar-library` is intended to consist of only operators without extra
> > transitive dependencies. All operators should be checked for the
> necessity
> > of extra dependencies.
> >
> > 2. *Move operators from malhar-demos and malhar-apps into contrib (or
> > library if prudent)*
> >     There are various operators in both of these modules that are general
> > enough to move into library or contrib.
> >
> > 3. *Create modules for all contrib subfolders*
> >     All folders under `contrib/src/main/com/datatorrent/contrib/` should
> be
> > converted to modules of contrib and listed as such in `/contrib/pom.xml`.
> >     Additionally, each of these smaller contrib modules will have its own
> > version and dependencies.
> >
> > 4. *Use the Shades Plugin to allow for backwards-compatible
> fully-qualified
> > class names*
> >     This is made possible by shades class relocation
> > <
> >
> https://maven.apache.org/plugins/maven-shade-plugin/examples/class-relocation.html
> > >
> > feature. This might be a bit error prone as well as confusing to use for
> > outside developers, but it must be done if these changes are to be made
> > prior to a major release.
> >
> >
> >
> > Let me know what you all think of this approach.
> >
> > Best,
> > Andy
> >
> >
> > On Tue, Sep 22, 2015 at 11:20 AM, Chetan Narsude <chetan@datatorrent.com
> >
> > wrote:
> >
> > > +1
> > >
> > > On Tue, Sep 22, 2015 at 11:08 AM, Gaurav Gupta <gaurav@datatorrent.com
> >
> > > wrote:
> > >
> > > > I agree with David.. Each artifact should have it's own version
> > > >
> > > > Thanks
> > > > -Gaurav
> > > >
> > > > On Tue, Sep 22, 2015 at 11:07 AM, David Yan <da...@datatorrent.com>
> > > wrote:
> > > >
> > > > > I actually think that each baby artifact should have its own
> version,
> > > > > because each artifact has its own interface and its own life cycle,
> > > > > especially after we break up the giant library, applications will
> > > depend
> > > > on
> > > > > the baby artifacts instead of the giant library.  For example if
> > there
> > > is
> > > > > no change in malhar-contrib-kafka (I think the name should actually
> > be
> > > > > apex-malhar-kafka), we should not confuse users by bumping the
> > version.
> > > > >
> > > > > David
> > > > >
> > > > > On Tue, Sep 22, 2015 at 9:03 AM, Andy Perlitch <
> andy@datatorrent.com
> > >
> > > > > wrote:
> > > > >
> > > > > > Tushar,
> > > > > >
> > > > > > I agree that all modules should inherit the version from the
> > "parent
> > > > pom"
> > > > > > of the malhar repo. I think the benefits outweigh the cost of
> > bumping
> > > > > > versions of components that haven't actually changed. I'd love to
> > get
> > > > > > others feedback on this as well.
> > > > > >
> > > > > > On another note, I plan on starting a spreadsheet/googledoc with
> > the
> > > > > > possible groupings of operators into these modules. Stay tuned...
> > > > > >
> > > > > > -Andy
> > > > > >
> > > > > > On Mon, Sep 21, 2015 at 11:51 PM, Tushar Gosavi <
> > > > tushar@datatorrent.com>
> > > > > > wrote:
> > > > > >
> > > > > > > +1 for the general idea
> > > > > > >
> > > > > > > Does these independent modules going to have independent
> > versions?
> > > > For
> > > > > > > example, if there is no change in kafka operator between malhar
> > 3.0
> > > > and
> > > > > > > malhar 4.0, will we increment version of malhar-contrib-kafka
> to
> > > > 4.0. I
> > > > > > > have learned from my previous project that, It is easier to
> > manage
> > > > > > versions
> > > > > > > if we make all modules at same version level for a release,
> even
> > if
> > > > > there
> > > > > > > is no change in a particular module.
> > > > > > >
> > > > > > > - Tushar.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Sep 18, 2015 at 12:18 AM, Timothy Farkas <
> > > > tim@datatorrent.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > I agree Andy's solution is better, but just for the sake of
> > > > argument
> > > > > > > > profiles can be inherited from a parent pom, so if the maven
> > > > > archetype
> > > > > > > > defines a new project with a parent pom with the correct
> > profiles
> > > > > > > defined,
> > > > > > > > then the desired profiles can be activated in the pom of the
> > new
> > > > > > project.
> > > > > > > > It is no more complicated than adding additional dependencies
> > to
> > > > your
> > > > > > > > project.
> > > > > > > >
> > > > > > > > On Thu, Sep 17, 2015 at 10:32 AM, Sandesh Hegde <
> > > > > > sandesh@datatorrent.com
> > > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Currently all the dependencies in Malhar-Contrib are marked
> > as
> > > > > > > optional.
> > > > > > > > So
> > > > > > > > > users have to already modify the existing POM to use it in
> > > their
> > > > > > > project.
> > > > > > > > > So restructuring should be fine.
> > > > > > > > >
> > > > > > > > > On Thu, Sep 17, 2015 at 11:29 AM Chetan Narsude <
> > > > > > > chetan@datatorrent.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > The profiles are excellent when you are developing
> > > > > malhar-contrib.
> > > > > > > > > Profiles
> > > > > > > > > > do not work when you are using malhar-contrib. The
> problem
> > > Andy
> > > > > is
> > > > > > > > > trying
> > > > > > > > > > to solve is the later. If there is an elegant solution
> > which
> > > I
> > > > am
> > > > > > > > missing
> > > > > > > > > > using profiles, please correct me.
> > > > > > > > > >
> > > > > > > > > > The way Andy suggested is the way many successful
> projects
> > do
> > > > it.
> > > > > > > Look
> > > > > > > > at
> > > > > > > > > > Netty as an example.
> > > > > > > > > >
> > > > > > > > > > +1 for that.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Chetan
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas <
> > > > > > > tim@datatorrent.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > I think restructuring the project in that way would be
> > the
> > > > > > > > technically
> > > > > > > > > > > correct thing to do, but if people are unwilling to
> > accept
> > > > the
> > > > > > > change
> > > > > > > > > in
> > > > > > > > > > > project structure you could achieve something similar
> by
> > > > using
> > > > > > > maven
> > > > > > > > > > > profiles. With profiles the project structure would
> > remain
> > > as
> > > > > is.
> > > > > > > > > > Profiles
> > > > > > > > > > > could be added to the malhar pom, and a profile would
> > > define
> > > > > the
> > > > > > > > > > > dependencies needed for different types of operators.
> For
> > > > > example
> > > > > > > the
> > > > > > > > > > hbase
> > > > > > > > > > > profile would define the dependencies for the hbase
> > > operator.
> > > > > > Then
> > > > > > > > any
> > > > > > > > > > > project using a malhar library would just activate the
> > > > correct
> > > > > > > > profile
> > > > > > > > > in
> > > > > > > > > > > it's pom, and the correct dependencies would be pulled
> > in.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://maven.apache.org/guides/introduction/introduction-to-profiles.html
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch <
> > > > > > > > andy@datatorrent.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > >
> > > > > > > > > > > > I am currently assigned to MLHR-1843
> > > > > > > > > > > > <https://malhar.atlassian.net/browse/MLHR-1843>,
> which
> > > > > > > essentially
> > > > > > > > > > aims
> > > > > > > > > > > to
> > > > > > > > > > > > expose smaller, more consumable maven artifacts that
> > > would
> > > > do
> > > > > > > away
> > > > > > > > > with
> > > > > > > > > > > the
> > > > > > > > > > > > need to manually include necessary dependencies based
> > on
> > > > the
> > > > > > > > > operators
> > > > > > > > > > in
> > > > > > > > > > > > use.
> > > > > > > > > > > >
> > > > > > > > > > > > As an example, say I am building an app package that
> > > needs
> > > > > > Kafka
> > > > > > > > > input
> > > > > > > > > > > and
> > > > > > > > > > > > output operators, but I don't want all the other
> > > transitive
> > > > > > > > > > dependencies
> > > > > > > > > > > > that come via malhar-contrib. Currently I would need
> to
> > > > > specify
> > > > > > > > > > > > malhar-contrib as a dependency, and add an exclusions
> > > block
> > > > > in
> > > > > > > my
> > > > > > > > > app
> > > > > > > > > > > > package pom:
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > > > <artifactId>malhar-contrib</artifactId>
> > > > > > <version>3.0.0</version>
> > > > > > > > > <!--
> > > > > > > > > > > so
> > > > > > > > > > > > none of malhar-contrib's deps are included -->*
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > *  <exclusions>    <exclusion>
> > <groupId>*</groupId>
> > > > > > > > > > > > <artifactId>*</artifactId>    </exclusion>
> > > > > > > > > </exclusions></dependency>*
> > > > > > > > > > > >
> > > > > > > > > > > > Then, I would have to include the kafka library
> > > explicitly
> > > > > as a
> > > > > > > > > > > dependency:
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > *<dependency>  <groupId>org.apache.kafka</groupId>
> > > > > > > > > > > > <artifactId>kafka_2.10</artifactId>
> > > > > > > > > > > > <version>0.8.1.1</version></dependency>*
> > > > > > > > > > > >
> > > > > > > > > > > > Wouldn't it be nice if I could just put this in my
> > pom?:
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > > > <artifactId>malhar-contrib-kafka</artifactId>
> > > > > > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > In order to make this possible, we will need to
> > organize
> > > > the
> > > > > > > malhar
> > > > > > > > > > > project
> > > > > > > > > > > > into more granular modules (artifacts). Specifically,
> > the
> > > > > > > > > > malhar-contrib
> > > > > > > > > > > > artifact would essentially just be a pom that
> specifies
> > > > each
> > > > > > > > smaller
> > > > > > > > > > > module
> > > > > > > > > > > > as a dependency:
> > > > > > > > > > > >
> > > > > > > > > > > > *<!-- in malhar-contrib's pom.xml: -->*
> > > > > > > > > > > >
> > > > > > > > > > > > *<modules>  <module>kafka</module>*
> > > > > > > > > > > > *  <module>twitter</module>*
> > > > > > > > > > > > *  <module>redis</module>*
> > > > > > > > > > > >
> > > > > > > > > > > > *  <!-- other smaller modules --></modules>*
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > > > <artifactId>malhar-contrib-kafka</artifactId>
> > > > > > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > > > <artifactId>malhar-contrib-twitter</artifactId>
> > > > > > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > > > <artifactId>malhar-contrib-redis</artifactId>
> > > > > > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > > > > > >
> > > > > > > > > > > > With these changes, there may be a risk of breaking
> > > > backwards
> > > > > > > > > > > > compatibility, however I think the gain in usability
> of
> > > > > malhar
> > > > > > > > merits
> > > > > > > > > > the
> > > > > > > > > > > > effort to make this work.
> > > > > > > > > > > >
> > > > > > > > > > > > I am still relatively new to maven, so I would love
> to
> > > get
> > > > > some
> > > > > > > > > > feedback
> > > > > > > > > > > > from other devs about this!
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Regards,
> > > > > > > > > > > > Andy Perlitch
> > > > > > > > > > > > Software Engineer
> > > > > > > > > > > > DataTorrent Inc
> > > > > > > > > > > > (408)829-9319
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Regards,
> > > > > > Andy Perlitch
> > > > > > Software Engineer
> > > > > > DataTorrent Inc
> > > > > > (408)829-9319
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > Regards,
> > Andy Perlitch
> > Software Engineer
> > DataTorrent Inc
> > (408)829-9319
> >
>

Re: More sensible modules/artifacts in malhar

Posted by Pramod Immaneni <pr...@datatorrent.com>.
Where is number 4 needed? The current contrib package structure could
remain even though we move them into modules right?

On Tue, Sep 29, 2015 at 12:19 AM, Andy Perlitch <an...@datatorrent.com>
wrote:

> Hi all,
>
> This is a first cut at a plan to restructure malhar in a way that is more
> portable and adherent to Maven's principles of modularity and dependency
> management.
>
> Overview of Current Malhar Architecture
> ---------------------------------------------------------------
> The current malhar repo consists of several maven modules:
>
> * *malhar-library*
>    operators which do not require additional transitive dependencies beyond
> what Apex and Hadoop require
> *  *malhar-contrib*
>    operators requiring other maven dependencies
> * *malhar-demos*
>    demo applications
> * *malhar-samples*
>    sample code showing example usage of malhar operators
> * *malhar-apps*
>    apex applications (currently only logstream)
>
>
> Proposed Changes
> ---------------------------------------------------------------
>
> 1. *Scrub malhar-library for any operators needing additional dependencies*
>   `malhar-library` is intended to consist of only operators without extra
> transitive dependencies. All operators should be checked for the necessity
> of extra dependencies.
>
> 2. *Move operators from malhar-demos and malhar-apps into contrib (or
> library if prudent)*
>     There are various operators in both of these modules that are general
> enough to move into library or contrib.
>
> 3. *Create modules for all contrib subfolders*
>     All folders under `contrib/src/main/com/datatorrent/contrib/` should be
> converted to modules of contrib and listed as such in `/contrib/pom.xml`.
>     Additionally, each of these smaller contrib modules will have its own
> version and dependencies.
>
> 4. *Use the Shades Plugin to allow for backwards-compatible fully-qualified
> class names*
>     This is made possible by shades class relocation
> <
> https://maven.apache.org/plugins/maven-shade-plugin/examples/class-relocation.html
> >
> feature. This might be a bit error prone as well as confusing to use for
> outside developers, but it must be done if these changes are to be made
> prior to a major release.
>
>
>
> Let me know what you all think of this approach.
>
> Best,
> Andy
>
>
> On Tue, Sep 22, 2015 at 11:20 AM, Chetan Narsude <ch...@datatorrent.com>
> wrote:
>
> > +1
> >
> > On Tue, Sep 22, 2015 at 11:08 AM, Gaurav Gupta <ga...@datatorrent.com>
> > wrote:
> >
> > > I agree with David.. Each artifact should have it's own version
> > >
> > > Thanks
> > > -Gaurav
> > >
> > > On Tue, Sep 22, 2015 at 11:07 AM, David Yan <da...@datatorrent.com>
> > wrote:
> > >
> > > > I actually think that each baby artifact should have its own version,
> > > > because each artifact has its own interface and its own life cycle,
> > > > especially after we break up the giant library, applications will
> > depend
> > > on
> > > > the baby artifacts instead of the giant library.  For example if
> there
> > is
> > > > no change in malhar-contrib-kafka (I think the name should actually
> be
> > > > apex-malhar-kafka), we should not confuse users by bumping the
> version.
> > > >
> > > > David
> > > >
> > > > On Tue, Sep 22, 2015 at 9:03 AM, Andy Perlitch <andy@datatorrent.com
> >
> > > > wrote:
> > > >
> > > > > Tushar,
> > > > >
> > > > > I agree that all modules should inherit the version from the
> "parent
> > > pom"
> > > > > of the malhar repo. I think the benefits outweigh the cost of
> bumping
> > > > > versions of components that haven't actually changed. I'd love to
> get
> > > > > others feedback on this as well.
> > > > >
> > > > > On another note, I plan on starting a spreadsheet/googledoc with
> the
> > > > > possible groupings of operators into these modules. Stay tuned...
> > > > >
> > > > > -Andy
> > > > >
> > > > > On Mon, Sep 21, 2015 at 11:51 PM, Tushar Gosavi <
> > > tushar@datatorrent.com>
> > > > > wrote:
> > > > >
> > > > > > +1 for the general idea
> > > > > >
> > > > > > Does these independent modules going to have independent
> versions?
> > > For
> > > > > > example, if there is no change in kafka operator between malhar
> 3.0
> > > and
> > > > > > malhar 4.0, will we increment version of malhar-contrib-kafka to
> > > 4.0. I
> > > > > > have learned from my previous project that, It is easier to
> manage
> > > > > versions
> > > > > > if we make all modules at same version level for a release, even
> if
> > > > there
> > > > > > is no change in a particular module.
> > > > > >
> > > > > > - Tushar.
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Fri, Sep 18, 2015 at 12:18 AM, Timothy Farkas <
> > > tim@datatorrent.com>
> > > > > > wrote:
> > > > > >
> > > > > > > I agree Andy's solution is better, but just for the sake of
> > > argument
> > > > > > > profiles can be inherited from a parent pom, so if the maven
> > > > archetype
> > > > > > > defines a new project with a parent pom with the correct
> profiles
> > > > > > defined,
> > > > > > > then the desired profiles can be activated in the pom of the
> new
> > > > > project.
> > > > > > > It is no more complicated than adding additional dependencies
> to
> > > your
> > > > > > > project.
> > > > > > >
> > > > > > > On Thu, Sep 17, 2015 at 10:32 AM, Sandesh Hegde <
> > > > > sandesh@datatorrent.com
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Currently all the dependencies in Malhar-Contrib are marked
> as
> > > > > > optional.
> > > > > > > So
> > > > > > > > users have to already modify the existing POM to use it in
> > their
> > > > > > project.
> > > > > > > > So restructuring should be fine.
> > > > > > > >
> > > > > > > > On Thu, Sep 17, 2015 at 11:29 AM Chetan Narsude <
> > > > > > chetan@datatorrent.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > The profiles are excellent when you are developing
> > > > malhar-contrib.
> > > > > > > > Profiles
> > > > > > > > > do not work when you are using malhar-contrib. The problem
> > Andy
> > > > is
> > > > > > > > trying
> > > > > > > > > to solve is the later. If there is an elegant solution
> which
> > I
> > > am
> > > > > > > missing
> > > > > > > > > using profiles, please correct me.
> > > > > > > > >
> > > > > > > > > The way Andy suggested is the way many successful projects
> do
> > > it.
> > > > > > Look
> > > > > > > at
> > > > > > > > > Netty as an example.
> > > > > > > > >
> > > > > > > > > +1 for that.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Chetan
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas <
> > > > > > tim@datatorrent.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > I think restructuring the project in that way would be
> the
> > > > > > > technically
> > > > > > > > > > correct thing to do, but if people are unwilling to
> accept
> > > the
> > > > > > change
> > > > > > > > in
> > > > > > > > > > project structure you could achieve something similar by
> > > using
> > > > > > maven
> > > > > > > > > > profiles. With profiles the project structure would
> remain
> > as
> > > > is.
> > > > > > > > > Profiles
> > > > > > > > > > could be added to the malhar pom, and a profile would
> > define
> > > > the
> > > > > > > > > > dependencies needed for different types of operators. For
> > > > example
> > > > > > the
> > > > > > > > > hbase
> > > > > > > > > > profile would define the dependencies for the hbase
> > operator.
> > > > > Then
> > > > > > > any
> > > > > > > > > > project using a malhar library would just activate the
> > > correct
> > > > > > > profile
> > > > > > > > in
> > > > > > > > > > it's pom, and the correct dependencies would be pulled
> in.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://maven.apache.org/guides/introduction/introduction-to-profiles.html
> > > > > > > > > >
> > > > > > > > > > On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch <
> > > > > > > andy@datatorrent.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi everyone,
> > > > > > > > > > >
> > > > > > > > > > > I am currently assigned to MLHR-1843
> > > > > > > > > > > <https://malhar.atlassian.net/browse/MLHR-1843>, which
> > > > > > essentially
> > > > > > > > > aims
> > > > > > > > > > to
> > > > > > > > > > > expose smaller, more consumable maven artifacts that
> > would
> > > do
> > > > > > away
> > > > > > > > with
> > > > > > > > > > the
> > > > > > > > > > > need to manually include necessary dependencies based
> on
> > > the
> > > > > > > > operators
> > > > > > > > > in
> > > > > > > > > > > use.
> > > > > > > > > > >
> > > > > > > > > > > As an example, say I am building an app package that
> > needs
> > > > > Kafka
> > > > > > > > input
> > > > > > > > > > and
> > > > > > > > > > > output operators, but I don't want all the other
> > transitive
> > > > > > > > > dependencies
> > > > > > > > > > > that come via malhar-contrib. Currently I would need to
> > > > specify
> > > > > > > > > > > malhar-contrib as a dependency, and add an exclusions
> > block
> > > > in
> > > > > > my
> > > > > > > > app
> > > > > > > > > > > package pom:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > > <artifactId>malhar-contrib</artifactId>
> > > > > <version>3.0.0</version>
> > > > > > > > <!--
> > > > > > > > > > so
> > > > > > > > > > > none of malhar-contrib's deps are included -->*
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > *  <exclusions>    <exclusion>
> <groupId>*</groupId>
> > > > > > > > > > > <artifactId>*</artifactId>    </exclusion>
> > > > > > > > </exclusions></dependency>*
> > > > > > > > > > >
> > > > > > > > > > > Then, I would have to include the kafka library
> > explicitly
> > > > as a
> > > > > > > > > > dependency:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > *<dependency>  <groupId>org.apache.kafka</groupId>
> > > > > > > > > > > <artifactId>kafka_2.10</artifactId>
> > > > > > > > > > > <version>0.8.1.1</version></dependency>*
> > > > > > > > > > >
> > > > > > > > > > > Wouldn't it be nice if I could just put this in my
> pom?:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > > <artifactId>malhar-contrib-kafka</artifactId>
> > > > > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > In order to make this possible, we will need to
> organize
> > > the
> > > > > > malhar
> > > > > > > > > > project
> > > > > > > > > > > into more granular modules (artifacts). Specifically,
> the
> > > > > > > > > malhar-contrib
> > > > > > > > > > > artifact would essentially just be a pom that specifies
> > > each
> > > > > > > smaller
> > > > > > > > > > module
> > > > > > > > > > > as a dependency:
> > > > > > > > > > >
> > > > > > > > > > > *<!-- in malhar-contrib's pom.xml: -->*
> > > > > > > > > > >
> > > > > > > > > > > *<modules>  <module>kafka</module>*
> > > > > > > > > > > *  <module>twitter</module>*
> > > > > > > > > > > *  <module>redis</module>*
> > > > > > > > > > >
> > > > > > > > > > > *  <!-- other smaller modules --></modules>*
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > > <artifactId>malhar-contrib-kafka</artifactId>
> > > > > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > > <artifactId>malhar-contrib-twitter</artifactId>
> > > > > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > > <artifactId>malhar-contrib-redis</artifactId>
> > > > > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > > > > >
> > > > > > > > > > > With these changes, there may be a risk of breaking
> > > backwards
> > > > > > > > > > > compatibility, however I think the gain in usability of
> > > > malhar
> > > > > > > merits
> > > > > > > > > the
> > > > > > > > > > > effort to make this work.
> > > > > > > > > > >
> > > > > > > > > > > I am still relatively new to maven, so I would love to
> > get
> > > > some
> > > > > > > > > feedback
> > > > > > > > > > > from other devs about this!
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Regards,
> > > > > > > > > > > Andy Perlitch
> > > > > > > > > > > Software Engineer
> > > > > > > > > > > DataTorrent Inc
> > > > > > > > > > > (408)829-9319
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Regards,
> > > > > Andy Perlitch
> > > > > Software Engineer
> > > > > DataTorrent Inc
> > > > > (408)829-9319
> > > > >
> > > >
> > >
> >
>
>
>
> --
> Regards,
> Andy Perlitch
> Software Engineer
> DataTorrent Inc
> (408)829-9319
>

Re: More sensible modules/artifacts in malhar

Posted by David Yan <da...@datatorrent.com>.
I just looked at how storm, flink and spark streaming do it.  They are
releasing all modules at the same time, and probably for that reason.
Let's just stick to releasing all malhar modules together at one time.

David

On Mon, Dec 28, 2015 at 10:46 AM, Timothy Farkas <ti...@datatorrent.com>
wrote:

> Hi David,
>
> How would the versions of the modules be managed in git if the release
> cycles / patches are independent? The only way I could think of would be to
> have separate branches for each version of each module, and I think that
> would become very difficult to manage.
>
> Thanks,
> Tim
>
> On Mon, Dec 28, 2015 at 10:37 AM, David Yan <da...@datatorrent.com> wrote:
>
> > Releasing repository is far more involving than releasing individual
> > modules.
> > If there is a critical bug in one of the modules, should we allow a patch
> > version to be released for that module without waiting for the repository
> > release?
> >
> > David
> >
> > On Wed, Dec 23, 2015 at 7:29 PM, Thomas Weise <th...@datatorrent.com>
> > wrote:
> >
> > > Releases and versions are tied to the repository as a whole, not
> > individual
> > > modules within it. I don't think we should change this.
> > >
> > >
> > > On Wed, Dec 23, 2015 at 6:57 PM, David Yan <da...@datatorrent.com>
> > wrote:
> > >
> > > > As I understand, each artifact will be independent and will have its
> > own
> > > > release cycle.
> > > >
> > > > On Wed, Dec 23, 2015 at 6:50 PM, Pramod Immaneni <
> > pramod@datatorrent.com
> > > >
> > > > wrote:
> > > >
> > > > > Wouldn't it also mean that in near term we would be releasing new
> > > version
> > > > > of all the artifacts when there is a new malhar release to be made
> > even
> > > > > though many of them may not have changed.
> > > > >
> > > > > On Wed, Dec 23, 2015 at 5:23 PM, David Yan <da...@datatorrent.com>
> > > > wrote:
> > > > >
> > > > > > Let's restart the discussion of this topic.
> > > > > >
> > > > > > We'd like to break malhar into modules, so we can have separate
> > > > artifacts
> > > > > > for kafka, cassandra, hbase, etc., instead of just malhar-contrib
> > and
> > > > > > malhar-library.
> > > > > > This way users using them will only pull in the right
> dependencies
> > > > > > automatically, without the ugly business of optional and exclude
> > > > > > dependencies today.
> > > > > >
> > > > > > Also, I propose adding the 3rd party version in the artifact
> name.
> > > For
> > > > > > example:
> > > > > >
> > > > > > malhar-kafka-0.8
> > > > > > malhar-kafka-0.9
> > > > > >
> > > > > > so that we can simultaneously support multiple versions of kafka.
> > > > > >
> > > > > > Thoughts?
> > > > > >
> > > > > > David
> > > > > >
> > > > > > On Fri, Oct 2, 2015 at 4:40 PM, David Yan <david@datatorrent.com
> >
> > > > wrote:
> > > > > >
> > > > > > > The list of all malhar operators are listed as part of the
> apidoc
> > > > here:
> > > > > > > https://www.datatorrent.com/docs/apidocs/index.html
> > > > > > > And developers should be able to find the operators they need
> > > there.
> > > > > > >
> > > > > > > But, it's referenced from
> > > > > > > https://www.datatorrent.com/product-documentation/ as
> "Platform
> > > API
> > > > > > > Reference" so users may have trouble finding it.
> > > > > > >
> > > > > > > We probably should have a separate javadoc pages for Apex Core
> > and
> > > > Apex
> > > > > > > Malhar and add the links to this page
> > > > http://apex.apache.org/docs.html
> > > > > > > also.
> > > > > > >
> > > > > > > David
> > > > > > >
> > > > > > > On Fri, Oct 2, 2015 at 4:28 PM, Pramod Immaneni <
> > > > > pramod@datatorrent.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > >> We got to think about how people can find the operators and
> > > > > > >> dependencies when bundling the applications. The complain I
> hear
> > > > often
> > > > > > >> is that folks can't find the operators they are looking for.
> We
> > > > should
> > > > > > >> be careful about how much more work this will add for the user
> > to
> > > > now
> > > > > > >> search and find all the dependencies.
> > > > > > >>
> > > > > > >> Thanks
> > > > > > >>
> > > > > > >> > On Oct 2, 2015, at 3:44 PM, David Yan <
> david@datatorrent.com>
> > > > > wrote:
> > > > > > >> >
> > > > > > >> > I actually don't think it makes sense any more to separate
> > > > > > >> malhar-library
> > > > > > >> > and malhar-contrib after the breakup, especially since we
> are
> > > > > planning
> > > > > > >> for
> > > > > > >> > a major release for these changes.
> > > > > > >> >
> > > > > > >> > People are often confused, myself included, which operators
> > > should
> > > > > be
> > > > > > in
> > > > > > >> > malhar-library and which ones should be in contrib.
> > Requiring a
> > > > > > >> separate
> > > > > > >> > setup for unit test should not be a criteria because the
> user
> > of
> > > > the
> > > > > > >> > library couldn't care less whether the unit test requires
> > extra
> > > > > setup.
> > > > > > >> The
> > > > > > >> > factor of requiring extra dependencies isn't valid either
> > > because
> > > > > > >> there're
> > > > > > >> > already dependencies of malhar-library now that apex does
> not
> > > > have.
> > > > > > >> >
> > > > > > >> > We can retain them for backward compatibility purpose but
> > going
> > > > > > forward
> > > > > > >> new
> > > > > > >> > app packages should only use the baby artifacts, without
> > > denoting
> > > > > > >> whether
> > > > > > >> > it's contrib or not.
> > > > > > >> >
> > > > > > >> > David
> > > > > > >> >
> > > > > > >> > On Tue, Sep 29, 2015 at 12:19 AM, Andy Perlitch <
> > > > > andy@datatorrent.com
> > > > > > >
> > > > > > >> > wrote:
> > > > > > >> >
> > > > > > >> >> Hi all,
> > > > > > >> >>
> > > > > > >> >> This is a first cut at a plan to restructure malhar in a
> way
> > > that
> > > > > is
> > > > > > >> more
> > > > > > >> >> portable and adherent to Maven's principles of modularity
> and
> > > > > > >> dependency
> > > > > > >> >> management.
> > > > > > >> >>
> > > > > > >> >> Overview of Current Malhar Architecture
> > > > > > >> >>
> > ---------------------------------------------------------------
> > > > > > >> >> The current malhar repo consists of several maven modules:
> > > > > > >> >>
> > > > > > >> >> * *malhar-library*
> > > > > > >> >>   operators which do not require additional transitive
> > > > dependencies
> > > > > > >> beyond
> > > > > > >> >> what Apex and Hadoop require
> > > > > > >> >> *  *malhar-contrib*
> > > > > > >> >>   operators requiring other maven dependencies
> > > > > > >> >> * *malhar-demos*
> > > > > > >> >>   demo applications
> > > > > > >> >> * *malhar-samples*
> > > > > > >> >>   sample code showing example usage of malhar operators
> > > > > > >> >> * *malhar-apps*
> > > > > > >> >>   apex applications (currently only logstream)
> > > > > > >> >>
> > > > > > >> >>
> > > > > > >> >> Proposed Changes
> > > > > > >> >>
> > ---------------------------------------------------------------
> > > > > > >> >>
> > > > > > >> >> 1. *Scrub malhar-library for any operators needing
> additional
> > > > > > >> dependencies*
> > > > > > >> >>  `malhar-library` is intended to consist of only operators
> > > > without
> > > > > > >> extra
> > > > > > >> >> transitive dependencies. All operators should be checked
> for
> > > the
> > > > > > >> necessity
> > > > > > >> >> of extra dependencies.
> > > > > > >> >>
> > > > > > >> >> 2. *Move operators from malhar-demos and malhar-apps into
> > > contrib
> > > > > (or
> > > > > > >> >> library if prudent)*
> > > > > > >> >>    There are various operators in both of these modules
> that
> > > are
> > > > > > >> general
> > > > > > >> >> enough to move into library or contrib.
> > > > > > >> >>
> > > > > > >> >> 3. *Create modules for all contrib subfolders*
> > > > > > >> >>    All folders under
> > > `contrib/src/main/com/datatorrent/contrib/`
> > > > > > >> should be
> > > > > > >> >> converted to modules of contrib and listed as such in
> > > > > > >> `/contrib/pom.xml`.
> > > > > > >> >>    Additionally, each of these smaller contrib modules will
> > > have
> > > > > its
> > > > > > >> own
> > > > > > >> >> version and dependencies.
> > > > > > >> >>
> > > > > > >> >> 4. *Use the Shades Plugin to allow for backwards-compatible
> > > > > > >> fully-qualified
> > > > > > >> >> class names*
> > > > > > >> >>    This is made possible by shades class relocation
> > > > > > >> >> <
> > > > > > >> >>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> https://maven.apache.org/plugins/maven-shade-plugin/examples/class-relocation.html
> > > > > > >> >> feature. This might be a bit error prone as well as
> confusing
> > > to
> > > > > use
> > > > > > >> for
> > > > > > >> >> outside developers, but it must be done if these changes
> are
> > to
> > > > be
> > > > > > made
> > > > > > >> >> prior to a major release.
> > > > > > >> >>
> > > > > > >> >>
> > > > > > >> >>
> > > > > > >> >> Let me know what you all think of this approach.
> > > > > > >> >>
> > > > > > >> >> Best,
> > > > > > >> >> Andy
> > > > > > >> >>
> > > > > > >> >>
> > > > > > >> >> On Tue, Sep 22, 2015 at 11:20 AM, Chetan Narsude <
> > > > > > >> chetan@datatorrent.com>
> > > > > > >> >> wrote:
> > > > > > >> >>
> > > > > > >> >>> +1
> > > > > > >> >>>
> > > > > > >> >>> On Tue, Sep 22, 2015 at 11:08 AM, Gaurav Gupta <
> > > > > > >> gaurav@datatorrent.com>
> > > > > > >> >>> wrote:
> > > > > > >> >>>
> > > > > > >> >>>> I agree with David.. Each artifact should have it's own
> > > version
> > > > > > >> >>>>
> > > > > > >> >>>> Thanks
> > > > > > >> >>>> -Gaurav
> > > > > > >> >>>>
> > > > > > >> >>>>> On Tue, Sep 22, 2015 at 11:07 AM, David Yan <
> > > > > > david@datatorrent.com>
> > > > > > >> >>>> wrote:
> > > > > > >> >>>>
> > > > > > >> >>>>> I actually think that each baby artifact should have its
> > own
> > > > > > >> version,
> > > > > > >> >>>>> because each artifact has its own interface and its own
> > life
> > > > > > cycle,
> > > > > > >> >>>>> especially after we break up the giant library,
> > applications
> > > > > will
> > > > > > >> >>> depend
> > > > > > >> >>>> on
> > > > > > >> >>>>> the baby artifacts instead of the giant library.  For
> > > example
> > > > if
> > > > > > >> >> there
> > > > > > >> >>> is
> > > > > > >> >>>>> no change in malhar-contrib-kafka (I think the name
> should
> > > > > > actually
> > > > > > >> >> be
> > > > > > >> >>>>> apex-malhar-kafka), we should not confuse users by
> bumping
> > > the
> > > > > > >> >> version.
> > > > > > >> >>>>>
> > > > > > >> >>>>> David
> > > > > > >> >>>>>
> > > > > > >> >>>>> On Tue, Sep 22, 2015 at 9:03 AM, Andy Perlitch <
> > > > > > >> andy@datatorrent.com
> > > > > > >> >>>
> > > > > > >> >>>>> wrote:
> > > > > > >> >>>>>
> > > > > > >> >>>>>> Tushar,
> > > > > > >> >>>>>>
> > > > > > >> >>>>>> I agree that all modules should inherit the version
> from
> > > the
> > > > > > >> >> "parent
> > > > > > >> >>>> pom"
> > > > > > >> >>>>>> of the malhar repo. I think the benefits outweigh the
> > cost
> > > of
> > > > > > >> >> bumping
> > > > > > >> >>>>>> versions of components that haven't actually changed.
> I'd
> > > > love
> > > > > to
> > > > > > >> >> get
> > > > > > >> >>>>>> others feedback on this as well.
> > > > > > >> >>>>>>
> > > > > > >> >>>>>> On another note, I plan on starting a
> > spreadsheet/googledoc
> > > > > with
> > > > > > >> >> the
> > > > > > >> >>>>>> possible groupings of operators into these modules.
> Stay
> > > > > tuned...
> > > > > > >> >>>>>>
> > > > > > >> >>>>>> -Andy
> > > > > > >> >>>>>>
> > > > > > >> >>>>>> On Mon, Sep 21, 2015 at 11:51 PM, Tushar Gosavi <
> > > > > > >> >>>> tushar@datatorrent.com>
> > > > > > >> >>>>>> wrote:
> > > > > > >> >>>>>>
> > > > > > >> >>>>>>> +1 for the general idea
> > > > > > >> >>>>>>>
> > > > > > >> >>>>>>> Does these independent modules going to have
> independent
> > > > > > >> >> versions?
> > > > > > >> >>>> For
> > > > > > >> >>>>>>> example, if there is no change in kafka operator
> between
> > > > > malhar
> > > > > > >> >> 3.0
> > > > > > >> >>>> and
> > > > > > >> >>>>>>> malhar 4.0, will we increment version of
> > > > malhar-contrib-kafka
> > > > > to
> > > > > > >> >>>> 4.0. I
> > > > > > >> >>>>>>> have learned from my previous project that, It is
> easier
> > > to
> > > > > > >> >> manage
> > > > > > >> >>>>>> versions
> > > > > > >> >>>>>>> if we make all modules at same version level for a
> > > release,
> > > > > even
> > > > > > >> >> if
> > > > > > >> >>>>> there
> > > > > > >> >>>>>>> is no change in a particular module.
> > > > > > >> >>>>>>>
> > > > > > >> >>>>>>> - Tushar.
> > > > > > >> >>>>>>>
> > > > > > >> >>>>>>>
> > > > > > >> >>>>>>>
> > > > > > >> >>>>>>> On Fri, Sep 18, 2015 at 12:18 AM, Timothy Farkas <
> > > > > > >> >>>> tim@datatorrent.com>
> > > > > > >> >>>>>>> wrote:
> > > > > > >> >>>>>>>
> > > > > > >> >>>>>>>> I agree Andy's solution is better, but just for the
> > sake
> > > of
> > > > > > >> >>>> argument
> > > > > > >> >>>>>>>> profiles can be inherited from a parent pom, so if
> the
> > > > maven
> > > > > > >> >>>>> archetype
> > > > > > >> >>>>>>>> defines a new project with a parent pom with the
> > correct
> > > > > > >> >> profiles
> > > > > > >> >>>>>>> defined,
> > > > > > >> >>>>>>>> then the desired profiles can be activated in the pom
> > of
> > > > the
> > > > > > >> >> new
> > > > > > >> >>>>>> project.
> > > > > > >> >>>>>>>> It is no more complicated than adding additional
> > > > dependencies
> > > > > > >> >> to
> > > > > > >> >>>> your
> > > > > > >> >>>>>>>> project.
> > > > > > >> >>>>>>>>
> > > > > > >> >>>>>>>> On Thu, Sep 17, 2015 at 10:32 AM, Sandesh Hegde <
> > > > > > >> >>>>>> sandesh@datatorrent.com
> > > > > > >> >>>>>>>>
> > > > > > >> >>>>>>>> wrote:
> > > > > > >> >>>>>>>>
> > > > > > >> >>>>>>>>> Currently all the dependencies in Malhar-Contrib are
> > > > marked
> > > > > > >> >> as
> > > > > > >> >>>>>>> optional.
> > > > > > >> >>>>>>>> So
> > > > > > >> >>>>>>>>> users have to already modify the existing POM to use
> > it
> > > in
> > > > > > >> >>> their
> > > > > > >> >>>>>>> project.
> > > > > > >> >>>>>>>>> So restructuring should be fine.
> > > > > > >> >>>>>>>>>
> > > > > > >> >>>>>>>>> On Thu, Sep 17, 2015 at 11:29 AM Chetan Narsude <
> > > > > > >> >>>>>>> chetan@datatorrent.com>
> > > > > > >> >>>>>>>>> wrote:
> > > > > > >> >>>>>>>>>
> > > > > > >> >>>>>>>>>> The profiles are excellent when you are developing
> > > > > > >> >>>>> malhar-contrib.
> > > > > > >> >>>>>>>>> Profiles
> > > > > > >> >>>>>>>>>> do not work when you are using malhar-contrib. The
> > > > problem
> > > > > > >> >>> Andy
> > > > > > >> >>>>> is
> > > > > > >> >>>>>>>>> trying
> > > > > > >> >>>>>>>>>> to solve is the later. If there is an elegant
> > solution
> > > > > > >> >> which
> > > > > > >> >>> I
> > > > > > >> >>>> am
> > > > > > >> >>>>>>>> missing
> > > > > > >> >>>>>>>>>> using profiles, please correct me.
> > > > > > >> >>>>>>>>>>
> > > > > > >> >>>>>>>>>> The way Andy suggested is the way many successful
> > > > projects
> > > > > > >> >> do
> > > > > > >> >>>> it.
> > > > > > >> >>>>>>> Look
> > > > > > >> >>>>>>>> at
> > > > > > >> >>>>>>>>>> Netty as an example.
> > > > > > >> >>>>>>>>>>
> > > > > > >> >>>>>>>>>> +1 for that.
> > > > > > >> >>>>>>>>>>
> > > > > > >> >>>>>>>>>>
> > > > > > >> >>>>>>>>>> --
> > > > > > >> >>>>>>>>>> Chetan
> > > > > > >> >>>>>>>>>>
> > > > > > >> >>>>>>>>>>
> > > > > > >> >>>>>>>>>>
> > > > > > >> >>>>>>>>>> On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas <
> > > > > > >> >>>>>>> tim@datatorrent.com>
> > > > > > >> >>>>>>>>>> wrote:
> > > > > > >> >>>>>>>>>>
> > > > > > >> >>>>>>>>>>> I think restructuring the project in that way
> would
> > be
> > > > > > >> >> the
> > > > > > >> >>>>>>>> technically
> > > > > > >> >>>>>>>>>>> correct thing to do, but if people are unwilling
> to
> > > > > > >> >> accept
> > > > > > >> >>>> the
> > > > > > >> >>>>>>> change
> > > > > > >> >>>>>>>>> in
> > > > > > >> >>>>>>>>>>> project structure you could achieve something
> > similar
> > > by
> > > > > > >> >>>> using
> > > > > > >> >>>>>>> maven
> > > > > > >> >>>>>>>>>>> profiles. With profiles the project structure
> would
> > > > > > >> >> remain
> > > > > > >> >>> as
> > > > > > >> >>>>> is.
> > > > > > >> >>>>>>>>>> Profiles
> > > > > > >> >>>>>>>>>>> could be added to the malhar pom, and a profile
> > would
> > > > > > >> >>> define
> > > > > > >> >>>>> the
> > > > > > >> >>>>>>>>>>> dependencies needed for different types of
> > operators.
> > > > For
> > > > > > >> >>>>> example
> > > > > > >> >>>>>>> the
> > > > > > >> >>>>>>>>>> hbase
> > > > > > >> >>>>>>>>>>> profile would define the dependencies for the
> hbase
> > > > > > >> >>> operator.
> > > > > > >> >>>>>> Then
> > > > > > >> >>>>>>>> any
> > > > > > >> >>>>>>>>>>> project using a malhar library would just activate
> > the
> > > > > > >> >>>> correct
> > > > > > >> >>>>>>>> profile
> > > > > > >> >>>>>>>>> in
> > > > > > >> >>>>>>>>>>> it's pom, and the correct dependencies would be
> > pulled
> > > > > > >> >> in.
> > > > > > >> >>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> http://maven.apache.org/guides/introduction/introduction-to-profiles.html
> > > > > > >> >>>>>>>>>>>
> > > > > > >> >>>>>>>>>>> On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch <
> > > > > > >> >>>>>>>> andy@datatorrent.com>
> > > > > > >> >>>>>>>>>>> wrote:
> > > > > > >> >>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>> Hi everyone,
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>> I am currently assigned to MLHR-1843
> > > > > > >> >>>>>>>>>>>> <https://malhar.atlassian.net/browse/MLHR-1843>,
> > > which
> > > > > > >> >>>>>>> essentially
> > > > > > >> >>>>>>>>>> aims
> > > > > > >> >>>>>>>>>>> to
> > > > > > >> >>>>>>>>>>>> expose smaller, more consumable maven artifacts
> > that
> > > > > > >> >>> would
> > > > > > >> >>>> do
> > > > > > >> >>>>>>> away
> > > > > > >> >>>>>>>>> with
> > > > > > >> >>>>>>>>>>> the
> > > > > > >> >>>>>>>>>>>> need to manually include necessary dependencies
> > based
> > > > > > >> >> on
> > > > > > >> >>>> the
> > > > > > >> >>>>>>>>> operators
> > > > > > >> >>>>>>>>>> in
> > > > > > >> >>>>>>>>>>>> use.
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>> As an example, say I am building an app package
> > that
> > > > > > >> >>> needs
> > > > > > >> >>>>>> Kafka
> > > > > > >> >>>>>>>>> input
> > > > > > >> >>>>>>>>>>> and
> > > > > > >> >>>>>>>>>>>> output operators, but I don't want all the other
> > > > > > >> >>> transitive
> > > > > > >> >>>>>>>>>> dependencies
> > > > > > >> >>>>>>>>>>>> that come via malhar-contrib. Currently I would
> > need
> > > to
> > > > > > >> >>>>> specify
> > > > > > >> >>>>>>>>>>>> malhar-contrib as a dependency, and add an
> > exclusions
> > > > > > >> >>> block
> > > > > > >> >>>>> in
> > > > > > >> >>>>>>> my
> > > > > > >> >>>>>>>>> app
> > > > > > >> >>>>>>>>>>>> package pom:
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > >> >>>>>>>>>>>> <artifactId>malhar-contrib</artifactId>
> > > > > > >> >>>>>> <version>3.0.0</version>
> > > > > > >> >>>>>>>>> <!--
> > > > > > >> >>>>>>>>>>> so
> > > > > > >> >>>>>>>>>>>> none of malhar-contrib's deps are included -->*
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>> *  <exclusions>    <exclusion>
> > > > > > >> >> <groupId>*</groupId>
> > > > > > >> >>>>>>>>>>>> <artifactId>*</artifactId>    </exclusion>
> > > > > > >> >>>>>>>>> </exclusions></dependency>*
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>> Then, I would have to include the kafka library
> > > > > > >> >>> explicitly
> > > > > > >> >>>>> as a
> > > > > > >> >>>>>>>>>>> dependency:
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>> *<dependency>
> <groupId>org.apache.kafka</groupId>
> > > > > > >> >>>>>>>>>>>> <artifactId>kafka_2.10</artifactId>
> > > > > > >> >>>>>>>>>>>> <version>0.8.1.1</version></dependency>*
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>> Wouldn't it be nice if I could just put this in
> my
> > > > > > >> >> pom?:
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > >> >>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId>
> > > > > > >> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>> In order to make this possible, we will need to
> > > > > > >> >> organize
> > > > > > >> >>>> the
> > > > > > >> >>>>>>> malhar
> > > > > > >> >>>>>>>>>>> project
> > > > > > >> >>>>>>>>>>>> into more granular modules (artifacts).
> > Specifically,
> > > > > > >> >> the
> > > > > > >> >>>>>>>>>> malhar-contrib
> > > > > > >> >>>>>>>>>>>> artifact would essentially just be a pom that
> > > specifies
> > > > > > >> >>>> each
> > > > > > >> >>>>>>>> smaller
> > > > > > >> >>>>>>>>>>> module
> > > > > > >> >>>>>>>>>>>> as a dependency:
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>> *<!-- in malhar-contrib's pom.xml: -->*
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>> *<modules>  <module>kafka</module>*
> > > > > > >> >>>>>>>>>>>> *  <module>twitter</module>*
> > > > > > >> >>>>>>>>>>>> *  <module>redis</module>*
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>> *  <!-- other smaller modules --></modules>*
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > >> >>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId>
> > > > > > >> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > >> >>>>>>>>>>>> <artifactId>malhar-contrib-twitter</artifactId>
> > > > > > >> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > >> >>>>>>>>>>>> <artifactId>malhar-contrib-redis</artifactId>
> > > > > > >> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>> With these changes, there may be a risk of
> breaking
> > > > > > >> >>>> backwards
> > > > > > >> >>>>>>>>>>>> compatibility, however I think the gain in
> > usability
> > > of
> > > > > > >> >>>>> malhar
> > > > > > >> >>>>>>>> merits
> > > > > > >> >>>>>>>>>> the
> > > > > > >> >>>>>>>>>>>> effort to make this work.
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>> I am still relatively new to maven, so I would
> love
> > > to
> > > > > > >> >>> get
> > > > > > >> >>>>> some
> > > > > > >> >>>>>>>>>> feedback
> > > > > > >> >>>>>>>>>>>> from other devs about this!
> > > > > > >> >>>>>>>>>>>>
> > > > > > >> >>>>>>>>>>>> --
> > > > > > >> >>>>>>>>>>>> Regards,
> > > > > > >> >>>>>>>>>>>> Andy Perlitch
> > > > > > >> >>>>>>>>>>>> Software Engineer
> > > > > > >> >>>>>>>>>>>> DataTorrent Inc
> > > > > > >> >>>>>>>>>>>> (408)829-9319
> > > > > > >> >>>>>>
> > > > > > >> >>>>>>
> > > > > > >> >>>>>>
> > > > > > >> >>>>>> --
> > > > > > >> >>>>>> Regards,
> > > > > > >> >>>>>> Andy Perlitch
> > > > > > >> >>>>>> Software Engineer
> > > > > > >> >>>>>> DataTorrent Inc
> > > > > > >> >>>>>> (408)829-9319
> > > > > > >> >>
> > > > > > >> >>
> > > > > > >> >>
> > > > > > >> >> --
> > > > > > >> >> Regards,
> > > > > > >> >> Andy Perlitch
> > > > > > >> >> Software Engineer
> > > > > > >> >> DataTorrent Inc
> > > > > > >> >> (408)829-9319
> > > > > > >> >>
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: More sensible modules/artifacts in malhar

Posted by Timothy Farkas <ti...@datatorrent.com>.
Hi David,

How would the versions of the modules be managed in git if the release
cycles / patches are independent? The only way I could think of would be to
have separate branches for each version of each module, and I think that
would become very difficult to manage.

Thanks,
Tim

On Mon, Dec 28, 2015 at 10:37 AM, David Yan <da...@datatorrent.com> wrote:

> Releasing repository is far more involving than releasing individual
> modules.
> If there is a critical bug in one of the modules, should we allow a patch
> version to be released for that module without waiting for the repository
> release?
>
> David
>
> On Wed, Dec 23, 2015 at 7:29 PM, Thomas Weise <th...@datatorrent.com>
> wrote:
>
> > Releases and versions are tied to the repository as a whole, not
> individual
> > modules within it. I don't think we should change this.
> >
> >
> > On Wed, Dec 23, 2015 at 6:57 PM, David Yan <da...@datatorrent.com>
> wrote:
> >
> > > As I understand, each artifact will be independent and will have its
> own
> > > release cycle.
> > >
> > > On Wed, Dec 23, 2015 at 6:50 PM, Pramod Immaneni <
> pramod@datatorrent.com
> > >
> > > wrote:
> > >
> > > > Wouldn't it also mean that in near term we would be releasing new
> > version
> > > > of all the artifacts when there is a new malhar release to be made
> even
> > > > though many of them may not have changed.
> > > >
> > > > On Wed, Dec 23, 2015 at 5:23 PM, David Yan <da...@datatorrent.com>
> > > wrote:
> > > >
> > > > > Let's restart the discussion of this topic.
> > > > >
> > > > > We'd like to break malhar into modules, so we can have separate
> > > artifacts
> > > > > for kafka, cassandra, hbase, etc., instead of just malhar-contrib
> and
> > > > > malhar-library.
> > > > > This way users using them will only pull in the right dependencies
> > > > > automatically, without the ugly business of optional and exclude
> > > > > dependencies today.
> > > > >
> > > > > Also, I propose adding the 3rd party version in the artifact name.
> > For
> > > > > example:
> > > > >
> > > > > malhar-kafka-0.8
> > > > > malhar-kafka-0.9
> > > > >
> > > > > so that we can simultaneously support multiple versions of kafka.
> > > > >
> > > > > Thoughts?
> > > > >
> > > > > David
> > > > >
> > > > > On Fri, Oct 2, 2015 at 4:40 PM, David Yan <da...@datatorrent.com>
> > > wrote:
> > > > >
> > > > > > The list of all malhar operators are listed as part of the apidoc
> > > here:
> > > > > > https://www.datatorrent.com/docs/apidocs/index.html
> > > > > > And developers should be able to find the operators they need
> > there.
> > > > > >
> > > > > > But, it's referenced from
> > > > > > https://www.datatorrent.com/product-documentation/ as "Platform
> > API
> > > > > > Reference" so users may have trouble finding it.
> > > > > >
> > > > > > We probably should have a separate javadoc pages for Apex Core
> and
> > > Apex
> > > > > > Malhar and add the links to this page
> > > http://apex.apache.org/docs.html
> > > > > > also.
> > > > > >
> > > > > > David
> > > > > >
> > > > > > On Fri, Oct 2, 2015 at 4:28 PM, Pramod Immaneni <
> > > > pramod@datatorrent.com>
> > > > > > wrote:
> > > > > >
> > > > > >> We got to think about how people can find the operators and
> > > > > >> dependencies when bundling the applications. The complain I hear
> > > often
> > > > > >> is that folks can't find the operators they are looking for. We
> > > should
> > > > > >> be careful about how much more work this will add for the user
> to
> > > now
> > > > > >> search and find all the dependencies.
> > > > > >>
> > > > > >> Thanks
> > > > > >>
> > > > > >> > On Oct 2, 2015, at 3:44 PM, David Yan <da...@datatorrent.com>
> > > > wrote:
> > > > > >> >
> > > > > >> > I actually don't think it makes sense any more to separate
> > > > > >> malhar-library
> > > > > >> > and malhar-contrib after the breakup, especially since we are
> > > > planning
> > > > > >> for
> > > > > >> > a major release for these changes.
> > > > > >> >
> > > > > >> > People are often confused, myself included, which operators
> > should
> > > > be
> > > > > in
> > > > > >> > malhar-library and which ones should be in contrib.
> Requiring a
> > > > > >> separate
> > > > > >> > setup for unit test should not be a criteria because the user
> of
> > > the
> > > > > >> > library couldn't care less whether the unit test requires
> extra
> > > > setup.
> > > > > >> The
> > > > > >> > factor of requiring extra dependencies isn't valid either
> > because
> > > > > >> there're
> > > > > >> > already dependencies of malhar-library now that apex does not
> > > have.
> > > > > >> >
> > > > > >> > We can retain them for backward compatibility purpose but
> going
> > > > > forward
> > > > > >> new
> > > > > >> > app packages should only use the baby artifacts, without
> > denoting
> > > > > >> whether
> > > > > >> > it's contrib or not.
> > > > > >> >
> > > > > >> > David
> > > > > >> >
> > > > > >> > On Tue, Sep 29, 2015 at 12:19 AM, Andy Perlitch <
> > > > andy@datatorrent.com
> > > > > >
> > > > > >> > wrote:
> > > > > >> >
> > > > > >> >> Hi all,
> > > > > >> >>
> > > > > >> >> This is a first cut at a plan to restructure malhar in a way
> > that
> > > > is
> > > > > >> more
> > > > > >> >> portable and adherent to Maven's principles of modularity and
> > > > > >> dependency
> > > > > >> >> management.
> > > > > >> >>
> > > > > >> >> Overview of Current Malhar Architecture
> > > > > >> >>
> ---------------------------------------------------------------
> > > > > >> >> The current malhar repo consists of several maven modules:
> > > > > >> >>
> > > > > >> >> * *malhar-library*
> > > > > >> >>   operators which do not require additional transitive
> > > dependencies
> > > > > >> beyond
> > > > > >> >> what Apex and Hadoop require
> > > > > >> >> *  *malhar-contrib*
> > > > > >> >>   operators requiring other maven dependencies
> > > > > >> >> * *malhar-demos*
> > > > > >> >>   demo applications
> > > > > >> >> * *malhar-samples*
> > > > > >> >>   sample code showing example usage of malhar operators
> > > > > >> >> * *malhar-apps*
> > > > > >> >>   apex applications (currently only logstream)
> > > > > >> >>
> > > > > >> >>
> > > > > >> >> Proposed Changes
> > > > > >> >>
> ---------------------------------------------------------------
> > > > > >> >>
> > > > > >> >> 1. *Scrub malhar-library for any operators needing additional
> > > > > >> dependencies*
> > > > > >> >>  `malhar-library` is intended to consist of only operators
> > > without
> > > > > >> extra
> > > > > >> >> transitive dependencies. All operators should be checked for
> > the
> > > > > >> necessity
> > > > > >> >> of extra dependencies.
> > > > > >> >>
> > > > > >> >> 2. *Move operators from malhar-demos and malhar-apps into
> > contrib
> > > > (or
> > > > > >> >> library if prudent)*
> > > > > >> >>    There are various operators in both of these modules that
> > are
> > > > > >> general
> > > > > >> >> enough to move into library or contrib.
> > > > > >> >>
> > > > > >> >> 3. *Create modules for all contrib subfolders*
> > > > > >> >>    All folders under
> > `contrib/src/main/com/datatorrent/contrib/`
> > > > > >> should be
> > > > > >> >> converted to modules of contrib and listed as such in
> > > > > >> `/contrib/pom.xml`.
> > > > > >> >>    Additionally, each of these smaller contrib modules will
> > have
> > > > its
> > > > > >> own
> > > > > >> >> version and dependencies.
> > > > > >> >>
> > > > > >> >> 4. *Use the Shades Plugin to allow for backwards-compatible
> > > > > >> fully-qualified
> > > > > >> >> class names*
> > > > > >> >>    This is made possible by shades class relocation
> > > > > >> >> <
> > > > > >> >>
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://maven.apache.org/plugins/maven-shade-plugin/examples/class-relocation.html
> > > > > >> >> feature. This might be a bit error prone as well as confusing
> > to
> > > > use
> > > > > >> for
> > > > > >> >> outside developers, but it must be done if these changes are
> to
> > > be
> > > > > made
> > > > > >> >> prior to a major release.
> > > > > >> >>
> > > > > >> >>
> > > > > >> >>
> > > > > >> >> Let me know what you all think of this approach.
> > > > > >> >>
> > > > > >> >> Best,
> > > > > >> >> Andy
> > > > > >> >>
> > > > > >> >>
> > > > > >> >> On Tue, Sep 22, 2015 at 11:20 AM, Chetan Narsude <
> > > > > >> chetan@datatorrent.com>
> > > > > >> >> wrote:
> > > > > >> >>
> > > > > >> >>> +1
> > > > > >> >>>
> > > > > >> >>> On Tue, Sep 22, 2015 at 11:08 AM, Gaurav Gupta <
> > > > > >> gaurav@datatorrent.com>
> > > > > >> >>> wrote:
> > > > > >> >>>
> > > > > >> >>>> I agree with David.. Each artifact should have it's own
> > version
> > > > > >> >>>>
> > > > > >> >>>> Thanks
> > > > > >> >>>> -Gaurav
> > > > > >> >>>>
> > > > > >> >>>>> On Tue, Sep 22, 2015 at 11:07 AM, David Yan <
> > > > > david@datatorrent.com>
> > > > > >> >>>> wrote:
> > > > > >> >>>>
> > > > > >> >>>>> I actually think that each baby artifact should have its
> own
> > > > > >> version,
> > > > > >> >>>>> because each artifact has its own interface and its own
> life
> > > > > cycle,
> > > > > >> >>>>> especially after we break up the giant library,
> applications
> > > > will
> > > > > >> >>> depend
> > > > > >> >>>> on
> > > > > >> >>>>> the baby artifacts instead of the giant library.  For
> > example
> > > if
> > > > > >> >> there
> > > > > >> >>> is
> > > > > >> >>>>> no change in malhar-contrib-kafka (I think the name should
> > > > > actually
> > > > > >> >> be
> > > > > >> >>>>> apex-malhar-kafka), we should not confuse users by bumping
> > the
> > > > > >> >> version.
> > > > > >> >>>>>
> > > > > >> >>>>> David
> > > > > >> >>>>>
> > > > > >> >>>>> On Tue, Sep 22, 2015 at 9:03 AM, Andy Perlitch <
> > > > > >> andy@datatorrent.com
> > > > > >> >>>
> > > > > >> >>>>> wrote:
> > > > > >> >>>>>
> > > > > >> >>>>>> Tushar,
> > > > > >> >>>>>>
> > > > > >> >>>>>> I agree that all modules should inherit the version from
> > the
> > > > > >> >> "parent
> > > > > >> >>>> pom"
> > > > > >> >>>>>> of the malhar repo. I think the benefits outweigh the
> cost
> > of
> > > > > >> >> bumping
> > > > > >> >>>>>> versions of components that haven't actually changed. I'd
> > > love
> > > > to
> > > > > >> >> get
> > > > > >> >>>>>> others feedback on this as well.
> > > > > >> >>>>>>
> > > > > >> >>>>>> On another note, I plan on starting a
> spreadsheet/googledoc
> > > > with
> > > > > >> >> the
> > > > > >> >>>>>> possible groupings of operators into these modules. Stay
> > > > tuned...
> > > > > >> >>>>>>
> > > > > >> >>>>>> -Andy
> > > > > >> >>>>>>
> > > > > >> >>>>>> On Mon, Sep 21, 2015 at 11:51 PM, Tushar Gosavi <
> > > > > >> >>>> tushar@datatorrent.com>
> > > > > >> >>>>>> wrote:
> > > > > >> >>>>>>
> > > > > >> >>>>>>> +1 for the general idea
> > > > > >> >>>>>>>
> > > > > >> >>>>>>> Does these independent modules going to have independent
> > > > > >> >> versions?
> > > > > >> >>>> For
> > > > > >> >>>>>>> example, if there is no change in kafka operator between
> > > > malhar
> > > > > >> >> 3.0
> > > > > >> >>>> and
> > > > > >> >>>>>>> malhar 4.0, will we increment version of
> > > malhar-contrib-kafka
> > > > to
> > > > > >> >>>> 4.0. I
> > > > > >> >>>>>>> have learned from my previous project that, It is easier
> > to
> > > > > >> >> manage
> > > > > >> >>>>>> versions
> > > > > >> >>>>>>> if we make all modules at same version level for a
> > release,
> > > > even
> > > > > >> >> if
> > > > > >> >>>>> there
> > > > > >> >>>>>>> is no change in a particular module.
> > > > > >> >>>>>>>
> > > > > >> >>>>>>> - Tushar.
> > > > > >> >>>>>>>
> > > > > >> >>>>>>>
> > > > > >> >>>>>>>
> > > > > >> >>>>>>> On Fri, Sep 18, 2015 at 12:18 AM, Timothy Farkas <
> > > > > >> >>>> tim@datatorrent.com>
> > > > > >> >>>>>>> wrote:
> > > > > >> >>>>>>>
> > > > > >> >>>>>>>> I agree Andy's solution is better, but just for the
> sake
> > of
> > > > > >> >>>> argument
> > > > > >> >>>>>>>> profiles can be inherited from a parent pom, so if the
> > > maven
> > > > > >> >>>>> archetype
> > > > > >> >>>>>>>> defines a new project with a parent pom with the
> correct
> > > > > >> >> profiles
> > > > > >> >>>>>>> defined,
> > > > > >> >>>>>>>> then the desired profiles can be activated in the pom
> of
> > > the
> > > > > >> >> new
> > > > > >> >>>>>> project.
> > > > > >> >>>>>>>> It is no more complicated than adding additional
> > > dependencies
> > > > > >> >> to
> > > > > >> >>>> your
> > > > > >> >>>>>>>> project.
> > > > > >> >>>>>>>>
> > > > > >> >>>>>>>> On Thu, Sep 17, 2015 at 10:32 AM, Sandesh Hegde <
> > > > > >> >>>>>> sandesh@datatorrent.com
> > > > > >> >>>>>>>>
> > > > > >> >>>>>>>> wrote:
> > > > > >> >>>>>>>>
> > > > > >> >>>>>>>>> Currently all the dependencies in Malhar-Contrib are
> > > marked
> > > > > >> >> as
> > > > > >> >>>>>>> optional.
> > > > > >> >>>>>>>> So
> > > > > >> >>>>>>>>> users have to already modify the existing POM to use
> it
> > in
> > > > > >> >>> their
> > > > > >> >>>>>>> project.
> > > > > >> >>>>>>>>> So restructuring should be fine.
> > > > > >> >>>>>>>>>
> > > > > >> >>>>>>>>> On Thu, Sep 17, 2015 at 11:29 AM Chetan Narsude <
> > > > > >> >>>>>>> chetan@datatorrent.com>
> > > > > >> >>>>>>>>> wrote:
> > > > > >> >>>>>>>>>
> > > > > >> >>>>>>>>>> The profiles are excellent when you are developing
> > > > > >> >>>>> malhar-contrib.
> > > > > >> >>>>>>>>> Profiles
> > > > > >> >>>>>>>>>> do not work when you are using malhar-contrib. The
> > > problem
> > > > > >> >>> Andy
> > > > > >> >>>>> is
> > > > > >> >>>>>>>>> trying
> > > > > >> >>>>>>>>>> to solve is the later. If there is an elegant
> solution
> > > > > >> >> which
> > > > > >> >>> I
> > > > > >> >>>> am
> > > > > >> >>>>>>>> missing
> > > > > >> >>>>>>>>>> using profiles, please correct me.
> > > > > >> >>>>>>>>>>
> > > > > >> >>>>>>>>>> The way Andy suggested is the way many successful
> > > projects
> > > > > >> >> do
> > > > > >> >>>> it.
> > > > > >> >>>>>>> Look
> > > > > >> >>>>>>>> at
> > > > > >> >>>>>>>>>> Netty as an example.
> > > > > >> >>>>>>>>>>
> > > > > >> >>>>>>>>>> +1 for that.
> > > > > >> >>>>>>>>>>
> > > > > >> >>>>>>>>>>
> > > > > >> >>>>>>>>>> --
> > > > > >> >>>>>>>>>> Chetan
> > > > > >> >>>>>>>>>>
> > > > > >> >>>>>>>>>>
> > > > > >> >>>>>>>>>>
> > > > > >> >>>>>>>>>> On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas <
> > > > > >> >>>>>>> tim@datatorrent.com>
> > > > > >> >>>>>>>>>> wrote:
> > > > > >> >>>>>>>>>>
> > > > > >> >>>>>>>>>>> I think restructuring the project in that way would
> be
> > > > > >> >> the
> > > > > >> >>>>>>>> technically
> > > > > >> >>>>>>>>>>> correct thing to do, but if people are unwilling to
> > > > > >> >> accept
> > > > > >> >>>> the
> > > > > >> >>>>>>> change
> > > > > >> >>>>>>>>> in
> > > > > >> >>>>>>>>>>> project structure you could achieve something
> similar
> > by
> > > > > >> >>>> using
> > > > > >> >>>>>>> maven
> > > > > >> >>>>>>>>>>> profiles. With profiles the project structure would
> > > > > >> >> remain
> > > > > >> >>> as
> > > > > >> >>>>> is.
> > > > > >> >>>>>>>>>> Profiles
> > > > > >> >>>>>>>>>>> could be added to the malhar pom, and a profile
> would
> > > > > >> >>> define
> > > > > >> >>>>> the
> > > > > >> >>>>>>>>>>> dependencies needed for different types of
> operators.
> > > For
> > > > > >> >>>>> example
> > > > > >> >>>>>>> the
> > > > > >> >>>>>>>>>> hbase
> > > > > >> >>>>>>>>>>> profile would define the dependencies for the hbase
> > > > > >> >>> operator.
> > > > > >> >>>>>> Then
> > > > > >> >>>>>>>> any
> > > > > >> >>>>>>>>>>> project using a malhar library would just activate
> the
> > > > > >> >>>> correct
> > > > > >> >>>>>>>> profile
> > > > > >> >>>>>>>>> in
> > > > > >> >>>>>>>>>>> it's pom, and the correct dependencies would be
> pulled
> > > > > >> >> in.
> > > > > >> >>
> > > > > >>
> > > > >
> > > >
> > >
> >
> http://maven.apache.org/guides/introduction/introduction-to-profiles.html
> > > > > >> >>>>>>>>>>>
> > > > > >> >>>>>>>>>>> On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch <
> > > > > >> >>>>>>>> andy@datatorrent.com>
> > > > > >> >>>>>>>>>>> wrote:
> > > > > >> >>>>>>>>>>>
> > > > > >> >>>>>>>>>>>> Hi everyone,
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>> I am currently assigned to MLHR-1843
> > > > > >> >>>>>>>>>>>> <https://malhar.atlassian.net/browse/MLHR-1843>,
> > which
> > > > > >> >>>>>>> essentially
> > > > > >> >>>>>>>>>> aims
> > > > > >> >>>>>>>>>>> to
> > > > > >> >>>>>>>>>>>> expose smaller, more consumable maven artifacts
> that
> > > > > >> >>> would
> > > > > >> >>>> do
> > > > > >> >>>>>>> away
> > > > > >> >>>>>>>>> with
> > > > > >> >>>>>>>>>>> the
> > > > > >> >>>>>>>>>>>> need to manually include necessary dependencies
> based
> > > > > >> >> on
> > > > > >> >>>> the
> > > > > >> >>>>>>>>> operators
> > > > > >> >>>>>>>>>> in
> > > > > >> >>>>>>>>>>>> use.
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>> As an example, say I am building an app package
> that
> > > > > >> >>> needs
> > > > > >> >>>>>> Kafka
> > > > > >> >>>>>>>>> input
> > > > > >> >>>>>>>>>>> and
> > > > > >> >>>>>>>>>>>> output operators, but I don't want all the other
> > > > > >> >>> transitive
> > > > > >> >>>>>>>>>> dependencies
> > > > > >> >>>>>>>>>>>> that come via malhar-contrib. Currently I would
> need
> > to
> > > > > >> >>>>> specify
> > > > > >> >>>>>>>>>>>> malhar-contrib as a dependency, and add an
> exclusions
> > > > > >> >>> block
> > > > > >> >>>>> in
> > > > > >> >>>>>>> my
> > > > > >> >>>>>>>>> app
> > > > > >> >>>>>>>>>>>> package pom:
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > >> >>>>>>>>>>>> <artifactId>malhar-contrib</artifactId>
> > > > > >> >>>>>> <version>3.0.0</version>
> > > > > >> >>>>>>>>> <!--
> > > > > >> >>>>>>>>>>> so
> > > > > >> >>>>>>>>>>>> none of malhar-contrib's deps are included -->*
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>> *  <exclusions>    <exclusion>
> > > > > >> >> <groupId>*</groupId>
> > > > > >> >>>>>>>>>>>> <artifactId>*</artifactId>    </exclusion>
> > > > > >> >>>>>>>>> </exclusions></dependency>*
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>> Then, I would have to include the kafka library
> > > > > >> >>> explicitly
> > > > > >> >>>>> as a
> > > > > >> >>>>>>>>>>> dependency:
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>> *<dependency>  <groupId>org.apache.kafka</groupId>
> > > > > >> >>>>>>>>>>>> <artifactId>kafka_2.10</artifactId>
> > > > > >> >>>>>>>>>>>> <version>0.8.1.1</version></dependency>*
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>> Wouldn't it be nice if I could just put this in my
> > > > > >> >> pom?:
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > >> >>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId>
> > > > > >> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>> In order to make this possible, we will need to
> > > > > >> >> organize
> > > > > >> >>>> the
> > > > > >> >>>>>>> malhar
> > > > > >> >>>>>>>>>>> project
> > > > > >> >>>>>>>>>>>> into more granular modules (artifacts).
> Specifically,
> > > > > >> >> the
> > > > > >> >>>>>>>>>> malhar-contrib
> > > > > >> >>>>>>>>>>>> artifact would essentially just be a pom that
> > specifies
> > > > > >> >>>> each
> > > > > >> >>>>>>>> smaller
> > > > > >> >>>>>>>>>>> module
> > > > > >> >>>>>>>>>>>> as a dependency:
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>> *<!-- in malhar-contrib's pom.xml: -->*
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>> *<modules>  <module>kafka</module>*
> > > > > >> >>>>>>>>>>>> *  <module>twitter</module>*
> > > > > >> >>>>>>>>>>>> *  <module>redis</module>*
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>> *  <!-- other smaller modules --></modules>*
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > >> >>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId>
> > > > > >> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > >> >>>>>>>>>>>> <artifactId>malhar-contrib-twitter</artifactId>
> > > > > >> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > >> >>>>>>>>>>>> <artifactId>malhar-contrib-redis</artifactId>
> > > > > >> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>> With these changes, there may be a risk of breaking
> > > > > >> >>>> backwards
> > > > > >> >>>>>>>>>>>> compatibility, however I think the gain in
> usability
> > of
> > > > > >> >>>>> malhar
> > > > > >> >>>>>>>> merits
> > > > > >> >>>>>>>>>> the
> > > > > >> >>>>>>>>>>>> effort to make this work.
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>> I am still relatively new to maven, so I would love
> > to
> > > > > >> >>> get
> > > > > >> >>>>> some
> > > > > >> >>>>>>>>>> feedback
> > > > > >> >>>>>>>>>>>> from other devs about this!
> > > > > >> >>>>>>>>>>>>
> > > > > >> >>>>>>>>>>>> --
> > > > > >> >>>>>>>>>>>> Regards,
> > > > > >> >>>>>>>>>>>> Andy Perlitch
> > > > > >> >>>>>>>>>>>> Software Engineer
> > > > > >> >>>>>>>>>>>> DataTorrent Inc
> > > > > >> >>>>>>>>>>>> (408)829-9319
> > > > > >> >>>>>>
> > > > > >> >>>>>>
> > > > > >> >>>>>>
> > > > > >> >>>>>> --
> > > > > >> >>>>>> Regards,
> > > > > >> >>>>>> Andy Perlitch
> > > > > >> >>>>>> Software Engineer
> > > > > >> >>>>>> DataTorrent Inc
> > > > > >> >>>>>> (408)829-9319
> > > > > >> >>
> > > > > >> >>
> > > > > >> >>
> > > > > >> >> --
> > > > > >> >> Regards,
> > > > > >> >> Andy Perlitch
> > > > > >> >> Software Engineer
> > > > > >> >> DataTorrent Inc
> > > > > >> >> (408)829-9319
> > > > > >> >>
> > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: More sensible modules/artifacts in malhar

Posted by David Yan <da...@datatorrent.com>.
Releasing repository is far more involving than releasing individual
modules.
If there is a critical bug in one of the modules, should we allow a patch
version to be released for that module without waiting for the repository
release?

David

On Wed, Dec 23, 2015 at 7:29 PM, Thomas Weise <th...@datatorrent.com>
wrote:

> Releases and versions are tied to the repository as a whole, not individual
> modules within it. I don't think we should change this.
>
>
> On Wed, Dec 23, 2015 at 6:57 PM, David Yan <da...@datatorrent.com> wrote:
>
> > As I understand, each artifact will be independent and will have its own
> > release cycle.
> >
> > On Wed, Dec 23, 2015 at 6:50 PM, Pramod Immaneni <pramod@datatorrent.com
> >
> > wrote:
> >
> > > Wouldn't it also mean that in near term we would be releasing new
> version
> > > of all the artifacts when there is a new malhar release to be made even
> > > though many of them may not have changed.
> > >
> > > On Wed, Dec 23, 2015 at 5:23 PM, David Yan <da...@datatorrent.com>
> > wrote:
> > >
> > > > Let's restart the discussion of this topic.
> > > >
> > > > We'd like to break malhar into modules, so we can have separate
> > artifacts
> > > > for kafka, cassandra, hbase, etc., instead of just malhar-contrib and
> > > > malhar-library.
> > > > This way users using them will only pull in the right dependencies
> > > > automatically, without the ugly business of optional and exclude
> > > > dependencies today.
> > > >
> > > > Also, I propose adding the 3rd party version in the artifact name.
> For
> > > > example:
> > > >
> > > > malhar-kafka-0.8
> > > > malhar-kafka-0.9
> > > >
> > > > so that we can simultaneously support multiple versions of kafka.
> > > >
> > > > Thoughts?
> > > >
> > > > David
> > > >
> > > > On Fri, Oct 2, 2015 at 4:40 PM, David Yan <da...@datatorrent.com>
> > wrote:
> > > >
> > > > > The list of all malhar operators are listed as part of the apidoc
> > here:
> > > > > https://www.datatorrent.com/docs/apidocs/index.html
> > > > > And developers should be able to find the operators they need
> there.
> > > > >
> > > > > But, it's referenced from
> > > > > https://www.datatorrent.com/product-documentation/ as "Platform
> API
> > > > > Reference" so users may have trouble finding it.
> > > > >
> > > > > We probably should have a separate javadoc pages for Apex Core and
> > Apex
> > > > > Malhar and add the links to this page
> > http://apex.apache.org/docs.html
> > > > > also.
> > > > >
> > > > > David
> > > > >
> > > > > On Fri, Oct 2, 2015 at 4:28 PM, Pramod Immaneni <
> > > pramod@datatorrent.com>
> > > > > wrote:
> > > > >
> > > > >> We got to think about how people can find the operators and
> > > > >> dependencies when bundling the applications. The complain I hear
> > often
> > > > >> is that folks can't find the operators they are looking for. We
> > should
> > > > >> be careful about how much more work this will add for the user to
> > now
> > > > >> search and find all the dependencies.
> > > > >>
> > > > >> Thanks
> > > > >>
> > > > >> > On Oct 2, 2015, at 3:44 PM, David Yan <da...@datatorrent.com>
> > > wrote:
> > > > >> >
> > > > >> > I actually don't think it makes sense any more to separate
> > > > >> malhar-library
> > > > >> > and malhar-contrib after the breakup, especially since we are
> > > planning
> > > > >> for
> > > > >> > a major release for these changes.
> > > > >> >
> > > > >> > People are often confused, myself included, which operators
> should
> > > be
> > > > in
> > > > >> > malhar-library and which ones should be in contrib.  Requiring a
> > > > >> separate
> > > > >> > setup for unit test should not be a criteria because the user of
> > the
> > > > >> > library couldn't care less whether the unit test requires extra
> > > setup.
> > > > >> The
> > > > >> > factor of requiring extra dependencies isn't valid either
> because
> > > > >> there're
> > > > >> > already dependencies of malhar-library now that apex does not
> > have.
> > > > >> >
> > > > >> > We can retain them for backward compatibility purpose but going
> > > > forward
> > > > >> new
> > > > >> > app packages should only use the baby artifacts, without
> denoting
> > > > >> whether
> > > > >> > it's contrib or not.
> > > > >> >
> > > > >> > David
> > > > >> >
> > > > >> > On Tue, Sep 29, 2015 at 12:19 AM, Andy Perlitch <
> > > andy@datatorrent.com
> > > > >
> > > > >> > wrote:
> > > > >> >
> > > > >> >> Hi all,
> > > > >> >>
> > > > >> >> This is a first cut at a plan to restructure malhar in a way
> that
> > > is
> > > > >> more
> > > > >> >> portable and adherent to Maven's principles of modularity and
> > > > >> dependency
> > > > >> >> management.
> > > > >> >>
> > > > >> >> Overview of Current Malhar Architecture
> > > > >> >> ---------------------------------------------------------------
> > > > >> >> The current malhar repo consists of several maven modules:
> > > > >> >>
> > > > >> >> * *malhar-library*
> > > > >> >>   operators which do not require additional transitive
> > dependencies
> > > > >> beyond
> > > > >> >> what Apex and Hadoop require
> > > > >> >> *  *malhar-contrib*
> > > > >> >>   operators requiring other maven dependencies
> > > > >> >> * *malhar-demos*
> > > > >> >>   demo applications
> > > > >> >> * *malhar-samples*
> > > > >> >>   sample code showing example usage of malhar operators
> > > > >> >> * *malhar-apps*
> > > > >> >>   apex applications (currently only logstream)
> > > > >> >>
> > > > >> >>
> > > > >> >> Proposed Changes
> > > > >> >> ---------------------------------------------------------------
> > > > >> >>
> > > > >> >> 1. *Scrub malhar-library for any operators needing additional
> > > > >> dependencies*
> > > > >> >>  `malhar-library` is intended to consist of only operators
> > without
> > > > >> extra
> > > > >> >> transitive dependencies. All operators should be checked for
> the
> > > > >> necessity
> > > > >> >> of extra dependencies.
> > > > >> >>
> > > > >> >> 2. *Move operators from malhar-demos and malhar-apps into
> contrib
> > > (or
> > > > >> >> library if prudent)*
> > > > >> >>    There are various operators in both of these modules that
> are
> > > > >> general
> > > > >> >> enough to move into library or contrib.
> > > > >> >>
> > > > >> >> 3. *Create modules for all contrib subfolders*
> > > > >> >>    All folders under
> `contrib/src/main/com/datatorrent/contrib/`
> > > > >> should be
> > > > >> >> converted to modules of contrib and listed as such in
> > > > >> `/contrib/pom.xml`.
> > > > >> >>    Additionally, each of these smaller contrib modules will
> have
> > > its
> > > > >> own
> > > > >> >> version and dependencies.
> > > > >> >>
> > > > >> >> 4. *Use the Shades Plugin to allow for backwards-compatible
> > > > >> fully-qualified
> > > > >> >> class names*
> > > > >> >>    This is made possible by shades class relocation
> > > > >> >> <
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> https://maven.apache.org/plugins/maven-shade-plugin/examples/class-relocation.html
> > > > >> >> feature. This might be a bit error prone as well as confusing
> to
> > > use
> > > > >> for
> > > > >> >> outside developers, but it must be done if these changes are to
> > be
> > > > made
> > > > >> >> prior to a major release.
> > > > >> >>
> > > > >> >>
> > > > >> >>
> > > > >> >> Let me know what you all think of this approach.
> > > > >> >>
> > > > >> >> Best,
> > > > >> >> Andy
> > > > >> >>
> > > > >> >>
> > > > >> >> On Tue, Sep 22, 2015 at 11:20 AM, Chetan Narsude <
> > > > >> chetan@datatorrent.com>
> > > > >> >> wrote:
> > > > >> >>
> > > > >> >>> +1
> > > > >> >>>
> > > > >> >>> On Tue, Sep 22, 2015 at 11:08 AM, Gaurav Gupta <
> > > > >> gaurav@datatorrent.com>
> > > > >> >>> wrote:
> > > > >> >>>
> > > > >> >>>> I agree with David.. Each artifact should have it's own
> version
> > > > >> >>>>
> > > > >> >>>> Thanks
> > > > >> >>>> -Gaurav
> > > > >> >>>>
> > > > >> >>>>> On Tue, Sep 22, 2015 at 11:07 AM, David Yan <
> > > > david@datatorrent.com>
> > > > >> >>>> wrote:
> > > > >> >>>>
> > > > >> >>>>> I actually think that each baby artifact should have its own
> > > > >> version,
> > > > >> >>>>> because each artifact has its own interface and its own life
> > > > cycle,
> > > > >> >>>>> especially after we break up the giant library, applications
> > > will
> > > > >> >>> depend
> > > > >> >>>> on
> > > > >> >>>>> the baby artifacts instead of the giant library.  For
> example
> > if
> > > > >> >> there
> > > > >> >>> is
> > > > >> >>>>> no change in malhar-contrib-kafka (I think the name should
> > > > actually
> > > > >> >> be
> > > > >> >>>>> apex-malhar-kafka), we should not confuse users by bumping
> the
> > > > >> >> version.
> > > > >> >>>>>
> > > > >> >>>>> David
> > > > >> >>>>>
> > > > >> >>>>> On Tue, Sep 22, 2015 at 9:03 AM, Andy Perlitch <
> > > > >> andy@datatorrent.com
> > > > >> >>>
> > > > >> >>>>> wrote:
> > > > >> >>>>>
> > > > >> >>>>>> Tushar,
> > > > >> >>>>>>
> > > > >> >>>>>> I agree that all modules should inherit the version from
> the
> > > > >> >> "parent
> > > > >> >>>> pom"
> > > > >> >>>>>> of the malhar repo. I think the benefits outweigh the cost
> of
> > > > >> >> bumping
> > > > >> >>>>>> versions of components that haven't actually changed. I'd
> > love
> > > to
> > > > >> >> get
> > > > >> >>>>>> others feedback on this as well.
> > > > >> >>>>>>
> > > > >> >>>>>> On another note, I plan on starting a spreadsheet/googledoc
> > > with
> > > > >> >> the
> > > > >> >>>>>> possible groupings of operators into these modules. Stay
> > > tuned...
> > > > >> >>>>>>
> > > > >> >>>>>> -Andy
> > > > >> >>>>>>
> > > > >> >>>>>> On Mon, Sep 21, 2015 at 11:51 PM, Tushar Gosavi <
> > > > >> >>>> tushar@datatorrent.com>
> > > > >> >>>>>> wrote:
> > > > >> >>>>>>
> > > > >> >>>>>>> +1 for the general idea
> > > > >> >>>>>>>
> > > > >> >>>>>>> Does these independent modules going to have independent
> > > > >> >> versions?
> > > > >> >>>> For
> > > > >> >>>>>>> example, if there is no change in kafka operator between
> > > malhar
> > > > >> >> 3.0
> > > > >> >>>> and
> > > > >> >>>>>>> malhar 4.0, will we increment version of
> > malhar-contrib-kafka
> > > to
> > > > >> >>>> 4.0. I
> > > > >> >>>>>>> have learned from my previous project that, It is easier
> to
> > > > >> >> manage
> > > > >> >>>>>> versions
> > > > >> >>>>>>> if we make all modules at same version level for a
> release,
> > > even
> > > > >> >> if
> > > > >> >>>>> there
> > > > >> >>>>>>> is no change in a particular module.
> > > > >> >>>>>>>
> > > > >> >>>>>>> - Tushar.
> > > > >> >>>>>>>
> > > > >> >>>>>>>
> > > > >> >>>>>>>
> > > > >> >>>>>>> On Fri, Sep 18, 2015 at 12:18 AM, Timothy Farkas <
> > > > >> >>>> tim@datatorrent.com>
> > > > >> >>>>>>> wrote:
> > > > >> >>>>>>>
> > > > >> >>>>>>>> I agree Andy's solution is better, but just for the sake
> of
> > > > >> >>>> argument
> > > > >> >>>>>>>> profiles can be inherited from a parent pom, so if the
> > maven
> > > > >> >>>>> archetype
> > > > >> >>>>>>>> defines a new project with a parent pom with the correct
> > > > >> >> profiles
> > > > >> >>>>>>> defined,
> > > > >> >>>>>>>> then the desired profiles can be activated in the pom of
> > the
> > > > >> >> new
> > > > >> >>>>>> project.
> > > > >> >>>>>>>> It is no more complicated than adding additional
> > dependencies
> > > > >> >> to
> > > > >> >>>> your
> > > > >> >>>>>>>> project.
> > > > >> >>>>>>>>
> > > > >> >>>>>>>> On Thu, Sep 17, 2015 at 10:32 AM, Sandesh Hegde <
> > > > >> >>>>>> sandesh@datatorrent.com
> > > > >> >>>>>>>>
> > > > >> >>>>>>>> wrote:
> > > > >> >>>>>>>>
> > > > >> >>>>>>>>> Currently all the dependencies in Malhar-Contrib are
> > marked
> > > > >> >> as
> > > > >> >>>>>>> optional.
> > > > >> >>>>>>>> So
> > > > >> >>>>>>>>> users have to already modify the existing POM to use it
> in
> > > > >> >>> their
> > > > >> >>>>>>> project.
> > > > >> >>>>>>>>> So restructuring should be fine.
> > > > >> >>>>>>>>>
> > > > >> >>>>>>>>> On Thu, Sep 17, 2015 at 11:29 AM Chetan Narsude <
> > > > >> >>>>>>> chetan@datatorrent.com>
> > > > >> >>>>>>>>> wrote:
> > > > >> >>>>>>>>>
> > > > >> >>>>>>>>>> The profiles are excellent when you are developing
> > > > >> >>>>> malhar-contrib.
> > > > >> >>>>>>>>> Profiles
> > > > >> >>>>>>>>>> do not work when you are using malhar-contrib. The
> > problem
> > > > >> >>> Andy
> > > > >> >>>>> is
> > > > >> >>>>>>>>> trying
> > > > >> >>>>>>>>>> to solve is the later. If there is an elegant solution
> > > > >> >> which
> > > > >> >>> I
> > > > >> >>>> am
> > > > >> >>>>>>>> missing
> > > > >> >>>>>>>>>> using profiles, please correct me.
> > > > >> >>>>>>>>>>
> > > > >> >>>>>>>>>> The way Andy suggested is the way many successful
> > projects
> > > > >> >> do
> > > > >> >>>> it.
> > > > >> >>>>>>> Look
> > > > >> >>>>>>>> at
> > > > >> >>>>>>>>>> Netty as an example.
> > > > >> >>>>>>>>>>
> > > > >> >>>>>>>>>> +1 for that.
> > > > >> >>>>>>>>>>
> > > > >> >>>>>>>>>>
> > > > >> >>>>>>>>>> --
> > > > >> >>>>>>>>>> Chetan
> > > > >> >>>>>>>>>>
> > > > >> >>>>>>>>>>
> > > > >> >>>>>>>>>>
> > > > >> >>>>>>>>>> On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas <
> > > > >> >>>>>>> tim@datatorrent.com>
> > > > >> >>>>>>>>>> wrote:
> > > > >> >>>>>>>>>>
> > > > >> >>>>>>>>>>> I think restructuring the project in that way would be
> > > > >> >> the
> > > > >> >>>>>>>> technically
> > > > >> >>>>>>>>>>> correct thing to do, but if people are unwilling to
> > > > >> >> accept
> > > > >> >>>> the
> > > > >> >>>>>>> change
> > > > >> >>>>>>>>> in
> > > > >> >>>>>>>>>>> project structure you could achieve something similar
> by
> > > > >> >>>> using
> > > > >> >>>>>>> maven
> > > > >> >>>>>>>>>>> profiles. With profiles the project structure would
> > > > >> >> remain
> > > > >> >>> as
> > > > >> >>>>> is.
> > > > >> >>>>>>>>>> Profiles
> > > > >> >>>>>>>>>>> could be added to the malhar pom, and a profile would
> > > > >> >>> define
> > > > >> >>>>> the
> > > > >> >>>>>>>>>>> dependencies needed for different types of operators.
> > For
> > > > >> >>>>> example
> > > > >> >>>>>>> the
> > > > >> >>>>>>>>>> hbase
> > > > >> >>>>>>>>>>> profile would define the dependencies for the hbase
> > > > >> >>> operator.
> > > > >> >>>>>> Then
> > > > >> >>>>>>>> any
> > > > >> >>>>>>>>>>> project using a malhar library would just activate the
> > > > >> >>>> correct
> > > > >> >>>>>>>> profile
> > > > >> >>>>>>>>> in
> > > > >> >>>>>>>>>>> it's pom, and the correct dependencies would be pulled
> > > > >> >> in.
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> http://maven.apache.org/guides/introduction/introduction-to-profiles.html
> > > > >> >>>>>>>>>>>
> > > > >> >>>>>>>>>>> On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch <
> > > > >> >>>>>>>> andy@datatorrent.com>
> > > > >> >>>>>>>>>>> wrote:
> > > > >> >>>>>>>>>>>
> > > > >> >>>>>>>>>>>> Hi everyone,
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>> I am currently assigned to MLHR-1843
> > > > >> >>>>>>>>>>>> <https://malhar.atlassian.net/browse/MLHR-1843>,
> which
> > > > >> >>>>>>> essentially
> > > > >> >>>>>>>>>> aims
> > > > >> >>>>>>>>>>> to
> > > > >> >>>>>>>>>>>> expose smaller, more consumable maven artifacts that
> > > > >> >>> would
> > > > >> >>>> do
> > > > >> >>>>>>> away
> > > > >> >>>>>>>>> with
> > > > >> >>>>>>>>>>> the
> > > > >> >>>>>>>>>>>> need to manually include necessary dependencies based
> > > > >> >> on
> > > > >> >>>> the
> > > > >> >>>>>>>>> operators
> > > > >> >>>>>>>>>> in
> > > > >> >>>>>>>>>>>> use.
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>> As an example, say I am building an app package that
> > > > >> >>> needs
> > > > >> >>>>>> Kafka
> > > > >> >>>>>>>>> input
> > > > >> >>>>>>>>>>> and
> > > > >> >>>>>>>>>>>> output operators, but I don't want all the other
> > > > >> >>> transitive
> > > > >> >>>>>>>>>> dependencies
> > > > >> >>>>>>>>>>>> that come via malhar-contrib. Currently I would need
> to
> > > > >> >>>>> specify
> > > > >> >>>>>>>>>>>> malhar-contrib as a dependency, and add an exclusions
> > > > >> >>> block
> > > > >> >>>>> in
> > > > >> >>>>>>> my
> > > > >> >>>>>>>>> app
> > > > >> >>>>>>>>>>>> package pom:
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> > > > >> >>>>>>>>>>>> <artifactId>malhar-contrib</artifactId>
> > > > >> >>>>>> <version>3.0.0</version>
> > > > >> >>>>>>>>> <!--
> > > > >> >>>>>>>>>>> so
> > > > >> >>>>>>>>>>>> none of malhar-contrib's deps are included -->*
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>> *  <exclusions>    <exclusion>
> > > > >> >> <groupId>*</groupId>
> > > > >> >>>>>>>>>>>> <artifactId>*</artifactId>    </exclusion>
> > > > >> >>>>>>>>> </exclusions></dependency>*
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>> Then, I would have to include the kafka library
> > > > >> >>> explicitly
> > > > >> >>>>> as a
> > > > >> >>>>>>>>>>> dependency:
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>> *<dependency>  <groupId>org.apache.kafka</groupId>
> > > > >> >>>>>>>>>>>> <artifactId>kafka_2.10</artifactId>
> > > > >> >>>>>>>>>>>> <version>0.8.1.1</version></dependency>*
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>> Wouldn't it be nice if I could just put this in my
> > > > >> >> pom?:
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> > > > >> >>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId>
> > > > >> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>> In order to make this possible, we will need to
> > > > >> >> organize
> > > > >> >>>> the
> > > > >> >>>>>>> malhar
> > > > >> >>>>>>>>>>> project
> > > > >> >>>>>>>>>>>> into more granular modules (artifacts). Specifically,
> > > > >> >> the
> > > > >> >>>>>>>>>> malhar-contrib
> > > > >> >>>>>>>>>>>> artifact would essentially just be a pom that
> specifies
> > > > >> >>>> each
> > > > >> >>>>>>>> smaller
> > > > >> >>>>>>>>>>> module
> > > > >> >>>>>>>>>>>> as a dependency:
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>> *<!-- in malhar-contrib's pom.xml: -->*
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>> *<modules>  <module>kafka</module>*
> > > > >> >>>>>>>>>>>> *  <module>twitter</module>*
> > > > >> >>>>>>>>>>>> *  <module>redis</module>*
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>> *  <!-- other smaller modules --></modules>*
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> > > > >> >>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId>
> > > > >> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> > > > >> >>>>>>>>>>>> <artifactId>malhar-contrib-twitter</artifactId>
> > > > >> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> > > > >> >>>>>>>>>>>> <artifactId>malhar-contrib-redis</artifactId>
> > > > >> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>> With these changes, there may be a risk of breaking
> > > > >> >>>> backwards
> > > > >> >>>>>>>>>>>> compatibility, however I think the gain in usability
> of
> > > > >> >>>>> malhar
> > > > >> >>>>>>>> merits
> > > > >> >>>>>>>>>> the
> > > > >> >>>>>>>>>>>> effort to make this work.
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>> I am still relatively new to maven, so I would love
> to
> > > > >> >>> get
> > > > >> >>>>> some
> > > > >> >>>>>>>>>> feedback
> > > > >> >>>>>>>>>>>> from other devs about this!
> > > > >> >>>>>>>>>>>>
> > > > >> >>>>>>>>>>>> --
> > > > >> >>>>>>>>>>>> Regards,
> > > > >> >>>>>>>>>>>> Andy Perlitch
> > > > >> >>>>>>>>>>>> Software Engineer
> > > > >> >>>>>>>>>>>> DataTorrent Inc
> > > > >> >>>>>>>>>>>> (408)829-9319
> > > > >> >>>>>>
> > > > >> >>>>>>
> > > > >> >>>>>>
> > > > >> >>>>>> --
> > > > >> >>>>>> Regards,
> > > > >> >>>>>> Andy Perlitch
> > > > >> >>>>>> Software Engineer
> > > > >> >>>>>> DataTorrent Inc
> > > > >> >>>>>> (408)829-9319
> > > > >> >>
> > > > >> >>
> > > > >> >>
> > > > >> >> --
> > > > >> >> Regards,
> > > > >> >> Andy Perlitch
> > > > >> >> Software Engineer
> > > > >> >> DataTorrent Inc
> > > > >> >> (408)829-9319
> > > > >> >>
> > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>

Re: More sensible modules/artifacts in malhar

Posted by Thomas Weise <th...@datatorrent.com>.
Releases and versions are tied to the repository as a whole, not individual
modules within it. I don't think we should change this.


On Wed, Dec 23, 2015 at 6:57 PM, David Yan <da...@datatorrent.com> wrote:

> As I understand, each artifact will be independent and will have its own
> release cycle.
>
> On Wed, Dec 23, 2015 at 6:50 PM, Pramod Immaneni <pr...@datatorrent.com>
> wrote:
>
> > Wouldn't it also mean that in near term we would be releasing new version
> > of all the artifacts when there is a new malhar release to be made even
> > though many of them may not have changed.
> >
> > On Wed, Dec 23, 2015 at 5:23 PM, David Yan <da...@datatorrent.com>
> wrote:
> >
> > > Let's restart the discussion of this topic.
> > >
> > > We'd like to break malhar into modules, so we can have separate
> artifacts
> > > for kafka, cassandra, hbase, etc., instead of just malhar-contrib and
> > > malhar-library.
> > > This way users using them will only pull in the right dependencies
> > > automatically, without the ugly business of optional and exclude
> > > dependencies today.
> > >
> > > Also, I propose adding the 3rd party version in the artifact name.  For
> > > example:
> > >
> > > malhar-kafka-0.8
> > > malhar-kafka-0.9
> > >
> > > so that we can simultaneously support multiple versions of kafka.
> > >
> > > Thoughts?
> > >
> > > David
> > >
> > > On Fri, Oct 2, 2015 at 4:40 PM, David Yan <da...@datatorrent.com>
> wrote:
> > >
> > > > The list of all malhar operators are listed as part of the apidoc
> here:
> > > > https://www.datatorrent.com/docs/apidocs/index.html
> > > > And developers should be able to find the operators they need there.
> > > >
> > > > But, it's referenced from
> > > > https://www.datatorrent.com/product-documentation/ as "Platform API
> > > > Reference" so users may have trouble finding it.
> > > >
> > > > We probably should have a separate javadoc pages for Apex Core and
> Apex
> > > > Malhar and add the links to this page
> http://apex.apache.org/docs.html
> > > > also.
> > > >
> > > > David
> > > >
> > > > On Fri, Oct 2, 2015 at 4:28 PM, Pramod Immaneni <
> > pramod@datatorrent.com>
> > > > wrote:
> > > >
> > > >> We got to think about how people can find the operators and
> > > >> dependencies when bundling the applications. The complain I hear
> often
> > > >> is that folks can't find the operators they are looking for. We
> should
> > > >> be careful about how much more work this will add for the user to
> now
> > > >> search and find all the dependencies.
> > > >>
> > > >> Thanks
> > > >>
> > > >> > On Oct 2, 2015, at 3:44 PM, David Yan <da...@datatorrent.com>
> > wrote:
> > > >> >
> > > >> > I actually don't think it makes sense any more to separate
> > > >> malhar-library
> > > >> > and malhar-contrib after the breakup, especially since we are
> > planning
> > > >> for
> > > >> > a major release for these changes.
> > > >> >
> > > >> > People are often confused, myself included, which operators should
> > be
> > > in
> > > >> > malhar-library and which ones should be in contrib.  Requiring a
> > > >> separate
> > > >> > setup for unit test should not be a criteria because the user of
> the
> > > >> > library couldn't care less whether the unit test requires extra
> > setup.
> > > >> The
> > > >> > factor of requiring extra dependencies isn't valid either because
> > > >> there're
> > > >> > already dependencies of malhar-library now that apex does not
> have.
> > > >> >
> > > >> > We can retain them for backward compatibility purpose but going
> > > forward
> > > >> new
> > > >> > app packages should only use the baby artifacts, without denoting
> > > >> whether
> > > >> > it's contrib or not.
> > > >> >
> > > >> > David
> > > >> >
> > > >> > On Tue, Sep 29, 2015 at 12:19 AM, Andy Perlitch <
> > andy@datatorrent.com
> > > >
> > > >> > wrote:
> > > >> >
> > > >> >> Hi all,
> > > >> >>
> > > >> >> This is a first cut at a plan to restructure malhar in a way that
> > is
> > > >> more
> > > >> >> portable and adherent to Maven's principles of modularity and
> > > >> dependency
> > > >> >> management.
> > > >> >>
> > > >> >> Overview of Current Malhar Architecture
> > > >> >> ---------------------------------------------------------------
> > > >> >> The current malhar repo consists of several maven modules:
> > > >> >>
> > > >> >> * *malhar-library*
> > > >> >>   operators which do not require additional transitive
> dependencies
> > > >> beyond
> > > >> >> what Apex and Hadoop require
> > > >> >> *  *malhar-contrib*
> > > >> >>   operators requiring other maven dependencies
> > > >> >> * *malhar-demos*
> > > >> >>   demo applications
> > > >> >> * *malhar-samples*
> > > >> >>   sample code showing example usage of malhar operators
> > > >> >> * *malhar-apps*
> > > >> >>   apex applications (currently only logstream)
> > > >> >>
> > > >> >>
> > > >> >> Proposed Changes
> > > >> >> ---------------------------------------------------------------
> > > >> >>
> > > >> >> 1. *Scrub malhar-library for any operators needing additional
> > > >> dependencies*
> > > >> >>  `malhar-library` is intended to consist of only operators
> without
> > > >> extra
> > > >> >> transitive dependencies. All operators should be checked for the
> > > >> necessity
> > > >> >> of extra dependencies.
> > > >> >>
> > > >> >> 2. *Move operators from malhar-demos and malhar-apps into contrib
> > (or
> > > >> >> library if prudent)*
> > > >> >>    There are various operators in both of these modules that are
> > > >> general
> > > >> >> enough to move into library or contrib.
> > > >> >>
> > > >> >> 3. *Create modules for all contrib subfolders*
> > > >> >>    All folders under `contrib/src/main/com/datatorrent/contrib/`
> > > >> should be
> > > >> >> converted to modules of contrib and listed as such in
> > > >> `/contrib/pom.xml`.
> > > >> >>    Additionally, each of these smaller contrib modules will have
> > its
> > > >> own
> > > >> >> version and dependencies.
> > > >> >>
> > > >> >> 4. *Use the Shades Plugin to allow for backwards-compatible
> > > >> fully-qualified
> > > >> >> class names*
> > > >> >>    This is made possible by shades class relocation
> > > >> >> <
> > > >> >>
> > > >>
> > >
> >
> https://maven.apache.org/plugins/maven-shade-plugin/examples/class-relocation.html
> > > >> >> feature. This might be a bit error prone as well as confusing to
> > use
> > > >> for
> > > >> >> outside developers, but it must be done if these changes are to
> be
> > > made
> > > >> >> prior to a major release.
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >> Let me know what you all think of this approach.
> > > >> >>
> > > >> >> Best,
> > > >> >> Andy
> > > >> >>
> > > >> >>
> > > >> >> On Tue, Sep 22, 2015 at 11:20 AM, Chetan Narsude <
> > > >> chetan@datatorrent.com>
> > > >> >> wrote:
> > > >> >>
> > > >> >>> +1
> > > >> >>>
> > > >> >>> On Tue, Sep 22, 2015 at 11:08 AM, Gaurav Gupta <
> > > >> gaurav@datatorrent.com>
> > > >> >>> wrote:
> > > >> >>>
> > > >> >>>> I agree with David.. Each artifact should have it's own version
> > > >> >>>>
> > > >> >>>> Thanks
> > > >> >>>> -Gaurav
> > > >> >>>>
> > > >> >>>>> On Tue, Sep 22, 2015 at 11:07 AM, David Yan <
> > > david@datatorrent.com>
> > > >> >>>> wrote:
> > > >> >>>>
> > > >> >>>>> I actually think that each baby artifact should have its own
> > > >> version,
> > > >> >>>>> because each artifact has its own interface and its own life
> > > cycle,
> > > >> >>>>> especially after we break up the giant library, applications
> > will
> > > >> >>> depend
> > > >> >>>> on
> > > >> >>>>> the baby artifacts instead of the giant library.  For example
> if
> > > >> >> there
> > > >> >>> is
> > > >> >>>>> no change in malhar-contrib-kafka (I think the name should
> > > actually
> > > >> >> be
> > > >> >>>>> apex-malhar-kafka), we should not confuse users by bumping the
> > > >> >> version.
> > > >> >>>>>
> > > >> >>>>> David
> > > >> >>>>>
> > > >> >>>>> On Tue, Sep 22, 2015 at 9:03 AM, Andy Perlitch <
> > > >> andy@datatorrent.com
> > > >> >>>
> > > >> >>>>> wrote:
> > > >> >>>>>
> > > >> >>>>>> Tushar,
> > > >> >>>>>>
> > > >> >>>>>> I agree that all modules should inherit the version from the
> > > >> >> "parent
> > > >> >>>> pom"
> > > >> >>>>>> of the malhar repo. I think the benefits outweigh the cost of
> > > >> >> bumping
> > > >> >>>>>> versions of components that haven't actually changed. I'd
> love
> > to
> > > >> >> get
> > > >> >>>>>> others feedback on this as well.
> > > >> >>>>>>
> > > >> >>>>>> On another note, I plan on starting a spreadsheet/googledoc
> > with
> > > >> >> the
> > > >> >>>>>> possible groupings of operators into these modules. Stay
> > tuned...
> > > >> >>>>>>
> > > >> >>>>>> -Andy
> > > >> >>>>>>
> > > >> >>>>>> On Mon, Sep 21, 2015 at 11:51 PM, Tushar Gosavi <
> > > >> >>>> tushar@datatorrent.com>
> > > >> >>>>>> wrote:
> > > >> >>>>>>
> > > >> >>>>>>> +1 for the general idea
> > > >> >>>>>>>
> > > >> >>>>>>> Does these independent modules going to have independent
> > > >> >> versions?
> > > >> >>>> For
> > > >> >>>>>>> example, if there is no change in kafka operator between
> > malhar
> > > >> >> 3.0
> > > >> >>>> and
> > > >> >>>>>>> malhar 4.0, will we increment version of
> malhar-contrib-kafka
> > to
> > > >> >>>> 4.0. I
> > > >> >>>>>>> have learned from my previous project that, It is easier to
> > > >> >> manage
> > > >> >>>>>> versions
> > > >> >>>>>>> if we make all modules at same version level for a release,
> > even
> > > >> >> if
> > > >> >>>>> there
> > > >> >>>>>>> is no change in a particular module.
> > > >> >>>>>>>
> > > >> >>>>>>> - Tushar.
> > > >> >>>>>>>
> > > >> >>>>>>>
> > > >> >>>>>>>
> > > >> >>>>>>> On Fri, Sep 18, 2015 at 12:18 AM, Timothy Farkas <
> > > >> >>>> tim@datatorrent.com>
> > > >> >>>>>>> wrote:
> > > >> >>>>>>>
> > > >> >>>>>>>> I agree Andy's solution is better, but just for the sake of
> > > >> >>>> argument
> > > >> >>>>>>>> profiles can be inherited from a parent pom, so if the
> maven
> > > >> >>>>> archetype
> > > >> >>>>>>>> defines a new project with a parent pom with the correct
> > > >> >> profiles
> > > >> >>>>>>> defined,
> > > >> >>>>>>>> then the desired profiles can be activated in the pom of
> the
> > > >> >> new
> > > >> >>>>>> project.
> > > >> >>>>>>>> It is no more complicated than adding additional
> dependencies
> > > >> >> to
> > > >> >>>> your
> > > >> >>>>>>>> project.
> > > >> >>>>>>>>
> > > >> >>>>>>>> On Thu, Sep 17, 2015 at 10:32 AM, Sandesh Hegde <
> > > >> >>>>>> sandesh@datatorrent.com
> > > >> >>>>>>>>
> > > >> >>>>>>>> wrote:
> > > >> >>>>>>>>
> > > >> >>>>>>>>> Currently all the dependencies in Malhar-Contrib are
> marked
> > > >> >> as
> > > >> >>>>>>> optional.
> > > >> >>>>>>>> So
> > > >> >>>>>>>>> users have to already modify the existing POM to use it in
> > > >> >>> their
> > > >> >>>>>>> project.
> > > >> >>>>>>>>> So restructuring should be fine.
> > > >> >>>>>>>>>
> > > >> >>>>>>>>> On Thu, Sep 17, 2015 at 11:29 AM Chetan Narsude <
> > > >> >>>>>>> chetan@datatorrent.com>
> > > >> >>>>>>>>> wrote:
> > > >> >>>>>>>>>
> > > >> >>>>>>>>>> The profiles are excellent when you are developing
> > > >> >>>>> malhar-contrib.
> > > >> >>>>>>>>> Profiles
> > > >> >>>>>>>>>> do not work when you are using malhar-contrib. The
> problem
> > > >> >>> Andy
> > > >> >>>>> is
> > > >> >>>>>>>>> trying
> > > >> >>>>>>>>>> to solve is the later. If there is an elegant solution
> > > >> >> which
> > > >> >>> I
> > > >> >>>> am
> > > >> >>>>>>>> missing
> > > >> >>>>>>>>>> using profiles, please correct me.
> > > >> >>>>>>>>>>
> > > >> >>>>>>>>>> The way Andy suggested is the way many successful
> projects
> > > >> >> do
> > > >> >>>> it.
> > > >> >>>>>>> Look
> > > >> >>>>>>>> at
> > > >> >>>>>>>>>> Netty as an example.
> > > >> >>>>>>>>>>
> > > >> >>>>>>>>>> +1 for that.
> > > >> >>>>>>>>>>
> > > >> >>>>>>>>>>
> > > >> >>>>>>>>>> --
> > > >> >>>>>>>>>> Chetan
> > > >> >>>>>>>>>>
> > > >> >>>>>>>>>>
> > > >> >>>>>>>>>>
> > > >> >>>>>>>>>> On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas <
> > > >> >>>>>>> tim@datatorrent.com>
> > > >> >>>>>>>>>> wrote:
> > > >> >>>>>>>>>>
> > > >> >>>>>>>>>>> I think restructuring the project in that way would be
> > > >> >> the
> > > >> >>>>>>>> technically
> > > >> >>>>>>>>>>> correct thing to do, but if people are unwilling to
> > > >> >> accept
> > > >> >>>> the
> > > >> >>>>>>> change
> > > >> >>>>>>>>> in
> > > >> >>>>>>>>>>> project structure you could achieve something similar by
> > > >> >>>> using
> > > >> >>>>>>> maven
> > > >> >>>>>>>>>>> profiles. With profiles the project structure would
> > > >> >> remain
> > > >> >>> as
> > > >> >>>>> is.
> > > >> >>>>>>>>>> Profiles
> > > >> >>>>>>>>>>> could be added to the malhar pom, and a profile would
> > > >> >>> define
> > > >> >>>>> the
> > > >> >>>>>>>>>>> dependencies needed for different types of operators.
> For
> > > >> >>>>> example
> > > >> >>>>>>> the
> > > >> >>>>>>>>>> hbase
> > > >> >>>>>>>>>>> profile would define the dependencies for the hbase
> > > >> >>> operator.
> > > >> >>>>>> Then
> > > >> >>>>>>>> any
> > > >> >>>>>>>>>>> project using a malhar library would just activate the
> > > >> >>>> correct
> > > >> >>>>>>>> profile
> > > >> >>>>>>>>> in
> > > >> >>>>>>>>>>> it's pom, and the correct dependencies would be pulled
> > > >> >> in.
> > > >> >>
> > > >>
> > >
> >
> http://maven.apache.org/guides/introduction/introduction-to-profiles.html
> > > >> >>>>>>>>>>>
> > > >> >>>>>>>>>>> On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch <
> > > >> >>>>>>>> andy@datatorrent.com>
> > > >> >>>>>>>>>>> wrote:
> > > >> >>>>>>>>>>>
> > > >> >>>>>>>>>>>> Hi everyone,
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>> I am currently assigned to MLHR-1843
> > > >> >>>>>>>>>>>> <https://malhar.atlassian.net/browse/MLHR-1843>, which
> > > >> >>>>>>> essentially
> > > >> >>>>>>>>>> aims
> > > >> >>>>>>>>>>> to
> > > >> >>>>>>>>>>>> expose smaller, more consumable maven artifacts that
> > > >> >>> would
> > > >> >>>> do
> > > >> >>>>>>> away
> > > >> >>>>>>>>> with
> > > >> >>>>>>>>>>> the
> > > >> >>>>>>>>>>>> need to manually include necessary dependencies based
> > > >> >> on
> > > >> >>>> the
> > > >> >>>>>>>>> operators
> > > >> >>>>>>>>>> in
> > > >> >>>>>>>>>>>> use.
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>> As an example, say I am building an app package that
> > > >> >>> needs
> > > >> >>>>>> Kafka
> > > >> >>>>>>>>> input
> > > >> >>>>>>>>>>> and
> > > >> >>>>>>>>>>>> output operators, but I don't want all the other
> > > >> >>> transitive
> > > >> >>>>>>>>>> dependencies
> > > >> >>>>>>>>>>>> that come via malhar-contrib. Currently I would need to
> > > >> >>>>> specify
> > > >> >>>>>>>>>>>> malhar-contrib as a dependency, and add an exclusions
> > > >> >>> block
> > > >> >>>>> in
> > > >> >>>>>>> my
> > > >> >>>>>>>>> app
> > > >> >>>>>>>>>>>> package pom:
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> > > >> >>>>>>>>>>>> <artifactId>malhar-contrib</artifactId>
> > > >> >>>>>> <version>3.0.0</version>
> > > >> >>>>>>>>> <!--
> > > >> >>>>>>>>>>> so
> > > >> >>>>>>>>>>>> none of malhar-contrib's deps are included -->*
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>> *  <exclusions>    <exclusion>
> > > >> >> <groupId>*</groupId>
> > > >> >>>>>>>>>>>> <artifactId>*</artifactId>    </exclusion>
> > > >> >>>>>>>>> </exclusions></dependency>*
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>> Then, I would have to include the kafka library
> > > >> >>> explicitly
> > > >> >>>>> as a
> > > >> >>>>>>>>>>> dependency:
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>> *<dependency>  <groupId>org.apache.kafka</groupId>
> > > >> >>>>>>>>>>>> <artifactId>kafka_2.10</artifactId>
> > > >> >>>>>>>>>>>> <version>0.8.1.1</version></dependency>*
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>> Wouldn't it be nice if I could just put this in my
> > > >> >> pom?:
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> > > >> >>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId>
> > > >> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>> In order to make this possible, we will need to
> > > >> >> organize
> > > >> >>>> the
> > > >> >>>>>>> malhar
> > > >> >>>>>>>>>>> project
> > > >> >>>>>>>>>>>> into more granular modules (artifacts). Specifically,
> > > >> >> the
> > > >> >>>>>>>>>> malhar-contrib
> > > >> >>>>>>>>>>>> artifact would essentially just be a pom that specifies
> > > >> >>>> each
> > > >> >>>>>>>> smaller
> > > >> >>>>>>>>>>> module
> > > >> >>>>>>>>>>>> as a dependency:
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>> *<!-- in malhar-contrib's pom.xml: -->*
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>> *<modules>  <module>kafka</module>*
> > > >> >>>>>>>>>>>> *  <module>twitter</module>*
> > > >> >>>>>>>>>>>> *  <module>redis</module>*
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>> *  <!-- other smaller modules --></modules>*
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> > > >> >>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId>
> > > >> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> > > >> >>>>>>>>>>>> <artifactId>malhar-contrib-twitter</artifactId>
> > > >> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> > > >> >>>>>>>>>>>> <artifactId>malhar-contrib-redis</artifactId>
> > > >> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>> With these changes, there may be a risk of breaking
> > > >> >>>> backwards
> > > >> >>>>>>>>>>>> compatibility, however I think the gain in usability of
> > > >> >>>>> malhar
> > > >> >>>>>>>> merits
> > > >> >>>>>>>>>> the
> > > >> >>>>>>>>>>>> effort to make this work.
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>> I am still relatively new to maven, so I would love to
> > > >> >>> get
> > > >> >>>>> some
> > > >> >>>>>>>>>> feedback
> > > >> >>>>>>>>>>>> from other devs about this!
> > > >> >>>>>>>>>>>>
> > > >> >>>>>>>>>>>> --
> > > >> >>>>>>>>>>>> Regards,
> > > >> >>>>>>>>>>>> Andy Perlitch
> > > >> >>>>>>>>>>>> Software Engineer
> > > >> >>>>>>>>>>>> DataTorrent Inc
> > > >> >>>>>>>>>>>> (408)829-9319
> > > >> >>>>>>
> > > >> >>>>>>
> > > >> >>>>>>
> > > >> >>>>>> --
> > > >> >>>>>> Regards,
> > > >> >>>>>> Andy Perlitch
> > > >> >>>>>> Software Engineer
> > > >> >>>>>> DataTorrent Inc
> > > >> >>>>>> (408)829-9319
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >> --
> > > >> >> Regards,
> > > >> >> Andy Perlitch
> > > >> >> Software Engineer
> > > >> >> DataTorrent Inc
> > > >> >> (408)829-9319
> > > >> >>
> > > >>
> > > >
> > > >
> > >
> >
>

Re: More sensible modules/artifacts in malhar

Posted by David Yan <da...@datatorrent.com>.
As I understand, each artifact will be independent and will have its own
release cycle.

On Wed, Dec 23, 2015 at 6:50 PM, Pramod Immaneni <pr...@datatorrent.com>
wrote:

> Wouldn't it also mean that in near term we would be releasing new version
> of all the artifacts when there is a new malhar release to be made even
> though many of them may not have changed.
>
> On Wed, Dec 23, 2015 at 5:23 PM, David Yan <da...@datatorrent.com> wrote:
>
> > Let's restart the discussion of this topic.
> >
> > We'd like to break malhar into modules, so we can have separate artifacts
> > for kafka, cassandra, hbase, etc., instead of just malhar-contrib and
> > malhar-library.
> > This way users using them will only pull in the right dependencies
> > automatically, without the ugly business of optional and exclude
> > dependencies today.
> >
> > Also, I propose adding the 3rd party version in the artifact name.  For
> > example:
> >
> > malhar-kafka-0.8
> > malhar-kafka-0.9
> >
> > so that we can simultaneously support multiple versions of kafka.
> >
> > Thoughts?
> >
> > David
> >
> > On Fri, Oct 2, 2015 at 4:40 PM, David Yan <da...@datatorrent.com> wrote:
> >
> > > The list of all malhar operators are listed as part of the apidoc here:
> > > https://www.datatorrent.com/docs/apidocs/index.html
> > > And developers should be able to find the operators they need there.
> > >
> > > But, it's referenced from
> > > https://www.datatorrent.com/product-documentation/ as "Platform API
> > > Reference" so users may have trouble finding it.
> > >
> > > We probably should have a separate javadoc pages for Apex Core and Apex
> > > Malhar and add the links to this page http://apex.apache.org/docs.html
> > > also.
> > >
> > > David
> > >
> > > On Fri, Oct 2, 2015 at 4:28 PM, Pramod Immaneni <
> pramod@datatorrent.com>
> > > wrote:
> > >
> > >> We got to think about how people can find the operators and
> > >> dependencies when bundling the applications. The complain I hear often
> > >> is that folks can't find the operators they are looking for. We should
> > >> be careful about how much more work this will add for the user to now
> > >> search and find all the dependencies.
> > >>
> > >> Thanks
> > >>
> > >> > On Oct 2, 2015, at 3:44 PM, David Yan <da...@datatorrent.com>
> wrote:
> > >> >
> > >> > I actually don't think it makes sense any more to separate
> > >> malhar-library
> > >> > and malhar-contrib after the breakup, especially since we are
> planning
> > >> for
> > >> > a major release for these changes.
> > >> >
> > >> > People are often confused, myself included, which operators should
> be
> > in
> > >> > malhar-library and which ones should be in contrib.  Requiring a
> > >> separate
> > >> > setup for unit test should not be a criteria because the user of the
> > >> > library couldn't care less whether the unit test requires extra
> setup.
> > >> The
> > >> > factor of requiring extra dependencies isn't valid either because
> > >> there're
> > >> > already dependencies of malhar-library now that apex does not have.
> > >> >
> > >> > We can retain them for backward compatibility purpose but going
> > forward
> > >> new
> > >> > app packages should only use the baby artifacts, without denoting
> > >> whether
> > >> > it's contrib or not.
> > >> >
> > >> > David
> > >> >
> > >> > On Tue, Sep 29, 2015 at 12:19 AM, Andy Perlitch <
> andy@datatorrent.com
> > >
> > >> > wrote:
> > >> >
> > >> >> Hi all,
> > >> >>
> > >> >> This is a first cut at a plan to restructure malhar in a way that
> is
> > >> more
> > >> >> portable and adherent to Maven's principles of modularity and
> > >> dependency
> > >> >> management.
> > >> >>
> > >> >> Overview of Current Malhar Architecture
> > >> >> ---------------------------------------------------------------
> > >> >> The current malhar repo consists of several maven modules:
> > >> >>
> > >> >> * *malhar-library*
> > >> >>   operators which do not require additional transitive dependencies
> > >> beyond
> > >> >> what Apex and Hadoop require
> > >> >> *  *malhar-contrib*
> > >> >>   operators requiring other maven dependencies
> > >> >> * *malhar-demos*
> > >> >>   demo applications
> > >> >> * *malhar-samples*
> > >> >>   sample code showing example usage of malhar operators
> > >> >> * *malhar-apps*
> > >> >>   apex applications (currently only logstream)
> > >> >>
> > >> >>
> > >> >> Proposed Changes
> > >> >> ---------------------------------------------------------------
> > >> >>
> > >> >> 1. *Scrub malhar-library for any operators needing additional
> > >> dependencies*
> > >> >>  `malhar-library` is intended to consist of only operators without
> > >> extra
> > >> >> transitive dependencies. All operators should be checked for the
> > >> necessity
> > >> >> of extra dependencies.
> > >> >>
> > >> >> 2. *Move operators from malhar-demos and malhar-apps into contrib
> (or
> > >> >> library if prudent)*
> > >> >>    There are various operators in both of these modules that are
> > >> general
> > >> >> enough to move into library or contrib.
> > >> >>
> > >> >> 3. *Create modules for all contrib subfolders*
> > >> >>    All folders under `contrib/src/main/com/datatorrent/contrib/`
> > >> should be
> > >> >> converted to modules of contrib and listed as such in
> > >> `/contrib/pom.xml`.
> > >> >>    Additionally, each of these smaller contrib modules will have
> its
> > >> own
> > >> >> version and dependencies.
> > >> >>
> > >> >> 4. *Use the Shades Plugin to allow for backwards-compatible
> > >> fully-qualified
> > >> >> class names*
> > >> >>    This is made possible by shades class relocation
> > >> >> <
> > >> >>
> > >>
> >
> https://maven.apache.org/plugins/maven-shade-plugin/examples/class-relocation.html
> > >> >> feature. This might be a bit error prone as well as confusing to
> use
> > >> for
> > >> >> outside developers, but it must be done if these changes are to be
> > made
> > >> >> prior to a major release.
> > >> >>
> > >> >>
> > >> >>
> > >> >> Let me know what you all think of this approach.
> > >> >>
> > >> >> Best,
> > >> >> Andy
> > >> >>
> > >> >>
> > >> >> On Tue, Sep 22, 2015 at 11:20 AM, Chetan Narsude <
> > >> chetan@datatorrent.com>
> > >> >> wrote:
> > >> >>
> > >> >>> +1
> > >> >>>
> > >> >>> On Tue, Sep 22, 2015 at 11:08 AM, Gaurav Gupta <
> > >> gaurav@datatorrent.com>
> > >> >>> wrote:
> > >> >>>
> > >> >>>> I agree with David.. Each artifact should have it's own version
> > >> >>>>
> > >> >>>> Thanks
> > >> >>>> -Gaurav
> > >> >>>>
> > >> >>>>> On Tue, Sep 22, 2015 at 11:07 AM, David Yan <
> > david@datatorrent.com>
> > >> >>>> wrote:
> > >> >>>>
> > >> >>>>> I actually think that each baby artifact should have its own
> > >> version,
> > >> >>>>> because each artifact has its own interface and its own life
> > cycle,
> > >> >>>>> especially after we break up the giant library, applications
> will
> > >> >>> depend
> > >> >>>> on
> > >> >>>>> the baby artifacts instead of the giant library.  For example if
> > >> >> there
> > >> >>> is
> > >> >>>>> no change in malhar-contrib-kafka (I think the name should
> > actually
> > >> >> be
> > >> >>>>> apex-malhar-kafka), we should not confuse users by bumping the
> > >> >> version.
> > >> >>>>>
> > >> >>>>> David
> > >> >>>>>
> > >> >>>>> On Tue, Sep 22, 2015 at 9:03 AM, Andy Perlitch <
> > >> andy@datatorrent.com
> > >> >>>
> > >> >>>>> wrote:
> > >> >>>>>
> > >> >>>>>> Tushar,
> > >> >>>>>>
> > >> >>>>>> I agree that all modules should inherit the version from the
> > >> >> "parent
> > >> >>>> pom"
> > >> >>>>>> of the malhar repo. I think the benefits outweigh the cost of
> > >> >> bumping
> > >> >>>>>> versions of components that haven't actually changed. I'd love
> to
> > >> >> get
> > >> >>>>>> others feedback on this as well.
> > >> >>>>>>
> > >> >>>>>> On another note, I plan on starting a spreadsheet/googledoc
> with
> > >> >> the
> > >> >>>>>> possible groupings of operators into these modules. Stay
> tuned...
> > >> >>>>>>
> > >> >>>>>> -Andy
> > >> >>>>>>
> > >> >>>>>> On Mon, Sep 21, 2015 at 11:51 PM, Tushar Gosavi <
> > >> >>>> tushar@datatorrent.com>
> > >> >>>>>> wrote:
> > >> >>>>>>
> > >> >>>>>>> +1 for the general idea
> > >> >>>>>>>
> > >> >>>>>>> Does these independent modules going to have independent
> > >> >> versions?
> > >> >>>> For
> > >> >>>>>>> example, if there is no change in kafka operator between
> malhar
> > >> >> 3.0
> > >> >>>> and
> > >> >>>>>>> malhar 4.0, will we increment version of malhar-contrib-kafka
> to
> > >> >>>> 4.0. I
> > >> >>>>>>> have learned from my previous project that, It is easier to
> > >> >> manage
> > >> >>>>>> versions
> > >> >>>>>>> if we make all modules at same version level for a release,
> even
> > >> >> if
> > >> >>>>> there
> > >> >>>>>>> is no change in a particular module.
> > >> >>>>>>>
> > >> >>>>>>> - Tushar.
> > >> >>>>>>>
> > >> >>>>>>>
> > >> >>>>>>>
> > >> >>>>>>> On Fri, Sep 18, 2015 at 12:18 AM, Timothy Farkas <
> > >> >>>> tim@datatorrent.com>
> > >> >>>>>>> wrote:
> > >> >>>>>>>
> > >> >>>>>>>> I agree Andy's solution is better, but just for the sake of
> > >> >>>> argument
> > >> >>>>>>>> profiles can be inherited from a parent pom, so if the maven
> > >> >>>>> archetype
> > >> >>>>>>>> defines a new project with a parent pom with the correct
> > >> >> profiles
> > >> >>>>>>> defined,
> > >> >>>>>>>> then the desired profiles can be activated in the pom of the
> > >> >> new
> > >> >>>>>> project.
> > >> >>>>>>>> It is no more complicated than adding additional dependencies
> > >> >> to
> > >> >>>> your
> > >> >>>>>>>> project.
> > >> >>>>>>>>
> > >> >>>>>>>> On Thu, Sep 17, 2015 at 10:32 AM, Sandesh Hegde <
> > >> >>>>>> sandesh@datatorrent.com
> > >> >>>>>>>>
> > >> >>>>>>>> wrote:
> > >> >>>>>>>>
> > >> >>>>>>>>> Currently all the dependencies in Malhar-Contrib are marked
> > >> >> as
> > >> >>>>>>> optional.
> > >> >>>>>>>> So
> > >> >>>>>>>>> users have to already modify the existing POM to use it in
> > >> >>> their
> > >> >>>>>>> project.
> > >> >>>>>>>>> So restructuring should be fine.
> > >> >>>>>>>>>
> > >> >>>>>>>>> On Thu, Sep 17, 2015 at 11:29 AM Chetan Narsude <
> > >> >>>>>>> chetan@datatorrent.com>
> > >> >>>>>>>>> wrote:
> > >> >>>>>>>>>
> > >> >>>>>>>>>> The profiles are excellent when you are developing
> > >> >>>>> malhar-contrib.
> > >> >>>>>>>>> Profiles
> > >> >>>>>>>>>> do not work when you are using malhar-contrib. The problem
> > >> >>> Andy
> > >> >>>>> is
> > >> >>>>>>>>> trying
> > >> >>>>>>>>>> to solve is the later. If there is an elegant solution
> > >> >> which
> > >> >>> I
> > >> >>>> am
> > >> >>>>>>>> missing
> > >> >>>>>>>>>> using profiles, please correct me.
> > >> >>>>>>>>>>
> > >> >>>>>>>>>> The way Andy suggested is the way many successful projects
> > >> >> do
> > >> >>>> it.
> > >> >>>>>>> Look
> > >> >>>>>>>> at
> > >> >>>>>>>>>> Netty as an example.
> > >> >>>>>>>>>>
> > >> >>>>>>>>>> +1 for that.
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>
> > >> >>>>>>>>>> --
> > >> >>>>>>>>>> Chetan
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>
> > >> >>>>>>>>>> On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas <
> > >> >>>>>>> tim@datatorrent.com>
> > >> >>>>>>>>>> wrote:
> > >> >>>>>>>>>>
> > >> >>>>>>>>>>> I think restructuring the project in that way would be
> > >> >> the
> > >> >>>>>>>> technically
> > >> >>>>>>>>>>> correct thing to do, but if people are unwilling to
> > >> >> accept
> > >> >>>> the
> > >> >>>>>>> change
> > >> >>>>>>>>> in
> > >> >>>>>>>>>>> project structure you could achieve something similar by
> > >> >>>> using
> > >> >>>>>>> maven
> > >> >>>>>>>>>>> profiles. With profiles the project structure would
> > >> >> remain
> > >> >>> as
> > >> >>>>> is.
> > >> >>>>>>>>>> Profiles
> > >> >>>>>>>>>>> could be added to the malhar pom, and a profile would
> > >> >>> define
> > >> >>>>> the
> > >> >>>>>>>>>>> dependencies needed for different types of operators. For
> > >> >>>>> example
> > >> >>>>>>> the
> > >> >>>>>>>>>> hbase
> > >> >>>>>>>>>>> profile would define the dependencies for the hbase
> > >> >>> operator.
> > >> >>>>>> Then
> > >> >>>>>>>> any
> > >> >>>>>>>>>>> project using a malhar library would just activate the
> > >> >>>> correct
> > >> >>>>>>>> profile
> > >> >>>>>>>>> in
> > >> >>>>>>>>>>> it's pom, and the correct dependencies would be pulled
> > >> >> in.
> > >> >>
> > >>
> >
> http://maven.apache.org/guides/introduction/introduction-to-profiles.html
> > >> >>>>>>>>>>>
> > >> >>>>>>>>>>> On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch <
> > >> >>>>>>>> andy@datatorrent.com>
> > >> >>>>>>>>>>> wrote:
> > >> >>>>>>>>>>>
> > >> >>>>>>>>>>>> Hi everyone,
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> I am currently assigned to MLHR-1843
> > >> >>>>>>>>>>>> <https://malhar.atlassian.net/browse/MLHR-1843>, which
> > >> >>>>>>> essentially
> > >> >>>>>>>>>> aims
> > >> >>>>>>>>>>> to
> > >> >>>>>>>>>>>> expose smaller, more consumable maven artifacts that
> > >> >>> would
> > >> >>>> do
> > >> >>>>>>> away
> > >> >>>>>>>>> with
> > >> >>>>>>>>>>> the
> > >> >>>>>>>>>>>> need to manually include necessary dependencies based
> > >> >> on
> > >> >>>> the
> > >> >>>>>>>>> operators
> > >> >>>>>>>>>> in
> > >> >>>>>>>>>>>> use.
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> As an example, say I am building an app package that
> > >> >>> needs
> > >> >>>>>> Kafka
> > >> >>>>>>>>> input
> > >> >>>>>>>>>>> and
> > >> >>>>>>>>>>>> output operators, but I don't want all the other
> > >> >>> transitive
> > >> >>>>>>>>>> dependencies
> > >> >>>>>>>>>>>> that come via malhar-contrib. Currently I would need to
> > >> >>>>> specify
> > >> >>>>>>>>>>>> malhar-contrib as a dependency, and add an exclusions
> > >> >>> block
> > >> >>>>> in
> > >> >>>>>>> my
> > >> >>>>>>>>> app
> > >> >>>>>>>>>>>> package pom:
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> > >> >>>>>>>>>>>> <artifactId>malhar-contrib</artifactId>
> > >> >>>>>> <version>3.0.0</version>
> > >> >>>>>>>>> <!--
> > >> >>>>>>>>>>> so
> > >> >>>>>>>>>>>> none of malhar-contrib's deps are included -->*
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> *  <exclusions>    <exclusion>
> > >> >> <groupId>*</groupId>
> > >> >>>>>>>>>>>> <artifactId>*</artifactId>    </exclusion>
> > >> >>>>>>>>> </exclusions></dependency>*
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> Then, I would have to include the kafka library
> > >> >>> explicitly
> > >> >>>>> as a
> > >> >>>>>>>>>>> dependency:
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> *<dependency>  <groupId>org.apache.kafka</groupId>
> > >> >>>>>>>>>>>> <artifactId>kafka_2.10</artifactId>
> > >> >>>>>>>>>>>> <version>0.8.1.1</version></dependency>*
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> Wouldn't it be nice if I could just put this in my
> > >> >> pom?:
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> > >> >>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId>
> > >> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> In order to make this possible, we will need to
> > >> >> organize
> > >> >>>> the
> > >> >>>>>>> malhar
> > >> >>>>>>>>>>> project
> > >> >>>>>>>>>>>> into more granular modules (artifacts). Specifically,
> > >> >> the
> > >> >>>>>>>>>> malhar-contrib
> > >> >>>>>>>>>>>> artifact would essentially just be a pom that specifies
> > >> >>>> each
> > >> >>>>>>>> smaller
> > >> >>>>>>>>>>> module
> > >> >>>>>>>>>>>> as a dependency:
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> *<!-- in malhar-contrib's pom.xml: -->*
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> *<modules>  <module>kafka</module>*
> > >> >>>>>>>>>>>> *  <module>twitter</module>*
> > >> >>>>>>>>>>>> *  <module>redis</module>*
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> *  <!-- other smaller modules --></modules>*
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> > >> >>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId>
> > >> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> > >> >>>>>>>>>>>> <artifactId>malhar-contrib-twitter</artifactId>
> > >> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> > >> >>>>>>>>>>>> <artifactId>malhar-contrib-redis</artifactId>
> > >> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> With these changes, there may be a risk of breaking
> > >> >>>> backwards
> > >> >>>>>>>>>>>> compatibility, however I think the gain in usability of
> > >> >>>>> malhar
> > >> >>>>>>>> merits
> > >> >>>>>>>>>> the
> > >> >>>>>>>>>>>> effort to make this work.
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> I am still relatively new to maven, so I would love to
> > >> >>> get
> > >> >>>>> some
> > >> >>>>>>>>>> feedback
> > >> >>>>>>>>>>>> from other devs about this!
> > >> >>>>>>>>>>>>
> > >> >>>>>>>>>>>> --
> > >> >>>>>>>>>>>> Regards,
> > >> >>>>>>>>>>>> Andy Perlitch
> > >> >>>>>>>>>>>> Software Engineer
> > >> >>>>>>>>>>>> DataTorrent Inc
> > >> >>>>>>>>>>>> (408)829-9319
> > >> >>>>>>
> > >> >>>>>>
> > >> >>>>>>
> > >> >>>>>> --
> > >> >>>>>> Regards,
> > >> >>>>>> Andy Perlitch
> > >> >>>>>> Software Engineer
> > >> >>>>>> DataTorrent Inc
> > >> >>>>>> (408)829-9319
> > >> >>
> > >> >>
> > >> >>
> > >> >> --
> > >> >> Regards,
> > >> >> Andy Perlitch
> > >> >> Software Engineer
> > >> >> DataTorrent Inc
> > >> >> (408)829-9319
> > >> >>
> > >>
> > >
> > >
> >
>

Re: More sensible modules/artifacts in malhar

Posted by Pramod Immaneni <pr...@datatorrent.com>.
Wouldn't it also mean that in near term we would be releasing new version
of all the artifacts when there is a new malhar release to be made even
though many of them may not have changed.

On Wed, Dec 23, 2015 at 5:23 PM, David Yan <da...@datatorrent.com> wrote:

> Let's restart the discussion of this topic.
>
> We'd like to break malhar into modules, so we can have separate artifacts
> for kafka, cassandra, hbase, etc., instead of just malhar-contrib and
> malhar-library.
> This way users using them will only pull in the right dependencies
> automatically, without the ugly business of optional and exclude
> dependencies today.
>
> Also, I propose adding the 3rd party version in the artifact name.  For
> example:
>
> malhar-kafka-0.8
> malhar-kafka-0.9
>
> so that we can simultaneously support multiple versions of kafka.
>
> Thoughts?
>
> David
>
> On Fri, Oct 2, 2015 at 4:40 PM, David Yan <da...@datatorrent.com> wrote:
>
> > The list of all malhar operators are listed as part of the apidoc here:
> > https://www.datatorrent.com/docs/apidocs/index.html
> > And developers should be able to find the operators they need there.
> >
> > But, it's referenced from
> > https://www.datatorrent.com/product-documentation/ as "Platform API
> > Reference" so users may have trouble finding it.
> >
> > We probably should have a separate javadoc pages for Apex Core and Apex
> > Malhar and add the links to this page http://apex.apache.org/docs.html
> > also.
> >
> > David
> >
> > On Fri, Oct 2, 2015 at 4:28 PM, Pramod Immaneni <pr...@datatorrent.com>
> > wrote:
> >
> >> We got to think about how people can find the operators and
> >> dependencies when bundling the applications. The complain I hear often
> >> is that folks can't find the operators they are looking for. We should
> >> be careful about how much more work this will add for the user to now
> >> search and find all the dependencies.
> >>
> >> Thanks
> >>
> >> > On Oct 2, 2015, at 3:44 PM, David Yan <da...@datatorrent.com> wrote:
> >> >
> >> > I actually don't think it makes sense any more to separate
> >> malhar-library
> >> > and malhar-contrib after the breakup, especially since we are planning
> >> for
> >> > a major release for these changes.
> >> >
> >> > People are often confused, myself included, which operators should be
> in
> >> > malhar-library and which ones should be in contrib.  Requiring a
> >> separate
> >> > setup for unit test should not be a criteria because the user of the
> >> > library couldn't care less whether the unit test requires extra setup.
> >> The
> >> > factor of requiring extra dependencies isn't valid either because
> >> there're
> >> > already dependencies of malhar-library now that apex does not have.
> >> >
> >> > We can retain them for backward compatibility purpose but going
> forward
> >> new
> >> > app packages should only use the baby artifacts, without denoting
> >> whether
> >> > it's contrib or not.
> >> >
> >> > David
> >> >
> >> > On Tue, Sep 29, 2015 at 12:19 AM, Andy Perlitch <andy@datatorrent.com
> >
> >> > wrote:
> >> >
> >> >> Hi all,
> >> >>
> >> >> This is a first cut at a plan to restructure malhar in a way that is
> >> more
> >> >> portable and adherent to Maven's principles of modularity and
> >> dependency
> >> >> management.
> >> >>
> >> >> Overview of Current Malhar Architecture
> >> >> ---------------------------------------------------------------
> >> >> The current malhar repo consists of several maven modules:
> >> >>
> >> >> * *malhar-library*
> >> >>   operators which do not require additional transitive dependencies
> >> beyond
> >> >> what Apex and Hadoop require
> >> >> *  *malhar-contrib*
> >> >>   operators requiring other maven dependencies
> >> >> * *malhar-demos*
> >> >>   demo applications
> >> >> * *malhar-samples*
> >> >>   sample code showing example usage of malhar operators
> >> >> * *malhar-apps*
> >> >>   apex applications (currently only logstream)
> >> >>
> >> >>
> >> >> Proposed Changes
> >> >> ---------------------------------------------------------------
> >> >>
> >> >> 1. *Scrub malhar-library for any operators needing additional
> >> dependencies*
> >> >>  `malhar-library` is intended to consist of only operators without
> >> extra
> >> >> transitive dependencies. All operators should be checked for the
> >> necessity
> >> >> of extra dependencies.
> >> >>
> >> >> 2. *Move operators from malhar-demos and malhar-apps into contrib (or
> >> >> library if prudent)*
> >> >>    There are various operators in both of these modules that are
> >> general
> >> >> enough to move into library or contrib.
> >> >>
> >> >> 3. *Create modules for all contrib subfolders*
> >> >>    All folders under `contrib/src/main/com/datatorrent/contrib/`
> >> should be
> >> >> converted to modules of contrib and listed as such in
> >> `/contrib/pom.xml`.
> >> >>    Additionally, each of these smaller contrib modules will have its
> >> own
> >> >> version and dependencies.
> >> >>
> >> >> 4. *Use the Shades Plugin to allow for backwards-compatible
> >> fully-qualified
> >> >> class names*
> >> >>    This is made possible by shades class relocation
> >> >> <
> >> >>
> >>
> https://maven.apache.org/plugins/maven-shade-plugin/examples/class-relocation.html
> >> >> feature. This might be a bit error prone as well as confusing to use
> >> for
> >> >> outside developers, but it must be done if these changes are to be
> made
> >> >> prior to a major release.
> >> >>
> >> >>
> >> >>
> >> >> Let me know what you all think of this approach.
> >> >>
> >> >> Best,
> >> >> Andy
> >> >>
> >> >>
> >> >> On Tue, Sep 22, 2015 at 11:20 AM, Chetan Narsude <
> >> chetan@datatorrent.com>
> >> >> wrote:
> >> >>
> >> >>> +1
> >> >>>
> >> >>> On Tue, Sep 22, 2015 at 11:08 AM, Gaurav Gupta <
> >> gaurav@datatorrent.com>
> >> >>> wrote:
> >> >>>
> >> >>>> I agree with David.. Each artifact should have it's own version
> >> >>>>
> >> >>>> Thanks
> >> >>>> -Gaurav
> >> >>>>
> >> >>>>> On Tue, Sep 22, 2015 at 11:07 AM, David Yan <
> david@datatorrent.com>
> >> >>>> wrote:
> >> >>>>
> >> >>>>> I actually think that each baby artifact should have its own
> >> version,
> >> >>>>> because each artifact has its own interface and its own life
> cycle,
> >> >>>>> especially after we break up the giant library, applications will
> >> >>> depend
> >> >>>> on
> >> >>>>> the baby artifacts instead of the giant library.  For example if
> >> >> there
> >> >>> is
> >> >>>>> no change in malhar-contrib-kafka (I think the name should
> actually
> >> >> be
> >> >>>>> apex-malhar-kafka), we should not confuse users by bumping the
> >> >> version.
> >> >>>>>
> >> >>>>> David
> >> >>>>>
> >> >>>>> On Tue, Sep 22, 2015 at 9:03 AM, Andy Perlitch <
> >> andy@datatorrent.com
> >> >>>
> >> >>>>> wrote:
> >> >>>>>
> >> >>>>>> Tushar,
> >> >>>>>>
> >> >>>>>> I agree that all modules should inherit the version from the
> >> >> "parent
> >> >>>> pom"
> >> >>>>>> of the malhar repo. I think the benefits outweigh the cost of
> >> >> bumping
> >> >>>>>> versions of components that haven't actually changed. I'd love to
> >> >> get
> >> >>>>>> others feedback on this as well.
> >> >>>>>>
> >> >>>>>> On another note, I plan on starting a spreadsheet/googledoc with
> >> >> the
> >> >>>>>> possible groupings of operators into these modules. Stay tuned...
> >> >>>>>>
> >> >>>>>> -Andy
> >> >>>>>>
> >> >>>>>> On Mon, Sep 21, 2015 at 11:51 PM, Tushar Gosavi <
> >> >>>> tushar@datatorrent.com>
> >> >>>>>> wrote:
> >> >>>>>>
> >> >>>>>>> +1 for the general idea
> >> >>>>>>>
> >> >>>>>>> Does these independent modules going to have independent
> >> >> versions?
> >> >>>> For
> >> >>>>>>> example, if there is no change in kafka operator between malhar
> >> >> 3.0
> >> >>>> and
> >> >>>>>>> malhar 4.0, will we increment version of malhar-contrib-kafka to
> >> >>>> 4.0. I
> >> >>>>>>> have learned from my previous project that, It is easier to
> >> >> manage
> >> >>>>>> versions
> >> >>>>>>> if we make all modules at same version level for a release, even
> >> >> if
> >> >>>>> there
> >> >>>>>>> is no change in a particular module.
> >> >>>>>>>
> >> >>>>>>> - Tushar.
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>> On Fri, Sep 18, 2015 at 12:18 AM, Timothy Farkas <
> >> >>>> tim@datatorrent.com>
> >> >>>>>>> wrote:
> >> >>>>>>>
> >> >>>>>>>> I agree Andy's solution is better, but just for the sake of
> >> >>>> argument
> >> >>>>>>>> profiles can be inherited from a parent pom, so if the maven
> >> >>>>> archetype
> >> >>>>>>>> defines a new project with a parent pom with the correct
> >> >> profiles
> >> >>>>>>> defined,
> >> >>>>>>>> then the desired profiles can be activated in the pom of the
> >> >> new
> >> >>>>>> project.
> >> >>>>>>>> It is no more complicated than adding additional dependencies
> >> >> to
> >> >>>> your
> >> >>>>>>>> project.
> >> >>>>>>>>
> >> >>>>>>>> On Thu, Sep 17, 2015 at 10:32 AM, Sandesh Hegde <
> >> >>>>>> sandesh@datatorrent.com
> >> >>>>>>>>
> >> >>>>>>>> wrote:
> >> >>>>>>>>
> >> >>>>>>>>> Currently all the dependencies in Malhar-Contrib are marked
> >> >> as
> >> >>>>>>> optional.
> >> >>>>>>>> So
> >> >>>>>>>>> users have to already modify the existing POM to use it in
> >> >>> their
> >> >>>>>>> project.
> >> >>>>>>>>> So restructuring should be fine.
> >> >>>>>>>>>
> >> >>>>>>>>> On Thu, Sep 17, 2015 at 11:29 AM Chetan Narsude <
> >> >>>>>>> chetan@datatorrent.com>
> >> >>>>>>>>> wrote:
> >> >>>>>>>>>
> >> >>>>>>>>>> The profiles are excellent when you are developing
> >> >>>>> malhar-contrib.
> >> >>>>>>>>> Profiles
> >> >>>>>>>>>> do not work when you are using malhar-contrib. The problem
> >> >>> Andy
> >> >>>>> is
> >> >>>>>>>>> trying
> >> >>>>>>>>>> to solve is the later. If there is an elegant solution
> >> >> which
> >> >>> I
> >> >>>> am
> >> >>>>>>>> missing
> >> >>>>>>>>>> using profiles, please correct me.
> >> >>>>>>>>>>
> >> >>>>>>>>>> The way Andy suggested is the way many successful projects
> >> >> do
> >> >>>> it.
> >> >>>>>>> Look
> >> >>>>>>>> at
> >> >>>>>>>>>> Netty as an example.
> >> >>>>>>>>>>
> >> >>>>>>>>>> +1 for that.
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>> --
> >> >>>>>>>>>> Chetan
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>> On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas <
> >> >>>>>>> tim@datatorrent.com>
> >> >>>>>>>>>> wrote:
> >> >>>>>>>>>>
> >> >>>>>>>>>>> I think restructuring the project in that way would be
> >> >> the
> >> >>>>>>>> technically
> >> >>>>>>>>>>> correct thing to do, but if people are unwilling to
> >> >> accept
> >> >>>> the
> >> >>>>>>> change
> >> >>>>>>>>> in
> >> >>>>>>>>>>> project structure you could achieve something similar by
> >> >>>> using
> >> >>>>>>> maven
> >> >>>>>>>>>>> profiles. With profiles the project structure would
> >> >> remain
> >> >>> as
> >> >>>>> is.
> >> >>>>>>>>>> Profiles
> >> >>>>>>>>>>> could be added to the malhar pom, and a profile would
> >> >>> define
> >> >>>>> the
> >> >>>>>>>>>>> dependencies needed for different types of operators. For
> >> >>>>> example
> >> >>>>>>> the
> >> >>>>>>>>>> hbase
> >> >>>>>>>>>>> profile would define the dependencies for the hbase
> >> >>> operator.
> >> >>>>>> Then
> >> >>>>>>>> any
> >> >>>>>>>>>>> project using a malhar library would just activate the
> >> >>>> correct
> >> >>>>>>>> profile
> >> >>>>>>>>> in
> >> >>>>>>>>>>> it's pom, and the correct dependencies would be pulled
> >> >> in.
> >> >>
> >>
> http://maven.apache.org/guides/introduction/introduction-to-profiles.html
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch <
> >> >>>>>>>> andy@datatorrent.com>
> >> >>>>>>>>>>> wrote:
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>> Hi everyone,
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> I am currently assigned to MLHR-1843
> >> >>>>>>>>>>>> <https://malhar.atlassian.net/browse/MLHR-1843>, which
> >> >>>>>>> essentially
> >> >>>>>>>>>> aims
> >> >>>>>>>>>>> to
> >> >>>>>>>>>>>> expose smaller, more consumable maven artifacts that
> >> >>> would
> >> >>>> do
> >> >>>>>>> away
> >> >>>>>>>>> with
> >> >>>>>>>>>>> the
> >> >>>>>>>>>>>> need to manually include necessary dependencies based
> >> >> on
> >> >>>> the
> >> >>>>>>>>> operators
> >> >>>>>>>>>> in
> >> >>>>>>>>>>>> use.
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> As an example, say I am building an app package that
> >> >>> needs
> >> >>>>>> Kafka
> >> >>>>>>>>> input
> >> >>>>>>>>>>> and
> >> >>>>>>>>>>>> output operators, but I don't want all the other
> >> >>> transitive
> >> >>>>>>>>>> dependencies
> >> >>>>>>>>>>>> that come via malhar-contrib. Currently I would need to
> >> >>>>> specify
> >> >>>>>>>>>>>> malhar-contrib as a dependency, and add an exclusions
> >> >>> block
> >> >>>>> in
> >> >>>>>>> my
> >> >>>>>>>>> app
> >> >>>>>>>>>>>> package pom:
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> >> >>>>>>>>>>>> <artifactId>malhar-contrib</artifactId>
> >> >>>>>> <version>3.0.0</version>
> >> >>>>>>>>> <!--
> >> >>>>>>>>>>> so
> >> >>>>>>>>>>>> none of malhar-contrib's deps are included -->*
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> *  <exclusions>    <exclusion>
> >> >> <groupId>*</groupId>
> >> >>>>>>>>>>>> <artifactId>*</artifactId>    </exclusion>
> >> >>>>>>>>> </exclusions></dependency>*
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> Then, I would have to include the kafka library
> >> >>> explicitly
> >> >>>>> as a
> >> >>>>>>>>>>> dependency:
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> *<dependency>  <groupId>org.apache.kafka</groupId>
> >> >>>>>>>>>>>> <artifactId>kafka_2.10</artifactId>
> >> >>>>>>>>>>>> <version>0.8.1.1</version></dependency>*
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> Wouldn't it be nice if I could just put this in my
> >> >> pom?:
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> >> >>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId>
> >> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> In order to make this possible, we will need to
> >> >> organize
> >> >>>> the
> >> >>>>>>> malhar
> >> >>>>>>>>>>> project
> >> >>>>>>>>>>>> into more granular modules (artifacts). Specifically,
> >> >> the
> >> >>>>>>>>>> malhar-contrib
> >> >>>>>>>>>>>> artifact would essentially just be a pom that specifies
> >> >>>> each
> >> >>>>>>>> smaller
> >> >>>>>>>>>>> module
> >> >>>>>>>>>>>> as a dependency:
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> *<!-- in malhar-contrib's pom.xml: -->*
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> *<modules>  <module>kafka</module>*
> >> >>>>>>>>>>>> *  <module>twitter</module>*
> >> >>>>>>>>>>>> *  <module>redis</module>*
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> *  <!-- other smaller modules --></modules>*
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> >> >>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId>
> >> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> >> >>>>>>>>>>>> <artifactId>malhar-contrib-twitter</artifactId>
> >> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> >> >>>>>>>>>>>> <artifactId>malhar-contrib-redis</artifactId>
> >> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> With these changes, there may be a risk of breaking
> >> >>>> backwards
> >> >>>>>>>>>>>> compatibility, however I think the gain in usability of
> >> >>>>> malhar
> >> >>>>>>>> merits
> >> >>>>>>>>>> the
> >> >>>>>>>>>>>> effort to make this work.
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> I am still relatively new to maven, so I would love to
> >> >>> get
> >> >>>>> some
> >> >>>>>>>>>> feedback
> >> >>>>>>>>>>>> from other devs about this!
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> --
> >> >>>>>>>>>>>> Regards,
> >> >>>>>>>>>>>> Andy Perlitch
> >> >>>>>>>>>>>> Software Engineer
> >> >>>>>>>>>>>> DataTorrent Inc
> >> >>>>>>>>>>>> (408)829-9319
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> --
> >> >>>>>> Regards,
> >> >>>>>> Andy Perlitch
> >> >>>>>> Software Engineer
> >> >>>>>> DataTorrent Inc
> >> >>>>>> (408)829-9319
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Regards,
> >> >> Andy Perlitch
> >> >> Software Engineer
> >> >> DataTorrent Inc
> >> >> (408)829-9319
> >> >>
> >>
> >
> >
>

Re: More sensible modules/artifacts in malhar

Posted by David Yan <da...@datatorrent.com>.
Let's restart the discussion of this topic.

We'd like to break malhar into modules, so we can have separate artifacts
for kafka, cassandra, hbase, etc., instead of just malhar-contrib and
malhar-library.
This way users using them will only pull in the right dependencies
automatically, without the ugly business of optional and exclude
dependencies today.

Also, I propose adding the 3rd party version in the artifact name.  For
example:

malhar-kafka-0.8
malhar-kafka-0.9

so that we can simultaneously support multiple versions of kafka.

Thoughts?

David

On Fri, Oct 2, 2015 at 4:40 PM, David Yan <da...@datatorrent.com> wrote:

> The list of all malhar operators are listed as part of the apidoc here:
> https://www.datatorrent.com/docs/apidocs/index.html
> And developers should be able to find the operators they need there.
>
> But, it's referenced from
> https://www.datatorrent.com/product-documentation/ as "Platform API
> Reference" so users may have trouble finding it.
>
> We probably should have a separate javadoc pages for Apex Core and Apex
> Malhar and add the links to this page http://apex.apache.org/docs.html
> also.
>
> David
>
> On Fri, Oct 2, 2015 at 4:28 PM, Pramod Immaneni <pr...@datatorrent.com>
> wrote:
>
>> We got to think about how people can find the operators and
>> dependencies when bundling the applications. The complain I hear often
>> is that folks can't find the operators they are looking for. We should
>> be careful about how much more work this will add for the user to now
>> search and find all the dependencies.
>>
>> Thanks
>>
>> > On Oct 2, 2015, at 3:44 PM, David Yan <da...@datatorrent.com> wrote:
>> >
>> > I actually don't think it makes sense any more to separate
>> malhar-library
>> > and malhar-contrib after the breakup, especially since we are planning
>> for
>> > a major release for these changes.
>> >
>> > People are often confused, myself included, which operators should be in
>> > malhar-library and which ones should be in contrib.  Requiring a
>> separate
>> > setup for unit test should not be a criteria because the user of the
>> > library couldn't care less whether the unit test requires extra setup.
>> The
>> > factor of requiring extra dependencies isn't valid either because
>> there're
>> > already dependencies of malhar-library now that apex does not have.
>> >
>> > We can retain them for backward compatibility purpose but going forward
>> new
>> > app packages should only use the baby artifacts, without denoting
>> whether
>> > it's contrib or not.
>> >
>> > David
>> >
>> > On Tue, Sep 29, 2015 at 12:19 AM, Andy Perlitch <an...@datatorrent.com>
>> > wrote:
>> >
>> >> Hi all,
>> >>
>> >> This is a first cut at a plan to restructure malhar in a way that is
>> more
>> >> portable and adherent to Maven's principles of modularity and
>> dependency
>> >> management.
>> >>
>> >> Overview of Current Malhar Architecture
>> >> ---------------------------------------------------------------
>> >> The current malhar repo consists of several maven modules:
>> >>
>> >> * *malhar-library*
>> >>   operators which do not require additional transitive dependencies
>> beyond
>> >> what Apex and Hadoop require
>> >> *  *malhar-contrib*
>> >>   operators requiring other maven dependencies
>> >> * *malhar-demos*
>> >>   demo applications
>> >> * *malhar-samples*
>> >>   sample code showing example usage of malhar operators
>> >> * *malhar-apps*
>> >>   apex applications (currently only logstream)
>> >>
>> >>
>> >> Proposed Changes
>> >> ---------------------------------------------------------------
>> >>
>> >> 1. *Scrub malhar-library for any operators needing additional
>> dependencies*
>> >>  `malhar-library` is intended to consist of only operators without
>> extra
>> >> transitive dependencies. All operators should be checked for the
>> necessity
>> >> of extra dependencies.
>> >>
>> >> 2. *Move operators from malhar-demos and malhar-apps into contrib (or
>> >> library if prudent)*
>> >>    There are various operators in both of these modules that are
>> general
>> >> enough to move into library or contrib.
>> >>
>> >> 3. *Create modules for all contrib subfolders*
>> >>    All folders under `contrib/src/main/com/datatorrent/contrib/`
>> should be
>> >> converted to modules of contrib and listed as such in
>> `/contrib/pom.xml`.
>> >>    Additionally, each of these smaller contrib modules will have its
>> own
>> >> version and dependencies.
>> >>
>> >> 4. *Use the Shades Plugin to allow for backwards-compatible
>> fully-qualified
>> >> class names*
>> >>    This is made possible by shades class relocation
>> >> <
>> >>
>> https://maven.apache.org/plugins/maven-shade-plugin/examples/class-relocation.html
>> >> feature. This might be a bit error prone as well as confusing to use
>> for
>> >> outside developers, but it must be done if these changes are to be made
>> >> prior to a major release.
>> >>
>> >>
>> >>
>> >> Let me know what you all think of this approach.
>> >>
>> >> Best,
>> >> Andy
>> >>
>> >>
>> >> On Tue, Sep 22, 2015 at 11:20 AM, Chetan Narsude <
>> chetan@datatorrent.com>
>> >> wrote:
>> >>
>> >>> +1
>> >>>
>> >>> On Tue, Sep 22, 2015 at 11:08 AM, Gaurav Gupta <
>> gaurav@datatorrent.com>
>> >>> wrote:
>> >>>
>> >>>> I agree with David.. Each artifact should have it's own version
>> >>>>
>> >>>> Thanks
>> >>>> -Gaurav
>> >>>>
>> >>>>> On Tue, Sep 22, 2015 at 11:07 AM, David Yan <da...@datatorrent.com>
>> >>>> wrote:
>> >>>>
>> >>>>> I actually think that each baby artifact should have its own
>> version,
>> >>>>> because each artifact has its own interface and its own life cycle,
>> >>>>> especially after we break up the giant library, applications will
>> >>> depend
>> >>>> on
>> >>>>> the baby artifacts instead of the giant library.  For example if
>> >> there
>> >>> is
>> >>>>> no change in malhar-contrib-kafka (I think the name should actually
>> >> be
>> >>>>> apex-malhar-kafka), we should not confuse users by bumping the
>> >> version.
>> >>>>>
>> >>>>> David
>> >>>>>
>> >>>>> On Tue, Sep 22, 2015 at 9:03 AM, Andy Perlitch <
>> andy@datatorrent.com
>> >>>
>> >>>>> wrote:
>> >>>>>
>> >>>>>> Tushar,
>> >>>>>>
>> >>>>>> I agree that all modules should inherit the version from the
>> >> "parent
>> >>>> pom"
>> >>>>>> of the malhar repo. I think the benefits outweigh the cost of
>> >> bumping
>> >>>>>> versions of components that haven't actually changed. I'd love to
>> >> get
>> >>>>>> others feedback on this as well.
>> >>>>>>
>> >>>>>> On another note, I plan on starting a spreadsheet/googledoc with
>> >> the
>> >>>>>> possible groupings of operators into these modules. Stay tuned...
>> >>>>>>
>> >>>>>> -Andy
>> >>>>>>
>> >>>>>> On Mon, Sep 21, 2015 at 11:51 PM, Tushar Gosavi <
>> >>>> tushar@datatorrent.com>
>> >>>>>> wrote:
>> >>>>>>
>> >>>>>>> +1 for the general idea
>> >>>>>>>
>> >>>>>>> Does these independent modules going to have independent
>> >> versions?
>> >>>> For
>> >>>>>>> example, if there is no change in kafka operator between malhar
>> >> 3.0
>> >>>> and
>> >>>>>>> malhar 4.0, will we increment version of malhar-contrib-kafka to
>> >>>> 4.0. I
>> >>>>>>> have learned from my previous project that, It is easier to
>> >> manage
>> >>>>>> versions
>> >>>>>>> if we make all modules at same version level for a release, even
>> >> if
>> >>>>> there
>> >>>>>>> is no change in a particular module.
>> >>>>>>>
>> >>>>>>> - Tushar.
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> On Fri, Sep 18, 2015 at 12:18 AM, Timothy Farkas <
>> >>>> tim@datatorrent.com>
>> >>>>>>> wrote:
>> >>>>>>>
>> >>>>>>>> I agree Andy's solution is better, but just for the sake of
>> >>>> argument
>> >>>>>>>> profiles can be inherited from a parent pom, so if the maven
>> >>>>> archetype
>> >>>>>>>> defines a new project with a parent pom with the correct
>> >> profiles
>> >>>>>>> defined,
>> >>>>>>>> then the desired profiles can be activated in the pom of the
>> >> new
>> >>>>>> project.
>> >>>>>>>> It is no more complicated than adding additional dependencies
>> >> to
>> >>>> your
>> >>>>>>>> project.
>> >>>>>>>>
>> >>>>>>>> On Thu, Sep 17, 2015 at 10:32 AM, Sandesh Hegde <
>> >>>>>> sandesh@datatorrent.com
>> >>>>>>>>
>> >>>>>>>> wrote:
>> >>>>>>>>
>> >>>>>>>>> Currently all the dependencies in Malhar-Contrib are marked
>> >> as
>> >>>>>>> optional.
>> >>>>>>>> So
>> >>>>>>>>> users have to already modify the existing POM to use it in
>> >>> their
>> >>>>>>> project.
>> >>>>>>>>> So restructuring should be fine.
>> >>>>>>>>>
>> >>>>>>>>> On Thu, Sep 17, 2015 at 11:29 AM Chetan Narsude <
>> >>>>>>> chetan@datatorrent.com>
>> >>>>>>>>> wrote:
>> >>>>>>>>>
>> >>>>>>>>>> The profiles are excellent when you are developing
>> >>>>> malhar-contrib.
>> >>>>>>>>> Profiles
>> >>>>>>>>>> do not work when you are using malhar-contrib. The problem
>> >>> Andy
>> >>>>> is
>> >>>>>>>>> trying
>> >>>>>>>>>> to solve is the later. If there is an elegant solution
>> >> which
>> >>> I
>> >>>> am
>> >>>>>>>> missing
>> >>>>>>>>>> using profiles, please correct me.
>> >>>>>>>>>>
>> >>>>>>>>>> The way Andy suggested is the way many successful projects
>> >> do
>> >>>> it.
>> >>>>>>> Look
>> >>>>>>>> at
>> >>>>>>>>>> Netty as an example.
>> >>>>>>>>>>
>> >>>>>>>>>> +1 for that.
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> --
>> >>>>>>>>>> Chetan
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas <
>> >>>>>>> tim@datatorrent.com>
>> >>>>>>>>>> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>>> I think restructuring the project in that way would be
>> >> the
>> >>>>>>>> technically
>> >>>>>>>>>>> correct thing to do, but if people are unwilling to
>> >> accept
>> >>>> the
>> >>>>>>> change
>> >>>>>>>>> in
>> >>>>>>>>>>> project structure you could achieve something similar by
>> >>>> using
>> >>>>>>> maven
>> >>>>>>>>>>> profiles. With profiles the project structure would
>> >> remain
>> >>> as
>> >>>>> is.
>> >>>>>>>>>> Profiles
>> >>>>>>>>>>> could be added to the malhar pom, and a profile would
>> >>> define
>> >>>>> the
>> >>>>>>>>>>> dependencies needed for different types of operators. For
>> >>>>> example
>> >>>>>>> the
>> >>>>>>>>>> hbase
>> >>>>>>>>>>> profile would define the dependencies for the hbase
>> >>> operator.
>> >>>>>> Then
>> >>>>>>>> any
>> >>>>>>>>>>> project using a malhar library would just activate the
>> >>>> correct
>> >>>>>>>> profile
>> >>>>>>>>> in
>> >>>>>>>>>>> it's pom, and the correct dependencies would be pulled
>> >> in.
>> >>
>> http://maven.apache.org/guides/introduction/introduction-to-profiles.html
>> >>>>>>>>>>>
>> >>>>>>>>>>> On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch <
>> >>>>>>>> andy@datatorrent.com>
>> >>>>>>>>>>> wrote:
>> >>>>>>>>>>>
>> >>>>>>>>>>>> Hi everyone,
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> I am currently assigned to MLHR-1843
>> >>>>>>>>>>>> <https://malhar.atlassian.net/browse/MLHR-1843>, which
>> >>>>>>> essentially
>> >>>>>>>>>> aims
>> >>>>>>>>>>> to
>> >>>>>>>>>>>> expose smaller, more consumable maven artifacts that
>> >>> would
>> >>>> do
>> >>>>>>> away
>> >>>>>>>>> with
>> >>>>>>>>>>> the
>> >>>>>>>>>>>> need to manually include necessary dependencies based
>> >> on
>> >>>> the
>> >>>>>>>>> operators
>> >>>>>>>>>> in
>> >>>>>>>>>>>> use.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> As an example, say I am building an app package that
>> >>> needs
>> >>>>>> Kafka
>> >>>>>>>>> input
>> >>>>>>>>>>> and
>> >>>>>>>>>>>> output operators, but I don't want all the other
>> >>> transitive
>> >>>>>>>>>> dependencies
>> >>>>>>>>>>>> that come via malhar-contrib. Currently I would need to
>> >>>>> specify
>> >>>>>>>>>>>> malhar-contrib as a dependency, and add an exclusions
>> >>> block
>> >>>>> in
>> >>>>>>> my
>> >>>>>>>>> app
>> >>>>>>>>>>>> package pom:
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
>> >>>>>>>>>>>> <artifactId>malhar-contrib</artifactId>
>> >>>>>> <version>3.0.0</version>
>> >>>>>>>>> <!--
>> >>>>>>>>>>> so
>> >>>>>>>>>>>> none of malhar-contrib's deps are included -->*
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> *  <exclusions>    <exclusion>
>> >> <groupId>*</groupId>
>> >>>>>>>>>>>> <artifactId>*</artifactId>    </exclusion>
>> >>>>>>>>> </exclusions></dependency>*
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Then, I would have to include the kafka library
>> >>> explicitly
>> >>>>> as a
>> >>>>>>>>>>> dependency:
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> *<dependency>  <groupId>org.apache.kafka</groupId>
>> >>>>>>>>>>>> <artifactId>kafka_2.10</artifactId>
>> >>>>>>>>>>>> <version>0.8.1.1</version></dependency>*
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Wouldn't it be nice if I could just put this in my
>> >> pom?:
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
>> >>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId>
>> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> In order to make this possible, we will need to
>> >> organize
>> >>>> the
>> >>>>>>> malhar
>> >>>>>>>>>>> project
>> >>>>>>>>>>>> into more granular modules (artifacts). Specifically,
>> >> the
>> >>>>>>>>>> malhar-contrib
>> >>>>>>>>>>>> artifact would essentially just be a pom that specifies
>> >>>> each
>> >>>>>>>> smaller
>> >>>>>>>>>>> module
>> >>>>>>>>>>>> as a dependency:
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> *<!-- in malhar-contrib's pom.xml: -->*
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> *<modules>  <module>kafka</module>*
>> >>>>>>>>>>>> *  <module>twitter</module>*
>> >>>>>>>>>>>> *  <module>redis</module>*
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> *  <!-- other smaller modules --></modules>*
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
>> >>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId>
>> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
>> >>>>>>>>>>>> <artifactId>malhar-contrib-twitter</artifactId>
>> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
>> >>>>>>>>>>>> <artifactId>malhar-contrib-redis</artifactId>
>> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> With these changes, there may be a risk of breaking
>> >>>> backwards
>> >>>>>>>>>>>> compatibility, however I think the gain in usability of
>> >>>>> malhar
>> >>>>>>>> merits
>> >>>>>>>>>> the
>> >>>>>>>>>>>> effort to make this work.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> I am still relatively new to maven, so I would love to
>> >>> get
>> >>>>> some
>> >>>>>>>>>> feedback
>> >>>>>>>>>>>> from other devs about this!
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> --
>> >>>>>>>>>>>> Regards,
>> >>>>>>>>>>>> Andy Perlitch
>> >>>>>>>>>>>> Software Engineer
>> >>>>>>>>>>>> DataTorrent Inc
>> >>>>>>>>>>>> (408)829-9319
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> --
>> >>>>>> Regards,
>> >>>>>> Andy Perlitch
>> >>>>>> Software Engineer
>> >>>>>> DataTorrent Inc
>> >>>>>> (408)829-9319
>> >>
>> >>
>> >>
>> >> --
>> >> Regards,
>> >> Andy Perlitch
>> >> Software Engineer
>> >> DataTorrent Inc
>> >> (408)829-9319
>> >>
>>
>
>

Re: More sensible modules/artifacts in malhar

Posted by David Yan <da...@datatorrent.com>.
The list of all malhar operators are listed as part of the apidoc here:
https://www.datatorrent.com/docs/apidocs/index.html
And developers should be able to find the operators they need there.

But, it's referenced from https://www.datatorrent.com/product-documentation/
as "Platform API Reference" so users may have trouble finding it.

We probably should have a separate javadoc pages for Apex Core and Apex
Malhar and add the links to this page http://apex.apache.org/docs.html also.

David

On Fri, Oct 2, 2015 at 4:28 PM, Pramod Immaneni <pr...@datatorrent.com>
wrote:

> We got to think about how people can find the operators and
> dependencies when bundling the applications. The complain I hear often
> is that folks can't find the operators they are looking for. We should
> be careful about how much more work this will add for the user to now
> search and find all the dependencies.
>
> Thanks
>
> > On Oct 2, 2015, at 3:44 PM, David Yan <da...@datatorrent.com> wrote:
> >
> > I actually don't think it makes sense any more to separate malhar-library
> > and malhar-contrib after the breakup, especially since we are planning
> for
> > a major release for these changes.
> >
> > People are often confused, myself included, which operators should be in
> > malhar-library and which ones should be in contrib.  Requiring a separate
> > setup for unit test should not be a criteria because the user of the
> > library couldn't care less whether the unit test requires extra setup.
> The
> > factor of requiring extra dependencies isn't valid either because
> there're
> > already dependencies of malhar-library now that apex does not have.
> >
> > We can retain them for backward compatibility purpose but going forward
> new
> > app packages should only use the baby artifacts, without denoting whether
> > it's contrib or not.
> >
> > David
> >
> > On Tue, Sep 29, 2015 at 12:19 AM, Andy Perlitch <an...@datatorrent.com>
> > wrote:
> >
> >> Hi all,
> >>
> >> This is a first cut at a plan to restructure malhar in a way that is
> more
> >> portable and adherent to Maven's principles of modularity and dependency
> >> management.
> >>
> >> Overview of Current Malhar Architecture
> >> ---------------------------------------------------------------
> >> The current malhar repo consists of several maven modules:
> >>
> >> * *malhar-library*
> >>   operators which do not require additional transitive dependencies
> beyond
> >> what Apex and Hadoop require
> >> *  *malhar-contrib*
> >>   operators requiring other maven dependencies
> >> * *malhar-demos*
> >>   demo applications
> >> * *malhar-samples*
> >>   sample code showing example usage of malhar operators
> >> * *malhar-apps*
> >>   apex applications (currently only logstream)
> >>
> >>
> >> Proposed Changes
> >> ---------------------------------------------------------------
> >>
> >> 1. *Scrub malhar-library for any operators needing additional
> dependencies*
> >>  `malhar-library` is intended to consist of only operators without extra
> >> transitive dependencies. All operators should be checked for the
> necessity
> >> of extra dependencies.
> >>
> >> 2. *Move operators from malhar-demos and malhar-apps into contrib (or
> >> library if prudent)*
> >>    There are various operators in both of these modules that are general
> >> enough to move into library or contrib.
> >>
> >> 3. *Create modules for all contrib subfolders*
> >>    All folders under `contrib/src/main/com/datatorrent/contrib/` should
> be
> >> converted to modules of contrib and listed as such in
> `/contrib/pom.xml`.
> >>    Additionally, each of these smaller contrib modules will have its own
> >> version and dependencies.
> >>
> >> 4. *Use the Shades Plugin to allow for backwards-compatible
> fully-qualified
> >> class names*
> >>    This is made possible by shades class relocation
> >> <
> >>
> https://maven.apache.org/plugins/maven-shade-plugin/examples/class-relocation.html
> >> feature. This might be a bit error prone as well as confusing to use for
> >> outside developers, but it must be done if these changes are to be made
> >> prior to a major release.
> >>
> >>
> >>
> >> Let me know what you all think of this approach.
> >>
> >> Best,
> >> Andy
> >>
> >>
> >> On Tue, Sep 22, 2015 at 11:20 AM, Chetan Narsude <
> chetan@datatorrent.com>
> >> wrote:
> >>
> >>> +1
> >>>
> >>> On Tue, Sep 22, 2015 at 11:08 AM, Gaurav Gupta <gaurav@datatorrent.com
> >
> >>> wrote:
> >>>
> >>>> I agree with David.. Each artifact should have it's own version
> >>>>
> >>>> Thanks
> >>>> -Gaurav
> >>>>
> >>>>> On Tue, Sep 22, 2015 at 11:07 AM, David Yan <da...@datatorrent.com>
> >>>> wrote:
> >>>>
> >>>>> I actually think that each baby artifact should have its own version,
> >>>>> because each artifact has its own interface and its own life cycle,
> >>>>> especially after we break up the giant library, applications will
> >>> depend
> >>>> on
> >>>>> the baby artifacts instead of the giant library.  For example if
> >> there
> >>> is
> >>>>> no change in malhar-contrib-kafka (I think the name should actually
> >> be
> >>>>> apex-malhar-kafka), we should not confuse users by bumping the
> >> version.
> >>>>>
> >>>>> David
> >>>>>
> >>>>> On Tue, Sep 22, 2015 at 9:03 AM, Andy Perlitch <andy@datatorrent.com
> >>>
> >>>>> wrote:
> >>>>>
> >>>>>> Tushar,
> >>>>>>
> >>>>>> I agree that all modules should inherit the version from the
> >> "parent
> >>>> pom"
> >>>>>> of the malhar repo. I think the benefits outweigh the cost of
> >> bumping
> >>>>>> versions of components that haven't actually changed. I'd love to
> >> get
> >>>>>> others feedback on this as well.
> >>>>>>
> >>>>>> On another note, I plan on starting a spreadsheet/googledoc with
> >> the
> >>>>>> possible groupings of operators into these modules. Stay tuned...
> >>>>>>
> >>>>>> -Andy
> >>>>>>
> >>>>>> On Mon, Sep 21, 2015 at 11:51 PM, Tushar Gosavi <
> >>>> tushar@datatorrent.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> +1 for the general idea
> >>>>>>>
> >>>>>>> Does these independent modules going to have independent
> >> versions?
> >>>> For
> >>>>>>> example, if there is no change in kafka operator between malhar
> >> 3.0
> >>>> and
> >>>>>>> malhar 4.0, will we increment version of malhar-contrib-kafka to
> >>>> 4.0. I
> >>>>>>> have learned from my previous project that, It is easier to
> >> manage
> >>>>>> versions
> >>>>>>> if we make all modules at same version level for a release, even
> >> if
> >>>>> there
> >>>>>>> is no change in a particular module.
> >>>>>>>
> >>>>>>> - Tushar.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Fri, Sep 18, 2015 at 12:18 AM, Timothy Farkas <
> >>>> tim@datatorrent.com>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> I agree Andy's solution is better, but just for the sake of
> >>>> argument
> >>>>>>>> profiles can be inherited from a parent pom, so if the maven
> >>>>> archetype
> >>>>>>>> defines a new project with a parent pom with the correct
> >> profiles
> >>>>>>> defined,
> >>>>>>>> then the desired profiles can be activated in the pom of the
> >> new
> >>>>>> project.
> >>>>>>>> It is no more complicated than adding additional dependencies
> >> to
> >>>> your
> >>>>>>>> project.
> >>>>>>>>
> >>>>>>>> On Thu, Sep 17, 2015 at 10:32 AM, Sandesh Hegde <
> >>>>>> sandesh@datatorrent.com
> >>>>>>>>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Currently all the dependencies in Malhar-Contrib are marked
> >> as
> >>>>>>> optional.
> >>>>>>>> So
> >>>>>>>>> users have to already modify the existing POM to use it in
> >>> their
> >>>>>>> project.
> >>>>>>>>> So restructuring should be fine.
> >>>>>>>>>
> >>>>>>>>> On Thu, Sep 17, 2015 at 11:29 AM Chetan Narsude <
> >>>>>>> chetan@datatorrent.com>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> The profiles are excellent when you are developing
> >>>>> malhar-contrib.
> >>>>>>>>> Profiles
> >>>>>>>>>> do not work when you are using malhar-contrib. The problem
> >>> Andy
> >>>>> is
> >>>>>>>>> trying
> >>>>>>>>>> to solve is the later. If there is an elegant solution
> >> which
> >>> I
> >>>> am
> >>>>>>>> missing
> >>>>>>>>>> using profiles, please correct me.
> >>>>>>>>>>
> >>>>>>>>>> The way Andy suggested is the way many successful projects
> >> do
> >>>> it.
> >>>>>>> Look
> >>>>>>>> at
> >>>>>>>>>> Netty as an example.
> >>>>>>>>>>
> >>>>>>>>>> +1 for that.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> Chetan
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas <
> >>>>>>> tim@datatorrent.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> I think restructuring the project in that way would be
> >> the
> >>>>>>>> technically
> >>>>>>>>>>> correct thing to do, but if people are unwilling to
> >> accept
> >>>> the
> >>>>>>> change
> >>>>>>>>> in
> >>>>>>>>>>> project structure you could achieve something similar by
> >>>> using
> >>>>>>> maven
> >>>>>>>>>>> profiles. With profiles the project structure would
> >> remain
> >>> as
> >>>>> is.
> >>>>>>>>>> Profiles
> >>>>>>>>>>> could be added to the malhar pom, and a profile would
> >>> define
> >>>>> the
> >>>>>>>>>>> dependencies needed for different types of operators. For
> >>>>> example
> >>>>>>> the
> >>>>>>>>>> hbase
> >>>>>>>>>>> profile would define the dependencies for the hbase
> >>> operator.
> >>>>>> Then
> >>>>>>>> any
> >>>>>>>>>>> project using a malhar library would just activate the
> >>>> correct
> >>>>>>>> profile
> >>>>>>>>> in
> >>>>>>>>>>> it's pom, and the correct dependencies would be pulled
> >> in.
> >>
> http://maven.apache.org/guides/introduction/introduction-to-profiles.html
> >>>>>>>>>>>
> >>>>>>>>>>> On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch <
> >>>>>>>> andy@datatorrent.com>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi everyone,
> >>>>>>>>>>>>
> >>>>>>>>>>>> I am currently assigned to MLHR-1843
> >>>>>>>>>>>> <https://malhar.atlassian.net/browse/MLHR-1843>, which
> >>>>>>> essentially
> >>>>>>>>>> aims
> >>>>>>>>>>> to
> >>>>>>>>>>>> expose smaller, more consumable maven artifacts that
> >>> would
> >>>> do
> >>>>>>> away
> >>>>>>>>> with
> >>>>>>>>>>> the
> >>>>>>>>>>>> need to manually include necessary dependencies based
> >> on
> >>>> the
> >>>>>>>>> operators
> >>>>>>>>>> in
> >>>>>>>>>>>> use.
> >>>>>>>>>>>>
> >>>>>>>>>>>> As an example, say I am building an app package that
> >>> needs
> >>>>>> Kafka
> >>>>>>>>> input
> >>>>>>>>>>> and
> >>>>>>>>>>>> output operators, but I don't want all the other
> >>> transitive
> >>>>>>>>>> dependencies
> >>>>>>>>>>>> that come via malhar-contrib. Currently I would need to
> >>>>> specify
> >>>>>>>>>>>> malhar-contrib as a dependency, and add an exclusions
> >>> block
> >>>>> in
> >>>>>>> my
> >>>>>>>>> app
> >>>>>>>>>>>> package pom:
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> >>>>>>>>>>>> <artifactId>malhar-contrib</artifactId>
> >>>>>> <version>3.0.0</version>
> >>>>>>>>> <!--
> >>>>>>>>>>> so
> >>>>>>>>>>>> none of malhar-contrib's deps are included -->*
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> *  <exclusions>    <exclusion>
> >> <groupId>*</groupId>
> >>>>>>>>>>>> <artifactId>*</artifactId>    </exclusion>
> >>>>>>>>> </exclusions></dependency>*
> >>>>>>>>>>>>
> >>>>>>>>>>>> Then, I would have to include the kafka library
> >>> explicitly
> >>>>> as a
> >>>>>>>>>>> dependency:
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> *<dependency>  <groupId>org.apache.kafka</groupId>
> >>>>>>>>>>>> <artifactId>kafka_2.10</artifactId>
> >>>>>>>>>>>> <version>0.8.1.1</version></dependency>*
> >>>>>>>>>>>>
> >>>>>>>>>>>> Wouldn't it be nice if I could just put this in my
> >> pom?:
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> >>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId>
> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> In order to make this possible, we will need to
> >> organize
> >>>> the
> >>>>>>> malhar
> >>>>>>>>>>> project
> >>>>>>>>>>>> into more granular modules (artifacts). Specifically,
> >> the
> >>>>>>>>>> malhar-contrib
> >>>>>>>>>>>> artifact would essentially just be a pom that specifies
> >>>> each
> >>>>>>>> smaller
> >>>>>>>>>>> module
> >>>>>>>>>>>> as a dependency:
> >>>>>>>>>>>>
> >>>>>>>>>>>> *<!-- in malhar-contrib's pom.xml: -->*
> >>>>>>>>>>>>
> >>>>>>>>>>>> *<modules>  <module>kafka</module>*
> >>>>>>>>>>>> *  <module>twitter</module>*
> >>>>>>>>>>>> *  <module>redis</module>*
> >>>>>>>>>>>>
> >>>>>>>>>>>> *  <!-- other smaller modules --></modules>*
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> >>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId>
> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> >>>>>>>>>>>> <artifactId>malhar-contrib-twitter</artifactId>
> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
> >>>>>>>>>>>> <artifactId>malhar-contrib-redis</artifactId>
> >>>>>>>>>>>> <version>3.0.0</version></dependency>*
> >>>>>>>>>>>>
> >>>>>>>>>>>> With these changes, there may be a risk of breaking
> >>>> backwards
> >>>>>>>>>>>> compatibility, however I think the gain in usability of
> >>>>> malhar
> >>>>>>>> merits
> >>>>>>>>>> the
> >>>>>>>>>>>> effort to make this work.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I am still relatively new to maven, so I would love to
> >>> get
> >>>>> some
> >>>>>>>>>> feedback
> >>>>>>>>>>>> from other devs about this!
> >>>>>>>>>>>>
> >>>>>>>>>>>> --
> >>>>>>>>>>>> Regards,
> >>>>>>>>>>>> Andy Perlitch
> >>>>>>>>>>>> Software Engineer
> >>>>>>>>>>>> DataTorrent Inc
> >>>>>>>>>>>> (408)829-9319
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Regards,
> >>>>>> Andy Perlitch
> >>>>>> Software Engineer
> >>>>>> DataTorrent Inc
> >>>>>> (408)829-9319
> >>
> >>
> >>
> >> --
> >> Regards,
> >> Andy Perlitch
> >> Software Engineer
> >> DataTorrent Inc
> >> (408)829-9319
> >>
>

Re: More sensible modules/artifacts in malhar

Posted by Pramod Immaneni <pr...@datatorrent.com>.
We got to think about how people can find the operators and
dependencies when bundling the applications. The complain I hear often
is that folks can't find the operators they are looking for. We should
be careful about how much more work this will add for the user to now
search and find all the dependencies.

Thanks

> On Oct 2, 2015, at 3:44 PM, David Yan <da...@datatorrent.com> wrote:
>
> I actually don't think it makes sense any more to separate malhar-library
> and malhar-contrib after the breakup, especially since we are planning for
> a major release for these changes.
>
> People are often confused, myself included, which operators should be in
> malhar-library and which ones should be in contrib.  Requiring a separate
> setup for unit test should not be a criteria because the user of the
> library couldn't care less whether the unit test requires extra setup.  The
> factor of requiring extra dependencies isn't valid either because there're
> already dependencies of malhar-library now that apex does not have.
>
> We can retain them for backward compatibility purpose but going forward new
> app packages should only use the baby artifacts, without denoting whether
> it's contrib or not.
>
> David
>
> On Tue, Sep 29, 2015 at 12:19 AM, Andy Perlitch <an...@datatorrent.com>
> wrote:
>
>> Hi all,
>>
>> This is a first cut at a plan to restructure malhar in a way that is more
>> portable and adherent to Maven's principles of modularity and dependency
>> management.
>>
>> Overview of Current Malhar Architecture
>> ---------------------------------------------------------------
>> The current malhar repo consists of several maven modules:
>>
>> * *malhar-library*
>>   operators which do not require additional transitive dependencies beyond
>> what Apex and Hadoop require
>> *  *malhar-contrib*
>>   operators requiring other maven dependencies
>> * *malhar-demos*
>>   demo applications
>> * *malhar-samples*
>>   sample code showing example usage of malhar operators
>> * *malhar-apps*
>>   apex applications (currently only logstream)
>>
>>
>> Proposed Changes
>> ---------------------------------------------------------------
>>
>> 1. *Scrub malhar-library for any operators needing additional dependencies*
>>  `malhar-library` is intended to consist of only operators without extra
>> transitive dependencies. All operators should be checked for the necessity
>> of extra dependencies.
>>
>> 2. *Move operators from malhar-demos and malhar-apps into contrib (or
>> library if prudent)*
>>    There are various operators in both of these modules that are general
>> enough to move into library or contrib.
>>
>> 3. *Create modules for all contrib subfolders*
>>    All folders under `contrib/src/main/com/datatorrent/contrib/` should be
>> converted to modules of contrib and listed as such in `/contrib/pom.xml`.
>>    Additionally, each of these smaller contrib modules will have its own
>> version and dependencies.
>>
>> 4. *Use the Shades Plugin to allow for backwards-compatible fully-qualified
>> class names*
>>    This is made possible by shades class relocation
>> <
>> https://maven.apache.org/plugins/maven-shade-plugin/examples/class-relocation.html
>> feature. This might be a bit error prone as well as confusing to use for
>> outside developers, but it must be done if these changes are to be made
>> prior to a major release.
>>
>>
>>
>> Let me know what you all think of this approach.
>>
>> Best,
>> Andy
>>
>>
>> On Tue, Sep 22, 2015 at 11:20 AM, Chetan Narsude <ch...@datatorrent.com>
>> wrote:
>>
>>> +1
>>>
>>> On Tue, Sep 22, 2015 at 11:08 AM, Gaurav Gupta <ga...@datatorrent.com>
>>> wrote:
>>>
>>>> I agree with David.. Each artifact should have it's own version
>>>>
>>>> Thanks
>>>> -Gaurav
>>>>
>>>>> On Tue, Sep 22, 2015 at 11:07 AM, David Yan <da...@datatorrent.com>
>>>> wrote:
>>>>
>>>>> I actually think that each baby artifact should have its own version,
>>>>> because each artifact has its own interface and its own life cycle,
>>>>> especially after we break up the giant library, applications will
>>> depend
>>>> on
>>>>> the baby artifacts instead of the giant library.  For example if
>> there
>>> is
>>>>> no change in malhar-contrib-kafka (I think the name should actually
>> be
>>>>> apex-malhar-kafka), we should not confuse users by bumping the
>> version.
>>>>>
>>>>> David
>>>>>
>>>>> On Tue, Sep 22, 2015 at 9:03 AM, Andy Perlitch <andy@datatorrent.com
>>>
>>>>> wrote:
>>>>>
>>>>>> Tushar,
>>>>>>
>>>>>> I agree that all modules should inherit the version from the
>> "parent
>>>> pom"
>>>>>> of the malhar repo. I think the benefits outweigh the cost of
>> bumping
>>>>>> versions of components that haven't actually changed. I'd love to
>> get
>>>>>> others feedback on this as well.
>>>>>>
>>>>>> On another note, I plan on starting a spreadsheet/googledoc with
>> the
>>>>>> possible groupings of operators into these modules. Stay tuned...
>>>>>>
>>>>>> -Andy
>>>>>>
>>>>>> On Mon, Sep 21, 2015 at 11:51 PM, Tushar Gosavi <
>>>> tushar@datatorrent.com>
>>>>>> wrote:
>>>>>>
>>>>>>> +1 for the general idea
>>>>>>>
>>>>>>> Does these independent modules going to have independent
>> versions?
>>>> For
>>>>>>> example, if there is no change in kafka operator between malhar
>> 3.0
>>>> and
>>>>>>> malhar 4.0, will we increment version of malhar-contrib-kafka to
>>>> 4.0. I
>>>>>>> have learned from my previous project that, It is easier to
>> manage
>>>>>> versions
>>>>>>> if we make all modules at same version level for a release, even
>> if
>>>>> there
>>>>>>> is no change in a particular module.
>>>>>>>
>>>>>>> - Tushar.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Sep 18, 2015 at 12:18 AM, Timothy Farkas <
>>>> tim@datatorrent.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I agree Andy's solution is better, but just for the sake of
>>>> argument
>>>>>>>> profiles can be inherited from a parent pom, so if the maven
>>>>> archetype
>>>>>>>> defines a new project with a parent pom with the correct
>> profiles
>>>>>>> defined,
>>>>>>>> then the desired profiles can be activated in the pom of the
>> new
>>>>>> project.
>>>>>>>> It is no more complicated than adding additional dependencies
>> to
>>>> your
>>>>>>>> project.
>>>>>>>>
>>>>>>>> On Thu, Sep 17, 2015 at 10:32 AM, Sandesh Hegde <
>>>>>> sandesh@datatorrent.com
>>>>>>>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Currently all the dependencies in Malhar-Contrib are marked
>> as
>>>>>>> optional.
>>>>>>>> So
>>>>>>>>> users have to already modify the existing POM to use it in
>>> their
>>>>>>> project.
>>>>>>>>> So restructuring should be fine.
>>>>>>>>>
>>>>>>>>> On Thu, Sep 17, 2015 at 11:29 AM Chetan Narsude <
>>>>>>> chetan@datatorrent.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> The profiles are excellent when you are developing
>>>>> malhar-contrib.
>>>>>>>>> Profiles
>>>>>>>>>> do not work when you are using malhar-contrib. The problem
>>> Andy
>>>>> is
>>>>>>>>> trying
>>>>>>>>>> to solve is the later. If there is an elegant solution
>> which
>>> I
>>>> am
>>>>>>>> missing
>>>>>>>>>> using profiles, please correct me.
>>>>>>>>>>
>>>>>>>>>> The way Andy suggested is the way many successful projects
>> do
>>>> it.
>>>>>>> Look
>>>>>>>> at
>>>>>>>>>> Netty as an example.
>>>>>>>>>>
>>>>>>>>>> +1 for that.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Chetan
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas <
>>>>>>> tim@datatorrent.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> I think restructuring the project in that way would be
>> the
>>>>>>>> technically
>>>>>>>>>>> correct thing to do, but if people are unwilling to
>> accept
>>>> the
>>>>>>> change
>>>>>>>>> in
>>>>>>>>>>> project structure you could achieve something similar by
>>>> using
>>>>>>> maven
>>>>>>>>>>> profiles. With profiles the project structure would
>> remain
>>> as
>>>>> is.
>>>>>>>>>> Profiles
>>>>>>>>>>> could be added to the malhar pom, and a profile would
>>> define
>>>>> the
>>>>>>>>>>> dependencies needed for different types of operators. For
>>>>> example
>>>>>>> the
>>>>>>>>>> hbase
>>>>>>>>>>> profile would define the dependencies for the hbase
>>> operator.
>>>>>> Then
>>>>>>>> any
>>>>>>>>>>> project using a malhar library would just activate the
>>>> correct
>>>>>>>> profile
>>>>>>>>> in
>>>>>>>>>>> it's pom, and the correct dependencies would be pulled
>> in.
>> http://maven.apache.org/guides/introduction/introduction-to-profiles.html
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch <
>>>>>>>> andy@datatorrent.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>
>>>>>>>>>>>> I am currently assigned to MLHR-1843
>>>>>>>>>>>> <https://malhar.atlassian.net/browse/MLHR-1843>, which
>>>>>>> essentially
>>>>>>>>>> aims
>>>>>>>>>>> to
>>>>>>>>>>>> expose smaller, more consumable maven artifacts that
>>> would
>>>> do
>>>>>>> away
>>>>>>>>> with
>>>>>>>>>>> the
>>>>>>>>>>>> need to manually include necessary dependencies based
>> on
>>>> the
>>>>>>>>> operators
>>>>>>>>>> in
>>>>>>>>>>>> use.
>>>>>>>>>>>>
>>>>>>>>>>>> As an example, say I am building an app package that
>>> needs
>>>>>> Kafka
>>>>>>>>> input
>>>>>>>>>>> and
>>>>>>>>>>>> output operators, but I don't want all the other
>>> transitive
>>>>>>>>>> dependencies
>>>>>>>>>>>> that come via malhar-contrib. Currently I would need to
>>>>> specify
>>>>>>>>>>>> malhar-contrib as a dependency, and add an exclusions
>>> block
>>>>> in
>>>>>>> my
>>>>>>>>> app
>>>>>>>>>>>> package pom:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
>>>>>>>>>>>> <artifactId>malhar-contrib</artifactId>
>>>>>> <version>3.0.0</version>
>>>>>>>>> <!--
>>>>>>>>>>> so
>>>>>>>>>>>> none of malhar-contrib's deps are included -->*
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> *  <exclusions>    <exclusion>
>> <groupId>*</groupId>
>>>>>>>>>>>> <artifactId>*</artifactId>    </exclusion>
>>>>>>>>> </exclusions></dependency>*
>>>>>>>>>>>>
>>>>>>>>>>>> Then, I would have to include the kafka library
>>> explicitly
>>>>> as a
>>>>>>>>>>> dependency:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> *<dependency>  <groupId>org.apache.kafka</groupId>
>>>>>>>>>>>> <artifactId>kafka_2.10</artifactId>
>>>>>>>>>>>> <version>0.8.1.1</version></dependency>*
>>>>>>>>>>>>
>>>>>>>>>>>> Wouldn't it be nice if I could just put this in my
>> pom?:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
>>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId>
>>>>>>>>>>>> <version>3.0.0</version></dependency>*
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> In order to make this possible, we will need to
>> organize
>>>> the
>>>>>>> malhar
>>>>>>>>>>> project
>>>>>>>>>>>> into more granular modules (artifacts). Specifically,
>> the
>>>>>>>>>> malhar-contrib
>>>>>>>>>>>> artifact would essentially just be a pom that specifies
>>>> each
>>>>>>>> smaller
>>>>>>>>>>> module
>>>>>>>>>>>> as a dependency:
>>>>>>>>>>>>
>>>>>>>>>>>> *<!-- in malhar-contrib's pom.xml: -->*
>>>>>>>>>>>>
>>>>>>>>>>>> *<modules>  <module>kafka</module>*
>>>>>>>>>>>> *  <module>twitter</module>*
>>>>>>>>>>>> *  <module>redis</module>*
>>>>>>>>>>>>
>>>>>>>>>>>> *  <!-- other smaller modules --></modules>*
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
>>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId>
>>>>>>>>>>>> <version>3.0.0</version></dependency>*
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
>>>>>>>>>>>> <artifactId>malhar-contrib-twitter</artifactId>
>>>>>>>>>>>> <version>3.0.0</version></dependency>*
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> *<dependency>  <groupId>com.datatorrent</groupId>
>>>>>>>>>>>> <artifactId>malhar-contrib-redis</artifactId>
>>>>>>>>>>>> <version>3.0.0</version></dependency>*
>>>>>>>>>>>>
>>>>>>>>>>>> With these changes, there may be a risk of breaking
>>>> backwards
>>>>>>>>>>>> compatibility, however I think the gain in usability of
>>>>> malhar
>>>>>>>> merits
>>>>>>>>>> the
>>>>>>>>>>>> effort to make this work.
>>>>>>>>>>>>
>>>>>>>>>>>> I am still relatively new to maven, so I would love to
>>> get
>>>>> some
>>>>>>>>>> feedback
>>>>>>>>>>>> from other devs about this!
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Andy Perlitch
>>>>>>>>>>>> Software Engineer
>>>>>>>>>>>> DataTorrent Inc
>>>>>>>>>>>> (408)829-9319
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Regards,
>>>>>> Andy Perlitch
>>>>>> Software Engineer
>>>>>> DataTorrent Inc
>>>>>> (408)829-9319
>>
>>
>>
>> --
>> Regards,
>> Andy Perlitch
>> Software Engineer
>> DataTorrent Inc
>> (408)829-9319
>>

Re: More sensible modules/artifacts in malhar

Posted by Chetan Narsude <ch...@datatorrent.com>.
+1.

Also do not think that Major version change is a prerequisite. Start
babyfication and give each module its own version. Contrib and Library can
include the baby modules and keep it backward compatible (use shade,
resource plugin whatever it takes) until the next major version (which
naturally comes and is not forced upon).

Expectation is that baby plugins will be preferred over the poorly
assembled monoliths (library, contrib) so we can even drop the monoliths or
stitch them differently with the major version.

--
Chetan






On Fri, Oct 2, 2015 at 3:44 PM, David Yan <da...@datatorrent.com> wrote:

> I actually don't think it makes sense any more to separate malhar-library
> and malhar-contrib after the breakup, especially since we are planning for
> a major release for these changes.
>
> People are often confused, myself included, which operators should be in
> malhar-library and which ones should be in contrib.  Requiring a separate
> setup for unit test should not be a criteria because the user of the
> library couldn't care less whether the unit test requires extra setup.  The
> factor of requiring extra dependencies isn't valid either because there're
> already dependencies of malhar-library now that apex does not have.
>
> We can retain them for backward compatibility purpose but going forward new
> app packages should only use the baby artifacts, without denoting whether
> it's contrib or not.
>
> David
>
> On Tue, Sep 29, 2015 at 12:19 AM, Andy Perlitch <an...@datatorrent.com>
> wrote:
>
> > Hi all,
> >
> > This is a first cut at a plan to restructure malhar in a way that is more
> > portable and adherent to Maven's principles of modularity and dependency
> > management.
> >
> > Overview of Current Malhar Architecture
> > ---------------------------------------------------------------
> > The current malhar repo consists of several maven modules:
> >
> > * *malhar-library*
> >    operators which do not require additional transitive dependencies
> beyond
> > what Apex and Hadoop require
> > *  *malhar-contrib*
> >    operators requiring other maven dependencies
> > * *malhar-demos*
> >    demo applications
> > * *malhar-samples*
> >    sample code showing example usage of malhar operators
> > * *malhar-apps*
> >    apex applications (currently only logstream)
> >
> >
> > Proposed Changes
> > ---------------------------------------------------------------
> >
> > 1. *Scrub malhar-library for any operators needing additional
> dependencies*
> >   `malhar-library` is intended to consist of only operators without extra
> > transitive dependencies. All operators should be checked for the
> necessity
> > of extra dependencies.
> >
> > 2. *Move operators from malhar-demos and malhar-apps into contrib (or
> > library if prudent)*
> >     There are various operators in both of these modules that are general
> > enough to move into library or contrib.
> >
> > 3. *Create modules for all contrib subfolders*
> >     All folders under `contrib/src/main/com/datatorrent/contrib/` should
> be
> > converted to modules of contrib and listed as such in `/contrib/pom.xml`.
> >     Additionally, each of these smaller contrib modules will have its own
> > version and dependencies.
> >
> > 4. *Use the Shades Plugin to allow for backwards-compatible
> fully-qualified
> > class names*
> >     This is made possible by shades class relocation
> > <
> >
> https://maven.apache.org/plugins/maven-shade-plugin/examples/class-relocation.html
> > >
> > feature. This might be a bit error prone as well as confusing to use for
> > outside developers, but it must be done if these changes are to be made
> > prior to a major release.
> >
> >
> >
> > Let me know what you all think of this approach.
> >
> > Best,
> > Andy
> >
> >
> > On Tue, Sep 22, 2015 at 11:20 AM, Chetan Narsude <chetan@datatorrent.com
> >
> > wrote:
> >
> > > +1
> > >
> > > On Tue, Sep 22, 2015 at 11:08 AM, Gaurav Gupta <gaurav@datatorrent.com
> >
> > > wrote:
> > >
> > > > I agree with David.. Each artifact should have it's own version
> > > >
> > > > Thanks
> > > > -Gaurav
> > > >
> > > > On Tue, Sep 22, 2015 at 11:07 AM, David Yan <da...@datatorrent.com>
> > > wrote:
> > > >
> > > > > I actually think that each baby artifact should have its own
> version,
> > > > > because each artifact has its own interface and its own life cycle,
> > > > > especially after we break up the giant library, applications will
> > > depend
> > > > on
> > > > > the baby artifacts instead of the giant library.  For example if
> > there
> > > is
> > > > > no change in malhar-contrib-kafka (I think the name should actually
> > be
> > > > > apex-malhar-kafka), we should not confuse users by bumping the
> > version.
> > > > >
> > > > > David
> > > > >
> > > > > On Tue, Sep 22, 2015 at 9:03 AM, Andy Perlitch <
> andy@datatorrent.com
> > >
> > > > > wrote:
> > > > >
> > > > > > Tushar,
> > > > > >
> > > > > > I agree that all modules should inherit the version from the
> > "parent
> > > > pom"
> > > > > > of the malhar repo. I think the benefits outweigh the cost of
> > bumping
> > > > > > versions of components that haven't actually changed. I'd love to
> > get
> > > > > > others feedback on this as well.
> > > > > >
> > > > > > On another note, I plan on starting a spreadsheet/googledoc with
> > the
> > > > > > possible groupings of operators into these modules. Stay tuned...
> > > > > >
> > > > > > -Andy
> > > > > >
> > > > > > On Mon, Sep 21, 2015 at 11:51 PM, Tushar Gosavi <
> > > > tushar@datatorrent.com>
> > > > > > wrote:
> > > > > >
> > > > > > > +1 for the general idea
> > > > > > >
> > > > > > > Does these independent modules going to have independent
> > versions?
> > > > For
> > > > > > > example, if there is no change in kafka operator between malhar
> > 3.0
> > > > and
> > > > > > > malhar 4.0, will we increment version of malhar-contrib-kafka
> to
> > > > 4.0. I
> > > > > > > have learned from my previous project that, It is easier to
> > manage
> > > > > > versions
> > > > > > > if we make all modules at same version level for a release,
> even
> > if
> > > > > there
> > > > > > > is no change in a particular module.
> > > > > > >
> > > > > > > - Tushar.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Sep 18, 2015 at 12:18 AM, Timothy Farkas <
> > > > tim@datatorrent.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > I agree Andy's solution is better, but just for the sake of
> > > > argument
> > > > > > > > profiles can be inherited from a parent pom, so if the maven
> > > > > archetype
> > > > > > > > defines a new project with a parent pom with the correct
> > profiles
> > > > > > > defined,
> > > > > > > > then the desired profiles can be activated in the pom of the
> > new
> > > > > > project.
> > > > > > > > It is no more complicated than adding additional dependencies
> > to
> > > > your
> > > > > > > > project.
> > > > > > > >
> > > > > > > > On Thu, Sep 17, 2015 at 10:32 AM, Sandesh Hegde <
> > > > > > sandesh@datatorrent.com
> > > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Currently all the dependencies in Malhar-Contrib are marked
> > as
> > > > > > > optional.
> > > > > > > > So
> > > > > > > > > users have to already modify the existing POM to use it in
> > > their
> > > > > > > project.
> > > > > > > > > So restructuring should be fine.
> > > > > > > > >
> > > > > > > > > On Thu, Sep 17, 2015 at 11:29 AM Chetan Narsude <
> > > > > > > chetan@datatorrent.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > The profiles are excellent when you are developing
> > > > > malhar-contrib.
> > > > > > > > > Profiles
> > > > > > > > > > do not work when you are using malhar-contrib. The
> problem
> > > Andy
> > > > > is
> > > > > > > > > trying
> > > > > > > > > > to solve is the later. If there is an elegant solution
> > which
> > > I
> > > > am
> > > > > > > > missing
> > > > > > > > > > using profiles, please correct me.
> > > > > > > > > >
> > > > > > > > > > The way Andy suggested is the way many successful
> projects
> > do
> > > > it.
> > > > > > > Look
> > > > > > > > at
> > > > > > > > > > Netty as an example.
> > > > > > > > > >
> > > > > > > > > > +1 for that.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Chetan
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas <
> > > > > > > tim@datatorrent.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > I think restructuring the project in that way would be
> > the
> > > > > > > > technically
> > > > > > > > > > > correct thing to do, but if people are unwilling to
> > accept
> > > > the
> > > > > > > change
> > > > > > > > > in
> > > > > > > > > > > project structure you could achieve something similar
> by
> > > > using
> > > > > > > maven
> > > > > > > > > > > profiles. With profiles the project structure would
> > remain
> > > as
> > > > > is.
> > > > > > > > > > Profiles
> > > > > > > > > > > could be added to the malhar pom, and a profile would
> > > define
> > > > > the
> > > > > > > > > > > dependencies needed for different types of operators.
> For
> > > > > example
> > > > > > > the
> > > > > > > > > > hbase
> > > > > > > > > > > profile would define the dependencies for the hbase
> > > operator.
> > > > > > Then
> > > > > > > > any
> > > > > > > > > > > project using a malhar library would just activate the
> > > > correct
> > > > > > > > profile
> > > > > > > > > in
> > > > > > > > > > > it's pom, and the correct dependencies would be pulled
> > in.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://maven.apache.org/guides/introduction/introduction-to-profiles.html
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch <
> > > > > > > > andy@datatorrent.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > >
> > > > > > > > > > > > I am currently assigned to MLHR-1843
> > > > > > > > > > > > <https://malhar.atlassian.net/browse/MLHR-1843>,
> which
> > > > > > > essentially
> > > > > > > > > > aims
> > > > > > > > > > > to
> > > > > > > > > > > > expose smaller, more consumable maven artifacts that
> > > would
> > > > do
> > > > > > > away
> > > > > > > > > with
> > > > > > > > > > > the
> > > > > > > > > > > > need to manually include necessary dependencies based
> > on
> > > > the
> > > > > > > > > operators
> > > > > > > > > > in
> > > > > > > > > > > > use.
> > > > > > > > > > > >
> > > > > > > > > > > > As an example, say I am building an app package that
> > > needs
> > > > > > Kafka
> > > > > > > > > input
> > > > > > > > > > > and
> > > > > > > > > > > > output operators, but I don't want all the other
> > > transitive
> > > > > > > > > > dependencies
> > > > > > > > > > > > that come via malhar-contrib. Currently I would need
> to
> > > > > specify
> > > > > > > > > > > > malhar-contrib as a dependency, and add an exclusions
> > > block
> > > > > in
> > > > > > > my
> > > > > > > > > app
> > > > > > > > > > > > package pom:
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > > > <artifactId>malhar-contrib</artifactId>
> > > > > > <version>3.0.0</version>
> > > > > > > > > <!--
> > > > > > > > > > > so
> > > > > > > > > > > > none of malhar-contrib's deps are included -->*
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > *  <exclusions>    <exclusion>
> > <groupId>*</groupId>
> > > > > > > > > > > > <artifactId>*</artifactId>    </exclusion>
> > > > > > > > > </exclusions></dependency>*
> > > > > > > > > > > >
> > > > > > > > > > > > Then, I would have to include the kafka library
> > > explicitly
> > > > > as a
> > > > > > > > > > > dependency:
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > *<dependency>  <groupId>org.apache.kafka</groupId>
> > > > > > > > > > > > <artifactId>kafka_2.10</artifactId>
> > > > > > > > > > > > <version>0.8.1.1</version></dependency>*
> > > > > > > > > > > >
> > > > > > > > > > > > Wouldn't it be nice if I could just put this in my
> > pom?:
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > > > <artifactId>malhar-contrib-kafka</artifactId>
> > > > > > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > In order to make this possible, we will need to
> > organize
> > > > the
> > > > > > > malhar
> > > > > > > > > > > project
> > > > > > > > > > > > into more granular modules (artifacts). Specifically,
> > the
> > > > > > > > > > malhar-contrib
> > > > > > > > > > > > artifact would essentially just be a pom that
> specifies
> > > > each
> > > > > > > > smaller
> > > > > > > > > > > module
> > > > > > > > > > > > as a dependency:
> > > > > > > > > > > >
> > > > > > > > > > > > *<!-- in malhar-contrib's pom.xml: -->*
> > > > > > > > > > > >
> > > > > > > > > > > > *<modules>  <module>kafka</module>*
> > > > > > > > > > > > *  <module>twitter</module>*
> > > > > > > > > > > > *  <module>redis</module>*
> > > > > > > > > > > >
> > > > > > > > > > > > *  <!-- other smaller modules --></modules>*
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > > > <artifactId>malhar-contrib-kafka</artifactId>
> > > > > > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > > > <artifactId>malhar-contrib-twitter</artifactId>
> > > > > > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > > > <artifactId>malhar-contrib-redis</artifactId>
> > > > > > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > > > > > >
> > > > > > > > > > > > With these changes, there may be a risk of breaking
> > > > backwards
> > > > > > > > > > > > compatibility, however I think the gain in usability
> of
> > > > > malhar
> > > > > > > > merits
> > > > > > > > > > the
> > > > > > > > > > > > effort to make this work.
> > > > > > > > > > > >
> > > > > > > > > > > > I am still relatively new to maven, so I would love
> to
> > > get
> > > > > some
> > > > > > > > > > feedback
> > > > > > > > > > > > from other devs about this!
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Regards,
> > > > > > > > > > > > Andy Perlitch
> > > > > > > > > > > > Software Engineer
> > > > > > > > > > > > DataTorrent Inc
> > > > > > > > > > > > (408)829-9319
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Regards,
> > > > > > Andy Perlitch
> > > > > > Software Engineer
> > > > > > DataTorrent Inc
> > > > > > (408)829-9319
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > Regards,
> > Andy Perlitch
> > Software Engineer
> > DataTorrent Inc
> > (408)829-9319
> >
>

Re: More sensible modules/artifacts in malhar

Posted by David Yan <da...@datatorrent.com>.
I actually don't think it makes sense any more to separate malhar-library
and malhar-contrib after the breakup, especially since we are planning for
a major release for these changes.

People are often confused, myself included, which operators should be in
malhar-library and which ones should be in contrib.  Requiring a separate
setup for unit test should not be a criteria because the user of the
library couldn't care less whether the unit test requires extra setup.  The
factor of requiring extra dependencies isn't valid either because there're
already dependencies of malhar-library now that apex does not have.

We can retain them for backward compatibility purpose but going forward new
app packages should only use the baby artifacts, without denoting whether
it's contrib or not.

David

On Tue, Sep 29, 2015 at 12:19 AM, Andy Perlitch <an...@datatorrent.com>
wrote:

> Hi all,
>
> This is a first cut at a plan to restructure malhar in a way that is more
> portable and adherent to Maven's principles of modularity and dependency
> management.
>
> Overview of Current Malhar Architecture
> ---------------------------------------------------------------
> The current malhar repo consists of several maven modules:
>
> * *malhar-library*
>    operators which do not require additional transitive dependencies beyond
> what Apex and Hadoop require
> *  *malhar-contrib*
>    operators requiring other maven dependencies
> * *malhar-demos*
>    demo applications
> * *malhar-samples*
>    sample code showing example usage of malhar operators
> * *malhar-apps*
>    apex applications (currently only logstream)
>
>
> Proposed Changes
> ---------------------------------------------------------------
>
> 1. *Scrub malhar-library for any operators needing additional dependencies*
>   `malhar-library` is intended to consist of only operators without extra
> transitive dependencies. All operators should be checked for the necessity
> of extra dependencies.
>
> 2. *Move operators from malhar-demos and malhar-apps into contrib (or
> library if prudent)*
>     There are various operators in both of these modules that are general
> enough to move into library or contrib.
>
> 3. *Create modules for all contrib subfolders*
>     All folders under `contrib/src/main/com/datatorrent/contrib/` should be
> converted to modules of contrib and listed as such in `/contrib/pom.xml`.
>     Additionally, each of these smaller contrib modules will have its own
> version and dependencies.
>
> 4. *Use the Shades Plugin to allow for backwards-compatible fully-qualified
> class names*
>     This is made possible by shades class relocation
> <
> https://maven.apache.org/plugins/maven-shade-plugin/examples/class-relocation.html
> >
> feature. This might be a bit error prone as well as confusing to use for
> outside developers, but it must be done if these changes are to be made
> prior to a major release.
>
>
>
> Let me know what you all think of this approach.
>
> Best,
> Andy
>
>
> On Tue, Sep 22, 2015 at 11:20 AM, Chetan Narsude <ch...@datatorrent.com>
> wrote:
>
> > +1
> >
> > On Tue, Sep 22, 2015 at 11:08 AM, Gaurav Gupta <ga...@datatorrent.com>
> > wrote:
> >
> > > I agree with David.. Each artifact should have it's own version
> > >
> > > Thanks
> > > -Gaurav
> > >
> > > On Tue, Sep 22, 2015 at 11:07 AM, David Yan <da...@datatorrent.com>
> > wrote:
> > >
> > > > I actually think that each baby artifact should have its own version,
> > > > because each artifact has its own interface and its own life cycle,
> > > > especially after we break up the giant library, applications will
> > depend
> > > on
> > > > the baby artifacts instead of the giant library.  For example if
> there
> > is
> > > > no change in malhar-contrib-kafka (I think the name should actually
> be
> > > > apex-malhar-kafka), we should not confuse users by bumping the
> version.
> > > >
> > > > David
> > > >
> > > > On Tue, Sep 22, 2015 at 9:03 AM, Andy Perlitch <andy@datatorrent.com
> >
> > > > wrote:
> > > >
> > > > > Tushar,
> > > > >
> > > > > I agree that all modules should inherit the version from the
> "parent
> > > pom"
> > > > > of the malhar repo. I think the benefits outweigh the cost of
> bumping
> > > > > versions of components that haven't actually changed. I'd love to
> get
> > > > > others feedback on this as well.
> > > > >
> > > > > On another note, I plan on starting a spreadsheet/googledoc with
> the
> > > > > possible groupings of operators into these modules. Stay tuned...
> > > > >
> > > > > -Andy
> > > > >
> > > > > On Mon, Sep 21, 2015 at 11:51 PM, Tushar Gosavi <
> > > tushar@datatorrent.com>
> > > > > wrote:
> > > > >
> > > > > > +1 for the general idea
> > > > > >
> > > > > > Does these independent modules going to have independent
> versions?
> > > For
> > > > > > example, if there is no change in kafka operator between malhar
> 3.0
> > > and
> > > > > > malhar 4.0, will we increment version of malhar-contrib-kafka to
> > > 4.0. I
> > > > > > have learned from my previous project that, It is easier to
> manage
> > > > > versions
> > > > > > if we make all modules at same version level for a release, even
> if
> > > > there
> > > > > > is no change in a particular module.
> > > > > >
> > > > > > - Tushar.
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Fri, Sep 18, 2015 at 12:18 AM, Timothy Farkas <
> > > tim@datatorrent.com>
> > > > > > wrote:
> > > > > >
> > > > > > > I agree Andy's solution is better, but just for the sake of
> > > argument
> > > > > > > profiles can be inherited from a parent pom, so if the maven
> > > > archetype
> > > > > > > defines a new project with a parent pom with the correct
> profiles
> > > > > > defined,
> > > > > > > then the desired profiles can be activated in the pom of the
> new
> > > > > project.
> > > > > > > It is no more complicated than adding additional dependencies
> to
> > > your
> > > > > > > project.
> > > > > > >
> > > > > > > On Thu, Sep 17, 2015 at 10:32 AM, Sandesh Hegde <
> > > > > sandesh@datatorrent.com
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Currently all the dependencies in Malhar-Contrib are marked
> as
> > > > > > optional.
> > > > > > > So
> > > > > > > > users have to already modify the existing POM to use it in
> > their
> > > > > > project.
> > > > > > > > So restructuring should be fine.
> > > > > > > >
> > > > > > > > On Thu, Sep 17, 2015 at 11:29 AM Chetan Narsude <
> > > > > > chetan@datatorrent.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > The profiles are excellent when you are developing
> > > > malhar-contrib.
> > > > > > > > Profiles
> > > > > > > > > do not work when you are using malhar-contrib. The problem
> > Andy
> > > > is
> > > > > > > > trying
> > > > > > > > > to solve is the later. If there is an elegant solution
> which
> > I
> > > am
> > > > > > > missing
> > > > > > > > > using profiles, please correct me.
> > > > > > > > >
> > > > > > > > > The way Andy suggested is the way many successful projects
> do
> > > it.
> > > > > > Look
> > > > > > > at
> > > > > > > > > Netty as an example.
> > > > > > > > >
> > > > > > > > > +1 for that.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Chetan
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas <
> > > > > > tim@datatorrent.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > I think restructuring the project in that way would be
> the
> > > > > > > technically
> > > > > > > > > > correct thing to do, but if people are unwilling to
> accept
> > > the
> > > > > > change
> > > > > > > > in
> > > > > > > > > > project structure you could achieve something similar by
> > > using
> > > > > > maven
> > > > > > > > > > profiles. With profiles the project structure would
> remain
> > as
> > > > is.
> > > > > > > > > Profiles
> > > > > > > > > > could be added to the malhar pom, and a profile would
> > define
> > > > the
> > > > > > > > > > dependencies needed for different types of operators. For
> > > > example
> > > > > > the
> > > > > > > > > hbase
> > > > > > > > > > profile would define the dependencies for the hbase
> > operator.
> > > > > Then
> > > > > > > any
> > > > > > > > > > project using a malhar library would just activate the
> > > correct
> > > > > > > profile
> > > > > > > > in
> > > > > > > > > > it's pom, and the correct dependencies would be pulled
> in.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://maven.apache.org/guides/introduction/introduction-to-profiles.html
> > > > > > > > > >
> > > > > > > > > > On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch <
> > > > > > > andy@datatorrent.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi everyone,
> > > > > > > > > > >
> > > > > > > > > > > I am currently assigned to MLHR-1843
> > > > > > > > > > > <https://malhar.atlassian.net/browse/MLHR-1843>, which
> > > > > > essentially
> > > > > > > > > aims
> > > > > > > > > > to
> > > > > > > > > > > expose smaller, more consumable maven artifacts that
> > would
> > > do
> > > > > > away
> > > > > > > > with
> > > > > > > > > > the
> > > > > > > > > > > need to manually include necessary dependencies based
> on
> > > the
> > > > > > > > operators
> > > > > > > > > in
> > > > > > > > > > > use.
> > > > > > > > > > >
> > > > > > > > > > > As an example, say I am building an app package that
> > needs
> > > > > Kafka
> > > > > > > > input
> > > > > > > > > > and
> > > > > > > > > > > output operators, but I don't want all the other
> > transitive
> > > > > > > > > dependencies
> > > > > > > > > > > that come via malhar-contrib. Currently I would need to
> > > > specify
> > > > > > > > > > > malhar-contrib as a dependency, and add an exclusions
> > block
> > > > in
> > > > > > my
> > > > > > > > app
> > > > > > > > > > > package pom:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > > <artifactId>malhar-contrib</artifactId>
> > > > > <version>3.0.0</version>
> > > > > > > > <!--
> > > > > > > > > > so
> > > > > > > > > > > none of malhar-contrib's deps are included -->*
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > *  <exclusions>    <exclusion>
> <groupId>*</groupId>
> > > > > > > > > > > <artifactId>*</artifactId>    </exclusion>
> > > > > > > > </exclusions></dependency>*
> > > > > > > > > > >
> > > > > > > > > > > Then, I would have to include the kafka library
> > explicitly
> > > > as a
> > > > > > > > > > dependency:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > *<dependency>  <groupId>org.apache.kafka</groupId>
> > > > > > > > > > > <artifactId>kafka_2.10</artifactId>
> > > > > > > > > > > <version>0.8.1.1</version></dependency>*
> > > > > > > > > > >
> > > > > > > > > > > Wouldn't it be nice if I could just put this in my
> pom?:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > > <artifactId>malhar-contrib-kafka</artifactId>
> > > > > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > In order to make this possible, we will need to
> organize
> > > the
> > > > > > malhar
> > > > > > > > > > project
> > > > > > > > > > > into more granular modules (artifacts). Specifically,
> the
> > > > > > > > > malhar-contrib
> > > > > > > > > > > artifact would essentially just be a pom that specifies
> > > each
> > > > > > > smaller
> > > > > > > > > > module
> > > > > > > > > > > as a dependency:
> > > > > > > > > > >
> > > > > > > > > > > *<!-- in malhar-contrib's pom.xml: -->*
> > > > > > > > > > >
> > > > > > > > > > > *<modules>  <module>kafka</module>*
> > > > > > > > > > > *  <module>twitter</module>*
> > > > > > > > > > > *  <module>redis</module>*
> > > > > > > > > > >
> > > > > > > > > > > *  <!-- other smaller modules --></modules>*
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > > <artifactId>malhar-contrib-kafka</artifactId>
> > > > > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > > <artifactId>malhar-contrib-twitter</artifactId>
> > > > > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > > <artifactId>malhar-contrib-redis</artifactId>
> > > > > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > > > > >
> > > > > > > > > > > With these changes, there may be a risk of breaking
> > > backwards
> > > > > > > > > > > compatibility, however I think the gain in usability of
> > > > malhar
> > > > > > > merits
> > > > > > > > > the
> > > > > > > > > > > effort to make this work.
> > > > > > > > > > >
> > > > > > > > > > > I am still relatively new to maven, so I would love to
> > get
> > > > some
> > > > > > > > > feedback
> > > > > > > > > > > from other devs about this!
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Regards,
> > > > > > > > > > > Andy Perlitch
> > > > > > > > > > > Software Engineer
> > > > > > > > > > > DataTorrent Inc
> > > > > > > > > > > (408)829-9319
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Regards,
> > > > > Andy Perlitch
> > > > > Software Engineer
> > > > > DataTorrent Inc
> > > > > (408)829-9319
> > > > >
> > > >
> > >
> >
>
>
>
> --
> Regards,
> Andy Perlitch
> Software Engineer
> DataTorrent Inc
> (408)829-9319
>

Re: More sensible modules/artifacts in malhar

Posted by Andy Perlitch <an...@datatorrent.com>.
Hi all,

This is a first cut at a plan to restructure malhar in a way that is more
portable and adherent to Maven's principles of modularity and dependency
management.

Overview of Current Malhar Architecture
---------------------------------------------------------------
The current malhar repo consists of several maven modules:

* *malhar-library*
   operators which do not require additional transitive dependencies beyond
what Apex and Hadoop require
*  *malhar-contrib*
   operators requiring other maven dependencies
* *malhar-demos*
   demo applications
* *malhar-samples*
   sample code showing example usage of malhar operators
* *malhar-apps*
   apex applications (currently only logstream)


Proposed Changes
---------------------------------------------------------------

1. *Scrub malhar-library for any operators needing additional dependencies*
  `malhar-library` is intended to consist of only operators without extra
transitive dependencies. All operators should be checked for the necessity
of extra dependencies.

2. *Move operators from malhar-demos and malhar-apps into contrib (or
library if prudent)*
    There are various operators in both of these modules that are general
enough to move into library or contrib.

3. *Create modules for all contrib subfolders*
    All folders under `contrib/src/main/com/datatorrent/contrib/` should be
converted to modules of contrib and listed as such in `/contrib/pom.xml`.
    Additionally, each of these smaller contrib modules will have its own
version and dependencies.

4. *Use the Shades Plugin to allow for backwards-compatible fully-qualified
class names*
    This is made possible by shades class relocation
<https://maven.apache.org/plugins/maven-shade-plugin/examples/class-relocation.html>
feature. This might be a bit error prone as well as confusing to use for
outside developers, but it must be done if these changes are to be made
prior to a major release.



Let me know what you all think of this approach.

Best,
Andy


On Tue, Sep 22, 2015 at 11:20 AM, Chetan Narsude <ch...@datatorrent.com>
wrote:

> +1
>
> On Tue, Sep 22, 2015 at 11:08 AM, Gaurav Gupta <ga...@datatorrent.com>
> wrote:
>
> > I agree with David.. Each artifact should have it's own version
> >
> > Thanks
> > -Gaurav
> >
> > On Tue, Sep 22, 2015 at 11:07 AM, David Yan <da...@datatorrent.com>
> wrote:
> >
> > > I actually think that each baby artifact should have its own version,
> > > because each artifact has its own interface and its own life cycle,
> > > especially after we break up the giant library, applications will
> depend
> > on
> > > the baby artifacts instead of the giant library.  For example if there
> is
> > > no change in malhar-contrib-kafka (I think the name should actually be
> > > apex-malhar-kafka), we should not confuse users by bumping the version.
> > >
> > > David
> > >
> > > On Tue, Sep 22, 2015 at 9:03 AM, Andy Perlitch <an...@datatorrent.com>
> > > wrote:
> > >
> > > > Tushar,
> > > >
> > > > I agree that all modules should inherit the version from the "parent
> > pom"
> > > > of the malhar repo. I think the benefits outweigh the cost of bumping
> > > > versions of components that haven't actually changed. I'd love to get
> > > > others feedback on this as well.
> > > >
> > > > On another note, I plan on starting a spreadsheet/googledoc with the
> > > > possible groupings of operators into these modules. Stay tuned...
> > > >
> > > > -Andy
> > > >
> > > > On Mon, Sep 21, 2015 at 11:51 PM, Tushar Gosavi <
> > tushar@datatorrent.com>
> > > > wrote:
> > > >
> > > > > +1 for the general idea
> > > > >
> > > > > Does these independent modules going to have independent versions?
> > For
> > > > > example, if there is no change in kafka operator between malhar 3.0
> > and
> > > > > malhar 4.0, will we increment version of malhar-contrib-kafka to
> > 4.0. I
> > > > > have learned from my previous project that, It is easier to manage
> > > > versions
> > > > > if we make all modules at same version level for a release, even if
> > > there
> > > > > is no change in a particular module.
> > > > >
> > > > > - Tushar.
> > > > >
> > > > >
> > > > >
> > > > > On Fri, Sep 18, 2015 at 12:18 AM, Timothy Farkas <
> > tim@datatorrent.com>
> > > > > wrote:
> > > > >
> > > > > > I agree Andy's solution is better, but just for the sake of
> > argument
> > > > > > profiles can be inherited from a parent pom, so if the maven
> > > archetype
> > > > > > defines a new project with a parent pom with the correct profiles
> > > > > defined,
> > > > > > then the desired profiles can be activated in the pom of the new
> > > > project.
> > > > > > It is no more complicated than adding additional dependencies to
> > your
> > > > > > project.
> > > > > >
> > > > > > On Thu, Sep 17, 2015 at 10:32 AM, Sandesh Hegde <
> > > > sandesh@datatorrent.com
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Currently all the dependencies in Malhar-Contrib are marked as
> > > > > optional.
> > > > > > So
> > > > > > > users have to already modify the existing POM to use it in
> their
> > > > > project.
> > > > > > > So restructuring should be fine.
> > > > > > >
> > > > > > > On Thu, Sep 17, 2015 at 11:29 AM Chetan Narsude <
> > > > > chetan@datatorrent.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > The profiles are excellent when you are developing
> > > malhar-contrib.
> > > > > > > Profiles
> > > > > > > > do not work when you are using malhar-contrib. The problem
> Andy
> > > is
> > > > > > > trying
> > > > > > > > to solve is the later. If there is an elegant solution which
> I
> > am
> > > > > > missing
> > > > > > > > using profiles, please correct me.
> > > > > > > >
> > > > > > > > The way Andy suggested is the way many successful projects do
> > it.
> > > > > Look
> > > > > > at
> > > > > > > > Netty as an example.
> > > > > > > >
> > > > > > > > +1 for that.
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Chetan
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas <
> > > > > tim@datatorrent.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > I think restructuring the project in that way would be the
> > > > > > technically
> > > > > > > > > correct thing to do, but if people are unwilling to accept
> > the
> > > > > change
> > > > > > > in
> > > > > > > > > project structure you could achieve something similar by
> > using
> > > > > maven
> > > > > > > > > profiles. With profiles the project structure would remain
> as
> > > is.
> > > > > > > > Profiles
> > > > > > > > > could be added to the malhar pom, and a profile would
> define
> > > the
> > > > > > > > > dependencies needed for different types of operators. For
> > > example
> > > > > the
> > > > > > > > hbase
> > > > > > > > > profile would define the dependencies for the hbase
> operator.
> > > > Then
> > > > > > any
> > > > > > > > > project using a malhar library would just activate the
> > correct
> > > > > > profile
> > > > > > > in
> > > > > > > > > it's pom, and the correct dependencies would be pulled in.
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://maven.apache.org/guides/introduction/introduction-to-profiles.html
> > > > > > > > >
> > > > > > > > > On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch <
> > > > > > andy@datatorrent.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi everyone,
> > > > > > > > > >
> > > > > > > > > > I am currently assigned to MLHR-1843
> > > > > > > > > > <https://malhar.atlassian.net/browse/MLHR-1843>, which
> > > > > essentially
> > > > > > > > aims
> > > > > > > > > to
> > > > > > > > > > expose smaller, more consumable maven artifacts that
> would
> > do
> > > > > away
> > > > > > > with
> > > > > > > > > the
> > > > > > > > > > need to manually include necessary dependencies based on
> > the
> > > > > > > operators
> > > > > > > > in
> > > > > > > > > > use.
> > > > > > > > > >
> > > > > > > > > > As an example, say I am building an app package that
> needs
> > > > Kafka
> > > > > > > input
> > > > > > > > > and
> > > > > > > > > > output operators, but I don't want all the other
> transitive
> > > > > > > > dependencies
> > > > > > > > > > that come via malhar-contrib. Currently I would need to
> > > specify
> > > > > > > > > > malhar-contrib as a dependency, and add an exclusions
> block
> > > in
> > > > > my
> > > > > > > app
> > > > > > > > > > package pom:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > <artifactId>malhar-contrib</artifactId>
> > > > <version>3.0.0</version>
> > > > > > > <!--
> > > > > > > > > so
> > > > > > > > > > none of malhar-contrib's deps are included -->*
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > *  <exclusions>    <exclusion>      <groupId>*</groupId>
> > > > > > > > > > <artifactId>*</artifactId>    </exclusion>
> > > > > > > </exclusions></dependency>*
> > > > > > > > > >
> > > > > > > > > > Then, I would have to include the kafka library
> explicitly
> > > as a
> > > > > > > > > dependency:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > *<dependency>  <groupId>org.apache.kafka</groupId>
> > > > > > > > > > <artifactId>kafka_2.10</artifactId>
> > > > > > > > > > <version>0.8.1.1</version></dependency>*
> > > > > > > > > >
> > > > > > > > > > Wouldn't it be nice if I could just put this in my pom?:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > <artifactId>malhar-contrib-kafka</artifactId>
> > > > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > In order to make this possible, we will need to organize
> > the
> > > > > malhar
> > > > > > > > > project
> > > > > > > > > > into more granular modules (artifacts). Specifically, the
> > > > > > > > malhar-contrib
> > > > > > > > > > artifact would essentially just be a pom that specifies
> > each
> > > > > > smaller
> > > > > > > > > module
> > > > > > > > > > as a dependency:
> > > > > > > > > >
> > > > > > > > > > *<!-- in malhar-contrib's pom.xml: -->*
> > > > > > > > > >
> > > > > > > > > > *<modules>  <module>kafka</module>*
> > > > > > > > > > *  <module>twitter</module>*
> > > > > > > > > > *  <module>redis</module>*
> > > > > > > > > >
> > > > > > > > > > *  <!-- other smaller modules --></modules>*
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > <artifactId>malhar-contrib-kafka</artifactId>
> > > > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > <artifactId>malhar-contrib-twitter</artifactId>
> > > > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > > <artifactId>malhar-contrib-redis</artifactId>
> > > > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > > > >
> > > > > > > > > > With these changes, there may be a risk of breaking
> > backwards
> > > > > > > > > > compatibility, however I think the gain in usability of
> > > malhar
> > > > > > merits
> > > > > > > > the
> > > > > > > > > > effort to make this work.
> > > > > > > > > >
> > > > > > > > > > I am still relatively new to maven, so I would love to
> get
> > > some
> > > > > > > > feedback
> > > > > > > > > > from other devs about this!
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Regards,
> > > > > > > > > > Andy Perlitch
> > > > > > > > > > Software Engineer
> > > > > > > > > > DataTorrent Inc
> > > > > > > > > > (408)829-9319
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Regards,
> > > > Andy Perlitch
> > > > Software Engineer
> > > > DataTorrent Inc
> > > > (408)829-9319
> > > >
> > >
> >
>



-- 
Regards,
Andy Perlitch
Software Engineer
DataTorrent Inc
(408)829-9319

Re: More sensible modules/artifacts in malhar

Posted by Chetan Narsude <ch...@datatorrent.com>.
+1

On Tue, Sep 22, 2015 at 11:08 AM, Gaurav Gupta <ga...@datatorrent.com>
wrote:

> I agree with David.. Each artifact should have it's own version
>
> Thanks
> -Gaurav
>
> On Tue, Sep 22, 2015 at 11:07 AM, David Yan <da...@datatorrent.com> wrote:
>
> > I actually think that each baby artifact should have its own version,
> > because each artifact has its own interface and its own life cycle,
> > especially after we break up the giant library, applications will depend
> on
> > the baby artifacts instead of the giant library.  For example if there is
> > no change in malhar-contrib-kafka (I think the name should actually be
> > apex-malhar-kafka), we should not confuse users by bumping the version.
> >
> > David
> >
> > On Tue, Sep 22, 2015 at 9:03 AM, Andy Perlitch <an...@datatorrent.com>
> > wrote:
> >
> > > Tushar,
> > >
> > > I agree that all modules should inherit the version from the "parent
> pom"
> > > of the malhar repo. I think the benefits outweigh the cost of bumping
> > > versions of components that haven't actually changed. I'd love to get
> > > others feedback on this as well.
> > >
> > > On another note, I plan on starting a spreadsheet/googledoc with the
> > > possible groupings of operators into these modules. Stay tuned...
> > >
> > > -Andy
> > >
> > > On Mon, Sep 21, 2015 at 11:51 PM, Tushar Gosavi <
> tushar@datatorrent.com>
> > > wrote:
> > >
> > > > +1 for the general idea
> > > >
> > > > Does these independent modules going to have independent versions?
> For
> > > > example, if there is no change in kafka operator between malhar 3.0
> and
> > > > malhar 4.0, will we increment version of malhar-contrib-kafka to
> 4.0. I
> > > > have learned from my previous project that, It is easier to manage
> > > versions
> > > > if we make all modules at same version level for a release, even if
> > there
> > > > is no change in a particular module.
> > > >
> > > > - Tushar.
> > > >
> > > >
> > > >
> > > > On Fri, Sep 18, 2015 at 12:18 AM, Timothy Farkas <
> tim@datatorrent.com>
> > > > wrote:
> > > >
> > > > > I agree Andy's solution is better, but just for the sake of
> argument
> > > > > profiles can be inherited from a parent pom, so if the maven
> > archetype
> > > > > defines a new project with a parent pom with the correct profiles
> > > > defined,
> > > > > then the desired profiles can be activated in the pom of the new
> > > project.
> > > > > It is no more complicated than adding additional dependencies to
> your
> > > > > project.
> > > > >
> > > > > On Thu, Sep 17, 2015 at 10:32 AM, Sandesh Hegde <
> > > sandesh@datatorrent.com
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Currently all the dependencies in Malhar-Contrib are marked as
> > > > optional.
> > > > > So
> > > > > > users have to already modify the existing POM to use it in their
> > > > project.
> > > > > > So restructuring should be fine.
> > > > > >
> > > > > > On Thu, Sep 17, 2015 at 11:29 AM Chetan Narsude <
> > > > chetan@datatorrent.com>
> > > > > > wrote:
> > > > > >
> > > > > > > The profiles are excellent when you are developing
> > malhar-contrib.
> > > > > > Profiles
> > > > > > > do not work when you are using malhar-contrib. The problem Andy
> > is
> > > > > > trying
> > > > > > > to solve is the later. If there is an elegant solution which I
> am
> > > > > missing
> > > > > > > using profiles, please correct me.
> > > > > > >
> > > > > > > The way Andy suggested is the way many successful projects do
> it.
> > > > Look
> > > > > at
> > > > > > > Netty as an example.
> > > > > > >
> > > > > > > +1 for that.
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Chetan
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas <
> > > > tim@datatorrent.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > I think restructuring the project in that way would be the
> > > > > technically
> > > > > > > > correct thing to do, but if people are unwilling to accept
> the
> > > > change
> > > > > > in
> > > > > > > > project structure you could achieve something similar by
> using
> > > > maven
> > > > > > > > profiles. With profiles the project structure would remain as
> > is.
> > > > > > > Profiles
> > > > > > > > could be added to the malhar pom, and a profile would define
> > the
> > > > > > > > dependencies needed for different types of operators. For
> > example
> > > > the
> > > > > > > hbase
> > > > > > > > profile would define the dependencies for the hbase operator.
> > > Then
> > > > > any
> > > > > > > > project using a malhar library would just activate the
> correct
> > > > > profile
> > > > > > in
> > > > > > > > it's pom, and the correct dependencies would be pulled in.
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://maven.apache.org/guides/introduction/introduction-to-profiles.html
> > > > > > > >
> > > > > > > > On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch <
> > > > > andy@datatorrent.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi everyone,
> > > > > > > > >
> > > > > > > > > I am currently assigned to MLHR-1843
> > > > > > > > > <https://malhar.atlassian.net/browse/MLHR-1843>, which
> > > > essentially
> > > > > > > aims
> > > > > > > > to
> > > > > > > > > expose smaller, more consumable maven artifacts that would
> do
> > > > away
> > > > > > with
> > > > > > > > the
> > > > > > > > > need to manually include necessary dependencies based on
> the
> > > > > > operators
> > > > > > > in
> > > > > > > > > use.
> > > > > > > > >
> > > > > > > > > As an example, say I am building an app package that needs
> > > Kafka
> > > > > > input
> > > > > > > > and
> > > > > > > > > output operators, but I don't want all the other transitive
> > > > > > > dependencies
> > > > > > > > > that come via malhar-contrib. Currently I would need to
> > specify
> > > > > > > > > malhar-contrib as a dependency, and add an exclusions block
> > in
> > > > my
> > > > > > app
> > > > > > > > > package pom:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > <artifactId>malhar-contrib</artifactId>
> > > <version>3.0.0</version>
> > > > > > <!--
> > > > > > > > so
> > > > > > > > > none of malhar-contrib's deps are included -->*
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > *  <exclusions>    <exclusion>      <groupId>*</groupId>
> > > > > > > > > <artifactId>*</artifactId>    </exclusion>
> > > > > > </exclusions></dependency>*
> > > > > > > > >
> > > > > > > > > Then, I would have to include the kafka library explicitly
> > as a
> > > > > > > > dependency:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > *<dependency>  <groupId>org.apache.kafka</groupId>
> > > > > > > > > <artifactId>kafka_2.10</artifactId>
> > > > > > > > > <version>0.8.1.1</version></dependency>*
> > > > > > > > >
> > > > > > > > > Wouldn't it be nice if I could just put this in my pom?:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > <artifactId>malhar-contrib-kafka</artifactId>
> > > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > In order to make this possible, we will need to organize
> the
> > > > malhar
> > > > > > > > project
> > > > > > > > > into more granular modules (artifacts). Specifically, the
> > > > > > > malhar-contrib
> > > > > > > > > artifact would essentially just be a pom that specifies
> each
> > > > > smaller
> > > > > > > > module
> > > > > > > > > as a dependency:
> > > > > > > > >
> > > > > > > > > *<!-- in malhar-contrib's pom.xml: -->*
> > > > > > > > >
> > > > > > > > > *<modules>  <module>kafka</module>*
> > > > > > > > > *  <module>twitter</module>*
> > > > > > > > > *  <module>redis</module>*
> > > > > > > > >
> > > > > > > > > *  <!-- other smaller modules --></modules>*
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > <artifactId>malhar-contrib-kafka</artifactId>
> > > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > <artifactId>malhar-contrib-twitter</artifactId>
> > > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > > <artifactId>malhar-contrib-redis</artifactId>
> > > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > > >
> > > > > > > > > With these changes, there may be a risk of breaking
> backwards
> > > > > > > > > compatibility, however I think the gain in usability of
> > malhar
> > > > > merits
> > > > > > > the
> > > > > > > > > effort to make this work.
> > > > > > > > >
> > > > > > > > > I am still relatively new to maven, so I would love to get
> > some
> > > > > > > feedback
> > > > > > > > > from other devs about this!
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Regards,
> > > > > > > > > Andy Perlitch
> > > > > > > > > Software Engineer
> > > > > > > > > DataTorrent Inc
> > > > > > > > > (408)829-9319
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Regards,
> > > Andy Perlitch
> > > Software Engineer
> > > DataTorrent Inc
> > > (408)829-9319
> > >
> >
>

Re: More sensible modules/artifacts in malhar

Posted by Gaurav Gupta <ga...@datatorrent.com>.
I agree with David.. Each artifact should have it's own version

Thanks
-Gaurav

On Tue, Sep 22, 2015 at 11:07 AM, David Yan <da...@datatorrent.com> wrote:

> I actually think that each baby artifact should have its own version,
> because each artifact has its own interface and its own life cycle,
> especially after we break up the giant library, applications will depend on
> the baby artifacts instead of the giant library.  For example if there is
> no change in malhar-contrib-kafka (I think the name should actually be
> apex-malhar-kafka), we should not confuse users by bumping the version.
>
> David
>
> On Tue, Sep 22, 2015 at 9:03 AM, Andy Perlitch <an...@datatorrent.com>
> wrote:
>
> > Tushar,
> >
> > I agree that all modules should inherit the version from the "parent pom"
> > of the malhar repo. I think the benefits outweigh the cost of bumping
> > versions of components that haven't actually changed. I'd love to get
> > others feedback on this as well.
> >
> > On another note, I plan on starting a spreadsheet/googledoc with the
> > possible groupings of operators into these modules. Stay tuned...
> >
> > -Andy
> >
> > On Mon, Sep 21, 2015 at 11:51 PM, Tushar Gosavi <tu...@datatorrent.com>
> > wrote:
> >
> > > +1 for the general idea
> > >
> > > Does these independent modules going to have independent versions? For
> > > example, if there is no change in kafka operator between malhar 3.0 and
> > > malhar 4.0, will we increment version of malhar-contrib-kafka to 4.0. I
> > > have learned from my previous project that, It is easier to manage
> > versions
> > > if we make all modules at same version level for a release, even if
> there
> > > is no change in a particular module.
> > >
> > > - Tushar.
> > >
> > >
> > >
> > > On Fri, Sep 18, 2015 at 12:18 AM, Timothy Farkas <ti...@datatorrent.com>
> > > wrote:
> > >
> > > > I agree Andy's solution is better, but just for the sake of argument
> > > > profiles can be inherited from a parent pom, so if the maven
> archetype
> > > > defines a new project with a parent pom with the correct profiles
> > > defined,
> > > > then the desired profiles can be activated in the pom of the new
> > project.
> > > > It is no more complicated than adding additional dependencies to your
> > > > project.
> > > >
> > > > On Thu, Sep 17, 2015 at 10:32 AM, Sandesh Hegde <
> > sandesh@datatorrent.com
> > > >
> > > > wrote:
> > > >
> > > > > Currently all the dependencies in Malhar-Contrib are marked as
> > > optional.
> > > > So
> > > > > users have to already modify the existing POM to use it in their
> > > project.
> > > > > So restructuring should be fine.
> > > > >
> > > > > On Thu, Sep 17, 2015 at 11:29 AM Chetan Narsude <
> > > chetan@datatorrent.com>
> > > > > wrote:
> > > > >
> > > > > > The profiles are excellent when you are developing
> malhar-contrib.
> > > > > Profiles
> > > > > > do not work when you are using malhar-contrib. The problem Andy
> is
> > > > > trying
> > > > > > to solve is the later. If there is an elegant solution which I am
> > > > missing
> > > > > > using profiles, please correct me.
> > > > > >
> > > > > > The way Andy suggested is the way many successful projects do it.
> > > Look
> > > > at
> > > > > > Netty as an example.
> > > > > >
> > > > > > +1 for that.
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Chetan
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas <
> > > tim@datatorrent.com>
> > > > > > wrote:
> > > > > >
> > > > > > > I think restructuring the project in that way would be the
> > > > technically
> > > > > > > correct thing to do, but if people are unwilling to accept the
> > > change
> > > > > in
> > > > > > > project structure you could achieve something similar by using
> > > maven
> > > > > > > profiles. With profiles the project structure would remain as
> is.
> > > > > > Profiles
> > > > > > > could be added to the malhar pom, and a profile would define
> the
> > > > > > > dependencies needed for different types of operators. For
> example
> > > the
> > > > > > hbase
> > > > > > > profile would define the dependencies for the hbase operator.
> > Then
> > > > any
> > > > > > > project using a malhar library would just activate the correct
> > > > profile
> > > > > in
> > > > > > > it's pom, and the correct dependencies would be pulled in.
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://maven.apache.org/guides/introduction/introduction-to-profiles.html
> > > > > > >
> > > > > > > On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch <
> > > > andy@datatorrent.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi everyone,
> > > > > > > >
> > > > > > > > I am currently assigned to MLHR-1843
> > > > > > > > <https://malhar.atlassian.net/browse/MLHR-1843>, which
> > > essentially
> > > > > > aims
> > > > > > > to
> > > > > > > > expose smaller, more consumable maven artifacts that would do
> > > away
> > > > > with
> > > > > > > the
> > > > > > > > need to manually include necessary dependencies based on the
> > > > > operators
> > > > > > in
> > > > > > > > use.
> > > > > > > >
> > > > > > > > As an example, say I am building an app package that needs
> > Kafka
> > > > > input
> > > > > > > and
> > > > > > > > output operators, but I don't want all the other transitive
> > > > > > dependencies
> > > > > > > > that come via malhar-contrib. Currently I would need to
> specify
> > > > > > > > malhar-contrib as a dependency, and add an exclusions block
> in
> > > my
> > > > > app
> > > > > > > > package pom:
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > <artifactId>malhar-contrib</artifactId>
> > <version>3.0.0</version>
> > > > > <!--
> > > > > > > so
> > > > > > > > none of malhar-contrib's deps are included -->*
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > *  <exclusions>    <exclusion>      <groupId>*</groupId>
> > > > > > > > <artifactId>*</artifactId>    </exclusion>
> > > > > </exclusions></dependency>*
> > > > > > > >
> > > > > > > > Then, I would have to include the kafka library explicitly
> as a
> > > > > > > dependency:
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > *<dependency>  <groupId>org.apache.kafka</groupId>
> > > > > > > > <artifactId>kafka_2.10</artifactId>
> > > > > > > > <version>0.8.1.1</version></dependency>*
> > > > > > > >
> > > > > > > > Wouldn't it be nice if I could just put this in my pom?:
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > <artifactId>malhar-contrib-kafka</artifactId>
> > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > >
> > > > > > > >
> > > > > > > > In order to make this possible, we will need to organize the
> > > malhar
> > > > > > > project
> > > > > > > > into more granular modules (artifacts). Specifically, the
> > > > > > malhar-contrib
> > > > > > > > artifact would essentially just be a pom that specifies each
> > > > smaller
> > > > > > > module
> > > > > > > > as a dependency:
> > > > > > > >
> > > > > > > > *<!-- in malhar-contrib's pom.xml: -->*
> > > > > > > >
> > > > > > > > *<modules>  <module>kafka</module>*
> > > > > > > > *  <module>twitter</module>*
> > > > > > > > *  <module>redis</module>*
> > > > > > > >
> > > > > > > > *  <!-- other smaller modules --></modules>*
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > <artifactId>malhar-contrib-kafka</artifactId>
> > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > <artifactId>malhar-contrib-twitter</artifactId>
> > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > > <artifactId>malhar-contrib-redis</artifactId>
> > > > > > > > <version>3.0.0</version></dependency>*
> > > > > > > >
> > > > > > > > With these changes, there may be a risk of breaking backwards
> > > > > > > > compatibility, however I think the gain in usability of
> malhar
> > > > merits
> > > > > > the
> > > > > > > > effort to make this work.
> > > > > > > >
> > > > > > > > I am still relatively new to maven, so I would love to get
> some
> > > > > > feedback
> > > > > > > > from other devs about this!
> > > > > > > >
> > > > > > > > --
> > > > > > > > Regards,
> > > > > > > > Andy Perlitch
> > > > > > > > Software Engineer
> > > > > > > > DataTorrent Inc
> > > > > > > > (408)829-9319
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > Regards,
> > Andy Perlitch
> > Software Engineer
> > DataTorrent Inc
> > (408)829-9319
> >
>

Re: More sensible modules/artifacts in malhar

Posted by David Yan <da...@datatorrent.com>.
I actually think that each baby artifact should have its own version,
because each artifact has its own interface and its own life cycle,
especially after we break up the giant library, applications will depend on
the baby artifacts instead of the giant library.  For example if there is
no change in malhar-contrib-kafka (I think the name should actually be
apex-malhar-kafka), we should not confuse users by bumping the version.

David

On Tue, Sep 22, 2015 at 9:03 AM, Andy Perlitch <an...@datatorrent.com> wrote:

> Tushar,
>
> I agree that all modules should inherit the version from the "parent pom"
> of the malhar repo. I think the benefits outweigh the cost of bumping
> versions of components that haven't actually changed. I'd love to get
> others feedback on this as well.
>
> On another note, I plan on starting a spreadsheet/googledoc with the
> possible groupings of operators into these modules. Stay tuned...
>
> -Andy
>
> On Mon, Sep 21, 2015 at 11:51 PM, Tushar Gosavi <tu...@datatorrent.com>
> wrote:
>
> > +1 for the general idea
> >
> > Does these independent modules going to have independent versions? For
> > example, if there is no change in kafka operator between malhar 3.0 and
> > malhar 4.0, will we increment version of malhar-contrib-kafka to 4.0. I
> > have learned from my previous project that, It is easier to manage
> versions
> > if we make all modules at same version level for a release, even if there
> > is no change in a particular module.
> >
> > - Tushar.
> >
> >
> >
> > On Fri, Sep 18, 2015 at 12:18 AM, Timothy Farkas <ti...@datatorrent.com>
> > wrote:
> >
> > > I agree Andy's solution is better, but just for the sake of argument
> > > profiles can be inherited from a parent pom, so if the maven archetype
> > > defines a new project with a parent pom with the correct profiles
> > defined,
> > > then the desired profiles can be activated in the pom of the new
> project.
> > > It is no more complicated than adding additional dependencies to your
> > > project.
> > >
> > > On Thu, Sep 17, 2015 at 10:32 AM, Sandesh Hegde <
> sandesh@datatorrent.com
> > >
> > > wrote:
> > >
> > > > Currently all the dependencies in Malhar-Contrib are marked as
> > optional.
> > > So
> > > > users have to already modify the existing POM to use it in their
> > project.
> > > > So restructuring should be fine.
> > > >
> > > > On Thu, Sep 17, 2015 at 11:29 AM Chetan Narsude <
> > chetan@datatorrent.com>
> > > > wrote:
> > > >
> > > > > The profiles are excellent when you are developing malhar-contrib.
> > > > Profiles
> > > > > do not work when you are using malhar-contrib. The problem Andy  is
> > > > trying
> > > > > to solve is the later. If there is an elegant solution which I am
> > > missing
> > > > > using profiles, please correct me.
> > > > >
> > > > > The way Andy suggested is the way many successful projects do it.
> > Look
> > > at
> > > > > Netty as an example.
> > > > >
> > > > > +1 for that.
> > > > >
> > > > >
> > > > > --
> > > > > Chetan
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas <
> > tim@datatorrent.com>
> > > > > wrote:
> > > > >
> > > > > > I think restructuring the project in that way would be the
> > > technically
> > > > > > correct thing to do, but if people are unwilling to accept the
> > change
> > > > in
> > > > > > project structure you could achieve something similar by using
> > maven
> > > > > > profiles. With profiles the project structure would remain as is.
> > > > > Profiles
> > > > > > could be added to the malhar pom, and a profile would define the
> > > > > > dependencies needed for different types of operators. For example
> > the
> > > > > hbase
> > > > > > profile would define the dependencies for the hbase operator.
> Then
> > > any
> > > > > > project using a malhar library would just activate the correct
> > > profile
> > > > in
> > > > > > it's pom, and the correct dependencies would be pulled in.
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://maven.apache.org/guides/introduction/introduction-to-profiles.html
> > > > > >
> > > > > > On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch <
> > > andy@datatorrent.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi everyone,
> > > > > > >
> > > > > > > I am currently assigned to MLHR-1843
> > > > > > > <https://malhar.atlassian.net/browse/MLHR-1843>, which
> > essentially
> > > > > aims
> > > > > > to
> > > > > > > expose smaller, more consumable maven artifacts that would do
> > away
> > > > with
> > > > > > the
> > > > > > > need to manually include necessary dependencies based on the
> > > > operators
> > > > > in
> > > > > > > use.
> > > > > > >
> > > > > > > As an example, say I am building an app package that needs
> Kafka
> > > > input
> > > > > > and
> > > > > > > output operators, but I don't want all the other transitive
> > > > > dependencies
> > > > > > > that come via malhar-contrib. Currently I would need to specify
> > > > > > > malhar-contrib as a dependency, and add an exclusions block  in
> > my
> > > > app
> > > > > > > package pom:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > <artifactId>malhar-contrib</artifactId>
> <version>3.0.0</version>
> > > > <!--
> > > > > > so
> > > > > > > none of malhar-contrib's deps are included -->*
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > *  <exclusions>    <exclusion>      <groupId>*</groupId>
> > > > > > > <artifactId>*</artifactId>    </exclusion>
> > > > </exclusions></dependency>*
> > > > > > >
> > > > > > > Then, I would have to include the kafka library explicitly as a
> > > > > > dependency:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > *<dependency>  <groupId>org.apache.kafka</groupId>
> > > > > > > <artifactId>kafka_2.10</artifactId>
> > > > > > > <version>0.8.1.1</version></dependency>*
> > > > > > >
> > > > > > > Wouldn't it be nice if I could just put this in my pom?:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > <artifactId>malhar-contrib-kafka</artifactId>
> > > > > > > <version>3.0.0</version></dependency>*
> > > > > > >
> > > > > > >
> > > > > > > In order to make this possible, we will need to organize the
> > malhar
> > > > > > project
> > > > > > > into more granular modules (artifacts). Specifically, the
> > > > > malhar-contrib
> > > > > > > artifact would essentially just be a pom that specifies each
> > > smaller
> > > > > > module
> > > > > > > as a dependency:
> > > > > > >
> > > > > > > *<!-- in malhar-contrib's pom.xml: -->*
> > > > > > >
> > > > > > > *<modules>  <module>kafka</module>*
> > > > > > > *  <module>twitter</module>*
> > > > > > > *  <module>redis</module>*
> > > > > > >
> > > > > > > *  <!-- other smaller modules --></modules>*
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > <artifactId>malhar-contrib-kafka</artifactId>
> > > > > > > <version>3.0.0</version></dependency>*
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > <artifactId>malhar-contrib-twitter</artifactId>
> > > > > > > <version>3.0.0</version></dependency>*
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > > <artifactId>malhar-contrib-redis</artifactId>
> > > > > > > <version>3.0.0</version></dependency>*
> > > > > > >
> > > > > > > With these changes, there may be a risk of breaking backwards
> > > > > > > compatibility, however I think the gain in usability of malhar
> > > merits
> > > > > the
> > > > > > > effort to make this work.
> > > > > > >
> > > > > > > I am still relatively new to maven, so I would love to get some
> > > > > feedback
> > > > > > > from other devs about this!
> > > > > > >
> > > > > > > --
> > > > > > > Regards,
> > > > > > > Andy Perlitch
> > > > > > > Software Engineer
> > > > > > > DataTorrent Inc
> > > > > > > (408)829-9319
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
>
>
> --
> Regards,
> Andy Perlitch
> Software Engineer
> DataTorrent Inc
> (408)829-9319
>

Re: More sensible modules/artifacts in malhar

Posted by Andy Perlitch <an...@datatorrent.com>.
Tushar,

I agree that all modules should inherit the version from the "parent pom"
of the malhar repo. I think the benefits outweigh the cost of bumping
versions of components that haven't actually changed. I'd love to get
others feedback on this as well.

On another note, I plan on starting a spreadsheet/googledoc with the
possible groupings of operators into these modules. Stay tuned...

-Andy

On Mon, Sep 21, 2015 at 11:51 PM, Tushar Gosavi <tu...@datatorrent.com>
wrote:

> +1 for the general idea
>
> Does these independent modules going to have independent versions? For
> example, if there is no change in kafka operator between malhar 3.0 and
> malhar 4.0, will we increment version of malhar-contrib-kafka to 4.0. I
> have learned from my previous project that, It is easier to manage versions
> if we make all modules at same version level for a release, even if there
> is no change in a particular module.
>
> - Tushar.
>
>
>
> On Fri, Sep 18, 2015 at 12:18 AM, Timothy Farkas <ti...@datatorrent.com>
> wrote:
>
> > I agree Andy's solution is better, but just for the sake of argument
> > profiles can be inherited from a parent pom, so if the maven archetype
> > defines a new project with a parent pom with the correct profiles
> defined,
> > then the desired profiles can be activated in the pom of the new project.
> > It is no more complicated than adding additional dependencies to your
> > project.
> >
> > On Thu, Sep 17, 2015 at 10:32 AM, Sandesh Hegde <sandesh@datatorrent.com
> >
> > wrote:
> >
> > > Currently all the dependencies in Malhar-Contrib are marked as
> optional.
> > So
> > > users have to already modify the existing POM to use it in their
> project.
> > > So restructuring should be fine.
> > >
> > > On Thu, Sep 17, 2015 at 11:29 AM Chetan Narsude <
> chetan@datatorrent.com>
> > > wrote:
> > >
> > > > The profiles are excellent when you are developing malhar-contrib.
> > > Profiles
> > > > do not work when you are using malhar-contrib. The problem Andy  is
> > > trying
> > > > to solve is the later. If there is an elegant solution which I am
> > missing
> > > > using profiles, please correct me.
> > > >
> > > > The way Andy suggested is the way many successful projects do it.
> Look
> > at
> > > > Netty as an example.
> > > >
> > > > +1 for that.
> > > >
> > > >
> > > > --
> > > > Chetan
> > > >
> > > >
> > > >
> > > > On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas <
> tim@datatorrent.com>
> > > > wrote:
> > > >
> > > > > I think restructuring the project in that way would be the
> > technically
> > > > > correct thing to do, but if people are unwilling to accept the
> change
> > > in
> > > > > project structure you could achieve something similar by using
> maven
> > > > > profiles. With profiles the project structure would remain as is.
> > > > Profiles
> > > > > could be added to the malhar pom, and a profile would define the
> > > > > dependencies needed for different types of operators. For example
> the
> > > > hbase
> > > > > profile would define the dependencies for the hbase operator. Then
> > any
> > > > > project using a malhar library would just activate the correct
> > profile
> > > in
> > > > > it's pom, and the correct dependencies would be pulled in.
> > > > >
> > > > >
> > > >
> > >
> >
> http://maven.apache.org/guides/introduction/introduction-to-profiles.html
> > > > >
> > > > > On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch <
> > andy@datatorrent.com>
> > > > > wrote:
> > > > >
> > > > > > Hi everyone,
> > > > > >
> > > > > > I am currently assigned to MLHR-1843
> > > > > > <https://malhar.atlassian.net/browse/MLHR-1843>, which
> essentially
> > > > aims
> > > > > to
> > > > > > expose smaller, more consumable maven artifacts that would do
> away
> > > with
> > > > > the
> > > > > > need to manually include necessary dependencies based on the
> > > operators
> > > > in
> > > > > > use.
> > > > > >
> > > > > > As an example, say I am building an app package that needs Kafka
> > > input
> > > > > and
> > > > > > output operators, but I don't want all the other transitive
> > > > dependencies
> > > > > > that come via malhar-contrib. Currently I would need to specify
> > > > > > malhar-contrib as a dependency, and add an exclusions block  in
> my
> > > app
> > > > > > package pom:
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > <artifactId>malhar-contrib</artifactId>  <version>3.0.0</version>
> > > <!--
> > > > > so
> > > > > > none of malhar-contrib's deps are included -->*
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > *  <exclusions>    <exclusion>      <groupId>*</groupId>
> > > > > > <artifactId>*</artifactId>    </exclusion>
> > > </exclusions></dependency>*
> > > > > >
> > > > > > Then, I would have to include the kafka library explicitly as a
> > > > > dependency:
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > *<dependency>  <groupId>org.apache.kafka</groupId>
> > > > > > <artifactId>kafka_2.10</artifactId>
> > > > > > <version>0.8.1.1</version></dependency>*
> > > > > >
> > > > > > Wouldn't it be nice if I could just put this in my pom?:
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > <artifactId>malhar-contrib-kafka</artifactId>
> > > > > > <version>3.0.0</version></dependency>*
> > > > > >
> > > > > >
> > > > > > In order to make this possible, we will need to organize the
> malhar
> > > > > project
> > > > > > into more granular modules (artifacts). Specifically, the
> > > > malhar-contrib
> > > > > > artifact would essentially just be a pom that specifies each
> > smaller
> > > > > module
> > > > > > as a dependency:
> > > > > >
> > > > > > *<!-- in malhar-contrib's pom.xml: -->*
> > > > > >
> > > > > > *<modules>  <module>kafka</module>*
> > > > > > *  <module>twitter</module>*
> > > > > > *  <module>redis</module>*
> > > > > >
> > > > > > *  <!-- other smaller modules --></modules>*
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > <artifactId>malhar-contrib-kafka</artifactId>
> > > > > > <version>3.0.0</version></dependency>*
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > <artifactId>malhar-contrib-twitter</artifactId>
> > > > > > <version>3.0.0</version></dependency>*
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > > <artifactId>malhar-contrib-redis</artifactId>
> > > > > > <version>3.0.0</version></dependency>*
> > > > > >
> > > > > > With these changes, there may be a risk of breaking backwards
> > > > > > compatibility, however I think the gain in usability of malhar
> > merits
> > > > the
> > > > > > effort to make this work.
> > > > > >
> > > > > > I am still relatively new to maven, so I would love to get some
> > > > feedback
> > > > > > from other devs about this!
> > > > > >
> > > > > > --
> > > > > > Regards,
> > > > > > Andy Perlitch
> > > > > > Software Engineer
> > > > > > DataTorrent Inc
> > > > > > (408)829-9319
> > > > > >
> > > > >
> > > >
> > >
> >
>



-- 
Regards,
Andy Perlitch
Software Engineer
DataTorrent Inc
(408)829-9319

Re: More sensible modules/artifacts in malhar

Posted by Tushar Gosavi <tu...@datatorrent.com>.
+1 for the general idea

Does these independent modules going to have independent versions? For
example, if there is no change in kafka operator between malhar 3.0 and
malhar 4.0, will we increment version of malhar-contrib-kafka to 4.0. I
have learned from my previous project that, It is easier to manage versions
if we make all modules at same version level for a release, even if there
is no change in a particular module.

- Tushar.



On Fri, Sep 18, 2015 at 12:18 AM, Timothy Farkas <ti...@datatorrent.com>
wrote:

> I agree Andy's solution is better, but just for the sake of argument
> profiles can be inherited from a parent pom, so if the maven archetype
> defines a new project with a parent pom with the correct profiles defined,
> then the desired profiles can be activated in the pom of the new project.
> It is no more complicated than adding additional dependencies to your
> project.
>
> On Thu, Sep 17, 2015 at 10:32 AM, Sandesh Hegde <sa...@datatorrent.com>
> wrote:
>
> > Currently all the dependencies in Malhar-Contrib are marked as optional.
> So
> > users have to already modify the existing POM to use it in their project.
> > So restructuring should be fine.
> >
> > On Thu, Sep 17, 2015 at 11:29 AM Chetan Narsude <ch...@datatorrent.com>
> > wrote:
> >
> > > The profiles are excellent when you are developing malhar-contrib.
> > Profiles
> > > do not work when you are using malhar-contrib. The problem Andy  is
> > trying
> > > to solve is the later. If there is an elegant solution which I am
> missing
> > > using profiles, please correct me.
> > >
> > > The way Andy suggested is the way many successful projects do it. Look
> at
> > > Netty as an example.
> > >
> > > +1 for that.
> > >
> > >
> > > --
> > > Chetan
> > >
> > >
> > >
> > > On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas <ti...@datatorrent.com>
> > > wrote:
> > >
> > > > I think restructuring the project in that way would be the
> technically
> > > > correct thing to do, but if people are unwilling to accept the change
> > in
> > > > project structure you could achieve something similar by using maven
> > > > profiles. With profiles the project structure would remain as is.
> > > Profiles
> > > > could be added to the malhar pom, and a profile would define the
> > > > dependencies needed for different types of operators. For example the
> > > hbase
> > > > profile would define the dependencies for the hbase operator. Then
> any
> > > > project using a malhar library would just activate the correct
> profile
> > in
> > > > it's pom, and the correct dependencies would be pulled in.
> > > >
> > > >
> > >
> >
> http://maven.apache.org/guides/introduction/introduction-to-profiles.html
> > > >
> > > > On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch <
> andy@datatorrent.com>
> > > > wrote:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > I am currently assigned to MLHR-1843
> > > > > <https://malhar.atlassian.net/browse/MLHR-1843>, which essentially
> > > aims
> > > > to
> > > > > expose smaller, more consumable maven artifacts that would do away
> > with
> > > > the
> > > > > need to manually include necessary dependencies based on the
> > operators
> > > in
> > > > > use.
> > > > >
> > > > > As an example, say I am building an app package that needs Kafka
> > input
> > > > and
> > > > > output operators, but I don't want all the other transitive
> > > dependencies
> > > > > that come via malhar-contrib. Currently I would need to specify
> > > > > malhar-contrib as a dependency, and add an exclusions block  in my
> > app
> > > > > package pom:
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > <artifactId>malhar-contrib</artifactId>  <version>3.0.0</version>
> > <!--
> > > > so
> > > > > none of malhar-contrib's deps are included -->*
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > *  <exclusions>    <exclusion>      <groupId>*</groupId>
> > > > > <artifactId>*</artifactId>    </exclusion>
> > </exclusions></dependency>*
> > > > >
> > > > > Then, I would have to include the kafka library explicitly as a
> > > > dependency:
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > *<dependency>  <groupId>org.apache.kafka</groupId>
> > > > > <artifactId>kafka_2.10</artifactId>
> > > > > <version>0.8.1.1</version></dependency>*
> > > > >
> > > > > Wouldn't it be nice if I could just put this in my pom?:
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > <artifactId>malhar-contrib-kafka</artifactId>
> > > > > <version>3.0.0</version></dependency>*
> > > > >
> > > > >
> > > > > In order to make this possible, we will need to organize the malhar
> > > > project
> > > > > into more granular modules (artifacts). Specifically, the
> > > malhar-contrib
> > > > > artifact would essentially just be a pom that specifies each
> smaller
> > > > module
> > > > > as a dependency:
> > > > >
> > > > > *<!-- in malhar-contrib's pom.xml: -->*
> > > > >
> > > > > *<modules>  <module>kafka</module>*
> > > > > *  <module>twitter</module>*
> > > > > *  <module>redis</module>*
> > > > >
> > > > > *  <!-- other smaller modules --></modules>*
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > <artifactId>malhar-contrib-kafka</artifactId>
> > > > > <version>3.0.0</version></dependency>*
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > <artifactId>malhar-contrib-twitter</artifactId>
> > > > > <version>3.0.0</version></dependency>*
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > > <artifactId>malhar-contrib-redis</artifactId>
> > > > > <version>3.0.0</version></dependency>*
> > > > >
> > > > > With these changes, there may be a risk of breaking backwards
> > > > > compatibility, however I think the gain in usability of malhar
> merits
> > > the
> > > > > effort to make this work.
> > > > >
> > > > > I am still relatively new to maven, so I would love to get some
> > > feedback
> > > > > from other devs about this!
> > > > >
> > > > > --
> > > > > Regards,
> > > > > Andy Perlitch
> > > > > Software Engineer
> > > > > DataTorrent Inc
> > > > > (408)829-9319
> > > > >
> > > >
> > >
> >
>

Re: More sensible modules/artifacts in malhar

Posted by Timothy Farkas <ti...@datatorrent.com>.
I agree Andy's solution is better, but just for the sake of argument
profiles can be inherited from a parent pom, so if the maven archetype
defines a new project with a parent pom with the correct profiles defined,
then the desired profiles can be activated in the pom of the new project.
It is no more complicated than adding additional dependencies to your
project.

On Thu, Sep 17, 2015 at 10:32 AM, Sandesh Hegde <sa...@datatorrent.com>
wrote:

> Currently all the dependencies in Malhar-Contrib are marked as optional. So
> users have to already modify the existing POM to use it in their project.
> So restructuring should be fine.
>
> On Thu, Sep 17, 2015 at 11:29 AM Chetan Narsude <ch...@datatorrent.com>
> wrote:
>
> > The profiles are excellent when you are developing malhar-contrib.
> Profiles
> > do not work when you are using malhar-contrib. The problem Andy  is
> trying
> > to solve is the later. If there is an elegant solution which I am missing
> > using profiles, please correct me.
> >
> > The way Andy suggested is the way many successful projects do it. Look at
> > Netty as an example.
> >
> > +1 for that.
> >
> >
> > --
> > Chetan
> >
> >
> >
> > On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas <ti...@datatorrent.com>
> > wrote:
> >
> > > I think restructuring the project in that way would be the technically
> > > correct thing to do, but if people are unwilling to accept the change
> in
> > > project structure you could achieve something similar by using maven
> > > profiles. With profiles the project structure would remain as is.
> > Profiles
> > > could be added to the malhar pom, and a profile would define the
> > > dependencies needed for different types of operators. For example the
> > hbase
> > > profile would define the dependencies for the hbase operator. Then any
> > > project using a malhar library would just activate the correct profile
> in
> > > it's pom, and the correct dependencies would be pulled in.
> > >
> > >
> >
> http://maven.apache.org/guides/introduction/introduction-to-profiles.html
> > >
> > > On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch <an...@datatorrent.com>
> > > wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > I am currently assigned to MLHR-1843
> > > > <https://malhar.atlassian.net/browse/MLHR-1843>, which essentially
> > aims
> > > to
> > > > expose smaller, more consumable maven artifacts that would do away
> with
> > > the
> > > > need to manually include necessary dependencies based on the
> operators
> > in
> > > > use.
> > > >
> > > > As an example, say I am building an app package that needs Kafka
> input
> > > and
> > > > output operators, but I don't want all the other transitive
> > dependencies
> > > > that come via malhar-contrib. Currently I would need to specify
> > > > malhar-contrib as a dependency, and add an exclusions block  in my
> app
> > > > package pom:
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > <artifactId>malhar-contrib</artifactId>  <version>3.0.0</version>
> <!--
> > > so
> > > > none of malhar-contrib's deps are included -->*
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > *  <exclusions>    <exclusion>      <groupId>*</groupId>
> > > > <artifactId>*</artifactId>    </exclusion>
> </exclusions></dependency>*
> > > >
> > > > Then, I would have to include the kafka library explicitly as a
> > > dependency:
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > *<dependency>  <groupId>org.apache.kafka</groupId>
> > > > <artifactId>kafka_2.10</artifactId>
> > > > <version>0.8.1.1</version></dependency>*
> > > >
> > > > Wouldn't it be nice if I could just put this in my pom?:
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > <artifactId>malhar-contrib-kafka</artifactId>
> > > > <version>3.0.0</version></dependency>*
> > > >
> > > >
> > > > In order to make this possible, we will need to organize the malhar
> > > project
> > > > into more granular modules (artifacts). Specifically, the
> > malhar-contrib
> > > > artifact would essentially just be a pom that specifies each smaller
> > > module
> > > > as a dependency:
> > > >
> > > > *<!-- in malhar-contrib's pom.xml: -->*
> > > >
> > > > *<modules>  <module>kafka</module>*
> > > > *  <module>twitter</module>*
> > > > *  <module>redis</module>*
> > > >
> > > > *  <!-- other smaller modules --></modules>*
> > > >
> > > >
> > > >
> > > >
> > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > <artifactId>malhar-contrib-kafka</artifactId>
> > > > <version>3.0.0</version></dependency>*
> > > >
> > > >
> > > >
> > > >
> > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > <artifactId>malhar-contrib-twitter</artifactId>
> > > > <version>3.0.0</version></dependency>*
> > > >
> > > >
> > > >
> > > >
> > > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > > <artifactId>malhar-contrib-redis</artifactId>
> > > > <version>3.0.0</version></dependency>*
> > > >
> > > > With these changes, there may be a risk of breaking backwards
> > > > compatibility, however I think the gain in usability of malhar merits
> > the
> > > > effort to make this work.
> > > >
> > > > I am still relatively new to maven, so I would love to get some
> > feedback
> > > > from other devs about this!
> > > >
> > > > --
> > > > Regards,
> > > > Andy Perlitch
> > > > Software Engineer
> > > > DataTorrent Inc
> > > > (408)829-9319
> > > >
> > >
> >
>

Re: More sensible modules/artifacts in malhar

Posted by Sandesh Hegde <sa...@datatorrent.com>.
Currently all the dependencies in Malhar-Contrib are marked as optional. So
users have to already modify the existing POM to use it in their project.
So restructuring should be fine.

On Thu, Sep 17, 2015 at 11:29 AM Chetan Narsude <ch...@datatorrent.com>
wrote:

> The profiles are excellent when you are developing malhar-contrib. Profiles
> do not work when you are using malhar-contrib. The problem Andy  is trying
> to solve is the later. If there is an elegant solution which I am missing
> using profiles, please correct me.
>
> The way Andy suggested is the way many successful projects do it. Look at
> Netty as an example.
>
> +1 for that.
>
>
> --
> Chetan
>
>
>
> On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas <ti...@datatorrent.com>
> wrote:
>
> > I think restructuring the project in that way would be the technically
> > correct thing to do, but if people are unwilling to accept the change in
> > project structure you could achieve something similar by using maven
> > profiles. With profiles the project structure would remain as is.
> Profiles
> > could be added to the malhar pom, and a profile would define the
> > dependencies needed for different types of operators. For example the
> hbase
> > profile would define the dependencies for the hbase operator. Then any
> > project using a malhar library would just activate the correct profile in
> > it's pom, and the correct dependencies would be pulled in.
> >
> >
> http://maven.apache.org/guides/introduction/introduction-to-profiles.html
> >
> > On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch <an...@datatorrent.com>
> > wrote:
> >
> > > Hi everyone,
> > >
> > > I am currently assigned to MLHR-1843
> > > <https://malhar.atlassian.net/browse/MLHR-1843>, which essentially
> aims
> > to
> > > expose smaller, more consumable maven artifacts that would do away with
> > the
> > > need to manually include necessary dependencies based on the operators
> in
> > > use.
> > >
> > > As an example, say I am building an app package that needs Kafka input
> > and
> > > output operators, but I don't want all the other transitive
> dependencies
> > > that come via malhar-contrib. Currently I would need to specify
> > > malhar-contrib as a dependency, and add an exclusions block  in my app
> > > package pom:
> > >
> > >
> > >
> > >
> > >
> > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > <artifactId>malhar-contrib</artifactId>  <version>3.0.0</version>  <!--
> > so
> > > none of malhar-contrib's deps are included -->*
> > >
> > >
> > >
> > >
> > >
> > >
> > > *  <exclusions>    <exclusion>      <groupId>*</groupId>
> > > <artifactId>*</artifactId>    </exclusion>  </exclusions></dependency>*
> > >
> > > Then, I would have to include the kafka library explicitly as a
> > dependency:
> > >
> > >
> > >
> > >
> > >
> > > *<dependency>  <groupId>org.apache.kafka</groupId>
> > > <artifactId>kafka_2.10</artifactId>
> > > <version>0.8.1.1</version></dependency>*
> > >
> > > Wouldn't it be nice if I could just put this in my pom?:
> > >
> > >
> > >
> > >
> > >
> > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > <artifactId>malhar-contrib-kafka</artifactId>
> > > <version>3.0.0</version></dependency>*
> > >
> > >
> > > In order to make this possible, we will need to organize the malhar
> > project
> > > into more granular modules (artifacts). Specifically, the
> malhar-contrib
> > > artifact would essentially just be a pom that specifies each smaller
> > module
> > > as a dependency:
> > >
> > > *<!-- in malhar-contrib's pom.xml: -->*
> > >
> > > *<modules>  <module>kafka</module>*
> > > *  <module>twitter</module>*
> > > *  <module>redis</module>*
> > >
> > > *  <!-- other smaller modules --></modules>*
> > >
> > >
> > >
> > >
> > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > <artifactId>malhar-contrib-kafka</artifactId>
> > > <version>3.0.0</version></dependency>*
> > >
> > >
> > >
> > >
> > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > <artifactId>malhar-contrib-twitter</artifactId>
> > > <version>3.0.0</version></dependency>*
> > >
> > >
> > >
> > >
> > > *<dependency>  <groupId>com.datatorrent</groupId>
> > > <artifactId>malhar-contrib-redis</artifactId>
> > > <version>3.0.0</version></dependency>*
> > >
> > > With these changes, there may be a risk of breaking backwards
> > > compatibility, however I think the gain in usability of malhar merits
> the
> > > effort to make this work.
> > >
> > > I am still relatively new to maven, so I would love to get some
> feedback
> > > from other devs about this!
> > >
> > > --
> > > Regards,
> > > Andy Perlitch
> > > Software Engineer
> > > DataTorrent Inc
> > > (408)829-9319
> > >
> >
>

Re: More sensible modules/artifacts in malhar

Posted by Chetan Narsude <ch...@datatorrent.com>.
The profiles are excellent when you are developing malhar-contrib. Profiles
do not work when you are using malhar-contrib. The problem Andy  is trying
to solve is the later. If there is an elegant solution which I am missing
using profiles, please correct me.

The way Andy suggested is the way many successful projects do it. Look at
Netty as an example.

+1 for that.


--
Chetan



On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas <ti...@datatorrent.com>
wrote:

> I think restructuring the project in that way would be the technically
> correct thing to do, but if people are unwilling to accept the change in
> project structure you could achieve something similar by using maven
> profiles. With profiles the project structure would remain as is. Profiles
> could be added to the malhar pom, and a profile would define the
> dependencies needed for different types of operators. For example the hbase
> profile would define the dependencies for the hbase operator. Then any
> project using a malhar library would just activate the correct profile in
> it's pom, and the correct dependencies would be pulled in.
>
> http://maven.apache.org/guides/introduction/introduction-to-profiles.html
>
> On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch <an...@datatorrent.com>
> wrote:
>
> > Hi everyone,
> >
> > I am currently assigned to MLHR-1843
> > <https://malhar.atlassian.net/browse/MLHR-1843>, which essentially aims
> to
> > expose smaller, more consumable maven artifacts that would do away with
> the
> > need to manually include necessary dependencies based on the operators in
> > use.
> >
> > As an example, say I am building an app package that needs Kafka input
> and
> > output operators, but I don't want all the other transitive dependencies
> > that come via malhar-contrib. Currently I would need to specify
> > malhar-contrib as a dependency, and add an exclusions block  in my app
> > package pom:
> >
> >
> >
> >
> >
> > *<dependency>  <groupId>com.datatorrent</groupId>
> > <artifactId>malhar-contrib</artifactId>  <version>3.0.0</version>  <!--
> so
> > none of malhar-contrib's deps are included -->*
> >
> >
> >
> >
> >
> >
> > *  <exclusions>    <exclusion>      <groupId>*</groupId>
> > <artifactId>*</artifactId>    </exclusion>  </exclusions></dependency>*
> >
> > Then, I would have to include the kafka library explicitly as a
> dependency:
> >
> >
> >
> >
> >
> > *<dependency>  <groupId>org.apache.kafka</groupId>
> > <artifactId>kafka_2.10</artifactId>
> > <version>0.8.1.1</version></dependency>*
> >
> > Wouldn't it be nice if I could just put this in my pom?:
> >
> >
> >
> >
> >
> > *<dependency>  <groupId>com.datatorrent</groupId>
> > <artifactId>malhar-contrib-kafka</artifactId>
> > <version>3.0.0</version></dependency>*
> >
> >
> > In order to make this possible, we will need to organize the malhar
> project
> > into more granular modules (artifacts). Specifically, the malhar-contrib
> > artifact would essentially just be a pom that specifies each smaller
> module
> > as a dependency:
> >
> > *<!-- in malhar-contrib's pom.xml: -->*
> >
> > *<modules>  <module>kafka</module>*
> > *  <module>twitter</module>*
> > *  <module>redis</module>*
> >
> > *  <!-- other smaller modules --></modules>*
> >
> >
> >
> >
> > *<dependency>  <groupId>com.datatorrent</groupId>
> > <artifactId>malhar-contrib-kafka</artifactId>
> > <version>3.0.0</version></dependency>*
> >
> >
> >
> >
> > *<dependency>  <groupId>com.datatorrent</groupId>
> > <artifactId>malhar-contrib-twitter</artifactId>
> > <version>3.0.0</version></dependency>*
> >
> >
> >
> >
> > *<dependency>  <groupId>com.datatorrent</groupId>
> > <artifactId>malhar-contrib-redis</artifactId>
> > <version>3.0.0</version></dependency>*
> >
> > With these changes, there may be a risk of breaking backwards
> > compatibility, however I think the gain in usability of malhar merits the
> > effort to make this work.
> >
> > I am still relatively new to maven, so I would love to get some feedback
> > from other devs about this!
> >
> > --
> > Regards,
> > Andy Perlitch
> > Software Engineer
> > DataTorrent Inc
> > (408)829-9319
> >
>

Re: More sensible modules/artifacts in malhar

Posted by Timothy Farkas <ti...@datatorrent.com>.
I think restructuring the project in that way would be the technically
correct thing to do, but if people are unwilling to accept the change in
project structure you could achieve something similar by using maven
profiles. With profiles the project structure would remain as is. Profiles
could be added to the malhar pom, and a profile would define the
dependencies needed for different types of operators. For example the hbase
profile would define the dependencies for the hbase operator. Then any
project using a malhar library would just activate the correct profile in
it's pom, and the correct dependencies would be pulled in.

http://maven.apache.org/guides/introduction/introduction-to-profiles.html

On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch <an...@datatorrent.com>
wrote:

> Hi everyone,
>
> I am currently assigned to MLHR-1843
> <https://malhar.atlassian.net/browse/MLHR-1843>, which essentially aims to
> expose smaller, more consumable maven artifacts that would do away with the
> need to manually include necessary dependencies based on the operators in
> use.
>
> As an example, say I am building an app package that needs Kafka input and
> output operators, but I don't want all the other transitive dependencies
> that come via malhar-contrib. Currently I would need to specify
> malhar-contrib as a dependency, and add an exclusions block  in my app
> package pom:
>
>
>
>
>
> *<dependency>  <groupId>com.datatorrent</groupId>
> <artifactId>malhar-contrib</artifactId>  <version>3.0.0</version>  <!-- so
> none of malhar-contrib's deps are included -->*
>
>
>
>
>
>
> *  <exclusions>    <exclusion>      <groupId>*</groupId>
> <artifactId>*</artifactId>    </exclusion>  </exclusions></dependency>*
>
> Then, I would have to include the kafka library explicitly as a dependency:
>
>
>
>
>
> *<dependency>  <groupId>org.apache.kafka</groupId>
> <artifactId>kafka_2.10</artifactId>
> <version>0.8.1.1</version></dependency>*
>
> Wouldn't it be nice if I could just put this in my pom?:
>
>
>
>
>
> *<dependency>  <groupId>com.datatorrent</groupId>
> <artifactId>malhar-contrib-kafka</artifactId>
> <version>3.0.0</version></dependency>*
>
>
> In order to make this possible, we will need to organize the malhar project
> into more granular modules (artifacts). Specifically, the malhar-contrib
> artifact would essentially just be a pom that specifies each smaller module
> as a dependency:
>
> *<!-- in malhar-contrib's pom.xml: -->*
>
> *<modules>  <module>kafka</module>*
> *  <module>twitter</module>*
> *  <module>redis</module>*
>
> *  <!-- other smaller modules --></modules>*
>
>
>
>
> *<dependency>  <groupId>com.datatorrent</groupId>
> <artifactId>malhar-contrib-kafka</artifactId>
> <version>3.0.0</version></dependency>*
>
>
>
>
> *<dependency>  <groupId>com.datatorrent</groupId>
> <artifactId>malhar-contrib-twitter</artifactId>
> <version>3.0.0</version></dependency>*
>
>
>
>
> *<dependency>  <groupId>com.datatorrent</groupId>
> <artifactId>malhar-contrib-redis</artifactId>
> <version>3.0.0</version></dependency>*
>
> With these changes, there may be a risk of breaking backwards
> compatibility, however I think the gain in usability of malhar merits the
> effort to make this work.
>
> I am still relatively new to maven, so I would love to get some feedback
> from other devs about this!
>
> --
> Regards,
> Andy Perlitch
> Software Engineer
> DataTorrent Inc
> (408)829-9319
>