You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apex.apache.org by Chandni Singh <ch...@datatorrent.com> on 2015/12/15 09:02:42 UTC

Why are parser and formatter operator hidden in Malhar/contrib/schema

Hi,

I just came across couple of formatter and parser operators which are under
Malhar/contrib/schema.

I have couple of questions:
1. What does schema denote here?
2. Why formatter/parser which are functions are placed under schema package?
2. Why are all these operator under Malhar/contrib and not Malhar/lib

Thanks,
Chandni

Re: Why are parser and formatter operator hidden in Malhar/contrib/schema

Posted by Isha Arkatkar <is...@datatorrent.com>.
Hi,

    Yes, I suppose that is true too. But having individual modules for each
result in too many modules in Malhar. Is that OK?
 I am open to any way of organizing which stays consistent with rest of the
Malhar repository. From what I see there are 3 options here:

1. Create fine-grained modules
           - In this particular example, we can have a transformations
folder in Malhar and inside it three separate maven modules for inidividual
parsers. The abstract classes and interfaces can still go in malhar-library
as they do not have any dependencies
          - Pros: Only relevant dependencies will be part of the project
          - Cons: Total number of modules will increase

2. Create modules by clubbing based on functionality
        - Here it means all parsers go together in one module
        - Pros: Easier to find operators as located in one package
        - Cons: Unnecessary dependencies may be pulled in project.

3. Keep all operators not having any external library dependencies in
malhar-library and all others can be created as modules
       - This would mean XML parser, which has only JDK dependencies will
be in malhar-lib along with abstract classes. Json (depends on Jackson,
jakson is part of malhar-library at this point but I don't know if there is
a plan to remove it) and csv (depends on supercsv) can be created as
individual modules.
      - Pros: Minimizes the number of modules we will create in Malhar.
Also makes it possible to load only required dependencies in project.
      - Cons: Makes it hard to find which parsers are available vs which
are not. And I think this organization is little awkward and non-intuitive.

  I am not sure which is the best way to organize. I just think it should
be consistent across. Thoughts?

Thanks,
Isha

On Thu, Dec 24, 2015 at 12:25 PM, Thomas Weise <th...@datatorrent.com>
wrote:

> Most users are not interested using all parsers and their dependencies in
> their project. They have a specific format, for which they will need a
> parser. When I have xml files, I don't need the dependencies for csv.
>
> Thomas
>
> On Thu, Dec 24, 2015 at 12:22 PM, Isha Arkatkar <is...@datatorrent.com>
> wrote:
>
> > Hi,
> >
> >    Agree with Shubham's point here as well. The parser split in contrib
> and
> > library is probably harder to find and manage going forward. However, I
> > think increasing dependencies in library also has its downside.
> >
> > As mentioned in separate mail thread on splitting malhar into modules,
> can
> > we create a single module for parsers outside library and contrib? I can
> > close the current pull request and open a new one if that makes more
> sense.
> >
> > Thanks,
> > Isha
> >
> >
> > On Wed, Dec 23, 2015 at 11:00 PM, Shubham Pathak <
> shubham@datatorrent.com>
> > wrote:
> >
> > > Hi,
> > >
> > > Can we take a decision regarding whether all parsers need to be in one
> > > place ? ( either malhar lib / contrib )
> > >
> > > In this PR https://github.com/apache/incubator-apex-malhar/pull/137 we
> > are
> > > moving XML and JSON parser from contrib to lib.
> > > I suggest we move CSV parsers as well .
> > >
> > > I understand Thomas's point of dependencies, but shouldn't we also
> think
> > > about users that use malhar that would wish to have all parsers under
> > > common bucket ?
> > >
> > >
> > > Thanks,
> > > Shubham
> > >
> > >
> > > On Fri, Dec 18, 2015 at 2:44 AM, Thomas Weise <th...@datatorrent.com>
> > > wrote:
> > >
> > > > This has changed over time. Pieces that cannot be covered by unit
> tests
> > > > should still be kept out of library.
> > > >
> > > > In addition, we cannot keep on expanding the number of dependencies
> in
> > a
> > > > single monolithic module. Hence, we are going start to create smaller
> > > > modules with their dependencies and tests setup correctly. The first
> of
> > > > those will be Kafka, for which a PR is currently open.
> > > >
> > > > For lib, we can add more operators as long as they don't introduce
> new
> > > > dependencies. When building an app and using one operator from lib,
> > > > everything else comes with it. That's a problem for the application
> > > > assembly we are trying to address.
> > > >
> > > > Thomas
> > > >
> > > >
> > > > On Thu, Dec 17, 2015 at 1:06 PM, Sandesh Hegde <
> > sandesh@datatorrent.com>
> > > > wrote:
> > > >
> > > > > Even my understanding was along the lines, "if an unit test can't
> be
> > > run
> > > > on
> > > > > a developer machine without installing dependencies (
> RabbitMq/redis
> > > ec),
> > > > > it should go into contrib".
> > > > >
> > > > > Documenting the exact process as to what goes where helps.
> > > > >
> > > > > On Thu, Dec 17, 2015 at 12:16 PM Chinmay Kolhatkar <
> > > > > chinmay@datatorrent.com>
> > > > > wrote:
> > > > >
> > > > > > Hi Isha,
> > > > > >
> > > > > > I think what Shubham meant to say is any operator which has a
> > > > dependency
> > > > > on
> > > > > > external entity (not library) should go in contrib. Others should
> > go
> > > in
> > > > > > library.
> > > > > >
> > > > > > For eg. DB related operators needs an instance/server of respect
> > > > database
> > > > > > to be running. Without that running server, the operator cannot
> > > satisfy
> > > > > the
> > > > > > functionality expected. Hence it goes into contrib.
> > > > > >
> > > > > > I think none of the parsers will have dependency on external
> > entity,
> > > > > hence
> > > > > > they should go in library and not contrib.
> > > > > >
> > > > > > - Chinmay.
> > > > > > On 18 Dec 2015 00:42, "Isha Arkatkar" <is...@datatorrent.com>
> > wrote:
> > > > > >
> > > > > > > Hi Shubham,
> > > > > > >
> > > > > > >       I think it is somewhat subjective what goes in Contrib Vs
> > > > > Library.
> > > > > > > Here is what I understood about general guideline:
> > > > > > >        An operator would go in  malhar-library if it does not
> > have
> > > > any
> > > > > > > other library dependency than what is already available. If
> > > operator
> > > > > > needs
> > > > > > > to include a dependency, it could be added to Contrib but not
> > > > library.
> > > > > > >
> > > > > > >      If we add an external library dependency in
> Malhar-library,
> > > the
> > > > > size
> > > > > > > of the lib jar keeps on growing. So, if we refer malhar-lib in
> a
> > > > > project,
> > > > > > > the size of the total package would increase, even if we may
> not
> > > > > directly
> > > > > > > use all external libs.
> > > > > > >
> > > > > > >     For this reason, in pull request #137
> > > > > > > <https://github.com/apache/incubator-apex-malhar/pull/137> , I
> > > move
> > > > > only
> > > > > > > Json and XML parsers to malhar-library. Csv one had library
> > > reference
> > > > > to
> > > > > > > supercsv, so it is still in contrib.
> > > > > > >
> > > > > > >   Please correct if I missed something. :)
> > > > > > >
> > > > > > > Thanks!
> > > > > > > Isha
> > > > > > >
> > > > > > > On Wed, Dec 16, 2015 at 11:27 PM, Shubham Pathak <
> > > > > > shubham@datatorrent.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > What do we mean by "additional dependency" in case of
> deciding
> > > > > contrib
> > > > > > vs
> > > > > > > > lib ?
> > > > > > > > As i understand, "additional dependency"  is when an operator
> > is
> > > > > > > > interacting with external technologies . For e.g kafka, MQ ,
> > > HBase
> > > > > etc.
> > > > > > > > By this definition even CSV parsers need to be moved to
> > > malhar-lib.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Shubham
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, Dec 17, 2015 at 5:39 AM, Isha Arkatkar <
> > > > isha@datatorrent.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Chandni,
> > > > > > > > >
> > > > > > > > >    I have moved that as well to lib, as the parsers
> depended
> > on
> > > > > that.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Isha
> > > > > > > > >
> > > > > > > > > On Wed, Dec 16, 2015 at 1:01 PM, Chandni Singh <
> > > > > > > chandni@datatorrent.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > There is a converter package under
> com.datatorrent.contrib
> > > > which
> > > > > > has
> > > > > > > a
> > > > > > > > > > Converter API. This belongs in library as well.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Chandni
> > > > > > > > > >
> > > > > > > > > > On Tue, Dec 15, 2015 at 1:29 PM, Chandni Singh <
> > > > > > > > chandni@datatorrent.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Isha,
> > > > > > > > > > >
> > > > > > > > > > > Thanks for moving this. When you move these files,
> please
> > > > place
> > > > > > > then
> > > > > > > > > > under
> > > > > > > > > > > a package which reflects its functionality. I don't see
> > the
> > > > > need
> > > > > > > for
> > > > > > > > > > > package called schema.
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > > Chandni
> > > > > > > > > > >
> > > > > > > > > > > On Tue, Dec 15, 2015 at 12:31 PM, Isha Arkatkar <
> > > > > > > > isha@datatorrent.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > >> Hi,
> > > > > > > > > > >>
> > > > > > > > > > >>   For csv parser there is an additional dependency.
> So,
> > > I'll
> > > > > > move
> > > > > > > > only
> > > > > > > > > > >> json
> > > > > > > > > > >> and xml to new location.
> > > > > > > > > > >>
> > > > > > > > > > >> Thanks,
> > > > > > > > > > >> Isha
> > > > > > > > > > >>
> > > > > > > > > > >> On Tue, Dec 15, 2015 at 11:42 AM, Thomas Weise <
> > > > > > > > > thomas@datatorrent.com>
> > > > > > > > > > >> wrote:
> > > > > > > > > > >>
> > > > > > > > > > >> > As long as the operators don't introduce additional
> > > > > > dependencies
> > > > > > > > > they
> > > > > > > > > > >> > should be in lib.
> > > > > > > > > > >> >
> > > > > > > > > > >> >
> > > > > > > > > > >> > On Tue, Dec 15, 2015 at 9:34 AM, Shubham Pathak <
> > > > > > > > > > >> shubham@datatorrent.com>
> > > > > > > > > > >> > wrote:
> > > > > > > > > > >> >
> > > > > > > > > > >> > > Hi Chandni,
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > I had written those operators.
> > > > > > > > > > >> > > Here is the jira for that
> > > > > > > > > > >> https://malhar.atlassian.net/browse/MLHR-1838
> > > > > > > > > > >> > > You would find the entire discussion there.
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > Why are all these operator under Malhar/contrib
> and
> > > not
> > > > > > > > Malhar/lib
> > > > > > > > > > >> > > When i was writing the code i saw
> AbstractCsvParser
> > in
> > > > > > > contriib
> > > > > > > > > and
> > > > > > > > > > >> hence
> > > > > > > > > > >> > > added there.
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > Recently i got to know which operators must go in
> > > > contrib
> > > > > > and
> > > > > > > > what
> > > > > > > > > > >> must
> > > > > > > > > > >> > go
> > > > > > > > > > >> > > in lib.
> > > > > > > > > > >> > > By that definition, these operators must belong to
> > > lib.
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > Thanks,
> > > > > > > > > > >> > > Shubham
> > > > > > > > > > >> > >
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > On Tue, Dec 15, 2015 at 1:32 PM, Chandni Singh <
> > > > > > > > > > >> chandni@datatorrent.com>
> > > > > > > > > > >> > > wrote:
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > > Hi,
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > > I just came across couple of formatter and
> parser
> > > > > > operators
> > > > > > > > > which
> > > > > > > > > > >> are
> > > > > > > > > > >> > > under
> > > > > > > > > > >> > > > Malhar/contrib/schema.
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > > I have couple of questions:
> > > > > > > > > > >> > > > 1. What does schema denote here?
> > > > > > > > > > >> > > > 2. Why formatter/parser which are functions are
> > > placed
> > > > > > under
> > > > > > > > > > schema
> > > > > > > > > > >> > > > package?
> > > > > > > > > > >> > > > 2. Why are all these operator under
> Malhar/contrib
> > > and
> > > > > not
> > > > > > > > > > >> Malhar/lib
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > > Thanks,
> > > > > > > > > > >> > > > Chandni
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > >
> > > > > > > > > > >> >
> > > > > > > > > > >>
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Why are parser and formatter operator hidden in Malhar/contrib/schema

Posted by Thomas Weise <th...@datatorrent.com>.
Most users are not interested using all parsers and their dependencies in
their project. They have a specific format, for which they will need a
parser. When I have xml files, I don't need the dependencies for csv.

Thomas

On Thu, Dec 24, 2015 at 12:22 PM, Isha Arkatkar <is...@datatorrent.com>
wrote:

> Hi,
>
>    Agree with Shubham's point here as well. The parser split in contrib and
> library is probably harder to find and manage going forward. However, I
> think increasing dependencies in library also has its downside.
>
> As mentioned in separate mail thread on splitting malhar into modules, can
> we create a single module for parsers outside library and contrib? I can
> close the current pull request and open a new one if that makes more sense.
>
> Thanks,
> Isha
>
>
> On Wed, Dec 23, 2015 at 11:00 PM, Shubham Pathak <sh...@datatorrent.com>
> wrote:
>
> > Hi,
> >
> > Can we take a decision regarding whether all parsers need to be in one
> > place ? ( either malhar lib / contrib )
> >
> > In this PR https://github.com/apache/incubator-apex-malhar/pull/137 we
> are
> > moving XML and JSON parser from contrib to lib.
> > I suggest we move CSV parsers as well .
> >
> > I understand Thomas's point of dependencies, but shouldn't we also think
> > about users that use malhar that would wish to have all parsers under
> > common bucket ?
> >
> >
> > Thanks,
> > Shubham
> >
> >
> > On Fri, Dec 18, 2015 at 2:44 AM, Thomas Weise <th...@datatorrent.com>
> > wrote:
> >
> > > This has changed over time. Pieces that cannot be covered by unit tests
> > > should still be kept out of library.
> > >
> > > In addition, we cannot keep on expanding the number of dependencies in
> a
> > > single monolithic module. Hence, we are going start to create smaller
> > > modules with their dependencies and tests setup correctly. The first of
> > > those will be Kafka, for which a PR is currently open.
> > >
> > > For lib, we can add more operators as long as they don't introduce new
> > > dependencies. When building an app and using one operator from lib,
> > > everything else comes with it. That's a problem for the application
> > > assembly we are trying to address.
> > >
> > > Thomas
> > >
> > >
> > > On Thu, Dec 17, 2015 at 1:06 PM, Sandesh Hegde <
> sandesh@datatorrent.com>
> > > wrote:
> > >
> > > > Even my understanding was along the lines, "if an unit test can't be
> > run
> > > on
> > > > a developer machine without installing dependencies ( RabbitMq/redis
> > ec),
> > > > it should go into contrib".
> > > >
> > > > Documenting the exact process as to what goes where helps.
> > > >
> > > > On Thu, Dec 17, 2015 at 12:16 PM Chinmay Kolhatkar <
> > > > chinmay@datatorrent.com>
> > > > wrote:
> > > >
> > > > > Hi Isha,
> > > > >
> > > > > I think what Shubham meant to say is any operator which has a
> > > dependency
> > > > on
> > > > > external entity (not library) should go in contrib. Others should
> go
> > in
> > > > > library.
> > > > >
> > > > > For eg. DB related operators needs an instance/server of respect
> > > database
> > > > > to be running. Without that running server, the operator cannot
> > satisfy
> > > > the
> > > > > functionality expected. Hence it goes into contrib.
> > > > >
> > > > > I think none of the parsers will have dependency on external
> entity,
> > > > hence
> > > > > they should go in library and not contrib.
> > > > >
> > > > > - Chinmay.
> > > > > On 18 Dec 2015 00:42, "Isha Arkatkar" <is...@datatorrent.com>
> wrote:
> > > > >
> > > > > > Hi Shubham,
> > > > > >
> > > > > >       I think it is somewhat subjective what goes in Contrib Vs
> > > > Library.
> > > > > > Here is what I understood about general guideline:
> > > > > >        An operator would go in  malhar-library if it does not
> have
> > > any
> > > > > > other library dependency than what is already available. If
> > operator
> > > > > needs
> > > > > > to include a dependency, it could be added to Contrib but not
> > > library.
> > > > > >
> > > > > >      If we add an external library dependency in Malhar-library,
> > the
> > > > size
> > > > > > of the lib jar keeps on growing. So, if we refer malhar-lib in a
> > > > project,
> > > > > > the size of the total package would increase, even if we may not
> > > > directly
> > > > > > use all external libs.
> > > > > >
> > > > > >     For this reason, in pull request #137
> > > > > > <https://github.com/apache/incubator-apex-malhar/pull/137> , I
> > move
> > > > only
> > > > > > Json and XML parsers to malhar-library. Csv one had library
> > reference
> > > > to
> > > > > > supercsv, so it is still in contrib.
> > > > > >
> > > > > >   Please correct if I missed something. :)
> > > > > >
> > > > > > Thanks!
> > > > > > Isha
> > > > > >
> > > > > > On Wed, Dec 16, 2015 at 11:27 PM, Shubham Pathak <
> > > > > shubham@datatorrent.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > What do we mean by "additional dependency" in case of deciding
> > > > contrib
> > > > > vs
> > > > > > > lib ?
> > > > > > > As i understand, "additional dependency"  is when an operator
> is
> > > > > > > interacting with external technologies . For e.g kafka, MQ ,
> > HBase
> > > > etc.
> > > > > > > By this definition even CSV parsers need to be moved to
> > malhar-lib.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Shubham
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Dec 17, 2015 at 5:39 AM, Isha Arkatkar <
> > > isha@datatorrent.com
> > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Chandni,
> > > > > > > >
> > > > > > > >    I have moved that as well to lib, as the parsers depended
> on
> > > > that.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Isha
> > > > > > > >
> > > > > > > > On Wed, Dec 16, 2015 at 1:01 PM, Chandni Singh <
> > > > > > chandni@datatorrent.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > There is a converter package under com.datatorrent.contrib
> > > which
> > > > > has
> > > > > > a
> > > > > > > > > Converter API. This belongs in library as well.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Chandni
> > > > > > > > >
> > > > > > > > > On Tue, Dec 15, 2015 at 1:29 PM, Chandni Singh <
> > > > > > > chandni@datatorrent.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Isha,
> > > > > > > > > >
> > > > > > > > > > Thanks for moving this. When you move these files, please
> > > place
> > > > > > then
> > > > > > > > > under
> > > > > > > > > > a package which reflects its functionality. I don't see
> the
> > > > need
> > > > > > for
> > > > > > > > > > package called schema.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Chandni
> > > > > > > > > >
> > > > > > > > > > On Tue, Dec 15, 2015 at 12:31 PM, Isha Arkatkar <
> > > > > > > isha@datatorrent.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > >> Hi,
> > > > > > > > > >>
> > > > > > > > > >>   For csv parser there is an additional dependency. So,
> > I'll
> > > > > move
> > > > > > > only
> > > > > > > > > >> json
> > > > > > > > > >> and xml to new location.
> > > > > > > > > >>
> > > > > > > > > >> Thanks,
> > > > > > > > > >> Isha
> > > > > > > > > >>
> > > > > > > > > >> On Tue, Dec 15, 2015 at 11:42 AM, Thomas Weise <
> > > > > > > > thomas@datatorrent.com>
> > > > > > > > > >> wrote:
> > > > > > > > > >>
> > > > > > > > > >> > As long as the operators don't introduce additional
> > > > > dependencies
> > > > > > > > they
> > > > > > > > > >> > should be in lib.
> > > > > > > > > >> >
> > > > > > > > > >> >
> > > > > > > > > >> > On Tue, Dec 15, 2015 at 9:34 AM, Shubham Pathak <
> > > > > > > > > >> shubham@datatorrent.com>
> > > > > > > > > >> > wrote:
> > > > > > > > > >> >
> > > > > > > > > >> > > Hi Chandni,
> > > > > > > > > >> > >
> > > > > > > > > >> > > I had written those operators.
> > > > > > > > > >> > > Here is the jira for that
> > > > > > > > > >> https://malhar.atlassian.net/browse/MLHR-1838
> > > > > > > > > >> > > You would find the entire discussion there.
> > > > > > > > > >> > >
> > > > > > > > > >> > > Why are all these operator under Malhar/contrib and
> > not
> > > > > > > Malhar/lib
> > > > > > > > > >> > > When i was writing the code i saw AbstractCsvParser
> in
> > > > > > contriib
> > > > > > > > and
> > > > > > > > > >> hence
> > > > > > > > > >> > > added there.
> > > > > > > > > >> > >
> > > > > > > > > >> > > Recently i got to know which operators must go in
> > > contrib
> > > > > and
> > > > > > > what
> > > > > > > > > >> must
> > > > > > > > > >> > go
> > > > > > > > > >> > > in lib.
> > > > > > > > > >> > > By that definition, these operators must belong to
> > lib.
> > > > > > > > > >> > >
> > > > > > > > > >> > > Thanks,
> > > > > > > > > >> > > Shubham
> > > > > > > > > >> > >
> > > > > > > > > >> > >
> > > > > > > > > >> > > On Tue, Dec 15, 2015 at 1:32 PM, Chandni Singh <
> > > > > > > > > >> chandni@datatorrent.com>
> > > > > > > > > >> > > wrote:
> > > > > > > > > >> > >
> > > > > > > > > >> > > > Hi,
> > > > > > > > > >> > > >
> > > > > > > > > >> > > > I just came across couple of formatter and parser
> > > > > operators
> > > > > > > > which
> > > > > > > > > >> are
> > > > > > > > > >> > > under
> > > > > > > > > >> > > > Malhar/contrib/schema.
> > > > > > > > > >> > > >
> > > > > > > > > >> > > > I have couple of questions:
> > > > > > > > > >> > > > 1. What does schema denote here?
> > > > > > > > > >> > > > 2. Why formatter/parser which are functions are
> > placed
> > > > > under
> > > > > > > > > schema
> > > > > > > > > >> > > > package?
> > > > > > > > > >> > > > 2. Why are all these operator under Malhar/contrib
> > and
> > > > not
> > > > > > > > > >> Malhar/lib
> > > > > > > > > >> > > >
> > > > > > > > > >> > > > Thanks,
> > > > > > > > > >> > > > Chandni
> > > > > > > > > >> > > >
> > > > > > > > > >> > >
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Why are parser and formatter operator hidden in Malhar/contrib/schema

Posted by Isha Arkatkar <is...@datatorrent.com>.
Hi,

   Agree with Shubham's point here as well. The parser split in contrib and
library is probably harder to find and manage going forward. However, I
think increasing dependencies in library also has its downside.

As mentioned in separate mail thread on splitting malhar into modules, can
we create a single module for parsers outside library and contrib? I can
close the current pull request and open a new one if that makes more sense.

Thanks,
Isha


On Wed, Dec 23, 2015 at 11:00 PM, Shubham Pathak <sh...@datatorrent.com>
wrote:

> Hi,
>
> Can we take a decision regarding whether all parsers need to be in one
> place ? ( either malhar lib / contrib )
>
> In this PR https://github.com/apache/incubator-apex-malhar/pull/137 we are
> moving XML and JSON parser from contrib to lib.
> I suggest we move CSV parsers as well .
>
> I understand Thomas's point of dependencies, but shouldn't we also think
> about users that use malhar that would wish to have all parsers under
> common bucket ?
>
>
> Thanks,
> Shubham
>
>
> On Fri, Dec 18, 2015 at 2:44 AM, Thomas Weise <th...@datatorrent.com>
> wrote:
>
> > This has changed over time. Pieces that cannot be covered by unit tests
> > should still be kept out of library.
> >
> > In addition, we cannot keep on expanding the number of dependencies in a
> > single monolithic module. Hence, we are going start to create smaller
> > modules with their dependencies and tests setup correctly. The first of
> > those will be Kafka, for which a PR is currently open.
> >
> > For lib, we can add more operators as long as they don't introduce new
> > dependencies. When building an app and using one operator from lib,
> > everything else comes with it. That's a problem for the application
> > assembly we are trying to address.
> >
> > Thomas
> >
> >
> > On Thu, Dec 17, 2015 at 1:06 PM, Sandesh Hegde <sa...@datatorrent.com>
> > wrote:
> >
> > > Even my understanding was along the lines, "if an unit test can't be
> run
> > on
> > > a developer machine without installing dependencies ( RabbitMq/redis
> ec),
> > > it should go into contrib".
> > >
> > > Documenting the exact process as to what goes where helps.
> > >
> > > On Thu, Dec 17, 2015 at 12:16 PM Chinmay Kolhatkar <
> > > chinmay@datatorrent.com>
> > > wrote:
> > >
> > > > Hi Isha,
> > > >
> > > > I think what Shubham meant to say is any operator which has a
> > dependency
> > > on
> > > > external entity (not library) should go in contrib. Others should go
> in
> > > > library.
> > > >
> > > > For eg. DB related operators needs an instance/server of respect
> > database
> > > > to be running. Without that running server, the operator cannot
> satisfy
> > > the
> > > > functionality expected. Hence it goes into contrib.
> > > >
> > > > I think none of the parsers will have dependency on external entity,
> > > hence
> > > > they should go in library and not contrib.
> > > >
> > > > - Chinmay.
> > > > On 18 Dec 2015 00:42, "Isha Arkatkar" <is...@datatorrent.com> wrote:
> > > >
> > > > > Hi Shubham,
> > > > >
> > > > >       I think it is somewhat subjective what goes in Contrib Vs
> > > Library.
> > > > > Here is what I understood about general guideline:
> > > > >        An operator would go in  malhar-library if it does not have
> > any
> > > > > other library dependency than what is already available. If
> operator
> > > > needs
> > > > > to include a dependency, it could be added to Contrib but not
> > library.
> > > > >
> > > > >      If we add an external library dependency in Malhar-library,
> the
> > > size
> > > > > of the lib jar keeps on growing. So, if we refer malhar-lib in a
> > > project,
> > > > > the size of the total package would increase, even if we may not
> > > directly
> > > > > use all external libs.
> > > > >
> > > > >     For this reason, in pull request #137
> > > > > <https://github.com/apache/incubator-apex-malhar/pull/137> , I
> move
> > > only
> > > > > Json and XML parsers to malhar-library. Csv one had library
> reference
> > > to
> > > > > supercsv, so it is still in contrib.
> > > > >
> > > > >   Please correct if I missed something. :)
> > > > >
> > > > > Thanks!
> > > > > Isha
> > > > >
> > > > > On Wed, Dec 16, 2015 at 11:27 PM, Shubham Pathak <
> > > > shubham@datatorrent.com>
> > > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > What do we mean by "additional dependency" in case of deciding
> > > contrib
> > > > vs
> > > > > > lib ?
> > > > > > As i understand, "additional dependency"  is when an operator is
> > > > > > interacting with external technologies . For e.g kafka, MQ ,
> HBase
> > > etc.
> > > > > > By this definition even CSV parsers need to be moved to
> malhar-lib.
> > > > > >
> > > > > > Thanks,
> > > > > > Shubham
> > > > > >
> > > > > >
> > > > > > On Thu, Dec 17, 2015 at 5:39 AM, Isha Arkatkar <
> > isha@datatorrent.com
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Chandni,
> > > > > > >
> > > > > > >    I have moved that as well to lib, as the parsers depended on
> > > that.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Isha
> > > > > > >
> > > > > > > On Wed, Dec 16, 2015 at 1:01 PM, Chandni Singh <
> > > > > chandni@datatorrent.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > There is a converter package under com.datatorrent.contrib
> > which
> > > > has
> > > > > a
> > > > > > > > Converter API. This belongs in library as well.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Chandni
> > > > > > > >
> > > > > > > > On Tue, Dec 15, 2015 at 1:29 PM, Chandni Singh <
> > > > > > chandni@datatorrent.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Isha,
> > > > > > > > >
> > > > > > > > > Thanks for moving this. When you move these files, please
> > place
> > > > > then
> > > > > > > > under
> > > > > > > > > a package which reflects its functionality. I don't see the
> > > need
> > > > > for
> > > > > > > > > package called schema.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Chandni
> > > > > > > > >
> > > > > > > > > On Tue, Dec 15, 2015 at 12:31 PM, Isha Arkatkar <
> > > > > > isha@datatorrent.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > >> Hi,
> > > > > > > > >>
> > > > > > > > >>   For csv parser there is an additional dependency. So,
> I'll
> > > > move
> > > > > > only
> > > > > > > > >> json
> > > > > > > > >> and xml to new location.
> > > > > > > > >>
> > > > > > > > >> Thanks,
> > > > > > > > >> Isha
> > > > > > > > >>
> > > > > > > > >> On Tue, Dec 15, 2015 at 11:42 AM, Thomas Weise <
> > > > > > > thomas@datatorrent.com>
> > > > > > > > >> wrote:
> > > > > > > > >>
> > > > > > > > >> > As long as the operators don't introduce additional
> > > > dependencies
> > > > > > > they
> > > > > > > > >> > should be in lib.
> > > > > > > > >> >
> > > > > > > > >> >
> > > > > > > > >> > On Tue, Dec 15, 2015 at 9:34 AM, Shubham Pathak <
> > > > > > > > >> shubham@datatorrent.com>
> > > > > > > > >> > wrote:
> > > > > > > > >> >
> > > > > > > > >> > > Hi Chandni,
> > > > > > > > >> > >
> > > > > > > > >> > > I had written those operators.
> > > > > > > > >> > > Here is the jira for that
> > > > > > > > >> https://malhar.atlassian.net/browse/MLHR-1838
> > > > > > > > >> > > You would find the entire discussion there.
> > > > > > > > >> > >
> > > > > > > > >> > > Why are all these operator under Malhar/contrib and
> not
> > > > > > Malhar/lib
> > > > > > > > >> > > When i was writing the code i saw AbstractCsvParser in
> > > > > contriib
> > > > > > > and
> > > > > > > > >> hence
> > > > > > > > >> > > added there.
> > > > > > > > >> > >
> > > > > > > > >> > > Recently i got to know which operators must go in
> > contrib
> > > > and
> > > > > > what
> > > > > > > > >> must
> > > > > > > > >> > go
> > > > > > > > >> > > in lib.
> > > > > > > > >> > > By that definition, these operators must belong to
> lib.
> > > > > > > > >> > >
> > > > > > > > >> > > Thanks,
> > > > > > > > >> > > Shubham
> > > > > > > > >> > >
> > > > > > > > >> > >
> > > > > > > > >> > > On Tue, Dec 15, 2015 at 1:32 PM, Chandni Singh <
> > > > > > > > >> chandni@datatorrent.com>
> > > > > > > > >> > > wrote:
> > > > > > > > >> > >
> > > > > > > > >> > > > Hi,
> > > > > > > > >> > > >
> > > > > > > > >> > > > I just came across couple of formatter and parser
> > > > operators
> > > > > > > which
> > > > > > > > >> are
> > > > > > > > >> > > under
> > > > > > > > >> > > > Malhar/contrib/schema.
> > > > > > > > >> > > >
> > > > > > > > >> > > > I have couple of questions:
> > > > > > > > >> > > > 1. What does schema denote here?
> > > > > > > > >> > > > 2. Why formatter/parser which are functions are
> placed
> > > > under
> > > > > > > > schema
> > > > > > > > >> > > > package?
> > > > > > > > >> > > > 2. Why are all these operator under Malhar/contrib
> and
> > > not
> > > > > > > > >> Malhar/lib
> > > > > > > > >> > > >
> > > > > > > > >> > > > Thanks,
> > > > > > > > >> > > > Chandni
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Why are parser and formatter operator hidden in Malhar/contrib/schema

Posted by Shubham Pathak <sh...@datatorrent.com>.
Hi,

Can we take a decision regarding whether all parsers need to be in one
place ? ( either malhar lib / contrib )

In this PR https://github.com/apache/incubator-apex-malhar/pull/137 we are
moving XML and JSON parser from contrib to lib.
I suggest we move CSV parsers as well .

I understand Thomas's point of dependencies, but shouldn't we also think
about users that use malhar that would wish to have all parsers under
common bucket ?


Thanks,
Shubham


On Fri, Dec 18, 2015 at 2:44 AM, Thomas Weise <th...@datatorrent.com>
wrote:

> This has changed over time. Pieces that cannot be covered by unit tests
> should still be kept out of library.
>
> In addition, we cannot keep on expanding the number of dependencies in a
> single monolithic module. Hence, we are going start to create smaller
> modules with their dependencies and tests setup correctly. The first of
> those will be Kafka, for which a PR is currently open.
>
> For lib, we can add more operators as long as they don't introduce new
> dependencies. When building an app and using one operator from lib,
> everything else comes with it. That's a problem for the application
> assembly we are trying to address.
>
> Thomas
>
>
> On Thu, Dec 17, 2015 at 1:06 PM, Sandesh Hegde <sa...@datatorrent.com>
> wrote:
>
> > Even my understanding was along the lines, "if an unit test can't be run
> on
> > a developer machine without installing dependencies ( RabbitMq/redis ec),
> > it should go into contrib".
> >
> > Documenting the exact process as to what goes where helps.
> >
> > On Thu, Dec 17, 2015 at 12:16 PM Chinmay Kolhatkar <
> > chinmay@datatorrent.com>
> > wrote:
> >
> > > Hi Isha,
> > >
> > > I think what Shubham meant to say is any operator which has a
> dependency
> > on
> > > external entity (not library) should go in contrib. Others should go in
> > > library.
> > >
> > > For eg. DB related operators needs an instance/server of respect
> database
> > > to be running. Without that running server, the operator cannot satisfy
> > the
> > > functionality expected. Hence it goes into contrib.
> > >
> > > I think none of the parsers will have dependency on external entity,
> > hence
> > > they should go in library and not contrib.
> > >
> > > - Chinmay.
> > > On 18 Dec 2015 00:42, "Isha Arkatkar" <is...@datatorrent.com> wrote:
> > >
> > > > Hi Shubham,
> > > >
> > > >       I think it is somewhat subjective what goes in Contrib Vs
> > Library.
> > > > Here is what I understood about general guideline:
> > > >        An operator would go in  malhar-library if it does not have
> any
> > > > other library dependency than what is already available. If operator
> > > needs
> > > > to include a dependency, it could be added to Contrib but not
> library.
> > > >
> > > >      If we add an external library dependency in Malhar-library, the
> > size
> > > > of the lib jar keeps on growing. So, if we refer malhar-lib in a
> > project,
> > > > the size of the total package would increase, even if we may not
> > directly
> > > > use all external libs.
> > > >
> > > >     For this reason, in pull request #137
> > > > <https://github.com/apache/incubator-apex-malhar/pull/137> , I move
> > only
> > > > Json and XML parsers to malhar-library. Csv one had library reference
> > to
> > > > supercsv, so it is still in contrib.
> > > >
> > > >   Please correct if I missed something. :)
> > > >
> > > > Thanks!
> > > > Isha
> > > >
> > > > On Wed, Dec 16, 2015 at 11:27 PM, Shubham Pathak <
> > > shubham@datatorrent.com>
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > What do we mean by "additional dependency" in case of deciding
> > contrib
> > > vs
> > > > > lib ?
> > > > > As i understand, "additional dependency"  is when an operator is
> > > > > interacting with external technologies . For e.g kafka, MQ , HBase
> > etc.
> > > > > By this definition even CSV parsers need to be moved to malhar-lib.
> > > > >
> > > > > Thanks,
> > > > > Shubham
> > > > >
> > > > >
> > > > > On Thu, Dec 17, 2015 at 5:39 AM, Isha Arkatkar <
> isha@datatorrent.com
> > >
> > > > > wrote:
> > > > >
> > > > > > Hi Chandni,
> > > > > >
> > > > > >    I have moved that as well to lib, as the parsers depended on
> > that.
> > > > > >
> > > > > > Thanks,
> > > > > > Isha
> > > > > >
> > > > > > On Wed, Dec 16, 2015 at 1:01 PM, Chandni Singh <
> > > > chandni@datatorrent.com>
> > > > > > wrote:
> > > > > >
> > > > > > > There is a converter package under com.datatorrent.contrib
> which
> > > has
> > > > a
> > > > > > > Converter API. This belongs in library as well.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Chandni
> > > > > > >
> > > > > > > On Tue, Dec 15, 2015 at 1:29 PM, Chandni Singh <
> > > > > chandni@datatorrent.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Isha,
> > > > > > > >
> > > > > > > > Thanks for moving this. When you move these files, please
> place
> > > > then
> > > > > > > under
> > > > > > > > a package which reflects its functionality. I don't see the
> > need
> > > > for
> > > > > > > > package called schema.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Chandni
> > > > > > > >
> > > > > > > > On Tue, Dec 15, 2015 at 12:31 PM, Isha Arkatkar <
> > > > > isha@datatorrent.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > >> Hi,
> > > > > > > >>
> > > > > > > >>   For csv parser there is an additional dependency. So, I'll
> > > move
> > > > > only
> > > > > > > >> json
> > > > > > > >> and xml to new location.
> > > > > > > >>
> > > > > > > >> Thanks,
> > > > > > > >> Isha
> > > > > > > >>
> > > > > > > >> On Tue, Dec 15, 2015 at 11:42 AM, Thomas Weise <
> > > > > > thomas@datatorrent.com>
> > > > > > > >> wrote:
> > > > > > > >>
> > > > > > > >> > As long as the operators don't introduce additional
> > > dependencies
> > > > > > they
> > > > > > > >> > should be in lib.
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >> > On Tue, Dec 15, 2015 at 9:34 AM, Shubham Pathak <
> > > > > > > >> shubham@datatorrent.com>
> > > > > > > >> > wrote:
> > > > > > > >> >
> > > > > > > >> > > Hi Chandni,
> > > > > > > >> > >
> > > > > > > >> > > I had written those operators.
> > > > > > > >> > > Here is the jira for that
> > > > > > > >> https://malhar.atlassian.net/browse/MLHR-1838
> > > > > > > >> > > You would find the entire discussion there.
> > > > > > > >> > >
> > > > > > > >> > > Why are all these operator under Malhar/contrib and not
> > > > > Malhar/lib
> > > > > > > >> > > When i was writing the code i saw AbstractCsvParser in
> > > > contriib
> > > > > > and
> > > > > > > >> hence
> > > > > > > >> > > added there.
> > > > > > > >> > >
> > > > > > > >> > > Recently i got to know which operators must go in
> contrib
> > > and
> > > > > what
> > > > > > > >> must
> > > > > > > >> > go
> > > > > > > >> > > in lib.
> > > > > > > >> > > By that definition, these operators must belong to lib.
> > > > > > > >> > >
> > > > > > > >> > > Thanks,
> > > > > > > >> > > Shubham
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> > > On Tue, Dec 15, 2015 at 1:32 PM, Chandni Singh <
> > > > > > > >> chandni@datatorrent.com>
> > > > > > > >> > > wrote:
> > > > > > > >> > >
> > > > > > > >> > > > Hi,
> > > > > > > >> > > >
> > > > > > > >> > > > I just came across couple of formatter and parser
> > > operators
> > > > > > which
> > > > > > > >> are
> > > > > > > >> > > under
> > > > > > > >> > > > Malhar/contrib/schema.
> > > > > > > >> > > >
> > > > > > > >> > > > I have couple of questions:
> > > > > > > >> > > > 1. What does schema denote here?
> > > > > > > >> > > > 2. Why formatter/parser which are functions are placed
> > > under
> > > > > > > schema
> > > > > > > >> > > > package?
> > > > > > > >> > > > 2. Why are all these operator under Malhar/contrib and
> > not
> > > > > > > >> Malhar/lib
> > > > > > > >> > > >
> > > > > > > >> > > > Thanks,
> > > > > > > >> > > > Chandni
> > > > > > > >> > > >
> > > > > > > >> > >
> > > > > > > >> >
> > > > > > > >>
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Why are parser and formatter operator hidden in Malhar/contrib/schema

Posted by Thomas Weise <th...@datatorrent.com>.
This has changed over time. Pieces that cannot be covered by unit tests
should still be kept out of library.

In addition, we cannot keep on expanding the number of dependencies in a
single monolithic module. Hence, we are going start to create smaller
modules with their dependencies and tests setup correctly. The first of
those will be Kafka, for which a PR is currently open.

For lib, we can add more operators as long as they don't introduce new
dependencies. When building an app and using one operator from lib,
everything else comes with it. That's a problem for the application
assembly we are trying to address.

Thomas


On Thu, Dec 17, 2015 at 1:06 PM, Sandesh Hegde <sa...@datatorrent.com>
wrote:

> Even my understanding was along the lines, "if an unit test can't be run on
> a developer machine without installing dependencies ( RabbitMq/redis ec),
> it should go into contrib".
>
> Documenting the exact process as to what goes where helps.
>
> On Thu, Dec 17, 2015 at 12:16 PM Chinmay Kolhatkar <
> chinmay@datatorrent.com>
> wrote:
>
> > Hi Isha,
> >
> > I think what Shubham meant to say is any operator which has a dependency
> on
> > external entity (not library) should go in contrib. Others should go in
> > library.
> >
> > For eg. DB related operators needs an instance/server of respect database
> > to be running. Without that running server, the operator cannot satisfy
> the
> > functionality expected. Hence it goes into contrib.
> >
> > I think none of the parsers will have dependency on external entity,
> hence
> > they should go in library and not contrib.
> >
> > - Chinmay.
> > On 18 Dec 2015 00:42, "Isha Arkatkar" <is...@datatorrent.com> wrote:
> >
> > > Hi Shubham,
> > >
> > >       I think it is somewhat subjective what goes in Contrib Vs
> Library.
> > > Here is what I understood about general guideline:
> > >        An operator would go in  malhar-library if it does not have any
> > > other library dependency than what is already available. If operator
> > needs
> > > to include a dependency, it could be added to Contrib but not library.
> > >
> > >      If we add an external library dependency in Malhar-library, the
> size
> > > of the lib jar keeps on growing. So, if we refer malhar-lib in a
> project,
> > > the size of the total package would increase, even if we may not
> directly
> > > use all external libs.
> > >
> > >     For this reason, in pull request #137
> > > <https://github.com/apache/incubator-apex-malhar/pull/137> , I move
> only
> > > Json and XML parsers to malhar-library. Csv one had library reference
> to
> > > supercsv, so it is still in contrib.
> > >
> > >   Please correct if I missed something. :)
> > >
> > > Thanks!
> > > Isha
> > >
> > > On Wed, Dec 16, 2015 at 11:27 PM, Shubham Pathak <
> > shubham@datatorrent.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > What do we mean by "additional dependency" in case of deciding
> contrib
> > vs
> > > > lib ?
> > > > As i understand, "additional dependency"  is when an operator is
> > > > interacting with external technologies . For e.g kafka, MQ , HBase
> etc.
> > > > By this definition even CSV parsers need to be moved to malhar-lib.
> > > >
> > > > Thanks,
> > > > Shubham
> > > >
> > > >
> > > > On Thu, Dec 17, 2015 at 5:39 AM, Isha Arkatkar <isha@datatorrent.com
> >
> > > > wrote:
> > > >
> > > > > Hi Chandni,
> > > > >
> > > > >    I have moved that as well to lib, as the parsers depended on
> that.
> > > > >
> > > > > Thanks,
> > > > > Isha
> > > > >
> > > > > On Wed, Dec 16, 2015 at 1:01 PM, Chandni Singh <
> > > chandni@datatorrent.com>
> > > > > wrote:
> > > > >
> > > > > > There is a converter package under com.datatorrent.contrib which
> > has
> > > a
> > > > > > Converter API. This belongs in library as well.
> > > > > >
> > > > > > Thanks,
> > > > > > Chandni
> > > > > >
> > > > > > On Tue, Dec 15, 2015 at 1:29 PM, Chandni Singh <
> > > > chandni@datatorrent.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Isha,
> > > > > > >
> > > > > > > Thanks for moving this. When you move these files, please place
> > > then
> > > > > > under
> > > > > > > a package which reflects its functionality. I don't see the
> need
> > > for
> > > > > > > package called schema.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Chandni
> > > > > > >
> > > > > > > On Tue, Dec 15, 2015 at 12:31 PM, Isha Arkatkar <
> > > > isha@datatorrent.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > >> Hi,
> > > > > > >>
> > > > > > >>   For csv parser there is an additional dependency. So, I'll
> > move
> > > > only
> > > > > > >> json
> > > > > > >> and xml to new location.
> > > > > > >>
> > > > > > >> Thanks,
> > > > > > >> Isha
> > > > > > >>
> > > > > > >> On Tue, Dec 15, 2015 at 11:42 AM, Thomas Weise <
> > > > > thomas@datatorrent.com>
> > > > > > >> wrote:
> > > > > > >>
> > > > > > >> > As long as the operators don't introduce additional
> > dependencies
> > > > > they
> > > > > > >> > should be in lib.
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > On Tue, Dec 15, 2015 at 9:34 AM, Shubham Pathak <
> > > > > > >> shubham@datatorrent.com>
> > > > > > >> > wrote:
> > > > > > >> >
> > > > > > >> > > Hi Chandni,
> > > > > > >> > >
> > > > > > >> > > I had written those operators.
> > > > > > >> > > Here is the jira for that
> > > > > > >> https://malhar.atlassian.net/browse/MLHR-1838
> > > > > > >> > > You would find the entire discussion there.
> > > > > > >> > >
> > > > > > >> > > Why are all these operator under Malhar/contrib and not
> > > > Malhar/lib
> > > > > > >> > > When i was writing the code i saw AbstractCsvParser in
> > > contriib
> > > > > and
> > > > > > >> hence
> > > > > > >> > > added there.
> > > > > > >> > >
> > > > > > >> > > Recently i got to know which operators must go in contrib
> > and
> > > > what
> > > > > > >> must
> > > > > > >> > go
> > > > > > >> > > in lib.
> > > > > > >> > > By that definition, these operators must belong to lib.
> > > > > > >> > >
> > > > > > >> > > Thanks,
> > > > > > >> > > Shubham
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> > > On Tue, Dec 15, 2015 at 1:32 PM, Chandni Singh <
> > > > > > >> chandni@datatorrent.com>
> > > > > > >> > > wrote:
> > > > > > >> > >
> > > > > > >> > > > Hi,
> > > > > > >> > > >
> > > > > > >> > > > I just came across couple of formatter and parser
> > operators
> > > > > which
> > > > > > >> are
> > > > > > >> > > under
> > > > > > >> > > > Malhar/contrib/schema.
> > > > > > >> > > >
> > > > > > >> > > > I have couple of questions:
> > > > > > >> > > > 1. What does schema denote here?
> > > > > > >> > > > 2. Why formatter/parser which are functions are placed
> > under
> > > > > > schema
> > > > > > >> > > > package?
> > > > > > >> > > > 2. Why are all these operator under Malhar/contrib and
> not
> > > > > > >> Malhar/lib
> > > > > > >> > > >
> > > > > > >> > > > Thanks,
> > > > > > >> > > > Chandni
> > > > > > >> > > >
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Why are parser and formatter operator hidden in Malhar/contrib/schema

Posted by Sandesh Hegde <sa...@datatorrent.com>.
Even my understanding was along the lines, "if an unit test can't be run on
a developer machine without installing dependencies ( RabbitMq/redis ec),
it should go into contrib".

Documenting the exact process as to what goes where helps.

On Thu, Dec 17, 2015 at 12:16 PM Chinmay Kolhatkar <ch...@datatorrent.com>
wrote:

> Hi Isha,
>
> I think what Shubham meant to say is any operator which has a dependency on
> external entity (not library) should go in contrib. Others should go in
> library.
>
> For eg. DB related operators needs an instance/server of respect database
> to be running. Without that running server, the operator cannot satisfy the
> functionality expected. Hence it goes into contrib.
>
> I think none of the parsers will have dependency on external entity, hence
> they should go in library and not contrib.
>
> - Chinmay.
> On 18 Dec 2015 00:42, "Isha Arkatkar" <is...@datatorrent.com> wrote:
>
> > Hi Shubham,
> >
> >       I think it is somewhat subjective what goes in Contrib Vs Library.
> > Here is what I understood about general guideline:
> >        An operator would go in  malhar-library if it does not have any
> > other library dependency than what is already available. If operator
> needs
> > to include a dependency, it could be added to Contrib but not library.
> >
> >      If we add an external library dependency in Malhar-library, the size
> > of the lib jar keeps on growing. So, if we refer malhar-lib in a project,
> > the size of the total package would increase, even if we may not directly
> > use all external libs.
> >
> >     For this reason, in pull request #137
> > <https://github.com/apache/incubator-apex-malhar/pull/137> , I move only
> > Json and XML parsers to malhar-library. Csv one had library reference to
> > supercsv, so it is still in contrib.
> >
> >   Please correct if I missed something. :)
> >
> > Thanks!
> > Isha
> >
> > On Wed, Dec 16, 2015 at 11:27 PM, Shubham Pathak <
> shubham@datatorrent.com>
> > wrote:
> >
> > > Hi,
> > >
> > > What do we mean by "additional dependency" in case of deciding contrib
> vs
> > > lib ?
> > > As i understand, "additional dependency"  is when an operator is
> > > interacting with external technologies . For e.g kafka, MQ , HBase etc.
> > > By this definition even CSV parsers need to be moved to malhar-lib.
> > >
> > > Thanks,
> > > Shubham
> > >
> > >
> > > On Thu, Dec 17, 2015 at 5:39 AM, Isha Arkatkar <is...@datatorrent.com>
> > > wrote:
> > >
> > > > Hi Chandni,
> > > >
> > > >    I have moved that as well to lib, as the parsers depended on that.
> > > >
> > > > Thanks,
> > > > Isha
> > > >
> > > > On Wed, Dec 16, 2015 at 1:01 PM, Chandni Singh <
> > chandni@datatorrent.com>
> > > > wrote:
> > > >
> > > > > There is a converter package under com.datatorrent.contrib which
> has
> > a
> > > > > Converter API. This belongs in library as well.
> > > > >
> > > > > Thanks,
> > > > > Chandni
> > > > >
> > > > > On Tue, Dec 15, 2015 at 1:29 PM, Chandni Singh <
> > > chandni@datatorrent.com>
> > > > > wrote:
> > > > >
> > > > > > Isha,
> > > > > >
> > > > > > Thanks for moving this. When you move these files, please place
> > then
> > > > > under
> > > > > > a package which reflects its functionality. I don't see the need
> > for
> > > > > > package called schema.
> > > > > >
> > > > > > Thanks,
> > > > > > Chandni
> > > > > >
> > > > > > On Tue, Dec 15, 2015 at 12:31 PM, Isha Arkatkar <
> > > isha@datatorrent.com>
> > > > > > wrote:
> > > > > >
> > > > > >> Hi,
> > > > > >>
> > > > > >>   For csv parser there is an additional dependency. So, I'll
> move
> > > only
> > > > > >> json
> > > > > >> and xml to new location.
> > > > > >>
> > > > > >> Thanks,
> > > > > >> Isha
> > > > > >>
> > > > > >> On Tue, Dec 15, 2015 at 11:42 AM, Thomas Weise <
> > > > thomas@datatorrent.com>
> > > > > >> wrote:
> > > > > >>
> > > > > >> > As long as the operators don't introduce additional
> dependencies
> > > > they
> > > > > >> > should be in lib.
> > > > > >> >
> > > > > >> >
> > > > > >> > On Tue, Dec 15, 2015 at 9:34 AM, Shubham Pathak <
> > > > > >> shubham@datatorrent.com>
> > > > > >> > wrote:
> > > > > >> >
> > > > > >> > > Hi Chandni,
> > > > > >> > >
> > > > > >> > > I had written those operators.
> > > > > >> > > Here is the jira for that
> > > > > >> https://malhar.atlassian.net/browse/MLHR-1838
> > > > > >> > > You would find the entire discussion there.
> > > > > >> > >
> > > > > >> > > Why are all these operator under Malhar/contrib and not
> > > Malhar/lib
> > > > > >> > > When i was writing the code i saw AbstractCsvParser in
> > contriib
> > > > and
> > > > > >> hence
> > > > > >> > > added there.
> > > > > >> > >
> > > > > >> > > Recently i got to know which operators must go in contrib
> and
> > > what
> > > > > >> must
> > > > > >> > go
> > > > > >> > > in lib.
> > > > > >> > > By that definition, these operators must belong to lib.
> > > > > >> > >
> > > > > >> > > Thanks,
> > > > > >> > > Shubham
> > > > > >> > >
> > > > > >> > >
> > > > > >> > > On Tue, Dec 15, 2015 at 1:32 PM, Chandni Singh <
> > > > > >> chandni@datatorrent.com>
> > > > > >> > > wrote:
> > > > > >> > >
> > > > > >> > > > Hi,
> > > > > >> > > >
> > > > > >> > > > I just came across couple of formatter and parser
> operators
> > > > which
> > > > > >> are
> > > > > >> > > under
> > > > > >> > > > Malhar/contrib/schema.
> > > > > >> > > >
> > > > > >> > > > I have couple of questions:
> > > > > >> > > > 1. What does schema denote here?
> > > > > >> > > > 2. Why formatter/parser which are functions are placed
> under
> > > > > schema
> > > > > >> > > > package?
> > > > > >> > > > 2. Why are all these operator under Malhar/contrib and not
> > > > > >> Malhar/lib
> > > > > >> > > >
> > > > > >> > > > Thanks,
> > > > > >> > > > Chandni
> > > > > >> > > >
> > > > > >> > >
> > > > > >> >
> > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Why are parser and formatter operator hidden in Malhar/contrib/schema

Posted by Chinmay Kolhatkar <ch...@datatorrent.com>.
Hi Isha,

I think what Shubham meant to say is any operator which has a dependency on
external entity (not library) should go in contrib. Others should go in
library.

For eg. DB related operators needs an instance/server of respect database
to be running. Without that running server, the operator cannot satisfy the
functionality expected. Hence it goes into contrib.

I think none of the parsers will have dependency on external entity, hence
they should go in library and not contrib.

- Chinmay.
On 18 Dec 2015 00:42, "Isha Arkatkar" <is...@datatorrent.com> wrote:

> Hi Shubham,
>
>       I think it is somewhat subjective what goes in Contrib Vs Library.
> Here is what I understood about general guideline:
>        An operator would go in  malhar-library if it does not have any
> other library dependency than what is already available. If operator needs
> to include a dependency, it could be added to Contrib but not library.
>
>      If we add an external library dependency in Malhar-library, the size
> of the lib jar keeps on growing. So, if we refer malhar-lib in a project,
> the size of the total package would increase, even if we may not directly
> use all external libs.
>
>     For this reason, in pull request #137
> <https://github.com/apache/incubator-apex-malhar/pull/137> , I move only
> Json and XML parsers to malhar-library. Csv one had library reference to
> supercsv, so it is still in contrib.
>
>   Please correct if I missed something. :)
>
> Thanks!
> Isha
>
> On Wed, Dec 16, 2015 at 11:27 PM, Shubham Pathak <sh...@datatorrent.com>
> wrote:
>
> > Hi,
> >
> > What do we mean by "additional dependency" in case of deciding contrib vs
> > lib ?
> > As i understand, "additional dependency"  is when an operator is
> > interacting with external technologies . For e.g kafka, MQ , HBase etc.
> > By this definition even CSV parsers need to be moved to malhar-lib.
> >
> > Thanks,
> > Shubham
> >
> >
> > On Thu, Dec 17, 2015 at 5:39 AM, Isha Arkatkar <is...@datatorrent.com>
> > wrote:
> >
> > > Hi Chandni,
> > >
> > >    I have moved that as well to lib, as the parsers depended on that.
> > >
> > > Thanks,
> > > Isha
> > >
> > > On Wed, Dec 16, 2015 at 1:01 PM, Chandni Singh <
> chandni@datatorrent.com>
> > > wrote:
> > >
> > > > There is a converter package under com.datatorrent.contrib which has
> a
> > > > Converter API. This belongs in library as well.
> > > >
> > > > Thanks,
> > > > Chandni
> > > >
> > > > On Tue, Dec 15, 2015 at 1:29 PM, Chandni Singh <
> > chandni@datatorrent.com>
> > > > wrote:
> > > >
> > > > > Isha,
> > > > >
> > > > > Thanks for moving this. When you move these files, please place
> then
> > > > under
> > > > > a package which reflects its functionality. I don't see the need
> for
> > > > > package called schema.
> > > > >
> > > > > Thanks,
> > > > > Chandni
> > > > >
> > > > > On Tue, Dec 15, 2015 at 12:31 PM, Isha Arkatkar <
> > isha@datatorrent.com>
> > > > > wrote:
> > > > >
> > > > >> Hi,
> > > > >>
> > > > >>   For csv parser there is an additional dependency. So, I'll move
> > only
> > > > >> json
> > > > >> and xml to new location.
> > > > >>
> > > > >> Thanks,
> > > > >> Isha
> > > > >>
> > > > >> On Tue, Dec 15, 2015 at 11:42 AM, Thomas Weise <
> > > thomas@datatorrent.com>
> > > > >> wrote:
> > > > >>
> > > > >> > As long as the operators don't introduce additional dependencies
> > > they
> > > > >> > should be in lib.
> > > > >> >
> > > > >> >
> > > > >> > On Tue, Dec 15, 2015 at 9:34 AM, Shubham Pathak <
> > > > >> shubham@datatorrent.com>
> > > > >> > wrote:
> > > > >> >
> > > > >> > > Hi Chandni,
> > > > >> > >
> > > > >> > > I had written those operators.
> > > > >> > > Here is the jira for that
> > > > >> https://malhar.atlassian.net/browse/MLHR-1838
> > > > >> > > You would find the entire discussion there.
> > > > >> > >
> > > > >> > > Why are all these operator under Malhar/contrib and not
> > Malhar/lib
> > > > >> > > When i was writing the code i saw AbstractCsvParser in
> contriib
> > > and
> > > > >> hence
> > > > >> > > added there.
> > > > >> > >
> > > > >> > > Recently i got to know which operators must go in contrib and
> > what
> > > > >> must
> > > > >> > go
> > > > >> > > in lib.
> > > > >> > > By that definition, these operators must belong to lib.
> > > > >> > >
> > > > >> > > Thanks,
> > > > >> > > Shubham
> > > > >> > >
> > > > >> > >
> > > > >> > > On Tue, Dec 15, 2015 at 1:32 PM, Chandni Singh <
> > > > >> chandni@datatorrent.com>
> > > > >> > > wrote:
> > > > >> > >
> > > > >> > > > Hi,
> > > > >> > > >
> > > > >> > > > I just came across couple of formatter and parser operators
> > > which
> > > > >> are
> > > > >> > > under
> > > > >> > > > Malhar/contrib/schema.
> > > > >> > > >
> > > > >> > > > I have couple of questions:
> > > > >> > > > 1. What does schema denote here?
> > > > >> > > > 2. Why formatter/parser which are functions are placed under
> > > > schema
> > > > >> > > > package?
> > > > >> > > > 2. Why are all these operator under Malhar/contrib and not
> > > > >> Malhar/lib
> > > > >> > > >
> > > > >> > > > Thanks,
> > > > >> > > > Chandni
> > > > >> > > >
> > > > >> > >
> > > > >> >
> > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>

Re: Why are parser and formatter operator hidden in Malhar/contrib/schema

Posted by Isha Arkatkar <is...@datatorrent.com>.
Hi Shubham,

      I think it is somewhat subjective what goes in Contrib Vs Library.
Here is what I understood about general guideline:
       An operator would go in  malhar-library if it does not have any
other library dependency than what is already available. If operator needs
to include a dependency, it could be added to Contrib but not library.

     If we add an external library dependency in Malhar-library, the size
of the lib jar keeps on growing. So, if we refer malhar-lib in a project,
the size of the total package would increase, even if we may not directly
use all external libs.

    For this reason, in pull request #137
<https://github.com/apache/incubator-apex-malhar/pull/137> , I move only
Json and XML parsers to malhar-library. Csv one had library reference to
supercsv, so it is still in contrib.

  Please correct if I missed something. :)

Thanks!
Isha

On Wed, Dec 16, 2015 at 11:27 PM, Shubham Pathak <sh...@datatorrent.com>
wrote:

> Hi,
>
> What do we mean by "additional dependency" in case of deciding contrib vs
> lib ?
> As i understand, "additional dependency"  is when an operator is
> interacting with external technologies . For e.g kafka, MQ , HBase etc.
> By this definition even CSV parsers need to be moved to malhar-lib.
>
> Thanks,
> Shubham
>
>
> On Thu, Dec 17, 2015 at 5:39 AM, Isha Arkatkar <is...@datatorrent.com>
> wrote:
>
> > Hi Chandni,
> >
> >    I have moved that as well to lib, as the parsers depended on that.
> >
> > Thanks,
> > Isha
> >
> > On Wed, Dec 16, 2015 at 1:01 PM, Chandni Singh <ch...@datatorrent.com>
> > wrote:
> >
> > > There is a converter package under com.datatorrent.contrib which has a
> > > Converter API. This belongs in library as well.
> > >
> > > Thanks,
> > > Chandni
> > >
> > > On Tue, Dec 15, 2015 at 1:29 PM, Chandni Singh <
> chandni@datatorrent.com>
> > > wrote:
> > >
> > > > Isha,
> > > >
> > > > Thanks for moving this. When you move these files, please place then
> > > under
> > > > a package which reflects its functionality. I don't see the need for
> > > > package called schema.
> > > >
> > > > Thanks,
> > > > Chandni
> > > >
> > > > On Tue, Dec 15, 2015 at 12:31 PM, Isha Arkatkar <
> isha@datatorrent.com>
> > > > wrote:
> > > >
> > > >> Hi,
> > > >>
> > > >>   For csv parser there is an additional dependency. So, I'll move
> only
> > > >> json
> > > >> and xml to new location.
> > > >>
> > > >> Thanks,
> > > >> Isha
> > > >>
> > > >> On Tue, Dec 15, 2015 at 11:42 AM, Thomas Weise <
> > thomas@datatorrent.com>
> > > >> wrote:
> > > >>
> > > >> > As long as the operators don't introduce additional dependencies
> > they
> > > >> > should be in lib.
> > > >> >
> > > >> >
> > > >> > On Tue, Dec 15, 2015 at 9:34 AM, Shubham Pathak <
> > > >> shubham@datatorrent.com>
> > > >> > wrote:
> > > >> >
> > > >> > > Hi Chandni,
> > > >> > >
> > > >> > > I had written those operators.
> > > >> > > Here is the jira for that
> > > >> https://malhar.atlassian.net/browse/MLHR-1838
> > > >> > > You would find the entire discussion there.
> > > >> > >
> > > >> > > Why are all these operator under Malhar/contrib and not
> Malhar/lib
> > > >> > > When i was writing the code i saw AbstractCsvParser in contriib
> > and
> > > >> hence
> > > >> > > added there.
> > > >> > >
> > > >> > > Recently i got to know which operators must go in contrib and
> what
> > > >> must
> > > >> > go
> > > >> > > in lib.
> > > >> > > By that definition, these operators must belong to lib.
> > > >> > >
> > > >> > > Thanks,
> > > >> > > Shubham
> > > >> > >
> > > >> > >
> > > >> > > On Tue, Dec 15, 2015 at 1:32 PM, Chandni Singh <
> > > >> chandni@datatorrent.com>
> > > >> > > wrote:
> > > >> > >
> > > >> > > > Hi,
> > > >> > > >
> > > >> > > > I just came across couple of formatter and parser operators
> > which
> > > >> are
> > > >> > > under
> > > >> > > > Malhar/contrib/schema.
> > > >> > > >
> > > >> > > > I have couple of questions:
> > > >> > > > 1. What does schema denote here?
> > > >> > > > 2. Why formatter/parser which are functions are placed under
> > > schema
> > > >> > > > package?
> > > >> > > > 2. Why are all these operator under Malhar/contrib and not
> > > >> Malhar/lib
> > > >> > > >
> > > >> > > > Thanks,
> > > >> > > > Chandni
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>

Re: Why are parser and formatter operator hidden in Malhar/contrib/schema

Posted by Shubham Pathak <sh...@datatorrent.com>.
Hi,

What do we mean by "additional dependency" in case of deciding contrib vs
lib ?
As i understand, "additional dependency"  is when an operator is
interacting with external technologies . For e.g kafka, MQ , HBase etc.
By this definition even CSV parsers need to be moved to malhar-lib.

Thanks,
Shubham


On Thu, Dec 17, 2015 at 5:39 AM, Isha Arkatkar <is...@datatorrent.com> wrote:

> Hi Chandni,
>
>    I have moved that as well to lib, as the parsers depended on that.
>
> Thanks,
> Isha
>
> On Wed, Dec 16, 2015 at 1:01 PM, Chandni Singh <ch...@datatorrent.com>
> wrote:
>
> > There is a converter package under com.datatorrent.contrib which has a
> > Converter API. This belongs in library as well.
> >
> > Thanks,
> > Chandni
> >
> > On Tue, Dec 15, 2015 at 1:29 PM, Chandni Singh <ch...@datatorrent.com>
> > wrote:
> >
> > > Isha,
> > >
> > > Thanks for moving this. When you move these files, please place then
> > under
> > > a package which reflects its functionality. I don't see the need for
> > > package called schema.
> > >
> > > Thanks,
> > > Chandni
> > >
> > > On Tue, Dec 15, 2015 at 12:31 PM, Isha Arkatkar <is...@datatorrent.com>
> > > wrote:
> > >
> > >> Hi,
> > >>
> > >>   For csv parser there is an additional dependency. So, I'll move only
> > >> json
> > >> and xml to new location.
> > >>
> > >> Thanks,
> > >> Isha
> > >>
> > >> On Tue, Dec 15, 2015 at 11:42 AM, Thomas Weise <
> thomas@datatorrent.com>
> > >> wrote:
> > >>
> > >> > As long as the operators don't introduce additional dependencies
> they
> > >> > should be in lib.
> > >> >
> > >> >
> > >> > On Tue, Dec 15, 2015 at 9:34 AM, Shubham Pathak <
> > >> shubham@datatorrent.com>
> > >> > wrote:
> > >> >
> > >> > > Hi Chandni,
> > >> > >
> > >> > > I had written those operators.
> > >> > > Here is the jira for that
> > >> https://malhar.atlassian.net/browse/MLHR-1838
> > >> > > You would find the entire discussion there.
> > >> > >
> > >> > > Why are all these operator under Malhar/contrib and not Malhar/lib
> > >> > > When i was writing the code i saw AbstractCsvParser in contriib
> and
> > >> hence
> > >> > > added there.
> > >> > >
> > >> > > Recently i got to know which operators must go in contrib and what
> > >> must
> > >> > go
> > >> > > in lib.
> > >> > > By that definition, these operators must belong to lib.
> > >> > >
> > >> > > Thanks,
> > >> > > Shubham
> > >> > >
> > >> > >
> > >> > > On Tue, Dec 15, 2015 at 1:32 PM, Chandni Singh <
> > >> chandni@datatorrent.com>
> > >> > > wrote:
> > >> > >
> > >> > > > Hi,
> > >> > > >
> > >> > > > I just came across couple of formatter and parser operators
> which
> > >> are
> > >> > > under
> > >> > > > Malhar/contrib/schema.
> > >> > > >
> > >> > > > I have couple of questions:
> > >> > > > 1. What does schema denote here?
> > >> > > > 2. Why formatter/parser which are functions are placed under
> > schema
> > >> > > > package?
> > >> > > > 2. Why are all these operator under Malhar/contrib and not
> > >> Malhar/lib
> > >> > > >
> > >> > > > Thanks,
> > >> > > > Chandni
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> > >
> >
>

Re: Why are parser and formatter operator hidden in Malhar/contrib/schema

Posted by Isha Arkatkar <is...@datatorrent.com>.
Hi Chandni,

   I have moved that as well to lib, as the parsers depended on that.

Thanks,
Isha

On Wed, Dec 16, 2015 at 1:01 PM, Chandni Singh <ch...@datatorrent.com>
wrote:

> There is a converter package under com.datatorrent.contrib which has a
> Converter API. This belongs in library as well.
>
> Thanks,
> Chandni
>
> On Tue, Dec 15, 2015 at 1:29 PM, Chandni Singh <ch...@datatorrent.com>
> wrote:
>
> > Isha,
> >
> > Thanks for moving this. When you move these files, please place then
> under
> > a package which reflects its functionality. I don't see the need for
> > package called schema.
> >
> > Thanks,
> > Chandni
> >
> > On Tue, Dec 15, 2015 at 12:31 PM, Isha Arkatkar <is...@datatorrent.com>
> > wrote:
> >
> >> Hi,
> >>
> >>   For csv parser there is an additional dependency. So, I'll move only
> >> json
> >> and xml to new location.
> >>
> >> Thanks,
> >> Isha
> >>
> >> On Tue, Dec 15, 2015 at 11:42 AM, Thomas Weise <th...@datatorrent.com>
> >> wrote:
> >>
> >> > As long as the operators don't introduce additional dependencies they
> >> > should be in lib.
> >> >
> >> >
> >> > On Tue, Dec 15, 2015 at 9:34 AM, Shubham Pathak <
> >> shubham@datatorrent.com>
> >> > wrote:
> >> >
> >> > > Hi Chandni,
> >> > >
> >> > > I had written those operators.
> >> > > Here is the jira for that
> >> https://malhar.atlassian.net/browse/MLHR-1838
> >> > > You would find the entire discussion there.
> >> > >
> >> > > Why are all these operator under Malhar/contrib and not Malhar/lib
> >> > > When i was writing the code i saw AbstractCsvParser in contriib and
> >> hence
> >> > > added there.
> >> > >
> >> > > Recently i got to know which operators must go in contrib and what
> >> must
> >> > go
> >> > > in lib.
> >> > > By that definition, these operators must belong to lib.
> >> > >
> >> > > Thanks,
> >> > > Shubham
> >> > >
> >> > >
> >> > > On Tue, Dec 15, 2015 at 1:32 PM, Chandni Singh <
> >> chandni@datatorrent.com>
> >> > > wrote:
> >> > >
> >> > > > Hi,
> >> > > >
> >> > > > I just came across couple of formatter and parser operators which
> >> are
> >> > > under
> >> > > > Malhar/contrib/schema.
> >> > > >
> >> > > > I have couple of questions:
> >> > > > 1. What does schema denote here?
> >> > > > 2. Why formatter/parser which are functions are placed under
> schema
> >> > > > package?
> >> > > > 2. Why are all these operator under Malhar/contrib and not
> >> Malhar/lib
> >> > > >
> >> > > > Thanks,
> >> > > > Chandni
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>

Re: Why are parser and formatter operator hidden in Malhar/contrib/schema

Posted by Chandni Singh <ch...@datatorrent.com>.
There is a converter package under com.datatorrent.contrib which has a
Converter API. This belongs in library as well.

Thanks,
Chandni

On Tue, Dec 15, 2015 at 1:29 PM, Chandni Singh <ch...@datatorrent.com>
wrote:

> Isha,
>
> Thanks for moving this. When you move these files, please place then under
> a package which reflects its functionality. I don't see the need for
> package called schema.
>
> Thanks,
> Chandni
>
> On Tue, Dec 15, 2015 at 12:31 PM, Isha Arkatkar <is...@datatorrent.com>
> wrote:
>
>> Hi,
>>
>>   For csv parser there is an additional dependency. So, I'll move only
>> json
>> and xml to new location.
>>
>> Thanks,
>> Isha
>>
>> On Tue, Dec 15, 2015 at 11:42 AM, Thomas Weise <th...@datatorrent.com>
>> wrote:
>>
>> > As long as the operators don't introduce additional dependencies they
>> > should be in lib.
>> >
>> >
>> > On Tue, Dec 15, 2015 at 9:34 AM, Shubham Pathak <
>> shubham@datatorrent.com>
>> > wrote:
>> >
>> > > Hi Chandni,
>> > >
>> > > I had written those operators.
>> > > Here is the jira for that
>> https://malhar.atlassian.net/browse/MLHR-1838
>> > > You would find the entire discussion there.
>> > >
>> > > Why are all these operator under Malhar/contrib and not Malhar/lib
>> > > When i was writing the code i saw AbstractCsvParser in contriib and
>> hence
>> > > added there.
>> > >
>> > > Recently i got to know which operators must go in contrib and what
>> must
>> > go
>> > > in lib.
>> > > By that definition, these operators must belong to lib.
>> > >
>> > > Thanks,
>> > > Shubham
>> > >
>> > >
>> > > On Tue, Dec 15, 2015 at 1:32 PM, Chandni Singh <
>> chandni@datatorrent.com>
>> > > wrote:
>> > >
>> > > > Hi,
>> > > >
>> > > > I just came across couple of formatter and parser operators which
>> are
>> > > under
>> > > > Malhar/contrib/schema.
>> > > >
>> > > > I have couple of questions:
>> > > > 1. What does schema denote here?
>> > > > 2. Why formatter/parser which are functions are placed under schema
>> > > > package?
>> > > > 2. Why are all these operator under Malhar/contrib and not
>> Malhar/lib
>> > > >
>> > > > Thanks,
>> > > > Chandni
>> > > >
>> > >
>> >
>>
>
>

Re: Why are parser and formatter operator hidden in Malhar/contrib/schema

Posted by Chandni Singh <ch...@datatorrent.com>.
Isha,

Thanks for moving this. When you move these files, please place then under
a package which reflects its functionality. I don't see the need for
package called schema.

Thanks,
Chandni

On Tue, Dec 15, 2015 at 12:31 PM, Isha Arkatkar <is...@datatorrent.com>
wrote:

> Hi,
>
>   For csv parser there is an additional dependency. So, I'll move only json
> and xml to new location.
>
> Thanks,
> Isha
>
> On Tue, Dec 15, 2015 at 11:42 AM, Thomas Weise <th...@datatorrent.com>
> wrote:
>
> > As long as the operators don't introduce additional dependencies they
> > should be in lib.
> >
> >
> > On Tue, Dec 15, 2015 at 9:34 AM, Shubham Pathak <shubham@datatorrent.com
> >
> > wrote:
> >
> > > Hi Chandni,
> > >
> > > I had written those operators.
> > > Here is the jira for that
> https://malhar.atlassian.net/browse/MLHR-1838
> > > You would find the entire discussion there.
> > >
> > > Why are all these operator under Malhar/contrib and not Malhar/lib
> > > When i was writing the code i saw AbstractCsvParser in contriib and
> hence
> > > added there.
> > >
> > > Recently i got to know which operators must go in contrib and what must
> > go
> > > in lib.
> > > By that definition, these operators must belong to lib.
> > >
> > > Thanks,
> > > Shubham
> > >
> > >
> > > On Tue, Dec 15, 2015 at 1:32 PM, Chandni Singh <
> chandni@datatorrent.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I just came across couple of formatter and parser operators which are
> > > under
> > > > Malhar/contrib/schema.
> > > >
> > > > I have couple of questions:
> > > > 1. What does schema denote here?
> > > > 2. Why formatter/parser which are functions are placed under schema
> > > > package?
> > > > 2. Why are all these operator under Malhar/contrib and not Malhar/lib
> > > >
> > > > Thanks,
> > > > Chandni
> > > >
> > >
> >
>

Re: Why are parser and formatter operator hidden in Malhar/contrib/schema

Posted by Isha Arkatkar <is...@datatorrent.com>.
Hi,

  For csv parser there is an additional dependency. So, I'll move only json
and xml to new location.

Thanks,
Isha

On Tue, Dec 15, 2015 at 11:42 AM, Thomas Weise <th...@datatorrent.com>
wrote:

> As long as the operators don't introduce additional dependencies they
> should be in lib.
>
>
> On Tue, Dec 15, 2015 at 9:34 AM, Shubham Pathak <sh...@datatorrent.com>
> wrote:
>
> > Hi Chandni,
> >
> > I had written those operators.
> > Here is the jira for that  https://malhar.atlassian.net/browse/MLHR-1838
> > You would find the entire discussion there.
> >
> > Why are all these operator under Malhar/contrib and not Malhar/lib
> > When i was writing the code i saw AbstractCsvParser in contriib and hence
> > added there.
> >
> > Recently i got to know which operators must go in contrib and what must
> go
> > in lib.
> > By that definition, these operators must belong to lib.
> >
> > Thanks,
> > Shubham
> >
> >
> > On Tue, Dec 15, 2015 at 1:32 PM, Chandni Singh <ch...@datatorrent.com>
> > wrote:
> >
> > > Hi,
> > >
> > > I just came across couple of formatter and parser operators which are
> > under
> > > Malhar/contrib/schema.
> > >
> > > I have couple of questions:
> > > 1. What does schema denote here?
> > > 2. Why formatter/parser which are functions are placed under schema
> > > package?
> > > 2. Why are all these operator under Malhar/contrib and not Malhar/lib
> > >
> > > Thanks,
> > > Chandni
> > >
> >
>

Re: Why are parser and formatter operator hidden in Malhar/contrib/schema

Posted by Thomas Weise <th...@datatorrent.com>.
As long as the operators don't introduce additional dependencies they
should be in lib.


On Tue, Dec 15, 2015 at 9:34 AM, Shubham Pathak <sh...@datatorrent.com>
wrote:

> Hi Chandni,
>
> I had written those operators.
> Here is the jira for that  https://malhar.atlassian.net/browse/MLHR-1838
> You would find the entire discussion there.
>
> Why are all these operator under Malhar/contrib and not Malhar/lib
> When i was writing the code i saw AbstractCsvParser in contriib and hence
> added there.
>
> Recently i got to know which operators must go in contrib and what must go
> in lib.
> By that definition, these operators must belong to lib.
>
> Thanks,
> Shubham
>
>
> On Tue, Dec 15, 2015 at 1:32 PM, Chandni Singh <ch...@datatorrent.com>
> wrote:
>
> > Hi,
> >
> > I just came across couple of formatter and parser operators which are
> under
> > Malhar/contrib/schema.
> >
> > I have couple of questions:
> > 1. What does schema denote here?
> > 2. Why formatter/parser which are functions are placed under schema
> > package?
> > 2. Why are all these operator under Malhar/contrib and not Malhar/lib
> >
> > Thanks,
> > Chandni
> >
>

Re: Why are parser and formatter operator hidden in Malhar/contrib/schema

Posted by Isha Arkatkar <is...@datatorrent.com>.
Hi,

   I will move these operators to Malhar-lib with pull request for XML
parser change to validate based on xsd.

Thanks,
Isha

On Tue, Dec 15, 2015 at 9:34 AM, Shubham Pathak <sh...@datatorrent.com>
wrote:

> Hi Chandni,
>
> I had written those operators.
> Here is the jira for that  https://malhar.atlassian.net/browse/MLHR-1838
> You would find the entire discussion there.
>
> Why are all these operator under Malhar/contrib and not Malhar/lib
> When i was writing the code i saw AbstractCsvParser in contriib and hence
> added there.
>
> Recently i got to know which operators must go in contrib and what must go
> in lib.
> By that definition, these operators must belong to lib.
>
> Thanks,
> Shubham
>
>
> On Tue, Dec 15, 2015 at 1:32 PM, Chandni Singh <ch...@datatorrent.com>
> wrote:
>
> > Hi,
> >
> > I just came across couple of formatter and parser operators which are
> under
> > Malhar/contrib/schema.
> >
> > I have couple of questions:
> > 1. What does schema denote here?
> > 2. Why formatter/parser which are functions are placed under schema
> > package?
> > 2. Why are all these operator under Malhar/contrib and not Malhar/lib
> >
> > Thanks,
> > Chandni
> >
>

Re: Why are parser and formatter operator hidden in Malhar/contrib/schema

Posted by Shubham Pathak <sh...@datatorrent.com>.
Hi Chandni,

I had written those operators.
Here is the jira for that  https://malhar.atlassian.net/browse/MLHR-1838
You would find the entire discussion there.

Why are all these operator under Malhar/contrib and not Malhar/lib
When i was writing the code i saw AbstractCsvParser in contriib and hence
added there.

Recently i got to know which operators must go in contrib and what must go
in lib.
By that definition, these operators must belong to lib.

Thanks,
Shubham


On Tue, Dec 15, 2015 at 1:32 PM, Chandni Singh <ch...@datatorrent.com>
wrote:

> Hi,
>
> I just came across couple of formatter and parser operators which are under
> Malhar/contrib/schema.
>
> I have couple of questions:
> 1. What does schema denote here?
> 2. Why formatter/parser which are functions are placed under schema
> package?
> 2. Why are all these operator under Malhar/contrib and not Malhar/lib
>
> Thanks,
> Chandni
>