You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@predictionio.apache.org by Kenneth Chan <ke...@apache.org> on 2017/02/17 06:05:14 UTC

Re: consolidate repos into 1

sorry if too late into this discussion.

actually i also thought about if make sense to merge the official
templates into one repo too.

the reason is
- think it like library or a bundle of PIO. everytime we releae new version
of PIO, we also need to make sure these template work or upgraded together.
jist like spark and spark mllib.
- easier to maintain as well?


for the concern of large text classifcation template size. the reason it s
big because od there was a time a bunch of binary jar were checked in. i
think should recreate that repo and clean that up and include instruction
of how to get those jars.


On Thu, Nov 3, 2016 at 3:34 PM Pat Ferrel <pa...@occamsmachete.com> wrote:

> I wouldn’t favor merging for Tom’s point and others:
> So far from the template I maintain, there have been 2 PIO releases and
> soon to be 7 template releases. The point being that active templates will
> have their own revision schedule. You have only to look at the history of
> the templates to see that they are released independent of PIO releases.
> ASF tools make it hard, not the project needs.
> These were all separate repos in PIO days because they made sense as
> separate and because Github makes it easy. Now with ASF hosted git there is
> more pain but still the same project needs. Let’s not confuse pain with
> need. Let’s remove the pain points. We already have self-service repo
> creation from pushing on the pain points, a big step forward from the days
> when it took an infra-ticket to get a repo.
> If `git pull template-url` is the basis of getting a template, merging
> repos will break this and make contributed templates different than
> external ones to the confusion of users.
> As Tom noted It will also bloat the project when we’d like to see it more
> modular. For instance an Admin server microservice may also end up in a
> separate repo so it can be released at different intervals.
> The standard IMO is not Apache, which is a venerable institution (trying
> to remove friction points), it is outside-apache OSS which most assuredly
> is more modular. Pip, npm, gems, apt-get, ...
>
> Growth leads to bloat or efforts to decouple and refactor. I’d actually
> like to see PIO split up along mircorservices refactoring lines but all in
> time. A move to bundle together seems the wrong direction.
>
> Another problem is the difficulty of binary releases in ASF as we all
> witnessed (especially hard for incubating projects). Think about the fact
> that currently templates do not need to be released in any sense. Wow, that
> is very cool, speaking from the ASF red-tape avoidance part of me.
>
>
> On Nov 3, 2016, at 2:41 PM, Tom Chan <yu...@gmail.com> wrote:
>
> This is mostly a good idea but then one of the templates is 3 times the
> size of incubator-predictionio:
>
> $ du -d 1  -h
> 53M ./incubator-predictionio
> 1.3M ./incubator-predictionio-sdk-java
> 288K ./incubator-predictionio-sdk-php
> 536K ./incubator-predictionio-sdk-python
> 264K ./incubator-predictionio-sdk-ruby
> 236K ./incubator-predictionio-template-attribute-based-classifier
> 220K ./incubator-predictionio-template-ecom-recommender
> 264K ./incubator-predictionio-template-java-ecom-recommender
> 184K ./incubator-predictionio-template-recommender
> 196K ./incubator-predictionio-template-similar-product
> 440K ./incubator-predictionio-template-skeleton
> 160M ./incubator-predictionio-template-text-classifier
>
> This 160M will be downloaded by all users regardless of whether they use it
> or not, if we choose to consolidate them all into one repo.
>
> Tom
>
> On Thu, Nov 3, 2016 at 2:16 PM, Simon Chan <si...@salesforce.com> wrote:
>
> > Hi guys,
> >
> > I'm actually thinking we should consolidate all core templates / SDKs
> repos
> > that are donated to Apache (i.e.
> > https://github.com/search?q=org%3Aapache+PredictionIO) into one main
> repo
> > (
> > https://github.com/apache/incubator-predictionio)
> >
> > The benefit may be that:
> > 1. We can track Apache PredictionIO project activity in a unified place;
> > 2. Making these templates part of the main repo encourages contributors
> to
> > make sure they are all compatible with the latest version of PredictionIO
> > core;
> > 3. I don't see other projects (e.g. Mahout and its libraries) hosting
> core
> > and components separately.
> >
> > Thought?
> >
> > Simon
> >
>
>