You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@manifoldcf.apache.org by Karl Wright <da...@gmail.com> on 2011/09/19 21:06:37 UTC

1.0 release, and graduation

Folks,
I'd like to begin discussion about the next release, currently labeled
0.4, and also our potential for graduation from the incubator.  What
I'd like is a sense of:

(a) what we are still missing as far as incubator graduation is concerned, and
(b) what a 1.0 release might look like to everyone

Please try to be as concrete as possible.  My own personal goal is to
see this happen by the end of the year, more or less.  To that end
I've already begun triaging JIRA tickets for the 0.4 release that I
think would be appropriate for a 1.0 release.

It's entirely possible that some things that people feel strongly
about may not be doable in that time frame, but so be it.  This may
also be true of our status as a project.

Thanks,
Karl

Re: 1.0 release, and graduation

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Tue, Sep 27, 2011 at 3:05 PM, Karl Wright <da...@gmail.com> wrote:
> The nature of this project is unique in that only a few committers are
> interested in any given connector.

This is very similar to the concern we used to have with Tika, that
most people are only really interested in a narrow subset (i.e. a
specific file format) of functionality. This in fact did turn out to
be true for most contributors and committers, but the fact that they
still need to cooperate on the core of shared code was enough to form
a working community.

> An approach I've been trying to use is to have at least one committer
> per connector. [...] So, as I've discussed before, the criteria for becoming
> a ManifoldCF committer has to be more nuanced and must take domain
> knowledge into account, if we are to have anything like a committer base
> that covers all the code.

Agreed. There's no need for everyone to know every bit of the entire
codebase. In fact there's probably no reasonably sized Apache project
where any single committer is intimately familiar with all bits of the
project.

So from my perspective anyone who knows ManifoldCF well enough to
implement a new working connector or to make substantial patches to an
existing one is well above the entry barrier for committership.

Let's get enough of such people actively involved, and we're ready to graduate!

> I guess what I'm arguing for is a somewhat different set of graduation
> criteria that is more suited to ManifoldCF's unique situation.

As mentioned above, I don't see this situation as too unique within Apache.

Growing beyond just a single active committer can be pretty
time-consuming and frustrating, but the benefits are worth it. I'm
sorry that the Incubator collectively and I personally haven't so far
been too helpful in making this process easier. Hopefully we'll be
able to help make the ride better from now on.

BR,

Jukka Zitting

Re: 1.0 release, and graduation

Posted by Karl Wright <da...@gmail.com>.
Hi Jukka,

>>>>>>
Graduating
into a Lucene subproject is probably out of the question given the
structural changes in Lucene, so for now my recommendation would be to
remain in the Incubator until the community balance gets better.
...
To put things in perspective, since the beginning of this year Karl
has made over 96% of all ManifoldCF commits. This makes the bus factor
[1] of the project pretty high, and suggests that a more diverse
development community is needed.
<<<<<<

The nature of this project is unique in that only a few committers are
interested in any given connector.  Committer interest in proprietary
connectors is even more limited; people who are using these connectors
do not in general become open-source contributors or committers.  I
therefore don't think we're going to be able to avoid having a high
"bus factor" for this project.  Thus, if a low bus factor is the main
criteria, it seems pretty unlikely to me that ManifoldCF will ever
graduate from the incubator.

An approach I've been trying to use is to have at least one committer
per connector.  This requires some flexibility on the matter of
considering what makes a good committer, since for ManifoldCF domain
knowledge is perhaps the most important consideration.  I am hoping
for some consideration by our mentors in this regard, since it is
clear to me that we should not expect a contributor that knows about
CMIS to also be an expert in Active Directory (for example).  So, as
I've discussed before, the criteria for becoming a ManifoldCF
committer has to be more nuanced and must take domain knowledge into
account, if we are to have anything like a committer base that covers
all the code.

I wouldn't mind not graduating, except for two issues.  First issue is
that being in the incubator limits adoption of the software.  It
convinces enterprises that the software is not ready for prime time,
which I believe is a false impression, but there it is.  It also
causes Manning (the book publisher) from withholding the production
release of the ManifoldCF in Action book as well.  So it becomes a bit
of a chicken-and-egg problem.  The second issue is that releases and
decision-making processes are overly cumbersome when the incubator is
involved.  This is unfortunate since the goal of being in the
incubator, as I understand it, is simply to ensure that the project
has sufficient familiarity with Apache standards and procedures to not
mess up too badly.

I guess what I'm arguing for is a somewhat different set of graduation
criteria that is more suited to ManifoldCF's unique situation.

Karl

On Wed, Sep 21, 2011 at 5:41 AM, Jukka Zitting <ju...@gmail.com> wrote:
> Hi,
>
> On Mon, Sep 19, 2011 at 9:06 PM, Karl Wright <da...@gmail.com> wrote:
>> (a) what we are still missing as far as incubator graduation is concerned
>
> There's still quite a bit to be done for community diversity. The
> drive to get new committers in is definitely a step in the right
> direction, but we'll need to follow up on that to make keep at least
> some of the new people as active members of the community. This is an
> area where mentors should be able to help (I'll try to increase my
> involvement here).
>
> To put things in perspective, since the beginning of this year Karl
> has made over 96% of all ManifoldCF commits. This makes the bus factor
> [1] of the project pretty high, and suggests that a more diverse
> development community is needed. The solution is not to have Karl
> commit less, but to get other people to more actively join the fun.
>
> The situation here is roughly similar to what we experienced during
> the incubation of Apache Tika. In the last year before graduation
> (2008) I was responsible for about 87% of all commits, which raised
> similar concerns about diversity [2]. The solution then was to
> graduate into a Lucene subproject instead of a full TLP, so that the
> larger project could still provide oversight and continuity in case
> things went wrong.
>
> Since then Lucene has shed out most subprojects to avoid being too
> large to manage, and by the time Tika in 2010 became a TLP by itself
> my share of all commits had shrunk to a still high but much more
> reasonable 62%. Today I'm still the most active committer, but my
> share of all the activity is down to 44%.
>
> I'd like to see ManifoldCF follow a similar trajectory. Graduating
> into a Lucene subproject is probably out of the question given the
> structural changes in Lucene, so for now my recommendation would be to
> remain in the Incubator until the community balance gets better.
>
> Some of the key things I did in Tika to help reduce my central role
> there were to lower the barriers of entry by working on things like
> the Getting Started page [3] and adding tools like the runnable
> tika-app jar and the simple GUI interface that make it trivially easy
> for someone to get started using Tika.
>
> The Build and Deploy guide in ManifoldCF [4] and the start.jar
> mechanism are good steps in this direction, but I think we could
> streamline quite a few of those steps. As Tommaso and others already
> mentioned, things like a simpler build process and a nicer UI can be
> quite useful. These are things that don't usually mean much to people
> already familiar to the system, but for potential new users and
> contributors with a short attention span they matter a lot. Thus I
> think these are areas that we should try to focus on in near future.
>
> [1] http://en.wikipedia.org/wiki/Bus_factor
> [2] http://markmail.org/message/bvqs2zv762fmlyv5
> [3] http://tika.apache.org/0.9/gettingstarted.html
> [4] http://incubator.apache.org/connectors/how-to-build-and-deploy.html
>
> BR,
>
> Jukka Zitting
>

Re: 1.0 release, and graduation

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Mon, Sep 19, 2011 at 9:06 PM, Karl Wright <da...@gmail.com> wrote:
> (a) what we are still missing as far as incubator graduation is concerned

There's still quite a bit to be done for community diversity. The
drive to get new committers in is definitely a step in the right
direction, but we'll need to follow up on that to make keep at least
some of the new people as active members of the community. This is an
area where mentors should be able to help (I'll try to increase my
involvement here).

To put things in perspective, since the beginning of this year Karl
has made over 96% of all ManifoldCF commits. This makes the bus factor
[1] of the project pretty high, and suggests that a more diverse
development community is needed. The solution is not to have Karl
commit less, but to get other people to more actively join the fun.

The situation here is roughly similar to what we experienced during
the incubation of Apache Tika. In the last year before graduation
(2008) I was responsible for about 87% of all commits, which raised
similar concerns about diversity [2]. The solution then was to
graduate into a Lucene subproject instead of a full TLP, so that the
larger project could still provide oversight and continuity in case
things went wrong.

Since then Lucene has shed out most subprojects to avoid being too
large to manage, and by the time Tika in 2010 became a TLP by itself
my share of all commits had shrunk to a still high but much more
reasonable 62%. Today I'm still the most active committer, but my
share of all the activity is down to 44%.

I'd like to see ManifoldCF follow a similar trajectory. Graduating
into a Lucene subproject is probably out of the question given the
structural changes in Lucene, so for now my recommendation would be to
remain in the Incubator until the community balance gets better.

Some of the key things I did in Tika to help reduce my central role
there were to lower the barriers of entry by working on things like
the Getting Started page [3] and adding tools like the runnable
tika-app jar and the simple GUI interface that make it trivially easy
for someone to get started using Tika.

The Build and Deploy guide in ManifoldCF [4] and the start.jar
mechanism are good steps in this direction, but I think we could
streamline quite a few of those steps. As Tommaso and others already
mentioned, things like a simpler build process and a nicer UI can be
quite useful. These are things that don't usually mean much to people
already familiar to the system, but for potential new users and
contributors with a short attention span they matter a lot. Thus I
think these are areas that we should try to focus on in near future.

[1] http://en.wikipedia.org/wiki/Bus_factor
[2] http://markmail.org/message/bvqs2zv762fmlyv5
[3] http://tika.apache.org/0.9/gettingstarted.html
[4] http://incubator.apache.org/connectors/how-to-build-and-deploy.html

BR,

Jukka Zitting

AW: 1.0 release, and graduation

Posted by "Wunderlich, Tobias" <to...@igd-r.fraunhofer.de>.
 (1) If MySQL support is something important then I will triage the appropriate ticket accordingly.  I'm curious where this requirement comes from, though.

For my part, I'd realy like MySQL support. At the moment I just modified the jdbc-connector for my purposes, but an out of the box connector for mysql-dbs would be much appreciated.

Tobias

Re: 1.0 release, and graduation

Posted by Alex Ott <al...@gmail.com>.
Hello

We can put following script (in attachment) to simplify setup of
missing maven dependencies that are fetched using 'ant
download-dependencies' command. After using this script I was able to
build everything using maven.

One more comment is on tests - maybe it's better to put long-running
tests, like filesystem tests, etc. into integration-test build stage
(https://maven.apache.org/guides/introduction/introduction-to-the-lifecycle.html)?
Maven Failsafe plugin
(https://maven.apache.org/plugins/maven-failsafe-plugin/usage.html)
provides useful functionality to implement this. I think, that this
could make developer's life easier

On Wed, Sep 21, 2011 at 9:52 AM, Alex Ott <al...@gmail.com> wrote:
> Hello
>
> On Wed, Sep 21, 2011 at 9:43 AM, Tommaso Teofili
> <to...@gmail.com> wrote:
>>> (2) I'm not sure it is reasonable to go with one build system over
>>> another.  Lucene/Solr has been having this battle for years.  I
>>> thought we might just learn from that experience and try to be more
>>> flexible.  There are excellent technical reasons to have each - for
>>> instance, building Debian packages works much better with Ant, and
>>> Eclipse works much better with Maven.
>>>
>>
>> I don't like "wars" as well about which one *is* better, maybe it's just a
>> matter of use case and personal preferences.
>> However I like what you say about trying to be more flexible so I am +1 to
>> your position.
>> I have some experience with Maven so let me know if you need my help in
>> tweaking that part of the build.
>
> I also have some experience with Maven, and I'm interested in MCF's
> development, so I can try to fix some problems. The first thing, that
> should be achieved - building with Maven out of box, without manual
> work (or at least with minimum of it).
>
> --
> With best wishes,                    Alex Ott
> http://alexott.net/
> Tiwtter: alexott_en (English), alexott (Russian)
> Skype: alex.ott
>



-- 
With best wishes,                    Alex Ott
http://alexott.net/
Tiwtter: alexott_en (English), alexott (Russian)
Skype: alex.ott

Re: 1.0 release, and graduation

Posted by Alex Ott <al...@gmail.com>.
Hello

On Wed, Sep 21, 2011 at 9:43 AM, Tommaso Teofili
<to...@gmail.com> wrote:
>> (2) I'm not sure it is reasonable to go with one build system over
>> another.  Lucene/Solr has been having this battle for years.  I
>> thought we might just learn from that experience and try to be more
>> flexible.  There are excellent technical reasons to have each - for
>> instance, building Debian packages works much better with Ant, and
>> Eclipse works much better with Maven.
>>
>
> I don't like "wars" as well about which one *is* better, maybe it's just a
> matter of use case and personal preferences.
> However I like what you say about trying to be more flexible so I am +1 to
> your position.
> I have some experience with Maven so let me know if you need my help in
> tweaking that part of the build.

I also have some experience with Maven, and I'm interested in MCF's
development, so I can try to fix some problems. The first thing, that
should be achieved - building with Maven out of box, without manual
work (or at least with minimum of it).

-- 
With best wishes,                    Alex Ott
http://alexott.net/
Tiwtter: alexott_en (English), alexott (Russian)
Skype: alex.ott

Re: 1.0 release, and graduation

Posted by Tommaso Teofili <to...@gmail.com>.
Hello Karl

2011/9/21 Karl Wright <da...@gmail.com>

> Hi Tommaso,
>
> Thanks for your feedback!
>
> Are you in a position to update the status page?  I don't think I can.
>

I should be able to do it, will try and see if I can get that updated
finally :)


>  If you disagree please point me in the right direction..
>
>
> As far as the technical aspects:
> (1) If MySQL support is something important then I will triage the
> appropriate ticket accordingly.  I'm curious where this requirement
> comes from, though.
>

this requirement just comes from my little experience; since I started
mentoring here I tried to see if/where ManifoldCF would fit a particular
business case (dealing almost always with managing communication between
some source and Solr) and quite always I got asked if it could be used with
MySQL instead of PostgreSQL. Personally I always prefer the latter for it's
just faster (at least where I've seen meaningful comparisons) but often
MySQL gets chosen as a DBMS maybe just for the reason that it's more
popular.
So this comes from some people asking me about the possibility of using
ManifoldCF with MySQL instead of PostgreSQL.
However this is not a 'blocker', just I think that it should give much
architecture flexibility (and, thus, adoption) to the project.


> (2) I'm not sure it is reasonable to go with one build system over
> another.  Lucene/Solr has been having this battle for years.  I
> thought we might just learn from that experience and try to be more
> flexible.  There are excellent technical reasons to have each - for
> instance, building Debian packages works much better with Ant, and
> Eclipse works much better with Maven.
>

I don't like "wars" as well about which one *is* better, maybe it's just a
matter of use case and personal preferences.
However I like what you say about trying to be more flexible so I am +1 to
your position.
I have some experience with Maven so let me know if you need my help in
tweaking that part of the build.


> (3) I would love to have the UI look cooler.  Stylistic work on the UI
> is definitely not the right job for me though.  Now, Piergiorgio
> mentioned going to a spring-based UI but I'm not sure that will fix
> anything stylistically, and it might well require redefinition of the
> connector interfaces, which would be a bad thing at this point, so I
> don't see much benefit to this architectural change proposal.  Is this
> what you were thinking of, or were you more thinking of look and feel?
>

My concern at the moment with this point is more related to the look and
feel than on the appropriate (MVC) framework since I am not sure I'd want to
inject another framework (I think l&f can be better than the current one
just using servlets and some JS).

However if changing the framework would fasten the process of enhancing the
UI style I'm ok for that too, on the contrary if that would affect connector
interfaces it maybe a huge work to do right now.
I'd love to hear other opinions on this and other topics as I'm sure MySQL
and UI aren't the only open points the community would like to get sorted.

Have a nice day you all.
Cheers,
Tommaso


>
> Karl
>
>
> On Tue, Sep 20, 2011 at 5:51 PM, Tommaso Teofili
> <to...@gmail.com> wrote:
> > Hello Karl, all
> >
> >
> > 2011/9/19 Karl Wright <da...@gmail.com>
> >
> >> Folks,
> >> I'd like to begin discussion about the next release, currently labeled
> >> 0.4, and also our potential for graduation from the incubator.  What
> >> I'd like is a sense of:
> >>
> >> (a) what we are still missing as far as incubator graduation is
> concerned,
> >
> >
> > at the moment I think we're in the right direction towards the graduation
> > (see the clutch report [1] which however doesn't count the new
> > committers/mentors as our page on Incubator website has to be updated).
> >
> >
> >> and
> >> (b) what a 1.0 release might look like to everyone
> >>
> >
> > my point here relates also to the graduation: I think we're building a
> nice
> > community and the ManifoldCF code is being bettered every day but it
> seems
> > to me there are some (few) parts which need some refactoring as they
> can't
> > work as they are now (i.e.: support to MySQL DBs), also in my opinion we
> > should choose one of Maven or Ant and drop the other building system as I
> > fear this could confuse new users/devs.
> > A minor thing in my opinion is that a restyle of the UI would make
> > ManifoldCF some more "nice" to use, however this can be pretty personal
> and
> > less important than functional requirements.
> >
> >
> >
> >>
> >> Please try to be as concrete as possible.  My own personal goal is to
> >> see this happen by the end of the year, more or less
> >
> >
> > +1
> >
> >
> >>  To that end
> >> I've already begun triaging JIRA tickets for the 0.4 release that I
> >> think would be appropriate for a 1.0 release.
> >>
> >> It's entirely possible that some things that people feel strongly
> >> about may not be doable in that time frame, but so be it.  This may
> >> also be true of our status as a project.
> >>
> >>
> > What do others think?
> > All the best.
> > Tommaso
> >
> > [1] : http://incubator.apache.org/clutch.html
> > [2] : http://incubator.apache.org/projects/manifoldcf.html
> >
> >
> >> Thanks,
> >> Karl
> >>
> >
>

Re: 1.0 release, and graduation

Posted by Karl Wright <da...@gmail.com>.
Hi Tommaso,

Thanks for your feedback!

Are you in a position to update the status page?  I don't think I can.
 If you disagree please point me in the right direction..


As far as the technical aspects:
(1) If MySQL support is something important then I will triage the
appropriate ticket accordingly.  I'm curious where this requirement
comes from, though.
(2) I'm not sure it is reasonable to go with one build system over
another.  Lucene/Solr has been having this battle for years.  I
thought we might just learn from that experience and try to be more
flexible.  There are excellent technical reasons to have each - for
instance, building Debian packages works much better with Ant, and
Eclipse works much better with Maven.
(3) I would love to have the UI look cooler.  Stylistic work on the UI
is definitely not the right job for me though.  Now, Piergiorgio
mentioned going to a spring-based UI but I'm not sure that will fix
anything stylistically, and it might well require redefinition of the
connector interfaces, which would be a bad thing at this point, so I
don't see much benefit to this architectural change proposal.  Is this
what you were thinking of, or were you more thinking of look and feel?

Karl


On Tue, Sep 20, 2011 at 5:51 PM, Tommaso Teofili
<to...@gmail.com> wrote:
> Hello Karl, all
>
>
> 2011/9/19 Karl Wright <da...@gmail.com>
>
>> Folks,
>> I'd like to begin discussion about the next release, currently labeled
>> 0.4, and also our potential for graduation from the incubator.  What
>> I'd like is a sense of:
>>
>> (a) what we are still missing as far as incubator graduation is concerned,
>
>
> at the moment I think we're in the right direction towards the graduation
> (see the clutch report [1] which however doesn't count the new
> committers/mentors as our page on Incubator website has to be updated).
>
>
>> and
>> (b) what a 1.0 release might look like to everyone
>>
>
> my point here relates also to the graduation: I think we're building a nice
> community and the ManifoldCF code is being bettered every day but it seems
> to me there are some (few) parts which need some refactoring as they can't
> work as they are now (i.e.: support to MySQL DBs), also in my opinion we
> should choose one of Maven or Ant and drop the other building system as I
> fear this could confuse new users/devs.
> A minor thing in my opinion is that a restyle of the UI would make
> ManifoldCF some more "nice" to use, however this can be pretty personal and
> less important than functional requirements.
>
>
>
>>
>> Please try to be as concrete as possible.  My own personal goal is to
>> see this happen by the end of the year, more or less
>
>
> +1
>
>
>>  To that end
>> I've already begun triaging JIRA tickets for the 0.4 release that I
>> think would be appropriate for a 1.0 release.
>>
>> It's entirely possible that some things that people feel strongly
>> about may not be doable in that time frame, but so be it.  This may
>> also be true of our status as a project.
>>
>>
> What do others think?
> All the best.
> Tommaso
>
> [1] : http://incubator.apache.org/clutch.html
> [2] : http://incubator.apache.org/projects/manifoldcf.html
>
>
>> Thanks,
>> Karl
>>
>

Re: 1.0 release, and graduation

Posted by Tommaso Teofili <to...@gmail.com>.
Hello Karl, all


2011/9/19 Karl Wright <da...@gmail.com>

> Folks,
> I'd like to begin discussion about the next release, currently labeled
> 0.4, and also our potential for graduation from the incubator.  What
> I'd like is a sense of:
>
> (a) what we are still missing as far as incubator graduation is concerned,


at the moment I think we're in the right direction towards the graduation
(see the clutch report [1] which however doesn't count the new
committers/mentors as our page on Incubator website has to be updated).


> and
> (b) what a 1.0 release might look like to everyone
>

my point here relates also to the graduation: I think we're building a nice
community and the ManifoldCF code is being bettered every day but it seems
to me there are some (few) parts which need some refactoring as they can't
work as they are now (i.e.: support to MySQL DBs), also in my opinion we
should choose one of Maven or Ant and drop the other building system as I
fear this could confuse new users/devs.
A minor thing in my opinion is that a restyle of the UI would make
ManifoldCF some more "nice" to use, however this can be pretty personal and
less important than functional requirements.



>
> Please try to be as concrete as possible.  My own personal goal is to
> see this happen by the end of the year, more or less


+1


>  To that end
> I've already begun triaging JIRA tickets for the 0.4 release that I
> think would be appropriate for a 1.0 release.
>
> It's entirely possible that some things that people feel strongly
> about may not be doable in that time frame, but so be it.  This may
> also be true of our status as a project.
>
>
What do others think?
All the best.
Tommaso

[1] : http://incubator.apache.org/clutch.html
[2] : http://incubator.apache.org/projects/manifoldcf.html


> Thanks,
> Karl
>