You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@diversity.apache.org by Matt Sicker <bo...@gmail.com> on 2020/11/03 14:28:28 UTC

Fwd: Thoughts on study about the impact of the Apache ecosystem on community diversity of open source projects

Also relevant here. Main thread is in community.a.o

---------- Forwarded message ---------
From: sharanf <sh...@apache.org>
Date: Mon, Nov 2, 2020 at 14:19
Subject: Re: Thoughts on study about the impact of the Apache ecosystem on
community diversity of open source projects
To: Isabella Ferreira <is...@polymtl.ca>
CC: <de...@community.apache.org>, <jm...@apache.org>, Griselda Cuevas <
gris@apache.org>


Hi Isabella

Thanks very much for reaching out to me.  It's great to hear that you
are interested in the Apache Software Foundation (ASF) and doing some
research on the the potential lifecycle of incubating projects. I think
that the topics that you are looking to gather information on cover a
few areas within the ASF. While Community Development is a general
umbrella covering all Apache communities, we do have specific areas that
are focussed specifically on D&I and Apache Incubator itself.

Our Apache Incubator community oversees the whole Apache incubation
process while  D&I community has been instrumental in performing the
latest survey of diversity within the Apache communities and may be able
to give you a better indication of what diversity information we have
and can share.

In the meantime I will try to respond inline to your various points below.

A key tool we use to gather statistics and metrics for all Apache
projects is another Apache project called Apache Kibble which collects
contribution and statistics information on incubating projects too so
maybe take a look at.


On 2020-11-02 16:10, Isabella Ferreira wrote:
> Dear Sharan Foga,
>
> My name is Isabella Ferreira and I was in your presentation at
> CHAOSScon Europe 2020. Based on your presentation, I saw that you are
> involved in several initiatives to encourage diversity within the ASF.
>
> We, a group of Canadian and Dutch software engineering researchers,
> are interested in understanding why some projects joining Apache
> incubator grow and succeed, and others fail. Based on this study, our
> eventual goal is to formulate recommendations for projects considering
> to join Apache in terms of expectations and best practices. We aim to
> share our findings with the Apache community as well as software
> practitioners and researchers.
>
> So far we have manually classified the incubator proposals of 292
> projects to understand their motivation. We have found that these are
> the top-5 reasons for joining the Apache incubator:
>
> 1.
>     Community building
> 2.
>     Community diversity
> 3.
>     Follow an established development process (such as the "Apache Way")
> 4.
>     Increase user base
> 5.
>     Expected collaboration with other projects
>
>
> As the next step, we would like to evaluate to what extent joining the
> Apache ecosystem has enabled projects to achieve their goals. In
> particular, we are interested in questions like:
>
>  *
>     Did the number of organizations contributing to Apache projects
>     increase compared to before joining the Apache incubator?
>

We have been using Apache Kibble to generate statistics for all our
projects and we don't currently track track organisational affiliation
properly but there have been discussions about ways to improve and
include it.

For projects coming into Apache Incubator, I believe some organisational
affiliation is captured initially to ensure diversity of project
affiliation and the lack of dependency on one specific company. As you
mention sometimes a project enter incubation to grow their communities
as they need to diversify to survive.

>  *
>     Did the geographical spread of contributions to Apache projects
>     increase compared to before joining the Apache incubator?
>

I don't think Apache Kibble captures geographical location of
contributions but it does capture the time and date of the contribution,
if that is any help.

>  *
>     Did the gender diversity of contributions to Apache projects
>     increase compared to before joining the Apache incubator?
>

We do have the contributor id but Apache Kibble doesn't specifically
capture or report on this information. Perhaps our D&I community may be
able to help you here with some relevant details from the last Apache
Diversity survey.

> While the GitHub and Subversion repositories of Apache projects
> provide information about the kind of contributions made (size,
> complexity, etc.), the information needed to address the above
> questions is not as readily available.
>
> Hence, as the current VP of the Apache Community Development, we would
> like to have your thoughts on what would be the best way to obtain
> access to the above diversity data, without breaching any
> confidentiality concerns:
>
>  *
>     Is there a means to get access to Apache patch submitters’
>     contributor agreements, for research purposes? If so, what is the
>     process for this (e.g., NDAs to sign)?
>

Tha ASF site publishes publicly the list of people and companies that
have signed an Individual or Corporate Contributor Licence Agreeement
(ICLAs). If you are asking for access to the actual document signed,
then no - this is not possible.

>  *
>     Alternatively, is there a way for us to provide R or Python
>     analysis scripts that someone with data access could run on our
>     behalf, as such only exposing aggregate data to us?
>  *
>     Another alternative would be to perform a series of interviews
>     and/or a survey amongst Apache contributors, although the success
>     would heavily rely on a large participation rate.
>

If your focus is on Incubator then by reaching out to them you maybe be
able to gather enough survey participants. What sort of participation
levels do you need to reach?

>
> What are your thoughts on these points? Of course, we would be
> interested in organizing a virtual call to clarify our research
> objectives and/or questions.

I think you that have asked some interesting questions, but I am not
sure that we have all the information available.  Some of the
information you have asked for, we cannot give you. Perhaps it would be
good to continue this discussion on our mailing list to explore a bit
more what public data we have that could help with your research.

I have copied our VP Apache Incubator Justin McLean and our VP Apache
Diversity & Inclusion Gris Cuevas who may also be able to respond with
their comments or any additional details that could help you.

Thanks
Sharan


> Kind regards,
>
> Isabella Ferreira, Polytechnique Montréal, Canada
> Bram Adams, Queen’s University, Canada
> Alexander Serebrenik, Eindhoven University of Technology, The Netherlands
> Nan Yang, Eindhoven University of Technology, The Netherlands
>