You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by "Ralf S. Engelschall" <rs...@engelschall.com> on 1998/08/03 12:53:52 UTC

[STATISTIC] Apache Development

Last week two friends asked me how much I've contributed to Apache. I answered
"A lot over the last months". They responded: "Yeah, we know, but how much
total contribution?". Ops, I'd no answer and said "Don't know any numbers".
So we are real hackers, today I've spent an hour and wrote a little Perl
script (appended below) which gathers some statistical information out of the
CHANGES file (give credit to Marc who always complains if we didn't follow its
syntax) and reports it in a nice table format. The result is appended below.

I've also already thought about calculating the amount of source lines which
were changed between each release. This is possible by parsing the "cvs log
-r<rel> -r<next_rel>" output, but for today it's enough, I think. The
statistic is interesting enough...

PS: Should we post it on a regular basis, say once per month?

Greetings,
                                       Ralf S. Engelschall
                                       rse@engelschall.com
                                       www.engelschall.com

 ________________________________________________________________________

 APACHE DEVELOPMENT STATISTIC
 Calculated from CHANGES file as of 08-Aug-1998
 ________________________________________________________________________

 RELEASE STATISTIC

 How much effort was required for each Apache release and
 which developers have provided the major contributions.

 Release Changes Major Contributor         Second Major Contributor
 ------- ------- ------------------------- ------------------------------
 1.3.2        29 Ralf S. Engelschall (17)  Roy T. Fielding (2)      
 1.3.1        74 Ralf S. Engelschall (22)  Dean Gaudet (9)          
 1.3.0        20 Ralf S. Engelschall (8)   Martin Kraemer (2)       
 1.3b7        84 Ralf S. Engelschall (34)  Ben Laurie (7)           
 1.3b6       121 Ralf S. Engelschall (35)  Dean Gaudet (33)         
 1.3b5         3 Dean Gaudet (2)           Ben Laurie (1)           
 1.3b4       103 Dean Gaudet (34)          Marc Slemko (10)         
 1.3b3        55 Dean Gaudet (10)          Martin Kraemer (6)       
 1.3b2        99 Dean Gaudet (42)          Ken Coar (12)            
 1.3a1        50 Dean Gaudet (12)          Ken Coar (6)             
 1.2.6        22 Dean Gaudet (7)           Roy T. Fielding (2)      
 1.2.5        17 Marc Slemko (8)           Dean Gaudet (4)          
 1.2.4         2 Marc Slemko (2)           Roy T. Fielding (1)      
 1.2.3         4 Roy T. Fielding (1)       Lars Eilebrecht (1)      
 1.2.2        18 Dean Gaudet (8)           Roy T. Fielding (3)      
 1.2.1        27 Dean Gaudet (10)          Marc Slemko (8)          
 1.2b11       23 Roy T. Fielding (9)       Dean Gaudet (5)          
 1.2b10        5 Dean Gaudet (3)           Marc Slemko (2)          
 1.2b9        32 Roy T. Fielding (8)       Dean Gaudet (8)          
 1.2b8        47 Dean Gaudet (15)          Roy T. Fielding (10)     
 1.2b7        38 Dean Gaudet (8)           Marc Slemko (7)          
 1.2b6         2 Marc Slemko (1)           -unknown-                
 1.2b5        36 Jim Jagielski (5)         Alexei Kosut (4)         
 1.2b4         8 Jim Jagielski (1)         Marc Slemko (1)          
 1.2b3        21 Randy Terbush (3)         Alexei Kosut (2)         
 1.2b2        18 Ben Laurie (4)            Randy Terbush (3)        
 1.2b1         1 -unknown-                 -unknown-                
 1.1.1         5 Mark J. Cox (1)           Alexei Kosut (1)         
 1.1.0         7 Ben Laurie (4)            Alexei Kosut (3)         
 1.1b4         9 Robert S. Thau (2)        -unknown-                
 1.1b3        14 Ben Laurie (3)            Paul Sutton (1)          
 1.1b2         1 -unknown-                 -unknown-                
 1.1b1         1 -unknown-                 -unknown-                
 1.0.3        13 Rob Hartill (3)           David Robinson (2)       
 1.0.2         7 David Robinson (3)        Cliff Skolnick (1)       
 1.0.1         5 David Robinson (2)        Ben Laurie (2)           
 1.0.0         1 -unknown-                 -unknown-                
 0.8.16       12 David Robinson (8)        Robert S. Thau (1)       
 0.8.15       22 David Robinson (9)        Ben Laurie (5)           
 0.8.14        6 Andrew Wilson (2)         Ben Laurie (2)           
 0.8.13       11 Randy Terbush (2)         Aram W. Mirzadeh (1)     
 0.8.12       12 Robert S. Thau (5)        David Robinson (4)       
 0.8.11       12 Robert S. Thau (5)        Roy T. Fielding (3)      
 0.8.10        2 David Robinson (2)        -unknown-                
 0.8.9        20 Jim Jagielski (3)         David Robinson (3)       
 0.8.8         2 -unknown-                 -unknown-                
 0.8.7         3 David Robinson (2)        -unknown-                
 0.8.6         5 Mark J. Cox (1)           Roy T. Fielding (1)      
 0.8.5        10 Alexei Kosut (2)          Roy T. Fielding (1)      
 0.8.4         6 David Robinson (2)        Andrew Wilson (1)        
 0.8.3         8 David Robinson (2)        Rob Hartill (1)          
 0.8.2        11 Robert S. Thau (3)        Mark J. Cox (3)          
 0.8.1         3 Randy Terbush (1)         -unknown-                
 0.8.0         9 Robert S. Thau (5)        Cliff Skolnick (2)       
 0.6.1         5 Paul Sutton (1)           Rob Hartill (1)          
 0.6.0        11 Robert S. Thau (9)        Paul Sutton (2)          
 0.5.3         2 Cliff Skolnick (2)        -unknown-                
 0.5.2         4 Robert S. Thau (2)        Paul Sutton (1)          
 0.5.1         9 Robert S. Thau (5)        Rob Hartill (1)          
 0.4           1 Robert S. Thau (1)        Rob Hartill (1)          
 0.3           1 David Robinson (1)        Robert S. Thau (1)       
 0.2           1 David Robinson (1)        Robert S. Thau (1)       
 ________________________________________________________________________

 DEVELOPER STATISTIC                   (Sorted by amount of contribution)

 How much total effort was provided by each Apache
 developer since the start of the Apache project.

 Developer            Contributions  Developer            Contributions
 -------------------- -------------  -------------------- -------------
 Dean Gaudet                    220  Paul Sutton                     23
 Ralf S. Engelschall            129  Rob Hartill                     23
 Marc Slemko                     70  Lars Eilebrecht                 19
 Roy T. Fielding                 69  Chuck Murcko                    18
 Ben Laurie                      62  Brian Behlendorf                13
 Ken Coar                        60  Doug MacEachern                 13
 Robert S. Thau                  43  Ben Hyde                        12
 David Robinson                  41  Cliff Skolnick                  10
 Martin Kraemer                  39  Mark J. Cox                      8
 Jim Jagielski                   36  Andrew Wilson                    6
 Randy Terbush                   34  Sameer Parekh                    5
 Alexei Kosut                    27  Aram W. Mirzadeh                 4
 ________________________________________________________________________


And here comes the little script for your own pleasure:

#!/sw/bin/perl
##
##  CHANGES.stat -- calculate some statistic from Apache CHANGES file
##  Copyright (c) 1998 Ralf S. Engelschall, All Rights Reserved. 
##

#   the Apache Group developers
%N = (
   'brian'    => [ 'Brian.*Behlendorf', 'Brian Behlendorf' ],
   'ken'      => [ 'Ken.*Coar', 'Ken Coar' ],
   'mjc'      => [ 'Mark.*Cox', 'Mark J. Cox' ],
   'lars'     => [ 'Lars.*Eile', 'Lars Eilebrecht' ],
   'rse'      => [ 'Ralf.*Engel', 'Ralf S. Engelschall' ],
   'fielding' => [ 'Roy.*Fielding', 'Roy T. Fielding' ],
   'dgaudet'  => [ 'Dean.*Gaudet', 'Dean Gaudet' ],
   'robh'     => [ 'Rob.*Hart', 'Rob Hartill' ],
   'bhyde'    => [ 'Ben.*Hyde', 'Ben Hyde' ],
   'jim'      => [ 'Jim.*Jag', 'Jim Jagielski' ],
   'akosut'   => [ 'Alexei.*Kosut', 'Alexei Kosut' ],
   'martin'   => [ 'Martin.*Kraemer', 'Martin Kraemer' ],
   'ben'      => [ 'Ben.*Laurie', 'Ben Laurie' ],
   'dougm'    => [ 'Doug.*Eachern', 'Doug MacEachern' ],
   'aram'     => [ 'Aram.*Mirz', 'Aram W. Mirzadeh' ],
   'sameer'   => [ 'Sameer.*Pa', 'Sameer Parekh' ],
   'marc'     => [ 'Marc.*Slem', 'Marc Slemko' ],
   'cliff'    => [ 'Cliff.*Sk', 'Cliff Skolnick' ],
   'paul'     => [ 'Paul.*Sutton', 'Paul Sutton' ],
   'randy'    => [ 'Randy.Te', 'Randy Terbush' ],
   'dirk'     => [ 'Dirk-Willem.*', 'Dirk-Willem van Gulik' ],
   'chuck'    => [ 'Chuck.*Mu', 'Chuck Murcko' ],
   'david'    => [ 'David.*Robi', 'David Robinson' ],
   'rst'      => [ 'Robert.*Thau', 'Robert S. Thau' ],
   'awilson'  => [ 'Andrew.*Wil', 'Andrew Wilson' ],
);

#  initialize variables holding the information
%D = ();
@V = ();
%C = ();
%VC = ();
$v = '';

#  parse CHANGES file
open(FP, "<CHANGES") || die;
while (<FP>) {
    if (m|^Changes\s+with.+?(\d[\d.ab]+).*|) {
        $v = $1;
        $C{$v} = 0;
        push(@V, $v);
        $VC{$v} = {};
        next;
    }
    if (m|^\s*\*\)\s*|) {
        $C{$v}++;
    }
    foreach $n (keys(%N)) {
        $e = $N{$n};
        $p = $e->[0];
        if (m|$p|) {
            $D{$n}++;
            $VC{$v}->{$n}++;
        }
    }
}
close(FP);

#   create report
$date = `date '+%m-%b-%Y'`;
$date =~ s|\n$||;
print " ________________________________________________________________________\n";
print "\n";
print " APACHE DEVELOPMENT STATISTIC\n";
print " Calculated from CHANGES file as of $date\n";
print " ________________________________________________________________________\n";
print "\n";
print " RELEASE STATISTIC\n";
print "\n";
print " How much effort was required for each Apache release and\n";
print " which developers have provided the major contributions.\n";
print "\n";
print " Release Changes Major Contributor         Second Major Contributor\n";
print " ------- ------- ------------------------- ------------------------------\n";
foreach $v (@V) {
    $e = $VC{$v};
    @mc = sort({ $e->{$b} <=> $e->{$a} } keys(%{$e}));
    $mc = $N{$mc[0]}->[1]." (".$e->{$mc[0]}.")";
    $mc = '-unknown-' if ($N{$mc[0]}->[1] eq ''); 
    $oc = $N{$mc[1]}->[1]." (".$e->{$mc[1]}.")";
    $oc = '-unknown-' if ($N{$mc[1]}->[1] eq ''); 
    printf(" %-7s %7d %-25s %-25s\n", $v, $C{$v}, $mc, $oc);
}
print " ________________________________________________________________________\n";
print "\n";
print " DEVELOPER STATISTIC                   (Sorted by amount of contribution)\n";
print "\n";
print " How much total effort was provided by each Apache\n";
print " developer since the start of the Apache project.\n";
print "\n";
print " Developer            Contributions  Developer            Contributions\n";
print " -------------------- -------------  -------------------- -------------\n";
@O = ();
foreach $d (sort({ $D{$b} <=> $D{$a} } keys(%D))) {
    $e = $N{$d};
    $n = $e->[1];
    push(@O, sprintf(" %-20s %13d", $n, $D{$d}));
}
$n = int($#O / 2) + 1;
for ($i = 0; $i < $n; $i++) {
    print $O[$i] . ' ' . $O[$n+$i] . "\n";
}
print " ________________________________________________________________________\n";
print "\n";


[STATISTIC] Apache Development

Posted by Ben Hyde <bh...@pobox.com>.
Ralf S. Engelschall writes:
 > ... contributed to Apache. ... "Don't know any numbers". ...

Fun.

It is important to remain approprately light hearted about
such numbers.  I enjoy collecting numbers like these from
projects I work on.  So many times I've watched the arrival
of "numbers" like these create misunderstanding, hurt feelings
and defensiveness over and over again.

I'd particularly advise against collecting the number of edits by X
that caused bug later!  On time is also a terrible number.  On the
other hand # of edits to the bug database, help mailing list,
documentation, etc. are good since most people don't like that work.
I often monitor a mess of such numbers and make a point of going and
saying thank you when a person does the first in a new catagory.

Richard Gabreil's book has a nice story in it about the time
the CEO decided to layoff people based on # of hours they 
worked.

 - ben

Re: [STATISTIC] Apache Development

Posted by Mark J Cox <ma...@awe.com>.
> Last week two friends asked me how much I've contributed to Apache. 

Neat report.  The problem with using the CHANGES file is that changes made
whilst major versions were being worked on were not included (such as the
stuff between 1.1.1 and 1.2 or the stuff being done for 2.0)

Even more fun would be to look through the CVS logs, take account of the
"submitted by:" and work out how many lines were changed by each
developer :-)

Mark




Re: [STATISTIC] Apache Development

Posted by Brian Behlendorf <br...@hyperreal.org>.
I don't think any of this is productive.  We might as well count authorship
of messages to new-httpd, or personal finances spent on Apache development,
or bug database responses, or how many installations of Apache each of us
is personally liable for.  At the end of the day, it does nothing but
dismotivate.

	Brian

 

--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--
"Common sense is the collection of prejudices  |     brian@apache.org
acquired by the age of eighteen." - Einstein   |  brian@hyperreal.org

Re: [STATISTIC] Apache Development

Posted by Alexei Kosut <ak...@leland.Stanford.EDU>.
On Mon, 3 Aug 1998, Ralf S. Engelschall wrote:

> PS: Should we post it on a regular basis, say once per month?

Definitely not. This is especially due to the meaninglessness of this
particular statistic. Not that there aren't others that more more
meaningless. Ben's revelation that I've only checked out the Apache
sources five times in the last month was particuarly useless :)

And the CHANGES file, as we all know, is a particuarly bad indication of
when anything happened. It's missing large chunks of data, especially
between major releases (1.1, 1.2 and 1.3, for example). According to the
CHANGES file, for example, we never actually made a Win32 port. It just
sort of appeared. And it would have only counted once, at any rate,
although the number of lines of code attached to such an entry would
probably have given Ambarish, Ben and I top billing in the 1.3a1 category
if it were done by lines of code changed.

(And I *know* I've made more than 27 contributions, by the way. But most
of them came in sections of the CHANGES file that aren't there any more,
or never were. Minor things like mod_actions, handlers, keep-alive
support, <Location>, name-based virtual hosts, HTTP/1.1 compliance,
<Files>, chunks of the Win32 port, etc...)

-- Alexei Kosut <ak...@stanford.edu> <http://www.stanford.edu/~akosut/>
   Stanford University, Class of 2001 * Apache <http://www.apache.org> *