You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@gump.apache.org by "Daniel F. Savarese" <df...@savarese.org> on 2004/03/08 09:26:59 UTC

Re: [RT] Moving gump forward

In message <40...@apache.org>, Stefano Mazzocchi writes:
>Now, I think it's possible (even if computationally expensive) to 
>understand exactly what commit broke the build and to nag the exact 
>person and the community and copy all the offended people.
...
>for those not familiar with exponential growth, it's enough to say that 
>if the build took a minute it would take gump 2,5 billion years to find 
>out what broke the build.

Why don't you start with the easy stuff.  If a build of project A
fails, we know all builds depending on project A will fail.  But
if project A is dependent on project B, and project B failed, then
we keep on walking up the tree to find the first build that failed.
Okay, so now that leaves the issue of a build failing because of
an API or some other change in a dependent project that built
successfully.  For projects with a small number of dependencies,
use the brute force approach you described.  For projects with a
large number dependencies, develop some heuristics.  You can
reason that if project B makes a change that breaks project A and
if C also depends on project B, then there's a chance project C
might also have broken.  Since Gump builds a ton of code bases,
a decent number of culprits (i.e., project B) could be identified
by analyzing the dependencies shared by projects that failed to
build.  You can then apply the brute force approach to only those
shared dependencies if they number below some manageable figure.
Another heuristic approach that would work for Java projects at
least, would be to analyze the build failure messages.  Usually
they'll reference a class or class member/method that is in
a dependent code base.  Develop an evolving library of patterns
to extract the offending methods/classes/etc. and then discover
what jar they came from.

Anyway, those are some inelegant, inxact, and simplistic--but perhaps
useful in some instances--suggestions about how to start adding the
functionality you describe that would buy time to figure out how to do
it right.  That is, assuming it's a hard problem that requires a good
while to figure out.  My reasoning is that it's okay to nag the
wrong projects every once in a while as long as you nag the right projects
most of the time.  And on first glance, based on your comments, it seems
easier to implement that than to figure out the right project to nag all
the time.

daniel



---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@gump.apache.org
For additional commands, e-mail: general-help@gump.apache.org


Re: [RT] Moving gump forward

Posted by Stefano Mazzocchi <st...@apache.org>.
Daniel F. Savarese wrote:

> In message <40...@apache.org>, Stefano Mazzocchi writes:
> 
>>Now, I think it's possible (even if computationally expensive) to 
>>understand exactly what commit broke the build and to nag the exact 
>>person and the community and copy all the offended people.
> 
> ...
> 
>>for those not familiar with exponential growth, it's enough to say that 
>>if the build took a minute it would take gump 2,5 billion years to find 
>>out what broke the build.

> Why don't you start with the easy stuff.  If a build of project A
> fails, we know all builds depending on project A will fail.  But
> if project A is dependent on project B, and project B failed, then
> we keep on walking up the tree to find the first build that failed.
> Okay, so now that leaves the issue of a build failing because of
> an API or some other change in a dependent project that built
> successfully.  For projects with a small number of dependencies,
> use the brute force approach you described.  For projects with a
> large number dependencies, develop some heuristics.  

Yes, I was thinking about this today.

> You can
> reason that if project B makes a change that breaks project A and
> if C also depends on project B, then there's a chance project C
> might also have broken.  

uh, that's a brilliant suggestion right there!

> Since Gump builds a ton of code bases,
> a decent number of culprits (i.e., project B) could be identified
> by analyzing the dependencies shared by projects that failed to
> build.  

This is awesome! I love it!

> You can then apply the brute force approach to only those
> shared dependencies if they number below some manageable figure.

Sweet!

> Another heuristic approach that would work for Java projects at
> least, would be to analyze the build failure messages.  Usually
> they'll reference a class or class member/method that is in
> a dependent code base.  Develop an evolving library of patterns
> to extract the offending methods/classes/etc. and then discover
> what jar they came from.

yes, I was going to suggest this approach next, but I'm reluctant 
because I deeply appreciate the fact that gump is language agnostic and 
it should remain the same, IMO.

> Anyway, those are some inelegant, inxact, and simplistic--but perhaps
> useful in some instances--suggestions about how to start adding the
> functionality you describe that would buy time to figure out how to do
> it right.  

yep

> That is, assuming it's a hard problem that requires a good
> while to figure out.  My reasoning is that it's okay to nag the
> wrong projects every once in a while as long as you nag the right projects
> most of the time.  And on first glance, based on your comments, it seems
> easier to implement that than to figure out the right project to nag all
> the time.

yours are precious suggestions. many thanks for taking the time to share 
them.

I'll try to think more about heuristics on broken dependencies and 
report back.

-- 
Stefano.