You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@gump.apache.org by "Adam R. B. Jack" <aj...@trysybase.com> on 2004/05/14 16:52:15 UTC

[RT] Multi-threading ( was Re: [RT] Gump[y] internals re-design)

Not necessarily a follow on to this, but came it out of the same thought
stream...

It occurs to me that Gump is paced by the speed of remote HTTP servers (for
metadata download), remote CVS servers (for updates), and such.
Unfortunately Gump operates sequentially, so each delay is cumulative (and
also doesn't make much use of multiple CPUs). I suspect that by
multi-threading we could significantly reduce the elapsed time of a run.

* Multi-threading modules:

CVS|SVN updates are (potentially) IO bound more than CPU bound, which we
know to be true from past experience with SF.net CVS servers (before they
revamped). Busy servers will likely be far slower than Brutus (or other Gump
server) have capability for. Module updates (and the equally/possibly more
slow, need to check, synchronize -- which cleans the work area) really
aren't in need of being sequenced. We could easily have a pool of threads
forking CVS|SVN updates to work through the module list.

Using the algorithm (below) we could have a build thread block (and/or
perform) waiting for a module update to occur. If we did (however) update in
the module sequence (based off project sequence) I doubt we'd ever actually
see this occurring outside of theory.

* Multi-threading project builds:

We could create a queue/pool of 'projects ready to be built' that is the
list of unbuilt projects that are direct dependencies of built projects
(with no unbuilt dependencies). We'd seed this queue/pool with the first
project from the BuildSequence (or all those @ depth 0), and let build
completion keep filling it.

To be honest, I am not sure if multi-threading project builds will have much
impact (I doubt the queue/pool will be that full at any one time) but it
might & it could pace to the OSs abilities.

* Multi-threading other:

Heck, once the project build (<ant|<maven) has completed, there is really no
*need* for statistics, publishing jars, syndication, documentation,
notification to occur before subsequence projects get built. Not sure we add
much by moving this to another thread, 'cos we might just compete for
cycles/resources, but at least we'd be allowing the scheduler (in a better
position than us) make the determination.

                  -----------------------------------------------------

In general there are some aspects of multi-threaded code that we'd need to
verify (e.g. can we propagate state to dependee modules/projects that are
potentially being worked upon) but I think the above [and below] logic
actually reduces this to far less than one would initially imagine.

Thoughts?

regards,

Adam
----- Original Message ----- 
From: "Adam R. B. Jack" <aj...@trysybase.com>
To: <ge...@gump.apache.org>
Sent: Thursday, May 13, 2004 8:28 PM
Subject: [RT] Gump[y] internals re-design


> I think I've finally got Sam's itch for a more real-time/interactive Gump.
> [FWIIW: I think Brutus' wonderful multiple-a-day cycles are being wasted
> without more interaction with users, and this led me here.]
>
> Basically, I think I'd like to see Gump's internals refactored so that we
> have logic like:
>
>     # In order...
>     for project in run.getBuildSequence():
>
>         module=project.getModule()
>
>         # Update on demand
>         if not module.isUpdated():
>             module.update()
>             module.updateStats()
>             module.document()
>             module.syndicate()
>             module.notify()
>
>         # Build project
>         project.build()
>         product.publishArtefacts()
>         project.updateStats()
>         project.document()
>         project.syndicate()
>         project.notify()
>
>         # Keep track of progress...
>         documentBuildList()
>
>     # The wrap up...
>     documentWorkspace()
>
> This allows (1) CVS|SVN updates to occur just in time (2) the smallest
> possible window between update/build failure/notification (3) watch as
Gump
> goes [monitoring].
>
> [I'd like to split project/module/workspace into wrapping context objects
> (separate from the models) that hold state, etc. I doubt anybody cares
about
> that, but I, but I want a loaded workspace to be re-usable, it 'costs' too
> much to load.]
>
> Now, there are far better designers than I on this list, and I have to
> wonder if I am missing some useful patterns. Ought I consider plug-ins
(e.g
> ant|maven|xyz), ought I consider observer patterns, etc. etc. How do I
cope
> with optional/switchable/dynamic code paths through here? [i.e. no
> update/just build, no publish/just build, etc.] Any thoughts here?
>
> BTW: I'd like to be able to combine this with some sort of web-based
> interface that can use  a loaded workspace in memory (it is slow to load
600
> projects) and allow users to interactively say 'validate this', or
'preview
> this for me', 'do an adhoc build, pls', etc. This is a knock on goal, but
> seems important to consider.
>
> I'm posting this so I don't start coding it. It isn't critical this
moment,
> but it'd be a nice to have. I'd really appreciate
thoughts/insights/advice.
>
> regards,
>
> Adam
> --
> Experience the Unwired Enterprise:
> http://www.sybase.com/unwiredenterprise
> Try Sybase: http://www.try.sybase.com
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@gump.apache.org
> For additional commands, e-mail: general-help@gump.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@gump.apache.org
For additional commands, e-mail: general-help@gump.apache.org