You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@gump.apache.org by "Adam R. B. Jack" <aj...@trysybase.com> on 2004/04/01 20:02:39 UTC

Gump Actions (was re: brutus config -> lsd config)

Sam wrote:

> It appears to me that you are looking at Gump as as monolithic tool.

I don't think I do, other than I believe that 90% of the community are more
interested in Gump the service, than Gump the code -- so I focus on that. I
think (for us/Gumpmeisters and for the eventual growing millions ... ;-) we
ensure we can run things in whatever parts we want, to optimize our time. I
don't think we differ here.

We do have one monolith script (integrate.pt for the 'server' = cron run)
but we also have individual script (update/build/debug) and the GUI (I
almost got it working again), all of which I think ought be able to do all
or any small parts they wish.

For the record, I really like that we document to xdocs -- external files -- 
'cos if forrest fails to publish I can go in and tweak and re-run. I like
that we can do updates as we see fit, builds as we see fit, tight and
separate -- I just don't do them much myself, I don't have the network
resources.

The only problem I have are the in-memory structures that represent the run
information [options/list], state, annotations, etc. I've thought about
serializing that to XML, but not had the time/energy. I also fear it is
overkill.

>  I  see it as a series of actions: Generate, Update, Make, Publish...

I agree. Have you looked in engine.py? I've tried to break things into
individual named tasks with there being only the minimal mandatory
dependencies (i.e. code would crash without). There is no dependency upon
'update' for 'build', not on either of these two for documentation. Again,
the in memory tree is the main common aspect between these tasks.

As such, I try to use the 'GumpRun' as the common interfacing information
between aspects.

BTW: Generate no longer exists, we "generate into memory". That said, I
could value that as "generate to a merge file (we have that) and re-read" -- 
the only reason I didn't do that is that I don't see much time saving from
loading/merging on demand. I've thought about it.

> My suggestion is that we should decompose Gump.  There is no reason that
> everything needs to funnel through a single entry point.

I think it is decomposed, with 'engine' being my attempt at extracting logic
from model classes, so we don't have it duplicated. I'm no expert designer,
I'd welcome feedback/improvements.

> Focusing initially on the cron job, it needs to run a script.  That
> script can be written entirely in Python (including reading the
> configuration, setting environment variables, etc).  It can do more than
> what one typically does from the command line (e.g., copying of
> directories, nagging, etc).

That is gumpy.py.

>
> Key points:
>
>   1) when it does things typically done by "testing", it calls into the
> exact same code.
>

I've tried in the unit tests. [You, tinkered with them yet?] I find that
when I crash (in public) it is in any code I've not covered in tests (the
joys of a large runtime interpretted language w/ huge metadata inputs). As
such I'm striving to get as much coverage in the tests as possible.
Engine.py is one weak spot, I just don't want to get too fancy trying to
make that unit testable (Against some mythincall in-memory
workspace/filesystem/etc.).

BTW: See the test workspace in unit tests? That allows acces to the file
system, and such. I often run gumpy.py against that test workspace to get
better test coverage.

BTW: I want a 'check in script that does:

    1) Run all unit test
    2) Run the test workspace
    3) Do a pychecker run on all code

I tried writting it, but struggled with the pychecker part and gave up. If I
had the discipline to do that I know it'd help reduce my error rate...

>   2) official, completed outputs are served from a different URL than
> incomplete or testing outputs.

Yeah, I hear you. Right now I've been doing --text for testing when I don't
want to dork the site, or I just dork it and be damned. You'll find there is
some complexity in what gets copied to log (due to forrest) but feel free to
dig in. Again (from TODO list) a historical RDBMS is a nice thought for
saving some offical result history.

>   3) official runs start by cleaning up the work area.  This was done by
> rsync in "classic" gump.

We moved away from rsync in part because of (perceived) bugs [which in
hindsight maybe weren't] and portability (and we didn't need the 'r'). The
folks on M$ were struggling with it (see archives). Basically Antoine gave
us a Python port of is (see utils/sync.py). We use this to do as you say,
and it works well. Since we own the code, we could even extend it to report
on 'what files changed for this build' (some folks would like that).

Right now when we update we sync, which you might wish to break that apart.

regards,

Adam


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@gump.apache.org
For additional commands, e-mail: general-help@gump.apache.org