You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@continuum.apache.org by Wendy Smoak <ws...@gmail.com> on 2009/01/20 02:15:56 UTC

In distributed build, how does Continuum decide whether there have been scm changes?

I'm trying to understand how, if a project may build on any agent,
Continuum can determine whether there have been scm changes since the
last time it was built.  Here's what I think should happen (scheduled
builds, assuming Always Build and Build Fresh are NOT checked):

Add project
1:00pm Build on agent 1, checkout at r500
2:00pm Build on agent 2, checkout at r500 <--- no changes, project
should not build
2:15pm developer makes scm changes
3:00pm Build on agent 2 update to r501, build because there were changes
4:00pm Build on agent 1 update to r501 <---- no changes, project
should not build

But I'm not sure how Continuum makes its determination of whether
there have been changes, (even without Distributed Build.)

This is a bit of a pain to set up and test for, so I'm hoping someone
will reassure me that it will all work fine. :)

-- 
Wendy

Re: In distributed build, how does Continuum decide whether there have been scm changes?

Posted by Christian Edward Gruber <cg...@israfil.net>.

Most SCM systems will allow you to ask for a log list or diff between  
two different repository states, or a log list or diff between two  
timestamps, either of which will tell you if there have been changes  
(if you scope it to a sub-directory as appropriate).

Christian.

On 26-Jan-09, at 07:43 , Marica Tan wrote:

> So here's my problem. As you can see from the scenario above, when  
> it tried
> to build from agent 2 at 2:00, it checked out the project at r101  
> with no
> scm changes.
> How can it get the changes from r100 to r101? This also happens at  
> 3:00. It
> only gets the scm changes from r101 to r102. It can only have the scm
> changes from r100 to r102 if it build on agent 1 again just like at  
> 3:30
> (merging of scm results will happen).

Christian E. Gruber - President / Senior Consultant
Isráfíl Consulting Services Corporation
email:  cgruber@israfil.net
mobile: +1 (289) 221-9839
web:    http://www.israfil.net/
"...keenness of understanding is due to keenness of vision."

Re: In distributed build, how does Continuum decide whether there have been scm changes?

Posted by Marica Tan <ct...@exist.com>.

In my recent commit, I added a check that will compare the last commit date
of the previous build with the last commit date of the current build to know
whether the project should be build or not. This works if the agent already
has a working copy and if the provider supports changelog since the agent
will call the update command which will return an UpdateScmResult which
contains changes that has a date.

If it's the first time that an agent will build the project, it will execute
a checkout command which will return a CheckoutScmResult but no commit dates
on that result. I think I can still get the last commit date by using the
ChangeLog command but not all providers support that.


Thanks
--
Marica

On Mon, Feb 16, 2009 at 11:13 AM, Marica Tan <ct...@exist.com> wrote:

>
>
> On Tue, Jan 27, 2009 at 8:46 AM, Brett Porter <br...@apache.org> wrote:
>
>>
>> On 26/01/2009, at 5:43 AM, Marica Tan wrote:
>>
>>  So here's my problem. As you can see from the scenario above, when it
>>> tried
>>> to build from agent 2 at 2:00, it checked out the project at r101 with no
>>> scm changes.
>>> How can it get the changes from r100 to r101? This also happens at 3:00.
>>> It
>>> only gets the scm changes from r101 to r102. It can only have the scm
>>> changes from r100 to r102 if it build on agent 1 again just like at 3:30
>>> (merging of scm results will happen).
>>>
>>> Do you know how to solve this? Any suggestions?
>>>
>>
>> One of the reasons I initially thought it'd be good to attach the agent to
>> use to a build definition was that it meant that it wouldn't matter - the
>> checkout on the agent could be used to determine (because we actually want
>> to separate the update checks per build definition).
>>
>>
>>>
>>>
>>> Here's also a solution on what Wendy wants: (though i'm not that familiar
>>> with version controls, just know how to use one, so I don't know if this
>>> will work)
>>>
>>> 1.) agent updates checkout
>>> 2.) return result with revision number?
>>> 3.) master checks if up to date and perform other checks to determine
>>> whether agent should build the project or not. scm result may need an
>>> additional field like "Revision Number"
>>> 4.) master then decides whether to let the agent build or not based on #3
>>>
>>>
>>> Is it possible to get the revision number when we do an update in
>>> continuum?
>>>
>>
>> Yep, but something to bear in mind is that not all systems support an
>> atomic revision number (eg, CVS). Perhaps the date could be used in its
>> stead here - as long as it's used consistently (with no gaps in time) it
>> should be fine (though date is bad to use in subversion, since it doesn't
>> always work).
>>
>> I was actually thinking we should create a separate record from the
>> checkout about what the last build was anyway (as this would support the
>> multiple build definitions as well). This might be a good opportunity.
>>
>>
> How do you get the last commit date in Maven SCM?
>

Re: In distributed build, how does Continuum decide whether there have been scm changes?

Posted by Marica Tan <ct...@exist.com>.

On Tue, Jan 27, 2009 at 8:46 AM, Brett Porter <br...@apache.org> wrote:

>
> On 26/01/2009, at 5:43 AM, Marica Tan wrote:
>
>  So here's my problem. As you can see from the scenario above, when it
>> tried
>> to build from agent 2 at 2:00, it checked out the project at r101 with no
>> scm changes.
>> How can it get the changes from r100 to r101? This also happens at 3:00.
>> It
>> only gets the scm changes from r101 to r102. It can only have the scm
>> changes from r100 to r102 if it build on agent 1 again just like at 3:30
>> (merging of scm results will happen).
>>
>> Do you know how to solve this? Any suggestions?
>>
>
> One of the reasons I initially thought it'd be good to attach the agent to
> use to a build definition was that it meant that it wouldn't matter - the
> checkout on the agent could be used to determine (because we actually want
> to separate the update checks per build definition).
>
>
>>
>>
>> Here's also a solution on what Wendy wants: (though i'm not that familiar
>> with version controls, just know how to use one, so I don't know if this
>> will work)
>>
>> 1.) agent updates checkout
>> 2.) return result with revision number?
>> 3.) master checks if up to date and perform other checks to determine
>> whether agent should build the project or not. scm result may need an
>> additional field like "Revision Number"
>> 4.) master then decides whether to let the agent build or not based on #3
>>
>>
>> Is it possible to get the revision number when we do an update in
>> continuum?
>>
>
> Yep, but something to bear in mind is that not all systems support an
> atomic revision number (eg, CVS). Perhaps the date could be used in its
> stead here - as long as it's used consistently (with no gaps in time) it
> should be fine (though date is bad to use in subversion, since it doesn't
> always work).
>
> I was actually thinking we should create a separate record from the
> checkout about what the last build was anyway (as this would support the
> multiple build definitions as well). This might be a good opportunity.
>
>
How do you get the last commit date in Maven SCM?

Re: In distributed build, how does Continuum decide whether there have been scm changes?

Posted by Wendy Smoak <ws...@gmail.com>.

On Mon, Jan 26, 2009 at 8:19 PM, Edwin Punzalan <el...@gmail.com> wrote:

> First suggestion:  maybe the master should remember which agent last built
> the project because then that agent has the latest sources.  The master can
> then ask that same agent to check if there are scm updates.  Of course, its
> up to the master now which agent should actually build the project if there
> are changes.

As I recall the discussion back in December, a build environment was
going to include a single build agent, so the question of whether
there are scm changes wouldn't be an issue.  One project group would
always build on a particular agent.  It would simply be moving the
checkout from the server where it is now, out to an agent.

(I was happy with that at the time, but now that I've seen the next
available agent selection effectively give us concurrent builds, I'm
reluctant to give it up. :)  Thus, the idea of having groups or pools
of agents.)

> Second suggestion: Have one agent that will do all the scm checks?  This
> agent may not be doing only this task but you get the idea.

That's similar to the current setup where the master is doing the scm
checks.  It still means that a particular server is going to have to
have the disk space to handle *all* of the projects added to the
master.

> Introducing new agent features (like doing specific scm calls to check if
> there are updates) is not quite what a non-distributed Continuum does.  So
> I'm just trying to stay within what Continuum already does.

We might be able to improve the way Continuum currently handles scm
checks... I think it has some limitations with using the same checkout
for multiple build definitions.

-- 
Wendy

Re: In distributed build, how does Continuum decide whether there have been scm changes?

Posted by Edwin Punzalan <el...@gmail.com>.

I have an idea...  actually, make it two.

First suggestion:  maybe the master should remember which agent last built
the project because then that agent has the latest sources.  The master can
then ask that same agent to check if there are scm updates.  Of course, its
up to the master now which agent should actually build the project if there
are changes.

Second suggestion: Have one agent that will do all the scm checks?  This
agent may not be doing only this task but you get the idea.

Introducing new agent features (like doing specific scm calls to check if
there are updates) is not quite what a non-distributed Continuum does.  So
I'm just trying to stay within what Continuum already does.

Just my $0.02.  ^_^

On Mon, Jan 26, 2009 at 6:12 PM, Christian Edward Gruber <
cgruber@israfil.net> wrote:

> I agree with a unique number that Continuum can link to a revision or a
> date, depending on the SCM capabilities.  I'm thinking that the Master
> should control time, so if a timestamp is required, it should be obtained
> from the master server, so that all agents are synced up with a single
> source of time.
>
> Christian
>
> On 26-Jan-09, at 19:46 , Brett Porter wrote:
>
>  ep, but something to bear in mind is that not all systems support an
>> atomic revision number (eg, CVS). Perhaps the date could be used in its
>> stead here - as long as it's used consistently (with no gaps in time) it
>> should be fine (though date is bad to use in subversion, since it doesn't
>> always work).
>>
>> I was actually thinking we should create a separate record from the
>> checkout about what the last build was anyway (as this would support the
>> multiple build definitions as well). This might be a good opportunity.
>>
>
> Christian E. Gruber - President / Senior Consultant
> Isráfíl Consulting Services Corporation
> email:  cgruber@israfil.net
> mobile: +1 (289) 221-9839
> web:    http://www.israfil.net/
> "...keenness of understanding is due to keenness of vision."
>
>
>
>
>
>
>
>

Re: In distributed build, how does Continuum decide whether there have been scm changes?

Posted by Christian Edward Gruber <cg...@israfil.net>.

I agree with a unique number that Continuum can link to a revision or  
a date, depending on the SCM capabilities.  I'm thinking that the  
Master should control time, so if a timestamp is required, it should  
be obtained from the master server, so that all agents are synced up  
with a single source of time.

Christian

On 26-Jan-09, at 19:46 , Brett Porter wrote:

> ep, but something to bear in mind is that not all systems support an  
> atomic revision number (eg, CVS). Perhaps the date could be used in  
> its stead here - as long as it's used consistently (with no gaps in  
> time) it should be fine (though date is bad to use in subversion,  
> since it doesn't always work).
>
> I was actually thinking we should create a separate record from the  
> checkout about what the last build was anyway (as this would support  
> the multiple build definitions as well). This might be a good  
> opportunity.

Christian E. Gruber - President / Senior Consultant
Isráfíl Consulting Services Corporation
email:  cgruber@israfil.net
mobile: +1 (289) 221-9839
web:    http://www.israfil.net/
"...keenness of understanding is due to keenness of vision."

Re: In distributed build, how does Continuum decide whether there have been scm changes?

Posted by Brett Porter <br...@apache.org>.

On 26/01/2009, at 5:43 AM, Marica Tan wrote:

> So here's my problem. As you can see from the scenario above, when  
> it tried
> to build from agent 2 at 2:00, it checked out the project at r101  
> with no
> scm changes.
> How can it get the changes from r100 to r101? This also happens at  
> 3:00. It
> only gets the scm changes from r101 to r102. It can only have the scm
> changes from r100 to r102 if it build on agent 1 again just like at  
> 3:30
> (merging of scm results will happen).
>
> Do you know how to solve this? Any suggestions?

One of the reasons I initially thought it'd be good to attach the  
agent to use to a build definition was that it meant that it wouldn't  
matter - the checkout on the agent could be used to determine (because  
we actually want to separate the update checks per build definition).

>
>
>
> Here's also a solution on what Wendy wants: (though i'm not that  
> familiar
> with version controls, just know how to use one, so I don't know if  
> this
> will work)
>
> 1.) agent updates checkout
> 2.) return result with revision number?
> 3.) master checks if up to date and perform other checks to determine
> whether agent should build the project or not. scm result may need an
> additional field like "Revision Number"
> 4.) master then decides whether to let the agent build or not based  
> on #3
>
>
> Is it possible to get the revision number when we do an update in  
> continuum?

Yep, but something to bear in mind is that not all systems support an  
atomic revision number (eg, CVS). Perhaps the date could be used in  
its stead here - as long as it's used consistently (with no gaps in  
time) it should be fine (though date is bad to use in subversion,  
since it doesn't always work).

I was actually thinking we should create a separate record from the  
checkout about what the last build was anyway (as this would support  
the multiple build definitions as well). This might be a good  
opportunity.

- Brett

>
>
>
> Thanks
> --
> Marica

--
Brett Porter
brett@apache.org
http://blogs.exist.com/bporter/

Re: In distributed build, how does Continuum decide whether there have been scm changes?

Posted by Marica Tan <ct...@exist.com>.

On Mon, Jan 26, 2009 at 12:31 PM, Wendy Smoak <ws...@gmail.com> wrote:

> On Thu, Jan 22, 2009 at 7:07 PM, Marica Tan <ct...@exist.com> wrote:
> > IMO the master needs to keep track of the revisions so
> > that when agent 1 tries to build project @ 4:00pm it will only update the
> > working copy but it won't build the project.
>
> I'd like to avoid having the master check out the source code.  That
> would require the master to have enough disk space for *all* the
> projects, when all the master really needs to do is serve the webapp
> and fire off build on schedules.
>
> > Our initial plan is to have a dumb build agent so all it knows is how to
> > build. When it's a scheduled build, it will always build regardless if
> there
> > is or there isn't any change at all. We can add the check for whether it
> > should build or not in the next pass.
>
> Are you saying this is how it works in 1.3.1, that a scheduled build
> always builds regardless of whether there are scm changes?
>

Yes, in a distributed build.

>
> > It first updates the working copy and then set the project's scm result
> > (with scm changes). Project has a one to one relationship with ScmResult.
> > Everytime you update the working copy, it merges the new scm changes with
> > the old scm changes unless it says build fresh.
> >
> > Currently, no scm changes is returned to the master in a distributed
> build
> > and I'll be working on that next.
>
> How will this work?  Any given agent isn't guaranteed to even have the
> code checked out.
>
>

Add project
1:00 Build on agent 1, checkout r100, scm result has no scm changes -->
should build because it's a first build
1:30 commit changes
2:00 Build on agent 2, checkout r101, scm result has no scm changes -->
should build because there are changes
2:30 commit changes
3:00 Build on agent 2, update r102, scm result will have scm changes from
r101 to r102 --> should build because there are changes
3:30 Build on agent 1, update r102, scm result will have scm changes from
r100 to r102 --> should not build since we already build r102 from agent 2

So here's my problem. As you can see from the scenario above, when it tried
to build from agent 2 at 2:00, it checked out the project at r101 with no
scm changes.
How can it get the changes from r100 to r101? This also happens at 3:00. It
only gets the scm changes from r101 to r102. It can only have the scm
changes from r100 to r102 if it build on agent 1 again just like at 3:30
(merging of scm results will happen).

Do you know how to solve this? Any suggestions?

Here's also a solution on what Wendy wants: (though i'm not that familiar
with version controls, just know how to use one, so I don't know if this
will work)

1.) agent updates checkout
2.) return result with revision number?
3.) master checks if up to date and perform other checks to determine
whether agent should build the project or not. scm result may need an
additional field like "Revision Number"
4.) master then decides whether to let the agent build or not based on #3

Is it possible to get the revision number when we do an update in continuum?

Thanks
--
Marica

Re: In distributed build, how does Continuum decide whether there have been scm changes?

Posted by Wendy Smoak <ws...@gmail.com>.

On Thu, Jan 22, 2009 at 7:07 PM, Marica Tan <ct...@exist.com> wrote:
> IMO the master needs to keep track of the revisions so
> that when agent 1 tries to build project @ 4:00pm it will only update the
> working copy but it won't build the project.

I'd like to avoid having the master check out the source code.  That
would require the master to have enough disk space for *all* the
projects, when all the master really needs to do is serve the webapp
and fire off build on schedules.

> Our initial plan is to have a dumb build agent so all it knows is how to
> build. When it's a scheduled build, it will always build regardless if there
> is or there isn't any change at all. We can add the check for whether it
> should build or not in the next pass.

Are you saying this is how it works in 1.3.1, that a scheduled build
always builds regardless of whether there are scm changes?

> It first updates the working copy and then set the project's scm result
> (with scm changes). Project has a one to one relationship with ScmResult.
> Everytime you update the working copy, it merges the new scm changes with
> the old scm changes unless it says build fresh.
>
> Currently, no scm changes is returned to the master in a distributed build
> and I'll be working on that next.

How will this work?  Any given agent isn't guaranteed to even have the
code checked out.

-- 
Wendy

Re: In distributed build, how does Continuum decide whether there have been scm changes?

Posted by Wendy Smoak <ws...@gmail.com>.

On Thu, Jan 22, 2009 at 7:19 PM, Christian Edward Gruber
<cg...@israfil.net> wrote:
> The problem with Wendy's logic is that it works really well for SVN or P4
> with atomic commits, but not so much for CVS.

There's a 'quiet period' field which AIUI is meant to keep from
building a CVS project when it's in the middle of a commit.

-- 
Wendy

Re: In distributed build, how does Continuum decide whether there have been scm changes?

Posted by Christian Edward Gruber <cg...@israfil.net>.

The problem with Wendy's logic is that it works really well for SVN or  
P4 with atomic commits, but not so much for CVS.  On the other hand,  
the build agent checking its local workspace isn't great, because  
another agent may have built.  An option would be to call a timestamp  
as of the build scheduled time, and determine if there have been  
changes between the previous build's timestamp and the current  
timestamp.  On most atomic commit systems this is translated  
immediately into a revision number, but for something like CVS, each  
client can determine whether changes occur between TimeA and TimeB.   
Or maybe use revision and fall back on timestamps for non-atomic  
commit systems.

A lot of what I said is stated without really solid understanding of  
the continuum SCM infrastructure, so I'm not sure if the features are  
there, but they should be in the underlying SCM system being used.

Christian.

On 22-Jan-09, at 21:07 , Marica Tan wrote:

> On Tue, Jan 20, 2009 at 9:15 AM, Wendy Smoak <ws...@gmail.com> wrote:
>
>> I'm trying to understand how, if a project may build on any agent,
>> Continuum can determine whether there have been scm changes since the
>> last time it was built.  Here's what I think should happen (scheduled
>> builds, assuming Always Build and Build Fresh are NOT checked):
>>
>> Add project
>> 1:00pm Build on agent 1, checkout at r500
>> 2:00pm Build on agent 2, checkout at r500 <--- no changes, project
>> should not build
>> 2:15pm developer makes scm changes
>> 3:00pm Build on agent 2 update to r501, build because there were  
>> changes
>> 4:00pm Build on agent 1 update to r501 <---- no changes, project
>> should not build
>>
>
> This is possible, but IMO the master needs to keep track of the  
> revisions so
> that when agent 1 tries to build project @ 4:00pm it will only  
> update the
> working copy but it won't build the project.
>
> Our initial plan is to have a dumb build agent so all it knows is  
> how to
> build. When it's a scheduled build, it will always build regardless  
> if there
> is or there isn't any change at all. We can add the check for  
> whether it
> should build or not in the next pass.
>
>
>> But I'm not sure how Continuum makes its determination of whether
>> there have been changes, (even without Distributed Build.)
>>
>
> It first updates the working copy and then set the project's scm  
> result
> (with scm changes). Project has a one to one relationship with  
> ScmResult.
> Everytime you update the working copy, it merges the new scm changes  
> with
> the old scm changes unless it says build fresh.
>
> Currently, no scm changes is returned to the master in a distributed  
> build
> and I'll be working on that next.
>
>
>> This is a bit of a pain to set up and test for, so I'm hoping someone
>> will reassure me that it will all work fine. :)
>>
>> --
>> Wendy
>>

Christian E. Gruber - President / Senior Consultant
Isráfíl Consulting Services Corporation
email:  cgruber@israfil.net
mobile: +1 (289) 221-9839
web:    http://www.israfil.net/
"...keenness of understanding is due to keenness of vision."

Re: In distributed build, how does Continuum decide whether there have been scm changes?

Posted by Marica Tan <ct...@exist.com>.

On Tue, Jan 20, 2009 at 9:15 AM, Wendy Smoak <ws...@gmail.com> wrote:

> I'm trying to understand how, if a project may build on any agent,
> Continuum can determine whether there have been scm changes since the
> last time it was built.  Here's what I think should happen (scheduled
> builds, assuming Always Build and Build Fresh are NOT checked):
>
> Add project
> 1:00pm Build on agent 1, checkout at r500
> 2:00pm Build on agent 2, checkout at r500 <--- no changes, project
> should not build
> 2:15pm developer makes scm changes
> 3:00pm Build on agent 2 update to r501, build because there were changes
> 4:00pm Build on agent 1 update to r501 <---- no changes, project
> should not build
>

This is possible, but IMO the master needs to keep track of the revisions so
that when agent 1 tries to build project @ 4:00pm it will only update the
working copy but it won't build the project.

Our initial plan is to have a dumb build agent so all it knows is how to
build. When it's a scheduled build, it will always build regardless if there
is or there isn't any change at all. We can add the check for whether it
should build or not in the next pass.

> But I'm not sure how Continuum makes its determination of whether
> there have been changes, (even without Distributed Build.)
>

It first updates the working copy and then set the project's scm result
(with scm changes). Project has a one to one relationship with ScmResult.
Everytime you update the working copy, it merges the new scm changes with
the old scm changes unless it says build fresh.

Currently, no scm changes is returned to the master in a distributed build
and I'll be working on that next.

> This is a bit of a pain to set up and test for, so I'm hoping someone
> will reassure me that it will all work fine. :)
>
> --
> Wendy
>