You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by David Glasser <gl...@davidglasser.net> on 2007/10/18 20:30:07 UTC

Nasty double-replace copy-on-update bug

Repro script attached (maybe I'll make a Python version after lunch).

Note that this bug is only triggered if, in the final checkout, the
server decides to tell the client about $FN2 *before* telling about
$FN1, which depends on exactly what order the WC crawler and the FS
get_dir return their entries; just changing the filenames is enough to
avoid the bug.

--dave

-- 
David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/

Re: Nasty double-replace copy-on-update bug

Posted by Ben Collins-Sussman <su...@red-bean.com>.
Who needs caffeine, when you can sit in the same cube with glasser for
2 hours?  :-)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Nasty double-replace copy-on-update bug

Posted by David Glasser <gl...@davidglasser.net>.
On 10/20/07, Ben Collins-Sussman <su...@red-bean.com> wrote:
> On 10/20/07, David Glasser <gl...@davidglasser.net> wrote:
> > On 10/20/07, Ben Collins-Sussman <su...@red-bean.com> wrote:
> > > On 10/19/07, Ben Collins-Sussman <su...@red-bean.com> wrote:
> > > > On 10/19/07, Philip Martin <ph...@codematters.co.uk> wrote:
> > > > > "Ben Collins-Sussman" <su...@red-bean.com> writes:
> > > > >
> > > > > > If I were to remove this forced log execution, I'd have to come up
> > > > > > with some clever way of allowing the textdelta to apply against the
> > > > > > not-quite-real-yet file.   Hm.
> > > > >
> > > > > I suppose that ideally you would apply the textdelta to a temporary
> > > > > file in .svn/tmp and then write log entries to move it into place, I
> > > > > don't know how hard that would be.
> > > > >
> > > > > On a side issue, does this copyfrom stuff have a performance issue?
> > > > > During a checkout none of the copyfrom sources are going to exist in
> > > > > the working copy and yet rather than simply retrieve the final version
> > > > > the code first retrieves an old version and then retrieves a delta.
> > > > > In an environment where all development is done on branches it's
> > > > > possible that all files will have copyfrom data.
> > > > >
> > > >
> > > > epg brought this up, since we first discovered this bug while doing a
> > > > large checkout.  :-)
> > > >
> > > > I think what you say above is a good argument for 'svn checkout' to
> > > > explicitly *not* request that the server send copyfrom args, and
> > > > instead let the server behave the old way (just always send fulltexts
> > > > to the client).   'svn update', on the other hand, can continue to
> > > > request copyfrom args.
> > > >
> > >
> > > And, while glasser and I are still mulling about how to solve the 2
> > > larger bugs here, I've committed this change in r27297.
> > > svn_client_checkout() now asks the server to only send fulltexts
> > > (ignoring copy history), while svn_client_update() asks the server to
> > > send copyfrom-args on add_file() if it can.
> >
> > OK, but we do recognize that this is merely a heuristic, right?
> > Anything that can go wrong during a checkout can also go wrong during
> > an update...
> >
>
> Sure, this doesn't solve the bugs;  it's just a speedup for checkouts.
>  The original point of the copy-on-update feature is to preserve
> peoples local mods better during updates;  during a checkout, there's
> no point in going through all this extra work to do that.  There are
> no local mods.

Ben and I sat down and fixed the particular issue that was written up
in Issue 2977.  We have further plans for improving the use of loggy
in this code, described in Issue 2986; I plan to work on this
tomorrow.

--dave

-- 
David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Nasty double-replace copy-on-update bug

Posted by Ben Collins-Sussman <su...@red-bean.com>.
On 10/20/07, David Glasser <gl...@davidglasser.net> wrote:
> On 10/20/07, Ben Collins-Sussman <su...@red-bean.com> wrote:
> > On 10/19/07, Ben Collins-Sussman <su...@red-bean.com> wrote:
> > > On 10/19/07, Philip Martin <ph...@codematters.co.uk> wrote:
> > > > "Ben Collins-Sussman" <su...@red-bean.com> writes:
> > > >
> > > > > If I were to remove this forced log execution, I'd have to come up
> > > > > with some clever way of allowing the textdelta to apply against the
> > > > > not-quite-real-yet file.   Hm.
> > > >
> > > > I suppose that ideally you would apply the textdelta to a temporary
> > > > file in .svn/tmp and then write log entries to move it into place, I
> > > > don't know how hard that would be.
> > > >
> > > > On a side issue, does this copyfrom stuff have a performance issue?
> > > > During a checkout none of the copyfrom sources are going to exist in
> > > > the working copy and yet rather than simply retrieve the final version
> > > > the code first retrieves an old version and then retrieves a delta.
> > > > In an environment where all development is done on branches it's
> > > > possible that all files will have copyfrom data.
> > > >
> > >
> > > epg brought this up, since we first discovered this bug while doing a
> > > large checkout.  :-)
> > >
> > > I think what you say above is a good argument for 'svn checkout' to
> > > explicitly *not* request that the server send copyfrom args, and
> > > instead let the server behave the old way (just always send fulltexts
> > > to the client).   'svn update', on the other hand, can continue to
> > > request copyfrom args.
> > >
> >
> > And, while glasser and I are still mulling about how to solve the 2
> > larger bugs here, I've committed this change in r27297.
> > svn_client_checkout() now asks the server to only send fulltexts
> > (ignoring copy history), while svn_client_update() asks the server to
> > send copyfrom-args on add_file() if it can.
>
> OK, but we do recognize that this is merely a heuristic, right?
> Anything that can go wrong during a checkout can also go wrong during
> an update...
>

Sure, this doesn't solve the bugs;  it's just a speedup for checkouts.
 The original point of the copy-on-update feature is to preserve
peoples local mods better during updates;  during a checkout, there's
no point in going through all this extra work to do that.  There are
no local mods.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Nasty double-replace copy-on-update bug

Posted by David Glasser <gl...@davidglasser.net>.
On 10/20/07, Philip Martin <ph...@codematters.co.uk> wrote:
> "Ben Collins-Sussman" <su...@red-bean.com> writes:
>
> > And, while glasser and I are still mulling about how to solve the 2
> > larger bugs here, I've committed this change in r27297.
> > svn_client_checkout() now asks the server to only send fulltexts
> > (ignoring copy history), while svn_client_update() asks the server to
> > send copyfrom-args on add_file() if it can.
>
> There is another performance issue with flushing the log files early,
> each time log files are run the entries file gets rewritten and so for
> large directories the O(N^2) file IO can dominate.
>
> Another thing that occurs to me is the locking behaviour of
> locate_copyfrom.  I believe that if locate_copyfrom is going to
> identify files outside the access baton set then it needs to take
> write locks, but it should probably not treat a failure to get such a
> ock as an error but should fall back on getting the file over RA
> instead.  Consider a large working copy where the user runs update on
> two separate subtrees, a reasonable thing to do if only changes in
> those subtrees are of interest.  Well suppose a file has been
> moved/copied from one subtree to the other, do we want one update to
> fail part way through because it cannot get a write lock on the other
> locked subtree?  I think a user would expect to be able to run
> concurrent updates on separate subtrees in a single working copy.

Philip, can you take a look at my patch in the thread "[PATCH] Fix
issue 2986"?  It fixes the loggy problems, though not the
locate_copyfrom things.

--dave

-- 
David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Nasty double-replace copy-on-update bug

Posted by David Glasser <gl...@davidglasser.net>.
On 10/20/07, Ben Collins-Sussman <su...@red-bean.com> wrote:
> On 10/19/07, Ben Collins-Sussman <su...@red-bean.com> wrote:
> > On 10/19/07, Philip Martin <ph...@codematters.co.uk> wrote:
> > > "Ben Collins-Sussman" <su...@red-bean.com> writes:
> > >
> > > > If I were to remove this forced log execution, I'd have to come up
> > > > with some clever way of allowing the textdelta to apply against the
> > > > not-quite-real-yet file.   Hm.
> > >
> > > I suppose that ideally you would apply the textdelta to a temporary
> > > file in .svn/tmp and then write log entries to move it into place, I
> > > don't know how hard that would be.
> > >
> > > On a side issue, does this copyfrom stuff have a performance issue?
> > > During a checkout none of the copyfrom sources are going to exist in
> > > the working copy and yet rather than simply retrieve the final version
> > > the code first retrieves an old version and then retrieves a delta.
> > > In an environment where all development is done on branches it's
> > > possible that all files will have copyfrom data.
> > >
> >
> > epg brought this up, since we first discovered this bug while doing a
> > large checkout.  :-)
> >
> > I think what you say above is a good argument for 'svn checkout' to
> > explicitly *not* request that the server send copyfrom args, and
> > instead let the server behave the old way (just always send fulltexts
> > to the client).   'svn update', on the other hand, can continue to
> > request copyfrom args.
> >
>
> And, while glasser and I are still mulling about how to solve the 2
> larger bugs here, I've committed this change in r27297.
> svn_client_checkout() now asks the server to only send fulltexts
> (ignoring copy history), while svn_client_update() asks the server to
> send copyfrom-args on add_file() if it can.

OK, but we do recognize that this is merely a heuristic, right?
Anything that can go wrong during a checkout can also go wrong during
an update...

--dave

-- 
David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Nasty double-replace copy-on-update bug

Posted by Philip Martin <ph...@codematters.co.uk>.
"Ben Collins-Sussman" <su...@red-bean.com> writes:

> And, while glasser and I are still mulling about how to solve the 2
> larger bugs here, I've committed this change in r27297.
> svn_client_checkout() now asks the server to only send fulltexts
> (ignoring copy history), while svn_client_update() asks the server to
> send copyfrom-args on add_file() if it can.

There is another performance issue with flushing the log files early,
each time log files are run the entries file gets rewritten and so for
large directories the O(N^2) file IO can dominate.

Another thing that occurs to me is the locking behaviour of
locate_copyfrom.  I believe that if locate_copyfrom is going to
identify files outside the access baton set then it needs to take
write locks, but it should probably not treat a failure to get such a
ock as an error but should fall back on getting the file over RA
instead.  Consider a large working copy where the user runs update on
two separate subtrees, a reasonable thing to do if only changes in
those subtrees are of interest.  Well suppose a file has been
moved/copied from one subtree to the other, do we want one update to
fail part way through because it cannot get a write lock on the other
locked subtree?  I think a user would expect to be able to run
concurrent updates on separate subtrees in a single working copy.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Nasty double-replace copy-on-update bug

Posted by Ben Collins-Sussman <su...@red-bean.com>.
On 10/19/07, Ben Collins-Sussman <su...@red-bean.com> wrote:
> On 10/19/07, Philip Martin <ph...@codematters.co.uk> wrote:
> > "Ben Collins-Sussman" <su...@red-bean.com> writes:
> >
> > > If I were to remove this forced log execution, I'd have to come up
> > > with some clever way of allowing the textdelta to apply against the
> > > not-quite-real-yet file.   Hm.
> >
> > I suppose that ideally you would apply the textdelta to a temporary
> > file in .svn/tmp and then write log entries to move it into place, I
> > don't know how hard that would be.
> >
> > On a side issue, does this copyfrom stuff have a performance issue?
> > During a checkout none of the copyfrom sources are going to exist in
> > the working copy and yet rather than simply retrieve the final version
> > the code first retrieves an old version and then retrieves a delta.
> > In an environment where all development is done on branches it's
> > possible that all files will have copyfrom data.
> >
>
> epg brought this up, since we first discovered this bug while doing a
> large checkout.  :-)
>
> I think what you say above is a good argument for 'svn checkout' to
> explicitly *not* request that the server send copyfrom args, and
> instead let the server behave the old way (just always send fulltexts
> to the client).   'svn update', on the other hand, can continue to
> request copyfrom args.
>

And, while glasser and I are still mulling about how to solve the 2
larger bugs here, I've committed this change in r27297.
svn_client_checkout() now asks the server to only send fulltexts
(ignoring copy history), while svn_client_update() asks the server to
send copyfrom-args on add_file() if it can.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Nasty double-replace copy-on-update bug

Posted by Ben Collins-Sussman <su...@red-bean.com>.
On 10/19/07, Philip Martin <ph...@codematters.co.uk> wrote:
> "Ben Collins-Sussman" <su...@red-bean.com> writes:
>
> > If I were to remove this forced log execution, I'd have to come up
> > with some clever way of allowing the textdelta to apply against the
> > not-quite-real-yet file.   Hm.
>
> I suppose that ideally you would apply the textdelta to a temporary
> file in .svn/tmp and then write log entries to move it into place, I
> don't know how hard that would be.
>
> On a side issue, does this copyfrom stuff have a performance issue?
> During a checkout none of the copyfrom sources are going to exist in
> the working copy and yet rather than simply retrieve the final version
> the code first retrieves an old version and then retrieves a delta.
> In an environment where all development is done on branches it's
> possible that all files will have copyfrom data.
>

epg brought this up, since we first discovered this bug while doing a
large checkout.  :-)

I think what you say above is a good argument for 'svn checkout' to
explicitly *not* request that the server send copyfrom args, and
instead let the server behave the old way (just always send fulltexts
to the client).   'svn update', on the other hand, can continue to
request copyfrom args.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Nasty double-replace copy-on-update bug

Posted by Philip Martin <ph...@codematters.co.uk>.
"Ben Collins-Sussman" <su...@red-bean.com> writes:

> If I were to remove this forced log execution, I'd have to come up
> with some clever way of allowing the textdelta to apply against the
> not-quite-real-yet file.   Hm.

I suppose that ideally you would apply the textdelta to a temporary
file in .svn/tmp and then write log entries to move it into place, I
don't know how hard that would be.

On a side issue, does this copyfrom stuff have a performance issue?
During a checkout none of the copyfrom sources are going to exist in
the working copy and yet rather than simply retrieve the final version
the code first retrieves an old version and then retrieves a delta.
In an environment where all development is done on branches it's
possible that all files will have copyfrom data.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Nasty double-replace copy-on-update bug

Posted by Ben Collins-Sussman <su...@red-bean.com>.
On 10/19/07, Philip Martin <ph...@codematters.co.uk> wrote:

> I think you could make the test pass by avoiding copying the
> committed-rev, so that the temporary became alpha@-1, and by making
> locate_copyfrom recognise that -1 is not valid committed-rev.  Then
> when attempting to construct aardvark@3 the code would retrieve
> alpha@1 from the repository.  Although the test might pass it would
> *not* fix the problem that the working copy is temporarily in an
> invalid state.

Thanks for the clear explanation.

So yes, the original bug has revealed a bigger bug:  that it's
inherently unsafe to execute the parent's log in the middle of the
add_file() operation.   I'm currently executing the parent's log
manually after calling close_file() so that the file "actually
exists", so that the server's apply_textdelta() successfully applies.

If I were to remove this forced log execution, I'd have to come up
with some clever way of allowing the textdelta to apply against the
not-quite-real-yet file.   Hm.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Nasty double-replace copy-on-update bug

Posted by Philip Martin <ph...@codematters.co.uk>.
"Ben Collins-Sussman" <su...@red-bean.com> writes:

> Huh?  The whole bug here is that alpha isn't fetched via RA, it's
> being copied from an existing wc file.
>
> Let me walk over update-test #40 again;  I think I don't understand
> where you've gdb'd.

There are *two* sets of copied-from date.  The checkout attempts to
construct file alpha@3, which is copied from gamma@1 and subsequently
modified, and aardvark@3, which is copied from alpha@1 and
subsequently modified.  However while constructing alpha@3 it
temporarily constructs something it calls alpha@1, by copying from
gamma@1 including the committed-rev, which it will later convert into
alpha@3.  Note that the thing called alpha@1 in the wc isn't the same
as alpha@1 in the repository, it's a copy of gamma@1.  When
constructing aardvark@3 the code finds the temporary alpha@1 and uses
it, which is where it all falls down.

I quit out of gdb just after running the log file that created
alpha@1.

I think you could make the test pass by avoiding copying the
committed-rev, so that the temporary became alpha@-1, and by making
locate_copyfrom recognise that -1 is not valid committed-rev.  Then
when attempting to construct aardvark@3 the code would retrieve
alpha@1 from the repository.  Although the test might pass it would
*not* fix the problem that the working copy is temporarily in an
invalid state.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Nasty double-replace copy-on-update bug

Posted by Ben Collins-Sussman <su...@red-bean.com>.
On 10/19/07, Philip Martin <ph...@codematters.co.uk> wrote:

> If I quit out of gdb after running the first log, and then run cleanup
> I get:
>
>   $ svn st -uv wc
>          *                                wc/aardvark
>                   3        1 pm           wc/alpha
>   !      *        3       ?   ?           wc
>   Status against revision:      3
>   $ cat wc/alpha
>   This is the file 'g123'.
>   $ cat wc/.svn/text-base/alpha.svn-base
>   This is the file 'g123'.
>
> Notice that alpha is showen as up-to-date despite having
> committed-rev=1 and the wrong working text and text-base.  That's the
> sort of thing the log files are supposed to prevent.  I can't run
> update, because of the bug, but even if I could how would the client
> know to update alpha?

Argh!  I'm trying to see if I understand the situation here:  because
we're writing & executing a log "early", we're writing an out-of-date
committed-rev value to .svn/entries?   Whereas if we had allowed the
log to be written and executed at normal close_directory() time, the
log would have included the correct information?

I think I'm confused... at what point is the correct data ever written
to the loggy?



> > The props in memory are then applied to the file-baton.  The
> > difference, though, is that the first blocks just loads up typical
> > user-created properties, while svn_ra_get_file() returns *all* props:
> > normal user props, but also "weird" live properties like the
> > svn:wc:entry:* props that get cached in .svn/entries.   (This is
> > perfectly fine;  close_file() knows how to filter these different
> > types of props.)
> >
> > So in the case of an svn_ra_get_file() fetch, the
> > svn:wc:entry:last-committed-rev property is definitely being fetched
> > and pushed into the file-baton.  But when we're copying an existing
> > file's properties over, we're not dup'ing the last-committed-rev prop
> > at all.
>
> In this case alpha is created by doing svn_ra_get_file(gamma@1) and
> copying committed-rev=1.  The log file that creates alpha sets
> committed-rev=1.  I don't see how that can be anything other than a
> bug.

Huh?  The whole bug here is that alpha isn't fetched via RA, it's
being copied from an existing wc file.

Let me walk over update-test #40 again;  I think I don't understand
where you've gdb'd.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Nasty double-replace copy-on-update bug

Posted by Philip Martin <ph...@codematters.co.uk>.
"Ben Collins-Sussman" <su...@red-bean.com> writes:

> On 10/19/07, Philip Martin <ph...@codematters.co.uk> wrote:
>
>> How the logs and cache should interact is a difficult question to
>> answer.  We have the current behaviour which is the result of
>> evolution rather than any grand unified design.
>
> Tell me about it.  :-)
>
>>
>> Why does add_file_with_history flush and run the log, the one with the
>> wrong committed-rev?  I guess it's to enable a text-delta to be
>> applied, but it does seem to be sort of in the middle of things.  Is
>> the working copy consistent if the process gets interrupted after the
>> log has been run?  If there is no log file cleanup will simply remove
>> the locks and assume the working copy is OK.
>
> We deliberately flush the parent-dir's log, otherwise the newly
> "installed" file doesn't exist.  As you realized, it needs to exist so
> that the server can (possibly) send a subsequent apply_textdelta()
> against it.  I added this behavior after chatting with Erik Huelsmann,
> who explained that files which are created/installed by
> add_file()/close_file() pairs don't truly come into existence until
> the parent directory is explicitly closed via close_directory();  our
> best solution was to run the parent-dir's existing log "early" within
> this routine, rather than waiting till later.  It should be safe,
> though.  Notice that the log is reset properly, so that a new loggy
> for the directory (and it's incoming children) can be created and
> used.  If the checkout is interrupted right after the parent's log is
> run 'early', I see no reason why the wc would be inconsistent.  The
> directory should still be marked 'incomplete', thus allowing for a
> restartable update.

If I quit out of gdb after running the first log, and then run cleanup
I get:

  $ svn st -uv wc
         *                                wc/aardvark
                  3        1 pm           wc/alpha
  !      *        3       ?   ?           wc
  Status against revision:      3
  $ cat wc/alpha
  This is the file 'g123'.
  $ cat wc/.svn/text-base/alpha.svn-base 
  This is the file 'g123'.

Notice that alpha is showen as up-to-date despite having
committed-rev=1 and the wrong working text and text-base.  That's the
sort of thing the log files are supposed to prevent.  I can't run
update, because of the bug, but even if I could how would the client
know to update alpha?

>> >> Perhaps properties like committed-rev should be filtered out of then
>> >> above loop?  Perhaps only regular props should be transferred?
>> >
>> > Maybe, but that seems like sidestepping the problem, which is that the
>> > left hand of the library knows something (that the server has set
>> > committed-rev) that the right hand doesn't...
>>
>> I think that transferring committed-rev from one file to another is a
>> bug, that's not sidestepping anything.
>
> I don't think this is actually happening.
>
> Notice the two blocks above the loop where properties are applied to
> the file-baton.  If locate_copyfrom() finds a candidate, then it
> copies the candidate's text-base into place and loads the candidate's
> base-props into memory.   In the alternative block (where no candidate
> can be found), we call svn_ra_get_file() to fetch text to disk and
> props into memory.
>
> The props in memory are then applied to the file-baton.  The
> difference, though, is that the first blocks just loads up typical
> user-created properties, while svn_ra_get_file() returns *all* props:
> normal user props, but also "weird" live properties like the
> svn:wc:entry:* props that get cached in .svn/entries.   (This is
> perfectly fine;  close_file() knows how to filter these different
> types of props.)
>
> So in the case of an svn_ra_get_file() fetch, the
> svn:wc:entry:last-committed-rev property is definitely being fetched
> and pushed into the file-baton.  But when we're copying an existing
> file's properties over, we're not dup'ing the last-committed-rev prop
> at all.

In this case alpha is created by doing svn_ra_get_file(gamma@1) and
copying committed-rev=1.  The log file that creates alpha sets
committed-rev=1.  I don't see how that can be anything other than a
bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Nasty double-replace copy-on-update bug

Posted by Ben Collins-Sussman <su...@red-bean.com>.
On 10/19/07, Philip Martin <ph...@codematters.co.uk> wrote:

> How the logs and cache should interact is a difficult question to
> answer.  We have the current behaviour which is the result of
> evolution rather than any grand unified design.

Tell me about it.  :-)

>
> Why does add_file_with_history flush and run the log, the one with the
> wrong committed-rev?  I guess it's to enable a text-delta to be
> applied, but it does seem to be sort of in the middle of things.  Is
> the working copy consistent if the process gets interrupted after the
> log has been run?  If there is no log file cleanup will simply remove
> the locks and assume the working copy is OK.

We deliberately flush the parent-dir's log, otherwise the newly
"installed" file doesn't exist.  As you realized, it needs to exist so
that the server can (possibly) send a subsequent apply_textdelta()
against it.  I added this behavior after chatting with Erik Huelsmann,
who explained that files which are created/installed by
add_file()/close_file() pairs don't truly come into existence until
the parent directory is explicitly closed via close_directory();  our
best solution was to run the parent-dir's existing log "early" within
this routine, rather than waiting till later.  It should be safe,
though.  Notice that the log is reset properly, so that a new loggy
for the directory (and it's incoming children) can be created and
used.  If the checkout is interrupted right after the parent's log is
run 'early', I see no reason why the wc would be inconsistent.  The
directory should still be marked 'incomplete', thus allowing for a
restartable update.

> >> Perhaps properties like committed-rev should be filtered out of then
> >> above loop?  Perhaps only regular props should be transferred?
> >
> > Maybe, but that seems like sidestepping the problem, which is that the
> > left hand of the library knows something (that the server has set
> > committed-rev) that the right hand doesn't...
>
> I think that transferring committed-rev from one file to another is a
> bug, that's not sidestepping anything.

I don't think this is actually happening.

Notice the two blocks above the loop where properties are applied to
the file-baton.  If locate_copyfrom() finds a candidate, then it
copies the candidate's text-base into place and loads the candidate's
base-props into memory.   In the alternative block (where no candidate
can be found), we call svn_ra_get_file() to fetch text to disk and
props into memory.

The props in memory are then applied to the file-baton.  The
difference, though, is that the first blocks just loads up typical
user-created properties, while svn_ra_get_file() returns *all* props:
normal user props, but also "weird" live properties like the
svn:wc:entry:* props that get cached in .svn/entries.   (This is
perfectly fine;  close_file() knows how to filter these different
types of props.)

So in the case of an svn_ra_get_file() fetch, the
svn:wc:entry:last-committed-rev property is definitely being fetched
and pushed into the file-baton.  But when we're copying an existing
file's properties over, we're not dup'ing the last-committed-rev prop
at all.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Nasty double-replace copy-on-update bug

Posted by Garrett Rooney <ro...@electricjellyfish.net>.
On 10/19/07, David Glasser <gl...@davidglasser.net> wrote:
> On 10/19/07, Philip Martin <ph...@codematters.co.uk> wrote:
> > "David Glasser" <gl...@davidglasser.net> writes:
> >
> > > On 10/19/07, Philip Martin <ph...@codematters.co.uk> wrote:
> > >> "Ben Collins-Sussman" <su...@red-bean.com> writes:
> > >>
> > >> > The question is:  why the wc-searching routine
> > >> > (update_editor.c:locate_copyfrom()) think that it's found the correct
> > >> > version of the file, when it's not?  Answer:  it seems to be getting
> > >> > stale svn_wc_entry_t information.   It's reading a couple of entries
> > >> > files from disk, rather than from a more recent adm_access set still
> > >> > held in memory, and getting an incorrect (stale) last-changed-rev
> > >> > value.    From poking around in gdb, it seems that the entry_t's which
> > >> > are write-cached in memory are correct -- they have the proper
> > >> > last-changed-rev values, but simply aren't flushed to disk yet.
> > >> >
> > >> > I attempted to solve this by passing in the edit_baton->access_set to
> > >> > the searching routine, but it's not fixing the problem.  I've attached
> > >> > my patch-in-progress below.  Can we get some more eyes on this
> > >> > problem?
> > >>
> > >> I think there are two bugs here.  I believe it is wrong for
> > >> locate_copyfrom to use a read-only snapshot of the working copy, your
> > >> patch to pass an access baton looks like the right thing to do,
> > >> although I think that the _try calls should take write locks.  However
> > >> that's not what is causing the current failure.
> > >
> > > Why is that?  It sounds like a read-only operation...
> >
> > Without a write lock some other Subversion process could change or
> > delete the file in the interval between the decision to use a file and
> > the moment it gets used.
>
> Huh.  When is it ever safe to not take the write lock, then?

I must admit I was wondering the same thing...

-garrett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Nasty double-replace copy-on-update bug

Posted by Philip Martin <ph...@codematters.co.uk>.
"David Glasser" <gl...@davidglasser.net> writes:

> On 10/19/07, Philip Martin <ph...@codematters.co.uk> wrote:
>> "David Glasser" <gl...@davidglasser.net> writes:
>>
>> > On 10/19/07, Philip Martin <ph...@codematters.co.uk> wrote:
>> >>
>> >> Without a write lock some other Subversion process could change or
>> >> delete the file in the interval between the decision to use a file and
>> >> the moment it gets used.
>> >
>> > Huh.  When is it ever safe to not take the write lock, then?
>>
>> When doing a read-only operation such as status.
>
> OK, so what matters is the overall user-level (svn_wc-level?)
> operation, not the particular purpose that the adm_access is being
> grabbed for?  (Because here, the baton is being grabbed just for
> reading; a different, locked baton is used to write...)

Yes.  The locks are not read-write locks, I suppose they could be
described as "write-exclusive" locks as a lock excludes other writers.
In the past write operations that I worked on grabbed all needed locks
before starting, attempting to extend an access baton set was tricky.
I don't really know how the _try interface works and I don't know
whether it properly closes batons in the presence of "holes" (see the
comments above lock.c:do_close).  I've mostly given up hacking on the
wc code because it is simply too horrible.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Nasty double-replace copy-on-update bug

Posted by David Glasser <gl...@davidglasser.net>.
On 10/19/07, Philip Martin <ph...@codematters.co.uk> wrote:
> "David Glasser" <gl...@davidglasser.net> writes:
>
> > On 10/19/07, Philip Martin <ph...@codematters.co.uk> wrote:
> >>
> >> Without a write lock some other Subversion process could change or
> >> delete the file in the interval between the decision to use a file and
> >> the moment it gets used.
> >
> > Huh.  When is it ever safe to not take the write lock, then?
>
> When doing a read-only operation such as status.

OK, so what matters is the overall user-level (svn_wc-level?)
operation, not the particular purpose that the adm_access is being
grabbed for?  (Because here, the baton is being grabbed just for
reading; a different, locked baton is used to write...)

--dave

-- 
David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Nasty double-replace copy-on-update bug

Posted by Philip Martin <ph...@codematters.co.uk>.
"David Glasser" <gl...@davidglasser.net> writes:

> On 10/19/07, Philip Martin <ph...@codematters.co.uk> wrote:
>>
>> Without a write lock some other Subversion process could change or
>> delete the file in the interval between the decision to use a file and
>> the moment it gets used.
>
> Huh.  When is it ever safe to not take the write lock, then?

When doing a read-only operation such as status.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Nasty double-replace copy-on-update bug

Posted by David Glasser <gl...@davidglasser.net>.
On 10/19/07, Philip Martin <ph...@codematters.co.uk> wrote:
> "David Glasser" <gl...@davidglasser.net> writes:
>
> > On 10/19/07, Philip Martin <ph...@codematters.co.uk> wrote:
> >> "Ben Collins-Sussman" <su...@red-bean.com> writes:
> >>
> >> > The question is:  why the wc-searching routine
> >> > (update_editor.c:locate_copyfrom()) think that it's found the correct
> >> > version of the file, when it's not?  Answer:  it seems to be getting
> >> > stale svn_wc_entry_t information.   It's reading a couple of entries
> >> > files from disk, rather than from a more recent adm_access set still
> >> > held in memory, and getting an incorrect (stale) last-changed-rev
> >> > value.    From poking around in gdb, it seems that the entry_t's which
> >> > are write-cached in memory are correct -- they have the proper
> >> > last-changed-rev values, but simply aren't flushed to disk yet.
> >> >
> >> > I attempted to solve this by passing in the edit_baton->access_set to
> >> > the searching routine, but it's not fixing the problem.  I've attached
> >> > my patch-in-progress below.  Can we get some more eyes on this
> >> > problem?
> >>
> >> I think there are two bugs here.  I believe it is wrong for
> >> locate_copyfrom to use a read-only snapshot of the working copy, your
> >> patch to pass an access baton looks like the right thing to do,
> >> although I think that the _try calls should take write locks.  However
> >> that's not what is causing the current failure.
> >
> > Why is that?  It sounds like a read-only operation...
>
> Without a write lock some other Subversion process could change or
> delete the file in the interval between the decision to use a file and
> the moment it gets used.

Huh.  When is it ever safe to not take the write lock, then?

--dave

-- 
David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Nasty double-replace copy-on-update bug

Posted by Philip Martin <ph...@codematters.co.uk>.
"David Glasser" <gl...@davidglasser.net> writes:

> On 10/19/07, Philip Martin <ph...@codematters.co.uk> wrote:
>> "Ben Collins-Sussman" <su...@red-bean.com> writes:
>>
>> > The question is:  why the wc-searching routine
>> > (update_editor.c:locate_copyfrom()) think that it's found the correct
>> > version of the file, when it's not?  Answer:  it seems to be getting
>> > stale svn_wc_entry_t information.   It's reading a couple of entries
>> > files from disk, rather than from a more recent adm_access set still
>> > held in memory, and getting an incorrect (stale) last-changed-rev
>> > value.    From poking around in gdb, it seems that the entry_t's which
>> > are write-cached in memory are correct -- they have the proper
>> > last-changed-rev values, but simply aren't flushed to disk yet.
>> >
>> > I attempted to solve this by passing in the edit_baton->access_set to
>> > the searching routine, but it's not fixing the problem.  I've attached
>> > my patch-in-progress below.  Can we get some more eyes on this
>> > problem?
>>
>> I think there are two bugs here.  I believe it is wrong for
>> locate_copyfrom to use a read-only snapshot of the working copy, your
>> patch to pass an access baton looks like the right thing to do,
>> although I think that the _try calls should take write locks.  However
>> that's not what is causing the current failure.
>
> Why is that?  It sounds like a read-only operation...

Without a write lock some other Subversion process could change or
delete the file in the interval between the decision to use a file and
the moment it gets used.

>> The reason that locate_copyfrom thinks it has the right file is that
>> when alpha first gets added it has committed-rev=1 and that meets the
>> conditions required by locate_copyfrom.  This happens because alpha
>> itself has copyfrom info and update_editor.c:add_file_with_history
>> calls eb->fetch_func to get gamma@1 from the repository and then there
>> is a bit of code that does:
>>
>>   /* Loop over whatever base-props we have in memory, faking change_file_prop()
>>      calls against the file baton. */
>>   for (hi = apr_hash_first(pool, base_props); hi; hi = apr_hash_next(hi))
>>     {
>>       const void *key;
>>       void *val;
>>       const char *propname;
>>       svn_string_t *propval;
>>       apr_hash_this(hi, &key, NULL, &val);
>>       propname = key;
>>       propval = val;
>>       SVN_ERR(change_file_prop(tfb, propname, propval, pool));
>>     }
>>
>> which transfers the gamma@1 props to the new alpha, including
>> committed-rev=1.  Later on alpha will get corrected to have
>> committed-rev=3 but before that something else happens to look for
>> alpha@1.
>
> Well, it's not really "before that".  Based on the editor drive that's
> happening, the server has already told the client that commited-rev=3;
> this information is in a loggy that has not yet been executed.  Should
> the cached entries in the baton be updated when the loggy is written
> or when it is executed?

How the logs and cache should interact is a difficult question to
answer.  We have the current behaviour which is the result of
evolution rather than any grand unified design.

Why does add_file_with_history flush and run the log, the one with the
wrong committed-rev?  I guess it's to enable a text-delta to be
applied, but it does seem to be sort of in the middle of things.  Is
the working copy consistent if the process gets interrupted after the
log has been run?  If there is no log file cleanup will simply remove
the locks and assume the working copy is OK.

> See Issue 2977 for a more detailed explanation of what events occur.

That might have saved me some time.  I hate the issue tracker
diverting information from the dev list.

>> Perhaps properties like committed-rev should be filtered out of then
>> above loop?  Perhaps only regular props should be transferred?
>
> Maybe, but that seems like sidestepping the problem, which is that the
> left hand of the library knows something (that the server has set
> committed-rev) that the right hand doesn't...

I think that transferring committed-rev from one file to another is a
bug, that's not sidestepping anything.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Nasty double-replace copy-on-update bug

Posted by David Glasser <gl...@davidglasser.net>.
On 10/19/07, Philip Martin <ph...@codematters.co.uk> wrote:
> "Ben Collins-Sussman" <su...@red-bean.com> writes:
>
> > The question is:  why the wc-searching routine
> > (update_editor.c:locate_copyfrom()) think that it's found the correct
> > version of the file, when it's not?  Answer:  it seems to be getting
> > stale svn_wc_entry_t information.   It's reading a couple of entries
> > files from disk, rather than from a more recent adm_access set still
> > held in memory, and getting an incorrect (stale) last-changed-rev
> > value.    From poking around in gdb, it seems that the entry_t's which
> > are write-cached in memory are correct -- they have the proper
> > last-changed-rev values, but simply aren't flushed to disk yet.
> >
> > I attempted to solve this by passing in the edit_baton->access_set to
> > the searching routine, but it's not fixing the problem.  I've attached
> > my patch-in-progress below.  Can we get some more eyes on this
> > problem?
>
> I think there are two bugs here.  I believe it is wrong for
> locate_copyfrom to use a read-only snapshot of the working copy, your
> patch to pass an access baton looks like the right thing to do,
> although I think that the _try calls should take write locks.  However
> that's not what is causing the current failure.

Why is that?  It sounds like a read-only operation...

> The reason that locate_copyfrom thinks it has the right file is that
> when alpha first gets added it has committed-rev=1 and that meets the
> conditions required by locate_copyfrom.  This happens because alpha
> itself has copyfrom info and update_editor.c:add_file_with_history
> calls eb->fetch_func to get gamma@1 from the repository and then there
> is a bit of code that does:
>
>   /* Loop over whatever base-props we have in memory, faking change_file_prop()
>      calls against the file baton. */
>   for (hi = apr_hash_first(pool, base_props); hi; hi = apr_hash_next(hi))
>     {
>       const void *key;
>       void *val;
>       const char *propname;
>       svn_string_t *propval;
>       apr_hash_this(hi, &key, NULL, &val);
>       propname = key;
>       propval = val;
>       SVN_ERR(change_file_prop(tfb, propname, propval, pool));
>     }
>
> which transfers the gamma@1 props to the new alpha, including
> committed-rev=1.  Later on alpha will get corrected to have
> committed-rev=3 but before that something else happens to look for
> alpha@1.

Well, it's not really "before that".  Based on the editor drive that's
happening, the server has already told the client that commited-rev=3;
this information is in a loggy that has not yet been executed.  Should
the cached entries in the baton be updated when the loggy is written
or when it is executed?

See Issue 2977 for a more detailed explanation of what events occur.

> Perhaps properties like committed-rev should be filtered out of then
> above loop?  Perhaps only regular props should be transferred?

Maybe, but that seems like sidestepping the problem, which is that the
left hand of the library knows something (that the server has set
committed-rev) that the right hand doesn't...

--dave

-- 
David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Nasty double-replace copy-on-update bug

Posted by Philip Martin <ph...@codematters.co.uk>.
"Ben Collins-Sussman" <su...@red-bean.com> writes:

> The question is:  why the wc-searching routine
> (update_editor.c:locate_copyfrom()) think that it's found the correct
> version of the file, when it's not?  Answer:  it seems to be getting
> stale svn_wc_entry_t information.   It's reading a couple of entries
> files from disk, rather than from a more recent adm_access set still
> held in memory, and getting an incorrect (stale) last-changed-rev
> value.    From poking around in gdb, it seems that the entry_t's which
> are write-cached in memory are correct -- they have the proper
> last-changed-rev values, but simply aren't flushed to disk yet.
>
> I attempted to solve this by passing in the edit_baton->access_set to
> the searching routine, but it's not fixing the problem.  I've attached
> my patch-in-progress below.  Can we get some more eyes on this
> problem?

I think there are two bugs here.  I believe it is wrong for
locate_copyfrom to use a read-only snapshot of the working copy, your
patch to pass an access baton looks like the right thing to do,
although I think that the _try calls should take write locks.  However
that's not what is causing the current failure.

The reason that locate_copyfrom thinks it has the right file is that
when alpha first gets added it has committed-rev=1 and that meets the
conditions required by locate_copyfrom.  This happens because alpha
itself has copyfrom info and update_editor.c:add_file_with_history
calls eb->fetch_func to get gamma@1 from the repository and then there
is a bit of code that does:

  /* Loop over whatever base-props we have in memory, faking change_file_prop()
     calls against the file baton. */
  for (hi = apr_hash_first(pool, base_props); hi; hi = apr_hash_next(hi))
    {
      const void *key;
      void *val;
      const char *propname;
      svn_string_t *propval;
      apr_hash_this(hi, &key, NULL, &val);
      propname = key;
      propval = val;
      SVN_ERR(change_file_prop(tfb, propname, propval, pool));
    }

which transfers the gamma@1 props to the new alpha, including
committed-rev=1.  Later on alpha will get corrected to have
committed-rev=3 but before that something else happens to look for
alpha@1.

Perhaps properties like committed-rev should be filtered out of then
above loop?  Perhaps only regular props should be transferred?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Nasty double-replace copy-on-update bug

Posted by David Glasser <gl...@davidglasser.net>.
This is Issue #2977.  It is a release blocker.

--dave

On 10/18/07, David Glasser <gl...@davidglasser.net> wrote:
> See update_test 40 as of r27286.
>
> --dave
>
> On 10/18/07, David Glasser <gl...@davidglasser.net> wrote:
> > ... and guess what!  This bug can even strike *without* causing an
> > error; it can give you a wc which is completely self-consistent but is
> > different from the repository.  Specifically, removing the two perl
> > lines from the previous script does that.  Here's another version
> > which actually ensures that the files end up with the right contents.
> >
> > --dave
> >
> > On 10/18/07, David Glasser <gl...@davidglasser.net> wrote:
> > > Grr, arg!  I've been testing with the wrong version of my script (and
> > > the one I attached earlier was wrong too).  You're right; the patch
> > > doesn't work.
> > >
> > > One thing to do might be to instrument calls to svn_wc_adm_open3.
> > >
> > > (The only difference between previous and current test is a filename.
> > > Damn bugs that depend on order of iteration...)
> > >
> > > Real test attached.
> > >
> > > --dave
> > >
> > > On 10/18/07, Ben Collins-Sussman <su...@red-bean.com> wrote:
> > > > I get the same checksum error even with my patch.
> > > >
> > > >
> > > > On 10/18/07, David Glasser <gl...@davidglasser.net> wrote:
> > > > > On 10/18/07, Ben Collins-Sussman <su...@red-bean.com> wrote:
> > > > > > Let me elaborate on this bug:  it's something epg, glasser and I
> > > > > > discovered fooling around within Google today.   It only exists using
> > > > > > the latest 1.5 server and client.
> > > > > >
> > > > > > The basic symptom is:  you start a checkout, and at some point the 1.5
> > > > > > server says "add_file(foo, copyfrom=bar@10".  The client searches
> > > > > > around for bar@10 in the working copy, and finds it.  Or rather it
> > > > > > *thinks* it finds it, but it's actually the wrong version of the file.
> > > > > >  So, rather than just doing a side svn_ra_get_file() fetch (as it
> > > > > > should), it installs the wrong version of the file with a quick wc-wc
> > > > > > copy.  The server then tries to apply a txdelta to the file, and the
> > > > > > checksum fails... boom, failed checkout.
> > > > > >
> > > > > > The question is:  why the wc-searching routine
> > > > > > (update_editor.c:locate_copyfrom()) think that it's found the correct
> > > > > > version of the file, when it's not?  Answer:  it seems to be getting
> > > > > > stale svn_wc_entry_t information.   It's reading a couple of entries
> > > > > > files from disk, rather than from a more recent adm_access set still
> > > > > > held in memory, and getting an incorrect (stale) last-changed-rev
> > > > > > value.    From poking around in gdb, it seems that the entry_t's which
> > > > > > are write-cached in memory are correct -- they have the proper
> > > > > > last-changed-rev values, but simply aren't flushed to disk yet.
> > > > > >
> > > > > > I attempted to solve this by passing in the edit_baton->access_set to
> > > > > > the searching routine, but it's not fixing the problem.  I've attached
> > > > > > my patch-in-progress below.  Can we get some more eyes on this
> > > > > > problem?
> > > > >
> > > > > Define "not fixing the problem".  It makes my script pass for me...
> > > > >
> > > > > --dave
> > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > Index: subversion/libsvn_wc/update_editor.c
> > > > > > ===================================================================
> > > > > > --- subversion/libsvn_wc/update_editor.c        (revision 27277)
> > > > > > +++ subversion/libsvn_wc/update_editor.c        (working copy)
> > > > > > @@ -2829,6 +2829,8 @@
> > > > > >     copy for an pre-existing versioned file which is exactly equal to
> > > > > >     COPYFROM_PATH@COPYFROM_REV.
> > > > > >
> > > > > > +   Use existing ASSOCIATED access-sets to examine svn_wc_entry_t's.
> > > > > > +
> > > > > >     If the file is found, return the absolute path to it in
> > > > > >     *RETURN_PATH, as well as a (read-only) access_t for its parent in
> > > > > >     *RETURN_ACCESS.  If the file isn't found, set *RETURN_PATH to NULL.
> > > > > > @@ -2838,6 +2840,7 @@
> > > > > >                  svn_revnum_t copyfrom_rev,
> > > > > >                  const char *dest_dir,
> > > > > >                  const svn_wc_entry_t *dest_entry,
> > > > > > +                svn_wc_adm_access_t *associated,
> > > > > >                  const char **return_path,
> > > > > >                  svn_wc_adm_access_t **return_access,
> > > > > >                  apr_pool_t *pool)
> > > > > > @@ -2897,10 +2900,10 @@
> > > > > >    SVN_ERR(svn_io_check_path(cwd->data, &kind, subpool));
> > > > > >    if (kind != svn_node_dir)
> > > > > >      return SVN_NO_ERROR;
> > > > > > -  err = svn_wc_adm_open3(&ancestor_access, NULL, cwd->data,
> > > > > > -                         FALSE, /* open read-only, please */
> > > > > > -                         0,     /* open only this directory */
> > > > > > -                         NULL, NULL, subpool);
> > > > > > +  err = svn_wc_adm_probe_try3(&ancestor_access, associated, cwd->data,
> > > > > > +                              FALSE, /* open read-only, please */
> > > > > > +                              0,     /* open only this directory */
> > > > > > +                              NULL, NULL, subpool);
> > > > > >    if (err && err->apr_err == SVN_ERR_WC_NOT_DIRECTORY)
> > > > > >      {
> > > > > >        /* The common ancestor directory isn't version-controlled. */
> > > > > > @@ -2941,10 +2944,10 @@
> > > > > >      return SVN_NO_ERROR;
> > > > > >
> > > > > >    /* Next: is the file's parent-dir under version control?   */
> > > > > > -  err = svn_wc_adm_open3(&ancestor_access, NULL, cwd_parent->data,
> > > > > > -                         FALSE, /* open read-only, please */
> > > > > > -                         0,     /* open only the parent dir */
> > > > > > -                         NULL, NULL, pool);
> > > > > > +  err = svn_wc_adm_probe_try3(&ancestor_access, associated,
> > > > > cwd_parent->data,
> > > > > > +                              FALSE, /* open read-only, please */
> > > > > > +                              0,     /* open only the parent dir */
> > > > > > +                               NULL, NULL, pool);
> > > > > >    if (err && err->apr_err == SVN_ERR_WC_NOT_DIRECTORY)
> > > > > >      {
> > > > > >        svn_error_clear(err);
> > > > > > @@ -3052,7 +3055,7 @@
> > > > > >    /* Attempt to locate the copyfrom_path in the working copy first. */
> > > > > >    SVN_ERR(svn_wc_entry(&path_entry, pb->path, eb->adm_access, FALSE,
> > > > > pool));
> > > > > >    err = locate_copyfrom(copyfrom_path, copyfrom_rev,
> > > > > > -                        pb->path, path_entry,
> > > > > > +                        pb->path, path_entry, eb->adm_access,
> > > > > >                          &src_path, &src_access, pool);
> > > > > >    if (err && err->apr_err == SVN_ERR_WC_COPYFROM_PATH_NOT_FOUND)
> > > > > >      svn_error_clear(err);
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/
> > > > >
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> > > > For additional commands, e-mail: dev-help@subversion.tigris.org
> > > >
> > > >
> > >
> > >
> > > --
> > > David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/
> > >
> > >
> >
> >
> > --
> > David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/
> >
> >
>
>
> --
> David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/
>


-- 
David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Nasty double-replace copy-on-update bug

Posted by David Glasser <gl...@davidglasser.net>.
See update_test 40 as of r27286.

--dave

On 10/18/07, David Glasser <gl...@davidglasser.net> wrote:
> ... and guess what!  This bug can even strike *without* causing an
> error; it can give you a wc which is completely self-consistent but is
> different from the repository.  Specifically, removing the two perl
> lines from the previous script does that.  Here's another version
> which actually ensures that the files end up with the right contents.
>
> --dave
>
> On 10/18/07, David Glasser <gl...@davidglasser.net> wrote:
> > Grr, arg!  I've been testing with the wrong version of my script (and
> > the one I attached earlier was wrong too).  You're right; the patch
> > doesn't work.
> >
> > One thing to do might be to instrument calls to svn_wc_adm_open3.
> >
> > (The only difference between previous and current test is a filename.
> > Damn bugs that depend on order of iteration...)
> >
> > Real test attached.
> >
> > --dave
> >
> > On 10/18/07, Ben Collins-Sussman <su...@red-bean.com> wrote:
> > > I get the same checksum error even with my patch.
> > >
> > >
> > > On 10/18/07, David Glasser <gl...@davidglasser.net> wrote:
> > > > On 10/18/07, Ben Collins-Sussman <su...@red-bean.com> wrote:
> > > > > Let me elaborate on this bug:  it's something epg, glasser and I
> > > > > discovered fooling around within Google today.   It only exists using
> > > > > the latest 1.5 server and client.
> > > > >
> > > > > The basic symptom is:  you start a checkout, and at some point the 1.5
> > > > > server says "add_file(foo, copyfrom=bar@10".  The client searches
> > > > > around for bar@10 in the working copy, and finds it.  Or rather it
> > > > > *thinks* it finds it, but it's actually the wrong version of the file.
> > > > >  So, rather than just doing a side svn_ra_get_file() fetch (as it
> > > > > should), it installs the wrong version of the file with a quick wc-wc
> > > > > copy.  The server then tries to apply a txdelta to the file, and the
> > > > > checksum fails... boom, failed checkout.
> > > > >
> > > > > The question is:  why the wc-searching routine
> > > > > (update_editor.c:locate_copyfrom()) think that it's found the correct
> > > > > version of the file, when it's not?  Answer:  it seems to be getting
> > > > > stale svn_wc_entry_t information.   It's reading a couple of entries
> > > > > files from disk, rather than from a more recent adm_access set still
> > > > > held in memory, and getting an incorrect (stale) last-changed-rev
> > > > > value.    From poking around in gdb, it seems that the entry_t's which
> > > > > are write-cached in memory are correct -- they have the proper
> > > > > last-changed-rev values, but simply aren't flushed to disk yet.
> > > > >
> > > > > I attempted to solve this by passing in the edit_baton->access_set to
> > > > > the searching routine, but it's not fixing the problem.  I've attached
> > > > > my patch-in-progress below.  Can we get some more eyes on this
> > > > > problem?
> > > >
> > > > Define "not fixing the problem".  It makes my script pass for me...
> > > >
> > > > --dave
> > > >
> > > > >
> > > > >
> > > > >
> > > > > Index: subversion/libsvn_wc/update_editor.c
> > > > > ===================================================================
> > > > > --- subversion/libsvn_wc/update_editor.c        (revision 27277)
> > > > > +++ subversion/libsvn_wc/update_editor.c        (working copy)
> > > > > @@ -2829,6 +2829,8 @@
> > > > >     copy for an pre-existing versioned file which is exactly equal to
> > > > >     COPYFROM_PATH@COPYFROM_REV.
> > > > >
> > > > > +   Use existing ASSOCIATED access-sets to examine svn_wc_entry_t's.
> > > > > +
> > > > >     If the file is found, return the absolute path to it in
> > > > >     *RETURN_PATH, as well as a (read-only) access_t for its parent in
> > > > >     *RETURN_ACCESS.  If the file isn't found, set *RETURN_PATH to NULL.
> > > > > @@ -2838,6 +2840,7 @@
> > > > >                  svn_revnum_t copyfrom_rev,
> > > > >                  const char *dest_dir,
> > > > >                  const svn_wc_entry_t *dest_entry,
> > > > > +                svn_wc_adm_access_t *associated,
> > > > >                  const char **return_path,
> > > > >                  svn_wc_adm_access_t **return_access,
> > > > >                  apr_pool_t *pool)
> > > > > @@ -2897,10 +2900,10 @@
> > > > >    SVN_ERR(svn_io_check_path(cwd->data, &kind, subpool));
> > > > >    if (kind != svn_node_dir)
> > > > >      return SVN_NO_ERROR;
> > > > > -  err = svn_wc_adm_open3(&ancestor_access, NULL, cwd->data,
> > > > > -                         FALSE, /* open read-only, please */
> > > > > -                         0,     /* open only this directory */
> > > > > -                         NULL, NULL, subpool);
> > > > > +  err = svn_wc_adm_probe_try3(&ancestor_access, associated, cwd->data,
> > > > > +                              FALSE, /* open read-only, please */
> > > > > +                              0,     /* open only this directory */
> > > > > +                              NULL, NULL, subpool);
> > > > >    if (err && err->apr_err == SVN_ERR_WC_NOT_DIRECTORY)
> > > > >      {
> > > > >        /* The common ancestor directory isn't version-controlled. */
> > > > > @@ -2941,10 +2944,10 @@
> > > > >      return SVN_NO_ERROR;
> > > > >
> > > > >    /* Next: is the file's parent-dir under version control?   */
> > > > > -  err = svn_wc_adm_open3(&ancestor_access, NULL, cwd_parent->data,
> > > > > -                         FALSE, /* open read-only, please */
> > > > > -                         0,     /* open only the parent dir */
> > > > > -                         NULL, NULL, pool);
> > > > > +  err = svn_wc_adm_probe_try3(&ancestor_access, associated,
> > > > cwd_parent->data,
> > > > > +                              FALSE, /* open read-only, please */
> > > > > +                              0,     /* open only the parent dir */
> > > > > +                               NULL, NULL, pool);
> > > > >    if (err && err->apr_err == SVN_ERR_WC_NOT_DIRECTORY)
> > > > >      {
> > > > >        svn_error_clear(err);
> > > > > @@ -3052,7 +3055,7 @@
> > > > >    /* Attempt to locate the copyfrom_path in the working copy first. */
> > > > >    SVN_ERR(svn_wc_entry(&path_entry, pb->path, eb->adm_access, FALSE,
> > > > pool));
> > > > >    err = locate_copyfrom(copyfrom_path, copyfrom_rev,
> > > > > -                        pb->path, path_entry,
> > > > > +                        pb->path, path_entry, eb->adm_access,
> > > > >                          &src_path, &src_access, pool);
> > > > >    if (err && err->apr_err == SVN_ERR_WC_COPYFROM_PATH_NOT_FOUND)
> > > > >      svn_error_clear(err);
> > > > >
> > > >
> > > >
> > > > --
> > > > David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/
> > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> > > For additional commands, e-mail: dev-help@subversion.tigris.org
> > >
> > >
> >
> >
> > --
> > David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/
> >
> >
>
>
> --
> David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/
>
>


-- 
David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Nasty double-replace copy-on-update bug

Posted by David Glasser <gl...@davidglasser.net>.
... and guess what!  This bug can even strike *without* causing an
error; it can give you a wc which is completely self-consistent but is
different from the repository.  Specifically, removing the two perl
lines from the previous script does that.  Here's another version
which actually ensures that the files end up with the right contents.

--dave

On 10/18/07, David Glasser <gl...@davidglasser.net> wrote:
> Grr, arg!  I've been testing with the wrong version of my script (and
> the one I attached earlier was wrong too).  You're right; the patch
> doesn't work.
>
> One thing to do might be to instrument calls to svn_wc_adm_open3.
>
> (The only difference between previous and current test is a filename.
> Damn bugs that depend on order of iteration...)
>
> Real test attached.
>
> --dave
>
> On 10/18/07, Ben Collins-Sussman <su...@red-bean.com> wrote:
> > I get the same checksum error even with my patch.
> >
> >
> > On 10/18/07, David Glasser <gl...@davidglasser.net> wrote:
> > > On 10/18/07, Ben Collins-Sussman <su...@red-bean.com> wrote:
> > > > Let me elaborate on this bug:  it's something epg, glasser and I
> > > > discovered fooling around within Google today.   It only exists using
> > > > the latest 1.5 server and client.
> > > >
> > > > The basic symptom is:  you start a checkout, and at some point the 1.5
> > > > server says "add_file(foo, copyfrom=bar@10".  The client searches
> > > > around for bar@10 in the working copy, and finds it.  Or rather it
> > > > *thinks* it finds it, but it's actually the wrong version of the file.
> > > >  So, rather than just doing a side svn_ra_get_file() fetch (as it
> > > > should), it installs the wrong version of the file with a quick wc-wc
> > > > copy.  The server then tries to apply a txdelta to the file, and the
> > > > checksum fails... boom, failed checkout.
> > > >
> > > > The question is:  why the wc-searching routine
> > > > (update_editor.c:locate_copyfrom()) think that it's found the correct
> > > > version of the file, when it's not?  Answer:  it seems to be getting
> > > > stale svn_wc_entry_t information.   It's reading a couple of entries
> > > > files from disk, rather than from a more recent adm_access set still
> > > > held in memory, and getting an incorrect (stale) last-changed-rev
> > > > value.    From poking around in gdb, it seems that the entry_t's which
> > > > are write-cached in memory are correct -- they have the proper
> > > > last-changed-rev values, but simply aren't flushed to disk yet.
> > > >
> > > > I attempted to solve this by passing in the edit_baton->access_set to
> > > > the searching routine, but it's not fixing the problem.  I've attached
> > > > my patch-in-progress below.  Can we get some more eyes on this
> > > > problem?
> > >
> > > Define "not fixing the problem".  It makes my script pass for me...
> > >
> > > --dave
> > >
> > > >
> > > >
> > > >
> > > > Index: subversion/libsvn_wc/update_editor.c
> > > > ===================================================================
> > > > --- subversion/libsvn_wc/update_editor.c        (revision 27277)
> > > > +++ subversion/libsvn_wc/update_editor.c        (working copy)
> > > > @@ -2829,6 +2829,8 @@
> > > >     copy for an pre-existing versioned file which is exactly equal to
> > > >     COPYFROM_PATH@COPYFROM_REV.
> > > >
> > > > +   Use existing ASSOCIATED access-sets to examine svn_wc_entry_t's.
> > > > +
> > > >     If the file is found, return the absolute path to it in
> > > >     *RETURN_PATH, as well as a (read-only) access_t for its parent in
> > > >     *RETURN_ACCESS.  If the file isn't found, set *RETURN_PATH to NULL.
> > > > @@ -2838,6 +2840,7 @@
> > > >                  svn_revnum_t copyfrom_rev,
> > > >                  const char *dest_dir,
> > > >                  const svn_wc_entry_t *dest_entry,
> > > > +                svn_wc_adm_access_t *associated,
> > > >                  const char **return_path,
> > > >                  svn_wc_adm_access_t **return_access,
> > > >                  apr_pool_t *pool)
> > > > @@ -2897,10 +2900,10 @@
> > > >    SVN_ERR(svn_io_check_path(cwd->data, &kind, subpool));
> > > >    if (kind != svn_node_dir)
> > > >      return SVN_NO_ERROR;
> > > > -  err = svn_wc_adm_open3(&ancestor_access, NULL, cwd->data,
> > > > -                         FALSE, /* open read-only, please */
> > > > -                         0,     /* open only this directory */
> > > > -                         NULL, NULL, subpool);
> > > > +  err = svn_wc_adm_probe_try3(&ancestor_access, associated, cwd->data,
> > > > +                              FALSE, /* open read-only, please */
> > > > +                              0,     /* open only this directory */
> > > > +                              NULL, NULL, subpool);
> > > >    if (err && err->apr_err == SVN_ERR_WC_NOT_DIRECTORY)
> > > >      {
> > > >        /* The common ancestor directory isn't version-controlled. */
> > > > @@ -2941,10 +2944,10 @@
> > > >      return SVN_NO_ERROR;
> > > >
> > > >    /* Next: is the file's parent-dir under version control?   */
> > > > -  err = svn_wc_adm_open3(&ancestor_access, NULL, cwd_parent->data,
> > > > -                         FALSE, /* open read-only, please */
> > > > -                         0,     /* open only the parent dir */
> > > > -                         NULL, NULL, pool);
> > > > +  err = svn_wc_adm_probe_try3(&ancestor_access, associated,
> > > cwd_parent->data,
> > > > +                              FALSE, /* open read-only, please */
> > > > +                              0,     /* open only the parent dir */
> > > > +                               NULL, NULL, pool);
> > > >    if (err && err->apr_err == SVN_ERR_WC_NOT_DIRECTORY)
> > > >      {
> > > >        svn_error_clear(err);
> > > > @@ -3052,7 +3055,7 @@
> > > >    /* Attempt to locate the copyfrom_path in the working copy first. */
> > > >    SVN_ERR(svn_wc_entry(&path_entry, pb->path, eb->adm_access, FALSE,
> > > pool));
> > > >    err = locate_copyfrom(copyfrom_path, copyfrom_rev,
> > > > -                        pb->path, path_entry,
> > > > +                        pb->path, path_entry, eb->adm_access,
> > > >                          &src_path, &src_access, pool);
> > > >    if (err && err->apr_err == SVN_ERR_WC_COPYFROM_PATH_NOT_FOUND)
> > > >      svn_error_clear(err);
> > > >
> > >
> > >
> > > --
> > > David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> > For additional commands, e-mail: dev-help@subversion.tigris.org
> >
> >
>
>
> --
> David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/
>
>


-- 
David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/

Re: Nasty double-replace copy-on-update bug

Posted by David Glasser <gl...@davidglasser.net>.
Grr, arg!  I've been testing with the wrong version of my script (and
the one I attached earlier was wrong too).  You're right; the patch
doesn't work.

One thing to do might be to instrument calls to svn_wc_adm_open3.

(The only difference between previous and current test is a filename.
Damn bugs that depend on order of iteration...)

Real test attached.

--dave

On 10/18/07, Ben Collins-Sussman <su...@red-bean.com> wrote:
> I get the same checksum error even with my patch.
>
>
> On 10/18/07, David Glasser <gl...@davidglasser.net> wrote:
> > On 10/18/07, Ben Collins-Sussman <su...@red-bean.com> wrote:
> > > Let me elaborate on this bug:  it's something epg, glasser and I
> > > discovered fooling around within Google today.   It only exists using
> > > the latest 1.5 server and client.
> > >
> > > The basic symptom is:  you start a checkout, and at some point the 1.5
> > > server says "add_file(foo, copyfrom=bar@10".  The client searches
> > > around for bar@10 in the working copy, and finds it.  Or rather it
> > > *thinks* it finds it, but it's actually the wrong version of the file.
> > >  So, rather than just doing a side svn_ra_get_file() fetch (as it
> > > should), it installs the wrong version of the file with a quick wc-wc
> > > copy.  The server then tries to apply a txdelta to the file, and the
> > > checksum fails... boom, failed checkout.
> > >
> > > The question is:  why the wc-searching routine
> > > (update_editor.c:locate_copyfrom()) think that it's found the correct
> > > version of the file, when it's not?  Answer:  it seems to be getting
> > > stale svn_wc_entry_t information.   It's reading a couple of entries
> > > files from disk, rather than from a more recent adm_access set still
> > > held in memory, and getting an incorrect (stale) last-changed-rev
> > > value.    From poking around in gdb, it seems that the entry_t's which
> > > are write-cached in memory are correct -- they have the proper
> > > last-changed-rev values, but simply aren't flushed to disk yet.
> > >
> > > I attempted to solve this by passing in the edit_baton->access_set to
> > > the searching routine, but it's not fixing the problem.  I've attached
> > > my patch-in-progress below.  Can we get some more eyes on this
> > > problem?
> >
> > Define "not fixing the problem".  It makes my script pass for me...
> >
> > --dave
> >
> > >
> > >
> > >
> > > Index: subversion/libsvn_wc/update_editor.c
> > > ===================================================================
> > > --- subversion/libsvn_wc/update_editor.c        (revision 27277)
> > > +++ subversion/libsvn_wc/update_editor.c        (working copy)
> > > @@ -2829,6 +2829,8 @@
> > >     copy for an pre-existing versioned file which is exactly equal to
> > >     COPYFROM_PATH@COPYFROM_REV.
> > >
> > > +   Use existing ASSOCIATED access-sets to examine svn_wc_entry_t's.
> > > +
> > >     If the file is found, return the absolute path to it in
> > >     *RETURN_PATH, as well as a (read-only) access_t for its parent in
> > >     *RETURN_ACCESS.  If the file isn't found, set *RETURN_PATH to NULL.
> > > @@ -2838,6 +2840,7 @@
> > >                  svn_revnum_t copyfrom_rev,
> > >                  const char *dest_dir,
> > >                  const svn_wc_entry_t *dest_entry,
> > > +                svn_wc_adm_access_t *associated,
> > >                  const char **return_path,
> > >                  svn_wc_adm_access_t **return_access,
> > >                  apr_pool_t *pool)
> > > @@ -2897,10 +2900,10 @@
> > >    SVN_ERR(svn_io_check_path(cwd->data, &kind, subpool));
> > >    if (kind != svn_node_dir)
> > >      return SVN_NO_ERROR;
> > > -  err = svn_wc_adm_open3(&ancestor_access, NULL, cwd->data,
> > > -                         FALSE, /* open read-only, please */
> > > -                         0,     /* open only this directory */
> > > -                         NULL, NULL, subpool);
> > > +  err = svn_wc_adm_probe_try3(&ancestor_access, associated, cwd->data,
> > > +                              FALSE, /* open read-only, please */
> > > +                              0,     /* open only this directory */
> > > +                              NULL, NULL, subpool);
> > >    if (err && err->apr_err == SVN_ERR_WC_NOT_DIRECTORY)
> > >      {
> > >        /* The common ancestor directory isn't version-controlled. */
> > > @@ -2941,10 +2944,10 @@
> > >      return SVN_NO_ERROR;
> > >
> > >    /* Next: is the file's parent-dir under version control?   */
> > > -  err = svn_wc_adm_open3(&ancestor_access, NULL, cwd_parent->data,
> > > -                         FALSE, /* open read-only, please */
> > > -                         0,     /* open only the parent dir */
> > > -                         NULL, NULL, pool);
> > > +  err = svn_wc_adm_probe_try3(&ancestor_access, associated,
> > cwd_parent->data,
> > > +                              FALSE, /* open read-only, please */
> > > +                              0,     /* open only the parent dir */
> > > +                               NULL, NULL, pool);
> > >    if (err && err->apr_err == SVN_ERR_WC_NOT_DIRECTORY)
> > >      {
> > >        svn_error_clear(err);
> > > @@ -3052,7 +3055,7 @@
> > >    /* Attempt to locate the copyfrom_path in the working copy first. */
> > >    SVN_ERR(svn_wc_entry(&path_entry, pb->path, eb->adm_access, FALSE,
> > pool));
> > >    err = locate_copyfrom(copyfrom_path, copyfrom_rev,
> > > -                        pb->path, path_entry,
> > > +                        pb->path, path_entry, eb->adm_access,
> > >                          &src_path, &src_access, pool);
> > >    if (err && err->apr_err == SVN_ERR_WC_COPYFROM_PATH_NOT_FOUND)
> > >      svn_error_clear(err);
> > >
> >
> >
> > --
> > David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org
>
>


-- 
David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/

Re: Nasty double-replace copy-on-update bug

Posted by Ben Collins-Sussman <su...@red-bean.com>.
I get the same checksum error even with my patch.


On 10/18/07, David Glasser <gl...@davidglasser.net> wrote:
> On 10/18/07, Ben Collins-Sussman <su...@red-bean.com> wrote:
> > Let me elaborate on this bug:  it's something epg, glasser and I
> > discovered fooling around within Google today.   It only exists using
> > the latest 1.5 server and client.
> >
> > The basic symptom is:  you start a checkout, and at some point the 1.5
> > server says "add_file(foo, copyfrom=bar@10".  The client searches
> > around for bar@10 in the working copy, and finds it.  Or rather it
> > *thinks* it finds it, but it's actually the wrong version of the file.
> >  So, rather than just doing a side svn_ra_get_file() fetch (as it
> > should), it installs the wrong version of the file with a quick wc-wc
> > copy.  The server then tries to apply a txdelta to the file, and the
> > checksum fails... boom, failed checkout.
> >
> > The question is:  why the wc-searching routine
> > (update_editor.c:locate_copyfrom()) think that it's found the correct
> > version of the file, when it's not?  Answer:  it seems to be getting
> > stale svn_wc_entry_t information.   It's reading a couple of entries
> > files from disk, rather than from a more recent adm_access set still
> > held in memory, and getting an incorrect (stale) last-changed-rev
> > value.    From poking around in gdb, it seems that the entry_t's which
> > are write-cached in memory are correct -- they have the proper
> > last-changed-rev values, but simply aren't flushed to disk yet.
> >
> > I attempted to solve this by passing in the edit_baton->access_set to
> > the searching routine, but it's not fixing the problem.  I've attached
> > my patch-in-progress below.  Can we get some more eyes on this
> > problem?
>
> Define "not fixing the problem".  It makes my script pass for me...
>
> --dave
>
> >
> >
> >
> > Index: subversion/libsvn_wc/update_editor.c
> > ===================================================================
> > --- subversion/libsvn_wc/update_editor.c        (revision 27277)
> > +++ subversion/libsvn_wc/update_editor.c        (working copy)
> > @@ -2829,6 +2829,8 @@
> >     copy for an pre-existing versioned file which is exactly equal to
> >     COPYFROM_PATH@COPYFROM_REV.
> >
> > +   Use existing ASSOCIATED access-sets to examine svn_wc_entry_t's.
> > +
> >     If the file is found, return the absolute path to it in
> >     *RETURN_PATH, as well as a (read-only) access_t for its parent in
> >     *RETURN_ACCESS.  If the file isn't found, set *RETURN_PATH to NULL.
> > @@ -2838,6 +2840,7 @@
> >                  svn_revnum_t copyfrom_rev,
> >                  const char *dest_dir,
> >                  const svn_wc_entry_t *dest_entry,
> > +                svn_wc_adm_access_t *associated,
> >                  const char **return_path,
> >                  svn_wc_adm_access_t **return_access,
> >                  apr_pool_t *pool)
> > @@ -2897,10 +2900,10 @@
> >    SVN_ERR(svn_io_check_path(cwd->data, &kind, subpool));
> >    if (kind != svn_node_dir)
> >      return SVN_NO_ERROR;
> > -  err = svn_wc_adm_open3(&ancestor_access, NULL, cwd->data,
> > -                         FALSE, /* open read-only, please */
> > -                         0,     /* open only this directory */
> > -                         NULL, NULL, subpool);
> > +  err = svn_wc_adm_probe_try3(&ancestor_access, associated, cwd->data,
> > +                              FALSE, /* open read-only, please */
> > +                              0,     /* open only this directory */
> > +                              NULL, NULL, subpool);
> >    if (err && err->apr_err == SVN_ERR_WC_NOT_DIRECTORY)
> >      {
> >        /* The common ancestor directory isn't version-controlled. */
> > @@ -2941,10 +2944,10 @@
> >      return SVN_NO_ERROR;
> >
> >    /* Next: is the file's parent-dir under version control?   */
> > -  err = svn_wc_adm_open3(&ancestor_access, NULL, cwd_parent->data,
> > -                         FALSE, /* open read-only, please */
> > -                         0,     /* open only the parent dir */
> > -                         NULL, NULL, pool);
> > +  err = svn_wc_adm_probe_try3(&ancestor_access, associated,
> cwd_parent->data,
> > +                              FALSE, /* open read-only, please */
> > +                              0,     /* open only the parent dir */
> > +                               NULL, NULL, pool);
> >    if (err && err->apr_err == SVN_ERR_WC_NOT_DIRECTORY)
> >      {
> >        svn_error_clear(err);
> > @@ -3052,7 +3055,7 @@
> >    /* Attempt to locate the copyfrom_path in the working copy first. */
> >    SVN_ERR(svn_wc_entry(&path_entry, pb->path, eb->adm_access, FALSE,
> pool));
> >    err = locate_copyfrom(copyfrom_path, copyfrom_rev,
> > -                        pb->path, path_entry,
> > +                        pb->path, path_entry, eb->adm_access,
> >                          &src_path, &src_access, pool);
> >    if (err && err->apr_err == SVN_ERR_WC_COPYFROM_PATH_NOT_FOUND)
> >      svn_error_clear(err);
> >
>
>
> --
> David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Nasty double-replace copy-on-update bug

Posted by David Glasser <gl...@davidglasser.net>.
On 10/18/07, Ben Collins-Sussman <su...@red-bean.com> wrote:
> Let me elaborate on this bug:  it's something epg, glasser and I
> discovered fooling around within Google today.   It only exists using
> the latest 1.5 server and client.
>
> The basic symptom is:  you start a checkout, and at some point the 1.5
> server says "add_file(foo, copyfrom=bar@10".  The client searches
> around for bar@10 in the working copy, and finds it.  Or rather it
> *thinks* it finds it, but it's actually the wrong version of the file.
>  So, rather than just doing a side svn_ra_get_file() fetch (as it
> should), it installs the wrong version of the file with a quick wc-wc
> copy.  The server then tries to apply a txdelta to the file, and the
> checksum fails... boom, failed checkout.
>
> The question is:  why the wc-searching routine
> (update_editor.c:locate_copyfrom()) think that it's found the correct
> version of the file, when it's not?  Answer:  it seems to be getting
> stale svn_wc_entry_t information.   It's reading a couple of entries
> files from disk, rather than from a more recent adm_access set still
> held in memory, and getting an incorrect (stale) last-changed-rev
> value.    From poking around in gdb, it seems that the entry_t's which
> are write-cached in memory are correct -- they have the proper
> last-changed-rev values, but simply aren't flushed to disk yet.
>
> I attempted to solve this by passing in the edit_baton->access_set to
> the searching routine, but it's not fixing the problem.  I've attached
> my patch-in-progress below.  Can we get some more eyes on this
> problem?

Define "not fixing the problem".  It makes my script pass for me...

--dave

>
>
>
> Index: subversion/libsvn_wc/update_editor.c
> ===================================================================
> --- subversion/libsvn_wc/update_editor.c        (revision 27277)
> +++ subversion/libsvn_wc/update_editor.c        (working copy)
> @@ -2829,6 +2829,8 @@
>     copy for an pre-existing versioned file which is exactly equal to
>     COPYFROM_PATH@COPYFROM_REV.
>
> +   Use existing ASSOCIATED access-sets to examine svn_wc_entry_t's.
> +
>     If the file is found, return the absolute path to it in
>     *RETURN_PATH, as well as a (read-only) access_t for its parent in
>     *RETURN_ACCESS.  If the file isn't found, set *RETURN_PATH to NULL.
> @@ -2838,6 +2840,7 @@
>                  svn_revnum_t copyfrom_rev,
>                  const char *dest_dir,
>                  const svn_wc_entry_t *dest_entry,
> +                svn_wc_adm_access_t *associated,
>                  const char **return_path,
>                  svn_wc_adm_access_t **return_access,
>                  apr_pool_t *pool)
> @@ -2897,10 +2900,10 @@
>    SVN_ERR(svn_io_check_path(cwd->data, &kind, subpool));
>    if (kind != svn_node_dir)
>      return SVN_NO_ERROR;
> -  err = svn_wc_adm_open3(&ancestor_access, NULL, cwd->data,
> -                         FALSE, /* open read-only, please */
> -                         0,     /* open only this directory */
> -                         NULL, NULL, subpool);
> +  err = svn_wc_adm_probe_try3(&ancestor_access, associated, cwd->data,
> +                              FALSE, /* open read-only, please */
> +                              0,     /* open only this directory */
> +                              NULL, NULL, subpool);
>    if (err && err->apr_err == SVN_ERR_WC_NOT_DIRECTORY)
>      {
>        /* The common ancestor directory isn't version-controlled. */
> @@ -2941,10 +2944,10 @@
>      return SVN_NO_ERROR;
>
>    /* Next: is the file's parent-dir under version control?   */
> -  err = svn_wc_adm_open3(&ancestor_access, NULL, cwd_parent->data,
> -                         FALSE, /* open read-only, please */
> -                         0,     /* open only the parent dir */
> -                         NULL, NULL, pool);
> +  err = svn_wc_adm_probe_try3(&ancestor_access, associated, cwd_parent->data,
> +                              FALSE, /* open read-only, please */
> +                              0,     /* open only the parent dir */
> +                               NULL, NULL, pool);
>    if (err && err->apr_err == SVN_ERR_WC_NOT_DIRECTORY)
>      {
>        svn_error_clear(err);
> @@ -3052,7 +3055,7 @@
>    /* Attempt to locate the copyfrom_path in the working copy first. */
>    SVN_ERR(svn_wc_entry(&path_entry, pb->path, eb->adm_access, FALSE, pool));
>    err = locate_copyfrom(copyfrom_path, copyfrom_rev,
> -                        pb->path, path_entry,
> +                        pb->path, path_entry, eb->adm_access,
>                          &src_path, &src_access, pool);
>    if (err && err->apr_err == SVN_ERR_WC_COPYFROM_PATH_NOT_FOUND)
>      svn_error_clear(err);
>


-- 
David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Nasty double-replace copy-on-update bug

Posted by Ben Collins-Sussman <su...@red-bean.com>.
Let me elaborate on this bug:  it's something epg, glasser and I
discovered fooling around within Google today.   It only exists using
the latest 1.5 server and client.

The basic symptom is:  you start a checkout, and at some point the 1.5
server says "add_file(foo, copyfrom=bar@10".  The client searches
around for bar@10 in the working copy, and finds it.  Or rather it
*thinks* it finds it, but it's actually the wrong version of the file.
 So, rather than just doing a side svn_ra_get_file() fetch (as it
should), it installs the wrong version of the file with a quick wc-wc
copy.  The server then tries to apply a txdelta to the file, and the
checksum fails... boom, failed checkout.

The question is:  why the wc-searching routine
(update_editor.c:locate_copyfrom()) think that it's found the correct
version of the file, when it's not?  Answer:  it seems to be getting
stale svn_wc_entry_t information.   It's reading a couple of entries
files from disk, rather than from a more recent adm_access set still
held in memory, and getting an incorrect (stale) last-changed-rev
value.    From poking around in gdb, it seems that the entry_t's which
are write-cached in memory are correct -- they have the proper
last-changed-rev values, but simply aren't flushed to disk yet.

I attempted to solve this by passing in the edit_baton->access_set to
the searching routine, but it's not fixing the problem.  I've attached
my patch-in-progress below.  Can we get some more eyes on this
problem?



Index: subversion/libsvn_wc/update_editor.c
===================================================================
--- subversion/libsvn_wc/update_editor.c        (revision 27277)
+++ subversion/libsvn_wc/update_editor.c        (working copy)
@@ -2829,6 +2829,8 @@
    copy for an pre-existing versioned file which is exactly equal to
    COPYFROM_PATH@COPYFROM_REV.

+   Use existing ASSOCIATED access-sets to examine svn_wc_entry_t's.
+
    If the file is found, return the absolute path to it in
    *RETURN_PATH, as well as a (read-only) access_t for its parent in
    *RETURN_ACCESS.  If the file isn't found, set *RETURN_PATH to NULL.
@@ -2838,6 +2840,7 @@
                 svn_revnum_t copyfrom_rev,
                 const char *dest_dir,
                 const svn_wc_entry_t *dest_entry,
+                svn_wc_adm_access_t *associated,
                 const char **return_path,
                 svn_wc_adm_access_t **return_access,
                 apr_pool_t *pool)
@@ -2897,10 +2900,10 @@
   SVN_ERR(svn_io_check_path(cwd->data, &kind, subpool));
   if (kind != svn_node_dir)
     return SVN_NO_ERROR;
-  err = svn_wc_adm_open3(&ancestor_access, NULL, cwd->data,
-                         FALSE, /* open read-only, please */
-                         0,     /* open only this directory */
-                         NULL, NULL, subpool);
+  err = svn_wc_adm_probe_try3(&ancestor_access, associated, cwd->data,
+                              FALSE, /* open read-only, please */
+                              0,     /* open only this directory */
+                              NULL, NULL, subpool);
   if (err && err->apr_err == SVN_ERR_WC_NOT_DIRECTORY)
     {
       /* The common ancestor directory isn't version-controlled. */
@@ -2941,10 +2944,10 @@
     return SVN_NO_ERROR;

   /* Next: is the file's parent-dir under version control?   */
-  err = svn_wc_adm_open3(&ancestor_access, NULL, cwd_parent->data,
-                         FALSE, /* open read-only, please */
-                         0,     /* open only the parent dir */
-                         NULL, NULL, pool);
+  err = svn_wc_adm_probe_try3(&ancestor_access, associated, cwd_parent->data,
+                              FALSE, /* open read-only, please */
+                              0,     /* open only the parent dir */
+                               NULL, NULL, pool);
   if (err && err->apr_err == SVN_ERR_WC_NOT_DIRECTORY)
     {
       svn_error_clear(err);
@@ -3052,7 +3055,7 @@
   /* Attempt to locate the copyfrom_path in the working copy first. */
   SVN_ERR(svn_wc_entry(&path_entry, pb->path, eb->adm_access, FALSE, pool));
   err = locate_copyfrom(copyfrom_path, copyfrom_rev,
-                        pb->path, path_entry,
+                        pb->path, path_entry, eb->adm_access,
                         &src_path, &src_access, pool);
   if (err && err->apr_err == SVN_ERR_WC_COPYFROM_PATH_NOT_FOUND)
     svn_error_clear(err);

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org