You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Julian Foad <ju...@wandisco.com> on 2011/05/10 00:30:11 UTC

[PATCH] WC DB verification 1

I'm planning to commit the attached wc-db-verification-1.patch, subject
to any advice on how to best fit it in to the code base or other
concerns, in order to get a DB "self-check" function started.

I think we need something like this.  Earlier today I found that "svn
status" showed I had a clean, single-rev WC, while "svnversion" said it
was mixed-rev and switched.  Investigation showed there were orphaned
base node rows in the DB which weren't seen by "svn st" but were seen by
"svnversion".  I'm not interested in how that particular state came to
be, as I've run hundreds of buggy trunk builds on this WC over many
months.  What I do want is to be able to run a set of checks on a DB
that detects basic rule violations like that.

Decisions about when and how to run it can come later.  Of course if we
plan to run it frequently and automatically that would mean we'd want to
make sure it only did fast checks and is efficiently coded whereas if it
remains a manual intervention for devs then that's no concern.  Thoughts
about this are welcome too.

- Julian


RE: [PATCH] WC DB verification 1

Posted by Julian Foad <ju...@wandisco.com>.
On Tue, 2011-05-10 at 02:12 +0200, Bert Huijben wrote:
> 
> > -----Original Message-----
> > From: Mark Phippard [mailto:markphip@gmail.com]
> > Sent: dinsdag 10 mei 2011 1:16
> > To: Julian Foad
> > Cc: dev@subversion.apache.org
> > Subject: Re: [PATCH] WC DB verification 1
> > 
> > Could we tie it into cleanup?  Then other tools like TortoiseSVN,
> > Subclipse, AnkhSVN could invoke it too.
> > 
> > 
> > On Mon, May 9, 2011 at 6:30 PM, Julian Foad <ju...@wandisco.com>
> > wrote:
> > > I'm planning to commit the attached wc-db-verification-1.patch, subject
> > > to any advice on how to best fit it in to the code base or other
> > > concerns, in order to get a DB "self-check" function started.
> > >
> > > I think we need something like this.  Earlier today I found that "svn
> > > status" showed I had a clean, single-rev WC, while "svnversion" said it
> > > was mixed-rev and switched.  Investigation showed there were orphaned
> > > base node rows in the DB which weren't seen by "svn st" but were seen by
> > > "svnversion".  I'm not interested in how that particular state came to
> > > be, as I've run hundreds of buggy trunk builds on this WC over many
> > > months.  What I do want is to be able to run a set of checks on a DB
> > > that detects basic rule violations like that.
> > >
> > > Decisions about when and how to run it can come later.  Of course if we
> > > plan to run it frequently and automatically that would mean we'd want to
> > > make sure it only did fast checks and is efficiently coded whereas if it
> > > remains a manual intervention for devs then that's no concern.  Thoughts
> > > about this are welcome too.
> 
> I would suggest adding a new function for this purpose; and maybe a separate
> program to call it for our beta cycle.

I'm happy to make a separate program to call it if that's what we
prefer.

> This code should never be necessary once we call the db stable.

The idea is similar to validating postconditions whenever we modify the
DB.  It is "assertions" but at a higher level than ones we would put
in-line in the code.  It will help to make the DB stable.  It should be
called only during testing and debugging, but should remain available
during testing and debugging forever.

> And making
> the normal cleanup very slow makes it less useful for its normal task.

Sure.

> I'm not sure about what you are testing in this initial patch though: The
> only functions that can insert rows in NODES always use
> svn_relpath_dirname() for creating parent_relpath, so if you find databases
> where that set incorrectly you can be 100% certain those databases that are
> hand edited.

It's verifying conditions that "must be true".  It should catch any case
where a DB has been hand-edited or where we have accidentally added some
kind of row-insertion code that is broken.

> The parent exists checks are more useful, but should also look at op-depth,
> etc. to really find problems. (The op-depth root must always exist and every
> intermediate too)

Yes, this is exactly the kind of useful extensions that I want us to
make.

- Julian



Re: [PATCH] WC DB verification 1

Posted by Johan Corveleyn <jc...@gmail.com>.
On Tue, May 10, 2011 at 1:28 PM, Bert Huijben <be...@qqmail.nl> wrote:
...
> But there are *a lot* of users who think of svn cleanup as a tool to update
> their recorded_size and recorded_mod time after they unleashed some tool on
> their working copy.

Or after they moved their working copy from A to B without preserving
timestamps (recently found out that a lot of our users do this, e.g.
when they get a new pc, they copy their old working copy to the new pc
(which is faster than creating a new checkout, and preserves their
local mods too)).

-- 
Johan

RE: [PATCH] WC DB verification 1

Posted by Bert Huijben <be...@qqmail.nl>.
> -----Original Message-----
> From: Bert Huijben [mailto:bert@qqmail.nl]
> Sent: dinsdag 10 mei 2011 13:58
> To: 'Ivan Zhakov'
> Cc: 'Greg Stein'; dev@subversion.apache.org
> Subject: RE: [PATCH] WC DB verification 1
> 
> > -----Original Message-----
> > From: Ivan Zhakov [mailto:ivan@visualsvn.com]
> > Sent: dinsdag 10 mei 2011 13:38
> > To: Bert Huijben
> > Cc: Greg Stein; dev@subversion.apache.org
> > Subject: Re: [PATCH] WC DB verification 1
> >
> > On Tue, May 10, 2011 at 15:28, Bert Huijben <be...@qqmail.nl> wrote:
> > >
> > >
> > >> -----Original Message-----
> > >> From: Greg Stein [mailto:gstein@gmail.com]
> > >> Sent: dinsdag 10 mei 2011 11:18
> > >> To: dev@subversion.apache.org
> > >> Subject: Re: [PATCH] WC DB verification 1
> 
> 
> > Or create an option for cleanup command. Something like "svn cleanup --
> > verify"
> 
> If we are going to rev svn_client_cleanup I would suggest doing two more
> things:
> * Include the upgrade step in the deprecated wrapper. (To allow applications
> linked to just an old api version to upgrade to wc-ng using their existing
> cleanup command)
> * Allow accessing only that timestamp repair function.

And maybe some cleanup pristine store option...

The only case where we did try to do some cleanup before is only used from svn_wc_crop() (and working copy destruction)

	Bert


RE: [PATCH] WC DB verification 1

Posted by Bert Huijben <be...@qqmail.nl>.
> -----Original Message-----
> From: Ivan Zhakov [mailto:ivan@visualsvn.com]
> Sent: dinsdag 10 mei 2011 13:38
> To: Bert Huijben
> Cc: Greg Stein; dev@subversion.apache.org
> Subject: Re: [PATCH] WC DB verification 1
> 
> On Tue, May 10, 2011 at 15:28, Bert Huijben <be...@qqmail.nl> wrote:
> >
> >
> >> -----Original Message-----
> >> From: Greg Stein [mailto:gstein@gmail.com]
> >> Sent: dinsdag 10 mei 2011 11:18
> >> To: dev@subversion.apache.org
> >> Subject: Re: [PATCH] WC DB verification 1


> Or create an option for cleanup command. Something like "svn cleanup --
> verify"

If we are going to rev svn_client_cleanup I would suggest doing two more things:
* Include the upgrade step in the deprecated wrapper. (To allow applications linked to just an old api version to upgrade to wc-ng using their existing cleanup command)
* Allow accessing only that timestamp repair function.

	Bert


Re: [PATCH] WC DB verification 1

Posted by Ivan Zhakov <iv...@visualsvn.com>.
On Tue, May 10, 2011 at 15:28, Bert Huijben <be...@qqmail.nl> wrote:
>
>
>> -----Original Message-----
>> From: Greg Stein [mailto:gstein@gmail.com]
>> Sent: dinsdag 10 mei 2011 11:18
>> To: dev@subversion.apache.org
>> Subject: Re: [PATCH] WC DB verification 1
>>
>> >...
>> > I would suggest adding a new function for this purpose; and maybe a
>> separate
>> > program to call it for our beta cycle.
>>
>> Please... not another program. There was also a recent call for 'svn
>> upgrade' to be separate. It really sucks to have an installation smear
>> a dozen programs into your system ("which one do I use? for what? hmm.
>> for this? ... um..."). We already have a notion of subcommand. That is
>> normal, expected, and understood. We can continue to use that without
>> fear of user misunderstanding. In fact, it is probably best to tell
>> users "just use 'svn'. you don't need anything else". (administrators
>> are another thing, but we don't have to complicate their lives, too!
>> ... we have too many executables)
>
> I'm still not sure if this tool should survive outside the subversion
> developer world.
>
> And I didn't hear you suggest that we should include all the
> svnraisetreeconflict and entries-dump and like programs in 'svn'.
>
> I would suggest creating the API inside a library so other tools may use it,
> and *for now* enable its use via a separate program.
>
Or create an option for cleanup command. Something like "svn cleanup --verify"

-- 
Ivan Zhakov

RE: [PATCH] WC DB verification 1

Posted by Bert Huijben <be...@qqmail.nl>.

> -----Original Message-----
> From: Greg Stein [mailto:gstein@gmail.com]
> Sent: dinsdag 10 mei 2011 11:18
> To: dev@subversion.apache.org
> Subject: Re: [PATCH] WC DB verification 1
> 
> >...
> > I would suggest adding a new function for this purpose; and maybe a
> separate
> > program to call it for our beta cycle.
> 
> Please... not another program. There was also a recent call for 'svn
> upgrade' to be separate. It really sucks to have an installation smear
> a dozen programs into your system ("which one do I use? for what? hmm.
> for this? ... um..."). We already have a notion of subcommand. That is
> normal, expected, and understood. We can continue to use that without
> fear of user misunderstanding. In fact, it is probably best to tell
> users "just use 'svn'. you don't need anything else". (administrators
> are another thing, but we don't have to complicate their lives, too!
> ... we have too many executables)

I'm still not sure if this tool should survive outside the subversion
developer world. 

And I didn't hear you suggest that we should include all the
svnraisetreeconflict and entries-dump and like programs in 'svn'.

I would suggest creating the API inside a library so other tools may use it,
and *for now* enable its use via a separate program.

If it becomes useful in detecting problems (and maybe even fixing) we might
make it better accessible.

(But until the primary check is more than do we set parent_relpath correctly
I don't see its use. But that might change in the next commit)

> > This code should never be necessary once we call the db stable. And
> making
> > the normal cleanup very slow makes it less useful for its normal task.
> 
> Are you kidding? There is no such thing as "normal" when you're
> talking about 'svn cleanup'. That command should take as long as it
> needs to ensure that everything is good. Let me repeat: AS LONG AS IT
> NEEDS. If people need to run 'svn cleanup' more than once a year, then
> we've failed. But if they only need to run it once a year, then it can
> goddamned well take as much as it needs to ensure that the user has
> not lost any of their changes.
> 
> I find it absurd to place time constraints on 'svn cleanup'. The
> better approach is to ensure it is never needed.

I agree that there should be no 'normal use' for svn cleanup. (And
personally I only run svn cleanup in testing cases)

But there are *a lot* of users who think of svn cleanup as a tool to update
their recorded_size and recorded_mod time after they unleashed some tool on
their working copy.
I even heard users suggesting to run svn cleanup before *every* command
invocation.

Yes, I tell all these users not to do this... 
But there are more than a few users who currently use svn cleanup this way.

	Bert


Re: [PATCH] WC DB verification 1

Posted by Greg Stein <gs...@gmail.com>.
>...
> I would suggest adding a new function for this purpose; and maybe a separate
> program to call it for our beta cycle.

Please... not another program. There was also a recent call for 'svn
upgrade' to be separate. It really sucks to have an installation smear
a dozen programs into your system ("which one do I use? for what? hmm.
for this? ... um..."). We already have a notion of subcommand. That is
normal, expected, and understood. We can continue to use that without
fear of user misunderstanding. In fact, it is probably best to tell
users "just use 'svn'. you don't need anything else". (administrators
are another thing, but we don't have to complicate their lives, too!
... we have too many executables)

> This code should never be necessary once we call the db stable. And making
> the normal cleanup very slow makes it less useful for its normal task.

Are you kidding? There is no such thing as "normal" when you're
talking about 'svn cleanup'. That command should take as long as it
needs to ensure that everything is good. Let me repeat: AS LONG AS IT
NEEDS. If people need to run 'svn cleanup' more than once a year, then
we've failed. But if they only need to run it once a year, then it can
goddamned well take as much as it needs to ensure that the user has
not lost any of their changes.

I find it absurd to place time constraints on 'svn cleanup'. The
better approach is to ensure it is never needed.

-g

RE: [PATCH] WC DB verification 1

Posted by Bert Huijben <be...@qqmail.nl>.

> -----Original Message-----
> From: Mark Phippard [mailto:markphip@gmail.com]
> Sent: dinsdag 10 mei 2011 1:16
> To: Julian Foad
> Cc: dev@subversion.apache.org
> Subject: Re: [PATCH] WC DB verification 1
> 
> Could we tie it into cleanup?  Then other tools like TortoiseSVN,
> Subclipse, AnkhSVN could invoke it too.
> 
> 
> On Mon, May 9, 2011 at 6:30 PM, Julian Foad <ju...@wandisco.com>
> wrote:
> > I'm planning to commit the attached wc-db-verification-1.patch, subject
> > to any advice on how to best fit it in to the code base or other
> > concerns, in order to get a DB "self-check" function started.
> >
> > I think we need something like this.  Earlier today I found that "svn
> > status" showed I had a clean, single-rev WC, while "svnversion" said it
> > was mixed-rev and switched.  Investigation showed there were orphaned
> > base node rows in the DB which weren't seen by "svn st" but were seen by
> > "svnversion".  I'm not interested in how that particular state came to
> > be, as I've run hundreds of buggy trunk builds on this WC over many
> > months.  What I do want is to be able to run a set of checks on a DB
> > that detects basic rule violations like that.
> >
> > Decisions about when and how to run it can come later.  Of course if we
> > plan to run it frequently and automatically that would mean we'd want to
> > make sure it only did fast checks and is efficiently coded whereas if it
> > remains a manual intervention for devs then that's no concern.  Thoughts
> > about this are welcome too.

I would suggest adding a new function for this purpose; and maybe a separate
program to call it for our beta cycle. 
This code should never be necessary once we call the db stable. And making
the normal cleanup very slow makes it less useful for its normal task.


I'm not sure about what you are testing in this initial patch though: The
only functions that can insert rows in NODES always use
svn_relpath_dirname() for creating parent_relpath, so if you find databases
where that set incorrectly you can be 100% certain those databases that are
hand edited.

The parent exists checks are more useful, but should also look at op-depth,
etc. to really find problems. (The op-depth root must always exist and every
intermediate too)

	Bert


Re: [PATCH] WC DB verification 1

Posted by Mark Phippard <ma...@gmail.com>.
Could we tie it into cleanup?  Then other tools like TortoiseSVN,
Subclipse, AnkhSVN could invoke it too.


On Mon, May 9, 2011 at 6:30 PM, Julian Foad <ju...@wandisco.com> wrote:
> I'm planning to commit the attached wc-db-verification-1.patch, subject
> to any advice on how to best fit it in to the code base or other
> concerns, in order to get a DB "self-check" function started.
>
> I think we need something like this.  Earlier today I found that "svn
> status" showed I had a clean, single-rev WC, while "svnversion" said it
> was mixed-rev and switched.  Investigation showed there were orphaned
> base node rows in the DB which weren't seen by "svn st" but were seen by
> "svnversion".  I'm not interested in how that particular state came to
> be, as I've run hundreds of buggy trunk builds on this WC over many
> months.  What I do want is to be able to run a set of checks on a DB
> that detects basic rule violations like that.
>
> Decisions about when and how to run it can come later.  Of course if we
> plan to run it frequently and automatically that would mean we'd want to
> make sure it only did fast checks and is efficiently coded whereas if it
> remains a manual intervention for devs then that's no concern.  Thoughts
> about this are welcome too.
>
> - Julian
>
>



-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

Re: [PATCH] WC DB verification 1

Posted by Julian Foad <ju...@wandisco.com>.
On Tue, 2011-05-10 at 05:29 -0400, Greg Stein wrote:
> On Mon, May 9, 2011 at 18:30, Julian Foad <ju...@wandisco.com> wrote:
> > I'm planning to commit the attached wc-db-verification-1.patch, subject
> > to any advice on how to best fit it in to the code base or other
> > concerns, in order to get a DB "self-check" function started.
> >
> > I think we need something like this.  Earlier today I found that "svn
> > status" showed I had a clean, single-rev WC, while "svnversion" said it
> > was mixed-rev and switched.  Investigation showed there were orphaned
> > base node rows in the DB which weren't seen by "svn st" but were seen by
> > "svnversion".  I'm not interested in how that particular state came to
> > be, as I've run hundreds of buggy trunk builds on this WC over many
> > months.  What I do want is to be able to run a set of checks on a DB
> > that detects basic rule violations like that.
> >
> > Decisions about when and how to run it can come later.  Of course if we
> > plan to run it frequently and automatically that would mean we'd want to
> > make sure it only did fast checks and is efficiently coded whereas if it
> > remains a manual intervention for devs then that's no concern.  Thoughts
> > about this are welcome too.

> Regarding the patch itself:

Thanks for this good feedback Greg.

> * I dislike creating yet another wc_db_foo file. Just put the damned
> thing into wc_db_util.c. These files are not so big that we have to
> break them up. wc_db.c, sure. But let's not keep adding files that are
> just dozens of lines. It makes it hard to answer the question, "where
> is that function?"

I agree there's merit in not splitting things up too much.  I'll note
however that wc_db_util is documented as being lower-level helpers for
use by wc_db, whereas this is more like a test driver for wc_db.  Not
sure that's a really good place but I can put it there for now.

> * There is a rough sketch of wc-checks.sql. Is there something that
> can be done where, for developers, more intense checks are performed?
> And even better, can some of these checks be expressed as triggers, in
> order to trap problems immediately?

That looks interesting and good.  I'll certainly try to expand the use
of triggers for on-the-fly validation.  We can of course have the option
of disabling it when we're confident in correctness and want more speed.

> * db_verify() should at least follow precedent with a
> VERIFY_USABLE_WCROOT() call. (why'd you omit that?)

Oversight.

> * the (new) typical pattern is to pass a WCROOT rather than the
> individual SDB and WC_ID, so _verify_nodes should have its args
> changed

Will look/do.

> * wtf is relpath_depth duplicated into the verify file? serious badness.

Just hacking.  Will fix.

> * there is no such comment as "TODO:" ... we use "###"

When I use a comment it means exactly what I want it to mean. :-P

Thanks and I'll hack on this on the train I'm about to catch.

- Julian



Re: [PATCH] WC DB verification 1

Posted by Greg Stein <gs...@gmail.com>.
On Mon, May 9, 2011 at 18:30, Julian Foad <ju...@wandisco.com> wrote:
> I'm planning to commit the attached wc-db-verification-1.patch, subject
> to any advice on how to best fit it in to the code base or other
> concerns, in order to get a DB "self-check" function started.
>
> I think we need something like this.  Earlier today I found that "svn
> status" showed I had a clean, single-rev WC, while "svnversion" said it
> was mixed-rev and switched.  Investigation showed there were orphaned
> base node rows in the DB which weren't seen by "svn st" but were seen by
> "svnversion".  I'm not interested in how that particular state came to
> be, as I've run hundreds of buggy trunk builds on this WC over many
> months.  What I do want is to be able to run a set of checks on a DB
> that detects basic rule violations like that.
>
> Decisions about when and how to run it can come later.  Of course if we
> plan to run it frequently and automatically that would mean we'd want to
> make sure it only did fast checks and is efficiently coded whereas if it
> remains a manual intervention for devs then that's no concern.  Thoughts
> about this are welcome too.

Regarding the patch itself:

* I dislike creating yet another wc_db_foo file. Just put the damned
thing into wc_db_util.c. These files are not so big that we have to
break them up. wc_db.c, sure. But let's not keep adding files that are
just dozens of lines. It makes it hard to answer the question, "where
is that function?"

* There is a rough sketch of wc-checks.sql. Is there something that
can be done where, for developers, more intense checks are performed?
And even better, can some of these checks be expressed as triggers, in
order to trap problems immediately?

* db_verify() should at least follow precedent with a
VERIFY_USABLE_WCROOT() call. (why'd you omit that?)

* the (new) typical pattern is to pass a WCROOT rather than the
individual SDB and WC_ID, so _verify_nodes should have its args
changed

* wtf is relpath_depth duplicated into the verify file? serious badness.

* there is no such comment as "TODO:" ... we use "###"

-g