You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@subversion.apache.org by Johan Corveleyn <jc...@gmail.com> on 2019/10/09 08:39:01 UTC

SVN-4065 - server should enforce LF normalization for svn:eol-style=native (was: PMCs: any Hackathon requests? (deadline 11 October))

On Tue, Oct 8, 2019 at 6:13 PM Julian Foad <ju...@apache.org> wrote:
> Daniel Shahaf wrote:
> > SVN-4065 - server should enforce LF normalization for svn:eol-
> > style=native files. (Not just a patch is needed, but also a FAQ entry
> > that explains how to handle existing repositories that have this
> > problem, once the patch is released. If it'll be like when
> > svn_repos__validate_prop() was added, then dump/load will still work,
> > but svnsync not.)
>
> That issue appears to need review.  It's not clear to me whether the
> server should really enforce LF normalization.  I rather think not.
> There's a discussion thread linked from it, with no final conclusion.

My last comment in the issue [1] summarizes the discussion(s) as follows:

[[[
Some recent discussion on dev@:
    http://svn.haxx.se/dev/archive-2012-12/0179.shtml

The current thinking from that thread is that this should (or can only) be
enforced through a pre-commit hook. It was suggested that this would ideally be
some standard, supported (and efficient) pre-commit hook. In a follow-up thread,
Ivan suggested to create a new program "svnhooks" for standardizing such hooks
(and go together with a configuration file to enable/disable certain behaviors):
    http://svn.haxx.se/dev/archive-2012-12/0217.shtml
]]]

I think that was the conclusion from those threads. I.e. it would be
best if we developed a standard "svnhooks" program that can be invoked
from the pre-commit hook (and not try to implement this directly in
the repos layer). At least, after those svnhooks suggestions no-one
objected, so I assumed silent consensus about that way forward :-) ...

Not sure if this is a good bite-sized task for interested hackers though ...
Though it's quite well-contained (develop a separate (small) program
with a configuration file, depending on existing server-side API's),
and we have a clear use case to start with.

[1] https://issues.apache.org/jira/browse/SVN-4065?focusedCommentId=14931167&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14931167

-- 
Johan

Re: SVN-4065 - server should enforce LF normalization for svn:eol-style=native

Posted by Branko Čibej <br...@apache.org>.

On 09.10.2019 19:14, Daniel Shahaf wrote:
> The FS layer exposes
> a filesystem that's a 0-indexed array of trees [implemented by the DAG layer].
> The repos layer does… what, exactly, on top of that?

I've been asking that from day one. :) All that API duplication, which
is incomplete to boot, is confusing.

Well, the repos layer is responsible for authorization checks, too.

-- Brane

Re: SVN-4065 - server should enforce LF normalization for svn:eol-style=native

Posted by Nathan Hartman <ha...@gmail.com>.

On Wed, Oct 9, 2019 at 7:12 PM Johan Corveleyn <jc...@gmail.com> wrote:

> But I think this kind of content validation is different from
> "application-level content validation", like "follow coding style X",
> or "commit message should contain an issue key", or "I want to enforce
> that .c files have svn:eol-style=native". Such application-level
> content rules have no effect whatsoever on the workings of SVN itself.
> SVN clients don't care about those, only the applications (IDE, users,
> ...) do.
>
> In contrast, having a "non-LF-normalized with svn:eol-style=native" in
> the repository breaks "svn diff" (all lines shown as different) and
> "svn status" (if timestamps mismatch, contents are checksummed, and
> the file shows up as modified while it isn't -- causing confusion and
> even worse, spurious tree conflicts).
>
> So in my eyes this is more of an obligatory validation, to preserve
> the sanity of your "svn ecosystem".

I don't (yet) have an opinion whether this should be a hook or implemented
internally. But when you explain it this way, I suppose I might gravitate
towards obligatory validation. I'm undecided so far because that seems to
cause some headaches in terms of legacy data and a path forward. I need to
learn more about it.

On the separate question of a standardized C-coded binary svnhooks program,
I like this idea. I see it as a "busybox"[1] of svn hooks. I like this
because many (perhaps most?) installations could use it rather than coding
up their own custom scripts, reducing setup difficulty on admins and
hopefully reducing  errors and misunderstandings. I like standardization.
But I'm probably repeating everything that was already said. :-)

It is important to let admins continue to write their own scripts when they
have custom needs but I'm sure that's the intention anyway.

[1] Busybox is a single executable that implements several standard UNIX
utilities:
https://en.wikipedia.org/wiki/BusyBox
 I see a "svnhooks" program as being an analogue of that because it could
consolidate all the existing commonly used hook scripts into one program.

Nathan

Re: SVN-4065 - server should enforce LF normalization for svn:eol-style=native

Posted by Johan Corveleyn <jc...@gmail.com>.

On Wed, Oct 9, 2019 at 7:14 PM Daniel Shahaf <d....@daniel.shahaf.name> wrote:
>
> Branko Čibej wrote on Wed, Oct 09, 2019 at 15:45:32 +0200:
> > On 09.10.2019 15:00, Johan Corveleyn wrote:
> > > During that entire discussion thread the only objections raised were
> > > about "enforcing it in the repos layer / server directly". No-one
> > > objected to the proposal(s) to solve the issue via pre-commit hooks.
> >
> > Validating contents (such as line endings based on svn:eol-style) is a
> > perfect fit for hooks. Not normalising of course. :)
>
> The repos layer validates svn:* revprops in svn_repos__validate_prop().
>
> The FS layer doesn't do that validation.
>
> The result of that is that old repositories with invalid svn:* properties can
> be dump/load'd but can't be svnsync'.d
>
> ---
>
> As to separation of concerns, that's a valid point, but we should be consistent
> about what concerns belong where.  The validation of svn:* props and of
> contents of files with svn:eol-style set belongs in the same layer, doesn't it?
>
> Therefore, I think there are two options:
>
> - These validation belongs in the repos layer.  The simpler/lower layer that
>   doesn't do such validation is the FS layer.  svn:eol-style validation belongs
>   in the repos layer.  (And if svnsync needs to bypass it, we'll design a way
>   for it to do so.)
>
> - These validations belong $elsewhere.  svn:eol-style validation will be added
>   $elsewhere and svn_repos__validate_prop() will be moved $elsewhere.
>
> My interpretation: I don't have an opinion yet, except that if we move the
> validation out of the repos layer, I won't be quite as sure any more what the
> difference between the FS layer and the repos layer _is_.  The FS layer exposes
> a filesystem that's a 0-indexed array of trees [implemented by the DAG layer].
> The repos layer does… what, exactly, on top of that?

Thinking about it again some more, I guess I agree that this really
belongs in the same place as svn_repos__validate_prop(), and really
should be in the repos layer (or in any case, more widely /
automatically / standardly available than just a hook script, where it
depends too much on the individual sysadmin).

In fact, back in 2012 I wasn't happy with the consensus gravitating
towards "solve this in a pre-commit hook" (except as a stop-gap
measure). But the "standard pre-commit utility" seemed to be the best
attainable idea that would receive support from most devs, at the
time.

Earlier in the current thread Brane said:
"Validating contents (such as line endings based on svn:eol-style) is
a perfect fit for hooks."

But I think this kind of content validation is different from
"application-level content validation", like "follow coding style X",
or "commit message should contain an issue key", or "I want to enforce
that .c files have svn:eol-style=native". Such application-level
content rules have no effect whatsoever on the workings of SVN itself.
SVN clients don't care about those, only the applications (IDE, users,
...) do.

In contrast, having a "non-LF-normalized with svn:eol-style=native" in
the repository breaks "svn diff" (all lines shown as different) and
"svn status" (if timestamps mismatch, contents are checksummed, and
the file shows up as modified while it isn't -- causing confusion and
even worse, spurious tree conflicts).

So in my eyes this is more of an obligatory validation, to preserve
the sanity of your "svn ecosystem". A bit like the sha1 collision
protection (yes, the repository can store sha1 collisions perfectly
fine, but they wreak havoc amongst your clients / working copies if
you let them enter your repository). The "havoc" caused by an SVN-4065
file is orders of magnitude less severe than what a sha1 collision
causes (entire working copy hosed), but still, it's not nice.

(Julian: sorry for my frustrated reaction earlier, and thanks for
staying constructive, as always ... you were right, the discussion was
clearly not over yet :-))

-- 
Johan

Re: SVN-4065 - server should enforce LF normalization for svn:eol-style=native

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.

Branko Čibej wrote on Wed, Oct 09, 2019 at 15:45:32 +0200:
> On 09.10.2019 15:00, Johan Corveleyn wrote:
> > During that entire discussion thread the only objections raised were
> > about "enforcing it in the repos layer / server directly". No-one
> > objected to the proposal(s) to solve the issue via pre-commit hooks.
> 
> Validating contents (such as line endings based on svn:eol-style) is a
> perfect fit for hooks. Not normalising of course. :)

The repos layer validates svn:* revprops in svn_repos__validate_prop().

The FS layer doesn't do that validation.

The result of that is that old repositories with invalid svn:* properties can
be dump/load'd but can't be svnsync'.d

---

As to separation of concerns, that's a valid point, but we should be consistent
about what concerns belong where.  The validation of svn:* props and of
contents of files with svn:eol-style set belongs in the same layer, doesn't it?

Therefore, I think there are two options:

- These validation belongs in the repos layer.  The simpler/lower layer that
  doesn't do such validation is the FS layer.  svn:eol-style validation belongs
  in the repos layer.  (And if svnsync needs to bypass it, we'll design a way
  for it to do so.)

- These validations belong $elsewhere.  svn:eol-style validation will be added
  $elsewhere and svn_repos__validate_prop() will be moved $elsewhere.

My interpretation: I don't have an opinion yet, except that if we move the
validation out of the repos layer, I won't be quite as sure any more what the
difference between the FS layer and the repos layer _is_.  The FS layer exposes
a filesystem that's a 0-indexed array of trees [implemented by the DAG layer].
The repos layer does… what, exactly, on top of that?

Cheers,

Daniel

Re: SVN-4065 - server should enforce LF normalization for svn:eol-style=native

Posted by Julian Foad <ju...@apache.org>.

Branko Čibej wrote:
>> The repos layer to a large extent transparent to properties and their
>> values, though not so much: it has some validation and even
>> "normalization" of "svn:" property names and values.  I feel this is
>> generally Bad; there is some room for repos-layer knowledge of
>> properties but we should have separated the concerns better.
> 
> Does the repos layer actually normalize svn: properties? I know
> 'svnadmin load' can, but I don't believe the repos API does that?

AFAICT the RA parts of the libsvn_repos API never modify node props, 
only revision props.

The normalization of node props done by 'svnadmin load' is implemented 
in libsvn_repos through svn_repos_get_fs_build_parser6():

http://svn.apache.org/viewvc/subversion/tags/1.12.0/subversion/include/svn_repos.h?view=markup#l3863

That is yet another case where it's implemented at the wrong level. 
Here, "svnrdump load" can't share it.  The dumpstream loader should be 
refactored so svnadmin and svnrdump share code: 
https://subversion.apache.org/issue/4780 "Factor out the dumpstream 
loader editor driver".

- Julian

Re: SVN-4065 - server should enforce LF normalization for svn:eol-style=native

Posted by Branko Čibej <br...@apache.org>.

On 09.10.2019 11:50, Julian Foad wrote:
> Julian Foad wrote:
>> If we want to change this to a repository-level rule, then that
>> implies we are changing the repository semantics in a
>> backward-incompatible way and we would need to address that (using a
>> format number bump, and an upgrade/migration path).  We could discuss
>> that path but I don't think anyone's currently pushing for that and I
>> don't see good reason to do so, so let's leave that aside.
>
> I want to add something stronger.  It would be a mistake to push
> validation of file content against property values into the repos
> layer as a new rule about the allowed semantics of repository contents.
>
> It's important to have different levels in the architecture with
> strongly defined characteristics, generally with lower layers having
> simpler semantics.  Currently the repos layer does not interfere with
> file content at all.  That is Good.

Couldn't agree more. There's a rather large can of really ugly worms
hiding in there and we do not want to look for it, let alone open it.


> The repos layer to a large extent transparent to properties and their
> values, though not so much: it has some validation and even
> "normalization" of "svn:" property names and values.  I feel this is
> generally Bad; there is some room for repos-layer knowledge of
> properties but we should have separated the concerns better.

Does the repos layer actually normalize svn: properties? I know
'svnadmin load' can, but I don't believe the repos API does that?

-- Brane

Re: SVN-4065 - server should enforce LF normalization for svn:eol-style=native

Posted by Julian Foad <ju...@apache.org>.

Julian Foad wrote:
> If we want to change this to a repository-level rule, then that implies 
> we are changing the repository semantics in a backward-incompatible way 
> and we would need to address that (using a format number bump, and an 
> upgrade/migration path).  We could discuss that path but I don't think 
> anyone's currently pushing for that and I don't see good reason to do 
> so, so let's leave that aside.

I want to add something stronger.  It would be a mistake to push 
validation of file content against property values into the repos layer 
as a new rule about the allowed semantics of repository contents.

It's important to have different levels in the architecture with 
strongly defined characteristics, generally with lower layers having 
simpler semantics.  Currently the repos layer does not interfere with 
file content at all.  That is Good.

The repos layer to a large extent transparent to properties and their 
values, though not so much: it has some validation and even 
"normalization" of "svn:" property names and values.  I feel this is 
generally Bad; there is some room for repos-layer knowledge of 
properties but we should have separated the concerns better.

- Julian

Re: SVN-4065 - server should enforce LF normalization for svn:eol-style=native

Posted by Branko Čibej <br...@apache.org>.

On 09.10.2019 15:00, Johan Corveleyn wrote:
> During that entire discussion thread the only objections raised were
> about "enforcing it in the repos layer / server directly". No-one
> objected to the proposal(s) to solve the issue via pre-commit hooks.

Validating contents (such as line endings based on svn:eol-style) is a
perfect fit for hooks. Not normalising of course. :)

-- Brane

Re: SVN-4065 - server should enforce LF normalization for svn:eol-style=native

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.

Julian Foad wrote on Wed, 09 Oct 2019 15:13 +00:00:
> * [SVN-4065] Redefine as:
>    "Write a pre-commit hook to reject non-conforming eol-style=native."
>    Put it in 'tools/hook-scripts/'.
...
> The redefined SVN-4065 could be a suitable hackathon task.

Isn't writing `svnlook changed -t $TXN | cut -c 5- | while read -r line;
do if svnlook propget $REPOS svn:eol-style $line | grep -qx native &&
svnlook cat -t $TXN $REPOS $line | xxd -p -c 1 | grep -qxi 0d; then
exit 1; fi; done` is too small a task for a hackathon?

Re: SVN-4065 - server should enforce LF normalization for svn:eol-style=native

Posted by Julian Foad <ju...@apache.org>.

I suggest there are three issues mixed up here:

* [SVN-4065] Redefine as:
   "Write a pre-commit hook to reject non-conforming eol-style=native."
   Put it in 'tools/hook-scripts/'.

* [new issue]
   Review/organize the set of recommended/suggested hooks.
   See 'tools/hook-scripts/'.  See what categories there are, e.g.
   enforcing standard svn client behaviours (such as eol-style=native).
   What can we do to make them easier to choose, configure, deploy?
   Should some be withdrawn?

* [new issue]
   Reconsider the idea of a configurable multi-hooks program.
   I think that is not as obvious as it might look in that
   old email thread and needs wider discussion.

The redefined SVN-4065 could be a suitable hackathon task.

- Julian

Re: SVN-4065 - server should enforce LF normalization for svn:eol-style=native

Posted by Julian Foad <ju...@apache.org>.

Johan Corveleyn wrote:
> Julian Foad wrote:
>> Johan Corveleyn wrote:
>>> I think that was the conclusion from those threads. I.e. it would be
>>> best if we developed a standard "svnhooks" program that can be invoked
>>> from the pre-commit hook (and not try to implement this directly in
>>> the repos layer). At least, after those svnhooks suggestions no-one
>>> objected, so I assumed silent consensus about that way forward :-) ...
>>
>> Silence meant "errm... what?"
> 
> I disagree. On this list silence means "no objections". If someone
> wanted to say "errm... what?", they should have mailed it.

Sorry, I was too harsh there.

Nevertheless, there are hundreds of issues and mail threads that are 
incomplete but don't end with someone saying so.

My reading of the issue ended with the impression that there should be a 
standard supplied "hooks program", but with no definition of what its 
scope/purpose/role should be, so unclear what we should add to it, in 
what contexts the svn server would execute it (like, svnadmin load?), 
why it should be an external program, which server operators would 
reasonably want to replace this program with a different one in what 
kinds of situation, etc.

> In that very thread, 7 years ago, it seemed quite clear that consensus
> was gravitating towards "don't solve it in the repos layer, but in a
> pre-commit hook":
> * Daniel first posted a simple script for pre-commit hook validation:
> https://svn.haxx.se/dev/archive-2012-12/0180.shtml
[...]
> * Ivan suggested to create a separate program for it "svnhooks", with
> this concrete "rule" as a first use case for a more general tool:
> https://svn.haxx.se/dev/archive-2012-12/0202.shtml
> * Ivan specified his idea a bit further here:
> https://svn.haxx.se/dev/archive-2012-12/0217.shtml
> 
> During that entire discussion thread the only objections raised were
> about "enforcing it in the repos layer / server directly". No-one
> objected to the proposal(s) to solve the issue via pre-commit hooks.
> 
> Sound pretty consensus-y to me.

Thanks for summarizing the thread.  I had only skimmed through bits of 
it.  In particular I hadn't seen where Ivan said the program would 
implement hooks for "standard recommended polices", and I hadn't picked 
up on it looking so consensus-y.

>> That means we have to design it in such a way that only "client commits"
>> are affected, and "server tools" such as sump/load are not, because we
>> can't have them suddenly start failing to load existing repository data.
> 
> Yes, and hooks can do that. You're arriving at the same conclusion as
> the thread 7 years ago.
> 
>> Are "svnrdump" and "svnsync" client-layer or repos-layer tools, for
>> these purposes?  We don't yet have a formal way to distinguish and use
>> "repos-layer" tools through our client-server (RA) interface.  So at the
>> moment I suppose we might say that ("de facto") all interactions through
>> the RA interface are deemed to be client-layer interactions.  We could
>> consider an enhancement to enable "repos-layer" interactions to be
>> performed over RA with suitable authorization.  We probably ought to
>> file that as a future enhancement issue.
>>
>> Currently we have "svn commit" and other client-layer operations, vs.
>> "svnadmin load" as the main repos-layer (server side) interaction.
>>
>> "svnadmin load" has these options:
>>     --use-pre-commit-hook
>>     --use-post-commit-hook
>>     --normalize-props
>>     --bypass-prop-validation
>>
>> To me, this looks like a crude attempt to allow it to support both
>> client-layer and repos-layer modes, but with no overall mode switch so
>> the user has to think about the implications of each individual sub-switch.
>>
>> We never spelled out the role of commit hooks.  Therefore presumably
>> some commit hooks are used for client-side purposes (enforcing rules
>> like LF normalization and log message formats, etc.) and some for
>> server-side purposes (logging, backups, mirroring to svnsync, etc.).
>> The option to enable or disable all commit hooks seems misguided:
>> instead it would seem more appropriate to categorize hooks into client
>> and server roles, and have "svnadmin load" apply only those in the
>> server role.
>>
>> Is two roles of hooks something it makes sense to introduce?  Or could
>> we clarify the current situation, document it?
>>
>> It looks to me like "enforcement of the standard svn client's rules" is
>> an option we would ideally like to provide to server operators.  How
>> would this be defined more precisely?  How should we design it to cope
>> with different client versions, whose rules have changed a few times?
> 
> You're pulling this way out of scope.

I'm unpicking a load of unspoken assumptions.

> Issue SVN-4065 is very concrete
> and specific, and a concrete proposal was made on how best to solve
> it.
> 
> You're using it to open a much broader architectural discussion. Which
> is fine of course, and I think quite interesting. But that shouldn't
> block progress for SVN-4065 in the way which was proposed 7 years ago,
> and which still seems the best way (via a utility program that can be
> invoked from hooks, where our hooks still have the same semantics /
> confusions / shortcomings / ... as today).

OK, so a valid option is to treat this as "just another pre-commit 
hook".  If the suggestion was just to write and supply another hook 
script, it would be reasonable to ignore all that: we don't need to 
think about the system semantics, as it's just like any other pre-commit 
hook.

The talk about building the required hook functionality into a new 
compiled program led to me wondering more widely about the design.  If 
we're going to combine multiple "standard" hooks into one configurable 
program, that sort of thing does need to be discussed.

It seems to me that combining multiple hooks into one configurable 
program is orthogonal to the actual issue.  The only reason it came up, 
AFAICT, is because of possibly premature concern about efficiency.


> That being said, maybe that svnhooks utility program (Ivan's proposal)
> could be used to introduce a way to separate those different roles
> that hooks serve. Ivan hinted a bit in that direction in his post
> there. From https://svn.haxx.se/dev/archive-2012-12/0217.shtml:
> 
>> svnhooks configuration file has separate section for each policy. For example:
>> [[[
>> [check-eol-normalization]
>> enable = yes
>>
>> [rev-propchange]
>> enable = yes
>> users = svnsync
>>
>> [edit-log-message]
>> enable = yes
>> users = admin, @author
>> ]]]
> 
> I agree this should be thought through, with a good design (a "users"
> field might not cut it -- perhaps something else). It's clear that the
> proposal needs further design work.

Thanks.

I didn't mean to prevent a simple task from being done.

- Julian

Re: SVN-4065 - server should enforce LF normalization for svn:eol-style=native

Posted by Johan Corveleyn <jc...@gmail.com>.

On Wed, Oct 9, 2019 at 11:41 AM Julian Foad <ju...@apache.org> wrote:
>
> Johan Corveleyn wrote:
> > I think that was the conclusion from those threads. I.e. it would be
> > best if we developed a standard "svnhooks" program that can be invoked
> > from the pre-commit hook (and not try to implement this directly in
> > the repos layer). At least, after those svnhooks suggestions no-one
> > objected, so I assumed silent consensus about that way forward :-) ...
>
> Silence meant "errm... what?"

I disagree. On this list silence means "no objections". If someone
wanted to say "errm... what?", they should have mailed it.

In that very thread, 7 years ago, it seemed quite clear that consensus
was gravitating towards "don't solve it in the repos layer, but in a
pre-commit hook":
* Daniel first posted a simple script for pre-commit hook validation:
https://svn.haxx.se/dev/archive-2012-12/0180.shtml
* Brane said the only sane place to do it was in pre-commit hook:
https://svn.haxx.se/dev/archive-2012-12/0182.shtml
* Brane also suggested to write it in C:
https://svn.haxx.se/dev/archive-2012-12/0191.shtml
* Ivan suggested to create a separate program for it "svnhooks", with
this concrete "rule" as a first use case for a more general tool:
https://svn.haxx.se/dev/archive-2012-12/0202.shtml
* Ivan specified his idea a bit further here:
https://svn.haxx.se/dev/archive-2012-12/0217.shtml

During that entire discussion thread the only objections raised were
about "enforcing it in the repos layer / server directly". No-one
objected to the proposal(s) to solve the issue via pre-commit hooks.

Sound pretty consensus-y to me.

...
> That means we have to design it in such a way that only "client commits"
> are affected, and "server tools" such as sump/load are not, because we
> can't have them suddenly start failing to load existing repository data.

Yes, and hooks can do that. You're arriving at the same conclusion as
the thread 7 years ago.

> Are "svnrdump" and "svnsync" client-layer or repos-layer tools, for
> these purposes?  We don't yet have a formal way to distinguish and use
> "repos-layer" tools through our client-server (RA) interface.  So at the
> moment I suppose we might say that ("de facto") all interactions through
> the RA interface are deemed to be client-layer interactions.  We could
> consider an enhancement to enable "repos-layer" interactions to be
> performed over RA with suitable authorization.  We probably ought to
> file that as a future enhancement issue.
>
> Currently we have "svn commit" and other client-layer operations, vs.
> "svnadmin load" as the main repos-layer (server side) interaction.
>
> "svnadmin load" has these options:
>    --use-pre-commit-hook
>    --use-post-commit-hook
>    --normalize-props
>    --bypass-prop-validation
>
> To me, this looks like a crude attempt to allow it to support both
> client-layer and repos-layer modes, but with no overall mode switch so
> the user has to think about the implications of each individual sub-switch.
>
> We never spelled out the role of commit hooks.  Therefore presumably
> some commit hooks are used for client-side purposes (enforcing rules
> like LF normalization and log message formats, etc.) and some for
> server-side purposes (logging, backups, mirroring to svnsync, etc.).
> The option to enable or disable all commit hooks seems misguided:
> instead it would seem more appropriate to categorize hooks into client
> and server roles, and have "svnadmin load" apply only those in the
> server role.
>
> Is two roles of hooks something it makes sense to introduce?  Or could
> we clarify the current situation, document it?
>
> It looks to me like "enforcement of the standard svn client's rules" is
> an option we would ideally like to provide to server operators.  How
> would this be defined more precisely?  How should we design it to cope
> with different client versions, whose rules have changed a few times?

You're pulling this way out of scope. Issue SVN-4065 is very concrete
and specific, and a concrete proposal was made on how best to solve
it.

You're using it to open a much broader architectural discussion. Which
is fine of course, and I think quite interesting. But that shouldn't
block progress for SVN-4065 in the way which was proposed 7 years ago,
and which still seems the best way (via a utility program that can be
invoked from hooks, where our hooks still have the same semantics /
confusions / shortcomings / ... as today).

That being said, maybe that svnhooks utility program (Ivan's proposal)
could be used to introduce a way to separate those different roles
that hooks serve. Ivan hinted a bit in that direction in his post
there. From https://svn.haxx.se/dev/archive-2012-12/0217.shtml:

> svnhooks configuration file has separate section for each policy. For example:
> [[[
> [check-eol-normalization]
> enable = yes
>
> [rev-propchange]
> enable = yes
> users = svnsync
>
> [edit-log-message]
> enable = yes
> users = admin, @author
> ]]]

I agree this should be thought through, with a good design (a "users"
field might not cut it -- perhaps something else). It's clear that the
proposal needs further design work.

-- 
Johan

Re: SVN-4065 - server should enforce LF normalization for svn:eol-style=native

Posted by Julian Foad <ju...@apache.org>.

Johan Corveleyn wrote:
> I think that was the conclusion from those threads. I.e. it would be
> best if we developed a standard "svnhooks" program that can be invoked
> from the pre-commit hook (and not try to implement this directly in
> the repos layer). At least, after those svnhooks suggestions no-one
> objected, so I assumed silent consensus about that way forward :-) ...

Silence meant "errm... what?"

> Not sure if this is a good bite-sized task for interested hackers though ...
> Though it's quite well-contained (develop a separate (small) program
> with a configuration file, depending on existing server-side API's),
> and we have a clear use case to start with.

Glad we're reviewing this old issue.

What do you envision the purpose/scope/role of this program should be? 
It's easy to suggest a facetious answer: "a collecting place for random 
bug fixes".  Of course not.

Seriously, the question that needs addressing for this issue is at what 
level the LF normalization is to be enforced in a Subversion system 
(deployment) in general.  So far it has been client-side, with the 
implication that different clients are expected but not forcibly 
required to co-operate.

The first implication of that is that clients should handle gracefully 
the case where repository data is not in fact normalized how they 
expected it.  That's one actionable conclusion.

If we want to change this to a repository-level rule, then that implies 
we are changing the repository semantics in a backward-incompatible way 
and we would need to address that (using a format number bump, and an 
upgrade/migration path).  We could discuss that path but I don't think 
anyone's currently pushing for that and I don't see good reason to do 
so, so let's leave that aside.

It seems we want to keep the design where this normalization is a 
client-side rule, but now we want to provide additional server-side 
enforcement of the client-side rule.  We must still in odd cases allow 
server tools (like dump/load) to continue working with repository 
content that doesn't obey that rule.

That means we have to design it in such a way that only "client commits" 
are affected, and "server tools" such as sump/load are not, because we 
can't have them suddenly start failing to load existing repository data.

Are "svnrdump" and "svnsync" client-layer or repos-layer tools, for 
these purposes?  We don't yet have a formal way to distinguish and use 
"repos-layer" tools through our client-server (RA) interface.  So at the 
moment I suppose we might say that ("de facto") all interactions through 
the RA interface are deemed to be client-layer interactions.  We could 
consider an enhancement to enable "repos-layer" interactions to be 
performed over RA with suitable authorization.  We probably ought to 
file that as a future enhancement issue.

Currently we have "svn commit" and other client-layer operations, vs. 
"svnadmin load" as the main repos-layer (server side) interaction.

"svnadmin load" has these options:
   --use-pre-commit-hook
   --use-post-commit-hook
   --normalize-props
   --bypass-prop-validation

To me, this looks like a crude attempt to allow it to support both 
client-layer and repos-layer modes, but with no overall mode switch so 
the user has to think about the implications of each individual sub-switch.

We never spelled out the role of commit hooks.  Therefore presumably 
some commit hooks are used for client-side purposes (enforcing rules 
like LF normalization and log message formats, etc.) and some for 
server-side purposes (logging, backups, mirroring to svnsync, etc.). 
The option to enable or disable all commit hooks seems misguided: 
instead it would seem more appropriate to categorize hooks into client 
and server roles, and have "svnadmin load" apply only those in the 
server role.

Is two roles of hooks something it makes sense to introduce?  Or could 
we clarify the current situation, document it?

It looks to me like "enforcement of the standard svn client's rules" is 
an option we would ideally like to provide to server operators.  How 
would this be defined more precisely?  How should we design it to cope 
with different client versions, whose rules have changed a few times?

- Julian