You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Stefan Sperling <st...@elego.de> on 2013/03/26 14:36:15 UTC

Filenames with trailing newlines wreak havoc

When a file with a trailing newline (ASCII 0x0a) has made its way into
the repository, there are various areas of the system that exhibit failure
modes. Failure modes I observed in one particular case are discussed below.

I intend to file separate issues for each failure mode to keep track of
them, rather than filing one giant issue for them all. Though perhaps
a meta-issue that depends on the other issues would be useful.

In the case observed there was a file, added in a revision rX, which had 
a trailing newline in its name. The file was added along with its parent
directory (which was also new in rX) and alongside several other files
in the same directory which did not have trailing newlines in their names.

The server was running Subversion 1.7.8, but might have been running
an older release at the time when rX was committed.
All commands listed below were run with Subversion 1.7.8.

Immediately visible problems were:

svnadmin verify resulted in an error for the revision: 'svnadmin: E160013: File not found: ...')

svnsync failed to sync the revision ('svnsync: E160013: File not found: ...')

However, the directory at rX could be listed (with svn ls) and checked
out successfully.

During checkout, RA layers were behaving inconsistently.

Checkout via ra_svn and ra_local resulted in the expected filename
with \n (0x0a) as the last byte both in wc.db (local_relpath NODES
column) and on disk.

However, ra_neon (didn't try serf) checked the file out with a trailing
space (0x20) instead of a trailing newline. The filename was stored with
a trailing 0x20 in wc.db as well as on disk.

Examining the FSFS revision file [*] of rX showed that the filename was
stored with a newline as part of its parent directory's representation.

  [*] For those who didn't know, the FSFS revision file format is documented
      here: https://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_fs/structure

Let's call the parent directory 'A', and the file in question 'alpha\n'.
A file named 'alpha\n' added in r1 might appear like this in its parent
directory's representation (i.e. the rep of 'A'):
============
PLAIN
K 6
alpha

V 17
file 2-1.0.r1/122
============

However, the cpath: field within the file's own representation did not
contain the trailing newline.

Also, the file was listed in the changed-paths data section without
the trailing newline. Note that the FSFS format docs say "The changed-path
data is represented as a series of changed-path items, each consisting of
two lines." -- i.e. a trailing newline is considered to be part of revision
file data, rather than the filename.

Furthermore, several children of the newly-added-in rX 'A' directory
were missing from the changed-paths list in the revision file, for an
unkown reason.
More precisely, children listed after 'alpha\n' in the rep of A were
missing from the changed paths list. This was not detected by 'svnadmin
verify' (which, in my opinion, is a bug) but was detected by 'svnsync'
when a later revision rY referred to one of the missing children in
copyfrom information. It turned out that the slave server was missing
any children of A not listed in the changed-paths list in the rX
revision file on the master server. What kind of consistency checks
could prevent this from happening remains to be determined. Perhaps
'svnadmin verify' could cross-check the changed paths list with node-rev
IDs which appear in the revision's directory listings?

The revision rX could be fixed by removing the trailing newline from the
filename in the parent directory's representation, and adjusting the
rep's length and checksum information accordingly. About two dozen other
revisions had to be adjusted as well because they listed the file
'alpha\n' as part of the 'A' directory rep (which is of course written
out in its entirety whenever a revision changes some child of 'A').

Also, the changed-paths list of rX had to be extended to add the missing
newly added children of A. Other revisions had to be checked for similar
omissions in the changed-paths list.

How the filename entered the repository is not known with absolute
certainty. The current theory, based on discussion with the revision's
author, is that the file was checked in using Eclipse with SVNKit 1.6
and was not refused by the Subversion server at the time.

Attempting the reproduce this problem with a stock Subversion client
in a naive way fails because libsvn_client refuses to add such files
to version control in the first place:

subversion/svn/add-cmd.c:86: (apr_err=160005)
subversion/svn/util.c:621: (apr_err=160005)
subversion/libsvn_client/add.c:1031: (apr_err=160005)
subversion/libsvn_client/add.c:937: (apr_err=160005)
subversion/libsvn_client/add.c:320: (apr_err=160005)
subversion/libsvn_wc/adm_ops.c:1004: (apr_err=160005)
subversion/libsvn_wc/adm_ops.c:679: (apr_err=160005)
subversion/libsvn_subr/path.c:1231: (apr_err=160005)
svn: E160005: Invalid control character '0x0a' in path '/tmp/wc/alpha\012'

This error does not seem to be triggered by our test suite, which
is bad because it would have allowed the SVNKit developers to prevent
this problem from happening (they use our test suite to test their
Java-based implementation of Subversion). There is a tab_test() in
commit_tests.py, but that test only tries a tab, not a newline character.

We should add some C tests as well to verify API behaviour at the
client layer and at the repos layer.

Given the ripple effects of this problem in FSFS revision files I think
we should ensure that the Subversion server blocks such filenames from
entering the repository (any repository, FSFS and BDB). It seems FSFS format
changes would be required to support filenames with trailing newlines
properly, an effort which isn't worth the gain in my opinion.

And given the ripple effects seen in areas such as repository verification,
svnsync, and ra_neon, I don't think we can afford to call this a supported
use case until all components of the system have been fixed to handle
filenames with trailing newlines properly.

Comments and opinions are very welcome. I intend to make sure that this
problem cannot happen with future Subversion releases, and welcome any
input and help I can get.

Re: Filenames with trailing newlines wreak havoc

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Ben Reser wrote on Tue, Mar 26, 2013 at 17:12:33 -0700:
> On Tue, Mar 26, 2013 at 4:39 PM, Ben Reser <be...@reser.org> wrote:
> > However, this pre-commit script (which requires the Python bindings)
> > should catch all the variations.  It also checks for other control
> > characters, but you can easily change that by changing the
> > control_chars set at the top of the file.
> 
> Slight tweaks to fix some minor nits with the previous pre-commit script.

Thanks Ben!  I went ahead and committed it as r1461389 (and might tweak
it a bit now).

Re: Filenames with trailing newlines wreak havoc

Posted by Ben Reser <be...@reser.org>.
On Tue, Mar 26, 2013 at 4:39 PM, Ben Reser <be...@reser.org> wrote:
> However, this pre-commit script (which requires the Python bindings)
> should catch all the variations.  It also checks for other control
> characters, but you can easily change that by changing the
> control_chars set at the top of the file.

Slight tweaks to fix some minor nits with the previous pre-commit script.

Re: Filenames with trailing newlines wreak havoc

Posted by Ben Reser <be...@reser.org>.
On Tue, Mar 26, 2013 at 12:47 PM, Daniel Shahaf <d....@daniel.shahaf.name> wrote:
> We've noted on IRC that this might fail if /foo and /foo\n get modified
> in the same revision.

Actually at least with FSFS the commit will fail in that case because
FSFS will believe you tried to add the same file twice.

However, this pre-commit script (which requires the Python bindings)
should catch all the variations.  It also checks for other control
characters, but you can easily change that by changing the
control_chars set at the top of the file.

Re: Filenames with trailing newlines wreak havoc

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Ben Reser wrote on Tue, Mar 26, 2013 at 12:03:43 -0700:
> On Tue, Mar 26, 2013 at 9:29 AM, Daniel Shahaf <d....@daniel.shahaf.name> wrote:
> > Fair enough.  Infra would be interested in a pre-commit hook script that
> > checks for control characters in filenames and aborts the transaction.
> 
> Use validate-files.py in the trunk tools/hook-scripts with a conf file like so:
> 

We've noted on IRC that this might fail if /foo and /foo\n get modified
in the same revision.

> [repositories]
> * = newlines
> 
> # Running any rule against a file with a newline in it will fail
> because we break the svnlook output by newlines.
> # So the filename will appear as non-existent.  Hook will fail with an
> error about the file not existing in the transaction and the commit
> will fail.
> # It's not pretty but it stops the problem.
> [rule:newlines]
> pattern = *
> command = 'exit 0'

Re: Filenames with trailing newlines wreak havoc

Posted by Ben Reser <be...@reser.org>.
On Tue, Mar 26, 2013 at 9:29 AM, Daniel Shahaf <d....@daniel.shahaf.name> wrote:
> Fair enough.  Infra would be interested in a pre-commit hook script that
> checks for control characters in filenames and aborts the transaction.

Use validate-files.py in the trunk tools/hook-scripts with a conf file like so:

[repositories]
* = newlines

# Running any rule against a file with a newline in it will fail
because we break the svnlook output by newlines.
# So the filename will appear as non-existent.  Hook will fail with an
error about the file not existing in the transaction and the commit
will fail.
# It's not pretty but it stops the problem.
[rule:newlines]
pattern = *
command = 'exit 0'

Re: Filenames with trailing newlines wreak havoc

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Stefan Sperling wrote on Tue, Mar 26, 2013 at 17:27:00 +0100:
> On Tue, Mar 26, 2013 at 06:10:35PM +0200, Daniel Shahaf wrote:
> > I'd call it a DoS if you can commit such a file and can't later 'svn rm
> > URL' it.
> 
> You cannot 'svnsync' the repository anymore, even if you rm the URL.
> So you can DoS a master/slave setup and force the slave to be
> out-of-date with no way to catch up until the revision is obliterated
> or repaired.

Fair enough.  Infra would be interested in a pre-commit hook script that
checks for control characters in filenames and aborts the transaction.

:-)

Re: Filenames with trailing newlines wreak havoc

Posted by Stefan Sperling <st...@elego.de>.
On Tue, Mar 26, 2013 at 06:10:35PM +0200, Daniel Shahaf wrote:
> I'd call it a DoS if you can commit such a file and can't later 'svn rm
> URL' it.

You cannot 'svnsync' the repository anymore, even if you rm the URL.
So you can DoS a master/slave setup and force the slave to be
out-of-date with no way to catch up until the revision is obliterated
or repaired.

> Or we could forbid newlines in pathnames.  The only problem with that is
> that the 1.0 API promised they would work... but if no one uses that,
> I'd be fine with calling it an erratum and disallowing it henceforth.

I think we should disallow it. There are too many unforeseen side effects.
I'm not even sure if I've already seen all failure modes that can result
from this.

Re: Filenames with trailing newlines wreak havoc

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Stefan Sperling wrote on Wed, Mar 27, 2013 at 16:03:32 +0100:
> On Wed, Mar 27, 2013 at 10:52:04AM -0400, C. Michael Pilato wrote:
> > > I think we'll have to block these paths at the FS layer.
> > 
> > If FSFS is fundamentally unable to support these types of paths, then sure,
> > let's go ahead and protect against the failure at that level.  But please
> > don't overreach here -- block only the paths that FSFS simply cannot deal
> > with.  There have been other tools built atop the FS layer in the past
> > (wikis, etc.) and could be in the future -- this is, after all, why we have
> > distinct FS and repos APIs -- and we shouldn't be artificially limiting what
> > folks can do with the that API.
> 
> That's fine. The fix I've committed and proposed for backport applies
> the large hammer and blocks any control characters. If there's a case
> to be made for relaxing this check I'm happy to do that.
> 
> But I think that if we do so, we also need to find a way to make the repos
> layer perform this check as well. Subversion clients have always refused
> paths which contain control characters, and it makes sense to have
> consistent checks at the client and server level, as far as Subversion
> is concerned.
> 
> Perhaps my idea of using repos_commit_txn() was bad because paths have
> already been handed off to the FS layer at that point. It should be
> possible to add the check elsewhere in the repos layer, before the FS
> gets involved.
> 

In the repos commit editor?

> > As an aside, though, we keep talking about paths with trailing newlines.  Is
> > it only *trailing* newlines that cause the problem here?  I would think that
> > newlines appearing anywhere in the path could be problematic in at least
> > some of the same places we've discussed on this thread thus far.
> 
> It's just that I've observed the trailing newlines problem in the wild.
> And it's indeed a particularly nasty case since 'svnadmin verify' doesn't
> complain about it as much as I believe it would when faced with a newline
> somewhere in the middle of a filename (the cpath: and the changed-paths
> list would also be corrupt in that case, but aren't with a trailing \n).

That's surprising.  The changed-paths list is sensitive to whitespace, a
trailing \n should cause 3 \n's rather than 2 and I would expect the
parser to choke on that.

In the cpath case it's the next-to-last header so copyfrom:, copyroot:,
is-fresh-txn-root:, and minfo-cnt: would be missing.  I guess their
absence doesn't affect verification.

Re: Filenames with trailing newlines wreak havoc

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Stefan Sperling wrote on Wed, Mar 27, 2013 at 16:39:47 +0100:
> What about:
> 
>  - We make the FS layer check for newlines (because we know FSFS can't cope)
> 

I think the newlines check should be in FSFS, not in libsvn_fs.

>  - We make the repos layer check for control characters (because the

+1

>    Subversion clients does, too, and we had problems with third-party
>    stuff like git-svn committing stuff which svn clients couldn't deal
>    with in the past)
> 
> That's some additional work, but I'm willing to do it.

Re: Filenames with trailing newlines wreak havoc

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Sun, Apr 14, 2013 at 11:04 AM, Stefan Sperling <st...@elego.de> wrote:

> On Sun, Apr 14, 2013 at 01:14:58AM +0200, Johan Corveleyn wrote:
> > The fact that the checks at both layers are different means that you
> > can still 'svnadmin load' a repos with filenames with (non-LF) control
> > characters, but you can't svnsync it anymore.
> >
> > Not sure if that's important. Just making an observation ...
>
> Well, this is exactly the sort of thing why Ben, myself, and others have
> been saying all along that we shouldn't give the FS layer (or BDB/FSFS
> layers) any special significance in making decisions about which characters
> Subversion considers legal in filenames. If all layers were consistent,
> there wouldn't be any such problem.
>
> Alas, the current state of things is closest to what could be called
> consensus.
>

As of r1467768, full UTF-8 file name support is a TODO for fsfs-format7.
It's easy to implement, we just need to piggy-back this to a format bump.

-- Stefan^2.

-- 
*Join one of our free daily demo sessions on* *Scaling Subversion for the
Enterprise <http://www.wandisco.com/training/webinars>*
*

*

Re: Filenames with trailing newlines wreak havoc

Posted by Stefan Sperling <st...@elego.de>.
On Sun, Apr 14, 2013 at 01:14:58AM +0200, Johan Corveleyn wrote:
> The fact that the checks at both layers are different means that you
> can still 'svnadmin load' a repos with filenames with (non-LF) control
> characters, but you can't svnsync it anymore.
> 
> Not sure if that's important. Just making an observation ...

Well, this is exactly the sort of thing why Ben, myself, and others have
been saying all along that we shouldn't give the FS layer (or BDB/FSFS
layers) any special significance in making decisions about which characters
Subversion considers legal in filenames. If all layers were consistent,
there wouldn't be any such problem.

Alas, the current state of things is closest to what could be called
consensus.

Re: Filenames with trailing newlines wreak havoc

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Johan Corveleyn wrote on Sun, Apr 14, 2013 at 01:14:58 +0200:
> On Tue, Apr 9, 2013 at 3:08 PM, Stefan Sperling <st...@elego.de> wrote:
> > On Wed, Apr 03, 2013 at 01:04:43PM +0200, Stefan Sperling wrote:
> >> To refresh our collective memory, the current trunk code does:
> >>  - prevent newlines from entering filenames at the fsfs layer
> >>  - prevent control characters from entering filenames at the repos layer
> >
> > I have nominated backports of this behaviour to 1.7.x and 1.6.x.
> > Thanks everyone for the very productive discussion about this issue!
> 
> Daniel said this on the users list [1], in relation to
> (syncing/dumping-loading) non-LF svn:log props:
> 
> "The LF check happens in the repos layer, but 'svnadmin load' uses the FS
> API directly so it doesn't trip that check.  svnsync can't do this ..."
> 
> Which made me wonder about the implications of this new filename
> checking code in the repos layer and fsfs layer on 'svnsync' on the
> one hand, and on 'svnadmin load' on the other hand.
> 

FSFS \n pathnames never worked and don't work, this is just a "Fail fast
due to a known implementation bug" case.  The repos commit editor patch
is a different story....

Daniel

> The fact that the checks at both layers are different means that you
> can still 'svnadmin load' a repos with filenames with (non-LF) control
> characters, but you can't svnsync it anymore.
> 
> Not sure if that's important. Just making an observation ...
> 
> [1] http://svn.haxx.se/users/archive-2013-04/0106.shtml
> 
> --
> Johan

Re: Filenames with trailing newlines wreak havoc

Posted by "C. Michael Pilato" <cm...@collab.net>.
On 04/03/2013 10:25 AM, Stefan Sperling wrote:
> Well, the argument goes like this:
> 
> Subversion cannot handle \n in filenames at many levels.
> 
> It was found that the FSFS layer cannot handle it without repository
> corruption, and it was pointed out that e.g. the unidiff format and the
> Subversion dump file format have problems with \n in filenames, too.
> 
> So preventing filenames with \n in Subversion in general seems to be
> a good thing to do. Since only FSFS currently rejects \n, BDB users
> are left unprotected from harmful effects of \n outside the filesystem
> layer unless we make the BDB layer reject such paths as well (which you
> don't agree with) or make the repos layer reject such paths (which it
> does as of r1461760 and which we should backport IMO).
> 
> Next, others have pointed out that similar issues can be caused by
> other control characters (such as \r) on some platforms, and also
> that libsvn_client has always been rejecting paths which contain any
> control characters. Consistency is a good thing, and making Subversion
> servers forbid control characters in filenames outright addresses our
> concerns about \n as well.
> 
> Does that make sense?

Yup.  I'll upgrade my response for this portion to +1.  Thanks for making
such a clear and well-articulated argument!

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Enterprise Cloud Development


Re: Filenames with trailing newlines wreak havoc

Posted by Stefan Sperling <st...@elego.de>.
On Wed, Apr 03, 2013 at 09:51:38AM -0400, C. Michael Pilato wrote:
> On 04/03/2013 07:04 AM, Stefan Sperling wrote:
> >  - prevent control characters from entering filenames at the repos layer
> 
> +0  (It's a bit overly conservative, probably, but at this layer of the API
> I'm less grumpy about that fact.)

Well, the argument goes like this:

Subversion cannot handle \n in filenames at many levels.

It was found that the FSFS layer cannot handle it without repository
corruption, and it was pointed out that e.g. the unidiff format and the
Subversion dump file format have problems with \n in filenames, too.

So preventing filenames with \n in Subversion in general seems to be
a good thing to do. Since only FSFS currently rejects \n, BDB users
are left unprotected from harmful effects of \n outside the filesystem
layer unless we make the BDB layer reject such paths as well (which you
don't agree with) or make the repos layer reject such paths (which it
does as of r1461760 and which we should backport IMO).

Next, others have pointed out that similar issues can be caused by
other control characters (such as \r) on some platforms, and also
that libsvn_client has always been rejecting paths which contain any
control characters. Consistency is a good thing, and making Subversion
servers forbid control characters in filenames outright addresses our
concerns about \n as well.

Does that make sense?

FWIW, I'm unhappy with the fact that we keep pretending the FS API was
somehow special while today, for all practical purposes, it is not.
I agree with Ben that the FS API should be considered an integral part
of Subversion and should mirror constraints we build into the system
in a consistent way because that makes Subversion easier to maintain
in the long run and might prevent some classes of security and data
corruption issues (such as the one that started this thread).
But I can live with the current solution and respect the goals the FS API
designers envisioned, and which the FSFS design failed to deliver, as
long as the repos layer ensures that none of our users will run into
nasty problems with control characters in filenames in the future.

Re: Filenames with trailing newlines wreak havoc

Posted by Daniel Shahaf <da...@elego.de>.
Stefan Sperling wrote on Wed, Apr 03, 2013 at 13:04:43 +0200:
> To refresh our collective memory, the current trunk code does:
>  - prevent newlines from entering filenames at the fsfs layer

+1

>  - prevent control characters from entering filenames at the repos layer

+0

Re: Filenames with trailing newlines wreak havoc

Posted by Johan Corveleyn <jc...@gmail.com>.
On Tue, Apr 9, 2013 at 3:08 PM, Stefan Sperling <st...@elego.de> wrote:
> On Wed, Apr 03, 2013 at 01:04:43PM +0200, Stefan Sperling wrote:
>> To refresh our collective memory, the current trunk code does:
>>  - prevent newlines from entering filenames at the fsfs layer
>>  - prevent control characters from entering filenames at the repos layer
>
> I have nominated backports of this behaviour to 1.7.x and 1.6.x.
> Thanks everyone for the very productive discussion about this issue!

Daniel said this on the users list [1], in relation to
(syncing/dumping-loading) non-LF svn:log props:

"The LF check happens in the repos layer, but 'svnadmin load' uses the FS
API directly so it doesn't trip that check.  svnsync can't do this ..."

Which made me wonder about the implications of this new filename
checking code in the repos layer and fsfs layer on 'svnsync' on the
one hand, and on 'svnadmin load' on the other hand.

The fact that the checks at both layers are different means that you
can still 'svnadmin load' a repos with filenames with (non-LF) control
characters, but you can't svnsync it anymore.

Not sure if that's important. Just making an observation ...

[1] http://svn.haxx.se/users/archive-2013-04/0106.shtml

--
Johan

Re: Filenames with trailing newlines wreak havoc

Posted by Stefan Sperling <st...@elego.de>.
On Wed, Apr 03, 2013 at 01:04:43PM +0200, Stefan Sperling wrote:
> To refresh our collective memory, the current trunk code does:
>  - prevent newlines from entering filenames at the fsfs layer
>  - prevent control characters from entering filenames at the repos layer

I have nominated backports of this behaviour to 1.7.x and 1.6.x.
Thanks everyone for the very productive discussion about this issue!

Re: Filenames with trailing newlines wreak havoc

Posted by "C. Michael Pilato" <cm...@collab.net>.
On 04/03/2013 07:04 AM, Stefan Sperling wrote:
> To refresh our collective memory, the current trunk code does:
>  - prevent newlines from entering filenames at the fsfs layer

+1

>  - prevent control characters from entering filenames at the repos layer

+0  (It's a bit overly conservative, probably, but at this layer of the API
I'm less grumpy about that fact.)

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Enterprise Cloud Development


Re: Filenames with trailing newlines wreak havoc

Posted by Ben Reser <be...@reser.org>.
On Wed, Apr 3, 2013 at 4:04 AM, Stefan Sperling <st...@elego.de> wrote:
> To refresh our collective memory, the current trunk code does:
>  - prevent newlines from entering filenames at the fsfs layer
>  - prevent control characters from entering filenames at the repos layer

+1, I'd prefer that the fs layer was consistent.  But I see no reason
not to move forward just because we haven't moved far enough forward
in my opinion.

Re: Filenames with trailing newlines wreak havoc

Posted by Stefan Sperling <st...@elego.de>.
On Wed, Mar 27, 2013 at 07:19:07PM +0100, Stefan Sperling wrote:
> So I've implemented this proposal in r1461701 and r1461760, and now
> we have code for both approaches in our repository history.
> I'll let everyone else discuss which approach is better and then
> do whatever needs to be done to get a fix shipped.

So it seems this discussion has petered out, so no consensus was found.
Can we please come to an agreement? Which approach should be used?

If I don't hear anything I'll prepare a backport of the current trunk
code for 1.7.x and 1.6.x (since this is a data corruption bug).

But I don't want to spend extra time on nominating things that will
end up being reverted, so I'll ask again: 
Is the current trunk code good enough for everyone?

To refresh our collective memory, the current trunk code does:
 - prevent newlines from entering filenames at the fsfs layer
 - prevent control characters from entering filenames at the repos layer

Re: Filenames with trailing newlines wreak havoc

Posted by Stefan Sperling <st...@elego.de>.
On Wed, Mar 27, 2013 at 01:09:00PM -0400, C. Michael Pilato wrote:
> I don't spot any run-time checks for the disallowed '.' and '..' components.

They are in svn_fs__path_valid(), libsvn_fs/fs-loader.c.

Re: Filenames with trailing newlines wreak havoc

Posted by Stefan Sperling <st...@elego.de>.
On Wed, Mar 27, 2013 at 10:01:01AM -0700, Ben Reser wrote:
> > Something like the proposal above, to reject LF only at the FS (or FSFS)
> >  layer, and all control characters in libsvn_repos, sounds good to me.
> > We should write unit tests to ensure that the FS layer works
> > properly with all other control characters.
> 
> I really don't like this proposal.  I think a major reason why we're
> in this situation is that we've been inconsistent.

FWIW, I agree that there will be few, if any, practical implications
from adding stricter verification to the FS layer. But I don't really
care which layer does the validation, as long as it is being done
somewhere on the server-side.

So I've implemented this proposal in r1461701 and r1461760, and now
we have code for both approaches in our repository history.
I'll let everyone else discuss which approach is better and then
do whatever needs to be done to get a fix shipped.

Re: Filenames with trailing newlines wreak havoc

Posted by Ben Reser <be...@reser.org>.
On Wed, Mar 27, 2013 at 9:12 AM, Julian Foad <ju...@btopenworld.com> wrote:
> So there is a compromise between the theoretical
> independence which allows FS to have a different definition of valid
> filenames, and the practical issue that if we actually do make it
> different from the repos layer's definition then it's more work to
> maintain.

I think we need to recognize that using libsvn_fs alone is entirely
theoretical.

* How many people are really going to create a dependency of the
entire Subversion setup in order to just use libsvn_fs?  We don't ship
it separately.  It's designed around our needs.
* If you are inclined to use libsvn_fs then you're inclined to use the
same limitations that our client library has because you're probably
wanting to interoperate with normal SVN clients.  Great examples of
this are svk and git-svn.  Both of which talk our wire protocols (svk
actually using our repos and ra layers) without using our client or wc
library or formats at all.

If people really were running around putting control characters in
their repository file names then I don't think it would have taken
this long for this issue to come up.  FSFS has been the primary
filesystem for how long now?  If there were real active use cases like
the argument that the FS is more liberal than our other layers then I
firmly believe we'd have run into this before.

> Something like the proposal above, to reject LF only at the FS (or FSFS)
>  layer, and all control characters in libsvn_repos, sounds good to me.
> We should write unit tests to ensure that the FS layer works
> properly with all other control characters.

I really don't like this proposal.  I think a major reason why we're
in this situation is that we've been inconsistent.  Why does fsfs not
handle newlines in filenames?  We could have just as easily used some
other delimiter, or had an encoding to allow entirely arbitrary node
names.  The answer in my opinion is that we thought control characters
were not allowed because the client library rejected them.  If we were
building a new file system right now without this having come up we
might even make the same mistake.  There is a huge benefit of
consistency.  I wish the code reuse argument was realistic, but it's
not.  So I think the bigger benefit here is to be consistent.

Re: Filenames with trailing newlines wreak havoc

Posted by "C. Michael Pilato" <cm...@collab.net>.
On 03/27/2013 12:12 PM, Julian Foad wrote:
> My thoughts:
> 
> The key issue is we need to define 'valid path' for the FS layer, and
> follow through with documenting it, implementing run-time checks, and
> testing it.

The definition of a 'valid path' for the FS layer has existed since long
before 1.0:  UTF8-encoded, no NUL characters, '/' as component separator, no
'.' or '..' components, and empty components treated as if they weren't
there at all.  And yes, it's documented in the same place it has been
for(effectively)ever -- svn_fs.h.

I don't spot any run-time checks for the disallowed '.' and '..' components.
 And I'm pretty sure we don't explicitly test our path support at the
requisite level (libsvn_fs C tests).  Those things should be remedied.

> Something like the proposal above, to reject LF only at the FS (or FSFS) 
> layer, and all control characters in libsvn_repos, sounds good to me. We
> should write unit tests to ensure that the FS layer works properly with
> all other control characters.

Agreed.

This is, however, not a 1.8.0 blocker, IMO  By its very design in this
areas, FSFS in Subversion 1.1 already introduced a
regression-turned-incompatibility versus Subversion 1.0.

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Enterprise Cloud Development


Re: Filenames with trailing newlines wreak havoc

Posted by Julian Foad <ju...@btopenworld.com>.
Stefan Sperling wrote:

> C. Michael Pilato wrote:
>>  On 03/27/2013 11:03 AM, Stefan Sperling wrote:
>>  > That's fine. The fix I've committed and proposed for backport applies
>>  > the large hammer and blocks any control characters. If there's a case
>>  > to be made for relaxing this check I'm happy to do that.
>> 
>>  Sorry, but I'm -1 on that change.  How did this expand from "trailing
>>  newlines" to "any control character"?
> 
> An IRC discussion with Ben:
> http://colabti.org/irclogger/irclogger_log/svn-dev?date=2013-03-26#l479
> Who said:
>   "I'm not inclined to spend a bunch of time thinking through all the
>    implications of the control characters that we already don't allow
>    in the client lib."
> 
> I agree there. I also didn't want to test all control characters
> to see if FSFS can handle them. It seems safer to reject them outright.
> 
>>  Look, the FS allows things that the repos layer does not.  For example, it
>>  doesn't care about property encoding or formats, where the repository layer
>>  does.  *That's by design.*
> 
> Sure, but then we need an implementation that conforms to that design.
> We clearly don't have one, else it wouldn't choke on newlines.
> Fixing the implementation is going to be hard and we aren't even
> sure if any user of the API really cares.
> 
>>  It's perfectly okay for us to tell folks that if
>>  they modify their FS with something other than Subversion and end up with
>>  data that Subversion doesn't like, that's their own problem.
> 
> But (at least) with newlines, they can end up with data that the filesystem
> implementation itself doesn't like. So the API becomes entirely useless
> to them if they want to store filenames which contain newlines.
> 
> What about:
> 
> - We make the FS layer check for newlines (because we know FSFS can't cope)
> 
> - We make the repos layer check for control characters (because the
>    Subversion clients does, too, and we had problems with third-party
>    stuff like git-svn committing stuff which svn clients couldn't deal
>    with in the past)
> 
> That's some additional work, but I'm willing to do it.

My thoughts:

The key issue is we need to define 'valid path' for the FS layer, and follow through with documenting it, implementing run-time checks, and testing it.  If the domain of valid FS paths is greater than what Subversion's higher layers use, then 
we need unit testing at the FS layer to ensure we're 
supporting what we claim.  If however, we decide to use the same 
definition as in the layer above, then we don't need to do so much
 unit testing of this layer, because testing through the layer above 
should mostly suffice.  So there is a compromise between the theoretical
 independence which allows FS to have a different definition of valid 
filenames, and the practical issue that if we actually do make it 
different from the repos layer's definition then it's more work to 
maintain.

If we now decide that LF isn't supported, I will 
ask, what about CR?  That seems the next-most-likely to cause problems. 
 I can almost guarantee that CR will break the repos layer (I'm thinking
 about passing a list of paths to a hook script, on Windows), but it 
might be acceptable at the FS layer.  How are we going to decide whether LF is the only control character that we need to forbid?  Partly by 
analysis / thought / knowledge, and partly by testing, I hope.  The alternative is we decide that the benefit to the Subversion community of allowing other control characters in the FS layer is too small compared with the risks and effort, and so define that it shall no longer support any control characters.

As for "trailing newline", that's one specific problem that occurred in practice.  A test for that 
case is a *regression test*.  But it's plain that a newline anywhere in a
 filename is going to cause problems in FSFS, so the new rule will (at least) forbid 
newlines anywhere in a path.  Along with implementing a check for the new behaviour, we should be adding a new *unit 
test* to verify the new behaviour, whatever that behaviour shall be.  Since that unit test will clearly 
cover the "trailing newline" case, we don't really need the regression 
test as well.  We would do well to take more care in distinguishing unit
 tests from regression tests.

Something like the proposal above, to reject LF only at the FS (or FSFS)
 layer, and all control characters in libsvn_repos, sounds good to me.  
We should write unit tests to ensure that the FS layer works 
properly with all other control characters.

- Julian

Re: Filenames with trailing newlines wreak havoc

Posted by Stefan Sperling <st...@elego.de>.
On Wed, Mar 27, 2013 at 11:20:54AM -0400, C. Michael Pilato wrote:
> On 03/27/2013 11:03 AM, Stefan Sperling wrote:
> > That's fine. The fix I've committed and proposed for backport applies
> > the large hammer and blocks any control characters. If there's a case
> > to be made for relaxing this check I'm happy to do that.
> 
> Sorry, but I'm -1 on that change.  How did this expand from "trailing
> newlines" to "any control character"?

An IRC discussion with Ben:
http://colabti.org/irclogger/irclogger_log/svn-dev?date=2013-03-26#l479
Who said:
  "I'm not inclined to spend a bunch of time thinking through all the
   implications of the control characters that we already don't allow
   in the client lib."

I agree there. I also didn't want to test all control characters
to see if FSFS can handle them. It seems safer to reject them outright.

> Look, the FS allows things that the repos layer does not.  For example, it
> doesn't care about property encoding or formats, where the repository layer
> does.  *That's by design.*

Sure, but then we need an implementation that conforms to that design.
We clearly don't have one, else it wouldn't choke on newlines.
Fixing the implementation is going to be hard and we aren't even
sure if any user of the API really cares.

> It's perfectly okay for us to tell folks that if
> they modify their FS with something other than Subversion and end up with
> data that Subversion doesn't like, that's their own problem.

But (at least) with newlines, they can end up with data that the filesystem
implementation itself doesn't like. So the API becomes entirely useless
to them if they want to store filenames which contain newlines.

What about:

 - We make the FS layer check for newlines (because we know FSFS can't cope)

 - We make the repos layer check for control characters (because the
   Subversion clients does, too, and we had problems with third-party
   stuff like git-svn committing stuff which svn clients couldn't deal
   with in the past)

That's some additional work, but I'm willing to do it.

Re: Filenames with trailing newlines wreak havoc

Posted by Branko Čibej <br...@wandisco.com>.
On 27.03.2013 16:24, Ivan Zhakov wrote:
> On Wed, Mar 27, 2013 at 7:20 PM, C. Michael Pilato <cm...@collab.net> wrote:
>> On 03/27/2013 11:03 AM, Stefan Sperling wrote:
>>> On Wed, Mar 27, 2013 at 10:52:04AM -0400, C. Michael Pilato wrote:
>>>>> I think we'll have to block these paths at the FS layer.
>>>> If FSFS is fundamentally unable to support these types of paths, then sure,
>>>> let's go ahead and protect against the failure at that level.  But please
>>>> don't overreach here -- block only the paths that FSFS simply cannot deal
>>>> with.  There have been other tools built atop the FS layer in the past
>>>> (wikis, etc.) and could be in the future -- this is, after all, why we have
>>>> distinct FS and repos APIs -- and we shouldn't be artificially limiting what
>>>> folks can do with the that API.
>>> That's fine. The fix I've committed and proposed for backport applies
>>> the large hammer and blocks any control characters. If there's a case
>>> to be made for relaxing this check I'm happy to do that.
>> Sorry, but I'm -1 on that change.  How did this expand from "trailing
>> newlines" to "any control character"?
>>
> I'm with Mike here.

<aol/>


-- 
Branko Čibej
Director of Subversion | WANdisco | www.wandisco.com


Re: Filenames with trailing newlines wreak havoc

Posted by Ivan Zhakov <iv...@visualsvn.com>.
On Wed, Mar 27, 2013 at 7:20 PM, C. Michael Pilato <cm...@collab.net> wrote:
> On 03/27/2013 11:03 AM, Stefan Sperling wrote:
>> On Wed, Mar 27, 2013 at 10:52:04AM -0400, C. Michael Pilato wrote:
>>>> I think we'll have to block these paths at the FS layer.
>>>
>>> If FSFS is fundamentally unable to support these types of paths, then sure,
>>> let's go ahead and protect against the failure at that level.  But please
>>> don't overreach here -- block only the paths that FSFS simply cannot deal
>>> with.  There have been other tools built atop the FS layer in the past
>>> (wikis, etc.) and could be in the future -- this is, after all, why we have
>>> distinct FS and repos APIs -- and we shouldn't be artificially limiting what
>>> folks can do with the that API.
>>
>> That's fine. The fix I've committed and proposed for backport applies
>> the large hammer and blocks any control characters. If there's a case
>> to be made for relaxing this check I'm happy to do that.
>
> Sorry, but I'm -1 on that change.  How did this expand from "trailing
> newlines" to "any control character"?
>
I'm with Mike here. Another problem that you used svn_path_check() API
to validate incoming name. But svn_path_check() is to validate disk
file names, not FS filenames.

-- 
Ivan Zhakov

Re: Filenames with trailing newlines wreak havoc

Posted by "C. Michael Pilato" <cm...@collab.net>.
On 03/27/2013 11:03 AM, Stefan Sperling wrote:
> On Wed, Mar 27, 2013 at 10:52:04AM -0400, C. Michael Pilato wrote:
>>> I think we'll have to block these paths at the FS layer.
>>
>> If FSFS is fundamentally unable to support these types of paths, then sure,
>> let's go ahead and protect against the failure at that level.  But please
>> don't overreach here -- block only the paths that FSFS simply cannot deal
>> with.  There have been other tools built atop the FS layer in the past
>> (wikis, etc.) and could be in the future -- this is, after all, why we have
>> distinct FS and repos APIs -- and we shouldn't be artificially limiting what
>> folks can do with the that API.
> 
> That's fine. The fix I've committed and proposed for backport applies
> the large hammer and blocks any control characters. If there's a case
> to be made for relaxing this check I'm happy to do that.

Sorry, but I'm -1 on that change.  How did this expand from "trailing
newlines" to "any control character"?

Look, the FS allows things that the repos layer does not.  For example, it
doesn't care about property encoding or formats, where the repository layer
does.  *That's by design.*  It's perfectly okay for us to tell folks that if
they modify their FS with something other than Subversion and end up with
data that Subversion doesn't like, that's their own problem.  But we already
catch flak as a community from time to time for the limitations in our
system forced upon us because of implementation decisions we've made (no
control characters because XML is unhappy; no colons in certain properties
because WebDAV can't hack it; etc.)

I get that we want Subversion (as a version control system, end-to-end) to
work.  But the FS layer should not inherit superficial limitations simply
because enforcing them in the right place is too hard.

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Enterprise Cloud Development


Re: Filenames with trailing newlines wreak havoc

Posted by Stefan Sperling <st...@elego.de>.
On Wed, Mar 27, 2013 at 10:52:04AM -0400, C. Michael Pilato wrote:
> > I think we'll have to block these paths at the FS layer.
> 
> If FSFS is fundamentally unable to support these types of paths, then sure,
> let's go ahead and protect against the failure at that level.  But please
> don't overreach here -- block only the paths that FSFS simply cannot deal
> with.  There have been other tools built atop the FS layer in the past
> (wikis, etc.) and could be in the future -- this is, after all, why we have
> distinct FS and repos APIs -- and we shouldn't be artificially limiting what
> folks can do with the that API.

That's fine. The fix I've committed and proposed for backport applies
the large hammer and blocks any control characters. If there's a case
to be made for relaxing this check I'm happy to do that.

But I think that if we do so, we also need to find a way to make the repos
layer perform this check as well. Subversion clients have always refused
paths which contain control characters, and it makes sense to have
consistent checks at the client and server level, as far as Subversion
is concerned.

Perhaps my idea of using repos_commit_txn() was bad because paths have
already been handed off to the FS layer at that point. It should be
possible to add the check elsewhere in the repos layer, before the FS
gets involved.

> As an aside, though, we keep talking about paths with trailing newlines.  Is
> it only *trailing* newlines that cause the problem here?  I would think that
> newlines appearing anywhere in the path could be problematic in at least
> some of the same places we've discussed on this thread thus far.

It's just that I've observed the trailing newlines problem in the wild.
And it's indeed a particularly nasty case since 'svnadmin verify' doesn't
complain about it as much as I believe it would when faced with a newline
somewhere in the middle of a filename (the cpath: and the changed-paths
list would also be corrupt in that case, but aren't with a trailing \n).

Re: Filenames with trailing newlines wreak havoc

Posted by "C. Michael Pilato" <cm...@collab.net>.
On 03/27/2013 08:35 AM, Stefan Sperling wrote:
> On Tue, Mar 26, 2013 at 06:42:27PM +0100, Stefan Sperling wrote:
>> On Tue, Mar 26, 2013 at 12:40:56PM -0400, C. Michael Pilato wrote:
>>> Certainly, libsvn_repos should be disallowing newline-bearing paths.  We
>>> weren't able to support these paths when our .svn/entries file was
>>> plaintext, we've never been able to support them in our dump format (which
>>> is a libsvn_repos construct), and so on.
>>>
>>> I'm not so sure about libsvn_fs.  BDB should be able to handle them just
>>> fine, but it sounds like FSFS can't.  Do we just make a post-facto
>>> declaration that, going forward, we won't try to support these paths in the
>>> FS layer?
>>
>> I don't really see how supporting such paths in the FS layer makes
>> sense if none of the other layers can deal with them.
>>
>> Are there any other tools built on top of our FS layer that rely on
>> being able to use newlines in filenames?
>>
>> But still, I was planning on making the repos layer enforce restrictions
>> on filenames anyway to avoid having each filesystem deal with it separately.
> 
> I think we'll have to block these paths at the FS layer.

If FSFS is fundamentally unable to support these types of paths, then sure,
let's go ahead and protect against the failure at that level.  But please
don't overreach here -- block only the paths that FSFS simply cannot deal
with.  There have been other tools built atop the FS layer in the past
(wikis, etc.) and could be in the future -- this is, after all, why we have
distinct FS and repos APIs -- and we shouldn't be artificially limiting what
folks can do with the that API.

As an aside, though, we keep talking about paths with trailing newlines.  Is
it only *trailing* newlines that cause the problem here?  I would think that
newlines appearing anywhere in the path could be problematic in at least
some of the same places we've discussed on this thread thus far.

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Enterprise Cloud Development


Re: Filenames with trailing newlines wreak havoc

Posted by Stefan Sperling <st...@elego.de>.
On Wed, Mar 27, 2013 at 01:35:30PM +0100, Stefan Sperling wrote:
> Let's face it, the FS API promise people are upholding in this discussion
> has been broken for years.

FWIW, here's the patch I was trying to use:

Index: subversion/libsvn_repos/fs-wrap.c
===================================================================
--- subversion/libsvn_repos/fs-wrap.c	(revision 1461542)
+++ subversion/libsvn_repos/fs-wrap.c	(working copy)
@@ -42,6 +42,34 @@
 
 /*** Commit wrappers ***/
 
+/* Verify names of changed paths within the transaction TXN.
+ *
+ * This was added for issue 4340, "repos layer should reject filenames with
+ * trailing \n", but in addition to newlines it rejects any characters
+ * Subversion clients would reject when adding new files to version control. */
+static svn_error_t *
+verify_changed_path_names(svn_fs_txn_t *txn, apr_pool_t *pool)
+{
+  apr_pool_t *iterpool;
+  svn_fs_root_t *txn_root;
+  apr_hash_t *changed_paths;
+  apr_hash_index_t *hi;
+
+  SVN_ERR(svn_fs_txn_root(&txn_root, txn, pool));
+  SVN_ERR(svn_fs_paths_changed2(&changed_paths, txn_root, pool));
+
+  iterpool = svn_pool_create(pool);
+  for (hi = apr_hash_first(pool, changed_paths); hi; hi = apr_hash_next(hi))
+    {
+        const char *path = svn__apr_hash_index_key(hi);
+
+        SVN_ERR(svn_path_check_valid(path, iterpool));
+    }
+  svn_pool_destroy(iterpool);
+
+  return SVN_NO_ERROR;
+}
+
 svn_error_t *
 svn_repos_fs_commit_txn(const char **conflict_p,
                         svn_repos_t *repos,
@@ -57,6 +85,9 @@ svn_repos_fs_commit_txn(const char **conflict_p,
 
   *new_rev = SVN_INVALID_REVNUM;
 
+  /* Verify names of changed paths. */
+  SVN_ERR(verify_changed_path_names(txn, pool));
+
   /* Run pre-commit hooks. */
   SVN_ERR(svn_fs_txn_name(&txn_name, txn, pool));
   SVN_ERR(svn_repos__hooks_pre_commit(repos, txn_name, pool));

Re: Filenames with trailing newlines wreak havoc

Posted by Stefan Sperling <st...@elego.de>.
On Tue, Mar 26, 2013 at 06:42:27PM +0100, Stefan Sperling wrote:
> On Tue, Mar 26, 2013 at 12:40:56PM -0400, C. Michael Pilato wrote:
> > Certainly, libsvn_repos should be disallowing newline-bearing paths.  We
> > weren't able to support these paths when our .svn/entries file was
> > plaintext, we've never been able to support them in our dump format (which
> > is a libsvn_repos construct), and so on.
> > 
> > I'm not so sure about libsvn_fs.  BDB should be able to handle them just
> > fine, but it sounds like FSFS can't.  Do we just make a post-facto
> > declaration that, going forward, we won't try to support these paths in the
> > FS layer?
> 
> I don't really see how supporting such paths in the FS layer makes
> sense if none of the other layers can deal with them.
> 
> Are there any other tools built on top of our FS layer that rely on
> being able to use newlines in filenames?
> 
> But still, I was planning on making the repos layer enforce restrictions
> on filenames anyway to avoid having each filesystem deal with it separately.

I think we'll have to block these paths at the FS layer.
It looks like the repos layer cannot perform this kind of verification.

When I use the FS API to add paths with trailing \n to a transaction,
the FS layer accepts them, and silently write out a corrupt changed
path list (in case of FSFS):

================
$ cat test-filename-trailing-newline/db/transactions/1-1.txn/changes
0-1._0.t1-1 add-dir false false /bar

1 /foo
================

When I try to make the repos layer iterate over changed paths to
perform verification, svn_fs_paths_changed2() returns an error:

(gdb) p *err->child->child->child->child       
$8 = {apr_err = 160004, 
  message = 0xef7143880f0 "Invalid changes line in rev-file", child = 0x0, 
  pool = 0xef714388028, 
  file = 0xef71a7d6538 "subversion/libsvn_fs_fs/fs_fs.c", line = 6024}

I'd say we cannot allow the FS API to break the FS like that.

Let's face it, the FS API promise people are upholding in this discussion
has been broken for years.

Re: Filenames with trailing newlines wreak havoc

Posted by Stefan Sperling <st...@elego.de>.
On Tue, Mar 26, 2013 at 12:40:56PM -0400, C. Michael Pilato wrote:
> On 03/26/2013 12:10 PM, Daniel Shahaf wrote:
> > Or we could forbid newlines in pathnames.  The only problem with that is
> > that the 1.0 API promised they would work... but if no one uses that,
> > I'd be fine with calling it an erratum and disallowing it henceforth.
> 
> I'm very much in the "let libsvn_fs do everything it can, and let
> libsvn_repos enforce Subversion-specific requirements" camp.
>
> Certainly, libsvn_repos should be disallowing newline-bearing paths.  We
> weren't able to support these paths when our .svn/entries file was
> plaintext, we've never been able to support them in our dump format (which
> is a libsvn_repos construct), and so on.
> 
> I'm not so sure about libsvn_fs.  BDB should be able to handle them just
> fine, but it sounds like FSFS can't.  Do we just make a post-facto
> declaration that, going forward, we won't try to support these paths in the
> FS layer?

I don't really see how supporting such paths in the FS layer makes
sense if none of the other layers can deal with them.

Are there any other tools built on top of our FS layer that rely on
being able to use newlines in filenames?

But still, I was planning on making the repos layer enforce restrictions
on filenames anyway to avoid having each filesystem deal with it separately.

Re: Filenames with trailing newlines wreak havoc

Posted by Ben Reser <be...@reser.org>.
On Tue, Mar 26, 2013 at 1:13 PM, Ivan Zhakov <iv...@visualsvn.com> wrote:
> On Tue, Mar 26, 2013 at 11:48 PM, Daniel Shahaf <d....@daniel.shahaf.name> wrote:
>> We can but that would break an API promise.  (A promise that we might be
>> fine with breaking.)
> I don't see API breakage here:
> 1. We already prohibited some FS paths (with . or ..)
> 2. Only new paths will be affected: users still can access already
> created files/folders in repository.

Agreed I don't see any API promise we're breaking.  What we reject is
an implementation detail.

Re: Filenames with trailing newlines wreak havoc

Posted by "C. Michael Pilato" <cm...@collab.net>.
On 03/26/2013 04:16 PM, Daniel Shahaf wrote:
> Ivan Zhakov wrote on Wed, Mar 27, 2013 at 00:13:12 +0400:
>> On Tue, Mar 26, 2013 at 11:48 PM, Daniel Shahaf <d....@daniel.shahaf.name> wrote:
>>> Ivan Zhakov wrote on Tue, Mar 26, 2013 at 21:57:42 +0400:
>>>> Can we just extend [FS validation] to prohibit all characters with ascii code < 31 ?
>>>
>>> We can but that would break an API promise.  (A promise that we might be
>>> fine with breaking.)
>> I don't see API breakage here:
>> 1. We already prohibited some FS paths (with . or ..)
>> 2. Only new paths will be affected: users still can access already
>> created files/folders in repository.
> 
> svn_fs.h:1202
> (I thought I'd given that pointer upthread)

Yup.  That's precisely the comment I was thinking of, too.

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Enterprise Cloud Development


Re: Filenames with trailing newlines wreak havoc

Posted by Stefan Sperling <st...@elego.de>.
On Tue, Mar 26, 2013 at 04:29:19PM -0400, C. Michael Pilato wrote:
> On 03/26/2013 04:16 PM, Daniel Shahaf wrote:
> > svn_fs.h:1202
> > (I thought I'd given that pointer upthread)
> 
> Yup.  That's precisely the comment I was thinking of, too.

That comment was written by Jim Blandy many years ago (r838152).

It is older than FSFS. So technically, this comment has been
violated ever since FSFS has been made default, because the
design of FSFS clearly didn't take this comment into account.
If it had, we wouldn't be having this discussion.

I think we can safely ignore whatever that comment is saying,
or rather, is failing to say, about control characters.

Re: Filenames with trailing newlines wreak havoc

Posted by "C. Michael Pilato" <cm...@collab.net>.
On 03/26/2013 04:16 PM, Daniel Shahaf wrote:
> Ivan Zhakov wrote on Wed, Mar 27, 2013 at 00:13:12 +0400:
>> On Tue, Mar 26, 2013 at 11:48 PM, Daniel Shahaf <d....@daniel.shahaf.name> wrote:
>>> Ivan Zhakov wrote on Tue, Mar 26, 2013 at 21:57:42 +0400:
>>>> Can we just extend [FS validation] to prohibit all characters with ascii code < 31 ?
>>>
>>> We can but that would break an API promise.  (A promise that we might be
>>> fine with breaking.)
>> I don't see API breakage here:
>> 1. We already prohibited some FS paths (with . or ..)
>> 2. Only new paths will be affected: users still can access already
>> created files/folders in repository.
> 
> svn_fs.h:1202
> (I thought I'd given that pointer upthread)

Yup.  That's precisely the comment I was thinking of, too.

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Enterprise Cloud Development


Re: Filenames with trailing newlines wreak havoc

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Ivan Zhakov wrote on Wed, Mar 27, 2013 at 00:13:12 +0400:
> On Tue, Mar 26, 2013 at 11:48 PM, Daniel Shahaf <d....@daniel.shahaf.name> wrote:
> > Ivan Zhakov wrote on Tue, Mar 26, 2013 at 21:57:42 +0400:
> >> Can we just extend [FS validation] to prohibit all characters with ascii code < 31 ?
> >
> > We can but that would break an API promise.  (A promise that we might be
> > fine with breaking.)
> I don't see API breakage here:
> 1. We already prohibited some FS paths (with . or ..)
> 2. Only new paths will be affected: users still can access already
> created files/folders in repository.

svn_fs.h:1202
(I thought I'd given that pointer upthread)

Re: Filenames with trailing newlines wreak havoc

Posted by Ivan Zhakov <iv...@visualsvn.com>.
On Tue, Mar 26, 2013 at 11:48 PM, Daniel Shahaf <d....@daniel.shahaf.name> wrote:
> Ivan Zhakov wrote on Tue, Mar 26, 2013 at 21:57:42 +0400:
>> Can we just extend [FS validation] to prohibit all characters with ascii code < 31 ?
>
> We can but that would break an API promise.  (A promise that we might be
> fine with breaking.)
I don't see API breakage here:
1. We already prohibited some FS paths (with . or ..)
2. Only new paths will be affected: users still can access already
created files/folders in repository.


-- 
Ivan Zhakov

Re: Filenames with trailing newlines wreak havoc

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Ivan Zhakov wrote on Tue, Mar 26, 2013 at 21:57:42 +0400:
> Can we just extend [FS validation] to prohibit all characters with ascii code < 31 ?

We can but that would break an API promise.  (A promise that we might be
fine with breaking.)

Re: Filenames with trailing newlines wreak havoc

Posted by Ivan Zhakov <iv...@visualsvn.com>.
On Tue, Mar 26, 2013 at 11:07 PM, Ben Reser <be...@reser.org> wrote:
> On Tue, Mar 26, 2013 at 10:57 AM, Ivan Zhakov <iv...@visualsvn.com> wrote:
>> We already have svn_fs__path_valid() and validate all incoming FS
>> paths in fs-loader.c. Current svn_fs__path_valid() implementation is:
>> [[[
>> svn_error_t *
>> svn_fs__path_valid(const char *path, apr_pool_t *pool)
>> {
>>   /* UTF-8 encoded string without NULs. */
>>   if (! svn_utf__cstring_is_valid(path))
>>     {
>>       return svn_error_createf(SVN_ERR_FS_PATH_SYNTAX, NULL,
>>                                _("Path '%s' is not in UTF-8"), path);
>>     }
>>
>>   /* No "." or ".." elements. */
>>   if (svn_path_is_backpath_present(path)
>>       || svn_path_is_dotpath_present(path))
>>     {
>>       return svn_error_createf(SVN_ERR_FS_PATH_SYNTAX, NULL,
>>                                _("Path '%s' contains '.' or '..' element"),
>>                                path);
>>     }
>>
>>   /* That's good enough. */
>>   return SVN_NO_ERROR;
>> }
>> ]]]
>>
>> Can we just extend it to prohibit all characters with ascii code < 31 ?
>
> Iterate the characters and use svn_ctype_iscntrl()
Yes, btw we already have such code in svn_path_check().


-- 
Ivan Zhakov

Re: Filenames with trailing newlines wreak havoc

Posted by Ben Reser <be...@reser.org>.
On Tue, Mar 26, 2013 at 10:57 AM, Ivan Zhakov <iv...@visualsvn.com> wrote:
> We already have svn_fs__path_valid() and validate all incoming FS
> paths in fs-loader.c. Current svn_fs__path_valid() implementation is:
> [[[
> svn_error_t *
> svn_fs__path_valid(const char *path, apr_pool_t *pool)
> {
>   /* UTF-8 encoded string without NULs. */
>   if (! svn_utf__cstring_is_valid(path))
>     {
>       return svn_error_createf(SVN_ERR_FS_PATH_SYNTAX, NULL,
>                                _("Path '%s' is not in UTF-8"), path);
>     }
>
>   /* No "." or ".." elements. */
>   if (svn_path_is_backpath_present(path)
>       || svn_path_is_dotpath_present(path))
>     {
>       return svn_error_createf(SVN_ERR_FS_PATH_SYNTAX, NULL,
>                                _("Path '%s' contains '.' or '..' element"),
>                                path);
>     }
>
>   /* That's good enough. */
>   return SVN_NO_ERROR;
> }
> ]]]
>
> Can we just extend it to prohibit all characters with ascii code < 31 ?

Iterate the characters and use svn_ctype_iscntrl()

Re: Filenames with trailing newlines wreak havoc

Posted by Ivan Zhakov <iv...@visualsvn.com>.
On Tue, Mar 26, 2013 at 8:40 PM, C. Michael Pilato <cm...@collab.net> wrote:
> On 03/26/2013 12:10 PM, Daniel Shahaf wrote:
>> Or we could forbid newlines in pathnames.  The only problem with that is
>> that the 1.0 API promised they would work... but if no one uses that,
>> I'd be fine with calling it an erratum and disallowing it henceforth.
>
> I'm very much in the "let libsvn_fs do everything it can, and let
> libsvn_repos enforce Subversion-specific requirements" camp.
>
> Certainly, libsvn_repos should be disallowing newline-bearing paths.  We
> weren't able to support these paths when our .svn/entries file was
> plaintext, we've never been able to support them in our dump format (which
> is a libsvn_repos construct), and so on.
>
> I'm not so sure about libsvn_fs.  BDB should be able to handle them just
> fine, but it sounds like FSFS can't.  Do we just make a post-facto
> declaration that, going forward, we won't try to support these paths in the
> FS layer?
>
We already have svn_fs__path_valid() and validate all incoming FS
paths in fs-loader.c. Current svn_fs__path_valid() implementation is:
[[[
svn_error_t *
svn_fs__path_valid(const char *path, apr_pool_t *pool)
{
  /* UTF-8 encoded string without NULs. */
  if (! svn_utf__cstring_is_valid(path))
    {
      return svn_error_createf(SVN_ERR_FS_PATH_SYNTAX, NULL,
                               _("Path '%s' is not in UTF-8"), path);
    }

  /* No "." or ".." elements. */
  if (svn_path_is_backpath_present(path)
      || svn_path_is_dotpath_present(path))
    {
      return svn_error_createf(SVN_ERR_FS_PATH_SYNTAX, NULL,
                               _("Path '%s' contains '.' or '..' element"),
                               path);
    }

  /* That's good enough. */
  return SVN_NO_ERROR;
}
]]]

Can we just extend it to prohibit all characters with ascii code < 31 ?


-- 
Ivan Zhakov

Re: Filenames with trailing newlines wreak havoc

Posted by "C. Michael Pilato" <cm...@collab.net>.
On 03/26/2013 12:10 PM, Daniel Shahaf wrote:
> Or we could forbid newlines in pathnames.  The only problem with that is
> that the 1.0 API promised they would work... but if no one uses that,
> I'd be fine with calling it an erratum and disallowing it henceforth.

I'm very much in the "let libsvn_fs do everything it can, and let
libsvn_repos enforce Subversion-specific requirements" camp.

Certainly, libsvn_repos should be disallowing newline-bearing paths.  We
weren't able to support these paths when our .svn/entries file was
plaintext, we've never been able to support them in our dump format (which
is a libsvn_repos construct), and so on.

I'm not so sure about libsvn_fs.  BDB should be able to handle them just
fine, but it sounds like FSFS can't.  Do we just make a post-facto
declaration that, going forward, we won't try to support these paths in the
FS layer?

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Enterprise Cloud Development


Re: Filenames with trailing newlines wreak havoc

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Ben Reser wrote on Tue, Mar 26, 2013 at 09:06:56 -0700:
> On Tue, Mar 26, 2013 at 6:36 AM, Stefan Sperling <st...@elego.de> wrote:
> > We should add some C tests as well to verify API behaviour at the
> > client layer and at the repos layer.
> >
> > Given the ripple effects of this problem in FSFS revision files I think
> > we should ensure that the Subversion server blocks such filenames from
> > entering the repository (any repository, FSFS and BDB). It seems FSFS format
> > changes would be required to support filenames with trailing newlines
> > properly, an effort which isn't worth the gain in my opinion.
> 
> +1, this is not an allowed use and is obviously a hole in our server
> implementations.  I'd actually say this is a potential DoS since
> committing such a file creates all sorts of havoc for clients and
> admins after the fact.
> 

I'd call it a DoS if you can commit such a file and can't later 'svn rm
URL' it.

> > And given the ripple effects seen in areas such as repository verification,
> > svnsync, and ra_neon, I don't think we can afford to call this a supported
> > use case until all components of the system have been fixed to handle
> > filenames with trailing newlines properly.
> 
> It probably breaks other things like the dump format and diff.  Tons
> of things assume newline has special meaning.

Fortunately we could encode newlines the same way in FSFS revision
files, diff, and dumpfile headers.  (Just need a new service routine
(dare I say yet another svn_foo_canonicalize() function? :-)))

Or we could forbid newlines in pathnames.  The only problem with that is
that the 1.0 API promised they would work... but if no one uses that,
I'd be fine with calling it an erratum and disallowing it henceforth.

Re: Filenames with trailing newlines wreak havoc

Posted by Ben Reser <be...@reser.org>.
On Tue, Mar 26, 2013 at 6:36 AM, Stefan Sperling <st...@elego.de> wrote:
> We should add some C tests as well to verify API behaviour at the
> client layer and at the repos layer.
>
> Given the ripple effects of this problem in FSFS revision files I think
> we should ensure that the Subversion server blocks such filenames from
> entering the repository (any repository, FSFS and BDB). It seems FSFS format
> changes would be required to support filenames with trailing newlines
> properly, an effort which isn't worth the gain in my opinion.

+1, this is not an allowed use and is obviously a hole in our server
implementations.  I'd actually say this is a potential DoS since
committing such a file creates all sorts of havoc for clients and
admins after the fact.

I'd suggest that we make svnadmin verify check for this condition and
report it.  Then devise a fix procedure to remove or rename the file
in the repository.  This allows admins to resolve their repositories
from this problem in case there are other cases of this out in the
wild that we don't know about.

> And given the ripple effects seen in areas such as repository verification,
> svnsync, and ra_neon, I don't think we can afford to call this a supported
> use case until all components of the system have been fixed to handle
> filenames with trailing newlines properly.

It probably breaks other things like the dump format and diff.  Tons
of things assume newline has special meaning.

Re: Filenames with trailing newlines wreak havoc

Posted by Daniel Shahaf <da...@apache.org>.
On Tue, Mar 26, 2013 at 02:36:15PM +0100, Stefan Sperling wrote:
> When a file with a trailing newline (ASCII 0x0a) has made its way into
> the repository, there are various areas of the system that exhibit failure
> modes. Failure modes I observed in one particular case are discussed below.

The one thing you didn't explicitly spell out in your mail: the API (svn_fs.h)
considers such filenames valid.  That is, the bug is that the various parts of
the system failed to DTRT with those filenames, rather than that they failed to
reject them.  (Though, as you say, rejecting them may be the practical answer.)

Re: Filenames with trailing newlines wreak havoc

Posted by Mark J Cox <ma...@awe.com>.
Please use CVE-2013-1968

Mark

On Sun, Apr 14, 2013 at 6:03 PM, Ben Reser <be...@reser.org> wrote:
> Apache security folks please provide us a CVE for this.
>
> On Tue, Mar 26, 2013 at 6:36 AM, Stefan Sperling <st...@elego.de> wrote:
>> When a file with a trailing newline (ASCII 0x0a) has made its way into
>> the repository, there are various areas of the system that exhibit failure
>> modes. Failure modes I observed in one particular case are discussed below.
>>
>> I intend to file separate issues for each failure mode to keep track of
>> them, rather than filing one giant issue for them all. Though perhaps
>> a meta-issue that depends on the other issues would be useful.
>>
>> In the case observed there was a file, added in a revision rX, which had
>> a trailing newline in its name. The file was added along with its parent
>> directory (which was also new in rX) and alongside several other files
>> in the same directory which did not have trailing newlines in their names.
>>
>> The server was running Subversion 1.7.8, but might have been running
>> an older release at the time when rX was committed.
>> All commands listed below were run with Subversion 1.7.8.
>>
>> Immediately visible problems were:
>>
>> svnadmin verify resulted in an error for the revision: 'svnadmin: E160013: File not found: ...')
>>
>> svnsync failed to sync the revision ('svnsync: E160013: File not found: ...')
>>
>> However, the directory at rX could be listed (with svn ls) and checked
>> out successfully.
>>
>> During checkout, RA layers were behaving inconsistently.
>>
>> Checkout via ra_svn and ra_local resulted in the expected filename
>> with \n (0x0a) as the last byte both in wc.db (local_relpath NODES
>> column) and on disk.
>>
>> However, ra_neon (didn't try serf) checked the file out with a trailing
>> space (0x20) instead of a trailing newline. The filename was stored with
>> a trailing 0x20 in wc.db as well as on disk.
>>
>> Examining the FSFS revision file [*] of rX showed that the filename was
>> stored with a newline as part of its parent directory's representation.
>>
>>   [*] For those who didn't know, the FSFS revision file format is documented
>>       here: https://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_fs/structure
>>
>> Let's call the parent directory 'A', and the file in question 'alpha\n'.
>> A file named 'alpha\n' added in r1 might appear like this in its parent
>> directory's representation (i.e. the rep of 'A'):
>> ============
>> PLAIN
>> K 6
>> alpha
>>
>> V 17
>> file 2-1.0.r1/122
>> ============
>>
>> However, the cpath: field within the file's own representation did not
>> contain the trailing newline.
>>
>> Also, the file was listed in the changed-paths data section without
>> the trailing newline. Note that the FSFS format docs say "The changed-path
>> data is represented as a series of changed-path items, each consisting of
>> two lines." -- i.e. a trailing newline is considered to be part of revision
>> file data, rather than the filename.
>>
>> Furthermore, several children of the newly-added-in rX 'A' directory
>> were missing from the changed-paths list in the revision file, for an
>> unkown reason.
>> More precisely, children listed after 'alpha\n' in the rep of A were
>> missing from the changed paths list. This was not detected by 'svnadmin
>> verify' (which, in my opinion, is a bug) but was detected by 'svnsync'
>> when a later revision rY referred to one of the missing children in
>> copyfrom information. It turned out that the slave server was missing
>> any children of A not listed in the changed-paths list in the rX
>> revision file on the master server. What kind of consistency checks
>> could prevent this from happening remains to be determined. Perhaps
>> 'svnadmin verify' could cross-check the changed paths list with node-rev
>> IDs which appear in the revision's directory listings?
>>
>> The revision rX could be fixed by removing the trailing newline from the
>> filename in the parent directory's representation, and adjusting the
>> rep's length and checksum information accordingly. About two dozen other
>> revisions had to be adjusted as well because they listed the file
>> 'alpha\n' as part of the 'A' directory rep (which is of course written
>> out in its entirety whenever a revision changes some child of 'A').
>>
>> Also, the changed-paths list of rX had to be extended to add the missing
>> newly added children of A. Other revisions had to be checked for similar
>> omissions in the changed-paths list.
>>
>> How the filename entered the repository is not known with absolute
>> certainty. The current theory, based on discussion with the revision's
>> author, is that the file was checked in using Eclipse with SVNKit 1.6
>> and was not refused by the Subversion server at the time.
>>
>> Attempting the reproduce this problem with a stock Subversion client
>> in a naive way fails because libsvn_client refuses to add such files
>> to version control in the first place:
>>
>> subversion/svn/add-cmd.c:86: (apr_err=160005)
>> subversion/svn/util.c:621: (apr_err=160005)
>> subversion/libsvn_client/add.c:1031: (apr_err=160005)
>> subversion/libsvn_client/add.c:937: (apr_err=160005)
>> subversion/libsvn_client/add.c:320: (apr_err=160005)
>> subversion/libsvn_wc/adm_ops.c:1004: (apr_err=160005)
>> subversion/libsvn_wc/adm_ops.c:679: (apr_err=160005)
>> subversion/libsvn_subr/path.c:1231: (apr_err=160005)
>> svn: E160005: Invalid control character '0x0a' in path '/tmp/wc/alpha\012'
>>
>> This error does not seem to be triggered by our test suite, which
>> is bad because it would have allowed the SVNKit developers to prevent
>> this problem from happening (they use our test suite to test their
>> Java-based implementation of Subversion). There is a tab_test() in
>> commit_tests.py, but that test only tries a tab, not a newline character.
>>
>> We should add some C tests as well to verify API behaviour at the
>> client layer and at the repos layer.
>>
>> Given the ripple effects of this problem in FSFS revision files I think
>> we should ensure that the Subversion server blocks such filenames from
>> entering the repository (any repository, FSFS and BDB). It seems FSFS format
>> changes would be required to support filenames with trailing newlines
>> properly, an effort which isn't worth the gain in my opinion.
>>
>> And given the ripple effects seen in areas such as repository verification,
>> svnsync, and ra_neon, I don't think we can afford to call this a supported
>> use case until all components of the system have been fixed to handle
>> filenames with trailing newlines properly.
>>
>> Comments and opinions are very welcome. I intend to make sure that this
>> problem cannot happen with future Subversion releases, and welcome any
>> input and help I can get.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: security-unsubscribe@apache.org
> For additional commands, e-mail: security-help@apache.org
>

Re: Filenames with trailing newlines wreak havoc

Posted by Ben Reser <be...@reser.org>.
Apache security folks please provide us a CVE for this.

On Tue, Mar 26, 2013 at 6:36 AM, Stefan Sperling <st...@elego.de> wrote:
> When a file with a trailing newline (ASCII 0x0a) has made its way into
> the repository, there are various areas of the system that exhibit failure
> modes. Failure modes I observed in one particular case are discussed below.
>
> I intend to file separate issues for each failure mode to keep track of
> them, rather than filing one giant issue for them all. Though perhaps
> a meta-issue that depends on the other issues would be useful.
>
> In the case observed there was a file, added in a revision rX, which had
> a trailing newline in its name. The file was added along with its parent
> directory (which was also new in rX) and alongside several other files
> in the same directory which did not have trailing newlines in their names.
>
> The server was running Subversion 1.7.8, but might have been running
> an older release at the time when rX was committed.
> All commands listed below were run with Subversion 1.7.8.
>
> Immediately visible problems were:
>
> svnadmin verify resulted in an error for the revision: 'svnadmin: E160013: File not found: ...')
>
> svnsync failed to sync the revision ('svnsync: E160013: File not found: ...')
>
> However, the directory at rX could be listed (with svn ls) and checked
> out successfully.
>
> During checkout, RA layers were behaving inconsistently.
>
> Checkout via ra_svn and ra_local resulted in the expected filename
> with \n (0x0a) as the last byte both in wc.db (local_relpath NODES
> column) and on disk.
>
> However, ra_neon (didn't try serf) checked the file out with a trailing
> space (0x20) instead of a trailing newline. The filename was stored with
> a trailing 0x20 in wc.db as well as on disk.
>
> Examining the FSFS revision file [*] of rX showed that the filename was
> stored with a newline as part of its parent directory's representation.
>
>   [*] For those who didn't know, the FSFS revision file format is documented
>       here: https://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_fs/structure
>
> Let's call the parent directory 'A', and the file in question 'alpha\n'.
> A file named 'alpha\n' added in r1 might appear like this in its parent
> directory's representation (i.e. the rep of 'A'):
> ============
> PLAIN
> K 6
> alpha
>
> V 17
> file 2-1.0.r1/122
> ============
>
> However, the cpath: field within the file's own representation did not
> contain the trailing newline.
>
> Also, the file was listed in the changed-paths data section without
> the trailing newline. Note that the FSFS format docs say "The changed-path
> data is represented as a series of changed-path items, each consisting of
> two lines." -- i.e. a trailing newline is considered to be part of revision
> file data, rather than the filename.
>
> Furthermore, several children of the newly-added-in rX 'A' directory
> were missing from the changed-paths list in the revision file, for an
> unkown reason.
> More precisely, children listed after 'alpha\n' in the rep of A were
> missing from the changed paths list. This was not detected by 'svnadmin
> verify' (which, in my opinion, is a bug) but was detected by 'svnsync'
> when a later revision rY referred to one of the missing children in
> copyfrom information. It turned out that the slave server was missing
> any children of A not listed in the changed-paths list in the rX
> revision file on the master server. What kind of consistency checks
> could prevent this from happening remains to be determined. Perhaps
> 'svnadmin verify' could cross-check the changed paths list with node-rev
> IDs which appear in the revision's directory listings?
>
> The revision rX could be fixed by removing the trailing newline from the
> filename in the parent directory's representation, and adjusting the
> rep's length and checksum information accordingly. About two dozen other
> revisions had to be adjusted as well because they listed the file
> 'alpha\n' as part of the 'A' directory rep (which is of course written
> out in its entirety whenever a revision changes some child of 'A').
>
> Also, the changed-paths list of rX had to be extended to add the missing
> newly added children of A. Other revisions had to be checked for similar
> omissions in the changed-paths list.
>
> How the filename entered the repository is not known with absolute
> certainty. The current theory, based on discussion with the revision's
> author, is that the file was checked in using Eclipse with SVNKit 1.6
> and was not refused by the Subversion server at the time.
>
> Attempting the reproduce this problem with a stock Subversion client
> in a naive way fails because libsvn_client refuses to add such files
> to version control in the first place:
>
> subversion/svn/add-cmd.c:86: (apr_err=160005)
> subversion/svn/util.c:621: (apr_err=160005)
> subversion/libsvn_client/add.c:1031: (apr_err=160005)
> subversion/libsvn_client/add.c:937: (apr_err=160005)
> subversion/libsvn_client/add.c:320: (apr_err=160005)
> subversion/libsvn_wc/adm_ops.c:1004: (apr_err=160005)
> subversion/libsvn_wc/adm_ops.c:679: (apr_err=160005)
> subversion/libsvn_subr/path.c:1231: (apr_err=160005)
> svn: E160005: Invalid control character '0x0a' in path '/tmp/wc/alpha\012'
>
> This error does not seem to be triggered by our test suite, which
> is bad because it would have allowed the SVNKit developers to prevent
> this problem from happening (they use our test suite to test their
> Java-based implementation of Subversion). There is a tab_test() in
> commit_tests.py, but that test only tries a tab, not a newline character.
>
> We should add some C tests as well to verify API behaviour at the
> client layer and at the repos layer.
>
> Given the ripple effects of this problem in FSFS revision files I think
> we should ensure that the Subversion server blocks such filenames from
> entering the repository (any repository, FSFS and BDB). It seems FSFS format
> changes would be required to support filenames with trailing newlines
> properly, an effort which isn't worth the gain in my opinion.
>
> And given the ripple effects seen in areas such as repository verification,
> svnsync, and ra_neon, I don't think we can afford to call this a supported
> use case until all components of the system have been fixed to handle
> filenames with trailing newlines properly.
>
> Comments and opinions are very welcome. I intend to make sure that this
> problem cannot happen with future Subversion releases, and welcome any
> input and help I can get.