You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Greg Hudson <gh...@MIT.EDU> on 2006/04/11 14:04:08 UTC

Re: [Show] Re: svn commit: r19299 - branches/nonxml-entries/subversion/libsvn_wc

On Tue, 2006-04-11 at 12:50 +0200, Julian Reschke wrote:
> if you allow control characters now (what for?), you make it harder to
> go back to an XML-based format in the future.

Right now, we have to encode such values at the application layer in
order to store them in XML.  Whether we encode such values at the
application layer for nonxml has no bearing on that.

As to why we allow them in the fields (encoded or not), it is to allow
us to faithfully reproduce any binary values we might want to store in
the entries file.

> In general, when moving away from an XML format I think the following
> issues need to considered:
> 
> - character encoding
> - allowed character set
> - escaping
> - extensibility (!) / versioning
> 
> ...because if you move away from XML these are the benefits you're losing.

No offense, but you sound like you've drunk the kool-aid.  Some of those
factors of XML are actually detriments to us, as our design experience
has shown over time.  We know our preferred character encoding for
internal data (UTF-8), and don't need a format which allows it to vary.
The allowed character set of XML has a few holes relative to 8-bit
binary, which has forced us into application-level encoding of binary
values.  As for escaping and extensibility/versioning, those issues can
be covered trivially, without the huge overhead imposed by XML.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org