You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Greg Hudson <gh...@MIT.EDU> on 2005/04/16 15:17:25 UTC
svn.collab.net httpd configuration: default charset
If you visit a URL like
http://svn.collab.net/repos/svn/tags/1.2.0-rc1/CHANGES
you will see some UTF-8 text misrenderd as ISO-8859-1, because the
response has a header of
Content-Type: text/plain; charset=ISO-8859-1
Since CHANGES, like most text files in Subversion, has no
svn:mime-type, the mod_dav_svn default of "text/plain" is used. httpd
automatically adds the value of AddDefaultCharset to this mime type,
which on svn.collab.net is evidently set to the default value of
ISO-8859-1.
I suggest that AddDefaultCharset on svn.collab.net be set to UTF-8.
That could presumably be scoped to the Subversion project if desired.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: svn.collab.net httpd configuration: default charset
Posted by "Øyvind A. Holm" <su...@sunbase.org>.
On 2005-04-16 21:43:09 Greg Hudson wrote:
> On Sat, 2005-04-16 at 18:23, Marcus Rueckert wrote:
> > i would remove the default charset at all. otherwise scripts which
> > are in latin1 might look weird.
>
> We have scripts in latin1?
These files contains non-UTF-8 sequences in trunk:
trunk$ grep -r . . | nosvn | find_inv_utf8 | cut -f 1 -d : | sort | uniq
./INSTALL
./doc/translations/french/appendices.texi
./doc/translations/french/client.texi
./doc/translations/french/getting_started.texi
./doc/translations/french/repos_admin.texi
./doc/translations/french/svn-handbook-french.texi
./doc/translations/russian/misc-docs/quick_walkthrough.xml
./notes/fs-improvements.txt
./packages/windows-innosetup/Readme.txt
./packages/windows-innosetup/svn.iss
./packages/windows-innosetup/tools/svnpath/svnpath.rc
./www/httpd-win32.patch.txt
trunk$
All of them are text files, except the windows-innosetup stuff which
probably have to be that way.
AddDefaultCharset UTF-8 — yesthankyou.
-- sunny256
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
detecting charset wishlist enhancement
Posted by Karl Chen <qu...@NOSPAM.quarl.org>.
I imagine that the text-encoding of a file in Subversion might be
useful to other applications than just mod_dav_svn. In any case,
once the method for specifying charset in properties is ironed
out, I suggest 'svn add' detect the charset of a text/* document
when possible.
For example, UTF-16 can be inferred from the presence of an
initial zero-width non-breaking space; charset can be specified by
a "-*- encoding: foo -*-" line on the first or second line (the
Emacs format, which at least Python also supports).
--
Karl 2005-04-17 01:05
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: svn.collab.net httpd configuration: default charset
Posted by Greg Hudson <gh...@MIT.EDU>.
On Sat, 2005-04-16 at 18:23, Marcus Rueckert wrote:
> i would remove the default charset at all. otherwise scripts which are
> in latin1 might look weird.
We have scripts in latin1?
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: svn.collab.net httpd configuration: default charset
Posted by Marcus Rueckert <da...@web.de>.
On 2005-04-16 11:17:25 -0400, Greg Hudson wrote:
> I suggest that AddDefaultCharset on svn.collab.net be set to UTF-8.
> That could presumably be scoped to the Subversion project if desired.
i would remove the default charset at all. otherwise scripts which are
in latin1 might look weird.
just my 2 cents
darix
--
irssi - the client of the smart and beautiful people
http://www.irssi.de/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: svn.collab.net httpd configuration: default charset
Posted by Eric Gillespie <ep...@pretzelnet.org>.
Greg Hudson <gh...@MIT.EDU> writes:
> Perhaps. But we can't change the svn:mime-type for historical
> and tagged versions of text documents in Subversion.
Ah, that's a good reason for using AddDefaultCharset, though it
still seems useful to set svn:mime-type as well.
> It doesn't? Why would we want to use any other encoding for
> text files in the Subversion repository?
I said "feel"; the implication is that i don't have a solid
reason to offer.
--
Eric Gillespie <*> epg@pretzelnet.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: svn.collab.net httpd configuration: default charset
Posted by Greg Hudson <gh...@MIT.EDU>.
On Sat, 2005-04-16 at 14:47, Eric Gillespie wrote:
> Greg Hudson <gh...@MIT.EDU> writes:
>
> > Why?
>
> On the assumption that more than just web browsers might be
> interested in the encoding of a document.
Perhaps. But we can't change the svn:mime-type for historical and
tagged versions of text documents in Subversion.
> Less importantly, it
> just doesn't "feel right" to slap UTF-8 on all documents.
It doesn't? Why would we want to use any other encoding for text files
in the Subversion repository?
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: svn.collab.net httpd configuration: default charset
Posted by Eric Gillespie <ep...@pretzelnet.org>.
Greg Hudson <gh...@MIT.EDU> writes:
> Why?
On the assumption that more than just web browsers might be
interested in the encoding of a document. Less importantly, it
just doesn't "feel right" to slap UTF-8 on all documents.
--
Eric Gillespie <*> epg@pretzelnet.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: svn.collab.net httpd configuration: default charset
Posted by Greg Hudson <gh...@MIT.EDU>.
On Sun, 2005-04-17 at 03:07, Justin Erenkrantz wrote:
> FWIW, httpd 2.0.53 removed AddDefaultCharset from the default httpd.conf
> configuration because of all the brokenness default charsets introduce.
As far as I can tell, all of the arguments presented in the issue you
cite make sense when considering a piece of software imposing a default
charset on an unsuspecting server administrator, but do not make sense
when considering a server administrator specifying a default charset.
> So, I believe keeping that directive present is really a bad idea. See:
So I don't see how this follows.
> So, +1 to Eric's idea as it is the best one. (Historical versions of CHANGES
> not showing up in UTF-8 aren't that big of a deal, IMHO.)
Nonetheless, I don't see what the problem is in saying "our text
documents are almost all UTF-8 documents".
(I saw Oyvind's list of files containing non-UTF-8 sequences, but as far
as I can tell they're a combination of rare exceptions and mistakes.)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: svn.collab.net httpd configuration: default charset
Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
On Sat, Apr 16, 2005 at 01:56:10PM -0400, Greg Hudson wrote:
> On Sat, 2005-04-16 at 13:50, Eric Gillespie wrote:
> > I'd prefer
> > AddDefaultCharset off and set svn:mime-type on CHANGES to
> > 'text/plain; charset=UTF-8'.
>
> Why?
FWIW, httpd 2.0.53 removed AddDefaultCharset from the default httpd.conf
configuration because of all the brokenness default charsets introduce.
So, I believe keeping that directive present is really a bad idea. See:
http://issues.apache.org/bugzilla/show_bug.cgi?id=23421
for more information as to why we removed this.
So, +1 to Eric's idea as it is the best one. (Historical versions of CHANGES
not showing up in UTF-8 aren't that big of a deal, IMHO.) -- justin
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: svn.collab.net httpd configuration: default charset
Posted by Greg Hudson <gh...@MIT.EDU>.
On Sat, 2005-04-16 at 13:50, Eric Gillespie wrote:
> I'd prefer
> AddDefaultCharset off and set svn:mime-type on CHANGES to
> 'text/plain; charset=UTF-8'.
Why?
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: svn.collab.net httpd configuration: default charset
Posted by Eric Gillespie <ep...@pretzelnet.org>.
Ben Collins-Sussman <su...@collab.net> writes:
> As a followup to this question -- what was the latest thinking
> on charsets in general? mod_dav_svn already looks for
> svn:mime-type when sending a file, and sets the Content-type:
> header appropriately. Didn't somebody once propose creating an
> svn:charset property that mod_dav_svn could also notice?
MIME types have parameters; see RFC 2045. I'd prefer
AddDefaultCharset off and set svn:mime-type on CHANGES to
'text/plain; charset=UTF-8'. mod_dav_svn already supports this.
--
Eric Gillespie <*> epg@pretzelnet.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: svn.collab.net httpd configuration: default charset
Posted by Ben Collins-Sussman <su...@collab.net>.
On Apr 16, 2005, at 10:17 AM, Greg Hudson wrote:
> If you visit a URL like
>
> http://svn.collab.net/repos/svn/tags/1.2.0-rc1/CHANGES
>
> you will see some UTF-8 text misrenderd as ISO-8859-1, because the
> response has a header of
>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Since CHANGES, like most text files in Subversion, has no
> svn:mime-type, the mod_dav_svn default of "text/plain" is used. httpd
> automatically adds the value of AddDefaultCharset to this mime type,
> which on svn.collab.net is evidently set to the default value of
> ISO-8859-1.
>
> I suggest that AddDefaultCharset on svn.collab.net be set to UTF-8.
> That could presumably be scoped to the Subversion project if desired.
>
As a followup to this question -- what was the latest thinking on
charsets in general? mod_dav_svn already looks for svn:mime-type when
sending a file, and sets the Content-type: header appropriately.
Didn't somebody once propose creating an svn:charset property that
mod_dav_svn could also notice?
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org