You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Paul Maier <sv...@web.de> on 2010/10/31 18:35:22 UTC

Feature request: Support of Windows Codepage 850

Hi there!

All output, that comes from the svn internationalization, shows correctly
on both codepages, 850 and 1252. E. g. error messages or "svn help ci". Good.
(Font in use is Lucida Console.)

This shows, that it *is* possible for svn, to produce correct output on both
codepages, 850 and 1252.

But output that comes from file data ("svn cat file") only prints correct with
Codepage 1252. Seems, that this output goes through a different piece of code,
that is not able to adjust to the Codepage in use.

So at the moment the behaviour of svn with Codepage 850 is inconsistent:
sometimes supported ("svn help ci"), sometimes not ("svn cat").

Currently I need Cygwin's cat command to convert the output of "svn cat" to
Codepage 850:
  svn cat file | cat

Feature request:
"svn cat file" should also work with Codepage 850.

Who agrees, that this feature request is worth a prio P4 issue?

(I have to use Codepage 850, because Cygwin needs it.)


Reproduction script:
--------------------

ver
chcp

# Codepage 850:
svn revert a  # works (correct German umlaut character is displayed)
svn cat c     # does not work (umlaut characters changed to weird characters)

# Codepage 1252:
chcp 1252
svn revert b  # works (correct German umlaut character is displayed)
svn cat c     # works (correct German umlaut character is displayed)


Full console output follows:
----------------------------

C:>ver
Microsoft Windows XP [Version 5.1.2600]

C:>chcp
Aktive Codepage: 850.

C:>svn revert a
Rückgängig gemacht: »a«

C:>svn cat c
õ÷³ -Í_ ¯ º  -- abcd1234

C:>chcp 1252
Aktive Codepage: 1252.

C:>svn revert b
Rückgängig gemacht: »b«

C:>svn cat c
äöü ÄÖÜ ß §  -- abcd1234


Paul.

 

> -----Ursprüngliche Nachricht-----
> Von: Paul Maier [mailto:svn-user@web.de] 
> Gesendet: Sonntag, 31. Oktober 2010 03:29
> An: 'users@subversion.apache.org'
> Betreff: Problem with code page on Windows XP SP3
> 
> Hello David, Alexey,
> 
> with your help, I could solve the problem for me. 
> Thanks!
> 
> For who is interested in my workaround:
> 
> Yes, "chcp 1252" plus Lucida Console breaks the Cygwin output.
> I first thought of wrapping all Cygwin commands into batch 
> scripts, like grep.bat containing "chcp 850; grep %*; chcp 1252". 
> That works, as long the (batch) commands aren't concatenated in 
> a pipe; then the first finishing batch script switches the 
> code page back too early.
> 
> Therefore, codepage 1252 is not an option for me, so I stay 
> with Codepage 850.
> As a workaround I found, that
>   svn cat | cat
> converts the output to the right display. That's ok for me.
> 
> I even considered to start the dos console with the command
>   cmd /K "cmd | cat"
> that runs ALL output automatically through cat. That works
> quite fine. Except that TAB key for file expansion doesn't
> work any more. But this solution failed because of something
> that I don't understand yet fully: some commands seem to be
> able to adjust to codepage 850. Look:
> 
> "svn cat" works only fine at Codepage 1252.
> But: "svn help ci" works fine with both, Codepage 850 and 1252.
> So svn seems to be able to adjust to the Codepage sometimes.
> 
> And this "svn help ci" breaks my "cmd | cat" trick, as 
> "svn help ci" seems to be already adjusted to Codepage 850 
> somehow, and gets double-adjusted by "| cmd" and therefore 
> breaks the output. Sad thing. The "cmd | cat" would, once 
> installed, have been an invisible solution. Tja, well.
> 
> Paul.
>  
> 

Windows Codepage: Problem SOLVED

Posted by Paul Maier <sv...@web.de>.
Hi David, Bert, Alexey and everybody,

now it WORKS!! Thank you for your advises. 8-))

To solve my encoding problem, I had to:
- upgrade to cygwin 1.7
- set LC_ALL=de_DE.CP1252  (for Cygwin)
- chcp 1252                (for Windows)
- use a True-Type font (Lucida Console)

=> Now Windows, Dos console, Cygwin and svn can talk with each
   other with all German umlaut characters and such.
   GREAT!

Paul.
 

> -----Ursprüngliche Nachricht-----
> Von: David Huang [mailto:khym@azeotrope.org] 
> Gesendet: Montag, 1. November 2010 04:23
> An: users@subversion.apache.org
> Betreff: Re: Feature request: Support of Windows Codepage 850
> 
> 
> On Oct 31, 2010, at 1:35 PM, Paul Maier wrote:
> 
> > Hi there!
> > 
> > All output, that comes from the svn internationalization, 
> shows correctly
> > on both codepages, 850 and 1252. E. g. error messages or 
> "svn help ci". Good.
> > (Font in use is Lucida Console.)
> > 
> > This shows, that it *is* possible for svn, to produce 
> correct output on both
> > codepages, 850 and 1252.
> > 
> > But output that comes from file data ("svn cat file") only 
> prints correct with
> > Codepage 1252. Seems, that this output goes through a 
> different piece of code,
> > that is not able to adjust to the Codepage in use.
> 
> svn cat just outputs the file that's in the repository... if 
> your file happens to be a a CP1252-encoded text file, but 
> your console expects CP850, I don't think you should expect 
> it to display properly. And how would svn even know that the 
> file is a CP1252-encoded text file? While it does keep track 
> of the MIME type, it doesn't keep track of charset 
> encoding... perhaps I have a Chinese BIG5-encoded file in my 
> repo; if SVN assumed it was CP1252 and tried to do a 
> 1252->850 conversion on it, it'd corrupt the file.
> 
> I think svn cat just needs to output the exact bytes of the 
> file, rather than trying to do any conversion. If you want to 
> do charset conversion, you could probably just pipe the 
> output through iconv (available in Cygwin). E.g., svn cat 
> filename | iconv -f cp1252 -t cp850
> 
> > (I have to use Codepage 850, because Cygwin needs it.)
> 
> I haven't tried it myself, but my understanding is that 
> Cygwin 1.7 honors the locale environment variables. Perhaps 
> try "set LC_CTYPE=de_DE.CP1252" or perhaps one of the other 
> locale variables (LANG? LC_ALL?)=

Re: Feature request: Support of Windows Codepage 850

Posted by David Huang <kh...@azeotrope.org>.
On Oct 31, 2010, at 1:35 PM, Paul Maier wrote:

> Hi there!
> 
> All output, that comes from the svn internationalization, shows correctly
> on both codepages, 850 and 1252. E. g. error messages or "svn help ci". Good.
> (Font in use is Lucida Console.)
> 
> This shows, that it *is* possible for svn, to produce correct output on both
> codepages, 850 and 1252.
> 
> But output that comes from file data ("svn cat file") only prints correct with
> Codepage 1252. Seems, that this output goes through a different piece of code,
> that is not able to adjust to the Codepage in use.

svn cat just outputs the file that's in the repository... if your file happens to be a a CP1252-encoded text file, but your console expects CP850, I don't think you should expect it to display properly. And how would svn even know that the file is a CP1252-encoded text file? While it does keep track of the MIME type, it doesn't keep track of charset encoding... perhaps I have a Chinese BIG5-encoded file in my repo; if SVN assumed it was CP1252 and tried to do a 1252->850 conversion on it, it'd corrupt the file.

I think svn cat just needs to output the exact bytes of the file, rather than trying to do any conversion. If you want to do charset conversion, you could probably just pipe the output through iconv (available in Cygwin). E.g., svn cat filename | iconv -f cp1252 -t cp850

> (I have to use Codepage 850, because Cygwin needs it.)

I haven't tried it myself, but my understanding is that Cygwin 1.7 honors the locale environment variables. Perhaps try "set LC_CTYPE=de_DE.CP1252" or perhaps one of the other locale variables (LANG? LC_ALL?)

RE: Feature request: Support of Windows Codepage 850

Posted by Bert Huijben <be...@qqmail.nl>.

> -----Original Message-----
> From: svn-user@web.de [mailto:svn-user@web.de]
> Sent: zondag 31 oktober 2010 19:35
> To: users@subversion.apache.org
> Subject: Feature request: Support of Windows Codepage 850
> 
> Hi there!
> 
> All output, that comes from the svn internationalization, shows
> correctly
> on both codepages, 850 and 1252. E. g. error messages or "svn help ci".
> Good.
> (Font in use is Lucida Console.)
> 
> This shows, that it *is* possible for svn, to produce correct output on
> both
> codepages, 850 and 1252.
> 
> But output that comes from file data ("svn cat file") only prints
> correct with
> Codepage 1252. Seems, that this output goes through a different piece
> of code,
> that is not able to adjust to the Codepage in use.
> 
> So at the moment the behaviour of svn with Codepage 850 is
> inconsistent:
> sometimes supported ("svn help ci"), sometimes not ("svn cat").
> 
> Currently I need Cygwin's cat command to convert the output of "svn
> cat" to
> Codepage 850:
>   svn cat file | cat
> 
> Feature request:
> "svn cat file" should also work with Codepage 850.
> 
> Who agrees, that this feature request is worth a prio P4 issue?
> 
> (I have to use Codepage 850, because Cygwin needs it.)

'svn cat file' doesn't perform any translation at all. It just writes what
is in 'file' to the standard output device.
(Subversion doesn't know which character encoding is used in 'file')

	Bert