You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Roman Donchenko <DX...@yandex.ru> on 2009/08/26 22:40:46 UTC

[PATCH] Remove po file charset stripping on Windows

Greetings,

As far as I know, the Subversion build system currently strips the  
Content-Type header from the po files on systems that don't support  
bind_textdomain_codeset(). This is performed to prevent libintl from  
converting the translated messages into the locale charset. On Unix, this  
is done via a sed invocation in the Makefile. On Windows, it's done with  
build/strip-po-charset.py.

However! The libintl implementation that's used on Windows is a slightly  
hacked-up version of GNU libintl, which DOES have bind_textdomain_codeset.  
Thus, I have no idea why is the stripping done in this case. It's kludgy  
and generates msgfmt warnings, so I whipped up the attached patch that  
makes Subversion use bind_textdomain_codeset instead. I didn't see any  
breakage, but I wouldn't mind some testing from someone actually using one  
of the locales that have translations.

[[[
On Windows, don't strip the charset from PO files - use  
bind_textdomain_codeset instead.

* build/generator/build_locale.ezt: Remove the strip-po-charset.py  
invocation.

* build/generator/gen_win.py:
   (POFile.__init__): don't store the .spo file name.

* build/strip-po-charset.py: Remove.

* subversion/libsvn_subr/nls.c:
   (svn_nls_init): Move the bind_textdomain_codeset invocation out of the  
#ifdef WIN32 block.

* subversion/svn_private_config.hw: Indicate that bind_textdomain_codeset  
is supported.
]]]

Awaiting your increments,
Roman.

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2387739

Re: [PATCH] Remove po file charset stripping on Windows

Posted by Branko Cibej <br...@xbc.nu>.
Roman Donchenko wrote:
> Branko Cibej <br...@xbc.nu> писал в своём письме Sun, 13 Sep 2009 14:13:57  
> +0400:
>
>   
>> Roman Donchenko wrote:
>>     
>>> Meanwhile, I noticed that the Subversion libintl is specifically  
>>> modified
>>> not to link to libiconv, which means it's not going to recode the  
>>> strings
>>> whether bind_textdomain_codeset is called or not. Huh. I still suppose  
>>> an
>>> explicit call won't hurt.
>>>
>>>       
>> I frankly don't understand the point of this discussion. Removing
>> codeset definition from the .po files is just an extra safety net that
>> prevents the aforementioned bug from raising its head if someone links
>> Subversion binaries with a different variant of libintl that does depend
>> on libiconv. The only downside I can think of is that it causes msgfmt
>> to complain during the build, but those warnings have no adverse side
>> effects.
>>     
>
> The point is that we can get rid of the warnings and simplify the build  
> system by removing the stripping,
> while still providing the safety net by calling bind_textdomain_codeset().
>   

I see. Assuming your change has no effect on how messages are displayed
on Windows, then I suppose it's a case of your bikeshed being lilac with
wisteria stripes, and mine the other way around. I just can't test that
assumption right now.

-- Brane

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2394438

Re: [PATCH] Remove po file charset stripping on Windows

Posted by Roman Donchenko <DX...@yandex.ru>.
Branko Cibej <br...@xbc.nu> писал в своём письме Sun, 13 Sep 2009 14:13:57  
+0400:

> Roman Donchenko wrote:
>> Meanwhile, I noticed that the Subversion libintl is specifically  
>> modified
>> not to link to libiconv, which means it's not going to recode the  
>> strings
>> whether bind_textdomain_codeset is called or not. Huh. I still suppose  
>> an
>> explicit call won't hurt.
>>
>
>
> I frankly don't understand the point of this discussion. Removing
> codeset definition from the .po files is just an extra safety net that
> prevents the aforementioned bug from raising its head if someone links
> Subversion binaries with a different variant of libintl that does depend
> on libiconv. The only downside I can think of is that it causes msgfmt
> to complain during the build, but those warnings have no adverse side
> effects.

The point is that we can get rid of the warnings and simplify the build  
system by removing the stripping,
while still providing the safety net by calling bind_textdomain_codeset().

Roman.

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2394298

Re: [PATCH] Remove po file charset stripping on Windows

Posted by Branko Cibej <br...@xbc.nu>.
Roman Donchenko wrote:
> Meanwhile, I noticed that the Subversion libintl is specifically modified  
> not to link to libiconv, which means it's not going to recode the strings  
> whether bind_textdomain_codeset is called or not. Huh. I still suppose an  
> explicit call won't hurt.
>   

The modification to libintl for Windows SVN is intentional. Because the
internal character encoding of APR on Windows is always UTF-8, as is the
source encoding of the .po files,
any kind of encoding conversion within libintl would be invalid and a
bug. Hence the removal of the libiconv dependency -- which, by the way,
is optional on Unix, but requires manual steps on Windows because of the
limitations of the build system.

I frankly don't understand the point of this discussion. Removing
codeset definition from the .po files is just an extra safety net that
prevents the aforementioned bug from raising its head if someone links
Subversion binaries with a different variant of libintl that does depend
on libiconv. The only downside I can think of is that it causes msgfmt
to complain during the build, but those warnings have no adverse side
effects.

-- Brane

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2394211

Re: [PATCH] Remove po file charset stripping on Windows

Posted by Roman Donchenko <DX...@yandex.ru>.
Inspired by Producing OSS, I'm taking this back to the list.

Gavin 'Beau' Baumanis <ga...@thespidernet.com> писал в своём письме Sat,  
12 Sep 2009 04:35:44 +0400:

> Hi Roman,
>
> Do you have commit rights for that area of the repository?
> If you have, then you've ben afforded those privileges because you're  
> trusted to do things correctly.
>
> (I understand your desire to have some extra eyes look over it).
> But not withstanding that... since no one else has yet managed to get to  
> it and as long as
> you've built it against trunk etc...all the normal things... then Surely  
> it is appropriate to commit it?
>
> if you're still a little hesitant to commit, then at the very worst, you  
> could create a ticket and attach the code to it and then reference your  
> ticket number back here in this thread.
>
> But again - if you have commit rights... then (especially now that it  
> has been two weeks) simply commit it.
> If it causes an unforeseen issue down the track  - it can always be  
> reverted.
>
>
> Gavin.

Gavin,

That depends on the meaning of "commit rights". I can commit all over the  
repository, but the build system is outside my delegated zone of authority.

HACKING doesn't seem to imply that no approval allows a partial committer  
to commit outside of his sandbox, and seeing as the build system is an  
important part of the project, I'd rather not assume that.


Meanwhile, I noticed that the Subversion libintl is specifically modified  
not to link to libiconv, which means it's not going to recode the strings  
whether bind_textdomain_codeset is called or not. Huh. I still suppose an  
explicit call won't hurt.

Roman.

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2394136

Re: [PATCH] Remove po file charset stripping on Windows

Posted by Roman Donchenko <DX...@yandex.ru>.
Roman Donchenko <DX...@yandex.ru> писал в своём письме Thu, 27 Aug 2009  
02:40:46 +0400:

> [[[
> On Windows, don't strip the charset from PO files - use
> bind_textdomain_codeset instead.
>
> * build/generator/build_locale.ezt: Remove the strip-po-charset.py
> invocation.
>
> * build/generator/gen_win.py:
>    (POFile.__init__): don't store the .spo file name.
>
> * build/strip-po-charset.py: Remove.
>
> * subversion/libsvn_subr/nls.c:
>    (svn_nls_init): Move the bind_textdomain_codeset invocation out of the
> #ifdef WIN32 block.
>
> * subversion/svn_private_config.hw: Indicate that bind_textdomain_codeset
> is supported.
> ]]]

Ping. This patch submission has received no comments (which is  
understandable since it touches stuff written in 2004 in an attempt to get  
rid of a few warnings, but the desire to DTRT makes me push it ;=]).

Meanwhile, I tested that the change, in fact, doesn't break the non-ASCII  
characters in strings, by (temporarily) wrecking the German translation.  
8=]

Cheers,
Roman.

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2391333