You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by "Carpentier, Fabien" <fc...@cesi.fr> on 2009/04/03 10:25:57 UTC

RE: [Error conversion UTF-8]

Hi,

Thanks for your answer.

I search for change the lang environment variable.

In /etc/login.conf, I add this :

French:French Users Accounts:\
	:charset=ISO-8859-15:\
	:lang=fr_FR.ISO8859-15:\
	:tc=default:

But it doesn't work. I've still the same problem.
And now, I have no more idea..

Thanks for helping me.

Fabien CARPENTIER



-----Message d'origine-----
De : Ryan Schmidt [mailto:subversion-2009a@ryandesign.com] 
Envoyé : mercredi 11 mars 2009 19:46
À : Carpentier, Fabien
Cc : users@subversion.tigris.org
Objet : Re: [Error conversion UTF-8]


On Mar 11, 2009, at 06:34, Carpentier, Fabien wrote:

> In Redmine, when I want view differences between two different  
> revisions, I have this error message :
> svn: Safe data 'Index: Re' was followed by non-ASCII byte 195:  
> unable to convert to/from UTF-8
>
> The problem is that the name file is Reédmine. (é = byte 195)
>
> Svn seems not support accents.
>
> I use OpenBSD 4.1, Subversion 1.5.5 and the last revision of Redmine.
>
> I try to solve this problem, and set this file : utf.c and  
> functions check_non_ascii, ..
> But it doesn't work.

Subversion stores filenames as UTF-8 internally. In order to be able  
to present filenames to you properly, it needs to know what encoding  
your terminal uses, so you need to set the LANG environment variable  
to the correct value for your terminal.

I'm not sure how this applies to Redmine (never used it) but  
presumably Redmine needs to tell Subversion what encoding to use.

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessageId=1530997

To unsubscribe from this discussion, e-mail: [users-unsubscribe@subversion.tigris.org].


Re: [Error conversion UTF-8]

Posted by Shrinivasan <ts...@gmail.com>.
Hi.



> I don't know about /etc/login.conf. On my Mac, I just added this line
> to my ~/.bash_profile:
>
> export LANG=en_US.UTF-8
>

I too have the same problem in my freebsd box.
set export LANG=en_US.UTF-8
rebooted.

But still having the same problem.

help me to solve this issue.

thanks.

-- 
Regards,
T.Shrinivasan


My experiences with Linux are here
http://goinggnu.wordpress.com

For Free and Open Source Jobs
http://fossjobs.wordpress.com

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessageId=1563899

To unsubscribe from this discussion, e-mail: [users-unsubscribe@subversion.tigris.org].

Re: [Error conversion UTF-8]

Posted by Paul Koning <Pa...@dell.com>.
>>>>> "Shrinivasan" == Shrinivasan T <Shrinivasan> writes:

 Shrinivasan> Hi paul,
 >> The solution to this is to normalize all Unicode strings.  If you
 >> ever need to compare strings, you have to normalize first; if you
 >> don't then it will not work.  I'm pretty sure Subversion didn't do
 >> this; I don't know if it does now.

 Shrinivasan> How to normalize the unicode strings in filenames in
 Shrinivasan> freeBSD?

I don't know.

Part of the issue is that the software has to do it, not the user.
The reason is that the terminal I/O machinery may pick one encoding,
while the file system may pick a different one.  Or one client OS may
do it one way while another OS does it a different way.  In either
case, the svn client and server have to do the normalizing.

      paul

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessageId=1582060

To unsubscribe from this discussion, e-mail: [users-unsubscribe@subversion.tigris.org].


Re: [Error conversion UTF-8]

Posted by Shrinivasan <ts...@gmail.com>.
Hi paul,


> The solution to this is to normalize all Unicode strings.  If you ever
> need to compare strings, you have to normalize first; if you don't
> then it will not work.  I'm pretty sure Subversion didn't do this; I
> don't know if it does now.

How to normalize the unicode strings in filenames in freeBSD?

Thanks for the information.

-- 
Regards,
T.Shrinivasan


My experiences with Linux are here
http://goinggnu.wordpress.com

For Free and Open Source Jobs
http://fossjobs.wordpress.com

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessageId=1581859

To unsubscribe from this discussion, e-mail: [users-unsubscribe@subversion.tigris.org].


Re: [Error conversion UTF-8]

Posted by Ryan Schmidt <su...@ryandesign.com>.
On Apr 7, 2009, at 11:21, Paul Koning wrote:

> Another possible problem, which I've run into on the Mac, is that some
> characters can be encoded in more than one way in UTF-8.  And the Mac
> filesystem converts them all to a single preferred encoding.  So if
> you read back a filename, it may not match what you originally
> supplied.
>
> The solution to this is to normalize all Unicode strings.  If you ever
> need to compare strings, you have to normalize first; if you don't
> then it will not work.  I'm pretty sure Subversion didn't do this; I
> don't know if it does now.

Yes, that bug remains:

http://subversion.tigris.org/issues/show_bug.cgi?id=2464

A workaround for this problem is available in the +unicode_path  
variant if you install Subversion using MacPorts.

But it is not what is occurring for the OP in this thread, whose  
error message was:

> svn: Safe data 'Index: Re' was followed by non-ASCII byte 195:
> unable to convert to/from UTF-8

You will not see that error message if the string is valid UTF-8.

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessageId=1589462

To unsubscribe from this discussion, e-mail: [users-unsubscribe@subversion.tigris.org].

RE: [Error conversion UTF-8]

Posted by Paul Koning <Pa...@dell.com>.
>>>>> "Bert" == Bert Huijben <rh...@sharpsvn.net> writes:

 Bert> On the Mac and on Windows the filesystem always uses Unicode to
 Bert> represent filenames (Mac as UTF-8, Windows as USC-2/UTF-16), so
 Bert> the LANG setting only applies to the client IO there and never
 Bert> to the paths. On the unixes paths don't have a specific
 Bert> encoding -paths consists of bytes-, so the LANG setting applies
 Bert> to path names too.

 Bert> In this case a file on disk has a path that can't be
 Bert> interpreted by the current LANG setting. (E.g. utf-8 works with
 Bert> lead and follow bytes for multibyte characters.. if the first
 Bert> byte of a multibyte character is a follow-byte it's encoding is
 Bert> invalid). This is probably caused by saving files with names in
 Bert> one encoding (E.g. ISO-8859-1) and then reading them back with
 Bert> another encoding (probably UTF-8).

Another possible problem, which I've run into on the Mac, is that some
characters can be encoded in more than one way in UTF-8.  And the Mac
filesystem converts them all to a single preferred encoding.  So if
you read back a filename, it may not match what you originally
supplied.

The solution to this is to normalize all Unicode strings.  If you ever
need to compare strings, you have to normalize first; if you don't
then it will not work.  I'm pretty sure Subversion didn't do this; I
don't know if it does now.  

      paul

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessageId=1579507

To unsubscribe from this discussion, e-mail: [users-unsubscribe@subversion.tigris.org].

RE: [Error conversion UTF-8]

Posted by Bert Huijben <rh...@sharpsvn.net>.
> -----Original Message-----
> From: Ryan Schmidt [mailto:subversion-2009a@ryandesign.com]
> Sent: maandag 6 april 2009 11:07
> To: Carpentier, Fabien
> Cc: users@subversion.tigris.org
> Subject: Re: [Error conversion UTF-8]
> 
> On Apr 3, 2009, at 05:25, Carpentier, Fabien wrote:
> 
> > Ryan Schmidt wrote:
> >
> >> On Mar 11, 2009, at 06:34, Carpentier, Fabien wrote:
> >>
> >>> In Redmine, when I want view differences between two different
> >>> revisions, I have this error message :
> >>> svn: Safe data 'Index: Re' was followed by non-ASCII byte 195:
> >>> unable to convert to/from UTF-8
> >>>
> >>> The problem is that the name file is Reédmine. (é = byte 195)
> >>>
> >>> Svn seems not support accents.
> >>>
> >>> I use OpenBSD 4.1, Subversion 1.5.5 and the last revision of
> >>> Redmine.
> >>>
> >>> I try to solve this problem, and set this file : utf.c and
> >>> functions check_non_ascii, ..
> >>> But it doesn't work.
> >>
> >> Subversion stores filenames as UTF-8 internally. In order to be able
> >> to present filenames to you properly, it needs to know what encoding
> >> your terminal uses, so you need to set the LANG environment variable
> >> to the correct value for your terminal.
> >>
> >> I'm not sure how this applies to Redmine (never used it) but
> >> presumably Redmine needs to tell Subversion what encoding to use.
> >
> > I search for change the lang environment variable.
> >
> > In /etc/login.conf, I add this :
> >
> > French:French Users Accounts:\
> > 	:charset=ISO-8859-15:\
> > 	:lang=fr_FR.ISO8859-15:\
> > 	:tc=default:
> >
> > But it doesn't work. I've still the same problem.
> > And now, I have no more idea..
> 
> I don't know about /etc/login.conf. On my Mac, I just added this line
> to my ~/.bash_profile:
> 
> export LANG=en_US.UTF-8

On the Mac and on Windows the filesystem always uses Unicode to represent filenames (Mac as UTF-8, Windows as USC-2/UTF-16), so the LANG setting only applies to the client IO there and never to the paths. 
On the unixes paths don't have a specific encoding -paths consists of bytes-, so the LANG setting applies to path names too.

In this case a file on disk has a path that can't be interpreted by the current LANG setting. (E.g. utf-8 works with lead and follow bytes for multibyte characters.. if the first byte of a multibyte character is a follow-byte it's encoding is invalid). This is probably caused by saving files with names in one encoding (E.g. ISO-8859-1) and then reading them back with another encoding (probably UTF-8).

	Bert

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessageId=1578172

To unsubscribe from this discussion, e-mail: [users-unsubscribe@subversion.tigris.org].


Re: [Error conversion UTF-8]

Posted by Ryan Schmidt <su...@ryandesign.com>.
On Apr 3, 2009, at 05:25, Carpentier, Fabien wrote:

> Ryan Schmidt wrote:
>
>> On Mar 11, 2009, at 06:34, Carpentier, Fabien wrote:
>>
>>> In Redmine, when I want view differences between two different
>>> revisions, I have this error message :
>>> svn: Safe data 'Index: Re' was followed by non-ASCII byte 195:
>>> unable to convert to/from UTF-8
>>>
>>> The problem is that the name file is Reédmine. (é = byte 195)
>>>
>>> Svn seems not support accents.
>>>
>>> I use OpenBSD 4.1, Subversion 1.5.5 and the last revision of  
>>> Redmine.
>>>
>>> I try to solve this problem, and set this file : utf.c and
>>> functions check_non_ascii, ..
>>> But it doesn't work.
>>
>> Subversion stores filenames as UTF-8 internally. In order to be able
>> to present filenames to you properly, it needs to know what encoding
>> your terminal uses, so you need to set the LANG environment variable
>> to the correct value for your terminal.
>>
>> I'm not sure how this applies to Redmine (never used it) but
>> presumably Redmine needs to tell Subversion what encoding to use.
>
> I search for change the lang environment variable.
>
> In /etc/login.conf, I add this :
>
> French:French Users Accounts:\
> 	:charset=ISO-8859-15:\
> 	:lang=fr_FR.ISO8859-15:\
> 	:tc=default:
>
> But it doesn't work. I've still the same problem.
> And now, I have no more idea..

I don't know about /etc/login.conf. On my Mac, I just added this line  
to my ~/.bash_profile:

export LANG=en_US.UTF-8

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessageId=1560038

To unsubscribe from this discussion, e-mail: [users-unsubscribe@subversion.tigris.org].