You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by David Kramer <da...@thekramers.net> on 2005/09/06 19:30:22 UTC

Locale problem: Can't convert string from native encoding to 'UTF-8'

Background:
- I'm running svnserve 1.1.0 on my old server using berkeley back end for the
repository. (SuSE 9.0)

- I just built a new server, now running svnserve 1.2.1, using fsfs for the
repository (SUSE 9.3, with subversion server and libraries updated from
ftp.suse.com/pub/projects)

- I dumped the repositories from the old server and loaded them on the new
server.

- I twiddled my router so now all svn:// requests went to the new server.

After doing that, I was able to check out and in with the new repository (this
is as of a day or two ago).  Now, I get the following error:

david@deepthink:/devel/agilerules/web/private/htdocs/worknotes> svn commit -m
"web private: removed worknotes/htmlunit"
Deleting       worknotes/htmlunit
svn: Commit failed (details follow):
svn: Can't convert string from native encoding to 'UTF-8':
@?\217?d?\14?\184?\174

- All files and filenames in version control are plain text or OpenOffice
documents.  All plain text files are normal ASCII US characters.

Now, I did some RTFM and some STFW, and noticed that (1) part of the 1.2
series changes was added localization, and (2) the way you set the
localization of the server is by exporting LANG to the correct setting in the
script that starts up svnserve, and I've now done that, setting/exporting both
LANG and LC_ALL to en_US.UTF-8 in /etc/init.d/svnserve.  This did not seem to
help.

We tried upgrading the client to the latest, and that didn't seem to help either.

After changing the locale on svnserve, do I need to do the load operations again?

Do I need to make new working copies?

One bit of funkyness is that I had libapr0-2.0.53-9, to match
apache2-2.0.53-9.2.  The subversion-server-1.2.1-1.1. said to install
libapr0-2.0.54-7.1.i586.rpm, but when I did so, apache would segfault on all
client requests, so I reverted to libapr0-2.0.53-9.  Both apps seemed to work
after that, but I'm not sure if that's involved or not.  Maybe whether using
that version of svnserve with that version of of libapr is a separate
question.


Thanks.  We can't check in anything until this is resolved, so please get back
to me with any tips as soon as possible.  Thanks.




---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Locale problem: Can't convert string from native encoding to 'UTF-8'

Posted by Joshua Varner <jl...@gmail.com>.
On 9/7/05, David Kramer <da...@thekramers.net> wrote:
> On Tue, September 6, 2005 3:30 pm, David Kramer wrote:
> > Background:
> > - I'm running svnserve 1.1.0 on my old server using berkeley back end for the
> > repository. (SuSE 9.0)
> >
> > - I just built a new server, now running svnserve 1.2.1, using fsfs for the
> > repository (SUSE 9.3, with subversion server and libraries updated from
> > ftp.suse.com/pub/projects)
> >
> > - I dumped the repositories from the old server and loaded them on the new
> > server.
> >
> > - I twiddled my router so now all svn:// requests went to the new server.
> >
> > After doing that, I was able to check out and in with the new repository (this
> > is as of a day or two ago).  Now, I get the following error:
> >
> > david@deepthink:/devel/agilerules/web/private/htdocs/worknotes> svn commit -m
> > "web private: removed worknotes/htmlunit"
> > Deleting       worknotes/htmlunit
> > svn: Commit failed (details follow):
> > svn: Can't convert string from native encoding to 'UTF-8':
> > @?\217?d?\14?\184?\174
> >
> > - All files and filenames in version control are plain text or OpenOffice
> > documents.  All plain text files are normal ASCII US characters.
> >

The only things that should be converted to UTF-8 are filenames and
some properties. Since you already rolled your own server, you might
try building a client with --enable-maintainer-mode to get the line number
of the error.

> >
> > We tried upgrading the client to the latest, and that didn't seem to help
> > either.
> >
> > After changing the locale on svnserve, do I need to do the load operations
> > again?
> >
> > Do I need to make new working copies?
> >
> > One bit of funkyness is that I had libapr0-2.0.53-9, to match
> > apache2-2.0.53-9.2.  The subversion-server-1.2.1-1.1. said to install
> > libapr0-2.0.54-7.1.i586.rpm, but when I did so, apache would segfault on all
> > client requests, so I reverted to libapr0-2.0.53-9.  Both apps seemed to work
> > after that, but I'm not sure if that's involved or not.  Maybe whether using
> > that version of svnserve with that version of of libapr is a separate
> > question.
> >
> >
> > Thanks.  We can't check in anything until this is resolved, so please get back
> > to me with any tips as soon as possible.  Thanks.
> 
> We've tried a few more things, with no luck.
> 
> - I tried checking out a fresh working copy, but I got the same "Can't convert
> string from native encoding to 'UTF-8'"
> 
> - I tried wiping out the repository and reloading it, now that svnserve is
> running UTF-8, but reloading failed, with
> "uninew:/data/subversion/agilerules # svnadmin load web <
> /tmp/agilerules.web.dump
> <<< Started new transaction, based on original revision 1
>      * adding path : devel ... done.
>      * adding path : live ... done.
> svnadmin: Valid UTF-8 data
> (hex:)
> followed by invalid UTF-8 sequence
> (hex: a0 d1 07 08)
> "
> 
> Is there a way to convert the dump to UTF-8 (it's all US ASCII, AFAIK)?
> 
> Is there a way to load the dump as native, and get the svnserve client to work
> with it?
> 
> Thanks again.  We at a dead stop over this.
> 

We've had some problems with weird characters showing up in text files
in conversions (non-subversion related), You could try the script below.
It'll be safe on any plain text data.

You can also try setting LANG=C on the server. This is a fix for some
UTF-8 conversion problems with perl, so maybe it would help.

Just some suggestions, haven't tried them for this problem, but it's
better than nothing.

Josh

#!/bin/sh

#
# perform some additional needed .doc to .txt translations
#


o342=`echo -ne '\342'`
o200=`echo -ne '\200'`
o231=`echo -ne '\231'`
o241=`echo -ne '\241'`
o226=`echo -ne '\226'`
o234=`echo -ne '\234'`
o235=`echo -ne '\235'`
o302=`echo -ne '\302'`
o255=`echo -ne '\255'`

sed -e "s/$o342$o200$o231/'/g"  \
    -e "s/$o342$o226$o241//g"   \
    -e "s/$o342$o200$o234/\"/g" \
    -e "s/$o342$o200$o235/\"/g" \
    -e "s/$o302$o255/\-/g" \
    "$1"                        |
    tr '\200-\377\014' ' '

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org


Re: Locale problem: Can't convert string from native encoding to 'UTF-8'

Posted by Ryan Schmidt <su...@ryandesign.com>.
On Sep 7, 2005, at 17:29, David Kramer wrote:

> Is there a way to convert the dump to UTF-8 (it's all US ASCII,  
> AFAIK)?

Well, ASCII is a subset of UTF-8. If you have a collection of ASCII  
characters, then you also by definition have a collection of UTF-8  
characters. There is no conversion to be done.

So the fact that you have an error signifies that you do not have  
exclusively ASCII characters.


> svnadmin: Valid UTF-8 data
> (hex:)
> followed by invalid UTF-8 sequence
> (hex: a0 d1 07 08)

Let's see here.... In the ISO-8859 character sets (all of them), A0  
is a non-breaking space. D1 is (in ISO-8859-1, -3 and -9) is a  
capital N with tilde ("Ñ") (and in other ISO-8859 sets, it's  
undefined). 07 is a bell, and 08 is a backspace. I don't know  
svnadmin well enough to know what kind of data it's talking about  
here. If that data is the contents of a binary file, then that  
sequence of bytes is conceivable, though it certainly won't conform  
to UTF-8, so not sure why svnadmin would think it should. If that's  
part of a text file's contents, or part of a filename or some  
properties, then that's a strange sequence of characters indeed. It  
almost sounds like something is corrupted somewhere. Perhaps you can  
open the dump file in a good editor (as in one that doesn't have to  
load the entire file into memory all at once) and search for the Ñ  
and see where it is. If you have binary files in your repository,  
then this may be more complicated (will probably give you many false- 
positives).


> svn: Can't convert string from native encoding to 'UTF-8':
> @?\217?d?\14?\184?\174

217 is D9 which is a capital U with grave accent ("Ù"). 14 is  
apparently the shift-out code, which I had never heard of until now.  
184 is B8 which is a cedilla ("¸", the hook that usually appears  
below a c). 174 is AE which is the registered trademark symbol ("®").  
Same comment and advice as above.



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org


Re: Locale problem: Can't convert string from native encoding to 'UTF-8'

Posted by David Kramer <da...@thekramers.net>.
On Tue, September 6, 2005 3:30 pm, David Kramer wrote:
> Background:
> - I'm running svnserve 1.1.0 on my old server using berkeley back end for the
> repository. (SuSE 9.0)
>
> - I just built a new server, now running svnserve 1.2.1, using fsfs for the
> repository (SUSE 9.3, with subversion server and libraries updated from
> ftp.suse.com/pub/projects)
>
> - I dumped the repositories from the old server and loaded them on the new
> server.
>
> - I twiddled my router so now all svn:// requests went to the new server.
>
> After doing that, I was able to check out and in with the new repository (this
> is as of a day or two ago).  Now, I get the following error:
>
> david@deepthink:/devel/agilerules/web/private/htdocs/worknotes> svn commit -m
> "web private: removed worknotes/htmlunit"
> Deleting       worknotes/htmlunit
> svn: Commit failed (details follow):
> svn: Can't convert string from native encoding to 'UTF-8':
> @?\217?d?\14?\184?\174
>
> - All files and filenames in version control are plain text or OpenOffice
> documents.  All plain text files are normal ASCII US characters.
>
> Now, I did some RTFM and some STFW, and noticed that (1) part of the 1.2
> series changes was added localization, and (2) the way you set the
> localization of the server is by exporting LANG to the correct setting in the
> script that starts up svnserve, and I've now done that, setting/exporting both
> LANG and LC_ALL to en_US.UTF-8 in /etc/init.d/svnserve.  This did not seem to
> help.
>
> We tried upgrading the client to the latest, and that didn't seem to help
> either.
>
> After changing the locale on svnserve, do I need to do the load operations
> again?
>
> Do I need to make new working copies?
>
> One bit of funkyness is that I had libapr0-2.0.53-9, to match
> apache2-2.0.53-9.2.  The subversion-server-1.2.1-1.1. said to install
> libapr0-2.0.54-7.1.i586.rpm, but when I did so, apache would segfault on all
> client requests, so I reverted to libapr0-2.0.53-9.  Both apps seemed to work
> after that, but I'm not sure if that's involved or not.  Maybe whether using
> that version of svnserve with that version of of libapr is a separate
> question.
>
>
> Thanks.  We can't check in anything until this is resolved, so please get back
> to me with any tips as soon as possible.  Thanks.

We've tried a few more things, with no luck.

- I tried checking out a fresh working copy, but I got the same "Can't convert
string from native encoding to 'UTF-8'"

- I tried wiping out the repository and reloading it, now that svnserve is
running UTF-8, but reloading failed, with
"uninew:/data/subversion/agilerules # svnadmin load web <
/tmp/agilerules.web.dump
<<< Started new transaction, based on original revision 1
     * adding path : devel ... done.
     * adding path : live ... done.
svnadmin: Valid UTF-8 data
(hex:)
followed by invalid UTF-8 sequence
(hex: a0 d1 07 08)
"

Is there a way to convert the dump to UTF-8 (it's all US ASCII, AFAIK)?

Is there a way to load the dump as native, and get the svnserve client to work
with it?

Thanks again.  We at a dead stop over this.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org