You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Samuel Langlois <sl...@ilog.fr> on 2008/03/21 17:27:26 UTC

Unable to restore a dump having Chinese characters

Hello,

I've had a tough week...
Because of a hardware failure, we had to restore our nightly subversion dump.
We are using 1.3.2 on Linux with fsfs (Apache/2.2.2 (Unix) mod_ssl/2.2.2 OpenSSL/0.9.7a DAV/2 SVN/1.3.2)

After 2 hours of feverish restore, the "svnadmin load" failed with the message:
File not found: transaction '84072-1', path 'JRules/branches/v65updates/localization/ja/tutorials/shared-jp/ã­ã¼ã³æ¤è¨¼ã«ã¼ã«/ã¯ã¨ãªã¼'
The strange characters at the end are supposed to be Japanese chars that the term cannot display.

We finally managed to restore the dump by using svndumpfilter to exclude this path and several others. All of the paths which were causing problems had Chinese or Japanese characters.
But most paths with such characters worked correctly: only a few were crashing the restore.

I was able to extract one part of the dump to reproduce the problem:
ftp://ftp.ilog.fr/private/JRules/chinese.svndump.gz
To reproduce, do the following:
  svnadmin create chineserep
  svn mkdir file://`pwd`/chineserep/trunk
  svnadmin load chineserep < chinese.svndump

I tried re-importing it with subversion 1.4.6, but I had the same error.
I suppose the problem happens during the dump, not the restore.

Is this a known problem with multibyte encodings?
Would it be corrected in a newer version, or do you want me to register an issue?

Thanks a lot for your help,
--
Samuel Langlois
ILOG S.A.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org


Re: Unable to restore a dump having Chinese characters

Posted by Ulrich Eckhardt <ec...@satorlaser.com>.
On Friday 21 March 2008, Ryan Schmidt wrote:
> On Mar 21, 2008, at 12:27, Samuel Langlois wrote:
[...]
> > After 2 hours of feverish restore, the "svnadmin load" failed with
> > the message:
> > File not found: transaction '84072-1', path 'JRules/branches/
> > v65updates/localization/ja/tutorials/shared-jp/ã ã¼ã³æ
> > ¤è¨¼ã«ã¼ã«/ã¯ã¨ãªã¼'
> > The strange characters at the end are supposed to be Japanese chars
> > that the term cannot display.

IMHO that just means that the terminal needs a different locale plus possibly 
fonts and Unicode support in general.

> For one thing, is your LANG environment variable set correctly? When
> I set it to a correct value on my system, the error message instead
> says:
>
> svnadmin: File not found: transaction '1-1', path 'trunk/shared-zh/
> squery-贷款审批-规则/规则/验证/贷款/检查总额.brl'
>
>
> Now my question to you is whether that file exists in the dump or not.

This statement puzzles me, and I wonder whether I'm understanding how SVN 
works correctly. In particular, I wonder whether the locale settings (LANG) 
should matter at all to the outcome. The reason why I don't think so is that 
this is a transformation from repository to some repository-independent 
representation and then to another repository. I don't see where the locale 
is involved, except perhaps for outputting error messages, but not in the 
resulting repository or dumpfile data.

Uli

-- 
ML: http://subversion.tigris.org/mailing-list-guidelines.html
FAQ: http://subversion.tigris.org/faq.html
Docs: http://svnbook.red-bean.com/

Sator Laser GmbH
Geschäftsführer: Michael Wöhrmann, Amtsgericht Hamburg HR B62 932

**************************************************************************************
           Visit our website at <http://www.satorlaser.de/>
**************************************************************************************
Diese E-Mail einschließlich sämtlicher Anhänge ist nur für den Adressaten bestimmt und kann vertrauliche Informationen enthalten. Bitte benachrichtigen Sie den Absender umgehend, falls Sie nicht der beabsichtigte Empfänger sein sollten. Die E-Mail ist in diesem Fall zu löschen und darf weder gelesen, weitergeleitet, veröffentlicht oder anderweitig benutzt werden.
E-Mails können durch Dritte gelesen werden und Viren sowie nichtautorisierte Änderungen enthalten. Sator Laser GmbH ist für diese Folgen nicht verantwortlich.

**************************************************************************************


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org


RE: Unable to restore a dump having Chinese characters

Posted by Samuel Langlois <sl...@ilog.fr>.
Hello,

> For one thing, is your LANG environment variable set correctly? When
> I set it to a correct value on my system, the error message instead
> says:
> 
> svnadmin: File not found: transaction '1-1', path 'trunk/shared-zh/
> squery-贷款审批-规则/规则/验证/贷款/检查总额.brl'
> 
Thanks for your answers.

As Ulrich Eckhardt pointed out, I think (and hope!) the LANG env variable is not used there, except for printing on the terminal. I tried with different settings and got the same result.
Beside, it does not fit well with the fact that most files with Chinese name are imported properly, and only a few cause problems.

I may have another explanation: if you look at the output, it seems that the creation of a folder is missing:
     * adding path : trunk/shared-zh/squery-____-__/__/__/.rulepackage ... done.
     * adding path : trunk/shared-zh/squery-____-__/__/__/acceptLoan.fct ... done.
     * adding path : trunk/shared-zh/squery-____-__/__/__ ... done.
     * adding path : trunk/shared-zh/squery-____-__/__/__/__/____.brl ...
svnadmin: File not found: transaction '1-1', path 'trunk/shared-zh/squery-____-__/__/__/__/____.brl'

See? It created "squery-____-__/__/__" (two slashes) but not "squery-____-__/__/__/__" (three slashes) which makes the creation of "squery-____-__/__/__/__/____.brl" (four slashes) fail.

Could it be that some Chinese characters "look like" slashes, making the svnadmin dump fail to create a proper archive?

Thanks,
Samuel

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org


Re: Unable to restore a dump having Chinese characters

Posted by Ryan Schmidt <su...@ryandesign.com>.
On Mar 21, 2008, at 12:27, Samuel Langlois wrote:

> Because of a hardware failure, we had to restore our nightly  
> subversion dump.
> We are using 1.3.2 on Linux with fsfs (Apache/2.2.2 (Unix) mod_ssl/ 
> 2.2.2 OpenSSL/0.9.7a DAV/2 SVN/1.3.2)
>
> After 2 hours of feverish restore, the "svnadmin load" failed with  
> the message:
> File not found: transaction '84072-1', path 'JRules/branches/ 
> v65updates/localization/ja/tutorials/shared-jp/ã ã¼ã³æ 
> ¤è¨¼ã«ã¼ã«/ã¯ã¨ãªã¼'
> The strange characters at the end are supposed to be Japanese chars  
> that the term cannot display.
>
> We finally managed to restore the dump by using svndumpfilter to  
> exclude this path and several others. All of the paths which were  
> causing problems had Chinese or Japanese characters.
> But most paths with such characters worked correctly: only a few  
> were crashing the restore.
>
> I was able to extract one part of the dump to reproduce the problem:
> ftp://ftp.ilog.fr/private/JRules/chinese.svndump.gz
> To reproduce, do the following:
>   svnadmin create chineserep
>   svn mkdir file://`pwd`/chineserep/trunk
>   svnadmin load chineserep < chinese.svndump
>
> I tried re-importing it with subversion 1.4.6, but I had the same  
> error.
> I suppose the problem happens during the dump, not the restore.
>
> Is this a known problem with multibyte encodings?
> Would it be corrected in a newer version, or do you want me to  
> register an issue?

For one thing, is your LANG environment variable set correctly? When  
I set it to a correct value on my system, the error message instead  
says:

svnadmin: File not found: transaction '1-1', path 'trunk/shared-zh/ 
squery-贷款审批-规则/规则/验证/贷款/检查总额.brl'


Now my question to you is whether that file exists in the dump or not.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org