You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Alon Bar-lev <Al...@xor-t.com> on 2004/10/23 12:11:09 UTC

Subversion roadmap - Unicode files

Hello,
 
Will subversion handle Unicode text files as text files in the future?
Currently Unicode text files are treated as binary files...
 
Best Regards,
Alon Bar-Lev.
 

Re: Subversion roadmap - Unicode files

Posted by Jeroen Leenarts <le...@tiscali.nl>.
kfogel@collab.net wrote:

>Jeroen Leenarts <le...@tiscali.nl> writes:
>  
>
>>>Pardon my ignorance, but does "Unicode" refer to a specfic encoding?
>>>I.e., is it the same as UTF-16, or another of the UTF-*'s?
>>>      
>>>
>>Have a go at this http://www.joelonsoftware.com/articles/Unicode.html
>>and see if it clears up some things about unicode and UTF-8 or UTF-16.
>>    
>>
>
>Very entertaining, thanks.  But it pretty much confirmed the line of
>thinking implied by my question above... namely:
>
>If someone says Subversion won't handle their "Unicode" files as text,
>we must ask "What specific *encoding* of Unicode?"  
>
>In this case, it seems clear it was UTF-N (where N > 8).  But when
>responding to a bug report, one always wants to make absolutely sure
>that what the reporter thinks is going on is the same as what
>Subversion thinks is going on :-).
>
>Best,
>-Karl
>
>  
>
The point your making here is relevant. It would be nasty when resolving 
a bug when the encoding assumptions are not traceble. I think it will be 
important when SVN starts fiddling with encoding and the likes that SVN 
logs the assumptions it has made about the contents of the file. Right 
now I have not seen this anywhere, but then again I'm not all that 
informed about unicode and SVN.

Nice things can happen when you're working on unicode files that only 
use the ASCII character range on UTF-8 encoding. It might look like it 
is plain and simple ASCII, but it really isn't. I have made one serious 
FU once because of my lack of understanding of  encoding and  character 
sets.

Jeroen


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Subversion roadmap - Unicode files

Posted by kf...@collab.net.
Jeroen Leenarts <le...@tiscali.nl> writes:
> >Pardon my ignorance, but does "Unicode" refer to a specfic encoding?
> >I.e., is it the same as UTF-16, or another of the UTF-*'s?
> 
> Have a go at this http://www.joelonsoftware.com/articles/Unicode.html
> and see if it clears up some things about unicode and UTF-8 or UTF-16.

Very entertaining, thanks.  But it pretty much confirmed the line of
thinking implied by my question above... namely:

If someone says Subversion won't handle their "Unicode" files as text,
we must ask "What specific *encoding* of Unicode?"  

In this case, it seems clear it was UTF-N (where N > 8).  But when
responding to a bug report, one always wants to make absolutely sure
that what the reporter thinks is going on is the same as what
Subversion thinks is going on :-).

Best,
-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Subversion roadmap - Unicode files

Posted by Jeroen Leenarts <le...@tiscali.nl>.
>Pardon my ignorance, but does "Unicode" refer to a specfic encoding?
>I.e., is it the same as UTF-16, or another of the UTF-*'s?
>
>You can set the 'svn:mime-type' property to something like
>"text/plain", by the way.  Do 'svn help propset' for details.
>
>-Karl
>
>
>  
>

Have a go at this http://www.joelonsoftware.com/articles/Unicode.html 
and see if it clears up some things about unicode and UTF-8 or UTF-16.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Subversion roadmap - Unicode files

Posted by kf...@collab.net.
"Alon Bar-lev" <Al...@xor-t.com> writes:
> Will subversion handle Unicode text files as text files in the future?
> Currently Unicode text files are treated as binary files...

Pardon my ignorance, but does "Unicode" refer to a specfic encoding?
I.e., is it the same as UTF-16, or another of the UTF-*'s?

You can set the 'svn:mime-type' property to something like
"text/plain", by the way.  Do 'svn help propset' for details.

-Karl


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org