You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Ricardo Grünewald <rg...@sts-systemtechnik.com> on 2005/11/14 14:09:54 UTC

Subversion and UTF-16 Files



Hello,

I am a c# developer and have recently started to use Subversion.
All my source codes are saved by Visual Studio 2003 in utf-16  format.
Subversion treats thes files as binary, not as text, which hinders the file
comparison
How can I get Subversion to treat my files as "text"

Thankful for any help !



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Subversion and UTF-16 Files

Posted by Dimitri Papadopoulos-Orfanos <pa...@shfj.cea.fr>.
> How can I get Subversion to treat my files as "text"

In the meantime, try this:
http://svnbook.red-bean.com/en/1.1/svn-book.html#svn-ap-a-sect-8
http://svnbook.red-bean.com/en/1.1/ch07s02.html#svn-ch-7-sect-2.3.2

Dimitri Papadopoulos

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Subversion and UTF-16 Files

Posted by kf...@collab.net.
Kalin KOZHUHAROV <ka...@thinrope.net> writes:
> Ryan Schmidt wrote:
> > I don't think Subversion can currently see UTF-16 files as text files.
> > 
> > Someone filed an issue about this in January:
> > 
> > http://subversion.tigris.org/issues/show_bug.cgi?id=2194
> > 
> > The issue was marked invalid because no discussion had taken place.  So
> > a discussion was started:
> > 
> > http://svn.haxx.se/users/archive-2005-01/0233.shtml
> > 
> > The result of the discussion was that the issue should be reopened  and
> > marked as a feature request to support UTF-16 and UTF-32, but  nobody
> > appears to have done so.
>
> The issue is not reopened, nor can I do that... Who can?

Done.  I've added a link to this thread, too.

Best,
-Karl

-- 
www.collab.net  <>  CollabNet  |  Distributed Development On Demand

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Subversion and UTF-16 Files

Posted by Kalin KOZHUHAROV <ka...@thinrope.net>.
Ryan Schmidt wrote:
> On Nov 14, 2005, at 15:09, Ricardo Grünewald wrote:
> 
>> I am a c# developer and have recently started to use Subversion.
>> All my source codes are saved by Visual Studio 2003 in utf-16  format.
>> Subversion treats thes files as binary, not as text, which hinders 
>> the file
>> comparison
>> How can I get Subversion to treat my files as "text"
> 
> 
> 
> I don't think Subversion can currently see UTF-16 files as text files.
> 
> Someone filed an issue about this in January:
> 
> http://subversion.tigris.org/issues/show_bug.cgi?id=2194
> 
> The issue was marked invalid because no discussion had taken place.  So
> a discussion was started:
> 
> http://svn.haxx.se/users/archive-2005-01/0233.shtml
> 
> The result of the discussion was that the issue should be reopened  and
> marked as a feature request to support UTF-16 and UTF-32, but  nobody
> appears to have done so.
The issue is not reopened, nor can I do that... Who can?

Although we do not use UTF-{16,32} currently, all i18n issues are a big PITA and in the present
state of the Net and IT as a whole are just lame. Working with 5 languages, a few encodings each,
with a handful of tools (some proprietary) on at least two distinctly different OSes (let alone
variants) has always been a problem for the me for the last 10 years or so. Trying to use UTF-8
lately seems to work around most problems.

The simplicity and wide adoption of English and ASCII (or ISO-8859-1) encodings is the result of
laziness on the developer side (yes, I do this also). Currently, most (say 85%) of the OSS is
developed in English and ASCII, though the percentage of i18n-ed OSS and OSS that is based on UTF-8
has quite increased in the last few years.

As you can represent anything in UTF-8, it is well defined and widely used across the Net, it is the
best inter-operable internal encoding for textual data (and somehow space efficient). See this very
informative table:
http://www.unicode.org/faq/utf_bom.html#37

Some more insight from:
	http://en.wikipedia.org/wiki/Unicode
	http://en.wikipedia.org/wiki/Comparison_of_unicode_encodings
and the links from there.

That all being said, I feel that if subversion (or any other software) has proper support for UTF-8,
or even better internal representation of textual data in UTF-8 then interoperability can be insured
by software such as iconv. UTF-8 can represent any language, so we don't need anything else.


The bug (personal classification) that UTF-8 is handled as binary in some (most?) situations should
be a good start.
Supporting tons of encodings can easily be achieved with external libraries (such as iconv).

Kalin.
-- 
|[ ~~~~~~~~~~~~~~~~~~~~~~ ]|
+-> http://ThinRope.net/ <-+
|[ ______________________ ]|

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Subversion and UTF-16 Files

Posted by Ryan Schmidt <su...@ryandesign.com>.
On Nov 14, 2005, at 15:09, Ricardo Grünewald wrote:

> I am a c# developer and have recently started to use Subversion.
> All my source codes are saved by Visual Studio 2003 in utf-16  format.
> Subversion treats thes files as binary, not as text, which hinders  
> the file
> comparison
> How can I get Subversion to treat my files as "text"


I don't think Subversion can currently see UTF-16 files as text files.

Someone filed an issue about this in January:

http://subversion.tigris.org/issues/show_bug.cgi?id=2194

The issue was marked invalid because no discussion had taken place.  
So a discussion was started:

http://svn.haxx.se/users/archive-2005-01/0233.shtml

The result of the discussion was that the issue should be reopened  
and marked as a feature request to support UTF-16 and UTF-32, but  
nobody appears to have done so.



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org