You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Branko Čibej <br...@apache.org> on 2019/06/26 15:39:05 UTC

Unicode composable characters on macOS [was: Subversion 2.0]

On 26.06.2019 10:40, Marc Strapetz wrote:
> On 25.06.2019 23:35, Branko Čibej wrote:> On 25.06.2019 19:15, Thomas
> Singer wrote:
>>> What I don't like:
>>> - after more than a decade the umlaut problem of composed/decomposed
>>> UTF-8 has not been solved
>>
>> It has, actually, in Apple's APFS, where the fix belongs.
>
> That sounds interesting. Just to be sure, you are referring to this
> problem:
>
> https://issues.apache.org/jira/browse/SVN-2464
>
> ? It would be great to have some more information for which OSX
> version and which file systems the problem should be resolved.


The original problem was that Apples HFS+ filesystem normalized paths to
Unicode Normalisation Form D. In practice that meant that if you created
a file with a name that contained a composable character, then read that
name from the filesystem, you could get different results (i.e., the
name was "the same" as far as Unicode normalisation is concerned, but
the actual representation bytes were different).

The new APFS filesystem (which is the default in the last two versions
of macOS, IIRC) doesn't do that any more.

This is on local disk, which is APFS:

brane@zulu:~/src/svn/test$ svnadmin create repo
brane@zulu:~/src/svn/test$ svn co file://$(pwd)/repo wc
Checked out revision 0.
brane@zulu:~/src/svn/test$ touch wc/čibej
brane@zulu:~/src/svn/test$ svn add wc/čibej 
A         wc/čibej
brane@zulu:~/src/svn/test$ svn st wc/
A       wc/čibej 

and this is on an HFS+ disk image:

brane@zulu:/Volumes/hfs$ svnadmin create repo
brane@zulu:/Volumes/hfs$ svn co file://$(pwd)/repo wc
Checked out revision 0.
brane@zulu:/Volumes/hfs$ touch wc/čibej
brane@zulu:/Volumes/hfs$ svn add wc/čibej 
A         wc/čibej
brane@zulu:/Volumes/hfs$ svn st wc/
?       wc/čibej
!       wc/čibej 

The second instance clearly shows that the filesystem changed the file name.

-- Brane


Re: Unicode composable characters on macOS [was: Subversion 2.0]

Posted by Thomas Singer <th...@syntevo.com>.
Hi Branko,

Thanks for the detailed explanation. Would you mind to add the 
description to the linked issue and mark it as 
resolved/works-for-me/no-bug, so this information is not lost?

-- 
Best regards,
Thomas Singer
=============
syntevo GmbH


On 26/06/2019 17:39, Branko Čibej wrote:
> On 26.06.2019 10:40, Marc Strapetz wrote:
>> On 25.06.2019 23:35, Branko Čibej wrote:> On 25.06.2019 19:15, Thomas
>> Singer wrote:
>>>> What I don't like:
>>>> - after more than a decade the umlaut problem of composed/decomposed
>>>> UTF-8 has not been solved
>>>
>>> It has, actually, in Apple's APFS, where the fix belongs.
>>
>> That sounds interesting. Just to be sure, you are referring to this
>> problem:
>>
>> https://issues.apache.org/jira/browse/SVN-2464
>>
>> ? It would be great to have some more information for which OSX
>> version and which file systems the problem should be resolved.
> 
> 
> The original problem was that Apples HFS+ filesystem normalized paths to
> Unicode Normalisation Form D. In practice that meant that if you created
> a file with a name that contained a composable character, then read that
> name from the filesystem, you could get different results (i.e., the
> name was "the same" as far as Unicode normalisation is concerned, but
> the actual representation bytes were different).
> 
> The new APFS filesystem (which is the default in the last two versions
> of macOS, IIRC) doesn't do that any more.
> 
> This is on local disk, which is APFS:
> 
> brane@zulu:~/src/svn/test$ svnadmin create repo
> brane@zulu:~/src/svn/test$ svn co file://$(pwd)/repo wc
> Checked out revision 0.
> brane@zulu:~/src/svn/test$ touch wc/čibej
> brane@zulu:~/src/svn/test$ svn add wc/čibej
> A         wc/čibej
> brane@zulu:~/src/svn/test$ svn st wc/
> A       wc/čibej
> 
> and this is on an HFS+ disk image:
> 
> brane@zulu:/Volumes/hfs$ svnadmin create repo
> brane@zulu:/Volumes/hfs$ svn co file://$(pwd)/repo wc
> Checked out revision 0.
> brane@zulu:/Volumes/hfs$ touch wc/čibej
> brane@zulu:/Volumes/hfs$ svn add wc/čibej
> A         wc/čibej
> brane@zulu:/Volumes/hfs$ svn st wc/
> ?       wc/čibej
> !       wc/čibej
> 
> The second instance clearly shows that the filesystem changed the file name.
> 
> -- Brane
> 
>