You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Trent Apted <ta...@it.usyd.edu.au> on 2005/03/29 06:18:55 UTC

Towards standardising mime-type support

Hi there, I've been a developer and CVS user for a long time and 
recently did the switch to Subversion
[svn, version 1.1.3 (r12730)]. Some things bug be (being unable to move 
things between repositories comes to mind), but overall I'm happy and it 
is now unlikely that I'll go back to CVS (it was a close thing though).

I'm currently in the process of reading through the docs to get the most 
out of Subversion. I came across the svn:mime-types property and my 
immediate thought was, "Why don't I just use /usr/bin/file to set the 
mime-type?" So I tried it out, but there are some issues:

When I run

$ file -bi something.c

(or .cpp, .h, .cc, etc.)

/usr/bin/file reports that the mime-type for C/C++ files is

text/x-c; charset=us-ascii

However, if I feed this to Subversion, it treats the file as binary. So, 
that's fine, I'll drop the charset stuff, and things are mostly back to 
normal, but the added information still appears to be meaningless to 
Subversion. Also, trac did not understand the text/x-c mime-type and 
refused to run the file through enscript, so I ended up deleting the 
mime-type altogether. This is bad, because I can foresee a future 
application that can take full advantage of such mime-type strings.

So perhaps, to future-proof the mime-types, the parsing should be able 
to treat any string that fits the standard in a sane way .. even if it 
just ignores some of it (like charsets).

Further, perhaps there should be a feature with support for 
`/usr/bin/file -bi` --- "auto-auto-props" might be nice. The current 
method doesn't really add much over the cvswrappers technique which 
always seemed to be a big duplication of effort to me.

The trac problem is less urgent, I might bug those guys later..

Cheers,
    Trent.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Towards standardising mime-type support

Posted by Erik Huelsmann <e....@gmx.net>.
> >>>> However, if I feed this to Subversion, it treats the file as 
> >>>> binary. So, that's fine, I'll drop the charset stuff,
> >>>
> >>>
> >>>
> >>>
> >>> Ah, yes, We should know about charset attributes.
> >>>
> >>>> and things are mostly back to normal, but the added information 
> >>>> still appears to be meaningless to Subversion.
> >>>
> >>>
> >>>
> >>>
> >>> Define "meaningless". You've told SVN that this is a text file, and 
> >>> that's it. SVN doesn't interpret the mime type any further than that 
> >>> (yet).
> >>
> >>
> >>
> >> I guess I'm saying that text/x-c and text/x-cc would imply that a 
> >> file is source code, and hence platform independent, thus should 
> >> always use the 'native' eol-style and should never be executable. 
> >> While you should still specify the style and executability for 
> >> something that is text/plain. However, this might not suit everyone...
> >
> >
> > Media types do not define the encoding, only the type of the contents. 
> > Therefore we a) can't extrapolate eol-style from the mime type, and b) 
> > would be totally wrong to do so because there are valid reasons _not_ 
> > to use native eol-style even in mixed-platform environments.
> 
> a) I'm not convinced that we can't determine an eol-style because we 
> only know the type of contents, and b) I can't think of any reason why I 
> would want my source code in something other than the native format for 
> whatever platform I'm editing it on.
> 
> Perhaps I'm merely used to CVS taking care of all this for me.

You are aware of the auto-props feature which does take care of this, right?
Though that doesn't work with mime-types, but with filename patterns though.

Maybe that can help?

bye,


Erik.


-- 
Handyrechnung zu hoch? Tipp: SMS und MMS mit GMX
Seien Sie so frei: Alle Infos unter http://www.gmx.net/de/go/freesms

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Towards standardising mime-type support

Posted by Travis P <sv...@castle.fastmail.fm>.
On Apr 1, 2005, at 6:43 AM, Trent Apted wrote:

>> Media types do not define the encoding, only the type of the 
>> contents. Therefore we a) can't extrapolate eol-style from the mime 
>> type, and b) would be totally wrong to do so because there are valid 
>> reasons _not_ to use native eol-style even in mixed-platform 
>> environments.
>
> a) I'm not convinced that we can't determine an eol-style because we 
> only know the type of contents, and

> b) I can't think of any reason why I would want my source code in 
> something other than the native format for whatever platform I'm 
> editing it on.

Your imagination hasn't been watching the dev@ list. :-)

Here's a reason why svn:eol-style=native should not be tied to anything:
   http://svn.haxx.se/dev/archive-2005-03/1072.shtml

I give users a huge (measured 10x) performance boost on WC operations 
where the WC is on a network share (AFS) by not allowing svn:eol-style 
or svn:keywords properties on source files in the repositories I admin. 
  (A precommit hook enforces only LF line endings and no svn:eol-style 
or svn:keywords properties allowed.)

We generally aren't cross-platform (all Unixy: AIX, Linux, MacOS X), 
but if anyone does editing on Windows, they'll have to find an editor 
or other way to use only LF line endings.  This trade-off is a huge win 
for us because the AFS WC operations on Unix are very much "the common 
case" for us.

> Actually, looking further it appears that the heuristic for c/c++ 
> files is that if a file starts with "/*" it is C, and if with "//" it 
> is C++.

That's a pretty rough heuristic.  So, if all C and C++ start with the 
same /* copyright notice, the detection system assumes all the C++ 
files are C, and ignores the filename extension?

> Perhaps my annoyance with having to specify a native eol-style each 
> time I add a new source file

As someone else mentioned, it sounds like you haven't discovered the 
[auto-props] in your config file which can be set to automatically add 
svn:eol-style=native to files in very many cases and will likely 
eliminate your trouble.

Cheers,
Travis


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Towards standardising mime-type support

Posted by Matthias Julius <jn...@julius-net.net>.
Branko Čibej <br...@xbc.nu> writes:

> As for using IANA's registry as the reference: there are many
> non-standard MIME types being used, some with fairly common meanings
> (e.g., application/x-gzip-compressed), and some that are more obscure
> (e.g., image/x-bmp). So, perhaps limiting ourselves to the registered
> types is a bit restrictive, but it's a safe baseline.

You could use the IANA registry and allow the user to specify
additional MIME types in the config file.  The default config file
could include the most common non-standard MIME types already.

Matthias


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Towards standardising mime-type support

Posted by Branko Čibej <br...@xbc.nu>.
Trent Apted wrote:

> a) I'm not convinced that we can't determine an eol-style because we 
> only know the type of contents, and b) I can't think of any reason why 
> I would want my source code in something other than the native format 
> for whatever platform I'm editing it on.

I'm working on a project this very minute where the customer requires 
all files to have CRLF line endings, even though the project is 
cross-platform.

>>
>> *ROTFL*
>>
>> Oh and surely it would not be platform specific at all, aye. It would 
>> work on *every* Linux box in the world (well, except some older ones).
>
>
> Don't be silly. This is a plaintext file. Last I checked text files 
> were readable on my non-Linux computers -- Windows isn't that bad.

Sorry, the way I understood it you implied that /usr/bin/file was not 
platform-specific.

> Sarcasm aside, the point is the way in which the mime type is 
> determined, which I had hoped would be evident in my choice of 
> quotation. Reading the first N bytes of a file and matching them to a 
> known sequence (which could be in a file distributed with Subversion 
> or statically linked via source automatically generated from the above 
> file) is in no way platform dependent. Obviously this is not a 
> complete solution, but it gets most of the way there.

When I said platform-specific implementation I meant just that. On 
Windows, there are native APIs that will a) guess or b) look in the 
registry to find a file's MIME type. I've also heard of a system (can't 
recall the name, but it's not Mac OS) where the file type is recorded as 
a property of the file in the filesystem, and so of course we'd want to 
use that information.

> Perhaps my annoyance with having to specify a native eol-style each 
> time I add a new source file is testament to the fact that I am 
> knowledgeable about what it means for something to be platform 
> dependent, having been a developer of cross-platform software projects 
> for some time now.

Then I'm sure you know the difference between "portable" and 
"platform-specific". Subversion aims to be portable by using 
platform-specific tools. That's certainly one reason why we're using APR 
for our portability layer.

>>>
>>> I can write a patch, if you like.. PhD students tend to find time 
>>> for all kinds of non-thesis pursuits ;-)
>>
>>
>>
>> Any change in this direction must be generic in the sense that it 
>> allows platform-specfic implementations, and it must produce portable 
>> results. I will veto any patch that produces MIME types that are not 
>> in the IANA registry.
>>
> Actually, looking further it appears that the heuristic for c/c++ 
> files is that if a file starts with "/*" it is C, and if with "//" it 
> is C++. `0xbabe` is a Java class file, etc.
> Maybe I'll just work on that patch, despite your discouragement, and 
> see if I can make you happy.

That wasn't meant to be discouraging. On the contrary, I think such a 
patch would be a good thing. I'm simply telling you how the patch must 
behave in order to be acceptable (to me). And I'm not laying down 
arbitrary rules; I'm just interpreting the rules we already have in the 
context of this particular feature.

When I say the solution must allow platform-specific implementations, of 
course I don't mean that *your* patch must contain implementations for 
all platforms. But the framework should be there. Others can fill in the 
blanks on other systems.

As for using IANA's registry as the reference: there are many 
non-standard MIME types being used, some with fairly common meanings 
(e.g., application/x-gzip-compressed), and some that are more obscure 
(e.g., image/x-bmp). So, perhaps limiting ourselves to the registered 
types is a bit restrictive, but it's a safe baseline.

-- Brane


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Towards standardising mime-type support

Posted by Trent Apted <ta...@it.usyd.edu.au>.
Golly, no need to be rude.


Branko Čibej wrote:

> Trent Apted wrote:
>
>> Thanks for your reply.
>>
>> Branko Čibej wrote:
>>
>>> Trent Apted wrote:
>>>
>>>> When I run
>>>>
>>>> $ file -bi something.c
>>>>
>>>> (or .cpp, .h, .cc, etc.)
>>>>
>>>> /usr/bin/file reports that the mime-type for C/C++ files is
>>>>
>>>> text/x-c; charset=us-ascii
>>>
>>>
>>>
>>>
>>> Well, this is clearly wrong, there's no such thing as a "text/x-c" 
>>> mime type.
>>
>>
>>
>> Perhaps true. RFC2046 *only* defines the 'plain' subtype of the text 
>> mime type, but we all use text/html. It also says any unrecognised 
>> mime types should just be treated as text/plain, so perhaps whether 
>> or not it is valid is moot.
>
>
> RFC2046 isn't the canonical reference. This 
> (http://www.iana.org/assignments/media-types/) is the canonical 
> reference.

Just because something hasn't been assigned doesn't mean it's invalid. 
This is why there are types and subtypes.

>>>> However, if I feed this to Subversion, it treats the file as 
>>>> binary. So, that's fine, I'll drop the charset stuff,
>>>
>>>
>>>
>>>
>>> Ah, yes, We should know about charset attributes.
>>>
>>>> and things are mostly back to normal, but the added information 
>>>> still appears to be meaningless to Subversion.
>>>
>>>
>>>
>>>
>>> Define "meaningless". You've told SVN that this is a text file, and 
>>> that's it. SVN doesn't interpret the mime type any further than that 
>>> (yet).
>>
>>
>>
>> I guess I'm saying that text/x-c and text/x-cc would imply that a 
>> file is source code, and hence platform independent, thus should 
>> always use the 'native' eol-style and should never be executable. 
>> While you should still specify the style and executability for 
>> something that is text/plain. However, this might not suit everyone...
>
>
> Media types do not define the encoding, only the type of the contents. 
> Therefore we a) can't extrapolate eol-style from the mime type, and b) 
> would be totally wrong to do so because there are valid reasons _not_ 
> to use native eol-style even in mixed-platform environments.

a) I'm not convinced that we can't determine an eol-style because we 
only know the type of contents, and b) I can't think of any reason why I 
would want my source code in something other than the native format for 
whatever platform I'm editing it on.

Perhaps I'm merely used to CVS taking care of all this for me.

>>>> Further, perhaps there should be a feature with support for 
>>>> `/usr/bin/file -bi` --- "auto-auto-props" might be nice.
>>>
>>>
>>>
>>>
>>> We've talked about using platform-specific mechanisms to guess the 
>>> mime type. What's missing is somebody with enough time on their 
>>> hands to actually do this.
>>
>>
>>
>> /usr/bin/file uses a tab-separated file of the form:
>>
>> $ cat /usr/share/misc/file/magic.mime
>> # Magic data for KMimeMagic (originally for file(1) command)
>> #
>> # The format is 4-5 columns:
>> #    Column #1: byte number to begin checking from, ">" indicates 
>> continuation
>> #    Column #2: type of data to match
>> #    Column #3: contents of data to match
>> #    Column #4: MIME type of result
>> #    Column #5: MIME encoding of result (optional)
>>
>> #------------------------------------------------------------------------------ 
>>
>>
>> This would not be platform-specific.
>
>
> *ROTFL*
>
> Oh and surely it would not be platform specific at all, aye. It would 
> work on *every* Linux box in the world (well, except some older ones).

Don't be silly. This is a plaintext file. Last I checked text files were 
readable on my non-Linux computers -- Windows isn't that bad.

Sarcasm aside, the point is the way in which the mime type is 
determined, which I had hoped would be evident in my choice of 
quotation. Reading the first N bytes of a file and matching them to a 
known sequence (which could be in a file distributed with Subversion or 
statically linked via source automatically generated from the above 
file) is in no way platform dependent. Obviously this is not a complete 
solution, but it gets most of the way there.

Perhaps my annoyance with having to specify a native eol-style each time 
I add a new source file is testament to the fact that I am knowledgeable 
about what it means for something to be platform dependent, having been 
a developer of cross-platform software projects for some time now.

>> Actually, looking further it appears that the heuristic for c/c++ 
>> files is that if a file starts with "/*" it is C, and if with "//" it 
>> is C++. `0xbabe` is a Java class file, etc.
>>
>> I can write a patch, if you like.. PhD students tend to find time for 
>> all kinds of non-thesis pursuits ;-)
>
>
> Any change in this direction must be generic in the sense that it 
> allows platform-specfic implementations, and it must produce portable 
> results. I will veto any patch that produces MIME types that are not 
> in the IANA registry.
>
Maybe I'll just work on that patch, despite your discouragement, and see 
if I can make you happy.

Thanks for your feedback,
    Trent.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Towards standardising mime-type support

Posted by Branko Čibej <br...@xbc.nu>.
Trent Apted wrote:

> Thanks for your reply.
>
> Branko Čibej wrote:
>
>> Trent Apted wrote:
>>
>>> When I run
>>>
>>> $ file -bi something.c
>>>
>>> (or .cpp, .h, .cc, etc.)
>>>
>>> /usr/bin/file reports that the mime-type for C/C++ files is
>>>
>>> text/x-c; charset=us-ascii
>>
>>
>>
>> Well, this is clearly wrong, there's no such thing as a "text/x-c" 
>> mime type.
>
>
> Perhaps true. RFC2046 *only* defines the 'plain' subtype of the text 
> mime type, but we all use text/html. It also says any unrecognised 
> mime types should just be treated as text/plain, so perhaps whether or 
> not it is valid is moot.

RFC2046 isn't the canonical reference. This 
(http://www.iana.org/assignments/media-types/) is the canonical reference.

>>> However, if I feed this to Subversion, it treats the file as binary. 
>>> So, that's fine, I'll drop the charset stuff,
>>
>>
>>
>> Ah, yes, We should know about charset attributes.
>>
>>> and things are mostly back to normal, but the added information 
>>> still appears to be meaningless to Subversion.
>>
>>
>>
>> Define "meaningless". You've told SVN that this is a text file, and 
>> that's it. SVN doesn't interpret the mime type any further than that 
>> (yet).
>
>
> I guess I'm saying that text/x-c and text/x-cc would imply that a file 
> is source code, and hence platform independent, thus should always use 
> the 'native' eol-style and should never be executable. While you 
> should still specify the style and executability for something that is 
> text/plain. However, this might not suit everyone...

Media types do not define the encoding, only the type of the contents. 
Therefore we a) can't extrapolate eol-style from the mime type, and b) 
would be totally wrong to do so because there are valid reasons _not_ to 
use native eol-style even in mixed-platform environments.


>
>>> Further, perhaps there should be a feature with support for 
>>> `/usr/bin/file -bi` --- "auto-auto-props" might be nice.
>>
>>
>>
>> We've talked about using platform-specific mechanisms to guess the 
>> mime type. What's missing is somebody with enough time on their hands 
>> to actually do this.
>
>
> /usr/bin/file uses a tab-separated file of the form:
>
> $ cat /usr/share/misc/file/magic.mime
> # Magic data for KMimeMagic (originally for file(1) command)
> #
> # The format is 4-5 columns:
> #    Column #1: byte number to begin checking from, ">" indicates 
> continuation
> #    Column #2: type of data to match
> #    Column #3: contents of data to match
> #    Column #4: MIME type of result
> #    Column #5: MIME encoding of result (optional)
>
> #------------------------------------------------------------------------------ 
>
>
> This would not be platform-specific.

*ROTFL*

Oh and surely it would not be platform specific at all, aye. It would 
work on *every* Linux box in the world (well, except some older ones).

> Actually, looking further it appears that the heuristic for c/c++ 
> files is that if a file starts with "/*" it is C, and if with "//" it 
> is C++. `0xbabe` is a Java class file, etc.
>
> I can write a patch, if you like.. PhD students tend to find time for 
> all kinds of non-thesis pursuits ;-)

Any change in this direction must be generic in the sense that it allows 
platform-specfic implementations, and it must produce portable results. 
I will veto any patch that produces MIME types that are not in the IANA 
registry.


-- Brane


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Towards standardising mime-type support

Posted by David Faure <fa...@kde.org>.
On Friday 01 April 2005 09:38, Trent Apted wrote:
> This would not be platform-specific. 

The "file" program itself is platform-specific, i.e. it might not be installed (e.g. on Windows).

> Actually, looking further it  
> appears that the heuristic for c/c++ files is that if a file starts with 
> "/*" it is C, and if with "//" it is C++. `0xbabe` is a Java class file, 
> etc.

If you want to implement any detection of mimetypes, the best thing to do
would be to use the freedesktop.org mimetype standard to avoid re-inventing
the wheel.
http://www.freedesktop.org/Standards/shared-mime-info-spec
(also comes with implementations, which you would be able to reuse).

-- 
David Faure, faure@kde.org, sponsored by Trolltech to work on KDE,
Konqueror (http://www.konqueror.org), and KOffice (http://www.koffice.org).

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Towards standardising mime-type support

Posted by Trent Apted <ta...@it.usyd.edu.au>.
Thanks for your reply.

Branko Čibej wrote:

> Trent Apted wrote:
>
>> When I run
>>
>> $ file -bi something.c
>>
>> (or .cpp, .h, .cc, etc.)
>>
>> /usr/bin/file reports that the mime-type for C/C++ files is
>>
>> text/x-c; charset=us-ascii
>
>
> Well, this is clearly wrong, there's no such thing as a "text/x-c" 
> mime type.

Perhaps true. RFC2046 *only* defines the 'plain' subtype of the text 
mime type, but we all use text/html. It also says any unrecognised mime 
types should just be treated as text/plain, so perhaps whether or not it 
is valid is moot.

>> However, if I feed this to Subversion, it treats the file as binary. 
>> So, that's fine, I'll drop the charset stuff,
>
>
> Ah, yes, We should know about charset attributes.
>
>> and things are mostly back to normal, but the added information still 
>> appears to be meaningless to Subversion.
>
>
> Define "meaningless". You've told SVN that this is a text file, and 
> that's it. SVN doesn't interpret the mime type any further than that 
> (yet).

I guess I'm saying that text/x-c and text/x-cc would imply that a file 
is source code, and hence platform independent, thus should always use 
the 'native' eol-style and should never be executable. While you should 
still specify the style and executability for something that is 
text/plain. However, this might not suit everyone...

>> Further, perhaps there should be a feature with support for 
>> `/usr/bin/file -bi` --- "auto-auto-props" might be nice.
>
>
> We've talked about using platform-specific mechanisms to guess the 
> mime type. What's missing is somebody with enough time on their hands 
> to actually do this.

/usr/bin/file uses a tab-separated file of the form:

$ cat /usr/share/misc/file/magic.mime
# Magic data for KMimeMagic (originally for file(1) command)
#
# The format is 4-5 columns:
#    Column #1: byte number to begin checking from, ">" indicates 
continuation
#    Column #2: type of data to match
#    Column #3: contents of data to match
#    Column #4: MIME type of result
#    Column #5: MIME encoding of result (optional)

#------------------------------------------------------------------------------

This would not be platform-specific. Actually, looking further it 
appears that the heuristic for c/c++ files is that if a file starts with 
"/*" it is C, and if with "//" it is C++. `0xbabe` is a Java class file, 
etc.

I can write a patch, if you like.. PhD students tend to find time for 
all kinds of non-thesis pursuits ;-)

- Trent.



Re: Towards standardising mime-type support

Posted by Ryan Schmidt <su...@ryandesign.com>.
On 01.04.2005, at 18:12, Branko Čibej wrote:

> But since Subversion is a version control system, not a source control 
> system, this kind of property doesn't really belong in the core 
> feature set.

Apparently I've been using the wrong term; I've been telling people 
that we've switched to the Subversion source control system. After all, 
it manages our source code. Why am I not supposed to call it a source 
control system? Is it because it can also version other data? What can 
a source control system do that Subversion doesn't?


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org


Re: Towards standardising mime-type support

Posted by Branko Čibej <br...@xbc.nu>.
Dale Worley wrote:

>>From: Branko Cibej [mailto:brane@xbc.nu]
>>
>>    
>>
>>>text/x-c; charset=us-ascii
>>>      
>>>
>>Well, this is clearly wrong, there's no such thing as a 
>>"text/x-c" mime type.
>>    
>>
>
>It seems to me that that is an overly strict interpretation.  The MIME standard  intends that one may use "private definition" subtypes starting with "x-" (RFC 1341, section 4, near the top of page 7).  (And that is in harmony with many other places where one may use a token starting with "x-" in place of a standardized token.)  For a source control system, I can see the advantage of being able to specify each programming language as a different subtype of text/*.
>  
>
Certainly, if Subversion were a source control system, but not in 
svn:mime-type. MIME was simply not designed for this purpose. I know 
it's been abused for such things, but that doesn't mean we have to 
continue this practice. Encoding the eimplementation language in the 
MIME type would be wrong, just as encoding the language used in a 
document would be wrong.

And there's no need for such overloading; after all, you can set 
arbitrary properties in Subversion, and one of them might well be the 
programming language used in the file.

But since Subversion is a version control system, not a source control 
system, this kind of property doesn't really belong in the core feature set.

-- Brane


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

RE: Towards standardising mime-type support

Posted by Dale Worley <dw...@pingtel.com>.
> From: Branko Cibej [mailto:brane@xbc.nu]
>
> > text/x-c; charset=us-ascii
> 
> Well, this is clearly wrong, there's no such thing as a 
> "text/x-c" mime type.

It seems to me that that is an overly strict interpretation.  The MIME standard  intends that one may use "private definition" subtypes starting with "x-" (RFC 1341, section 4, near the top of page 7).  (And that is in harmony with many other places where one may use a token starting with "x-" in place of a standardized token.)  For a source control system, I can see the advantage of being able to specify each programming language as a different subtype of text/*.

Dale


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org


Re: Towards standardising mime-type support

Posted by Branko Čibej <br...@xbc.nu>.
Trent Apted wrote:

> Hi there, I've been a developer and CVS user for a long time and 
> recently did the switch to Subversion
> [svn, version 1.1.3 (r12730)]. Some things bug be (being unable to 
> move things between repositories comes to mind), but overall I'm happy 
> and it is now unlikely that I'll go back to CVS (it was a close thing 
> though).
>
> I'm currently in the process of reading through the docs to get the 
> most out of Subversion. I came across the svn:mime-types property and 
> my immediate thought was, "Why don't I just use /usr/bin/file to set 
> the mime-type?" So I tried it out, but there are some issues:
>
> When I run
>
> $ file -bi something.c
>
> (or .cpp, .h, .cc, etc.)
>
> /usr/bin/file reports that the mime-type for C/C++ files is
>
> text/x-c; charset=us-ascii

Well, this is clearly wrong, there's no such thing as a "text/x-c" mime 
type.

> However, if I feed this to Subversion, it treats the file as binary. 
> So, that's fine, I'll drop the charset stuff,

Ah, yes, We should know about charset attributes.

> and things are mostly back to normal, but the added information still 
> appears to be meaningless to Subversion.

Define "meaningless". You've told SVN that this is a text file, and 
that's it. SVN doesn't interpret the mime type any further than that (yet).

> Also, trac did not understand the text/x-c mime-type

Like I said, it's not a valid mime type.

> and refused to run the file through enscript, so I ended up deleting 
> the mime-type altogether. This is bad, because I can foresee a future 
> application that can take full advantage of such mime-type strings.

You should complain about this to the Trac people, not here. Trac isn't 
Subversion.

> So perhaps, to future-proof the mime-types, the parsing should be able 
> to treat any string that fits the standard in a sane way .. even if it 
> just ignores some of it (like charsets).

I agree about the charset attribute, see above.

> Further, perhaps there should be a feature with support for 
> `/usr/bin/file -bi` --- "auto-auto-props" might be nice.

We've talked about using platform-specific mechanisms to guess the mime 
type. What's missing is somebody with enough time on their hands to 
actually do this.

> The current method doesn't really add much over the cvswrappers 
> technique which always seemed to be a big duplication of effort to me.

So it is.

> The trac problem is less urgent, I might bug those guys later..
>
> Cheers,
>    Trent.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: users-help@subversion.tigris.org
>


-- 
-- Brane


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org