You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Sean Leonard <de...@seantek.com> on 2014/10/25 20:53:24 UTC

svn:mime-type arbitrary parameters

Greetings:

What the best way for Subversion to record parameters of a MIME type 
(Internet media type) in a repository?

I require svn:mime-type to be filled out in all of the repositories that 
I manage. This is also useful for serving the right content over media 
type-aware protocols (HTTP, but there are others).

Some time ago there was a dev discussion about storing the character set 
of text in svn:charset (simple textual content) vs. svn:mime-type (using 
the header format of RFC 2045).

http://svn.haxx.se/dev/archive-2008-06/0941.shtml
http://svn.haxx.se/dev/archive-2008-07/0138.shtml

It appears that the matter was not fully resolved. svn:charset seems to 
enjoy de-facto use. Using RFC 2045 format in svn:mime-type as an 
appendage (delimited by ";") would be "correct", but as the poster 
notes, is extremely unwieldy. It is not possible to store arbitrary 
UTF-8 data, for example, unless you use the syntax of RFC 2231, which 
looks like:

    Content-Type: application/x-stuff;
     title*0*=us-ascii'en'This%20is%20even%20more%20
     title*1*=%2A%2A%2Afun%2A%2A%2A%20
     title*2="isn't it!"


Thank you,

Sean

PS I am also the author of the IETF text/markdown media type 
registration 
<https://tools.ietf.org/html/draft-ietf-appsawg-text-markdown-03>. The 
current draft proposes a "syntax" parameter to disambiguate between 
different kinds of Markdown. There are hundreds of different 
Markdown-derivative syntaxes, and it is unlikely that most of them will 
go through the trouble of registering separate media types. This matters 
to me personally because we have been storing Markdown content with 
materially different syntaxes in our Subversion repositories.

Re: svn:mime-type arbitrary parameters

Posted by Ben Reser <be...@reser.org>.
On 10/26/14 11:35 AM, Branko Čibej wrote:
> The fact that the svn:mime-type property is usable in any way for
> serving content from the repository is more or less an accident; it's
> definitely not a design goal. What you propose would have zero benefit
> for Subversion as a version control system but less than trivial
> maintenance costs, so it's not likely to ever happen.

I agree splitting up well known parameters of the Internet Media Type into
standard "svn:" namespace properties seems not terribly useful to me.

I don't think the choice of storing a mime-type rather than just a boolean for
binary data vs not-binary is a happy accident.  Pretty sure we stored a
mime-type because we figured it would be useful for other things than just
distinguishing binary types.  But as I recall you're the one that came up with
the svn:mime-type property in the first place.  So maybe you have a better
recollection than I do about this.

Honestly, I don't see what's unwieldy about using the the Internet Media Type
parameters in his case (syntax parameter for markdown).  The title example
given earlier in the thread seems like an example that should go in some other
property in the case of Subversion.

Basically if it isn't useful in the Content-Type field of the HTTP protocol, I
can't fathom why you'd shove it into svn:mime-type.

Re: svn:mime-type arbitrary parameters

Posted by Branko Čibej <br...@wandisco.com>.
On 27.10.2014 18:45, Ben Reser wrote:
> On 10/26/14 11:35 AM, Branko Čibej wrote:
>> The fact that the svn:mime-type property is usable in any way for
>> serving content from the repository is more or less an accident; it's
>> definitely not a design goal. What you propose would have zero benefit
>> for Subversion as a version control system but less than trivial
>> maintenance costs, so it's not likely to ever happen.
> I agree splitting up well known parameters of the Internet Media Type into
> standard "svn:" namespace properties seems not terribly useful to me.
>
> I don't think the choice of storing a mime-type rather than just a boolean for
> binary data vs not-binary is a happy accident.  Pretty sure we stored a
> mime-type because we figured it would be useful for other things than just
> distinguishing binary types.  But as I recall you're the one that came up with
> the svn:mime-type property in the first place.  So maybe you have a better
> recollection than I do about this.

Frankly, IIRC I had rather naïve expectations about the usefulness of
MIME-type for detecting "binaryness". Or, more precisely, the ability to
diff/merge. One of the considerations at the time was also to have the
ability to add diff/merge handlers for content other than line-based
plain text (XML merge was probably at the top of the list at the time).
Turns out we've never actually bothered doing that (yet).

-- Brane


Re: svn:mime-type arbitrary parameters

Posted by Ben Reser <be...@reser.org>.
On 10/26/14 11:35 AM, Branko Čibej wrote:
> The fact that the svn:mime-type property is usable in any way for
> serving content from the repository is more or less an accident; it's
> definitely not a design goal. What you propose would have zero benefit
> for Subversion as a version control system but less than trivial
> maintenance costs, so it's not likely to ever happen.

I agree splitting up well known parameters of the Internet Media Type into
standard "svn:" namespace properties seems not terribly useful to me.

I don't think the choice of storing a mime-type rather than just a boolean for
binary data vs not-binary is a happy accident.  Pretty sure we stored a
mime-type because we figured it would be useful for other things than just
distinguishing binary types.  But as I recall you're the one that came up with
the svn:mime-type property in the first place.  So maybe you have a better
recollection than I do about this.

Honestly, I don't see what's unwieldy about using the the Internet Media Type
parameters in his case (syntax parameter for markdown).  The title example
given earlier in the thread seems like an example that should go in some other
property in the case of Subversion.

Basically if it isn't useful in the Content-Type field of the HTTP protocol, I
can't fathom why you'd shove it into svn:mime-type.

Re: svn:mime-type arbitrary parameters

Posted by Sean Leonard <de...@seantek.com>.
On Oct 30, 2014, at 3:52 AM, Branko Čibej <br...@wandisco.com> wrote:

> On 30.10.2014 05:47, Sean Leonard wrote:
>> On Oct 26, 2014, at 11:35 AM, Branko Čibej <br...@wandisco.com> wrote:
>> 
>>> On 26.10.2014 16:06, Sean Leonard wrote:
>>>> On 10/26/2014 3:21 AM, Branko Čibej wrote:
>>>>> On 26.10.2014 05:49, Sean Leonard wrote:
>>>>>> On 10/25/2014 5:59 PM, Branko Čibej wrote:
>>>>>>> On 25.10.2014 20:53, Sean Leonard wrote:
>>>>>>>> It appears that the matter was not fully resolved. svn:charset seems
>>>>>>>> to enjoy de-facto use.
>>>>>>> If anyone is using svn:charset, they're violating our rules. The svn:
>>>>>>> namespace is reserved for property names defined by Subversion, and
>>>>>>> we've not defined that name. So ... using that name is likely to cause
>>>>>>> problems at some point.
>>>>>> Ok. So I guess the issue of how Subversion encodes a particular
>>>>>> character set/character encoding is still "live"?
>>>>> Well, Subversion doesn't "encode" anything; but for the purpose of
>>>>> serving content straight from the repository through an HTTP server, the
>>>>> established way to define the character set is to add the tag to the
>>>>> svn:mime-type property, e.g.:
>>>>> 
>>>>>    svn propset svn:mime-type 'text/plain; charset=UTF-8' file...
>>>> Actually I have a different proposal: what if the property name is the
>>>> parameter, prefixed with "svn:mime-type:", and the property value is
>>>> the UTF-8 encoded parameter value?
>>>> 
>>>> For example:
>>> [...]
>>> 
>>> You must be confusing Subversion with some Web content management system. :)
>> Well, Subversion is a content management system…at least for source code. :) And that includes source code for websites. Thus in that sense, it would be a Web content management system. :)
>> 
>> Internally one of my projects is using it for document storage, and it has been working out pretty well. Much cheaper than document management systems that cost hundreds of thousands of dollars.
>> 
>>> The fact that the svn:mime-type property is usable in any way for
>>> serving content from the repository is more or less an accident; it's
>>> definitely not a design goal. What you propose would have zero benefit
>>> for Subversion as a version control system but less than trivial
>>> maintenance costs, so it's not likely to ever happen.
>>> 
>>>> Are there other places where this property is parsed or interpreted?
>>> It's used by mod_dav_svn to populate Content-Type; but, as I said,
>>> that's not the purpose of the property.
>> Well I don’t know if has “zero" benefit. The benefit is that it is easier to retrieve and manipulate the semantic values without needing to do intricate parsing of svn:mime-type.
>> 
>> However, it seems that there is running code that will pass the parameters as-is to the Content-Type field, so this is sufficient reason for me (on top of the “; “ delimiter check in validate.c) to conclude that, if you want to store media type parameters in Subversion, you should do it with RFC 2045 style semicolon delimited data after the media type in the “svn:mime-type” property. Thanks!
>> 
>>>> OK. How does Subversion restrict the value to US-ASCII?
>>> It turns out I was wrong about that; we don't restrict the value to
>>> US-ASCII. See svn_mime_type_validate in subversion/lbsvn_subr/validate.c.
>> Ok, thanks. Yep, that settles it regarding the parameters—there is a comment in there.
>> 
>> On the other hand, the code contradicts what you’re saying:
>>  /* Check the mime type for illegal characters. See RFC 1521. */
>>  for (i = 0; i < len; i++)
>>    {
>>      if (&mime_type[i] != slash_pos
>>         && (! svn_ctype_isascii(mime_type[i])
>>            || svn_ctype_iscntrl(mime_type[i])
>>            || svn_ctype_isspace(mime_type[i])
>>            || (strchr(tspecials, mime_type[i]) != NULL)))
>>        return svn_error_createf
>> 
>> 
>> That suggests that the characters are limited to US-ASCII.
> 
> Meh ... you're right and I'm stupid these days.
> 
>> While this may be true for RFC 2045, e-mail can how contain UTF-8 headers. (RFC 6530.) It is not yet clear, however, whether UTF-8 headers apply to MIME headers, specifically Content-Type. Also, the code above is not *entirely* correct as a Content-Type header can contain linear whitespace (LWSP), meaning that \r\n(spaces)(more content) is permissible—it gets collapsed to a single space.
> 
> "May contain" is not the same as "must contain". The svn:mime-type
> property is not equivalent to the Content-Type header and it's a mistake
> to assume it is; e.g., as you note, it may not contain newlines, but
> that does not prevent us from populating Content-Type from
> svn:mime-type. That is a one-way path; we don't ever try to set
> svn:mime-type from the value of some arbitrary Content-Type header.
> 
> After reading RFCs 6530 and 6532, I conclude that Content-Type can now
> contain UTF-8 since neither document explicitly forbids that. However,
> we can't extend the domain of svn:mime-type property values because we
> must maintain backwards compatibility.
> 
> So, yes, you're restricted to using quoting kluges if you want to embed
> Unicode characters in the value. You'll also note that you can't use
> comments in svn:mime-type, because we forbid parentheses.
> 
> All that said, I'm sure we'd consider a client-side patch that would
> allow users to use UTF-8 when setting svn:mime-type in the client, or
> via the svn_client API, and would do the necessary quoting and unquoting
> transparently. Do you think you're up to having a go at producing such a
> patch?

Yes, I am willing to work on that.

I will probably not have time for the next couple of weeks, but will try to allocate time in November.

(If anyone wants to take it in the interim, go ahead. Otherwise I’ll see what I can do later this month.)

Best regards,

Sean

Re: svn:mime-type arbitrary parameters

Posted by Branko Čibej <br...@wandisco.com>.
On 30.10.2014 05:47, Sean Leonard wrote:
> On Oct 26, 2014, at 11:35 AM, Branko Čibej <br...@wandisco.com> wrote:
>
>> On 26.10.2014 16:06, Sean Leonard wrote:
>>> On 10/26/2014 3:21 AM, Branko Čibej wrote:
>>>> On 26.10.2014 05:49, Sean Leonard wrote:
>>>>> On 10/25/2014 5:59 PM, Branko Čibej wrote:
>>>>>> On 25.10.2014 20:53, Sean Leonard wrote:
>>>>>>> It appears that the matter was not fully resolved. svn:charset seems
>>>>>>> to enjoy de-facto use.
>>>>>> If anyone is using svn:charset, they're violating our rules. The svn:
>>>>>> namespace is reserved for property names defined by Subversion, and
>>>>>> we've not defined that name. So ... using that name is likely to cause
>>>>>> problems at some point.
>>>>> Ok. So I guess the issue of how Subversion encodes a particular
>>>>> character set/character encoding is still "live"?
>>>> Well, Subversion doesn't "encode" anything; but for the purpose of
>>>> serving content straight from the repository through an HTTP server, the
>>>> established way to define the character set is to add the tag to the
>>>> svn:mime-type property, e.g.:
>>>>
>>>>     svn propset svn:mime-type 'text/plain; charset=UTF-8' file...
>>> Actually I have a different proposal: what if the property name is the
>>> parameter, prefixed with "svn:mime-type:", and the property value is
>>> the UTF-8 encoded parameter value?
>>>
>>> For example:
>> [...]
>>
>> You must be confusing Subversion with some Web content management system. :)
> Well, Subversion is a content management system…at least for source code. :) And that includes source code for websites. Thus in that sense, it would be a Web content management system. :)
>
> Internally one of my projects is using it for document storage, and it has been working out pretty well. Much cheaper than document management systems that cost hundreds of thousands of dollars.
>
>> The fact that the svn:mime-type property is usable in any way for
>> serving content from the repository is more or less an accident; it's
>> definitely not a design goal. What you propose would have zero benefit
>> for Subversion as a version control system but less than trivial
>> maintenance costs, so it's not likely to ever happen.
>>
>>> Are there other places where this property is parsed or interpreted?
>> It's used by mod_dav_svn to populate Content-Type; but, as I said,
>> that's not the purpose of the property.
> Well I don’t know if has “zero" benefit. The benefit is that it is easier to retrieve and manipulate the semantic values without needing to do intricate parsing of svn:mime-type.
>
> However, it seems that there is running code that will pass the parameters as-is to the Content-Type field, so this is sufficient reason for me (on top of the “; “ delimiter check in validate.c) to conclude that, if you want to store media type parameters in Subversion, you should do it with RFC 2045 style semicolon delimited data after the media type in the “svn:mime-type” property. Thanks!
>
>>> OK. How does Subversion restrict the value to US-ASCII?
>> It turns out I was wrong about that; we don't restrict the value to
>> US-ASCII. See svn_mime_type_validate in subversion/lbsvn_subr/validate.c.
> Ok, thanks. Yep, that settles it regarding the parameters—there is a comment in there.
>
> On the other hand, the code contradicts what you’re saying:
>   /* Check the mime type for illegal characters. See RFC 1521. */
>   for (i = 0; i < len; i++)
>     {
>       if (&mime_type[i] != slash_pos
>          && (! svn_ctype_isascii(mime_type[i])
>             || svn_ctype_iscntrl(mime_type[i])
>             || svn_ctype_isspace(mime_type[i])
>             || (strchr(tspecials, mime_type[i]) != NULL)))
>         return svn_error_createf
>
>
> That suggests that the characters are limited to US-ASCII.

Meh ... you're right and I'm stupid these days.

> While this may be true for RFC 2045, e-mail can how contain UTF-8 headers. (RFC 6530.) It is not yet clear, however, whether UTF-8 headers apply to MIME headers, specifically Content-Type. Also, the code above is not *entirely* correct as a Content-Type header can contain linear whitespace (LWSP), meaning that \r\n(spaces)(more content) is permissible—it gets collapsed to a single space.

"May contain" is not the same as "must contain". The svn:mime-type
property is not equivalent to the Content-Type header and it's a mistake
to assume it is; e.g., as you note, it may not contain newlines, but
that does not prevent us from populating Content-Type from
svn:mime-type. That is a one-way path; we don't ever try to set
svn:mime-type from the value of some arbitrary Content-Type header.

After reading RFCs 6530 and 6532, I conclude that Content-Type can now
contain UTF-8 since neither document explicitly forbids that. However,
we can't extend the domain of svn:mime-type property values because we
must maintain backwards compatibility.

So, yes, you're restricted to using quoting kluges if you want to embed
Unicode characters in the value. You'll also note that you can't use
comments in svn:mime-type, because we forbid parentheses.

All that said, I'm sure we'd consider a client-side patch that would
allow users to use UTF-8 when setting svn:mime-type in the client, or
via the svn_client API, and would do the necessary quoting and unquoting
transparently. Do you think you're up to having a go at producing such a
patch?

-- Brane


Re: svn:mime-type arbitrary parameters

Posted by Sean Leonard <de...@seantek.com>.
On Oct 26, 2014, at 11:35 AM, Branko Čibej <br...@wandisco.com> wrote:

> On 26.10.2014 16:06, Sean Leonard wrote:
>> On 10/26/2014 3:21 AM, Branko Čibej wrote:
>>> On 26.10.2014 05:49, Sean Leonard wrote:
>>>> On 10/25/2014 5:59 PM, Branko Čibej wrote:
>>>>> On 25.10.2014 20:53, Sean Leonard wrote:
>>>>>> It appears that the matter was not fully resolved. svn:charset seems
>>>>>> to enjoy de-facto use.
>>>>> If anyone is using svn:charset, they're violating our rules. The svn:
>>>>> namespace is reserved for property names defined by Subversion, and
>>>>> we've not defined that name. So ... using that name is likely to cause
>>>>> problems at some point.
>>>> Ok. So I guess the issue of how Subversion encodes a particular
>>>> character set/character encoding is still "live"?
>>> Well, Subversion doesn't "encode" anything; but for the purpose of
>>> serving content straight from the repository through an HTTP server, the
>>> established way to define the character set is to add the tag to the
>>> svn:mime-type property, e.g.:
>>> 
>>>     svn propset svn:mime-type 'text/plain; charset=UTF-8' file...
>> 
>> Actually I have a different proposal: what if the property name is the
>> parameter, prefixed with "svn:mime-type:", and the property value is
>> the UTF-8 encoded parameter value?
>> 
>> For example:
> 
> [...]
> 
> You must be confusing Subversion with some Web content management system. :)

Well, Subversion is a content management system…at least for source code. :) And that includes source code for websites. Thus in that sense, it would be a Web content management system. :)

Internally one of my projects is using it for document storage, and it has been working out pretty well. Much cheaper than document management systems that cost hundreds of thousands of dollars.

> 
> The fact that the svn:mime-type property is usable in any way for
> serving content from the repository is more or less an accident; it's
> definitely not a design goal. What you propose would have zero benefit
> for Subversion as a version control system but less than trivial
> maintenance costs, so it's not likely to ever happen.
> 
>> Are there other places where this property is parsed or interpreted?
> 
> It's used by mod_dav_svn to populate Content-Type; but, as I said,
> that's not the purpose of the property.

Well I don’t know if has “zero" benefit. The benefit is that it is easier to retrieve and manipulate the semantic values without needing to do intricate parsing of svn:mime-type.

However, it seems that there is running code that will pass the parameters as-is to the Content-Type field, so this is sufficient reason for me (on top of the “; “ delimiter check in validate.c) to conclude that, if you want to store media type parameters in Subversion, you should do it with RFC 2045 style semicolon delimited data after the media type in the “svn:mime-type” property. Thanks!

> 
>> OK. How does Subversion restrict the value to US-ASCII?
> 
> It turns out I was wrong about that; we don't restrict the value to
> US-ASCII. See svn_mime_type_validate in subversion/lbsvn_subr/validate.c.

Ok, thanks. Yep, that settles it regarding the parameters—there is a comment in there.

On the other hand, the code contradicts what you’re saying:
  /* Check the mime type for illegal characters. See RFC 1521. */
  for (i = 0; i < len; i++)
    {
      if (&mime_type[i] != slash_pos
         && (! svn_ctype_isascii(mime_type[i])
            || svn_ctype_iscntrl(mime_type[i])
            || svn_ctype_isspace(mime_type[i])
            || (strchr(tspecials, mime_type[i]) != NULL)))
        return svn_error_createf


That suggests that the characters are limited to US-ASCII.

While this may be true for RFC 2045, e-mail can how contain UTF-8 headers. (RFC 6530.) It is not yet clear, however, whether UTF-8 headers apply to MIME headers, specifically Content-Type. Also, the code above is not *entirely* correct as a Content-Type header can contain linear whitespace (LWSP), meaning that \r\n(spaces)(more content) is permissible—it gets collapsed to a single space.

Sean


Re: svn:mime-type arbitrary parameters

Posted by Branko Čibej <br...@wandisco.com>.
On 26.10.2014 16:06, Sean Leonard wrote:
> On 10/26/2014 3:21 AM, Branko Čibej wrote:
>> On 26.10.2014 05:49, Sean Leonard wrote:
>>> On 10/25/2014 5:59 PM, Branko Čibej wrote:
>>>> On 25.10.2014 20:53, Sean Leonard wrote:
>>>>> It appears that the matter was not fully resolved. svn:charset seems
>>>>> to enjoy de-facto use.
>>>> If anyone is using svn:charset, they're violating our rules. The svn:
>>>> namespace is reserved for property names defined by Subversion, and
>>>> we've not defined that name. So ... using that name is likely to cause
>>>> problems at some point.
>>> Ok. So I guess the issue of how Subversion encodes a particular
>>> character set/character encoding is still "live"?
>> Well, Subversion doesn't "encode" anything; but for the purpose of
>> serving content straight from the repository through an HTTP server, the
>> established way to define the character set is to add the tag to the
>> svn:mime-type property, e.g.:
>>
>>      svn propset svn:mime-type 'text/plain; charset=UTF-8' file...
>
> Actually I have a different proposal: what if the property name is the
> parameter, prefixed with "svn:mime-type:", and the property value is
> the UTF-8 encoded parameter value?
>
> For example:

[...]

You must be confusing Subversion with some Web content management system. :)

The fact that the svn:mime-type property is usable in any way for
serving content from the repository is more or less an accident; it's
definitely not a design goal. What you propose would have zero benefit
for Subversion as a version control system but less than trivial
maintenance costs, so it's not likely to ever happen.

> Are there other places where this property is parsed or interpreted?

It's used by mod_dav_svn to populate Content-Type; but, as I said,
that's not the purpose of the property.

> OK. How does Subversion restrict the value to US-ASCII?

It turns out I was wrong about that; we don't restrict the value to
US-ASCII. See svn_mime_type_validate in subversion/lbsvn_subr/validate.c.

-- Brane


Re: svn:mime-type arbitrary parameters

Posted by Sean Leonard <de...@seantek.com>.
On 10/26/2014 3:21 AM, Branko Čibej wrote:
> On 26.10.2014 05:49, Sean Leonard wrote:
>> On 10/25/2014 5:59 PM, Branko Čibej wrote:
>>> On 25.10.2014 20:53, Sean Leonard wrote:
>>>> It appears that the matter was not fully resolved. svn:charset seems
>>>> to enjoy de-facto use.
>>> If anyone is using svn:charset, they're violating our rules. The svn:
>>> namespace is reserved for property names defined by Subversion, and
>>> we've not defined that name. So ... using that name is likely to cause
>>> problems at some point.
>> Ok. So I guess the issue of how Subversion encodes a particular
>> character set/character encoding is still "live"?
> Well, Subversion doesn't "encode" anything; but for the purpose of
> serving content straight from the repository through an HTTP server, the
> established way to define the character set is to add the tag to the
> svn:mime-type property, e.g.:
>
>      svn propset svn:mime-type 'text/plain; charset=UTF-8' file...

Actually I have a different proposal: what if the property name is the 
parameter, prefixed with "svn:mime-type:", and the property value is the 
UTF-8 encoded parameter value?

For example:

svn propset...
   svn:mime-type "text/plain"
   svn:mime-type:charset "UTF-8"

svn propset...
   svn:mime-type "text/markdown"
   svn:mime-type:syntax "Original"

svn propset...
   svn:mime-type "text/troff"
   svn:mime-type:versions "nroff 2013"
   svn:mime-type:resources "extra.txt"

svn propset...
   svn:mime-type "application/pkcs7-mime"
   svn:mime-type:smime-type "signed-receipt"

This would be easier for Subversion clients to manage and parse. It 
matches all of the semantics.

However, if there is already established code that encodes or decodes 
parameters, then encoding everything into svn:mime-type is not 
objectionable either.

> That will be exposed to the browser in the Content-Type header, and the
> Subversion client, which uses svn:mime-type for its own purposes, will
> ignore any content type parameters (that is, anything after the semicolon).

I looked through the source code and found one place where the 
svn:mime-type property is parsed: svn_mime_type_is_binary (in 
libsvn_subr/validate.c). It looks for the ; delimiter. Looks like 
svn_mime_type_is_binary is quite popular in the source.

Are there other places where this property is parsed or interpreted?

>
> The restriction here is that Subversion restricts the value of the
> svn:mime-type property to be in the ASCII; which, incidentally, is
> required by RFC2045 (see: https://www.ietf.org/rfc/rfc2045.txt section 5.1).

OK. How does Subversion restrict the value to US-ASCII?

Thanks,

Sean


Re: svn:mime-type arbitrary parameters

Posted by Branko Čibej <br...@wandisco.com>.
On 26.10.2014 05:49, Sean Leonard wrote:
> On 10/25/2014 5:59 PM, Branko Čibej wrote:
>> On 25.10.2014 20:53, Sean Leonard wrote:
>>> It appears that the matter was not fully resolved. svn:charset seems
>>> to enjoy de-facto use.
>> If anyone is using svn:charset, they're violating our rules. The svn:
>> namespace is reserved for property names defined by Subversion, and
>> we've not defined that name. So ... using that name is likely to cause
>> problems at some point.
>
> Ok. So I guess the issue of how Subversion encodes a particular
> character set/character encoding is still "live"?

Well, Subversion doesn't "encode" anything; but for the purpose of
serving content straight from the repository through an HTTP server, the
established way to define the character set is to add the tag to the
svn:mime-type property, e.g.:

    svn propset svn:mime-type 'text/plain; charset=UTF-8' file...

That will be exposed to the browser in the Content-Type header, and the
Subversion client, which uses svn:mime-type for its own purposes, will
ignore any content type parameters (that is, anything after the semicolon).

The restriction here is that Subversion restricts the value of the
svn:mime-type property to be in the ASCII; which, incidentally, is
required by RFC2045 (see: https://www.ietf.org/rfc/rfc2045.txt section 5.1).

This may be unwieldy, as you say in your original post, but it's the
only way that works.

-- Brane


Re: svn:mime-type arbitrary parameters

Posted by Sean Leonard <de...@seantek.com>.
On 10/25/2014 5:59 PM, Branko Čibej wrote:
> On 25.10.2014 20:53, Sean Leonard wrote:
>> It appears that the matter was not fully resolved. svn:charset seems
>> to enjoy de-facto use.
> If anyone is using svn:charset, they're violating our rules. The svn:
> namespace is reserved for property names defined by Subversion, and
> we've not defined that name. So ... using that name is likely to cause
> problems at some point.

Ok. So I guess the issue of how Subversion encodes a particular 
character set/character encoding is still "live"?

-Sean

Re: svn:mime-type arbitrary parameters

Posted by Branko Čibej <br...@wandisco.com>.
On 25.10.2014 20:53, Sean Leonard wrote:
> It appears that the matter was not fully resolved. svn:charset seems
> to enjoy de-facto use. 

If anyone is using svn:charset, they're violating our rules. The svn:
namespace is reserved for property names defined by Subversion, and
we've not defined that name. So ... using that name is likely to cause
problems at some point.


-- Brane