You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Mark Irving <Ma...@informatix.co.uk> on 2008/02/04 12:22:48 UTC

Action request: mime-type of xml-dtd should be treated as text

I have a repository with a DTD file checked in and carrying the 
property svn:mime-type value application/xml-dtd, the correct 
type for it according to RFC 2043. It is inconvenient to do some 
Subversion operations on it, such as "blame", because Subversion 
thinks it is non-textual. It would be an improvement if XML mime 
types were recognised as text, and specifically this one.

As far as I can tell, the enhancement should be to function 
svn_mime_type_is_binary in subversion/libsvn_subr/validate.c. I 
suggest the following additions to the mime types treated as 
text.

application/xml-dtd
application/xml
application/*+xml

(The FAQ and the source agree that Subversion's text mime-types 
are text/*, image/x-xbitmap and image/x-xpixmap.)

Do others agree? Should this be entered as a formal SVN issue?

  - Mark Irving.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Action request: mime-type of xml-dtd should be treated as text

Posted by Karl Fogel <kf...@red-bean.com>.
"Mark Irving" <Ma...@informatix.co.uk> writes:
> I have a repository with a DTD file checked in and carrying the 
> property svn:mime-type value application/xml-dtd, the correct 
> type for it according to RFC 2043. It is inconvenient to do some 
> Subversion operations on it, such as "blame", because Subversion 
> thinks it is non-textual. It would be an improvement if XML mime 
> types were recognised as text, and specifically this one.
>
> As far as I can tell, the enhancement should be to function 
> svn_mime_type_is_binary in subversion/libsvn_subr/validate.c. I 
> suggest the following additions to the mime types treated as 
> text.
>
> application/xml-dtd
> application/xml
> application/*+xml
>
> (The FAQ and the source agree that Subversion's text mime-types 
> are text/*, image/x-xbitmap and image/x-xpixmap.)
>
> Do others agree? Should this be entered as a formal SVN issue?

I think this has come up before, though I don't remember the
discussion clearly; there may be some reason we're not declaring all
XML to be a line-based text format.  Can you search the dev@ list
archive, and if you don't find it, repost your mail to dev@?

Thanks,
-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Re: Action request: mime-type of xml-dtd should be treated as text

Posted by Paul Koning <Pa...@dell.com>.
>>>>> "Mark" == Mark Irving <Ma...@informatix.co.uk> writes:

 Mark> An XML DTD is often, but not always, prepared with a text
 Mark> editor or a syntax-aware text editor. Exactly the same claim
 Mark> can be made about, say, a C++ source file. If SVN presents C++
 Mark> source as text, shouldn't it do the same for
 Mark> application/xml-dtd? The argument is weaker for
 Mark> application/xml, which is more likely to be edited with a
 Mark> specialized program, but is often text.

It's always text.

Yes, it is true that XML elements may be able to be moved around, and
IF you do that, then you end up with a large delta.  Marks' argument
is exactly correct: the same thing is true for C, or C++.  But that's
not used as an argument for having C not be text, and for the same
reason it isn't a correct argument for XML not being text.

       paul


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Action request: mime-type of xml-dtd should be treated as text

Posted by Karl Fogel <kf...@red-bean.com>.
"Mark Irving" <Ma...@informatix.co.uk> writes:
> First, sorry for not noticing past discussions on this subject. 
> My excuse is that I didn't read the mailing list search tool's 
> instructions carefully enough, and did not use the quotes when 
> searching for "mime-type" thus getting "mime" but not "type". 
> Sorry, again.
>
> Second, some background. I posted my original message after one 
> of my colleagues explained very, very plainly what he thought of 
> our version control system which wouldn't let him use svn blame 
> (more precisely, the TortoiseSVN equivalent) on a DTD. I didn't 
> know why at that stage, being fairly new to Subversion (we've 
> been using it for only two months). When I worked out that the 
> reason was because SVN thought it was not text, his opinion 
> became even plainer.

By the way, does 'svn blame --force' not do the trick here?
(I realize there are better solutions available, but this workaround
might help your colleague right now.)

-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Re: Action request: mime-type of xml-dtd should be treated as text

Posted by Erik Huelsmann <eh...@gmail.com>.
On Feb 5, 2008 7:47 PM, Paul Koning <Pa...@dell.com> wrote:
> >>>>> "John" == John Peacock <jo...@havurah-software.org> writes:
>
>  John> Mark Irving wrote:
>  >> An XML DTD is often, but not always, prepared with a text editor
>  >> or a syntax-aware text editor. Exactly the same claim can be made
>  >> about, say, a C++ source file. If SVN presents C++ source as text,
>  >> shouldn't it do the same for application/xml-dtd? The argument is
>  >> weaker for application/xml, which is more likely to be edited with
>  >> a specialized program, but is often text.
>
>  John> I've already responded several times to these threads
>  John> explaining that in the generic case, all XML files are not
>  John> "text documents" from the point of view of Subversion (or
>  John> ordinary diff tools for that matter).  So far, no one seems to
>  John> believe me, perhaps because I have not been using the
>  John> appropriate language.  Let me try again. ...
>
> I find Mark's C++ analogy a lot more convincing than the argument you
> gave.
>
> Another consideration is this: binary file functionality in Subversion
> is a strict subset of text file functionality.  So, even if it is true
> that for SOME XML files it is not helpful to consider it as a text
> file, treating XML a text cannot possibly do harm, but it often will
> be a benefit.  Conversely, treating XML as binary data cannot possibly
> be helpful, but it often will be a problem.
>
> Finally, there's the "principle of least astonishment".  If a file
> looks like text (if I open it in a text editor, I see printable
> characters -- which is the case for XML) then the expected behavior is
> that tools will treat it as text.


... which is *exactly* what Subversion will do if you don't set the
svn:mime-type property. So, basically, you're complaining here that
you intervened into a perfectly working system, and now it doesn't
work anymore. I can only say "Don't do that.".

bye,

Erik.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Re: Action request: mime-type of xml-dtd should be treated as text

Posted by Paul Koning <Pa...@dell.com>.
>>>>> "John" == John Peacock <jo...@havurah-software.org> writes:

 John> Mark Irving wrote:
 >> An XML DTD is often, but not always, prepared with a text editor
 >> or a syntax-aware text editor. Exactly the same claim can be made
 >> about, say, a C++ source file. If SVN presents C++ source as text,
 >> shouldn't it do the same for application/xml-dtd? The argument is
 >> weaker for application/xml, which is more likely to be edited with
 >> a specialized program, but is often text.

 John> I've already responded several times to these threads
 John> explaining that in the generic case, all XML files are not
 John> "text documents" from the point of view of Subversion (or
 John> ordinary diff tools for that matter).  So far, no one seems to
 John> believe me, perhaps because I have not been using the
 John> appropriate language.  Let me try again. ...

I find Mark's C++ analogy a lot more convincing than the argument you
gave.

Another consideration is this: binary file functionality in Subversion
is a strict subset of text file functionality.  So, even if it is true
that for SOME XML files it is not helpful to consider it as a text
file, treating XML a text cannot possibly do harm, but it often will
be a benefit.  Conversely, treating XML as binary data cannot possibly
be helpful, but it often will be a problem.

Finally, there's the "principle of least astonishment".  If a file
looks like text (if I open it in a text editor, I see printable
characters -- which is the case for XML) then the expected behavior is
that tools will treat it as text.

     paul



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Action request: mime-type of xml-dtd should be treated as text

Posted by Thomas Scheffler <th...@uni-jena.de>.
Am Mittwoch, 6. Februar 2008 schrieb John Peacock:
> Thomas Scheffler wrote:
> > The in the latter case this means something like, if your grand mother
> > can read it it is fine to use text/xml. In any other case and if you are
> > not sure "application/xml" is allways right.
>
> All XML documents are application/xml (dogs).  Some XML documents are
> textual and can use text/xml (poodles).  All poodles are dogs, but not
> all dogs are poodles.  Does that make it clearer?
>
> Just because, in your experience, XML documents are textual, the RFC's
> clearly state that text/xml is the appropriate mime-type for those
> documents.

Probably you read it a different way than me. I did not come to this 
conclusion as well as a few websites I investigated after the start of this 
discussion. But that was not the important point anyway.

> An important thing to remember here is that setting svn:mime-type to
> 'text/xml' affects *ONLY* Subversion's handling of the file.  You can
> treat these files as whatever other MIME type you want in any other
> application.  It would only affect how _Subversion_ would process the file.

Not if you use the webdav layer. The "svn:mime-type" property of the file is 
used for the HTTP response of Apache.

> In any case, I'm tired of going round and round on this topic.

/me too

> I've 
> started to investigate how to add a new svn:text-type property which
> would trump whatever svn:mime-type would normally use.  It isn't pretty,
> since there are a lot of places that check this.  It would be a one line
> patch to add an exception for 'application/xml*' but you simply haven't
> convinced me that this is the correct solution to anyone's problem
> except your own.

I could return that there is no counter evidence against my theory that for 
any XML file out there a proper diff or blame output is more usefull 
than "binary files differ" or "skipping binary file". No one argued against 
that so far.

Here a short list and their mime types that are affected by this issue (on our 
repository) too:
DTD       application/xml-dtd
JS            application/x-javascript
XHTML application/xhtml+xml
XSL        application/xml or application/xslt+xml

All of them would be perfect for a nice diff and blame output. Honestly I do 
not see a reason why this should not be the default case.

Greets

Thomas

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Action request: mime-type of xml-dtd should be treated as text

Posted by John Peacock <jo...@havurah-software.org>.
Thomas Scheffler wrote:
> The in the latter case this means something like, if your grand mother can 
> read it it is fine to use text/xml. In any other case and if you are not 
> sure "application/xml" is allways right.

All XML documents are application/xml (dogs).  Some XML documents are 
textual and can use text/xml (poodles).  All poodles are dogs, but not 
all dogs are poodles.  Does that make it clearer?

Just because, in your experience, XML documents are textual, the RFC's 
clearly state that text/xml is the appropriate mime-type for those 
documents.

An important thing to remember here is that setting svn:mime-type to 
'text/xml' affects *ONLY* Subversion's handling of the file.  You can 
treat these files as whatever other MIME type you want in any other 
application.  It would only affect how _Subversion_ would process the file.

In any case, I'm tired of going round and round on this topic.  I've 
started to investigate how to add a new svn:text-type property which 
would trump whatever svn:mime-type would normally use.  It isn't pretty, 
since there are a lot of places that check this.  It would be a one line 
patch to add an exception for 'application/xml*' but you simply haven't 
convinced me that this is the correct solution to anyone's problem 
except your own.

John

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Action request: mime-type of xml-dtd should be treated as text

Posted by Thomas Scheffler <th...@uni-jena.de>.
Am Dienstag, 5. Februar 2008 schrieb John Peacock:
> Thomas Scheffler wrote:
> > What can be a good reason to store the same file over and over again,
> > when data structure and content do not change? Is that a comon case now?
>
> All files are stored on the Subversion server as binary (using a fairly
> advanced scheme to track changes between versions), so this has nothing
> to do with your complaint.

You got me wrong. I am not complaining about the storage of the diff but on 
the user interface. That I have little control about when a file is handled 
like a binary file, which means I can not get usable blame and diff outputs.

> > I really like to have a diff and see right by svnnotify that just a two
> > attributes changed their order if thats the case. If subversion makes a
> > diff for it and stores it and I possibly can read it, why should I hang
> > with "binary files differ"? What is so great about it?
>
> You are missing the point.  All files that have MIME types of
> 'application/*' are by definition binary:
>
> 	http://www.ietf.org/rfc/rfc2046.txt

"application -- some other kind of data, typically either uninterpreted binary 
data or information to be processed by an application."

The "or information to be processed by an application" is the most important 
one. This is the common case for xml here, that's why xhtml documents should 
be "application/xhtml+xml" (http://www.w3.org/TR/xhtml-media-types/). 

>
> There is a MIME type, text/xml:
>
> 	http://www.ietf.org/rfc/rfc2376.txt
>
> which can be used for XML documents that are to be treated as text.  It
> would be wrong to make some exception for application/xml* which just
> isn't 100% correct.

The fact is that it would be in nearly 100% of the cases out there perfect.
RFC2376:
"Every XML entity is suitable for use with the application/xml media type 
without modification"

RFC2046:
"text -- textual information.[..] Other subtypes are to be used for enriched 
text in forms where application software may enhance the appearance of the 
text, but such software must not be required in order to get the general idea 
of the content."

The in the latter case this means something like, if your grand mother can 
read it it is fine to use text/xml. In any other case and if you are not 
sure "application/xml" is allways right.

> > So imho here is a huge loss in function if compared to CVS. And you did
> > not convinced be so far that it is not.
>
> CVS had virtually no support for binary types at all, so you are really
> throwing out a red herring here...

CVS uses the kb flag to determine if a file is binary or not. A user can set 
this flag like he wants. This has consequences for end of line characters and 
for the outputs of blame and diff. CVS does not guess when a file is binary 
or not. The user has to tell that it is binary else it is handled like text.

So you can not make something like "Diffs are readable by persons if and only 
if mime type is text/*." of it.

Greets

Thomas

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Action request: mime-type of xml-dtd should be treated as text

Posted by John Aldridge <jo...@informatix.co.uk>.
John Peacock wrote:
 > Just because, in your experience, XML documents are textual, the RFC's
 > clearly state that text/xml is the appropriate mime-type for those
 > documents.

That's exactly the crux of our disagreement, I think. Here's the wording 
from RFC 3023 (which has obsoleted 2376)

    If an XML document -- that is, the unprocessed, source XML document
    -- is readable by casual users, text/xml is preferable to
    application/xml.  MIME user agents (and web user agents) that do not
    have explicit support for text/xml will treat it as text/plain, for
    example, by displaying the XML MIME entity as plain text.
    Application/xml is preferable when the XML MIME entity is unreadable
    by casual users.  Similarly, text/xml-external-parsed-entity is
    preferable when an external parsed entity is readable by casual
    users, but application/xml-external-parsed-entity is preferable when
    a plain text display is inappropriate.

       NOTE: Users are in general not used to text containing tags such
       as <price>, and often find such tags quite disorienting or
       annoying.  If one is not sure, the conservative principle would
       suggest using application/* instead of text/* so as not to put
       information in front of users that they will quite likely not
       understand.

I don't think that wording has anything to do with whether merging and 
diffing is appropriate.

> See, this is the thing I don't understand.  Don't set a mime-type on 
> your XML or DTD documents, and Subversion will assume they are text and 
> let you merge/blame/etc to your heart's content.  If you do want/need to 
> set the MIME type for some reason, just use 'text/xml' and Subversion 
> will be happy to treat the files as text.  No one needs to know that the 
> MIME type from Subversion's standpoint isn't "precisely accurate".
> 
> QED

Except that there is no text/xml-dtd mime type, and text/xml isn't right 
for a DTD.

However it's true that we don't at the moment do any processing which is 
affected by the mime-type, and so omitting the declaration or setting it 
to text/plain is a bearable workround.


Unless you indicate that you'd like to debate this further, I'm going to 
let this drop now -- I don't think we are going to reach an agreement, 
we have a workround which is adequate, and if you do find time to 
implement svn:text-type (thank you for looking into the possibility!) 
we'd have an even better one.

-- 
Cheers,
John

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Action request: mime-type of xml-dtd should be treated as text

Posted by John Peacock <jo...@havurah-software.org>.
John Aldridge wrote:
> John Peacock wrote:
>> There is a MIME type, text/xml:
>>
>>     http://www.ietf.org/rfc/rfc2376.txt
>>
>> which can be used for XML documents that are to be treated as text.
> 
> But doesn't help my colleagues original problem which is with a DTD.
> 

See, this is the thing I don't understand.  Don't set a mime-type on 
your XML or DTD documents, and Subversion will assume they are text and 
let you merge/blame/etc to your heart's content.  If you do want/need to 
set the MIME type for some reason, just use 'text/xml' and Subversion 
will be happy to treat the files as text.  No one needs to know that the 
MIME type from Subversion's standpoint isn't "precisely accurate".

QED

John

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Action request: mime-type of xml-dtd should be treated as text

Posted by John Aldridge <jo...@informatix.co.uk>.
John Peacock wrote:
> You are missing the point.  All files that have MIME types of 
> 'application/*' are by definition binary:
> 
>     http://www.ietf.org/rfc/rfc2046.txt

I think this is a misreading of the specifications. "application/*" 
means merely that the text is not written in natural language, not that 
it isn't mergeable text.

For example,

    http://www.ietf.org/rfc/rfc4329.txt

makes it clear that

    application/javascript

is the preferred type for a .js file (and that text/javascript is 
obsolete). I think we can safely assume that, if a mime-type were ever 
standardised for C++ source code, it would be "application/something".

> There is a MIME type, text/xml:
> 
>     http://www.ietf.org/rfc/rfc2376.txt
> 
> which can be used for XML documents that are to be treated as text.

But doesn't help my colleagues original problem which is with a DTD.

-- 
Cheers,
John

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Action request: mime-type of xml-dtd should be treated as text

Posted by John Peacock <jo...@havurah-software.org>.
Thomas Scheffler wrote:
> What can be a good reason to store the same file over and over again, when 
> data structure and content do not change? Is that a comon case now?

All files are stored on the Subversion server as binary (using a fairly 
advanced scheme to track changes between versions), so this has nothing 
to do with your complaint.

> I really like to have a diff and see right by svnnotify that just a two 
> attributes changed their order if thats the case. If subversion makes a diff 
> for it and stores it and I possibly can read it, why should I hang 
> with "binary files differ"? What is so great about it?

You are missing the point.  All files that have MIME types of 
'application/*' are by definition binary:

	http://www.ietf.org/rfc/rfc2046.txt

There is a MIME type, text/xml:

	http://www.ietf.org/rfc/rfc2376.txt

which can be used for XML documents that are to be treated as text.  It 
would be wrong to make some exception for application/xml* which just 
isn't 100% correct.

> So imho here is a huge loss in function if compared to CVS. And you did not 
> convinced be so far that it is not.

CVS had virtually no support for binary types at all, so you are really 
throwing out a red herring here...

John

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Action request: mime-type of xml-dtd should be treated as text

Posted by Thomas Scheffler <th...@uni-jena.de>.
Am Dienstag, 5. Februar 2008 schrieb John Peacock:
> Mark Irving wrote:
> > An XML DTD is often, but not always, prepared with a text editor
> > or a syntax-aware text editor. Exactly the same claim can be
> > made about, say, a C++ source file. If SVN presents C++ source
> > as text, shouldn't it do the same for application/xml-dtd? The
> > argument is weaker for application/xml, which is more likely to
> > be edited with a specialized program, but is often text.
>
> I've already responded several times to these threads explaining that in
> the generic case, all XML files are not "text documents" from the point
> of view of Subversion (or ordinary diff tools for that matter).  So far,
> no one seems to believe me, perhaps because I have not been using the
> appropriate language.  Let me try again.

Should I really explain, why a diff with a new order of attributes of 
equivalent XML file is better than "binary files differ"?

What can be a good reason to store the same file over and over again, when 
data structure and content do not change? Is that a comon case now?

I really like to have a diff and see right by svnnotify that just a two 
attributes changed their order if thats the case. If subversion makes a diff 
for it and stores it and I possibly can read it, why should I hang 
with "binary files differ"? What is so great about it?

I mean I can reformat JAVA files a lot without changing the resulting class 
file in any bit. But if all I get is "binary files differ", the whole version 
CONTROL is useless because there is less control than there easily could be.

I mean I can print two md5 sums of two versions of a file and have more 
information and can easily figure out that they differ. Every line with + 
or - of a diff tells me that the versions differ plus a LOT of more. And that 
is the point.

So imho here is a huge loss in function if compared to CVS. And you did not 
convinced be so far that it is not.

Greets

Thomas

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Action request: mime-type of xml-dtd should be treated as text

Posted by John Peacock <jo...@havurah-software.org>.
John Aldridge wrote:
> Granted there will be cases where the text representing the XML infoset
> is changed radically although the infoset itself remains constant. Even
> though regarding the file as text is not useful in those circumstances,
> I don't see what harm is done. The worst would be that an inappropriate
> merge happened which resulted in the file becoming invalid XML -- again,
> I don't see a difference between that and a merge which renders C++ code
> uncompilable.

Well, for one thing unless you are using a DTD-aware editor, the odds 
are you won't necessarily notice a badly merged XML document change 
nearly as quickly as you would if your compile broke.

I'm going to continue to argue that it would be wrong to make 
application/xml* be anything other than binary, if only because at some 
point, Subversion *will* have the hooks to call an appropriate diff 
editor to resolve conflicts.

I can certainly appreciate that for *you*, and indeed for many users of 
XML documents, being able to treat them as text would be acceptable 
(i.e. you acknowledge /a priori/ that this mapping is imperfect, but 
most of the time it will be fine).  But I'd much rather there be a new 
property that overrides the binary/text handling, rather than globally 
forcing everyone to accept your assumptions.  If only there were more 
hours in the day, I could even offer to produce a patch to implement it 
in time for 1.5's release...

John

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Action request: mime-type of xml-dtd should be treated as text

Posted by John Aldridge <jo...@informatix.co.uk>.
John Peacock wrote:
> Mark Irving wrote:
>> An XML DTD is often, but not always, prepared with a text editor or a 
>> syntax-aware text editor. Exactly the same claim can be made about, 
>> say, a C++ source file. If SVN presents C++ source as text, shouldn't 
>> it do the same for application/xml-dtd? The argument is weaker for 
>> application/xml, which is more likely to be edited with a specialized 
>> program, but is often text.
> 
> I've already responded several times to these threads explaining that in 
> the generic case, all XML files are not "text documents" from the point 
> of view of Subversion (or ordinary diff tools for that matter).  So far, 
> no one seems to believe me, perhaps because I have not been using the 
> appropriate language.  Let me try again.
> 
> XML documents are *structured* documents that just so happen to be 
> [usually] stored as textual files.  By this I mean that the actual 
> makeup of the documents themselves are ASCII (or possibly UTF-8) 
> characters (which would normally be considered "text"); I'm ignoring 
> CDATA blocks for the moment.  But the overall structure of any XML 
> document is not necessarily fixed; there are transformations of the 
> textual representation that are equivalent XML documents.

I do believe you, honest! I just don't see why it matters :-)

As (my colleague) Mark pointed out, it's also true of C++ source files
(added whitespace is often semantically neutral, for example), and
no-one's arguing that C++ source should be treated as binary. For
another example, consider files with mime-type "text/html". Subversion
does regard these as text -- thank goodness -- although there are lots
of semantically neutral reformattings which can be applied.

Granted there will be cases where the text representing the XML infoset
is changed radically although the infoset itself remains constant. Even
though regarding the file as text is not useful in those circumstances,
I don't see what harm is done. The worst would be that an inappropriate
merge happened which resulted in the file becoming invalid XML -- again,
I don't see a difference between that and a merge which renders C++ code
uncompilable.

> Some of the future development of Subversion will probably include a way 
> to map a specific external tool to a specific MIME file type.  This 
> would allow you to use an XML diff tool for comparing changes to files, 
> just as it would allow you to use an image editing tool to compare 
> changes to jpeg's for example.

That would be splendid, but in the meantime treating XML as text would
be an improvement for anyone who either uses a text editor to edit XML
files, or consistently uses an single editing tool which writes a
reasonable canonical representation.

-- 
Cheers,
John


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Action request: mime-type of xml-dtd should be treated as text

Posted by John Peacock <jo...@havurah-software.org>.
Mark Irving wrote:
> An XML DTD is often, but not always, prepared with a text editor 
> or a syntax-aware text editor. Exactly the same claim can be 
> made about, say, a C++ source file. If SVN presents C++ source 
> as text, shouldn't it do the same for application/xml-dtd? The 
> argument is weaker for application/xml, which is more likely to 
> be edited with a specialized program, but is often text.

I've already responded several times to these threads explaining that in 
the generic case, all XML files are not "text documents" from the point 
of view of Subversion (or ordinary diff tools for that matter).  So far, 
no one seems to believe me, perhaps because I have not been using the 
appropriate language.  Let me try again.

XML documents are *structured* documents that just so happen to be 
[usually] stored as textual files.  By this I mean that the actual 
makeup of the documents themselves are ASCII (or possibly UTF-8) 
characters (which would normally be considered "text"); I'm ignoring 
CDATA blocks for the moment.  But the overall structure of any XML 
document is not necessarily fixed; there are transformations of the 
textual representation that are equivalent XML documents.

The classic example is attributes, which are by definition an unordered 
list that apply to a given element.  You can change the order of these 
attributes in the XML file itself, and yet the XML document is 
identical.  This is why there are syntax-aware XML editors and diff 
programs that can handle this.

In fact, under the XML 1.0 specification, the elements themselves in a 
well-formed document can be considered unordered as well, see this 
discussion:

	http://www-128.ibm.com/developerworks/xml/library/x-eleord.html


Just because most of the time, an XML parser will return the elements in 
document order, doesn't mean that this is the only valid representation 
of this particular XML document.  There are several XML parsers that I 
am aware of that reorder the document elements and attributes (since 
they use hashes during the parsing process) as a matter of course.

Does this make more sense?  In many cases, if there was a way to tag a 
given file as being textual under SVN (through the use of a new 
svn:textual attribute for example), then diff and blame would do the 
right thing.  But this would be accidental in the sense that some other 
tool could rewrite the XML document to a completely equivalent document 
and diff and blame would be completely useless.

Some of the future development of Subversion will probably include a way 
to map a specific external tool to a specific MIME file type.  This 
would allow you to use an XML diff tool for comparing changes to files, 
just as it would allow you to use an image editing tool to compare 
changes to jpeg's for example.

John

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Action request: mime-type of xml-dtd should be treated as text

Posted by Mark Irving <Ma...@informatix.co.uk>.
I suggested adding more mime-type values to be recognised as 
text: application/xml-dtd, application/xml and 
application/*+xml.

First, sorry for not noticing past discussions on this subject. 
My excuse is that I didn't read the mailing list search tool's 
instructions carefully enough, and did not use the quotes when 
searching for "mime-type" thus getting "mime" but not "type". 
Sorry, again.

Second, some background. I posted my original message after one 
of my colleagues explained very, very plainly what he thought of 
our version control system which wouldn't let him use svn blame 
(more precisely, the TortoiseSVN equivalent) on a DTD. I didn't 
know why at that stage, being fairly new to Subversion (we've 
been using it for only two months). When I worked out that the 
reason was because SVN thought it was not text, his opinion 
became even plainer.

An XML DTD is often, but not always, prepared with a text editor 
or a syntax-aware text editor. Exactly the same claim can be 
made about, say, a C++ source file. If SVN presents C++ source 
as text, shouldn't it do the same for application/xml-dtd? The 
argument is weaker for application/xml, which is more likely to 
be edited with a specialized program, but is often text.

I like Paul Koning's suggestion to make the presence of the 
svn:eol-style property define a file as text. If that worked, I 
would be using it. And thank you to Karl Fogel for adding this 
thread to 
http://subversion.tigris.org/issues/show_bug.cgi?id=1002.

 - Mark Irving

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Action request: mime-type of xml-dtd should be treated as text

Posted by Karl Fogel <kf...@red-bean.com>.
"John Niven" <jn...@bravurasolutions.com> writes:
> FWIW I'm not too keen on changing mime-types - "application/xml-dtd" is
> a correct mime-type for DTD files.  Perhaps there should be an
> additional property that specifies text vs. binary, instead of using
> mime-type (or eol-style) to guess?

I think that makes sense.  When that property is absent, we can use
the mime-type and whatever else to guess; the new property would
simply govern the decision iff present.

-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

RE: Action request: mime-type of xml-dtd should be treated as text

Posted by John Niven <jn...@bravurasolutions.com>.
(Response follows original message)

-----Original Message-----
From: Mark Irving [mailto:Mark.Irving@informatix.co.uk] 
Sent: Tuesday, 5 February 2008 01:23
To: users@subversion.tigris.org
Subject: Action request: mime-type of xml-dtd should be treated as text

I have a repository with a DTD file checked in and carrying the 
property svn:mime-type value application/xml-dtd, the correct 
type for it according to RFC 2043. It is inconvenient to do some 
Subversion operations on it, such as "blame", because Subversion 
thinks it is non-textual. It would be an improvement if XML mime 
types were recognised as text, and specifically this one.

As far as I can tell, the enhancement should be to function 
svn_mime_type_is_binary in subversion/libsvn_subr/validate.c. I 
suggest the following additions to the mime types treated as 
text.

application/xml-dtd
application/xml
application/*+xml

(The FAQ and the source agree that Subversion's text mime-types 
are text/*, image/x-xbitmap and image/x-xpixmap.)

Do others agree? Should this be entered as a formal SVN issue?

  - Mark Irving.
-----Original Message ends-----

There was a discussion about Subversion's handling of XML files on this
list a few days ago (archived at
http://svn.haxx.se/users/archive-2008-02/0014.shtml) - the consensus
then seemed to be that this behaviour was by design, ie. that XML files
may very well contain binary content
(http://svn.haxx.se/users/archive-2008-02/0017.shtml).  The workaround
was to delete the svn:mime-type property, or change it to "text/..."
(http://svn.haxx.se/users/archive-2008-02/0034.shtml).

FWIW I'm not too keen on changing mime-types - "application/xml-dtd" is
a correct mime-type for DTD files.  Perhaps there should be an
additional property that specifies text vs. binary, instead of using
mime-type (or eol-style) to guess?


Hope this helps
John

--
John Niven 
Senior Developer
Bravura Solutions Limited
Level 1, Jonmer House, 95 Hurstmere Road 
Takapuna, Auckland

www.bravurasolutions.com 

This mail message (and attachments) may contain information that is
confidential to Bravura Solutions. If you are not the intended recipient
you cannot use, distribute or copy the message or attachments. In such
case, please notify the sender by return email immediately and erase all
copies of the message and attachments. Regardless of content, this
e-mail shall not operate to bind Bravura Solutions to any order or other
contract unless pursuant to explicit written agreement expressly
permitting the use of e-mail for such purpose. 	

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org


Re: Action request: mime-type of xml-dtd should be treated as text

Posted by Karl Fogel <kf...@red-bean.com>.
Thomas Scheffler <th...@uni-jena.de> writes:
> I would definitifly like to have "application/xml" be treated as text, too.
>
> Maybe you can configure some mime-types that should be handled like text. Or 
> the better would be to have the "kb" sticky bit of CVS come to subversion. So 
> everything with it is binary and everything without it is not.
>
> Last week I posted on this issue with the subject "Subversion and binary 
> files". No solution was found though.
>
> So you have my "+1" for this suggestion.

Note that http://subversion.tigris.org/issues/show_bug.cgi?id=1002
exists to track this.  I've added this thread to that issue.

-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Re: Action request: mime-type of xml-dtd should be treated as text

Posted by Paul Koning <Pa...@dell.com>.
>>>>> "Thomas" == Thomas Scheffler <th...@uni-jena.de> writes:

 Thomas> Am Montag, 4. Februar 2008 schrieb Mark Irving:
 >> I have a repository with a DTD file checked in and carrying the
 >> property svn:mime-type value application/xml-dtd, the correct type
 >> for it according to RFC 2043. It is inconvenient to do some
 >> Subversion operations on it, such as "blame", because Subversion
 >> thinks it is non-textual. It would be an improvement if XML mime
 >> types were recognised as text, and specifically this one.
 >> ...
 >> Do others agree? Should this be entered as a formal SVN issue?

 Thomas> I would definitifly like to have "application/xml" be treated
 Thomas> as text, too.

Absolutely.

 Thomas> Maybe you can configure some mime-types that should be
 Thomas> handled like text. Or the better would be to have the "kb"
 Thomas> sticky bit of CVS come to subversion. So everything with it
 Thomas> is binary and everything without it is not.

It would make sense for the text-nature of files to be selectable.
One way would be a new property.

Another obvious approach is to tie it to the svn:eol-style property.
Clearly any file with that property is a text file.  So Subversion
should treat it that way.

       paul

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Action request: mime-type of xml-dtd should be treated as text

Posted by Thomas Scheffler <th...@uni-jena.de>.
Am Montag, 4. Februar 2008 schrieb Mark Irving:
> I have a repository with a DTD file checked in and carrying the
> property svn:mime-type value application/xml-dtd, the correct
> type for it according to RFC 2043. It is inconvenient to do some
> Subversion operations on it, such as "blame", because Subversion
> thinks it is non-textual. It would be an improvement if XML mime
> types were recognised as text, and specifically this one.
>
> As far as I can tell, the enhancement should be to function
> svn_mime_type_is_binary in subversion/libsvn_subr/validate.c. I
> suggest the following additions to the mime types treated as
> text.
>
> application/xml-dtd
> application/xml
> application/*+xml
>
> (The FAQ and the source agree that Subversion's text mime-types
> are text/*, image/x-xbitmap and image/x-xpixmap.)
>
> Do others agree? Should this be entered as a formal SVN issue?

I would definitifly like to have "application/xml" be treated as text, too.

Maybe you can configure some mime-types that should be handled like text. Or 
the better would be to have the "kb" sticky bit of CVS come to subversion. So 
everything with it is binary and everything without it is not.

Last week I posted on this issue with the subject "Subversion and binary 
files". No solution was found though.

So you have my "+1" for this suggestion.

Greets

Thomas

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org