You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by kf...@collab.net on 2004/07/30 17:54:10 UTC

Re: Textual binaries

Marcus Sundman <su...@iki.fi> writes:
> > Subversion has an internal list of mime-types that it thinks are
> > textual:  text/*, and a few others.  Otherwise it assumes that any other
> > mime-type is binary, and un-diffable.
> 
> That's just stupid. There is no thing as "text vs. binary"! All files are 
> binary. There isn't such a thing as "plain text". A "text file" is a binary 
> file in some particular encoding.
> 
> Why on earth don't you add support for mapping each mime-type to some kind 
> of plugin that knows how to diff/merge/annotate that particular type of 
> files?

Okay, time for the Nice Police.

Marcus, "stupid" won't get you very far here.  And the rest of your
first paragraph is, believe me, not news to any Subversion developers.

The answer to your question (which should have been asked without the
"on earth", by the way) is that we were trying to release Subversion
1.0 before the next ice age.  Therefore, in many areas we chose a
simple 90% solution that would work for most cases, and avoided
getting bogged down in complex 100% solutions that would be take a
long time and be harder to get right.  This is well-known philosophy
of software development, which holds that it is better to ship than
not to ship.

The feature you're requesting is a reasonable one, it just involves a
lot of UI design work, etc.  Those things take time, and this is all
volunteer labor.  If you'd like to contribute constructively to that
conversation, please start a thread on dev@subversion.tigris.org --
but only without the tautologies and without the whining.

I assume you've seen

   http://subversion.tigris.org/issues/show_bug.cgi?id=1002
   http://subversion.tigris.org/issues/show_bug.cgi?id=1233

both of which are more than a year old and which are semi-related to
this problem.

-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Textual binaries

Posted by Marcus Sundman <su...@iki.fi>.
> > Great! Somehow developers in general seem to have some weird problem
> > with understanding the relationship between text and encodings.
>
> There's no such confusion going on here.  Honest.  We're perfectly aware
> that there are unicode text files that can be encoded with null bytes,
> and so on.

That's very nice to hear.

> We're simply using the word "text file" to mean: "a line-based file than
> can be contexutally diffed with the usual diff/patch tools".  Nothing
> more.

OK. (I'm not sure exactly which files work with the gnu diff/patch tools, 
but I can find out that on my own.)

By the way, how does it know how to display the result to the user? I mean, 
one "text file" might be in UTF-8 while another one is in ISO-8859-15 and 
yeat another one in cp850. Will it display all files as if they were in 
whatever happens to be the user's local default encoding or will there be 
some transcoding or what?


- Marcus Sundman

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Textual binaries

Posted by Ben Collins-Sussman <su...@collab.net>.
On Fri, 2004-07-30 at 15:08, Marcus Sundman wrote:

> Great! Somehow developers in general seem to have some weird problem with 
> understanding the relationship between text and encodings. 

There's no such confusion going on here.  Honest.  We're perfectly aware
that there are unicode text files that can be encoded with null bytes,
and so on.

We're simply using the word "text file" to mean: "a line-based file than
can be contexutally diffed with the usual diff/patch tools".  Nothing
more.




---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Textual binaries

Posted by Marcus Sundman <su...@iki.fi>.
> > > Subversion has an internal list of mime-types that it thinks are
> > > textual:  text/*, and a few others.  Otherwise it assumes that any
> > > other mime-type is binary, and un-diffable.
> >
> > That's just stupid. There is no thing as "text vs. binary"! All files
> > are binary. There isn't such a thing as "plain text". A "text file" is
> >
>
> Marcus, "stupid" won't get you very far here.

Sorry if anyone felt offended. However, the solution *is* stupid. It might 
be better in the short term but it still is very stupid.

> And the rest of your first paragraph is, believe me, not news to any
> Subversion developers. 

Great! Somehow developers in general seem to have some weird problem with 
understanding the relationship between text and encodings. I can't 
understand why this is hard to understand, but I've noticed that it's very 
common, even this day.

> > Why on earth don't you add support for mapping each mime-type to some
> > kind of plugin that knows how to diff/merge/annotate that particular
> > type of files?
>
> The answer to your question (which should have been asked without the
> "on earth", by the way) is that we were trying to release Subversion
> 1.0 before the next ice age.  Therefore, in many areas we chose a
> simple 90% solution that would work for most cases, and avoided
> getting bogged down in complex 100% solutions that would be take a
> long time and be harder to get right.

OK, fair enough. Unfortunately it seems that once there is something that 
seems to work in 80% of all cases people won't develop it further unless 
the defects go past their pain treshold, which at this point has gone up 
very high. If you strive for a 100% solution, on the other hand, once you 
have all bugs ironed out you will have a 100% solution.

> This is well-known philosophy of software development, which holds that it
> is better to ship than not to ship.

Yeah, "less is more" and "worse is better".

> The feature you're requesting is a reasonable one, it just involves a
> lot of UI design work, etc.  Those things take time, and this is all
> volunteer labor.  If you'd like to contribute constructively to that
> conversation, please start a thread on dev@subversion.tigris.org --
> but only without the tautologies and without the whining.

I was about to some time ago, but after noticing that subversion seems to be 
developed with the mind set of "fix what's wrong with CVS" rather than "do 
things The Right Way(tm)" I decided I'd rather spend my time on arch or 
opencm. I wish the svn developers all the best, though.

> I assume you've seen
>
>    http://subversion.tigris.org/issues/show_bug.cgi?id=1002
>    http://subversion.tigris.org/issues/show_bug.cgi?id=1233
>
> both of which are more than a year old and which are semi-related to
> this problem.

No, I hadn't, but now I have. Maybe it's just me, but I don't see how they 
are related to anything I said. To me 1002 seems to just reiterate the 
misconception of "text vs. binary" and 1233 seems to be about enhanced 
mime-type auto-detection or something.

Again, I apologize if I offended someone. That wasn't my intention.


- Marcus Sundman

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org