You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Karl Berry <ka...@freefriends.org> on 2005/01/15 21:03:35 UTC

svn and automatic line ending

Greetings,

Last April Jim Fulton from Zope began a thread on the subversion dev
list about setting svn:eol-style=native by default for anything that
svn's heuristic considers text.
http://svn.haxx.se/users/archive-2004-04/1363.shtml

I'm having the same issue -- importing a large repository (1+gb; it's
TeX Live, http://tug.org/texlive/), with new files being added not
infrequently (whenever there's a new TeX package, basically), and with
dozens of file extensions.  Attempting to specify all the extensions
would be a nightmare.  And so I would like to support Jim's idea of
making the "binary mime type property" error into a warning (or just
silencing it).

I'll repeat the problematic scenario from that thread, since it was some
months ago.  With the config settings:

   [miscellany]
   enable-auto-props = yes

   [auto-props]
   * = svn:eol-style=native

the proposal is to make

     $ svn add text1 image1.gif text2 image2.gif
     A text1
     svn: File 'image1.gif' has binary mime type property
     $

continue to add text2 and image2.gif, instead of giving up after the error.

I understand that the binary/text heuristic is not perfect, and files
might be incorrectly marked with eol-style=native.  We can explicitly
fix those.  But having it guess correctly the vast majority of the time
would be a huge boon.

Automatic detection of text vs. binary files [was: svn and automatic line ending]

Posted by Julian Foad <ju...@btopenworld.com>.
Karl Berry wrote:
> BTW, are you interested in reports of files that the heuristic
> detects incorrectly?

Yes, please.  We don't expect it to be perfect, of course, but if there are any 
types of text files that Subversion frequently classifies as binary or vice 
versa, that would be good to know and we can probably improve it.

- Julian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn and automatic line ending

Posted by Karl Berry <ka...@freefriends.org>.
Hi Julian,

Thanks very much for all the additional info.  I understand some of the
issues better now.

    you might not want to do this because you are not interested in the
    MIME types, and Subversion's binary/text guess works accurately
    enough for you, and you just want all the files that were guessed as
    text to have svn:eol-style set.

Exactly.  BTW, are you interested in reports of files that the heuristic
detects incorrectly?

    How about the ability to specify:
    + What properties to set for files detected as binary;
    + What properties to set for files detected as text.

That sounds just fine, and quite practical.  (I understand that
something that just solves setting eol-style for heuristic text files
isn't the way to go.)

    At least, if it is, then it needs to be done 
    with a hook so that arbitrary code can be invoked.

Anything that lets the job get done.  I can imagine that implementing a
totally general hook structure could be quite a bit more difficult than
the property-setting based on the heuristic result.  Might not be worth it.

    The auto-props mechanism is a very limited feature and it feels
    wrong to extend it in ad-hoc ways.

I understand, but both auto-props and the properties themselves (not
just eol-style, although that's perhaps the most important) are crucial
to getting a usable system (for multi-platform projects anyway), in
practice.

    as us supplying a script that can be customised to do exactly what
    each team wants.

If you mean a script of some sort that can be hooked into svn
add/import, then yes, that sounds fine too.  

Ultimately, I just want the users to be able to run simply "svn add".
I hope this isn't an unreasonable goal.

Best regards,
karl

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn and automatic line ending

Posted by Julian Foad <ju...@btopenworld.com>.
Karl Berry wrote:
> No, I totally agree we don't want eol-style=native for binary files,
[...]

>     client obey svn:eol-style regardless of MIME type, because that
>     would be useful in the case of MIME types that Subversion doesn't
>     recognise but that are nevertheless for text files.
> 
> For mime types that subversion does not recognize, I can see the point.
> 
> For mime types that subversion knows are binary, like
> application/octet-stream, it seems it could only cause trouble to ever
> look at eol-style.  So I hope you don't do that.

Well, I agree that it seems dangerous to change the line endings in a binary 
file, but it would also be wrong for Subversion to silently ignore a setting 
that the user made.  Subversion doesn't actually have a very good idea of what 
is binary and what is text.  You say "binary, like application/octet-stream" 
but as far as I know that is the only MIME type that Subversion knows is 
binary.  The user knows better (sometimes).  Perhaps the best thing in this 
situation would be for it to give an error or a warning.


> Making eol-style settable based on mime-type, as well as file extension,
> seems like it gives users the flexibility to handle both known and
> unknown mime types in whatever way is desired.

Well, yes and no.

Note that Subversion doesn't know the MIME type of a file during "import" or 
"add".  The only thing it does is looks at the beginning of the file to see if 
it looks "binary", and if so it sets svn:mime-type to 
"application/octet-stream".  Just before or after this (I hope it's well 
defined), the auto-props operate, and can set svn:mime-type according to file 
name.  This combination is powerful enough to set the MIME type correctly for 
many situations, but not as good as being able to run an external MIME type 
detecting program (e.g. the Unix "file" command).

There is little point enhancing the auto-props mechanism to allow testing 
svn:mime-type for various patterns, because if it wasn't set by an auto-prop 
depending on file name, then it is either "application/octet-stream" or 
nothing.  The only new bit of functionality needed is a way to detect this last 
bit.

After the import or add, you probably want to go through the files and give 
them better MIME types.  If you do this, then you might as well set 
svn:eol-style appropriately as well.  However, I can still see that you might 
not want to do this because you are not interested in the MIME types, and 
Subversion's binary/text guess works accurately enough for you, and you just 
want all the files that were guessed as text to have svn:eol-style set.

What new functionality would best accomplish this?

How about the ability to specify:

+ What properties to set for files detected as binary;
+ What properties to set for files detected as text.

You see, I don't want another highly-specialised option such as one which just 
sets svn:eol-style when it doesn't set svn:mime-type.  Someone would just come 
along next month saying that that's all very well but they want to set 
"my-merge-mode=line-based" on all text files, as well as svn:eol-style.

On the other hand, what I suggested is only slightly more flexible.  I tend to 
conclude that it is not the job of "svn add" or "svn import" to set all the 
properties the way you want them.  At least, if it is, then it needs to be done 
with a hook so that arbitrary code can be invoked.

The auto-props mechanism is a very limited feature and it feels wrong to extend 
it in ad-hoc ways.


[...]
> We're all volunteers here, so I can't really complain that anything is
> unacceptable :).  However, it does seem to me that every project with
> cross-platform developers would like to have the behavior that binary
> files are binary, and text files have eol-style=native.  Making it
> easier for this to happen, one way or another, would remove one
> stumbling block to subversion adoption.

Yes, I pretty much agree.  I just don't like extending auto-props to accomplish 
that one particular setting, and would prefer a much more general approach such 
as us supplying a script that can be customised to do exactly what each team wants.


> P.S. Another stumbling block is not being able to specify the
> configuration once on the server, [...]

Yes.  That's being discussed again at the moment.  It's certainly wanted, but 
nobody yet seems to have taken on the non-trivial task of evaluating the needs 
and possibilities and coming up with a good design for it.  Several people have 
said things like "Why don't you just have a file on the server that is 
downloaded" or "You could do it with inheritable properties", but it needs a 
deeper proposal than that.

- Julian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn and automatic line ending

Posted by Karl Berry <ka...@freefriends.org>.
Hi Julian,

    Do you mean: give a warning and not add image1.gif; then continue to
    add text2 and try (but fail) to add image2.gif?  That might be
    reasonable.

The ideal is to add all the files requested, without further ado --
binaries as binaries, and text files with eol-style=native.

    I think you meant: add image1.gif and set its EOL style to "native"

No, I totally agree we don't want eol-style=native for binary files, and
wasn't suggesting that.  Sorry if that wasn't clear.  (That turned out
to the stopper with my kludge attempts yesterday.  Philip, thanks for
the pointers on further hacking, but it sounds like pursuing the
mime-type approach is better anyway ...)

    client obey svn:eol-style regardless of MIME type, because that
    would be useful in the case of MIME types that Subversion doesn't
    recognise but that are nevertheless for text files.

For mime types that subversion does not recognize, I can see the point.

For mime types that subversion knows are binary, like
application/octet-stream, it seems it could only cause trouble to ever
look at eol-style.  So I hope you don't do that.

Making eol-style settable based on mime-type, as well as file extension,
seems like it gives users the flexibility to handle both known and
unknown mime types in whatever way is desired.

    It would be convenient if Subversion provided a way to do what you
    want directly, 

Yes :).  

    but I think a reasonable solution for your purpose is to run (after
    the "svn add") a simple program that sets svn:eol-style on all files
    that do not have svn:mime-type.  Would that be acceptable?

That does sound a whole lot better than trying to maintain a complete
file extension list.  Thanks very much for the idea; I think I'll do it.

We're all volunteers here, so I can't really complain that anything is
unacceptable :).  However, it does seem to me that every project with
cross-platform developers would like to have the behavior that binary
files are binary, and text files have eol-style=native.  Making it
easier for this to happen, one way or another, would remove one
stumbling block to subversion adoption.

Thanks for the replies,
karl

P.S. Another stumbling block is not being able to specify the
configuration once on the server, instead forcing each developer to set
the same properties.  Even CVS has repository configuration :).  I
recall that being discussed in the same thread last April as well, so I
won't belabor the point further ...

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn and automatic line ending

Posted by Julian Foad <ju...@btopenworld.com>.
Karl Berry wrote:
>    [auto-props]
>    * = svn:eol-style=native
> 
> the proposal is to make
> 
>      $ svn add text1 image1.gif text2 image2.gif
>      A text1
>      svn: File 'image1.gif' has binary mime type property
>      $
> 
> continue to add text2 and image2.gif, instead of giving up after the error.

Do you mean: give a warning and not add image1.gif; then continue to add text2 
and try (but fail) to add image2.gif?  That might be reasonable.

I think you meant: add image1.gif and set its EOL style to "native" and give a 
warning; then continue.  That would be bad.  I strongly advise against setting 
svn:eol-style on non-text files.  Even if the current client ignores it when 
the MIME type indicates non-text, we have had a discussion about making the 
client obey svn:eol-style regardless of MIME type, because that would be useful 
in the case of MIME types that Subversion doesn't recognise but that are 
nevertheless for text files.


Subversion marks files that it thinks are binary with 
"svn:mime-type=application/octet-stream".  What you want is for all other files 
to have "svn:eol-style=native".  It would be convenient if Subversion provided 
a way to do what you want directly, but I think a reasonable solution for your 
purpose is to run (after the "svn add") a simple program that sets 
svn:eol-style on all files that do not have svn:mime-type.  Would that be 
acceptable?

- Julian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org