You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@subversion.apache.org by Cagatay Catal <ca...@bte.mam.gov.tr> on 2005/06/16 12:31:56 UTC

binary detection algorithm in SVN

Hello,

I am examining features of SVN and reading some notes about it.

I have read "Christophe Dupré" 's power point which is called "Source Code
Revision Control with Subversion".

In this this presentation it is stated that binary detection algorithm
sometime fails. Something must be set manually.

 

Why is not that property set default? Or did I missunderstand sth about
this?

 

*       SVN has a binary detection algorithm, but it sometimes fails (PDF
have a text header)

-         Need to set svn:mime-type property manually to
application/octet-stream

 

 

Thanks alot for any comment.

 

 

Best wishes,

 

Cagatay

Re: binary detection algorithm in SVN

Posted by Ben Collins-Sussman <su...@collab.net>.

On Jun 16, 2005, at 7:31 AM, Cagatay Catal wrote:

> Hello,
>
> I am examining features of SVN and reading some notes about it.
>
> I have read “Christophe Dupré” ‘s power point which is called  
> “Source Code Revision Control with Subversion”.
>
> In this this presentation it is stated that binary detection  
> algorithm sometime fails. Something must be set manually.
>
>
>
> Why is not that property set default? Or did I missunderstand sth  
> about this?

When you 'svn add' or 'svn import', svn has a heuristic that tries to  
guess if a file is binary.  It examines the first N bytes of a file,  
and looks for non-ascii characters or NULL bytes.  If a certain  
percentage of them look this way, then the 'svn:mime-type' property  
is automatically attached to the file with a value of "application/ 
octet-stream'.  This prevents the svn client from attempting to do  
contextual diffs and merges in the future.

But there's no way this algorithm can be perfect, it's just educated  
guessing.  The slide is saying:  in the case of PDF files, sometimes  
the algorithm doesn't think the file is binary.  Yet PDF files are  
definitely not line-based text-files that can be contextually diffed  
and merged, so humans need to intervene and set that property manually.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: binary detection algorithm in SVN

Posted by kf...@collab.net.

"Cagatay Catal" <ca...@bte.mam.gov.tr> writes:
> Why is not that property set default? Or did I missunderstand sth about
> this?

It is not set by default because Subversion will not change a file by
default, even if Subversion "thinks" that file is text when it's
really binary.

If Subversion were like CVS (which expands keywords and transforms
line endings by default on text files), then it would be different.
But Subversion is more cautious than CVS.  You have to set the
svn:keywords and svn:eol-style properties for those things to happen.

The main lossage that having a binary svn:mime-type prevents is with
the 'svn diff' command.  In merges it helps too, but even there, the
pristine plaintexts are available to work with if Subversion tries to
merge something that shouldn't be merged.

-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org