You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Ben Collins-Sussman <su...@collab.net> on 2001/08/10 21:19:25 UTC

another data point: perl

How Perl detects binary/text files:

"The -T and -B switches work as follows. The first block or so of the
file is examined for odd characters such as strange control codes or
metacharacters. If too many odd characters (>10%) are found, it's a -B
file, otherwise it's a -T file. Also, any file containing null in the
first block is considered a binary file."

   - from http://www.cs.cmu.edu/Web/People/rgs/pl-exp-op.html
     (found by Greg Stein)


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: another data point: perl

Posted by cm...@collab.net.
Ben Collins-Sussman <su...@collab.net> writes:

> How Perl detects binary/text files:
> 
> "The -T and -B switches work as follows. The first block or so of the
> file is examined for odd characters such as strange control codes or
> metacharacters. If too many odd characters (>10%) are found, it's a -B
> file, otherwise it's a -T file. Also, any file containing null in the
> first block is considered a binary file."
> 
>    - from http://www.cs.cmu.edu/Web/People/rgs/pl-exp-op.html
>      (found by Greg Stein)

Sweet.  I was just writing the default binary detector, and the
algorithm I chose was essentially the same (I was going to get them
15% instead of 10).

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org