You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Clay Davis <cd...@avionics-specialties.com> on 2007/01/05 20:34:22 UTC

CF files not formatted correctly; ASCII vs Binary?

Can someone give me an idea what is causing this and of its causing a
problem with my SA config?
 
I am using wget to ftp download several of the SARE rules on a weekly
basis.  When I look at the rule on the SARE site, its formatting looks
normal (spaces, tabs, indents, etc.); however, after an ftp download to
my PC and opening it in Notepad, its all run together and all the
formatting is gone.  Is this a result of ASCII vs binary?  Am I fouling
SA up?

Thanks gang,
Clay

RE: CF files not formatted correctly; ASCII vs Binary?

Posted by Clay Davis <cd...@avionics-specialties.com>.
thanks Dan.  I'll check it out.
Re,
Clay

>>> On 1/5/2007 at 4:20 PM, in message
<OE...@visioncomm.net>, "Dan Barker"
<db...@visioncomm.net> wrote:
Clay,

You have several replies about the difference between CR/LF and LF,
but
nothing useful<g>.

To "LOOK" at the files, use Wordpad instead of Notepad. It handles
either
line end.

SpamAssassin (actually perl) doesn't care in this instance. FTP'ing
them in
ASCII won't hurt anything, but why mess with it?

You may want to look into SA-UPDATE anyhow. Much smoother auto-update.

Dan

-----Original Message-----
From: Clay Davis [mailto:cdavis@avionics-specialties.com] 
Sent: Friday, January 05, 2007 2:34 PM
To: users@spamassassin.apache.org 
Subject: CF files not formatted correctly; ASCII vs Binary?


Can someone give me an idea what is causing this and of its causing a
problem with my SA config?

I am using wget to ftp download several of the SARE rules on a weekly
basis.  When I look at the rule on the SARE site, its formatting looks
normal (spaces, tabs, indents, etc.); however, after an ftp download
to
my PC and opening it in Notepad, its all run together and all the
formatting is gone.  Is this a result of ASCII vs binary?  Am I
fouling
SA up?

Thanks gang,
Clay


RE: CF files not formatted correctly; ASCII vs Binary?

Posted by Clay Davis <cd...@avionics-specialties.com>.
thanks, neal.  crimson reads it just fine.
clay

>>> On 1/5/2007 at 2:52 PM, in message
<F1...@usmsx01.us.langeveld.com>,
"Coffey, Neal" <nc...@langeveld.com> wrote:
Clay Davis wrote:
> I am using wget to ftp download several of the SARE rules on a
weekly
> basis.  When I look at the rule on the SARE site, its formatting
looks
> normal (spaces, tabs, indents, etc.); however, after an ftp download
> to my PC and opening it in Notepad, its all run together and all the
> formatting is gone.  Is this a result of ASCII vs binary?  Am I
> fouling SA up?

It's a fundamental difference in how text files are stored on Windows
versus Unix systems.  In Unix/Linux, the "end of line" character is LF
(ASCII 10).  In Windows, "end of line" is marked with CR (ASCII 13)
followed by LF.

So, when you load up a Unix/Linux text file in Notepad, the LFs are
there but not the CRs, and so Notepad doesn't think the line has
ended.
Text editors that are better than Notepad ( most of them are, I like
Crimson Editor[1]) handle this fine.

Is it fouling up SA?  If you've got SA running under linux, probably
not, since the files are in linux-friendly form.  If you want to
check,
use 'cat' to display the contents of the file on the linux machine. 
If
it looks fine, then SA should be reading it properly.

[1] http://www.crimsoneditor.com/

Re: CF files not formatted correctly; ASCII vs Binary?

Posted by Chris Purves <ch...@northfolk.ca>.
John Rudd wrote:
> Clay Davis wrote:
>> Can someone give me an idea what is causing this and of its causing a
>> problem with my SA config?
>>  
>> I am using wget to ftp download several of the SARE rules on a weekly
>> basis.  When I look at the rule on the SARE site, its formatting looks
>> normal (spaces, tabs, indents, etc.); however, after an ftp download to
>> my PC and opening it in Notepad, its all run together and all the
>> formatting is gone.  Is this a result of ASCII vs binary?  Am I fouling
>> SA up?
>>
> 
> It's most likely a case of Windows vs Unix end-of-line format.  That 
> _would_ be fixed if you ftp'ed in text/ascii mode instead of binary mode 
> ... but you can also fix it if you have a simple unix2dos program on 
> your PC.
> 
> I don't know if SA under windows chokes on the file format differences, 
> though.  You might want to look into a text file editor that can deal 
> with both formats.  Probably vim can do it (which you'd probably need to 
> use with cygwin).
> 

You can run Vim directly in Windows.  I use it often for viewing files 
authored in *nix as the original poster is doing and for regex.

http://www.vim.org/download.php#pc

-- 
Chris


Re: CF files not formatted correctly; ASCII vs Binary?

Posted by John Rudd <jr...@ucsc.edu>.
Clay Davis wrote:
> Can someone give me an idea what is causing this and of its causing a
> problem with my SA config?
>  
> I am using wget to ftp download several of the SARE rules on a weekly
> basis.  When I look at the rule on the SARE site, its formatting looks
> normal (spaces, tabs, indents, etc.); however, after an ftp download to
> my PC and opening it in Notepad, its all run together and all the
> formatting is gone.  Is this a result of ASCII vs binary?  Am I fouling
> SA up?
> 

It's most likely a case of Windows vs Unix end-of-line format.  That 
_would_ be fixed if you ftp'ed in text/ascii mode instead of binary mode 
... but you can also fix it if you have a simple unix2dos program on 
your PC.

I don't know if SA under windows chokes on the file format differences, 
though.  You might want to look into a text file editor that can deal 
with both formats.  Probably vim can do it (which you'd probably need to 
use with cygwin).

RE: CF files not formatted correctly; ASCII vs Binary?

Posted by "Coffey, Neal" <nc...@langeveld.com>.
Clay Davis wrote:
> I am using wget to ftp download several of the SARE rules on a weekly
> basis.  When I look at the rule on the SARE site, its formatting looks
> normal (spaces, tabs, indents, etc.); however, after an ftp download
> to my PC and opening it in Notepad, its all run together and all the
> formatting is gone.  Is this a result of ASCII vs binary?  Am I
> fouling SA up?

It's a fundamental difference in how text files are stored on Windows
versus Unix systems.  In Unix/Linux, the "end of line" character is LF
(ASCII 10).  In Windows, "end of line" is marked with CR (ASCII 13)
followed by LF.

So, when you load up a Unix/Linux text file in Notepad, the LFs are
there but not the CRs, and so Notepad doesn't think the line has ended.
Text editors that are better than Notepad ( most of them are, I like
Crimson Editor[1]) handle this fine.

Is it fouling up SA?  If you've got SA running under linux, probably
not, since the files are in linux-friendly form.  If you want to check,
use 'cat' to display the contents of the file on the linux machine.  If
it looks fine, then SA should be reading it properly.

[1] http://www.crimsoneditor.com/

RE: CF files not formatted correctly; ASCII vs Binary?

Posted by Dan Barker <db...@visioncomm.net>.
Clay,

You have several replies about the difference between CR/LF and LF, but
nothing useful<g>.

To "LOOK" at the files, use Wordpad instead of Notepad. It handles either
line end.

SpamAssassin (actually perl) doesn't care in this instance. FTP'ing them in
ASCII won't hurt anything, but why mess with it?

You may want to look into SA-UPDATE anyhow. Much smoother auto-update.

Dan

-----Original Message-----
From: Clay Davis [mailto:cdavis@avionics-specialties.com]
Sent: Friday, January 05, 2007 2:34 PM
To: users@spamassassin.apache.org
Subject: CF files not formatted correctly; ASCII vs Binary?


Can someone give me an idea what is causing this and of its causing a
problem with my SA config?

I am using wget to ftp download several of the SARE rules on a weekly
basis.  When I look at the rule on the SARE site, its formatting looks
normal (spaces, tabs, indents, etc.); however, after an ftp download to
my PC and opening it in Notepad, its all run together and all the
formatting is gone.  Is this a result of ASCII vs binary?  Am I fouling
SA up?

Thanks gang,
Clay