You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by "babedh-dhra@biggdog.biz" <ba...@biggdog.biz> on 2010/04/29 14:39:49 UTC

[Copfilter] Copy of quarantined email - *** SPAM *** [8.9/7.0] Re: Filtering zip spam

Hi,

> Alex, does Bayes understand/check INSIDE zips, at least for file
> properties?  If not, then it is inherently limited (just in this

I'm not sure if you're asking me rhetorically here. I really don't
know. Is it enough that bayes finds the encoded string as the
attachment, and matches that against other strings or must it be
expanded first into its real content?

> context), which is a big part of why this is such an effective
> technique.  Adding that to Bayes should be relatively straight
> forward, and should make zips less attractive to spammers.

Almost too obvious of an addition makes me wonder why it hasn't
previously been done.

> One simple approach is to score all "small" zips, then meta that
> with other characteristics, like ANY blocklist hit, "unusual"
> nation of origin, etc.

That's a good one. I'm not sure I'm at the point of writing rules to
match on attachment size, however.

> That's how I first handled zips, a few years ago, and it's fairly
> effective.  Small zips in ham are VERY unusual, and typically are

Again, very obvious after you mention it that I'm surprised it's not
in the default rules if you've been doing it for a while. Is there
some side-effect or drawback that would prevent it from being rolled
into a real SA release?

> To avoid FPs, I'm using the RealName-based rules I described almost
> three years ago (I have several "skip" rules daisy-chained off

I'll have to locate those. Not much luck finding it after a quick
search. It's not the Google "I'm feeling lucky" discussion, right?

# Is this even still relevant?
http://old.nabble.com/Googlepages---Livefilestore-spams-td14715808.html

> Alex, as with all rules, it really depends on your ham ecology.

I agree to an extent, but there is a common reference point that we
all have, and I'd like to at least find that.

> Feel free to share more info about yours (we need the equivalent
> of the Geek Code for ham ecology!).  When you first started
> posting, I briefly assumed you were a college student, then
> gradually realized you have decent volume and diversity. :)

I appreciate that. I've been working with Linux since the beginning
but not a real perl programmer.

> As I mentioned in a post in January, I had noticed a consistent
> value in an Image properties field which I was calculating, but
> not (at the time) exporting.

Is this it?

# Re: pill image spam learns to walk
http://marc.info/?l=spamassassin-users&m=126327771510366&w=2

Is there any progress on your work from that, which might benefit us here?

> Entire zip:
>    - number of files
>    - compression ratio (i.e. across ALL files)

Isn't this what the clamav and sanesecurity sigs are for?

Thanks,
Alex