You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by "Jan P. Kessler" <sa...@jpkessler.info> on 2010/08/26 15:41:10 UTC

Re: Samples?

 Am 25.8.2010 22:47, schrieb Karsten Bräckelmann:
>
> Jan,  any chance you could provide the paragraphs or text parts
> corresponding to the seeks?
>
> Just to clarify: We do *not* require the full message, even though it
> makes things simpler. In fact, no headers (other than Subject) are ever
> used in the sought process.

Karsten,

I've been out of office (better: out-of-oder ;)) for some days. Of
course I'll provide you some samples. I've spoken to some of our
customers and they have agreed, that I may submit the affected mails.

Let me prepare this and I'll send you a link. Are you able to analyze MS
Outlook .msg format?



> Anonymizing any personal data is perfectly fine. Moreover, the ham
> corpus for sought is not available publicly, but restricted to a few SA
> developers only.
>
> The rendered and normalized body text is used to prevent seeks from
> appearing in the automatically generated rules -- strings directly
> extracted from spam. Thus, by its nature, the FP string itself cannot
> possibly be confidential. :)

Understood and agreed ;)

Thank  you for your efforts and the great work on spamassassin!

Cheers, Jan



Re: Samples?

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
This already moved mostly off-list, but for the records...

On Thu, 2010-08-26 at 15:41 +0200, Jan P. Kessler wrote:
> I've been out of office (better: out-of-oder ;)) for some days. Of
> course I'll provide you some samples. I've spoken to some of our
> customers and they have agreed, that I may submit the affected mails.

Great.

> Let me prepare this and I'll send you a link. Are you able to analyze MS
> Outlook .msg format?

Nope. Neither me, nor SA. ;)  Moreover, to avoid false results, the body
should be as un-altered as possible.


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}