You are viewing a plain text version of this content. The canonical link for it is here.
Posted to ruleqa@spamassassin.apache.org by Marc Andre Selig <a2...@sedacon.com> on 2012/08/12 20:06:31 UTC

Timing; report_safe messages; mbox files

Hi all,

now that it's Sunday I'm finally getting around to setting up the mass
check scripts.  Thanks for setting up the account, by the way. :)

I've got three questions:

1. My work machine is a laptop that does not run continuously.  What do
I do if it happens to be sleeping at 9 a.m. UTC?  Skip the mass check
for that day, or just run it at the earliest point possible?

2. Do I understand the code correctly when I assume that I can just leave
report_safe messages as they are?  I.e. there's no need to remove the
report_safe encapsulation before putting the messages in the spam corpus?

3. I am having trouble using corpus files in mbox format.  I just started
with a handful of messages to try things out, namely 108 ham messages
and 288 spam messages.  If I put the messages into maildir folders, the
log files have 114 lines for ham (seeing that there are 6 header lines,
that seems to be all right) and 291 lines for spam (so I assume there's
a few duplicates left).  However, if I put the same messages into two
mbox files (and change the config file correspondingly), the files have
13 lines for ham and 291 lines for spam.  Is there anything special I
have to do to use mbox?

Thanks in advance!

Regards
Marc

Re: Timing; report_safe messages; mbox files

Posted by "Kevin A. McGrail" <KM...@PCCC.com>.
On 8/12/2012 2:06 PM, Marc Andre Selig wrote:
> Hi all,
>
> now that it's Sunday I'm finally getting around to setting up the mass
> check scripts.  Thanks for setting up the account, by the way. :)
>
> I've got three questions:
>
> 1. My work machine is a laptop that does not run continuously.  What do
> I do if it happens to be sleeping at 9 a.m. UTC?  Skip the mass check
> for that day, or just run it at the earliest point possible?
I would say go ahead and run it.  The worst that happens if you submit 
it too late is it isn't used.
> 2. Do I understand the code correctly when I assume that I can just leave
> report_safe messages as they are?  I.e. there's no need to remove the
> report_safe encapsulation before putting the messages in the spam corpus?
That's my understanding.
>
> 3. I am having trouble using corpus files in mbox format.  I just started
> with a handful of messages to try things out, namely 108 ham messages
> and 288 spam messages.  If I put the messages into maildir folders, the
> log files have 114 lines for ham (seeing that there are 6 header lines,
> that seems to be all right) and 291 lines for spam (so I assume there's
> a few duplicates left).  However, if I put the same messages into two
> mbox files (and change the config file correspondingly), the files have
> 13 lines for ham and 291 lines for spam.  Is there anything special I
> have to do to use mbox?
>
> Thanks in advance!
I believe mbox format is broken from another ticket, sorry.  Too many 
fronts being battled right now!  we will get mbox working.

-- 
*Kevin A. McGrail*
President

Peregrine Computer Consultants Corporation
3927 Old Lee Highway, Suite 102-C
Fairfax, VA 22030-2422

http://www.pccc.com/

703-359-9700 x50 / 800-823-8402 (Toll-Free)
703-359-8451 (fax)
KMcGrail@PCCC.com <ma...@pccc.com>