You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by crisppy fernandes <cr...@gmail.com> on 2005/03/06 13:27:58 UTC

Spam not wrking. Hot to train Bayes and make it wrk in beginning.

But my question which now becomes problem is if a new user has
installed spam-3.0.2 and wants it to identify the spam then How to do
it quickly?
I am wrking on it from last week to make it wrk but things not seems to click.

I have run these commands manually on corpus available at spamassassin.org
sa-learn --spam spam_2/*
sa-learn --ham easy_ham/*

then my sa-learn --dump magic is showing this :
-bash-2.05b$ sa-learn --dump magic
0.000          0          3          0  non-token data: bayes db version
0.000          0       1387          0  non-token data: nspam
0.000          0       1412          0  non-token data: nham
0.000          0     142988          0  non-token data: ntokens
0.000          0 1017036637          0  non-token data: oldest atime
0.000          0 1109068323          0  non-token data: newest atime
0.000          0 1109064441          0  non-token data: last journal sync atime
0.000          0 1109059988          0  non-token data: last expiry atime
0.000          0   22118400          0  non-token data: last expire atime delta
0.000          0      10864          0  non-token data: last expire
reduction count


But still after sending around 500 spam mails its not identifying the
coming mail as spam.
More over when i decrease the (required_scores 1 ) then even its not
ready to wrk.
same thing wrks absolutely fine on 2.63 version but not here.

Any or some help welcome

Crisppy f.

output of  spamassassin -D --lint  is
-------------------------------------------------------------------------------------------------------------------------------
 
-bash-2.05b$ spamassassin -D --lint

debug: SpamAssassin version 3.0.2
debug: Score set 0 chosen.
debug: running in taint mode? yes
debug: Running in taint mode, removing unsafe env vars, and resetting PATH
debug: PATH included '/usr/local/bin', keeping.
debug: PATH included '/bin', keeping.
debug: PATH included '/usr/bin', keeping.
debug: PATH included '/home/admin17/bin', which doesn't exist, dropping.
debug: Final PATH set to: /usr/local/bin:/bin:/usr/bin
debug: diag: module installed: DBI, version 1.37
debug: diag: module installed: DB_File, version 1.808
debug: diag: module installed: Digest::SHA1, version 2.01
debug: diag: module installed: IO::Socket::UNIX, version 1.21
debug: diag: module installed: MIME::Base64, version 2.21
debug: diag: module installed: Net::DNS, version 0.45
debug: diag: module not installed: Net::LDAP ('require' failed)
debug: diag: module not installed: Razor2::Client::Agent ('require' failed)
debug: diag: module installed: Storable, version 2.09
debug: diag: module installed: URI, version 1.21
debug: ignore: using a test message to lint rules
debug: using "/etc/mail/spamassassin/init.pre" for site rules init.pre
debug: config: read file /etc/mail/spamassassin/init.pre
debug: using "/usr/share/spamassassin" for default rules dir
debug: config: read file /usr/share/spamassassin/10_misc.cf
debug: config: read file /usr/share/spamassassin/20_anti_ratware.cf
debug: config: read file /usr/share/spamassassin/20_body_tests.cf
debug: config: read file /usr/share/spamassassin/20_compensate.cf
debug: config: read file /usr/share/spamassassin/20_dnsbl_tests.cf
debug: config: read file /usr/share/spamassassin/20_drugs.cf
debug: config: read file /usr/share/spamassassin/20_fake_helo_tests.cf
debug: config: read file /usr/share/spamassassin/20_head_tests.cf
debug: config: read file /usr/share/spamassassin/20_html_tests.cf
debug: config: read file /usr/share/spamassassin/20_meta_tests.cf
debug: config: read file /usr/share/spamassassin/20_phrases.cf
debug: config: read file /usr/share/spamassassin/20_porn.cf
debug: config: read file /usr/share/spamassassin/20_ratware.cf
debug: config: read file /usr/share/spamassassin/20_uri_tests.cf
debug: config: read file /usr/share/spamassassin/23_bayes.cf
debug: config: read file /usr/share/spamassassin/25_body_tests_es.cf
debug: config: read file /usr/share/spamassassin/25_hashcash.cf
debug: config: read file /usr/share/spamassassin/25_spf.cf
debug: config: read file /usr/share/spamassassin/25_uribl.cf
debug: config: read file /usr/share/spamassassin/30_text_de.cf
debug: config: read file /usr/share/spamassassin/30_text_fr.cf
debug: config: read file /usr/share/spamassassin/30_text_nl.cf
debug: config: read file /usr/share/spamassassin/30_text_pl.cf
debug: config: read file /usr/share/spamassassin/50_scores.cf
debug: config: read file /usr/share/spamassassin/60_whitelist.cf
debug: using "/etc/mail/spamassassin" for site rules dir
debug: config: read file /etc/mail/spamassassin/local.cf
debug: using "/home/admin17/.spamassassin" for user state dir
debug: using "/home/admin17/.spamassassin/user_prefs" for user prefs file
debug: config: read file /home/admin17/.spamassassin/user_prefs
debug: plugin: loading Mail::SpamAssassin::Plugin::URIDNSBL from @INC
debug: plugin: registered Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0x8c0f7dc)
debug: plugin: loading Mail::SpamAssassin::Plugin::Hashcash from @INC
debug: plugin: registered Mail::SpamAssassin::Plugin::Hashcash=HASH(0x93a7d30)
debug: plugin: loading Mail::SpamAssassin::Plugin::SPF from @INC
debug: plugin: registered Mail::SpamAssassin::Plugin::SPF=HASH(0x93841fc)
debug: plugin: Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0x8c0f7dc)
implements 'parse_config'
debug: plugin: Mail::SpamAssassin::Plugin::Hashcash=HASH(0x93a7d30)
implements 'parse_config'
debug: using "/home/admin17/.spamassassin" for user state dir
debug: bayes: 10692 tie-ing to DB file R/O
/home/admin17/.spamassassin/bayes_toks
debug: bayes: 10692 tie-ing to DB file R/O
/home/admin17/.spamassassin/bayes_seen
debug: bayes: found bayes db version 3
debug: using "/home/admin17/.spamassassin" for user state dir
debug: Score set 3 chosen.
debug: ---- MIME PARSER START ----
debug: main message type: text/plain
debug: parsing normal part
debug: added part, type: text/plain
debug: ---- MIME PARSER END ----
debug: metadata: X-Spam-Relays-Trusted:
debug: metadata: X-Spam-Relays-Untrusted:
debug: plugin: Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0x8c0f7dc)
implements 'parsed_metadata'
debug: is Net::DNS::Resolver available? yes
debug: Net::DNS version: 0.45
debug: trying (3) cingular.com...
debug: looking up NS for 'cingular.com'
debug: NS lookup of cingular.com succeeded => Dns available (set
dns_available to hardcode)
debug: is DNS available? 1
debug: decoding: no encoding detected
debug: URIDNSBL: domains to query:
debug: all '*From' addrs: ignore@compiling.spamassassin.taint.org
debug: Running tests for priority: 0
debug: running header regexp tests; score so far=0
debug: registering glue method for check_hashcash_double_spend
(Mail::SpamAssassin::Plugin::Hashcash=HASH(0x93a7d30))
debug: registering glue method for check_for_spf_helo_pass
(Mail::SpamAssassin::Plugin::SPF=HASH(0x93841fc))
debug: SPF: message was delivered entirely via trusted relays, not required
debug: registering glue method for check_hashcash_value
(Mail::SpamAssassin::Plugin::Hashcash=HASH(0x93a7d30))
debug: all '*To' addrs:
debug: registering glue method for check_for_spf_softfail
(Mail::SpamAssassin::Plugin::SPF=HASH(0x93841fc))
debug: SPF: message was delivered entirely via trusted relays, not required
debug: registering glue method for check_for_spf_pass
(Mail::SpamAssassin::Plugin::SPF=HASH(0x93841fc))
debug: registering glue method for check_for_spf_helo_softfail
(Mail::SpamAssassin::Plugin::SPF=HASH(0x93841fc))
debug: registering glue method for check_for_spf_fail
(Mail::SpamAssassin::Plugin::SPF=HASH(0x93841fc))
debug: registering glue method for check_for_spf_helo_fail
(Mail::SpamAssassin::Plugin::SPF=HASH(0x93841fc))
debug: running body-text per-line regexp tests; score so far=-3.174
debug: running uri tests; score so far=-3.174
debug: registering glue method for check_uridnsbl
(Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0x8c0f7dc))
debug: bayes corpus size: nspam = 1387, nham = 1412
debug: tokenize: header tokens for *F = "U*ignore
D*compiling.spamassassin.taint.org D*spamassassin.taint.org
D*taint.org D*org"
debug: tokenize: header tokens for *m = "  1109068933 lint_rules "
debug: tokenize: header tokens for *RT = " "
debug: tokenize: header tokens for *RU = " "
debug: bayes token 'somewhat' => 0.00356291390728477
debug: bayes token 'H*F:D*org' => 0.070175390012658
debug: bayes: score = 0.0923704635181808
debug: bayes: 10692 untie-ing
debug: bayes: 10692 untie-ing db_toks
debug: bayes: 10692 untie-ing db_seen
debug: Razor2 is not available
debug: plugin: Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0x8c0f7dc)
implements 'check_tick'
debug: running raw-body-text per-line regexp tests; score so far=-5.125
debug: running full-text regexp tests; score so far=-5.125
debug: Razor2 is not available
debug: Current PATH is: /usr/local/bin:/bin:/usr/bin
debug: Pyzor is not available: pyzor not found
debug: DCCifd is not available: no r/w dccifd socket found.
debug: DCC is not available: no executable dccproc found.
debug: Running tests for priority: 500
debug: RBL: success for 1 of 1 queries
debug: plugin: Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0x8c0f7dc)
implements 'check_post_dnsbl'
debug: running meta tests; score so far=-5.125
debug: running header regexp tests; score so far=-3.899
debug: running body-text per-line regexp tests; score so far=-3.899
debug: running uri tests; score so far=-3.899
debug: running raw-body-text per-line regexp tests; score so far=-3.899
debug: running full-text regexp tests; score so far=-3.899
debug: Running tests for priority: 1000
debug: running meta tests; score so far=-3.899
debug: running header regexp tests; score so far=-3.899
debug: using "/home/admin17/.spamassassin" for user state dir
debug: lock: 10692 created
/home/admin17/.spamassassin/auto-whitelist.lock.svohra13.india.ensim.com.10692
debug: lock: 10692 trying to get lock on
/home/admin17/.spamassassin/auto-whitelist with 0 retries
debug: lock: 10692 link to
/home/admin17/.spamassassin/auto-whitelist.lock: link ok
debug: Tie-ing to DB file R/W in /home/admin17/.spamassassin/auto-whitelist
debug: auto-whitelist (db-based):
ignore@compiling.spamassassin.taint.org|ip=none scores 0/0
debug: AWL active, pre-score: -3.899, autolearn score: -3.899, mean:
undef, IP: undef
debug: DB addr list: untie-ing and unlocking.
debug: DB addr list: file locked, breaking lock.
debug: unlock: 10692 unlink /home/admin17/.spamassassin/auto-whitelist.lock
debug: Post AWL score: -3.899
debug: running body-text per-line regexp tests; score so far=-3.899
debug: running uri tests; score so far=-3.899
debug: running raw-body-text per-line regexp tests; score so far=-3.899
debug: running full-text regexp tests; score so far=-3.899
debug: is spam? score=-3.899 required=1
debug: tests=ALL_TRUSTED,BAYES_20,MISSING_HEADERS,MISSING_SUBJECT,NO_REAL_NAME
debug: subtests=__HAS_MSGID,__MSGID_OK_DIGITS,__MSGID_OK_HOST,__SANE_MSGID,__UNUSABLE_MSGID


-- 
Crisppy Fernandes
-- 
Crisppy Fernandes