You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-dev@james.apache.org by "Vincenzo Gianferrari Pini (JIRA)" <se...@james.apache.org> on 2005/12/12 15:35:47 UTC

[jira] Resolved: (JAMES-387) Exception in BayesianAnalysis

     [ http://issues.apache.org/jira/browse/JAMES-387?page=all ]
     
Vincenzo Gianferrari Pini resolved JAMES-387:
---------------------------------------------

    Fix Version: 2.3.0
     Resolution: Fixed

The corpus reload activity was possibly conflicting with any ongoing analysis of messages, and the corpus could screw up.
Now such reload activity is done on a new hashmap, that at the end of the reload becomes the actual corpus. In the meantime any analysis is done on the old corpus and no conflict occurs. The old corpus will eventually be garbage collected.

> Exception in BayesianAnalysis
> -----------------------------
>
>          Key: JAMES-387
>          URL: http://issues.apache.org/jira/browse/JAMES-387
>      Project: James
>         Type: Bug
>   Components: Matchers/Mailets (bundled)
>     Versions: 3.0
>  Environment: James from svn-trunk 2005-08-01.
> MySQL 4.0
>     Reporter: Stefano Bagnara
>     Assignee: Vincenzo Gianferrari Pini
>     Priority: Minor
>      Fix For: 2.3.0

>
> Got this exception for every incoming mail:
> 02/08/05 00:39:25 INFO  James.Mailet: BayesianAnalysis: Exception: java.lang.Integer
> java.lang.ClassCastException: java.lang.Integer
>         at org.apache.james.util.BayesianAnalyzer.getTokenProbabilityStrengths(BayesianAnalyzer.java:591)
>         at org.apache.james.util.BayesianAnalyzer.computeSpamProbability(BayesianAnalyzer.java:340)
>         at org.apache.james.transport.mailets.BayesianAnalysis.service(BayesianAnalysis.java:289)
>         at org.apache.james.transport.LinearProcessor.service(LinearProcessor.java:407)
>         at org.apache.james.transport.JamesSpoolManager.process(JamesSpoolManager.java:460)
>         at org.apache.james.transport.JamesSpoolManager.run(JamesSpoolManager.java:369)
>         at java.lang.Thread.run(Unknown Source)
> If I clean my spam/ham db the exceptions disappears but they start again when the spam/ham db become large.
> My bayesiananalysis_spam contains 200000 rows.
> The following are the spam tokens with higher "occurrences".
> +---------------------------+-------------+
> | token                     | occurrences |
> +---------------------------+-------------+
> | 3D                        |       82151 |
> | a                         |       59953 |
> | the                       |       45295 |
> | FONT                      |       42771 |
> | Content-Type              |       39058 |
> | to                        |       36626 |
> | com                       |       32902 |
> | http                      |       32886 |
> | of                        |       32504 |
> | font                      |       31803 |
> | and                       |       31577 |
> | Content-Transfer-Encoding |       31576 |
> | p                         |       29746 |
> | text                      |       29482 |
> | in                        |       29418 |
> | it                        |       28498 |
> | br                        |       28037 |
> | DIV                       |       27431 |

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org