You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-dev@james.apache.org by "Vincenzo Gianferrari Pini (JIRA)" <se...@james.apache.org> on 2005/09/03 19:36:30 UTC

[jira] Commented: (JAMES-387) Exception in BayesianAnalysis

    [ http://issues.apache.org/jira/browse/JAMES-387?page=comments#action_12322588 ] 

Vincenzo Gianferrari Pini commented on JAMES-387:
-------------------------------------------------

I gave a careful look to the code and couldn't find anything wrong. I have a spam table with more than 258000 rows and everything works fine for me.

IMHO a possible explanation of Stefano's exceptions is the following:

The ham/spam corpus hashmaps may take a lot of memory. Accordingly, I gave a lot of  -Xmx memory to the JVM.
I remember some time ago, in a java (non James) application, an unpredictable JVM behaviour (strange exceptions thrown) when the available heap was just about the needed heap. Decreasing a little bit the -Xmx size I was getting OutOfMemoryError, and increasing it everything was fine.
Stefano, can you try with more memory?

> Exception in BayesianAnalysis
> -----------------------------
>
>          Key: JAMES-387
>          URL: http://issues.apache.org/jira/browse/JAMES-387
>      Project: James
>         Type: Bug
>   Components: Matchers/Mailets (bundled)
>     Versions: 3.0
>  Environment: James from svn-trunk 2005-08-01.
> MySQL 4.0
>     Reporter: Stefano Bagnara
>     Assignee: Vincenzo Gianferrari Pini
>     Priority: Minor

>
> Got this exception for every incoming mail:
> 02/08/05 00:39:25 INFO  James.Mailet: BayesianAnalysis: Exception: java.lang.Integer
> java.lang.ClassCastException: java.lang.Integer
>         at org.apache.james.util.BayesianAnalyzer.getTokenProbabilityStrengths(BayesianAnalyzer.java:591)
>         at org.apache.james.util.BayesianAnalyzer.computeSpamProbability(BayesianAnalyzer.java:340)
>         at org.apache.james.transport.mailets.BayesianAnalysis.service(BayesianAnalysis.java:289)
>         at org.apache.james.transport.LinearProcessor.service(LinearProcessor.java:407)
>         at org.apache.james.transport.JamesSpoolManager.process(JamesSpoolManager.java:460)
>         at org.apache.james.transport.JamesSpoolManager.run(JamesSpoolManager.java:369)
>         at java.lang.Thread.run(Unknown Source)
> If I clean my spam/ham db the exceptions disappears but they start again when the spam/ham db become large.
> My bayesiananalysis_spam contains 200000 rows.
> The following are the spam tokens with higher "occurrences".
> +---------------------------+-------------+
> | token                     | occurrences |
> +---------------------------+-------------+
> | 3D                        |       82151 |
> | a                         |       59953 |
> | the                       |       45295 |
> | FONT                      |       42771 |
> | Content-Type              |       39058 |
> | to                        |       36626 |
> | com                       |       32902 |
> | http                      |       32886 |
> | of                        |       32504 |
> | font                      |       31803 |
> | and                       |       31577 |
> | Content-Transfer-Encoding |       31576 |
> | p                         |       29746 |
> | text                      |       29482 |
> | in                        |       29418 |
> | it                        |       28498 |
> | br                        |       28037 |
> | DIV                       |       27431 |

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org