You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Amir Caspi <ce...@3phase.com> on 2019/06/28 02:11:57 UTC

Machine learning with or vs. Bayes?

Hi all,

I don't suppose anyone has a neural-net-based SA Machine Learning plugin or external program, to complement or replace Bayes?  There are a number of fairly compact Python ML packages that would greatly ease this task nowadays, like TensorFlow.  It looks like rspamd has a neural net module... I wonder if it would be relatively portable.

I guess there's a bunch of ML in use for QA/masscheck and auto-scoring... but is there anything for actual rule generation, not just scoring?  Or, like Bayes, where the "rule generation" is embedded in the neural net, and it just kicks out a spamminess indicator/probability?

Of course, Gmail and the other big providers have their own ML solutions that seem to be pretty good, though they have an enormous user base and near-infinite resources...

Granted, reliance on python means it's not embedded in SA, but SA already calls other external programs like pyzor/razor/DCC, so that wouldn't seem to necessarily be a big knock against it.

Cheers.

--- Amir


Re: Machine learning with or vs. Bayes?

Posted by "Shreyansh Shrivastava." <sh...@nitk.edu.in>.
On Fri, 28 Jun 2019, 07:42 Amir Caspi, <ce...@3phase.com> wrote:

> Hi all,
>
> I don't suppose anyone has a neural-net-based SA Machine Learning plugin
> or external program, to complement or replace Bayes?  There are a number of
> fairly compact Python ML packages that would greatly ease this task
> nowadays, like TensorFlow.  It looks like rspamd has a neural net module...
> I wonder if it would be relatively portable.
>
Hi Amir, I am working on developing a plugin with 2/3 statistical
classifiers including (SVM and neural nets) under the Google summer of code
programme with Kevin McGrail as my mentor.

I guess there's a bunch of ML in use for QA/masscheck and auto-scoring...
> but is there anything for actual rule generation, not just scoring?  Or,
> like Bayes, where the "rule generation" is embedded in the neural net, and
> it just kicks out a spamminess indicator/probability?
>
> Of course, Gmail and the other big providers have their own ML solutions
> that seem to be pretty good, though they have an enormous user base and
> near-infinite resources...
>
> Granted, reliance on python means it's not embedded in SA, but SA already
> calls other external programs like pyzor/razor/DCC, so that wouldn't seem
> to necessarily be a big knock against it.
>
With python, as you said it wont be embedded into SA and hence I'm worried
about plugin integration. We ( me + mentors) have come up with a couple of
possible feasible solutions. Will post about any updates on the list soon.

Note- Any information in general which you think might help in this issue,
please let me know.


> Cheers.
>
> --- Amir
>

Regards,
Shreyansh Shrivastava

>