You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by "Shreyansh Shrivastava." <sh...@nitk.edu.in> on 2019/07/28 20:49:25 UTC

NeuralNet plugin

I am a Google summer of code student working on developing a statistical
classifier plugin.

As of now, I'm taking the Bayes.pm file as the baseline. Any other
documentation which I can follow or a plugin which I can take as a
reference? ( I want to capture the mail, shell out to the models, get the
spam prediction and add it as an additional header preferably ).

Thanks,
Shreyansh Shrivastava

Re: NeuralNet plugin

Posted by "Shreyansh Shrivastava." <sh...@nitk.edu.in>.
I am calling the python script the same way you mentioned (haven't finished
yet). Will check pyzor.pm for enter_helper_run_mode /
helper_app_pipe_open. Thanks for the help.

Regards,
Shreyansh Shrivastava


On Mon, Jul 29, 2019 at 1:58 PM Henrik K <he...@hege.li> wrote:

>
> I already gave you example of CRM114.pm, but a working plugin can be very
> minimal, you really only need to finish your existing Svm.pm scan()
> function
> for now.
>
> At first you could even use something simple like
>
> my $tmpfile = $pms->create_fulltext_tmpfile();
> my $svn_command_output = `/path/to/classifier $tmpfile`;
> (parse output from classifier etc)
>
> But a real plugin with SA standards would use enter_helper_run_mode /
> helper_app_pipe_open etc for execution.  Examples for these can be found
> from Pyzor.pm etc.
>
> Of course some rule needs to be defined like
>
> full BAYES_SVM eval:check_svm()
>
> Or if you have some "classifier ranges" you could imitate SA Bayes, of
> course check_svm() logic needs to make sure scan() is only run once and
> repeated calls to check_svm simply compare the cached result to arguments.
>
> full BAYES_SVM_00 eval:check_svm('0.00', '0.01')
> full BAYES_SVM_05 eval:check_svm('0.01', '0.05')
>
> Cheers,
> Henrik
>
> On Mon, Jul 29, 2019 at 01:22:39PM +0530, Shreyansh Shrivastava. wrote:
> > Hey Henrik,
> >
> > I have finished my work on the python classifier. The pipeline is as
> follows [
> > mbox to dataframe for training -> extracting text/plain from MIME format
> mail -
> > > Text processing if any-> spam prediction ]. Also, I have 2 models as
> of now
> > SVM and Neural Net. Now I want to integrate it with SA.
> >
> > Regards,
> > Shreyansh Shrivastava
> >
> >
> > On Mon, Jul 29, 2019 at 10:35 AM Henrik K <[1...@hege.li> wrote:
> >
> >     On Mon, Jul 29, 2019 at 02:19:25AM +0530, Shreyansh Shrivastava.
> wrote:
> >     > I am a Google summer of code student working on developing a
> statistical
> >     > classifier plugin.
> >     >
> >     > As of now, I'm taking the Bayes.pm file as the baseline. Any other
> >     > documentation which I can follow or a plugin which I can take as a
> >     reference? (
> >     > I want to capture the mail, shell out to the models, get the spam
> >     prediction
> >     > and add it as an additional header preferably ).
> >
> >     I don't understand, haven't we looked at this already?  Are you
> working on
> >     the Python classifier or something completely different now?
> >
> >     Cheers,
> >     Henrik
> >
> >
> >
> > References:
> >
> > [1] mailto:hege@hege.li
>

Re: NeuralNet plugin

Posted by Henrik K <he...@hege.li>.
I already gave you example of CRM114.pm, but a working plugin can be very
minimal, you really only need to finish your existing Svm.pm scan() function
for now.

At first you could even use something simple like

my $tmpfile = $pms->create_fulltext_tmpfile();
my $svn_command_output = `/path/to/classifier $tmpfile`;
(parse output from classifier etc)

But a real plugin with SA standards would use enter_helper_run_mode /
helper_app_pipe_open etc for execution.  Examples for these can be found
from Pyzor.pm etc.

Of course some rule needs to be defined like

full BAYES_SVM eval:check_svm()

Or if you have some "classifier ranges" you could imitate SA Bayes, of
course check_svm() logic needs to make sure scan() is only run once and
repeated calls to check_svm simply compare the cached result to arguments.

full BAYES_SVM_00 eval:check_svm('0.00', '0.01')
full BAYES_SVM_05 eval:check_svm('0.01', '0.05')

Cheers,
Henrik

On Mon, Jul 29, 2019 at 01:22:39PM +0530, Shreyansh Shrivastava. wrote:
> Hey Henrik,
> 
> I have finished my work on the python classifier. The pipeline is as follows [
> mbox to dataframe for training -> extracting text/plain from MIME format mail -
> > Text processing if any-> spam prediction ]. Also, I have 2 models as of now
> SVM and Neural Net. Now I want to integrate it with SA.
> 
> Regards,
> Shreyansh Shrivastava
> 
> 
> On Mon, Jul 29, 2019 at 10:35 AM Henrik K <[1...@hege.li> wrote:
> 
>     On Mon, Jul 29, 2019 at 02:19:25AM +0530, Shreyansh Shrivastava. wrote:
>     > I am a Google summer of code student working on developing a statistical
>     > classifier plugin.
>     >
>     > As of now, I'm taking the Bayes.pm file as the baseline. Any other
>     > documentation which I can follow or a plugin which I can take as a
>     reference? (
>     > I want to capture the mail, shell out to the models, get the spam
>     prediction
>     > and add it as an additional header preferably ).
> 
>     I don't understand, haven't we looked at this already?  Are you working on
>     the Python classifier or something completely different now?
> 
>     Cheers,
>     Henrik
> 
> 
> 
> References:
> 
> [1] mailto:hege@hege.li

Re: NeuralNet plugin

Posted by "Shreyansh Shrivastava." <sh...@nitk.edu.in>.
Hey Henrik,

I have finished my work on the python classifier. The pipeline is as
follows [ mbox to dataframe for training -> extracting text/plain from MIME
format mail - > Text processing if any-> spam prediction ]. Also, I have 2
models as of now SVM and Neural Net. Now I want to integrate it with SA.

Regards,
Shreyansh Shrivastava


On Mon, Jul 29, 2019 at 10:35 AM Henrik K <he...@hege.li> wrote:

> On Mon, Jul 29, 2019 at 02:19:25AM +0530, Shreyansh Shrivastava. wrote:
> > I am a Google summer of code student working on developing a statistical
> > classifier plugin.
> >
> > As of now, I'm taking the Bayes.pm file as the baseline. Any other
> > documentation which I can follow or a plugin which I can take as a
> reference? (
> > I want to capture the mail, shell out to the models, get the spam
> prediction
> > and add it as an additional header preferably ).
>
> I don't understand, haven't we looked at this already?  Are you working on
> the Python classifier or something completely different now?
>
> Cheers,
> Henrik
>
>

Re: NeuralNet plugin

Posted by Henrik K <he...@hege.li>.
On Mon, Jul 29, 2019 at 02:19:25AM +0530, Shreyansh Shrivastava. wrote:
> I am a Google summer of code student working on developing a statistical
> classifier plugin.
> 
> As of now, I'm taking the Bayes.pm file as the baseline. Any other
> documentation which I can follow or a plugin which I can take as a reference? (
> I want to capture the mail, shell out to the models, get the spam prediction
> and add it as an additional header preferably ).

I don't understand, haven't we looked at this already?  Are you working on
the Python classifier or something completely different now?

Cheers,
Henrik