You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@community.apache.org by "Kevin A. McGrail (JIRA)" <ji...@apache.org> on 2018/02/05 04:40:00 UTC
[jira] [Updated] (COMDEV-267) GSOC 2018 SpamAssassin Improve Headers for RBL research

     [ https://issues.apache.org/jira/browse/COMDEV-267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kevin A. McGrail updated COMDEV-267:
------------------------------------
    Description: 
add subtests to headers for SA for logging / research purposes

parse out URIs from $report _URIDOMAINS(,)_ list

add_header all Status _YESNO_, score=_SCORE_ required=_REQD_ tests=_TESTS_ subtests=_SUBTESTS_ autolearn=_AUTOLEARN_ version=_VERSION_

logic such as a tflag describe an RBL and then have rbls=_RBLS_.  So every rbl test for PCCC will have a tflag that gives it a name like PCCC even if I have 20 tests.  Same for URIBL or SURBL or SpamHause.  So the rbls=PCCC,SURBL,URIBL as an overarching hit about which RBL hit.

With this we can better identify overlap of RBLs.

Also, it appears that URIDOMAINS(,)_ is not being interpolated in report body. 

 

Apache SpamAssassin is a mail filter to identify spam. It is an intelligent email filter which uses a diverse range of tests to identify unsolicited bulk email, more commonly known as Spam. These tests are applied to email headers and content to classify email using advanced statistical methods. 

In addition, SpamAssassin has a modular architecture that allows other technologies to be quickly wielded against spam and is designed for easy integration into virtually any email system. 

It is primarily written in Perl with a few bits in C and shell scripts for system integration.

The compendium at https://raptor.pccc.com/raptor.cgim?template=email_spam_compendium is helpful to understand some of the concepts with SpamAssassin

It will be helpful for a student in this project to understand SMTP but a willingness to learn and setup your own mail server on a Linux Distribution with SpamAssassin for a personal test domain will be very desired with assistance provided to get the basic framework for a sandbox for learning.

As email becomes more commodotized by major providers, knowledge of email systems and their security is dwindling.  This opportunity can provide real-world experience with an email security product that is employed by countless commercial systems in the world.

  was:
add subtests to headers for SA for logging / research purposes

parse out URIs from $report _URIDOMAINS(,)_ list

add_header all Status _YESNO_, score=_SCORE_ required=_REQD_ tests=_TESTS_ subtests=_SUBTESTS_ autolearn=_AUTOLEARN_ version=_VERSION_

logic such as a tflag describe an RBL and then have rbls=_RBLS_.  So every rbl test for PCCC will have a tflag that gives it a name like PCCC even if I have 20 tests.  Same for URIBL or SURBL or SpamHause.  So the rbls=PCCC,SURBL,URIBL as an overarching hit about which RBL hit.

With this we can better identify overlap of RBLs.

Also, it appears that URIDOMAINS(,)_ is not being interpolated in report body. 


> GSOC 2018 SpamAssassin Improve Headers for RBL research
> -------------------------------------------------------
>
>                 Key: COMDEV-267
>                 URL: https://issues.apache.org/jira/browse/COMDEV-267
>             Project: Community Development
>          Issue Type: Project
>          Components: GSoC/Mentoring ideas
>            Reporter: Kevin A. McGrail
>            Priority: Major
>
> add subtests to headers for SA for logging / research purposes
> parse out URIs from $report _URIDOMAINS(,)_ list
> add_header all Status _YESNO_, score=_SCORE_ required=_REQD_ tests=_TESTS_ subtests=_SUBTESTS_ autolearn=_AUTOLEARN_ version=_VERSION_
> logic such as a tflag describe an RBL and then have rbls=_RBLS_.  So every rbl test for PCCC will have a tflag that gives it a name like PCCC even if I have 20 tests.  Same for URIBL or SURBL or SpamHause.  So the rbls=PCCC,SURBL,URIBL as an overarching hit about which RBL hit.
> With this we can better identify overlap of RBLs.
> Also, it appears that URIDOMAINS(,)_ is not being interpolated in report body. 
>  
> Apache SpamAssassin is a mail filter to identify spam. It is an intelligent email filter which uses a diverse range of tests to identify unsolicited bulk email, more commonly known as Spam. These tests are applied to email headers and content to classify email using advanced statistical methods. 
> In addition, SpamAssassin has a modular architecture that allows other technologies to be quickly wielded against spam and is designed for easy integration into virtually any email system. 
> It is primarily written in Perl with a few bits in C and shell scripts for system integration.
> The compendium at https://raptor.pccc.com/raptor.cgim?template=email_spam_compendium is helpful to understand some of the concepts with SpamAssassin
> It will be helpful for a student in this project to understand SMTP but a willingness to learn and setup your own mail server on a Linux Distribution with SpamAssassin for a personal test domain will be very desired with assistance provided to get the basic framework for a sandbox for learning.
> As email becomes more commodotized by major providers, knowledge of email systems and their security is dwindling.  This opportunity can provide real-world experience with an email security product that is employed by countless commercial systems in the world.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@community.apache.org
For additional commands, e-mail: dev-help@community.apache.org