You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@opennlp.apache.org by Martin Wunderlich <ma...@gmx.net> on 2014/03/20 20:17:46 UTC

Off-topic (slightly): Need help in categorizing the Enron corpus

Hi all, 

i hope you will forgive me for posting a request here, which is only indirectly related to OpenNLP. Nevertheless, I presume that I can find people here who are interested in this sort of stuff. 
I am currently doing a little project at the university of Munich (LMU) in the area of automatic text classification with neural networks. I am developing a system that can be used to automatically categorize emails into various classes. The "Enron corpus" is used for this, but the system can be used for other texts, too, of course. 

However, the application is doing supervised learning and so it needs to be trained with manually categorized emails. This is where you'd come in: I have set up a website to help categorize a subset of 1000 enron emails. Every categorized email helps! So far, we are at 707 out of 1000 - nearly there. The categorized corpus will also be published for others to use. 

So, you can simply click on this link here and get started: 

http://www.martinwunderlich.com/enron/

Thanks so much in advance!!

Kind regards, 

Martin