I am looking at modifying /contributing followed by testing the free tool available for spam control called SpamBayes. I have installed it on my Outlook and trained it with my spam and ham mails and observing the outcome. I want to make some modifications/update the tool which is allowed as it is a freeware and then study the results. The tool is coded using python.
The link to the tool is http://spambayes.sourceforge.net/i
There is a wiki at http://www.entrian.com/sbwiki where all documentation, code etc for developers to contribute.
· Code modifications that I need are: add on to the tool to let the user add some preferences such as 1. email addresses, word preferences, etc 2. add on to let the user give the keywords list to be used for filtering the messages as spam or ham 3. filtering to be done on the basis of 2-3 words together or a phrase
Apart from code add-ons/update, I need a few things for documentation... documentation on the current architecture of the application algorithm on how the probability is calculated in the tool currently
This is the initial idea of what I am looking for...
I can also send you some research papers published on SpamBayes, if you want. Thanks