GSoC/GCI Archive
Google Summer of Code 2013 Wikimedia

Bayesian Spam Filter

by Anubhav Agarwal for Wikimedia

A token(word) based bayesian spam classifier for comabting wiki spam problmes.Besides words it takes into account a lot of other factors like capital letters, punctuation marks etc. Also adding support for large wiki for concurret edits.