How Can Spammers Get Past Bayesian Filters?


We have studied this issue for a while now.  It is not about passing through the filters in the very beginning, when the Bayesian Filter builds its database, but about getting past when the keywords database is already built.

Legitimate email

We have to understand, in the beginning, what a legitimate email is.  A legitimate email is that email that you send/receive to/from your co-workers, family, friends, customers. Its main characteristic is that it is not stuffed with keywords (you are not writing for SEO), it is not a commercial offer and it does not contain to many links. Instead, it is plain, boring and personal. So, once the SPAM filter understands that this type of email is ham, it makes it almost impossible for the spammers to get through.

The challenge for the spammers

Spammers will only make it past your well-trained Bayesian filters if they make their messages look perfectly like the ordinary, plain email everybody may get.  But this is a real challenge for them. Spammers are, usually, rebels that don’t work in a corporation, don’t have a fix working schedule; chances are they won't be doing it when ordinary, boring emails are the only way to make it past the anti-spam filters. If it happened, we would all see a lot of spam and email will become as frustrating as it was in the pre-Bayesian era. And moreover, the current type of SPAM will disappear.

Normally, if a Bayesian algorithm perceives one word (that has a high frequency in good email) as a valid word, it could flag any email containing that word as “ham”. If spammers find a way to determine your valid words, by using HTML return receipts to see which messages you opened, for example, they can include one of them in a junk mail and reach you even through a well-trained Bayesian filter.


What you need to do to improve your SPAM detection rate

1.     Make sure that you turn down the return receipts that you receive from unknown persons

2.     Add every email address that looks suspicious to the SPAM list

3.     Make sure that your Bayesian Engine works in different languages as well

4.     Find a Bayesian algorithm that mixes more than one probability calculation method

5.     Finally, don’t forward emails that you internally receive to outside persons

Visendo Mail Checker Server is an advanced Email Gateway with a complex Bayesian filter based on multiple methods of calculating the probabilities, offers support for multiple languages, has an integrated antivirus component yielding a high success rate in filtering spam, phishing, viruses, malware.


Kommentar schreiben