goodarticlelist.com goodarticlelist.com
  Main :> About Us :> Place Your Link :> Security & Privacy :> Terms & Conditions :> Submit Article
Search:   
 

Internet Search Basics

This article is an introduction to internet searching. It addresses the use of search engines such a ... - Hugh O'Connell
 

Affiliate Marketing: Starting Out On The Internet

Starting out as an online entrepreneur has never been easier. The task of acquiring traffic though r ... - Gene Leshinsky
 

Where To Get The Traffic You Want

Not all traffic is the same. Getting a gazillion hits is great, but not if it is from random subject ... - Halstatt Pires
 
 

The True Meaning of Freeware

The vast majority of us will have, at some point, had freeware games or applications installed on ou ... - Daniel Robson
 

I Want To Be A Freelancer

So you have decided that you want to do freelance work. You have done your homework and have develop ... - Ron Swerdfiger
 

Pricing Angst - The Solution To Charging And Being Paid What You're Worth Online

How you come across to your clients will pre-determine exactly how many zeros you can happily ask fo ... - Karin Manning
 

Computer Data Recovery ? Step by Step

Ever wonder how computer data is recovered? What happens to a file when it gets deleted from the rec ... - Natalie Aranda
 

What The Hex Goin' On Here?

When editing an HTML document or graphic image for the Internet, you may encounter a color code simi ... - Ben Gordon
 
 

Main –› Internet & Computers –› Spam Blocking
 

So What Makes a Good Spam Filter Anyway?

 
Author: Alan Hearnshaw
 

So What Makes a Good Spam Filter Anyway? By Alan Hearnshaw

Spam Filters. Most of us know we need one. Some of know we need a better one, but how many stop to think what actually makes a good spam filter in the first place?

This is not just a rhetorical question. It is a question that many usersand many developers - do not ask, and consequently, goes unanswered.

Maybe this could be better answered by defining here the qualities of the perfect spam filter. We'll call our perfect spam filter the 'SpamSplatter 3000'. Here are some of the defining qualities of 'SpamSplatter 3000'

1. It requires zero interaction from the user. 2. It produces zero false positives (good messages identified as bad) and zero false negatives (bad messages identified as good). 3. It is transparentthat is, you only ever see good messages and never need even be aware that spam exists.

That's it. Not much of a shopping list is it? Of course, 'SpamSplatter 3000' hasn't been invented yet (and if it does, I want a piece of the action), but it does give us a frame of reference when looking for the best filter we can find.

Let's take each point in turn:

It requires zero interaction from the user There are two kinds of filters that come near to this ideal currently: Bayesian Filters and Community Filters. Bayesian filters strip messages down to small 'word bites', or tokens and maintain a database containing lists of good and bad tokens. When a new message is encountered, the filter strips this message down to tokens, compares it to the database, and applies a formula based on the British scientist Alan Bayes' formula for probability calculation. Over time, the Bayesian filter 'learns' the characteristics of spam messages.

Community Filters simply work on a voting system whereby every user that receives a spam message 'votes' it as spam. This information is stored on a central server and when enough votes are received the message is banned from all users in the community.

As can be seen, the user interaction from these types of filters is mainly limited to two button operationcorrecting wrongly identified messagesand the more accurate the filter, the less those buttons are used.

OK, so that's pretty good. Not exactly zero interaction, but if the filter is accurate enough, then it should be pretty near. That brings us to point two:

It produces zero false positives or negatives This is the area in which most spam filter development is concentrating and things are getting pretty good nowadays. It is not at all unusual to see an efficient modern filter achieve accuracy of 96% or better. It is, of course, far better to have a false negative than a false positive if you are ever going to tear yourself away from the killed mail folder!

Of course, by definition, community filters cannot reach 100% accuracy as someone has to be getting the spam to be voting it as such! Theoretically, a Bayesian filter may be able to eventually get quite close to 100% accuracy, so at least there is hope there. Content based filters (those that look for certain words, phrases or other indicators in a message to identify it as spam), will almost certainly not get much higher accuracy figures than the best of them can achieve today. Adapting to changing spam requires new filters to be created on an ongoing basis.

And finally, we come to the holy grail of spam filtering:

It is transparent Strangely enough, not enough work seems to be done in trying to achieve this goal. Some of the best filters on the market today identify spam with impressive accuracy and then simply place them in a 'killed mail' folder for your later perusal. Now, forgive me if I'm missing something here, but isn't the point to save you having to wade through the junk mail? Isn't that what you bought the filter for? With the 'SpamSplatter 3000', you don't need to do that.

As we haven't achieved 100% accuracy yet (and probably never will), the only way to free us from checking the killed mail folder is a challenge/response system. This is where a message is automatically sent back to the sender requiring them to take some action for their message to actually be delivered.

Some systems tend to go overboard with the challenge/response system. These systems - often called 'Whitelist' systems - block messages from anyone that isn't in the user's friends list. Guaranteed 100% effective, but too drastic a measure for most users.

Now, it seems that the most intelligent use of this system would be to send challenges only to messages that were flagged as 'questionable'. Good message can be delivered, definite spam can be deleted and questionable ones would earn themselves a challenge message.

So, to sum up, let's rewrite the qualities of our perfect filter and get a shopping list of what to look for while we wait for the 'SpamSplatter 3000' to arrive:

1. Simple, minimal setup and maintenance. 2. Extremely low rate of false positives and as few false negatives as possible. 3. A transparent 'fail-safe' mechanism whereby the victims of those false positives can force the message through to you.

It's simple really. Now, who's going to build me this 'SpamSplatter 3000''?

Alan Hearnshaw is the owner of http://www.WhichSpamFilter.com, a site which provides weekly in-depth spam filter reviews, user help and guidance and a community forum. alan@whichspamfilter.com

 
 
 

Related Articles

 
Sell Your Ebook With An Easy Payment Process, p2
 
Go Flock Yourself - A Brief Look At The Flock Browser
 
Embrace Your Images in Search Engine Optimization
 
Global ISPs
 
Employee Time Tracking Software
 
The Google toolbar and search engine optimization.
 
How To Select Online Cash Paid Survey Sites
 
Adsense Trickles: The Way Up Unto Torrential Adsense Earning
 
How to Set Up Your Internet Marketing Campaign
 
Toto, We're Not in Kansas Anymore
 
 
 
Free 3 way links
 

Jobs & Careers

Health & Hygiene

Finance & Banking

Politics & Government

Online & Board Games

Self Enhancement

Academics & Learning

Shopping & Auction

Lifestyle & Fashion

Internet & Computers

Children

Art & Culture

Business & Services

Vehicles & Automotive

News & Media

Realty & Property

Cooking & Drinking

Research & Science

Travel & Accommodation

Medicine & Treatment

Adventure & Sports

Home & Garden

People & Society

Recreation & Entertainment

 
   Main :> Security & Privacy :> Terms & Conditions
Copyright © 2006-2008 www.goodarticlelist.com - All Rights Reserved.