byaarrott04-11-201109:04 PM - edited 04-12-201106:50 AM
PC Magazine's Neil Rubenking has written
an article describing how he uses the
results of independent test labs to write
product reviews of consumer AV software.
It isn't how we'd do it, but his opinion
carries enormous weight in consumer
security software sales.
How PC Magazine interprets independent AV lab tests
Neil Rubenking of PC Magazine probably swings more sales of consumer AV products than any other single person (except maybe some key decision maker at Best Buy/Geek Squad). Last month, he wrote a magazine article: “How We Interpret Antivirus Lab Tests”.
My own observation is that Rubenking’s results mixing algorithm does little to account for the quality of a test – that is, assessing how well a particular lab’s test measures the actual protection a security product provides to customers against today’s mostly web-borne malware threats. If he did, NSS Labs and Dennis Tech Labs would be included among his reference test labs and he would adopt AMTSO’s guidelines for “whole product testing” – that is, making sure in a test that all of a security products multi-layer protection technologies are given an opportunity to block an attack. Most of the tests that Rubenking relies on test only on-demand scanning of malware artificially copied onto the test computer’s hard drive. Trend Micro insists that independent test labs source threats live from the internet at the time of the test.
But Rubenking’s product reviews carry enormous influence.
However, if we modified our anti-malware solution technologies with laser-focus to do well as measured by Rubenking’s criteria, Trend Micro would get creamed in enterprise assessments of OfficeScan and, more importantly, we would not be protecting our customers from the real threats most likely to hurt them. Rubenking’s criteria have little to do with measuring protection from the Smart Protection Network.
(I know, I know, “Let’s do both: good actual protection for customers and good according to Rubenking old-fashioned metrics.” – easier said than done.)
The good news is that the independent test labs are adjusting to bad-guy innovation faster than Rubenking. AV-Test and AV-Comparatives both perform better tests that Rubenking does not use in his mixing algorithm.
When reviewing an antivirus or security suite product, I always perform hands-on testing of the product's ability to clean up malware-infested systems and to protect a clean system against attack. I'm just one person, though, so I can't come near the exhaustive evaluations performed by the independent testing labs. To supplement my own tests I look at results from five major labs, all of them members of the Anti-Malware Testing Standards Organization (AMTSO).
Five Independent Labs
West Coast Labs and ICSA Labs will check the ability of a vendor's technology to detect a vast number of malware samples, and will separately evaluate how well it cleans up the infestation. Virus Bulletin regularly tests security products against their list of viruses in the wild. To attain the VB100 certification, a product must detect all the threats without erroneously flagging any good programs. I look at the ten most recent tests. If a vendor's security technology has achieved VB100 certification in all ten, that's an impressive achievement.
AV-Test.org, based in Germany, keeps inventing new and better tests. Their latest set involves certifying products for antivirus protection under Windows XP, Windows Vista, and Windows 7. Each product gets from 0 to 6 points for protection, repair, and usability, with a total of 12 required for certification. A surprising number of products have failed to reach certification in one or more tests. Others have scored as high as 16 of 18 possible points in all three.
Austrian lab AV-Comparatives.org also keeps inventing new tests, but they run two specific types of test several times a year. The on-demand tests check a vendor's ability to detect a large collection of viruses and other malware samples. The retrospective tests force each product to use virus definitions from before the first appearance of the samples, thus testing the product's ability to detect new and unknown malware. They rate each tested product ADVANCED+, ADVANCED, or STANDARD; occasionally a product fails to even meet the criteria for a STANDARD rating.
When looking over results from the labs I have to consider the vendor rather than a specific program. Different tests may use different products or versions from the same vendor, so I take each test as an evaluation of the vendor's technology.
ICSA and West Coast Labs report on a vendor's certification only after success is achieved. Having their certifications is definitely good, but not having them typically means the vendor just didn't choose to participate. Likewise some vendors choose not to participate in Virus Bulletin's testing.
Because of the intensive nature of their testing, AV-Test.org and AV-Comparatives.org typically include just 15 to 20 products in a test. Here again, if a product isn't included I can't count its absence against it. On the other hand, I'm impressed with a product that all five labs consider important enough to test.
Keeping these facts in mind I've devised a system for aggregating test results into a rough overall score. This system may well change as the labs invent new tests, which is why it has to be rough. For AV-Comparatives.org I take the average of the on-demand and retrospective scores, counting ADVANCED+ as 3 and STANDARD as 1. I map the average of the three AV-Test.org results onto a range from 0 to 3. For Virus Bulletin I calculate the percentage of VB100 successes in the ten latest tests and also map that to 0, 1, 2, or 3. Then I average the three ratings. If a product doesn't have at least two of these three ratings I consider that there's just not enough information.
As noted, I have more confidence in a product that's tested by many labs. To account for that, I add to the average one tenth of a point for each lab that tested the product. At present, that's how I reach the aggregate rating of POOR, FAIR, GOOD, or EXCELLENT. This system will surely evolve with time, but I believe it does offer an easy way to summarize what the various labs have to say.
Anthony Arrott is product manager for security analytics at Trend Micro. Among other duties, he coordinates Trend Micro’s participation in external benchmark testing programs that measure the protection commercial security software products provide to their customers.
Arrott was Director of Threat Research at anti-spyware vendor InterMute, prior to its acquisition by Trend Micro in 2005. In 2007 Dr. Arrott led the project team for Trend Micro HijackThis v2.0 – enhancing the popular malware diagnostic tool originally developed by Merijn Bellekom. Dr. Arrott earned his degrees at McGill University and M.I.T.
What are other Premium Support Customers talking about? Learn more