
Measuring Security
PC Magazine's Neil Rubenking has written
an article describing how he uses the
results of independent test labs to write
product reviews of consumer AV software.
It isn't how we'd do it, but his opinion
carries enormous weight in consumer
security software sales.
How PC Magazine interprets independent AV lab tests
Neil Rubenking of PC Magazine probably swings more sales of consumer AV products than any other single person (except maybe some key decision maker at Best Buy/Geek Squad). Last month, he wrote a magazine article: “How We Interpret Antivirus Lab Tests”.
My own observation is that Rubenking’s results mixing algorithm does little to account for the quality of a test – that is, assessing how well a particular lab’s test measures the actual protection a security product provides to customers against today’s mostly web-borne malware threats. If he did, NSS Labs and Dennis Tech Labs would be included among his reference test labs and he would adopt AMTSO’s guidelines for “whole product testing” – that is, making sure in a test that all of a security products multi-layer protection technologies are given an opportunity to block an attack. Most of the tests that Rubenking relies on test only on-demand scanning of malware artificially copied onto the test computer’s hard drive. Trend Micro insists that independent test labs source threats live from the internet at the time of the test.
But Rubenking’s product reviews carry enormous influence.
However, if we modified our anti-malware solution technologies with laser-focus to do well as measured by Rubenking’s criteria, Trend Micro would get creamed in enterprise assessments of OfficeScan and, more importantly, we would not be protecting our customers from the real threats most likely to hurt them. Rubenking’s criteria have little to do with measuring protection from the Smart Protection Network.
(I know, I know, “Let’s do both: good actual protection for customers and good according to Rubenking old-fashioned metrics.” – easier said than done.)
The good news is that the independent test labs are adjusting to bad-guy innovation faster than Rubenking. AV-Test and AV-Comparatives both perform better tests that Rubenking does not use in his mixing algorithm.
More info at:
trendmicro.com: http://us.trendmicro.com/us/trendwatch/core-techno
iShare: http://tw.ishare2.trendmicro.com/sites/xb/SitePage
_________________________________________________
http://www.pcmag.com/article2/0,2817,2381924,00.as
PC Magazine
14 March 2011
How We Interpret Antivirus Lab Tests
by Neil J. Rubenking
When reviewing an antivirus or security suite product, I always perform hands-on testing of the product's ability to clean up malware-infested systems and to protect a clean system against attack. I'm just one person, though, so I can't come near the exhaustive evaluations performed by the independent testing labs. To supplement my own tests I look at results from five major labs, all of them members of the Anti-Malware Testing Standards Organization (AMTSO).
Five Independent Labs
West Coast Labs and ICSA Labs will check the ability of a vendor's technology to detect a vast number of malware samples, and will separately evaluate how well it cleans up the infestation. Virus Bulletin regularly tests security products against their list of viruses in the wild. To attain the VB100 certification, a product must detect all the threats without erroneously flagging any good programs. I look at the ten most recent tests. If a vendor's security technology has achieved VB100 certification in all ten, that's an impressive achievement.
AV-Test.org, based in Germany, keeps inventing new and better tests. Their latest set involves certifying products for antivirus protection under Windows XP, Windows Vista, and Windows 7. Each product gets from 0 to 6 points for protection, repair, and usability, with a total of 12 required for certification. A surprising number of products have failed to reach certification in one or more tests. Others have scored as high as 16 of 18 possible points in all three.
Austrian lab AV-Comparatives.org also keeps inventing new tests, but they run two specific types of test several times a year. The on-demand tests check a vendor's ability to detect a large collection of viruses and other malware samples. The retrospective tests force each product to use virus definitions from before the first appearance of the samples, thus testing the product's ability to detect new and unknown malware. They rate each tested product ADVANCED+, ADVANCED, or STANDARD; occasionally a product fails to even meet the criteria for a STANDARD rating.
Interpreting Results
When looking over results from the labs I have to consider the vendor rather than a specific program. Different tests may use different products or versions from the same vendor, so I take each test as an evaluation of the vendor's technology.
ICSA and West Coast Labs report on a vendor's certification only after success is achieved. Having their certifications is definitely good, but not having them typically means the vendor just didn't choose to participate. Likewise some vendors choose not to participate in Virus Bulletin's testing.
Because of the intensive nature of their testing, AV-Test.org and AV-Comparatives.org typically include just 15 to 20 products in a test. Here again, if a product isn't included I can't count its absence against it. On the other hand, I'm impressed with a product that all five labs consider important enough to test.
Keeping these facts in mind I've devised a system for aggregating test results into a rough overall score. This system may well change as the labs invent new tests, which is why it has to be rough. For AV-Comparatives.org I take the average of the on-demand and retrospective scores, counting ADVANCED+ as 3 and STANDARD as 1. I map the average of the three AV-Test.org results onto a range from 0 to 3. For Virus Bulletin I calculate the percentage of VB100 successes in the ten latest tests and also map that to 0, 1, 2, or 3. Then I average the three ratings. If a product doesn't have at least two of these three ratings I consider that there's just not enough information.
As noted, I have more confidence in a product that's tested by many labs. To account for that, I add to the average one tenth of a point for each lab that tested the product. At present, that's how I reach the aggregate rating of POOR, FAIR, GOOD, or EXCELLENT. This system will surely evolve with time, but I believe it does offer an easy way to summarize what the various labs have to say.
Copyright 1996-2011 Ziff Davis, Inc.
You must be a registered user to add a comment here. If you've already registered, please log in. If you haven't registered yet, please register and log in.
By using this community you agree to the Participation Guidelines and Terms of Use.
Trend Micro SafeSync for Business: Securely manage, access and share your files online
Join the 'Bring Your Own Device' Research Project
Already a TouchPoint member? Just Sign In
Copyright (c) 1989-2012 Trend Micro Incorporated. All rights reserved.
