Nearly Everyone Cheats on Android Benchmarks

By Wesley Fenlon

Samsung, HTC, LG, Asus--they're all detecting when certain benchmarks are running and messing with CPU or GPU performance.

With the Galaxy S 4, and now the Galaxy Note 3, Samsung has been caught cheating on benchmarks. Perhaps cheating is a strong word, but investigations discovered that Samsung's devices detected when benchmarks were being run and raised their thermal GPU limits and CPU voltages to the max to deliver the best possible performance immediately. The Galaxy S 4 and Galaxy Note 3 were turning in performance numbers they were capable of attaining, then--the scores weren't a complete lie. But were they reflective of real-world performance? Not at all.

Anandtech investigated Samsung's benchmark scores back in July, and now they're taking another look thanks to the Galaxy Note 3. Turns out, Samsung isn't the only company that tries to make Android devices look better on benchmark scores than they perform in real use. Just about everyone does it.

Nexus devices don't cheat in 3DMark, but almost everyone else does.

"With the exception of Apple and Motorola, literally every single OEM we’ve worked with ships (or has shipped) at least one device that runs this silly CPU optimization," writes Anandtech. On a large table of mobile devices and benchmarks, Asus, HTC, and Samsung all have devices that cheat on certain benchmarks. Amusingly, they don't cheat on all of the benchmarks--just certain ones. That differs from device to device. The Galaxy Note 3 is the worst perpetrator, cheating on 6/7 benchmarks Anandtech tested.

Interestingly, even Samsung's Galaxy Tab 3, which runs on an Intel Atom processor, exhibits this behavior. "I know internally Intel is quite opposed to the practice (as I’m assuming Qualcomm is as well), making this an OEM level decision and not something advocated by the chip makers (although none of them publicly chastise partners for engaging in the activity...)," Anandtech writes.

Almost everyone is cheating on the CPU tests by detecting when benchmarks are being run.

CPU and GPU benchmarks see different forms of cheating. Almost everyone is cheating on the CPU tests by detecting when benchmarks are being run; ironically, this means almost every device is seeing an equal small bump to its performance numbers. "The hilarious part of all of this is we’re still talking about small gains in performance," writes Anandtech. "The impact on our CPU tests is 0 - 5%, and somewhere south of 10% on our GPU benchmarks as far as we can tell. I can't stress enough that it would be far less painful for the OEMs to just stop this nonsense and instead demand better performance/power efficiency from their silicon vendors."

On the GPU side, Anandtech points specifically to Samsung and HTC for detecting when a benchmark is being run and fudging performance. The only solution, for now, is continuously evolving benchmarks to prevent the OEMs from cheating. Thankfully, most of the tests Anandtech performs aren't affected by the benchmark trickery. And a big part of defeating benchmark detection will be renaming the tests to make them more difficult to detect.

Anandtech writes that "We’ve been working with all of the benchmark vendors to try and stay one step ahead of the optimizations as much as possible. [GFX/GLBench] is working on some neat stuff internally, and we’ve always had a great relationship with all of the other vendors - many of whom are up in arms about this whole thing and have been working on ways to defeat it long before now. There’s also a tremendous amount of pressure the silicon vendors can put on their partners (although not quite as much as in the PC space, yet), not to mention Google could try to flex its muscle here as well. The best we can do is continue to keep our test suite a moving target, avoid using benchmarks that are very easily gamed and mostly meaningless, continue to work with the OEMs in trying to get them to stop (though tough for the international ones) and work with the benchmark vendors to defeat optimizations as they are discovered. We're presently doing all of these things and we have no plans to stop. Literally all of our benchmarks have either been renamed or are in the process of being renamed to non-public names in order to ensure simple app detects don't do anything going forward."

Anandtech's article offers more detail on the CPU and GPU optimizations, including a boatload of charts and graphs. For now, if you want to buy an Android device and don't want to support this benchmark business--well, you'll probably want a Nexus.