Why Perfect Hardware SNES Emulation Requires a 3GHz CPU

By Wesley Fenlon

A 3GHz processor is about 140 times more powerful than the chip inside the Super Nintendo. Here's where all that power goes.

Properly emulating old video game and computer hardware requires an enormous amount of work. Instructions have to be translated from one platform to another, hardware systems have to be recreated in software, and doing all that accurately requires a processor far more powerful than the original chip chugging along inside decades-old consoles. The Wii processor doesn't even break past 1GHz clock speeds, yet some games require PCs running well beyond 3GHz to maintain a solid framerate.

And emulation isn't just about speed. For purists, interested in experiencing a game just as it once was, accuracy is everything. byuu, creator of the SNES emulator bsnes, recently wrote an extensive article explaining the ins and outs of hardware-perfect emulation. Read on for the highlights, including why bsnes required a 3GHz processor to emulate the Super Nintendo's 21.47MHz CPU.

byuu lays out two reasons he works so hard to achieve accurate emulation: performance and preservation. While most SNES games run fine on emulators like SNES9x and zSNES, that's partially because such emulators use a few tricks and speed hacks to mimic perfect compatibility:

It is possible for a well-optimized, speed-oriented SNES emulator to run at full speed using only 300MHz of processing power. You will also end up with hundreds of obscure bugs.

What typically happens is that the problems are specifically hacked around. Both ZSNES and Snes9X contain internal lists of the most popular fifty or so games. When you load those games, the emulators tweak their timing values and patch out certain areas of code to get these games running. It's an improvement over the Nesticle days of the games themselves being hacked externally, but it is still cheating, regardless of the visual end results.

The casual gamer who only plays the most popular twenty or so titles will see no visible differences between an emulator requiring 300MHz and another requiring 3GHz, so they will of course go with the former. Although I do respect and appreciate speed-oriented emulators, one concerned with accuracy can't help but lament the way this approach stalls progress. Without more players using the more accurate emulators, we won't find the bugs in all the games the emulator supports. The more people we have playing the games in the way they were intended, the better the emulator can become as issues are found and stomped out—not by fixing specific code for each game, but by fixing the accuracy of the emulator.

byuu's article cites multiple examples of games that don't perform properly on other emulators; the spinning triangles that form the Triforce on the A Link to the Past title screen move too quickly on the ZSNES, since the emulator actually runs faster than the original hardware. These are often small bugs that affect timing and audio in obscure games, but purists will notice and care that the experience isn't perfect.

byuu holds preservation in high regard, as well, and makes the point that someday these games will be impossible to play on legacy hardware. Technology evolves, and hardware even older and far more obscure than the SNES, like Nintendo's Game & Watch handhelds, will eventually be gone forever. The only way to preserve those experiences--and the technology itself--is to emulate everything as accurately as possible. A focus on speed hacks and custom emulator design make preservation more difficult and doom ROM hacks and fan translation projects to limited lifespans:

You have to realize that emulators, too, have shelf-lives. That's especially true for ones such as ZSNES that are written in pure x86 assembly. You simply can't run this on your cell phone. By locking a hack to run only on ZSNES, you are dooming your hack to irrelevance.As soon as Windows drops 32-bit backwards compatibility, just as it has already done with 16-bit backwards compatibility, these fan translations and hacks will be lost forever. At that point the emulator itself becomes almost like another dead console, instead of a way to keep the old games alive.

And there's that whole naggling issue that SNES games should run on, you know, an actual SNES console. By creating emulators that mimic the original hardware perfectly, or as close we can get, we create a platform that can work on multiple systems, and allow for games and hacks to run on future versions. The idea is to keep the specifics of the SNES hardware alive, not just the idea of the games.

So why does perfect emulation take 3GHz of processing power? This is the most interesting part for tech heads:

The primary demands of an emulator are the amount of times per second one processor must synchronize with another. An emulator is an inherently serial process. Attempting to rely on today's multi-core processors leads to all kinds of timing problems. Take the analogy of an assembly line: one person unloads the boxes, another person scans them, another opens them, another starts putting the item together, etc. Synchronization is the equivalent of stalling out and clearing the entire assembly line, then starting over on a new product. It's an incredible hit to throughput. It completely negates the benefits of pipelining and out-of-order execution. The more you have to synchronize, the faster your assembly line has to move to keep up.

Let us compare the synchronization ratios between ZSNES and bsnes:

S-CPU: 600,000 vs 21,477,272

S-SMP: 256,000 vs 24,576,000

S-DSP: 32,000 vs 24,576,000

S-PPU: 15,720 vs 21,477,272

Total: 903,720 vs 92,106,544

Let's start with the CPU, which is typically assumed to be running at 3.58MHz. This rate applies to number of cycles executed per second. A typical instruction can consume four to eight cycles, and ZSNES synchronizes once per instruction. But to get even more technical, cycles are broken down into bus hold delays which require timing at the raw oscillator level. The SNES CPU oscillator is rated at 21.47MHz. The same applies to the SMP, whose oscillator is rated at 24.58MHz...

So although bsnes runs ten times slower than ZSNES, it is literally up to one hundred times more precise. In truth, it is actually very impressive that this is possible at a mere 3GHz. Only because I've utilized cooperative multithreading and just-in-time synchronization, techniques I've never before seen used in emulation, have I managed to eke out the performance bsnes currently has.

byuu goes on to detail the performance differences between low-level and high-level emulation, which is especially fascinating when you factor in the specialized digital signalling processors built into select Super Nintendo carts like Star Fox and Mega Man X. bsnes' accuracy in this area stems from an incredibly complex process of melting the circuits of DSP chips and scanning their surfaces with an electron microscope to determine the program code. That's dedication. If you're fascinated by the effort that goes into preserving legacy hardware, give the entire story a read.

Image credit: ArsTechnica