Tales from Prismata development: The Shift Bug


Sometimes, in large software projects, bugs appear for seemingly no reason, and efforts to fix them go absolutely nowhere.

This is a tale of one such bug. It’s an ugly one.

The Shift Bug

On March 8th, we deployed an update to Prismata that included a number of in-game graphical enhancements. The same week, we also debuted a prerelease version of our Windows build, which was subsequently made officially available on March 14th. Around that time, we received a number of troubling bug reports, most of which looked like the following:

———  Steps to reproduce (please be as detailed as possible):    ———-

Buy 1 cryo, press shift and click it.

———  Expected result:    ———-

It does everything normally.

———  Actual result:    ———-

Game slows down, the targeting is super floaty.

———  Any other information (please include replay codes if relevant):    ———-

If I stop holding down shift game returns to normal speed.

Side note: not all of our bug reports arrive in such good condition…

Using shift with big amount of units during targeting couses laag in the new client.
Try byuying 10 cryo and using them through shift (have to target different groups, laag satrs after the dirst group of units is targeted.)

These bug reports immediately worried us. A lot of our graphical improvements were done with the intention of reducing sources of lag/low frame rate, so seeing reports of new lag was quite troubling. The timing of these bug reports caused us to initially suspect that something had gone wrong in the March 8th patch, but further testing would later reveal something much more sinister.

 

Initial Investigation

At first, nobody at Lunarch could reproduce the bug. We followed the same steps that were described in the bug reports, but didn’t observe any noticeable slowdown. This can sometimes happen with rare bugs that are highly specific to a given user’s configuration. However, we continued to receive several reports of terrible lag during shift-clicking, breaching, and Cryo Ray usage, so we became increasingly determined to track down the problem.

When bugs turn out to be difficult to reproduce, we often turn to members of our community to help provide clues as to what might be causing the problem. Prismata itself tracks some information that can help; for example, each bug report is tagged with a time and date, user ID, and which browser is being used. However, this is often not enough; sometimes we need more information from users, and ask them to try various things. (Does it still happen if you switch browsers? What about incognito mode? Can you try it with a different Prismata account?) 

Bugfinder

If you see somebody with a bugfinder badge like the one shown above, they’ve probably helped us with one of these types of investigations.

As luck would have it, we soon gained the ability to investigate the bug on our own, as I encountered the slowdown myself! Spontaneously, during this game, I experienced a staggering drop in frame rate while using Cryo Rays. While shift-clicking them and targeting my opponent’s Walls, Prismata suddenly dropped from a buttery smooth 60fps to a stuttering, hideous, unplayable 2-frame-per-second slide show. The behaviour subsided immediately afterwards.

Another side note: the game itself was played against Apooche, who you might remember from his appearance at the Prismata Alpha World Championships (you can read his profile here). Apooche thoroughly kicked my ass that game and currently occupies the #1 spot on the Prismata leaderboard, and I would highly recommend checking out his twitch channel if you’re interested in high level Prismata.

Incidentally, Apooche was streaming on twitch at the time I played him, and I had the stream open during the game and was communicating with him via the twitch chat. As it turned out, this would become critical as we learned more about the bug.

After the game, some further experimentation revealed a number of other interesting phenomena—it turned out that the slowdown in gameplay didn’t merely occur when shift-clicking Cryo Rays or shift-breaching. In fact, I was able to get Prismata to slow down simply by holding shift in the lobby!

 

Digging Deeper

Unfortunately, the bug remained elusive—it didn’t happen all the time. However, it didn’t take too long to find a reliable way to reproduce the problem. As it turned out, simply having a twitch stream open was sufficient to cause terrible lag in Prismata whenever the shift key was held.

Further experimentation revealed the following:

  • Twitch streams weren’t the only triggers for the problem; having almost any other Flash content open would be sufficient to cause Prismata to stutter when shift was held. Even a second Prismata window could trigger the issue (but holding shift would only slow the Prismata game that was in focus at the time, they wouldn’t all slow down).
  • The problem was not unique to the shift key. Holding down any key on the keyboard seemed to cause the same problem (however, bug reports focused on the shift key because it’s the only key that is routinely held during actual Prismata games).
  • The problem never occurred in our in-house debug version of Prismata or the Windows desktop version of Prismata; it was a web-only bug.
  • Even among web versions, we’ve only ever been able to witness the problem in the Chrome browser. Firefox and IE seemed fine.
  • Reverting to an older version of Prismata did not help!  We even tried deploying old Prismata versions from 2015, but the bug persisted across all of them. (This ruled out the March patches as potential culprits, but left us with more questions than answers—why had nobody reported the shift-lag sooner? Did it only start happening in March?)

We began to suspect that multiple Flash instances were inducing some kind of strain on either Chrome, or on the Actionscript Virtual Machine that runs Flash content in the browser, and that Prismata just couldn’t operate correctly under this kind of strain. Perhaps a keyboard-triggered bug caused a sequence of failures leading to low framerate in these conditions.

As it turned out, the problem goes well past Prismata.

 

Diagnosis, Perhaps?

We use a number of industry-standard tools to test and profile our code to make sure everything is running as it should be. One of those tools—Adobe Scout—revealed something pretty unusual when profiling Prismata in the browser:

scoutAs you can see above, the frame rate sank to an abysmal 2.5fps over a period of several seconds. (You can see in the graph on the left that Prismata was only able to render about 15 frames during 6 seconds from 0:34 to 0:40.) However, the actual CPU time used by Prismata was very small, and only consumed 31% of the available budget for running at 60fps. We only expect to see frame rate drops when the blue bars on the graph on the left grow high enough to pass the dark red line above them.

The scary bit is that Inactive entry in the bottom right. That shows the number of milliseconds when Prismata was doing absolutely nothing, simply waiting for the browser to give it a chance to draw the next frame.

At this point, I began to suspect that the problem had nothing to do with Prismata at all. However, I’m always hesitant to arrive at that conclusion—Chrome and Flash are much more mature pieces of software than Prismata, so for any given defect, it’s astronomically more likely that Prismata is at fault. Plus, the slowdown was only ever observed in Prismata itself, and Prismata had to be in focus for keyboard presses to cause a problem.

One simple experiment would provide the final answer.

I loaded up a bunch of other Flash content to see if I could observe the same behavior in other games. And sure enough, it seemed that the shift bug was everywhere! For example, I was able to get Punch the Trump to stall to 1fps by opening up several copies of the game and holding shift (with no Prismata open at all).

We noticed a few other things:

  • Some Chrome extensions made things worse. I noticed a considerable improvement when I disabled one called “Video Speed Controller” (which I use to add custom keyboard controls for youtube and watch videos at speeds higher than 200%).
  • The beta and developer versions of Chrome still seem to have the problem.

 

Final Verdict

At this stage, my best guess is that either Chrome or the Flash Player itself has a problem with keyboard events. It seems that if there are multiple running Flash games or Chrome extensions that listen for keyboard input, the whole system can be suddenly and abruptly brought to a grinding halt if it encounters any strain. It seems to be a recent problem, which possibly began around March 8th, 2016 when our users started reporting the problem.

For this one, Prismata is exonerated. Given that other Flash content has the same problem, it’s unlikely to be us.

If you run into this bug, our best advice is the following: check your Chrome extensions. Try disabling them. Try closing twitch or other video streaming applications that might be using Flash.

Finally, if you can find a way to prevent this bug from happening, let us know and we’ll spread the word!

 


About Elyot Grant

A former gold medalist in national competitions in both mathematics and computer science, Elyot has long refused to enjoy anything except video games. Elyot took more pride in winning the Reddit Starcraft Tournament than he did in earning the Computing Research Association's most prestigious research award in North America. Decried for wasting his talents, Elyot founded Lunarch Studios to pursue his true passion.