Enterprise IT administrators are dealing with a confusing array of patches and advice as they look for effective ways to fix the Meltdown and Spectre processor vulnerabilities.
A growing number of reports say that the patches and updates have bugs—including at least one that’s causing serious issues—and now there are now reports that Intel is telling vendors to stop issuing fixes until it has time to sort the problems out.
This leaves IT departments trying to decide what systems to update, which systems to leave untouched and what actions to take next.
In an effort to get ahead of the issue, Intel CEO Brian Krzanich has acknowledged that the fixes that Intel has provided to prevent exploitation may be impacting performance.
In a statement issued on January 11, Krzanich said, “As we roll out software and firmware patches, we are learning a great deal. We know that impact on performance varies widely, based on the specific workload, platform configuration and mitigation technique.”
This is a significant change from earlier statements in which Intel said that any impact on the performance of systems with Intel processors would be minimal and that most users would never notice them. It turns out that not only are the changes significant in some cases, the new patches may cause some computers to randomly reboot.
Navin Shenoy, Intel’s general manager for its data center group, posted in a blog entry that the company is focusing on the unexpected reboot issue. “We have received reports from a few customers of higher system reboots after applying firmware updates. Specifically, these systems are running Intel Broadwell and Haswell CPUs for both client and data center.”
Shenoy said that if the reboot issue requires a firmware update, it will be distributed through normal channels. He also said that users should continue to apply updates provided by the maker of their computers or operating systems. However, this may not be as easy as it sounds.
According to a report in the Wall Street Journal, Intel is quietly advising system manufacturers to hold off on any patches that they issue to customers while Intel looks at the issue of bugs in the code that was developed to fix the problem. This is related to the Shenoy blog entry, but goes much farther than the simple statement that the company is aware of a reboot problem, and is working on it.
Apparently rumors that have surfaced that the processor security problem is worse than it seems may have a basis in reality. It also appears that the existence of this set of processor problems has been known for a while.
A blog entry by Google vice president Ben Treynor Sloss makes it clear that Google’s Project Zero uncovered the problem months before it was announced publicly. According to Sloss, Google began patching its cloud servers against part of the problem as early as September 2017.
However, Sloss also said that Google discovered that patching the most severe vulnerabilities would result in significant system slowdowns on the Google Cloud Platform and services. So the company looked for another solution.
After an effort that Sloss described as a “moonshot,” Google engineers were able to devise a software fix for the Intel processor problem that didn’t result in a slowdown, and that had no other significant side effects. Google has release its solution as a publicly available open source code so it can be widely adopted by other organizations.
However, the actual process of implementing Google’s fix, assuming it works in other cloud environments, is up each organization. Even if it turns out to be as good as Google’s folks say it is, it’s still a solution that is a ways off, because vendors need time to incorporate Google’s suggestions, validate them, and then turn them into updates. At this point, it’s not even clear if other cloud providers are planning on adopting it.
So where does this leave you and your IT department? To some extent, you’re in Limbo. If you have systems that have processors designed in the past five years, then Intel is going to issue a fix if it hasn’t already. Operating systems vendors are rolling out their part of the fix where they can, although it’s not clear whether Google’s approach can be integrated with that.
For the time being, you’re going to have to live with the fact that Intel’s patches may contain bugs, like the reboot situation Shenoy discusses. But there is talk online that there may be other bugs in some patches. And worse, you may not get patches at the rate you’d expect if Intel is telling system manufacturers not to issue them right now.
But here is a potential approach that might work best. If you have machines that are fairly new and not in compute intensive environments, then apply the patches as they come from the vendor along with the patches that come from Microsoft, Apple or are issued for your Linux distribution. They will lend an important level of security.
However, if you’re running machines with compute-intensive tasks, say mining bitcoins or rendering high-resolution video, you might want to avoid the microcode updates entirely until you know they’re stable and won’t significantly slow down your systems. This is one situation where the security fix might not be worth it.
After all, there’s currently no known exploit for Spectre or Meltdown and while exploits will certainly emerge now that the details are known, it’s unclear how effective they will be and whether they can be successfully deployed in the real world.
This is one time when caution is probably more important than an immediate patch. But for those computers that really are performing compute-intensive tasks, you will want to make sure all of other critical security measures are in place taken, including operating system patches. Then, once chip makers start making processors that aren’t subject to the Spectre and Meltdown vulnerabilities, upgrade your servers, which you will probably do anyway if you’re doing that much data crunching.
I know it’s weird for me to say to be cautions about updates, but in this case it might make sense. But only you can make the call about the balance between performance and security.