A software bug in CrowdStrike’s quality-control system caused the software update that crashed computers globally last week, the U.S. firm said on Wednesday, as losses mount following the outage which disrupted services from aviation to banking.
The extent of the damage from the botched update is still being assessed. On Saturday, Microsoft said about 8.5 million Windows devices had been affected, and the U.S. House of Representatives Homeland Security Committee has sent a letter to CrowdStrike CEO George Kurtz asking him to testify.
The financial cost was also starting to come into focus on Wednesday. Insurer Parametrix said U.S. Fortune 500 companies, excluding Microsoft, will face $5.4 billion in losses as a result of the outage, and Malaysia’s digital minister called on CrowdStrike and Microsoft to consider compensating affected companies.
The outage happened because CrowdStrike’s Falcon, an advanced platform that protects systems from malicious software and hackers, contained a fault that forced computers running Microsoft’s Windows operating system to crash and show the “Blue Screen of Death”.
“Due to a bug in the Content Validator, one of the two Template Instances passed validation despite containing problematic content data,” CrowdStrike said in a statement, referring to the failure of an internal quality control mechanism that allowed the problematic data to slip through the company’s own safety checks.
There is no sign Microsoft plans to limit CrowdStrike’s access to the Windows operating system in the wake of the outage, a person familiar with the issue said on Wednesday.
CrowdStrike did not say what that content data was, nor why it was problematic. A “Template Instance” is a set of instructions that guides the software on what threats to look for and how to respond. CrowdStrike said it had added a “new check” to its quality control process in a bid to prevent the issue from occurring again.
CrowdStrike released information to fix affected systems last week, but experts said getting them back online would take time as it required manually weeding out the flawed code.
Wednesday’s statement was in line with a widely held assessment from cybersecurity experts that something in CrowdStrike’s quality control process had gone badly wrong.
The incident has also raised concerns among experts that many organisations are not well-prepared to implement contingency plans when a single point of failure such as an IT system, or a piece of software within it, goes down.