CrowdStrike Apologizes for IT Outage, Defends Microsoft Kernel Access

Written by

Cybersecurity giant CrowdStrike apologizes for “letting customers down” after a faulty update of its Falcon sensor disabled millions of PCs on 19 July.

Adam Meyers, VP for counter-adversary operations at CrowdStrike, appeared before a US congressional committee on September 24 to answer questions about the mistake that crashed roughly 8.5m computers running Windows and forced them to display Microsoft’s infamous blue screen of death (BSOD).

The US House Committee on Homeland Security had requested public testimony from CrowdStrike CEO George Kurtz on July 22.

Kurtz promised to do so once the incident is fully resolved, but the company eventually chose to send Meyers instead.

CrowdStrike Falcon Sensor’s Fault Explained

In front of Congress, Meyers said that the "perfect storm" was due to the update having a "mismatch between input parameters and predefined rules."

“On July 19, 2024, new threat detection configurations were validated through regular validation procedures and sent to sensors running on Microsoft Windows devices. However, the configurations were not understood by the Falcon sensor’s rules engine, leading affected sensors to malfunction until the problematic configurations were replaced,” Meyers explained.

Read more: Cybercriminals Exploit CrowdStrike Outage Chaos

CrowdStrike’s Efforts to Reboot Affected Systems

Meyers also provided details on how CrowdStrike helped restore systems affected by the outage.

On July 22, the company introduced automated techniques to accelerate remediation.

Following this, CrowdStrike staff were deployed to assist customers in recovering their systems.

The challenge with this outage is that physical access was required to an affected machine to reboot it.

“As of July 29, virtually all of our customers’ systems were back up and running,” Meyers confirmed.

CrowdStrike is still facing multiple lawsuits following the July outage.

These include from the company’s own shareholders as well as Delta Airlines.

Delta has accused CrowdStrike of "negligence" and claims to have lost $500 million due to the outage, which caused thousands of flight cancellations.

Read more: CrowdStrike Windows Outage: What We Can Learn

CrowdStrike’s Measures to Prevent Similar Incidents

Meyers shared some of CrowdStrike’s efforts to ensure such an incident never happens again.

These improvements include:

  • Validation: CrowdStrike has introduced new validation checks to help ensure that the number of inputs expected by the sensor and its predefined rules match the same number of threat detection configurations provided
  • Testing: The company has enhanced existing testing procedures to cover a broader array of scenarios
  • Customer Control: CrowdStrike’s customers now have more control over the deployment of configuration updates to their systems
  • Rollouts: CrowdStrike now uses a phased approach to rollouts of threat-detection updates, which means customers do not have to implement updates immediately
  • Safeguards: The company has added additional runtime checks to the system, designed to ensure that the data provided matches the system’s expectations before any processing occurs
  • Third-Party Reviews: two independent third-party software security vendors have been hired to conduct further Falcon sensor code and end-to-end quality control and release process reviews

CrowdStrike’s Microsoft Kernel Access Discussed

Congresspeople asked Meyers whether software like CrowdStrike Falcon sensor should enjoy Microsoft kernel access.

Kernel access refers to the ability of a software program or process to directly interact with the kernel of an operating system. The kernel is the core component of an operating system, responsible for managing hardware resources, processes, and memory.

While most software applications operate in user space, some critical applications, including antiviruses, endpoint detection and response (EDR) solutions and other security products, are installed in the Microsoft kernel.

This level of access is necessary for many cybersecurity solutions to effectively monitor and protect systems. However, it also raises concerns about the potential for system crashes in the event of a fault with a solution granted kernel access.

The CrowdStrike incident reportedly prompted Microsoft to contemplate moving antivirus and other threat-detection updates into user mode to reduce the likelihood of significant incidents.

However, Meyers argued against this decision, saying that without kernel access, CrowdStrike's security products might be less effective.

He argued that products like Falcon have "visibility into everything happening on that operating system." This allows for threat prevention and helps "ensure anti-tampering."

Meyers mentioned that Scattered Spider, the group responsible for the Las Vegas casino network intrusions, often uses "new techniques to elevate their privilege in order to disable security tools on a regular basis."

Meyers stated that CrowdStrike will "continue to leverage the architecture of the operating system."

Photo credits: VDB Photos/Ascannio/Shutterstock

What’s hot on Infosecurity Magazine?