CrowdStrike Windows Outage: What We Can Learn

On Friday, July 19, 2024, tens of thousands of workers in hospitals, airlines, banks and other industries around the world stared at the “Blue Screen of Death” (BSOD) when trying to complete daily essential tasks n their Windows computers.

The large-spread outage was caused on Windows systems running CrowdStrike's Falcon Sensor.

Let’s explore what happened and the lessons we can learn from the incident.

CrowdStrike and Falcon Sensor

CrowdStrike is a US-headquartered cybersecurity technology company. They offer a suite of cybersecurity software products, used by dozens of industries, including airlines, hospitals, banks, and retailers, to prevent hacks and data breaches.

Falcon Sensor is one of CrowdStrike's software products. It protects systems from cyber-attacks by monitoring computers for signs of malicious activity and helping to lock down threats.

What Happened on July 19

CrowdStrike pushes updates to Falcon Sensor automatically and silently, every Friday. In an update on July 24, the firm revealed the incident was caused by a Rapid Response Content update containing an undetected error. This resulted in crashes of machines running Microsoft Windows operating system and caused worldwide chaos.

The outages were not the result of a security incident or cyber-attack.

The incident had significant real-world impacts. Flights were canceled, broadcasters went off air, trains didn’t run and medical procedures were delayed worldwide. Frustrated workers were confronted with blue computer screens, with no available workaround or solution to get back online. Customers and consumers were left hanging and stranded.

At 5:29 EST on July 20, CrowdStrike put out a statement saying "the issue has been identified, isolated and a fix has been deployed." Organizations using Falcon Sensor are urged to manually deploy the fix to get their service back online.

Learnings from the CrowdStrike Incident

Software Developers

Testing: Code can have flaws. This is precisely why the testing phase is vital. Unit testing, automated testing, and regressive testing are non-negotiable, as part of the software development lifecycle (SDLC). This is not even a secure development problem, but SDLC-101. While we wait on more information on the root cause from CrowdStrike, it is time to go back and look at the testing plans, procedures and environment.
Automated silent full updates: Cybersecurity organizations are fighting a daily battle with cybercrime. They look at pushing out updates as soon as possible to consumers of their services, so that the software is always protected against the latest threats. However, the flip side to that is the risk of outages. One option is to look into a balanced update policy, which considers staggered updates.

Software End Users

Allow kernel-level access: The reason Falcon Sensor could take down the entire Windows system, along with all other non-CrowdStrike software on it, is because Falcon Sensor has access to the system’s kernel. While cybersecurity vendors may tell you that it is essential to have access to the kernel to protect the system, allowing their-party software access to a system's kernel is essentially surrendering all control. Exploring non-invasive cybersecurity options should be part of the process before deciding on a cybersecurity vendor.
Testing: Consumers of updates should also be carrying out testing before rolling out updates to their systems.
Staggered updates: Consumers of software updates should have a staggered roll-out plan, which can help limit the damage.

The latest CrowdStrike update extensively notes the types of testing the company performs on its software. However, there is clearly a flaw in the testing plan as two of the newly developed template files were allowed to be packaged and deployed with the assumption that all would be OK, given everything else was previously tested. CrowdStrike, and the rest of the world, learned, though, that it wasn’t.

While the CrowdStrike CEO was emphatic about not calling this a cyber issue, and while it is primarily a development and SDLC-process issue, we often forget that the basic tenets of security have the “A” for availability. Any availability issue, then, becomes a cyber issue, especially when “availability” is at risk because of a cyber update.

It is vital that both software developers and end users learn lessons from this incident, to prevent such a widespread outage occurring in the future.

CrowdStrike Windows Outage: What We Can Learn

Divya Aradhya

CrowdStrike and Falcon Sensor

What Happened on July 19

Learnings from the CrowdStrike Incident

Software Developers

Software End Users

You may also like

CrowdStrike Fault Causes Global IT Outages

Cybercriminals Exploit CrowdStrike Outage Chaos

Microsoft Vows to Prevent Future CrowdStrike-Like Outages

#BHUSA: CrowdStrike Outage Serves as Dress Rehearsal for China-Led Cyber-Attacks

Cloudflare and the Art of Owning Your Mistakes

What’s hot on Infosecurity Magazine?

Microsoft Thwarts $4bn in Fraud Attempts

AI Hallucinations Create “Slopsquatting” Supply Chain Threat

Bot Traffic Overtakes Human Activity as Threat Actors Turn to AI

Digital Certificate Lifespans to Fall to 47 Days by 2029

Scalper Bots Fueling DVSA Driving Test Black Market

LabHost Phishing Mastermind Sentenced to 8.5 Years

Chaos Reigns as MITRE Set to Cease CVE and CWE Operations

Trump Administration Shakes Up CISA with Staff and Funding Cuts

Google Cloud: China Achieves “Cyber Superpower” Status

NVD Revamps Operations as Vulnerability Reporting Surges

Digital Certificate Lifespans to Fall to 47 Days by 2029

NCSC Warns of Spyware Targeting Chinese and Taiwanese Diaspora

The Evolving Ransomware Landscape: A 2025 Survival Guide

Safeguarding Critical Supply Chain Data Through Effective Risk Assessment

Proactive Incident Response and Recovery: Navigating Ransomware Attacks

Fireside Chat: How Initial Access Brokers Fuel the Ransomware-as-a-Service Model

Ransomware Negotiations: Mastering an Attacker’s Mindset and Minimizing Leverage

How to Update Your PAM Strategy to Protect Hybrid Cloud Infrastructures

Gatwick Airport's Cybersecurity Chief on Supply Chain Risks and CrowdStrike Outage

You're Hired! The Truth About Certifications in Cybersecurity Careers

T-Mobile Claims Salt Typhoon Did Not Access Customer Data

Darknet Services Fuel Holiday Scams and E-Commerce Exploits

Top 10 Cyber-Attacks of 2024

Google Deindexes Chinese Propaganda Network

CrowdStrike Windows Outage: What We Can Learn

Written by

CrowdStrike and Falcon Sensor

What Happened on July 19

Learnings from the CrowdStrike Incident

Software Developers

Software End Users

You may also like

What’s hot on Infosecurity Magazine?