11/08/2024 by Willy Rojas and Dustin O'Bier

Survive Digital Disruption: CrowdStrike Global Outage Takeaways

In a world so reliant on digital technology, we often expect (and hope) that our software will be stable. Yet even the most reliable technology platforms can falter. Take the recent digital disruption felt by businesses affected by the CrowdStrike global outage for example.

The recent global outage involving CrowdStrike, one of the world’s leading cybersecurity companies, was a stark reminder that no system is entirely immune to disruption. It’s a harsh reality, but one we need to face head-on. Here’s how the recent outage affected businesses and, most importantly, some essential tips to ensure your operations remain resilient. Read on to safeguard your company’s digital future.

Crowdstrike Global Outage Event

On July 19, 2024, CrowdStrike released an update for its Falcon Sensor software. The update caused a significant global IT outage, crashing millions of Windows computers and displaying what you might otherwise know as the “Blue Screen of Death.”

A black man is sitting at a desk with a laptop on it. The laptop is open and shows a blue loading screen while the man has his hands up in frustration.

Around 8.5 million systems worldwide were affected. The outage interrupted businesses of all kinds, including airlines, healthcare, banks, and more. Here are a few examples of the disruption the outage caused.

Airlines

At LaGuardia Airport, the outage caused their baggage handling system to fail, causing significant delays and widespread operational disarray. Wait times were extensive, and many passengers missed their flights.

Delta Airlines was the largest airline affected by the outage. The company had to reset over 40,000 servers and manually cancel 5,000 flights, losing over 500 million dollars in revenue.

Healthcare

Hospitals like the Mayo Clinic, Cleveland Clinic, and Mass General Brigham faced system crashes that affected patient care and administrative functions. Electronic health records went offline, delaying medical procedures and patient admissions. The estimated financial impact on the healthcare sector alone was around $1.94 billion.

Banking

Financial institutions like JPMorgan Chase and Bank of America suffered considerable downtime. Transactions, online banking services, and customer support were affected. The inability to process anything led to customer dissatisfaction and financial losses. The estimated impact on the banking sector contributed heavily to the global economic damage totaling at least $10 billion.

Preventing Digital Disruption in Your Business

A graphic that reads, "No system is 100% immune to failures." The text is in white except for "100%" that is bolded and in teal. The background is black with a teal diagonal slash at the bottom of the image.

The CrowdStrike global outage underscored the importance of being prepared for the unexpected. While such events may be rare, businesses should understand that no service is exempt from disruption. But don’t panic. You can use the practices below to reduce any impact should an incident like the CrowdStrike outage ever happen.

Have a Strong Incident Response Plan

A well-structured Incident Response Plan (IRP) is crucial for navigating outages. For many businesses, the CrowdStrike outage was a wake-up call about the importance of having a detailed IRP. Most organizations now have plans for cyber threats but remain unprepared for a service outage.

Organizations need well-defined and practiced IRPs. An effective IRP ensures faster recovery and coordinated actions during outages.

A proper incident response plan should have the following components:

The organization’s incident response strategy and how it supports business objectives
Roles and responsibilities involved in incident response
Procedures for each phase of the incident response process
Communication procedures within the incident response team, with the rest of the organization, and external stakeholders
How to learn from previous incidents to improve the organization’s security posture

Without a solid IRP, chaos can arise when essential tools and services go down. Time is so critical in these kinds of situations. Companies that don’t respond to incidents fast often face increased downtime and direct revenue loss.

According to a SANS report, companies without a proper IRP take 54 percent longer to contain incidents that cause downtime. Additionally, a study from Ponemom Institute found that organizations without an effective IRP team experienced 54 percent more downtime compared to those with one.

Having an IRP in place is crucial, but the second most important aspect is testing it often! This ensures that the plan is effective, team members are familiar with their roles, and potential gaps are identified before an actual incident occurs.

A graphic that reads, "The most important part of an incident response plan? Regularly testing it!" The text is in white except for "Regularly testing it!" that is bolded and in teal. The background is black with a teal diagonal slash at the bottom of the image.

Practicing the IRP should be done annually or after a major change to your process. Planning and organization are the only ways to mitigate significant disruption. It’s always best to be prepared for the worst!

Review Your Software Deployment Practices

Deploying new software can be a very complex process, especially when it is dependent on other applications or systems. The CrowdStrike global outage was caused directly by this issue, as it was dependent on the Windows operating system.

Here are some best practices for establishing an effective deployment process.

Develop a Patch Management Policy

Define a comprehensive policy that details the procedure for managing patches, specifying roles, responsibilities, and schedules.

Inventory Assets

Keep an updated inventory of all hardware and software assets that need patching.

Prioritize Patches

Assess and schedule patches based on the severity of vulnerabilities and the importance of the systems they impact.

Before deploying patches to production environments, test them in a controlled setting. This way, you can be sure they won’t cause any issues.

Automate Patch Deployment

Automated tools are excellent for streamlining the patch deployment process. This can reduce the risk of human error and ensure timely updates.

Monitor and Audit

Continuously track the patching process and audit patch deployments to ensure compliance and effectiveness.

Have a Rollback Plan

Have a rollback plan in place. This allows you to revert to a previous state should a patch cause problems.

Keep a Vendor Patch Schedules

Stay informed about each vendor’s patch release schedule to plan and prepare for upcoming updates.

Document Everything

Keep detailed records of all patching activities, including what was patched, when, and by whom.

Assess Your Vendor Relationships

Periodic assessment of third-party vendors is necessary to ensure resilience. The CrowdStrike outage has prompted many organizations to reconsider their vendors. Businesses should assess vendor relationships to confirm they meet the organization’s risk tolerance and operational needs.

A graphic that reads, "The Crowdstrike global outage has prompted many organizations to reconsider their vendors." The text is in white except for "Crowdstrike global outage" that is bolded and in teal. The background is black with a teal diagonal slash at the bottom of the image.

Scheduling annual assessments can help keep this task from being forgotten. Assessments should include one for vendor risk and a security questionnaire.

Vendor Risk Assessment

This assessment evaluates the potential risks that the vendor may introduce to your organization. This includes understanding the vendor’s operations, data handling practices, and risk profile.

Security Questionnaire

This should be comprehensive and help you understand the vendor’s security policies, practices, and standards. Topics may include encryption, incident response, access controls, and employee security training.

Consider Diversification

Companies may wish to review their diversification processes when relying on critical software to run their operations. As highlighted by the CrowdStrike outage, over-dependence on a single software solution can expose the business to significant risks. This can include operational disruptions due to software failures, security vulnerabilities, or vendor instability.

A graphic that reads, "Over-dependence on a single software solution can expose the business to significant risks." The text is in white except for "single software solution" that is bolded and in teal. The background is black with a teal diagonal slash at the bottom of the image.

Software diversification helps you not be reliant on one system. This can provide contingency options and flexibility in the face of unexpected challenges. By incorporating several complementary software tools or services, companies can enhance resilience, maintain business continuity, and mitigate potential risks.

Building Digital Resilience in the Face of Uncertainty

While incidents like the CrowdStrike outage can be rare, their impact can be severe. The unpredictability of such events can be scary, but with these proactive practices, there’s little to fear. Remember, a resilient business is a prepared business. By taking these steps, you can protect your operations and buckle in for a smooth ride in the digital landscape.

Get More Content Like This In Your Inbox

ABOUT THE AUTHORS

Willy Rojas

Engineer II, Infrastructure

Willy has been with Trinity Logistics for eight years. He’s held several other IT positions while here, starting as a Service Desk Intern to Senior Service Desk, IT Systems Administrator II, and Infrastructure Engineer II. Willy finds cybersecurity fascinating because it’s often changing and giving him something new to learn. He also finds satisfaction in knowing that the work he does every day is important, from keeping confidential information secure to keeping business operations running smoothly.

Dustin O’Bier

Manager, Infrastructure

Dustin has been working at Trinity for 21 years. Previous positions he’s held include Help Desk Specialist, System Administrator I, Senior System Administrator, and IT Systems Manager. Dustin enjoys the collaborative aspect of his role. He loves working alongside a team of people to solve complex problems. Dustin’s main focuses as Manager in Infrastructure cover three distinct aspects; the Core Infrastructure of the Regional Service Centers (RSCs), Security, and the company’s Cloud practice. He finds each focus brings a unique set of challenges, making his role dynamic and engaging.

< Prev Next >

Modes

Industries

Survive Digital Disruption: CrowdStrike Global Outage Takeaways