Microsoft Downtime: The Risk of Cloud Services (CrowdStrike Error & DDoS Attack on Microsoft July 2024)
July 2024 was a tumultuous month in the software industry. While Microsoft's sales pitches are louder than ever, the company faced significant criticism due to the downtime of several crucial systems. This month saw multiple incidents at Microsoft with far-reaching consequences. In this article, we delve deeper into the two major incidents: the CrowdStrike update and the DDoS attack on Microsoft 365 and Azure.
CrowdStrike update downtime
In July 2024, a large-scale disruption occurred due to a faulty software update of CrowdStrike's Falcon Sensor. This update, released on July 19, introduced a critical error in the form of a missing null-check in the code. This caused the system to attempt access to an invalid memory address, leading to the infamous "Blue Screen of Death" (BSOD) on millions of Microsoft Windows devices worldwide. The global community saw this as a scandal, as such an update should have been far better tested before implementation.
Impact of the faulty CrowdStrike update
Global Outage
The disruption had a global impact, causing critical disruptions in business operations, health services, airlines, and even stock exchanges. Approximately 85 million devices were affected by the update. The timing of the rollout at 04:09 UTC ensured that the disruption hit businesses during their working hours in Oceania and Asia and the early morning in Europe and America.
Financial Damage
A specialist in cloud outage insurance estimated that the largest 500 American companies suffered nearly $54 billion in losses, with only between $540 million and $108 billion insured. These 500 American companies represent just a small fraction of the total number of affected companies, illustrating the significant financial impact of the disruption on businesses worldwide.
Specific consequences for sectors
- Aviation: Globally, 5,078 flights were canceled, representing 46% of the planned flights for that day. Australian airlines like Qantas, Virgin Australia, and Jetstar were heavily affected, as were airports in cities like Sydney, Melbourne, and Brisbane.
- Healthcare and Business Operations: Critical systems in hospitals and other healthcare facilities were disrupted, severely affecting service delivery. Businesses experienced issues with their IT infrastructure, leading to reduced productivity and operational challenges.
Recovery attempts for the CrowdStrike update
- CrowdStrike: CrowdStrike acknowledged the problem and issued a public statement along with a workaround solution. They advised affected users to manually delete specific files from safe mode or the Windows Recovery Environment.
- Microsoft: Microsoft collaborated with CrowdStrike and external developers to expedite a solution. They provided technical guidance and support to help users securely restore their systems. This included rebooting affected virtual machines up to 15 times and restoring a backup from before July 18.
Both recommendations were labor-intensive and unrealistic for large companies.
DDoS Attack on Microsoft 365 and Azure
On July 30, 2024, Microsoft was hit by a large-scale Distributed Denial-of-Service (DDoS) attack, resulting in outages of various Microsoft 365 and Azure services worldwide. The attack lasted approximately nine hours and caused significant service disruptions for Microsoft Entra, Microsoft Purview, Azure App Services, Application Insights, Azure IoT Central, and the Azure Portal.
Details of the DDoS attack
Attack vector
The DDoS attack targeted the application layer (Layer 7) of the OSI model, meaning the attack specifically aimed to disrupt Microsoft's web application services. The attackers used various techniques such as HTTP(S) floods, cache-bypass, and Slowloris attacks to overwhelm the servers and disrupt normal operations.
Attackers
Microsoft identified the threat actor as Storm-1359, a group suspected to be pro-Russian and possibly connected to the hacktivist group Anonymous Sudan. This group has previously carried out attacks on organizations in Sweden, the Netherlands, Australia, and Germany. Their attacks use a collection of botnets, rented cloud infrastructure, and open proxies to execute the attacks.
Impact of the DDoS Attack on Microsoft 365 and Azure
Global outage
The attack had a global impact, with users worldwide reporting issues accessing their Microsoft 365 and Azure services. The affected services included critical business applications such as Intune, Power BI, and Power Platform, leading to widespread operational disruptions for businesses reliant on these services.
Microsoft's response
Microsoft's initial response to the attack seemed to exacerbate the impact rather than mitigate it. An error in the implementation of their DDoS protection mechanisms caused the defenses to amplify the attack. This led to additional outages and delays. Microsoft eventually made network configuration changes and failovers to alternative network routes to provide relief. Further in this article, you will read more about how to secure your business.
Recovery attempts for the DDoS attack
Technical solutions
To prevent further disruptions, Microsoft adjusted the settings of their Azure Web Application Firewall (WAF). They also advised customers to implement geographical restrictions to limit incoming traffic and minimize the impact of future attacks. Additionally, Microsoft confirmed that there was no evidence of customer data being accessed or compromised during these attacks.
Looking ahead and improvements
Microsoft announced that within 72 hours, a preliminary incident report (Preliminary Post-Incident Review PIR) would be published, and a final incident report within two weeks. These reports will contain further details and lessons learned from this week's outages. Microsoft continues to evaluate and improve its security mechanisms to reduce the impact of such attacks in the future.
Conclusion on DDoS Attacks
The DDoS attack on Microsoft 365 and Azure in July 2024 underscores the threat of cyberattacks on major cloud service providers. The incident highlights the importance of security measures and rapid, effective response strategies to minimize the impact of such attacks. Microsoft's experience shows that even the largest technology companies are vulnerable and must continually work to strengthen their defenses against increasingly sophisticated cyber threats. Have you, like many, lost confidence in Microsoft's security guarantees? We recommend continuing with on-premise software to ensure the security of your business and all its data. Explore our range of on-premise licenses such as Windows Server 2022, SQL Server 2022, and Office 2021.
What can you do against these online threats and uncertainties? Take control
In light of the recent incidents with CrowdStrike and the DDoS attack on Microsoft 365 and Azure, there is a growing discussion about the benefits of on-premise software versus cloud-based solutions. Here are some reasons why companies might consider taking control by choosing on-premise software:
- Manageability and Control
- With on-premise software, companies have full control over their IT environment. This means they are not dependent on external vendors for updates, security patches, or bug fixes. In the case of the CrowdStrike error of July 2024, affected companies needed physical access to their machines for days to manually correct the error. This kind of dependency can be avoided with on-premise solutions where IT teams can directly intervene and respond to issues more quickly.
- Security and Data Protection
- The DDoS attack on Microsoft 365 and Azure demonstrated how vulnerable cloud-based services can be to cyberattacks. On-premise systems can be designed with specific security protocols tailored to the unique needs of a business. Additionally, sensitive data can be stored internally, reducing the risk of data breaches from external threats.
- Reliability and Availability
- While cloud providers often guarantee high availability, incidents like the DDoS attack can cause prolonged outages, resulting in loss of productivity and revenue. On-premise systems can be designed and maintained redundantly to ensure business continuity even during internet outages or external attacks.
- Customizability and Flexibility
- On-premise software allows companies to tailor their IT infrastructure to specific business needs. Cloud solutions are often standardized and may impose limitations on customization. Companies can optimize their on-premise systems for better performance and integration with existing applications and processes.
- Long-Term Cost Management
- While cloud solutions often appear attractive due to lower initial costs, recurring subscription fees and additional costs for bandwidth, storage, and security can add up. On-premise solutions require a higher initial investment but can be more cost-effective long-term, especially for large enterprises with significant IT needs.
Conclusion
The certainty of on-premise software and choosing your security makes Windows Server 2022 and SQL Server 2022 the solution to bypass failing cloud security. Many companies are discovering the disadvantages of cloud solutions and find it difficult to revert to on-premise software. Microsoft cleverly does this as they benefit more from the monthly subscriptions of Azure, Microsoft 365, and other pay-as-you-go/subscription models. If you still want to experiment with cloud solutions, we highly recommend considering a watertight Cloud exit strategy in advance, so you are not caught off guard if you choose to switch to on-premise.