Uninterrupted uptime thanks to redundant power systems, hot-swappable fuel tanks, and the coordinated efforts of operations teams and CORS facilities

On April 28, 2025, a major power outage disrupted large areas of Spain and Portugal, putting the region’s digital infrastructure to the test. While millions were affected by the blackout, critical IT services, cloud platforms, and data center operations remained online and fully functional—thanks to robust engineering, contingency planning, and seamless execution by on-site and remote operations teams.

The incident has highlighted the importance of infrastructure-level resilience, particularly in an era where uninterrupted digital service is expected across all industries. Providers like Stackscale (Grupo Aire), a European cloud and data center company, demonstrated how a well-architected environment can sustain full operation—even in the face of a prolonged regional power failure.

Power resilience: more than just backup generators

Modern Tier III and Tier IV data centers are built with high availability in mind. During the April outage, data centers switched automatically from grid supply to UPS (uninterruptible power supply) systems, which instantly handled power continuity while diesel generators took over to provide sustained operation.

Providers like Stackscale activated redundant N+1 or 2N generator systems with:

Autonomy exceeding 24 hours at full load
Hot-refueling capabilities to allow refueling without shutdown
High-capacity fuel tanks monitored in real-time
Pre-arranged priority fuel supply agreements for emergencies

“From the very first moment, our infrastructure responded exactly as designed,” said David Carrero, co-founder of Stackscale. “Thanks to a highly redundant setup and a committed technical team, none of our clients experienced downtime or disruption.”

CORS: nerve centers built for continuity

Behind every resilient data center is an equally robust Network and Service Operations Center (CORS or NOC/SOC). These facilities are equipped with their own dedicated UPS systems, ensuring that even if the main power infrastructure is compromised, operations can continue without interruption.

During the blackout, CORS teams:

Maintained real-time monitoring of systems, network loads, and power metrics
Used redundant remote access platforms to manage infrastructure independently of local conditions
Communicated continuously with clients, fuel providers, and internal escalation chains
Ensured environmental parameters, such as temperature and humidity, remained within thresholds to protect equipment

In many cases, these teams work in geo-redundant configurations, with mirrored centers across regions or countries that can take over operations if needed.

Technical pillars of a resilient data center

The ability to remain operational during a grid-wide failure depends on layered redundancy and robust design. Key features of high-resilience infrastructure include:

Dual power feeds with independent UPS systems and backup generators
Modular cooling systems that can operate on backup power or with free-cooling when applicable
Carrier-neutral network redundancy, with BGP-based multihoming and automated failover
Highly available storage clusters with active-active replication across sites
Virtualization and orchestration platforms (e.g., VMware, Proxmox, OpenStack) that support live migrations and automated failover
On-site and remote monitoring systems for power, environment, and service status

These components are not optional—they are core to guaranteeing the availability of digital services, especially for mission-critical sectors like finance, healthcare, and government.

Lessons for system administrators: resilience starts at design

For system administrators and infrastructure engineers, the April 28 outage reinforced a long-standing principle: power failure is not a theoretical risk—it’s a real-world scenario that must be accounted for at every level of design and operations.

“Resilience isn’t a luxury—it’s a necessity,” Carrero emphasized. “At Stackscale, our mission is to deliver European cloud infrastructure with maximum uptime, security, and autonomy. Incidents like this one show the true value of that investment.”

The Iberian blackout of April 28 was not a moment of panic—it was a real-world stress test that the data center industry passed with flying colors. With well-maintained infrastructure, intelligent automation, and dedicated teams, providers ensured that the digital economy continued running seamlessly. For systems professionals, it’s a strong reminder that uptime isn’t accidental—it’s engineered.

References: Noticias Cloud y Redes Sociales