ROLE

Product Designer

TIMELINE

6 months

SCOPE

End-to-end product design

TEAM

3 Designers & Engineering

SCOPE

End-to-end product design

TEAM

3 Designers & Engineering

Designing an Outage Management System for Enterprise Operations

Outcome

The platform replaced fragmented tools with a centralized system for outage planning, execution, and reporting. Teams gained clearer visibility into project status, improved consistency through structured workflows, and were able to track progress in real time. This enabled faster issue identification and more informed decision-making during high-risk operations.

Overview

I worked on the design of an internal platform used to plan and manage large-scale operational outages—high-risk events that require tight coordination across multiple teams. The platform brings everything into one place through Projects, which act as the single source of truth for defining scope, aligning tasks, and tracking progress in real time. It replaces the usual mix of spreadsheets, email, and Slack with a more structured way to manage execution and communication. I focused on designing key parts of this workflow within an existing Material Design system, balancing usability with strict UI patterns and engineering constraints.

Core Problem

Teams managing large-scale operational outages relied on fragmented tools such as spreadsheets, email, and Slack to coordinate execution. This led to: ➞ Unclear task ownership across teams ➞ Limited visibility into real-time progress ➞ Inconsistent workflows between projects ➞ Delays in communication during critical operations There was no centralized system to coordinate execution, track progress, and align stakeholders during high-risk operations.

Key System 1 — Project & Scope Setup

The platform introduces real-time visibility during outages. Execution timelines show planned versus actual progress, while outage updates log critical events as they happen, creating a single source of truth during execution.

Key System 2 — Execution Management

Execution is coordinated through tasks within each project. Tasks define ownership, deadlines, and status, making progress visible and reducing reliance on informal coordination.

Key System 3 — Real-Time Operations

Real-time visibility is structured across levels to help teams quickly understand status and act on issues. The Overview surfaces system-wide risk and activity, while the Summary consolidates project-level details and updates. At the execution level, timelines and outage updates track planned versus actual progress. Together, these views provide a clear, connected understanding of operations during outages.

Key System 4 — Coordination & Communication

Meetings and prompts bring structure to team communication. Instead of ad hoc discussions, teams align around predefined topics tied to execution and safety, ensuring critical decisions are consistently addressed.

Key System 5 — Reporting & Visibility

Reporting provides visibility into performance across projects. Teams can monitor progress, track key metrics, and generate reports for stakeholders, improving accountability and decision-making.

Outcome

The platform replaced fragmented tools with a centralized system for outage planning, execution, and reporting. Teams gained clearer visibility into project status, improved consistency through structured workflows, and were able to track progress in real time. This enabled faster issue identification and more informed decision-making during high-risk operations.