Overview
DevOps1 partnered with a leading Australian superannuation fund to conduct a
comprehensive observability health check and establish a robust Center of Excellence (CoE)
framework. This initiative transformed their monitoring capabilities from reactive incident
response to proactive, business-aligned observability practices that directly support member
experience and operational efficiency.
The project delivered a structured three-phase approach encompassing current
state assessment, foundation building, and optimisation roadmap. Through systematic evaluation
and implementation, we established sustainable observability practices that align technical
capabilities with business outcomes while building internal capability through governance
frameworks and community-driven adoption models.
Challenges
The organisation faced significant observability challenges that were
impacting their ability to deliver reliable digital experiences to their members:
Tool and Technology Limitations
- Current APM tools couldn't provide distributed tracing for critical Boomi integration
platforms
- Over 20 APIs in production exhibited concerning failure rates exceeding 5%
- Existing monitoring tools created significant blind spots, particularly around member-facing
applications
Organisational and Process Gaps
- Observability function managed by a single individual, creating critical key person risk
- Lack of formal governance structure or shared responsibilities across teams
- No standardised approach for onboarding new applications to monitoring frameworks
- Teams operated independently with inconsistent monitoring practices between business units
Operational Inefficiencies
- Reactive incident response with slow mean time to detection and resolution
- Limited proactive monitoring capabilities across the technology stack
- Alert fatigue from high-volume, low-quality notifications (500+ alerts per month)
- Technical metrics weren't aligned with business outcomes or member experience indicators
Strategic Alignment Issues
- Fragmented approach to observability with no unified language between business and
technology teams
- Missing connection between technical performance metrics and business impact analysis
- Early-stage observability and SRE Center of Excellence with unclear direction
- Lack of comprehensive documentation, standards, and best practices repository
Solution
DevOps1 implemented a comprehensive three-phase transformation approach to
establish mature observability practices:
Phase 1: Current State Assessment (4-6 Weeks)
- Conducted thorough implementation review evaluating existing Dynatrace deployment and
configuration quality
- Performed comprehensive Observability CoE assessment reviewing team structure, processes,
documentation status, and skill gaps
- Executed detailed technical review covering infrastructure assessment, application coverage,
integration points, and performance baselines
- Developed prioritised recommendations roadmap with identified quick wins, risk areas, and
remediation plans
Phase 2: Foundation Building (14-16 Weeks)
- Established Observability Center of Excellence with defined roles, responsibilities, and
governance framework
- Created comprehensive documentation repository including standard operating procedures,
naming conventions, and troubleshooting guides
- Implemented role-specific training curriculum with self-service learning resources and
internal certification processes
- Developed application onboarding framework with checklists, success criteria, and cost
allocation models
- Established tiered support structure with clear escalation paths and incident response
procedures
Phase 3: Optimisation & Scale (DevOps1 SRE embedded champion)
- Advanced monitoring capabilities including synthetic testing strategy, RUM implementation,
and custom metrics approach
- Business integration through value stream mapping, cost modeling, and executive dashboard
strategy
- Technical uplifts including, dashboard templating, and license management optimisation
Benefits
The transformation delivered measurable improvements across technical,
operational, and business dimensions:
Operational Excellence Improvements
- Established governance framework through ITLT forum for structured observability
decision-making
- Implemented standardised processes and documentation for consistent monitoring practices
- Created role-based training programs improving team capability and reducing knowledge
concentration risk
Strategic Business Value
- Enabled data-driven decision making with clear visibility from infrastructure to business
impact
- Improved cross-team collaboration through unified observability language and shared
dashboards
- Established foundation for advanced capabilities including capacity forecasting and
infrastructure optimisation
Organisational Maturity Advancement
- Started to progress observability maturity from reactive monitoring to strategic business
alignment
- Built sustainable Center of Excellence with champion networks across business units
- Created scalable onboarding processes reducing time-to-value for new application monitoring
- Established cost allocation models enabling proper budgeting and FinOps integration