8:00 am - 9:00 am Registration & Coffee

9:00 am - 9:30 am KEYNOTE PRESENTATION: Adopting a Focus on Holistic Observability to Maximise Developer and SRE Value

This session will explore how holistic observability can enhance the visibility of engineering and SRE functions, streamline tech stacks to reduce cost and complexity, and provide valuable metrics on developer and SRE activities to identify where the highest value work is being done.

  • Tracking and analysing hard metrics to understand what developers and SREs are doing and where their efforts yield the most significant impact.
  • Simplifying complexity and enhancing security: Discover methods to reduce complexity within your systems while simultaneously increasing security measures.
  • Rationalising your tech stack to ensure that all tools and platforms are necessary and contribute to overall objectives.
  • Discussing best practices for migrating observability tools to improve efficiency and data integration.
  • Combining disparate data sources to provide a comprehensive view that enhances decision-making and operational efficiency.

This session focuses on strategies for balancing the costs associated with observability and OpenTelemetry tooling, while maximizing its value for both SRE and development teams, ensuring cost-effectiveness without compromising system performance and reliability.

  • Assessing the cost of observability and OpenTelemetry at scale for both SRE and development teams.
  • Understanding the hidden costs of data storage, eBPF programs, processing, and analysis.
  • Identifying essential metrics and logs to minimize data overload and associated costs.
  • Leveraging open-source tools and integrated solutions to reduce expenditures.
  • Demonstrating return on investment for observability to justify expenditures to stakeholders.

Shery Brauner

Director of Engineering (SRE)
Delivery Hero


Surabhi Mahajan

Executive Director, Software Engineering
JPMorgan Chase & Co


Ivan Katliarchuk

Engineering Manager, SRE & Platforms
Holland & Barrett

10:10 am - 10:40 am PRESENTATION: Observability as Code: Defining Consistent & Reproduceable Observability Configurations

Introducing the principles of Observability as Code and how it can automate and replicate observability configurations for SRE teams.

  • Outlining best practices for writing and maintaining observability configurations to ensure consistency and fault tolerance across different deployment environments.
  • Understanding the trade-offs in implementing Observability as Code, balancing value versus complexity
  • Exploring the challenges of integrating third-party tools to harness the full potential of Observability as Code within SRE practices.
  • Demonstrating how Observability as Code enhances reliability, reduces manual intervention, and improves collaboration between SRE and development teams.

10:40 am - 11:10 am Morning Coffee Break

11:10 am - 11:40 am PRESENTATION: Navigating Observability at Scale Without Crushing Devs With Corporate IT

Paul Miller - AIOps & Observability Technical Product Manager, National Grid

Addressing how you can achieve enterprise observability that developers trust, this session will explore effective strategies that balance the rigorous demands of corporate IT with the demands and needs of your developer population.

  • Enterprise Observability & AIOps delivering a proactive safety net, pre-emptively resolving issues and minimizing disruptions for developers.
  • Utilising advanced tools for integrating robust security measures that do not overwhelm developers with extensive rule management.
  • Addressing the existing mistrust between DevOps, SRE, and IT departments by refining communication and collaboration, facilitated by shared tools such as Grafana and
  • Prometheus.
  • Redefining traditional service management practices to make them more agile and responsive
  • Developing metrics to assess the effectiveness of change management by its ability to deliver services that are both impactful and well-received by users.

Paul Miller

AIOps & Observability Technical Product Manager
National Grid

11:40 am - 12:10 pm PRESENTATION: Transforming From Software To Service With SRE & Observability

Chris Marks - Site Reliability & Response Director, WTW

In this session, we will explore the comprehensive journey of integrating Site Reliability Engineering (SRE) and Observability into an enterprise environment, to take your business from software focussed to SAAS. The presentation will address the strategic shift from being a software-centric organization to becoming a service oriented

enterprise, emphasizing the critical role of SRE in this transformation.

  • Introducing the foundational concepts of SRE and how to adapt them within an enterprise context.
  • Evaluating the benefits and challenges of implementing SRE, including its impact on team dynamics and operational efficiency.
  • Highlighting the importance of effective tooling in supporting SRE practices and ensuring robust observability.
  • Discussing the balance between enhancing observability and maintaining the feature set of your offerings.

Chris Marks

Site Reliability & Response Director

12:10 pm - 12:40 pm PRESENTATION: Empowering Developers with Out-of-the-box Observability Tooling

Jean Burellier - Technical Lead, Sanofi

This session will explore how out-of-the-box observability tooling can empower developers to enhance system stability and accelerate problem resolution. Attendees will learn how to leverage these tools to proactively manage system health, and reducing cost and time wasted.

  • Leveraging pre-configured solutions that can be customised for immediate use.
  • Building an ecosystem that fosters collaborations and solutions rationalised for global use
  • Automating the detection and resolution of system issues, reducing the need for manual intervention and accelerating development cycles.
  • Building developer-centric tools with native security and monitoring protocols built in
  • Defining observability standards and limitations

Jean Burellier

Technical Lead

12:40 pm - 1:40 pm Networking Lunch

1:40 pm - 2:10 pm PRESENTATION: Observability at a Fast-Paced Airline: Strategies for Monitoring and Managing Complex Systems

Juan Valdes Gayo - Senior Engineering Manager - Head of Software Development, Ryanair

This talk will explore the unique challenges and opportunities of implementing observability in the airline industry. Covering different and specific technical stacks used at Ryanair and strategies used to monitor and manage Ryanair’s complex IT infrastructure in real-time. Key topics will include:

  • Overview of Ryanair Labs mission: Ecommerce vs Airline Operations
  • Different needs, different tech stacks: From serverless to Kubernetes
  • Data driven decision. Using observability as a driver for strategic decision
  • Real time monitoring of IT and/or Operational issues
  • Alignment and Distributed logging: Open Telemetry
  • Cost control

Juan Valdes Gayo

Senior Engineering Manager - Head of Software Development

2:10 pm - 2:40 pm PRESENTATION: Driving System Reliability & Awareness With AIOps

This session will delve into how AIOps supports enhanced system visibility and predictive capabilities, offering you an unparalleled view of performance and the potential issues. Attendees will learn how integrating AIOps with observability tools not only sharpens awareness but also empowers businesses to proactively address challenges and optimise operational processes.

  • Automatically analysing operational data and initiate responses, streamlining the identification and resolution of system issues.
  • Anticipating system disruptions before they occur, minimising operational impact.
  • Exploring the role of AIOps in achieving a deeper, more comprehensive view of system health and performance, aiding in quicker, more informed decision-making.
  • Optimising systems with AI to refine alert management, effectively reducing noise and focusing efforts on high-priority issues that affect system stability.

2:40 pm - 3:10 pm PRESENTATION: Demonstrating the Value of Platform Engineering & SRE in Achieving Business Objectives

This session explores how you can demonstrate platform and site reliability engineering value in driving business success. By connecting technical initiatives to wider business goals and employing the right tools and strategies, organisations can enhance visibility, improve incident response, and track developer behaviours for better outcomes.

  • Connecting technical initiatives and improvements to key business goals and metrics to demonstrate the value of platform engineering to broader objectives
  • Identifying essential tools for visibility and observability to monitor and optimize system performance.
  • Implementing self-service Kubernetes to empower teams and streamline workflows.
  • Enhancing incident identification and disaster recovery processes to minimize downtime and maintain reliability.
  • Tracking developer behaviours and building developer scorecards to measure impact and drive improvements.

3:10 pm - 3:40 pm Afternoon Coffee Break

3:40 pm - 4:10 pm PRESENTATION: Advanced Kubernetes Monitoring for Site Reliability Engineering: SRE Golden Signal Tracking – Measuring What Matters

Daniel Murphy - Head of SRE, PWC

In the realm of Site Reliability Engineering, understanding and tracking the right observability metrics is crucial for maintaining system health and performance. This session will delve into the concept of Golden Signals, highlighting their importance in effective monitoring and incident response.

  • Identify the Golden Signals For Your SRE: Understand latency, traffic, errors, and saturation, and their impact on system reliability.
  • Implementing Effective Monitoring: Learn strategies to integrate Golden Signal tracking into your existing observability stack.
  • Prioritising Alerts: Develop methods to set appropriate thresholds and alerting mechanisms based on Golden Signal metrics.
  • Data Interpretation and Response: Enhance skills in interpreting signal data to make informed decisions during incidents.
  • Continuous Improvement: Explore practices for refining Golden Signal tracking to adapt to evolving system demands and complexities.

Daniel Murphy

Head of SRE

4:10 pm - 4:40 pm PRESENTATION: Identifying The Right Observability & SRE Tooling and Metrics To Transform Production

Divya Sharma - Director - Head of Service Management, SRE/DevOps Strategy, Natwest Group

This session addresses how integrating observability with SRE practices can drastically improve service reliability, performance, and incident response in complex IT environments. Finding the right tools for your requirements, exploring opensource and applying SLOs to transform operating models for production teams.

  • Establishing a foundational framework that incorporates observability into SRE tasks to enhance system reliability and accountability.
  • Assessing observability tools that are right for your requirements – Corporate, Commercial, Institutional
  • Define and refine SLOs, applying concepts to observability, ensuring that performance metrics align with business objectives and customer expectations.
  • Launching an innersource community to explore opensource tooling
  • Techniques for continuous improvement through data-driven insights gained from observability, enhancing decision-making and strategic planning in SRE practices.

Divya Sharma

Director - Head of Service Management, SRE/DevOps Strategy
Natwest Group

4:40 pm - 5:10 pm CLOSING KEYNOTE PRESENTATION: Enhancing Developer Confidence through Robust Observability Practices

Don Tran - VP Engineering & Platform, Trinny London

Good observability enables developers to do what they want to do. This session will delve into the advanced techniques of synthetic test sweeping, testing in production,

and structured logging to provide comprehensive observability. By leveraging these practices, developers can gain deeper insights and greater control, enabling them to

innovate with confidence.

  • Implementing synthetic test sweeps in production to proactively identify and mitigate potential issues before they impact users.
  • Exploring strategies for safely conducting tests in production environments to validate performance and reliability in real-world conditions.
  • Enhancing the granularity and accessibility of log data for faster debugging and analysis.
  • Establishing robust observability frameworks that provide comprehensive visibility into system behaviour and performance.
  • Empowering Developers to deploy with confidence and focus on creating innovative solutions.

Don Tran

VP Engineering & Platform
Trinny London

5:15 pm - 5:20 pm End of Conference Day One

5:20 pm - 6:20 pm Drinks Reception