Engineering Manager at Datadog, working on observability and application performance monitoring. Previously in AWS and Finance.
All engineers are intelligent people and everyone doing their best to keep things done right.
In the world of web development, things go wrong. A lot. Disasters can happen to anyone, no matter how experienced they are. In fact, even the biggest companies in the world aren't immune to outages and other failures. These types of disasters can cost businesses a lot of money in terms of lost revenue and customers. So what can be done to prevent these disasters from happening?
In this talk, we will check the factors that lead to failures. We will also discuss how to reduce the likelihood of errors and accelerate system recovery time.
Gather around the campfire, brave DevOps practitioners, for an evening of spine-chilling tales from the world of software operations. In this unique and engaging talk, our two seasoned storytellers will regale you with a series of “scary campfire stories” that highlight common pitfalls in software operations practices. Each tale is based on real-world experiences and is designed to illuminate the challenges, missteps, and lessons learned in the ever-evolving field of DevOps.
Over the course of the presentation our storytellers will offer valuable insights and practical advice for avoiding these common pitfalls in your own DevOps journey. We'll discuss the importance of automation, monitoring, continuous improvement, and collaboration in building resilient and high-performing software operations.
So, join us by the fire and listen closely to these cautionary tales from the DevOps trenches. Whether you're a seasoned veteran or a newcomer to the field, these stories are sure to leave you with valuable lessons and a renewed commitment to excellence in software operations.