September 27, 2024
Unlocking key learnings in DevOps and SRE roles
4 min read
In this edition of AmnicCast, we are joined by Satendar Singh Patwal, a distinguished Senior SRE, and DevOps Leader with a proven track record in building and leading successful, lean DevOps and SRE teams. Satendar, with 20 years of industry experience, walks us through his journey from starting as a system administrator supporting various flavors of Linux, to leading DevOps and SRE teams at Adobe and RingCentral.
Embracing the Evolution: From On-Premises to Cloud
Satendar's career transformation mirrors the tech industry's shift to the cloud. Around 2011, while at Adobe, he was part of the team that transitioned services from on-premises to the cloud. This pivotal moment provided a platform for him to manage scalable cloud-based services, experience SaaS, and navigate the complexity of multi-cloud environments.
He notes that during the early days when he was adapting to the cloud, scaling was a big challenge. Moving from managing physical servers to leveraging AWS meant learning to handle issues of over-provisioning and under-provisioning. As Satendar describes it, while auto-scaling looked straightforward during setup, real-world scenarios brought unexpected production loads that led to either over-allocated resources or latency-induced incidents.
Satendar reflects on how technology has since evolved. Kubernetes, for example, now offers far more out-of-the-box scaling capabilities and solves many of the earlier challenges around optimization. He highlights that today, while the complexity remains, the ecosystem of tools and platforms supporting cloud-native technologies makes it easier to address those issues.
Cultural Shift: From Operations to DevOps
A major theme Satendar touches on is the cultural transition from traditional operations teams to adopting a DevOps mindset. Initially, operations were brought in later in the development lifecycle, often when services were already heading to production. The shift to DevOps required changing this approach, and getting involved earlier in the development process—what he calls “shifting left.”
This shift, though, was not without challenges. Convincing developers that DevOps wasn’t just about using tools like CI/CD, but was also a mindset, was key. Over time, the collaboration between development and DevOps evolved, with SRE teams being integrated earlier in the infrastructure provisioning process, helping to avoid under-provisioned resources and better anticipate scaling needs.
Satendar highlights that the goal of DevOps is not to slow down velocity but to act as effective gatekeepers, ensuring stability while enabling faster, safer deployments.
Cost Optimization and Automation in the Cloud
In his current role, cost observability is another critical focus for Satendar. Managing cloud infrastructure and ensuring cost optimization is a core responsibility of SRE teams. He shares how his team works closely with FinOps, continuously monitoring cloud costs and implementing automation to optimize resource usage.
One area of attention is development environments, which can often get out of control with unused resources. Satendar advocates for automation as a key to cost management, implementing tools that shut down dev environments over weekends or when they’re no longer in use. The self-service nature of these tools allows development teams to maintain their pace without constant manual oversight.
AI and the Future of DevOps
As AI becomes more integrated into DevOps workflows, Satendar sees huge potential for AI to revolutionize SRE practices. He shares how AI tools are already making a difference in monitoring and incident management. For example, machine learning can be used to adjust monitoring thresholds dynamically or proactively alert based on patterns in logs.
Satendar predicts that AI could take over much of the manual work of writing infrastructure-as-code, such as generating Terraform templates or managing GitOps. With AI's ability to handle large amounts of data and understand context, he sees it becoming a critical tool in helping SRE teams deal with high-pressure incident response situations.
Key Takeaways
Adaptability is essential: The technology landscape is dynamic, with new tools and frameworks constantly emerging. Staying up-to-date is crucial in SRE and DevOps roles.
Embrace change as an opportunity: The rapid pace of innovation, especially with AI, presents both challenges and opportunities for professionals to grow and adapt.
Focus on higher-level problem-solving: It’s important to shift focus from simply mastering tools to solving critical business problems like scalability, reliability, and cloud cost management.
Leverage automation and tools effectively: Let automation handle repetitive tasks, while professionals concentrate on strategic and creative problem-solving that drives real business value.
Don’t define yourself by tools alone: Being an expert in Kubernetes, Git, or any other tool is less important than understanding how to solve key business issues.
Utilize frameworks like AWS’s Well-Architected Framework: Align your work with its five pillars—operational excellence, security, reliability, performance efficiency, and cost optimization—to ensure you’re addressing the most important aspects of your role.
The core problems remain the same: While technology changes, the fundamental business challenges like reliability, performance, and cost-efficiency remain constant.
Think long-term: The tools may evolve, but the underlying goal of solving business problems through technology will always be central to SRE and DevOps roles.
At the end of the day, while the tools we use may change, the core problems we’re managing will remain largely the same. What we focus on today might look different in three years as new technologies emerge, but the abstract goal of solving business problems through the smart use of computing and other resources will always be central to what we do.
And with that, we wrap up this blog. Thank you for reading, and remember to stay open, curious, and adaptable as we continue to navigate this fast-moving, exciting technology landscape. Watch the full episode here!