On Your Path To Site Reliability Engineering

Cristiano Cunha's profile
Cristiano Cunha

Solution Architect & Testing Advocate

Challenge Description

Site Reliability Engineer by definition is an engineering approach to IT, these engineers are development-focused engineers who solve operational/scale/reliability problems. Knowing that SREs are vital to supporting the DevOps change and being an SDET, how can you apply what you already know from your engineering approach to testing that can be applied to this scenario? Using the context of your companies, sit down and think about what it could mean to make such a change in your context.

Instructions:

Define context

Sit down with your teammates (or alone) and describe the context of the company that would benefit from the creation of an SRE team, use the aspects of your company to bring some realism to this activity (or be creative and include problems you would like to have discussions over them and solutions suggested).

If you prefer you can use the following example:

“In this company, you have an infrastructure/operations team that is the one responsible for everything happening in infrastructure and in production. This team is being drowned by tickets and resolving issues using manual actions. They use some scripts but spread in diverse machines with no versioning. They also do on call and support production 24x7.“

Generate plan

Reflect on the situation described and discuss it with your teammates. What different approaches will you take to implement such change? Define 3 to 5 points that you and your team think are the most important to be addressed (explain how to implement it and what is the outcome that you expect for each point).

Starting to use Source Control tool

  • How
    • Online training on source control.
    • Sharing sessions on how we can save scripts in source control.
    • Make sure all scripts are now “downloaded” from source control and contribution is done in it (No more local scripts).
    • Ensemble programming for everyone to see how it should be used. 
  • Outcome
    • No more scripts in local machines.
    • Code starts to flow into the source control tool and a process starts to be designed for sharing and contribution.

Share

Choose 2-3 volunteers to describe their context and their plan to make the change and open a Q & A to discuss the approach.

Wrap-up:

Understanding what an SRE is and the set of responsibilities they are accounted for will help in the decision of considering a role in this area or not. You will also have an overview of the possible challenges to expect when doing such a change and possible solutions to try (or adapt) in order to invest in the change while moving forward.

What you’ll learn
  • Identify approaches for adopting site reliability engineering

Resources

The "Do Nots" of Testing - Melissa Tondi
Staying Tool Aware with Rahul Parwal and Ajay Balamurugadas
Favourite Three Tools with Neil Studd
Extra! Extra! Automation Declared Software! - Paul Grizzaffi
Three Ways To Measure Unit Testing Effectiveness
The "Do Nots" of Testing - Melissa Tondi
Testing Your Requirements
Accelerate: The Science of DevOps - What You Need to Know - Emily Bache
Continuous Delivery without Test Automation
This open-source tool is the #1 Automation Test Reporting Tool loved by the community and the developing team plans to share their knowledge via a learning course. Stay tuned!
Explore MoT
TestBash Brighton 2024
Thu, 12 Sep 2024, 9:00 AM
We’re shaking things up and bringing TestBash back to Brighton on September 12th and 13th, 2024.
A Software Tester’s Guide To Chrome Devtools
Learn how to dig deeper into the Web with the use of Devtools