Prerequisites
The Coaching team needs to have access to the TDD Sandbox Project Review.
ATDD Team Coaching (Real Life Project)
Read more: Overview of Optivem Coaching for details about the program.
Context: Software Delivery has a strong reliance on Manual QA Testing. Automated E2E Testing may have been attempted but failed (E2E Test problems - scenario limitations, test flakiness, unmaintainable test suite).
Impact:
Unsafe Delivery (High bug count): Lots of bugs in production because Manual QA Engineers can execute only a small subset of the regression test suite due to time constraints
Business Loss: Losing customers (which reduces revenue) and also negative brand reputation (which loses potential future revenue).
Slow Delivery (Long release cycles): Release cycles are slow because every QA Testing cycle is time-consuming (e.g. 2 weeks) and needs to be repeated for multiple cycles (e.g. 5 cycles).
Business Loss: Slow to release to market, can't keep up with business demands, losing competitive advantage
DORA Metrics:
Deployment Frequency (DF) - Low
Lead Time for Changes (LTC) - High
Change Failure Rate (CFR) - High
Mean Time to Restore (MTTR) - High
Note: See Financial Impact - Cost Calculations in the context of Manual QA Testing.
Context: Zero Defect Software Delivery; zero known user-facing bugs in production, because system level testing is in-built into the delivery process. No Manual QA Regression Testing (humans only would do exploratory & usability testing, but not regression testing).
Impact:
Safe delivery (Low bug count): Software is shipped without user-facing regression bugs
Business Benefit: Higher customer retention (so that we maximize Customer Lifetime Value -> increases revenue), and also the positive brand image will attract additional customers (revenue increase)
Faster Delivery (Shorter release cycles): Release cycles are faster because the feedback loop for effective system level automated testing is < 1hr.
Business Benefit: Faster release cycles means we can release features sooner to the market, gaining competitive advantage
DORA Metrics:
Deployment Frequency (DF) - Increased
Lead Time for Changes (LTC) - Decreased
Change Failure Rate (CFR) - Decreased
Mean Time to Restore (MTTR) - Decreased
Wasted Time (Salary Wasted):
QA Engineer is wasting time due to time-consuming & repetitive Regression Testing
Developers are facing higher rework costs due to late bug detection
Developers facing higher rework costs due to unclear requirements
QA Engineers facing higher retest costs due to unclear requirements
Developers facing switching costs due to long delays of QA Engineer feedback
Pipeline, Smoke Tests & E2E Tests for existing functionality
Acceptance Tests & External System Contract Tests for existing functionality
ATDD for new functionalities & bug fixes
6 months
Pipeline Setup - Commit Stage, Acceptance Stage, Release Stage
Metrics Setup (and Initial Snapshot) - Regression Bug Count, New Bug Count, Test Execution Time
Overview
Phase 1: Pipeline, Smoke Tests & E2E Tests for existing functionality
Phase 2: Acceptance Tests & External System Contract Tests for existing functionality
Phase 3: ATDD for new functionalities & bug fixes
Phase 1: Pipeline, Smoke Tests & E2E Tests (1 month)
Outputs: Smoke Tests & E2E Tests for existing functionalities, as a replacement of Manual QA Regression Testing
Scope: Limited happy paths; Existing functionalities
Impact: Eliminate regression bugs going to production related to happy path scenarios
Metrics: Regression Bug Count REDUCED, Test Execution Time REDUCED
Phase 2: Acceptance Tests & External System Contract Tests (2 months)
Outputs: Acceptance Tests & External System Contract Tests for existing functionalities
Scope: Happy, alternative, error paths; Existing functionalities
Impact: Eliminate regression bugs going to production related to happy path scenarios AND alternative/error scenarios
Metrics: Regression Bug Count REDUCED
Phase 3: ATDD (3 months)
Outputs: Acceptance Tests & External System Contract Tests for new functionalities & bug fixes (as well as continuing to add those tests for existing functionalities, where missing)
Scope: Happy, alternative, error paths; Existing functionalities, new functionalities & bug fixes
Impact: Zero Defect Software - Eliminate both regression bugs & new bugs for happy/alternative/error scenarios
Metrics: Regression Bug Count REDUCED, New Bug Count REDUCED
Kickoff
2hr Kickoff with Core Team & Management to for Big Picture over of the current state, target state, and transformation roadmap
Coaching:
24 x 1hr sessions with Core Team (i.e. approx 1 session per week)
12 x 30min sessions with Management Team (i.e. approx 1 session per fortnight)
Retrospectives:
1hr retrospective at the end of each phase
Core Team:
Coaching Team size is max 5-7 people, we recommend at least one Backend developer, at least one Frontend developer, and one QA Engineer. One of these people will also take on the role of Team Lead. (We do not take on additional people because it limits the efficiency of our sessions).
Note: Additional participants (not part of the count) are: DevOps Engineer (needed for Pipeline review), and PO (needed for ATDD, and might be needed even prior to that). Engineering Manager & CTO can also join (muted).
Management Team:
Includes the following: Engineering Leadership (CTO & Engineering Manager), may also include the Team Lead.
Concepts:
Regression Bug = functionality was previously working, but now is no longer working correctly
New Bug = new functionality is not working as expected
Test Execution = time to execute the system level tests (manual/automated)
Metrics:
Safety Metrics: Regression Bug Count & New Bug Count
Speed Metrics: Test Execution Time (part of total Delivery Time)
How to measure Metrics (examples with JIRA using ticket types/labels):
Regression Bug Count
Counting number of JIRA tickets of type/label "QA Regression Bug"
Counting number of JIRA tickets of type/label "CUSTOMER Regression Bug" (created based on Customer Support tickets)
New Bug Count
Counting number of JIRA tickets of type/label "QA New Bug"
Counting number of JIRA tickets of type/label "CUSTOMER New Bug" (created based on Customer Support tickets)
Test Execution Time
If Manual System Testing: Time for QA Engineer to manually execute tests based on documented test procedures, covering existing functionality and new functionality
If Automated System Testing: Time for the Test Suite to be executed on the Pipeline
Note: The above are just examples. We can align with the client how they'll measure it. The measures need to be accessible to both the Team, Management & Coach.
Measuring Metrics over time:
We can track the measures per iteration/sprint/release, since we expect that measure will improve per iteration
We can also track the cumulative measure, whereby we expect the following:
Slowing down the growth of cumulative bugs (i.e. if you're falling down a mountain fast, you try to slow down the rate at which you're falling down)
Stopping the growth of cumulative bugs (i.e. you're no longer falling down, you're standing still)
Reducing the number of cumulative bugs (i.e. you now start improving, going uphill)
Reaching Zero Defect Software, whereby there is zero known unhandled bugs in production (this means we've reached top of the hill) - this is the expected effect of practicing ATDD
Note: Aside from the above, we can also add additional metrics where needed, for example, criticality of bugs - e.g. tracking number of critical bugs, since we expect those to be reduced over time, because our system tests are intended to start by covering critical functionality. Other metrics that are affected: Maintenance costs of the test suite (effective automated tests are more maintainable compared to manual tests), Developer productivity increased (less miscommunication about requirements & less back-and-forth with QA when applying ATDD, hence less switching costs), improved team morale (greater employee satisfaction due to reduced stress & overtime), increased developer retention, reduced key-man risks, reduced onboarding time.
Current State: QA Engineers spend a lot of time manually executing tests for existing functionality (to uncover Regression Bugs) and for changes in functionality (to uncover New Bugs). This is very repetitive, time-consuming, wasteful work.
Target State: QA Engineers will be involved in better definition of Acceptance Criteria (working together with team, no silos) so that they contribute by using their skillset of thinking up of scenarios (this is where human mind is valuable) rather than repetitively manually executing those scenarios (low-value work, can be automated). Instead of doing Manual Regression Testing, they will be doing Exploratoruy Testing & Usability Testing because that type of testing is well-suited for humans, but not suited for automation. In this way, QA Engineers will contribute much more to improving End User Experience.
Communication Channels:
For Sandbox Project Review: Async communication via Substack (but can also discuss during Sessions if needed)
For Real Life Project Transformation: Communication during Sessions only (screen sharing real-life project - source code, test code)
Timelines: Timelines are for illustrative purposes. Teams may go at different paces. 6 months is the effective time, but calendar-wise it can be used up within 9 months from the coaching Start Date, that's the end of using up all sessions before they expire.
Impact & Metrics: The extent to which bugs are eliminated is dependent on the extent of functional coverage, the higher the functional coverage, the more reduction in regression bugs
Inter-Phase Dependencies: Phases are sequential and cannot be skipped, because we cumulately build up skills. To move from one phase to another, we require the previous phase to have been completed. A phase being completed means that, for any tests covered in that phase, that at least one test type was done successfully on the Real Life Project.
Intra-Phase Dependencies: Within each Phase, the team is required to complete the task on the Sandbox first. Assuming that Sandbox tasks have been marked as Done in the Sandbox Project Dashboard, the team can then apply the transformation on the Real Life Project
Pipeline Setup: We recommend hiring an external DevOps consultant if don't have the skillset in-house
Team Collaboration: We recommend the Mob Programming approach when working on the Sandbox Project and when doing a test type for the first time on the Real Life Project. The team can then split between themselves as they scale out to write such tests across the result of the Real Life Project.
Risks:
Pipeline prerequisite isn't satisfied, because the team doesn't have DevOps Engineer skillset, so it takes too long to build the Pipeline, or the team gets stuck, etc.
Mitigation:
To avoid delays in the Pipeline setup, you need to involve the DevOps Engineer on time, and if you don't have a DevOps Engineer and the team doesn't have Pipeline skillset, then it's best to hire an external DevOps Consultant to do the setup faster and then train the team in maintaining the Pipeline. In the worst case, if the Pipeline doesn't setup, then we'll start the coaching program anyway, but there will be inefficiencies due to manually executing deployment and the system level tests.
Risks:
Limited Phases completed: It takes too long to complete a Sandbox activity, we then delay progress onto Real Life application; it takes too long to complete the Real Life transfer, we then delay progress onto the next Phase.
Limited Metric Results: Within a Phase, we introduce some test type. From coaching perspective, we require that at least ONE instance of such test type be written & reviewed by the coach. It is up to the team to continue writing MORE instances of such a test type, beyond the coaching sessions. The team might do just the minimum, writing only ONE instance of the test type. In that case, there won't be visible improvement on the metric.
Mitigation:
Ensure that the team gets sufficient time allocation to work on activities beyond Coaching sessions, esp. uninterrupted time so that they can focus. If the team is constantly context-switching, if they are constantly fire-fighting due to bugs in production, if they have strict feature release deadlines from business, if they are working over time, if they are under a lot of stress -> then they will move slow and we won't see much metric improvements. It is essential that during this transformation, that any feature scope from business is reduced as much as possible, and that the team is given uninterrupted time blocks. The coaching team will be able to spend majority of their time working in the "new" way. Regarding firefighting support (for production bugs) we recommend other developers be assigned and also possibility of new developers to be hired and trained to support production maintenance, so that the coaching team is freed (as much as possible) from those issues.
Furthermore, even though the program is 6 months effective time, if there are delays due to whatever reason (e.g. vacations, urgent issues), sessions can be rescheduled. Sessions expire until end of Nov 2025.
To achieve visible improvement in metrics, for each phase, it is essential that the team replicates / scales out the knowledge as much as possible. So for example, even though during coaching sessions we may cover ONE instance of a test type, it is expected that the team will continue to write MULTIPLE instances of that test type outside of the coaching sessions. Since time is a constraint, we recommend the pareto principle - prioritization... it's not about 100% functional coverage, but rather identifying what's the top 20% most important functional coverage and focusing on that. The sooner that this is done, the sooner we'll start seeing metric improvements. Furthermore, the greater the scope of coverage, the greater the protection and then the higher the amount of improvement.
During sessions, there is 5 min break at end of each 1hr
At the start of any sessions, the host can start the recording, if they want to save the training for future internal reference.