We never have enough time for testing, so let’s just write the test first.

—Kent Beck


Test-First is a Built-In Quality practice derived from Extreme Programming (XP) that recommends building tests before writing code to improve delivery by focusing on the intended results.

Agile testing differs from the big-bang, deferred testing approach of traditional development. Instead, the code is developed and tested in small increments, often with the development of the test done ahead of writing the code. This way, tests help elaborate and better define the intended system behavior even before the system is coded. Quality is built in from the beginning. This just-in-time approach to the elaboration of the proposed system behavior also mitigates the need for overly detailed requirement specifications and sign-offs that are often used in traditional software development to control quality. Even better, these tests, unlike conventionally written requirements, are automated wherever possible. And when they’re not, they still provide a definitive statement of what the system does, rather than a statement of early thoughts about what it was supposed to do.


Agile testing is a continuous process that’s integral to Lean and built-in quality. In other words, Agile Teams and Agile Release Trains (ARTs) can’t go fast without high quality, and they can’t achieve that without continuous testing and, wherever possible, testing first.

The Agile Testing Matrix

Extreme Programming (XP) proponent and an Agile Manifesto author Brian Marick helped pioneer Agile testing by describing a matrix that guides the reasoning behind such tests. This approach was further developed in Agile Testing and extended for the scaling Agile paradigm in Agile Software Requirements [1, 2].

Figure 1 describes and extends the original matrix with guidance on what to test and when.

Figure 1. Agile Testing Matrix
Figure 1. Agile Testing Matrix

The horizontal axis of the matrix contains business-or technology-facing tests. Business-facing tests are understandable by the user and written using business terminology. Technology-facing tests are written in the language of the developer and are used to evaluate whether the code delivers the behaviors the developer intended.

The vertical axis contains tests supporting development (evaluating internal code) or critiquing the solution (evaluating the system against the user’s requirements).

Classification into these four quadrants (Q1 – Q4) enables a comprehensive testing strategy that helps ensure quality:

  • Q1 – Contains unit and component tests.  Tests are written to run before and after code changes to confirm that the system works as intended.
  • Q2 – Contains functional tests (user acceptance tests) for Stories, Features, and Capabilities, to validate that they work the way the Product Owner (or Customer/user) intended. Feature- and capability-level acceptance tests confirm the aggregate behavior of many user stories. Teams automate these tests whenever possible and use manual ones only when there is no other choice.
  • Q3 – Contains system-level acceptance tests to validate that the behavior of the whole system meets usability and functionality requirements, including scenarios that often encountered in actual system use. These may include exploratory tests, user acceptance test, scenario-based tests, and final usability tests. Because they involve users and testers engaged in real or simulated deployment scenarios, these tests are often manual. They’re frequently the final system validation before delivery of the system to the end user.
  • Q4 – Contains system qualities testing to verify the system meets its Nonfunctional Requirements (NFRs), as exhibited in part by Enabler tests. They are typically supported by a suite of automated testing tools, such as load and performance, designed specifically for this purpose. Since any system changes can violate conformance with NFRs, they must be run continuously, or at least whenever it’s practical.

Quadrants one and two define the functionality of the system. Test-first practices include both Test-Driven Development (TDD) and Acceptance Test-driven Development (ATDD). Both involve creating the test before developing the code and use test automation to support continuous integration, team velocity, and development effectiveness. The next section describes quadrants one and two.  The companion articles, Release on Demand and NFRs, describe quadrants three and four, respectively.

Test-Driven (Test-First) Development

Beck and others have defined a set of XP practices under the umbrella label of TDD [3]:

  • Write the test first, which ensures that the developer understands the required behavior.
  • Run the test and watch it fail. Because there is no code yet, this may seem silly initially, but it accomplishes two useful objectives: verifies the test works, including its harnesses, and demonstrates how the system will behave if the code is incorrect.
  • Write the minimum amount of code needed to pass the test. If it fails, rework the code or the test until it routinely passes.

In XP, this practice was designed primarily to operate in the context of unit tests, which are developer-written tests (also code) that evaluate the classes and methods used. These are a form of ‘white-box testing,’ because they test the internals of the system and the various code paths. Pair work is when two people simultaneously collaborate to develop the code and tests, providing a built-in peer review, which helps assure high quality. Even when not pairing, the tests give another set of eyes that review the code. Developers often refactor the code to pass the test as simply and elegantly as possible, which is one of the main reasons that SAFe relies on TDD.

Unit Tests

Most TDD involves unit testing, which prevents quality assurance (QA) and test personnel from spending most of their time finding and reporting on code-level bugs, which allows additional focus on system-level testing challenges, where more complex behaviors are identified based on the interactions between unit code modules. The open source community has built unit testing frameworks to cover most languages, including Java, C, C#, C++, XML, HTTP, and Python. Now there are unit-testing frameworks for most coding constructs a developer is most likely to encounter. They provide a harness for the development and maintenance of unit tests and for automatically executing them against the system.

Because unit tests are written before or concurrently with the code, and their frameworks include test execution automation, unit testing can occur within the same Iteration. Moreover, the unit test frameworks hold and manage the accumulated unit tests. As a result, regression testing automation for unit tests is mostly free for the team. Unit testing is a cornerstone of software agility, and any investment made in comprehensive unit testing usually improves quality and productivity.

Component Tests

Similarly, teams use tests to evaluate larger-scale components of the system. Many of these are present in various architectural layers, where they provide services needed by features or other modules. Testing tools and practices for implementing component tests vary. For example, testing frameworks can hold complicated unit tests written in the framework’s language (e.g., Java, C, C#, and so on). As a result, many teams use their unit testing frameworks to build component tests. They may not even think of them as separate functions, as it’s merely part of their testing strategy. In other cases, developers may incorporate other testing tools or write entirely customized tests in any language or environment that is most productive for them to test broader system behaviors. These tests are automated as well, where they serve as a primary defense against unanticipated consequences of refactoring and new code.

Acceptance Test–Driven Development

Quadrant two of the Agile Testing Matrix shows that test-first applies as well to testing stories, features, and capabilities as it does to unit testing, which is called Acceptance Test–Driven Development (ATDD). And whether it’s adopted formally or informally, many teams find it more efficient to write the acceptance test first, before developing the code. After all, the goal is to have the whole system work as intended. Ken Pugh notes that the emphasis is more on expressing requirements in unambiguous terms than on focusing on the test per se [4]. He further observes that there are three alternative labels to this detailing process: ATDD, Specification By Example (SBE), and Behavior-Driven Design (BDD). There are some slight differences in these approaches, but they all emphasize understanding requirements before implementation. In particular, SBE suggests that Product Owners should provide realistic examples instead of abstract statements, as they often do not write the acceptance tests themselves.

Whether it’s viewed as a form of requirements expression or as a test, the understanding is that the result is the same. Acceptance tests serve to record the decisions made in the conversation between the team and the Product Owner so that the team understands the specifics of the intended behavior the story represents. (See the 3Cs in the “Writing Good Stories” section of Story, referring to the card, conversation, and confirmation.)

Functional Tests

Story acceptance tests confirm that each new user story implemented delivers its intended behavior during the iteration. If these stories work as intended, then it’s likely that each increment of software will ultimately satisfy the needs of the users.

During a Program Increment (PI) feature and capability acceptance testing are performed, using similar tests. The difference is that capability tests operate at the next level of abstraction, typically showing how several stories work together to deliver a more significant amount of value to the user. Of course, there can easily be multiple feature acceptance tests associated with a more complex feature. And the same goes for stories, verifying that the system works as intended for all levels of abstraction.

The following are characteristics of functional tests:

  • Written in the language of the business
  • Developed in a conversation between developers, testers, and the Product Owner
  • ‘Black-box tested’ to verify only the outputs of the system meet its conditions of satisfaction, without concern for the internal workings of the system
  • Run in the same iteration as the code development

Although everyone can write tests, the Product Owner as Business Owner/customer proxy is responsible for the efficacy of the tests. If a story does not pass its test, the teams get no credit for that story, and it’s carried over into the next iteration to fix either the test or code.

Features, capabilities, and stories must pass one or more acceptance tests to meet its Definition of Done. Stories realize the intended features and capabilities. And there can be multiple tests associated with a particular work item.

Automating Acceptance Testing

Because acceptance tests run at a level above the code, there are a variety of approaches to executing them, including handling them as manual tests. However, manual tests pile up very quickly. The faster you go, the faster they grow, the slower you go. Eventually, the amount of manual work required to run regression testing slows down the team and causes delays in value delivery.

Teams know that to avoid this, they have to automate most of their acceptance tests by using a variety of tools, which includes the target programming language (e.g., Perl, PHP, Python, Java) or natural language as supported by specific testing frameworks, such as Cucumber. Or perhaps they use table formats like the ‘Framework for Integrated Testing’ (FIT). The preferred approach is to use a higher level of abstraction that works against the business logic of the application, which prevents the presentation layer or other implementation details from blocking testing.

Acceptance Test Template/Checklist

An ATDD checklist can help the team consider a simple list of things to do, review, and discuss each time a new story appears. Agile Software Requirements provides an example of a story acceptance-testing checklist [2].

Learn More

[1] Crispin, Lisa and Janet Gregory. Agile Testing: A Practical Guide for Testers and Agile Teams. Addison-Wesley, 2009.

[2] Leffingwell, Dean. Agile Software Requirements: Lean Requirements Practices for Teams, Programs, and the Enterprise. Addison-Wesley, 2011.

[3] Beck, Kent. Test-Driven Development. Addison-Wesley, 2003.

[4] Pugh, Ken. Lean-Agile Acceptance Test-Driven Development: Better Software Through Collaboration. Addison-Wesley, 2011.

Last update: 19 November 2017