Solving: How to set up complex relationships? #GivenWhenThenWithStyle

The challenge for this week was to manage complex relationships in test set-ups, especially when creating a whole network of collaborator objects, but keep the test easy to read and understand.

For a detailed explanation of the problem, check out the original challenge post. This article is a summary of the community responses and has some additional ideas on how to solve similar problems.

Manage storage constraints outside Given-When-Then files

Consistency requirements enforced by a database, or by the object model, are a large part of the problem with complex data setup. An inventory item needs a provider, the provider requires a purchase contract, with each contract in turn depending on a billing schedule. We might not care about the details of all those objects for a specific test case, but we can’t avoid setting them up. The typical – but not so good – solution is to list all these objects with all their properties explicitly in the background section of a feature file. There are two major perceived benefits of that approach:

  1. the data is completely visible to the readers of a Given-When-Then file
  2. Set-up step implementation can be relatively generic and simple

The first perceived benefit is in theory great, but it’s usually wasted because an overwhelming amount of information. Complex object networks tend to be difficult to read and understand, so even though the information might be in a feature file, readers can’t consume it easily.

The second perceived benefit is just plainly wrong. It’s a wrong local optimisation. By creating generic test set-ups, we might be saving programming time, but we’ll lose a lot more in trying to understand and maintain complexity in plain text. As a general guideline, avoid trying to do complex coding tasks in Given-When-Then scenarios. Push that complexity to a programming language environment where you have proper support for loops, conditions, type checking and full IDE tooling. Focus on clarity and understanding in executable specifications.

To make the important data visible to the readers of a feature file, we’ll need to deal with all the transitive relationships and storage constraints in the step implementations, not in the feature descriptions or scenario set-ups. There are three good ways of achieving that:

  • Object factories
  • Golden Source databases
  • Object finders

I’ll explain each of these in the following sections.

Use object factories to construct complex networks from attributes

Factory methods are one of the traditional object design patterns, mentioned in the original Gang-of-Four book. The pattern is a typical solution for situations where the process of creating an object is complex, and not appropriate for the local class constructor. In that sense, it matches the situation of complex data set-ups perfectly.

To implement this pattern for Given-When-Then scenarios, I usually create a separate utility class, so I can use it from many step implementations. This allows me to limit the scenario description to the bare essentials needed for a test, such as the one below:


Given a "not-delivered" refund request 

The implementation of this step could call the RefundRequestFactory object and just pass ‘not-delivered’ as the reason. The RefundRequestFactory would set up the customer, the orders, the payment methods, the inventory items, the providers and the billing contracts as needed. For situations where we need to specify a bit more about the scenario starting point, for example in order to test that the refunded amount matched the order amount minus the fees, we can make the factory take a few more parameters. I usually do that by allowing a table of properties that will be passed directly to the factory.


Given a "not-delivered" refund request for
| order amount |     customer email |
|      100 USD |  |

The major benefit of this approach is that it can be very flexible. Factories can provide default values for all non-essential properties and collaborators, and ensure that the provided attributes are correctly mapped. Although the order amount and customer email belong to different levels of a hierarchy, we can specify them in a flat list in the Given-When-Then scenario. The factory can deal with distributing the property values to the right objects. Factories are also a very effective way to reduce duplication. For example, the order amount may be copied to invoices, refund requests and account postings, but we don’t need to specify it three times. The factory can ensure that the dependent objects match the request. With more complex object relationships, the collaborators might have their own factories. So a RefundRequestFactory might just call the OrderRequestFactory to build the bulk of its dependencies. This is another good reason for pushing the object construction into code, and away from feature files. Other similar objects can just reuse the OrderRequestFactory when needed.

Another benefit of object factories compared to other approaches is that the set-up process is easy to version, and relatively easy to change. It’s all contained in a single class, so programmers can easily update it, and track changes through history.

The downside of the factory approach is that the process can be quite slow if the collaborator objects need to be saved to an external storage (for example, a database). Combining factories and databases can also cause problems for multiple concurrent test runs, as factories may be creating overlapping objects.

Use Golden Source databases for external storage

‘Golden Source’ (also known as ‘Golden Record’ or ‘Master Copy’) databases are a polar opposite approach to object factories. Instead of relying on dynamic creation, these databases contain a well-known starting point for the key reference data of an application. For example, we might pre-populate a database with a set of inventory items, providers, billing schedules and contact information. An individual scenario does not need to set up any of that data, as long as it knows what to expect in the database.

The key trick for using golden source data to use identifiers that imply the underlying references. For example, “Unavailable_Item” could be a good identifier for an inventory item that is no longer available. The key risk, conversely, is to use generic identifiers that make it difficult to understand the scenario.

The benefit of this approach is that database setup for individual scenarios is usually very fast, so it speeds up feedback.

The downside of this approach is that the data can become obscure, and that people may have incorrect assumptions about the relationships. “Unavailable_Item” might mean a completely different thing to different parts of the business.

Another common issue with golden source databases is that versions are very difficult to control. Database storages are usually binary files, and they don’t collaborate nicely with modern version control systems.

A potential way to manage golden data sources in a more controlled way is to use a set of SQL scripts as the primary source, and then create the binary database files from scratch. The SQL scripts are easy to store in version control systems. However, this requires setting up the database from start every time, so it can slow down the testing process. Because of that, full data set-up is usually not done for each test, but instead just once for the entire test suite, or even just when the SQL scripts change. Keeping SQL scripts in a version control system, and using a live “testing” database that is automatically built from those scripts but kept outside version control, often provides a good balance between confidence and feedback speed.

Another issue with a single shared golden source is that the data is easy to mess up. One test can change the inventory status of the “Unavailable_Item” and all of the sudden we’ll get a whole bunch of unexpected test failures. There are two good workarounds for that:

  • using database transactions to reverse changes
  • limiting golden data sources only to immutable reference data

Wrap tests into database transactions

Most relational databases provide transactions as a way of isolating concurrent processes and batching operations. By wrapping a test run into a database transaction, we can roll back the transaction at the end and just undo any changes to the data. With SpecFlow, the usual way to implement this would be to set up a before/after scenario hooks.

The benefit of this approach is that it is very easy to implement technically, and that it’s relatively generic. A test framework doesn’t need to know or care about data changes in individual tests. It can just roll back the current database transaction.

The downside of this approach is that it cannot be used to test processes that explicitly manage transactions, or coordinate distributed systems. For example, an API call might explicitly commit the changes to a database, and a subsequent rollback might then not completely clean up everything. With distributed systems, collaborators may not be able to see uncommitted data, so this approach is not applicable.

Many databases can also lock out readers in case of uncommitted data changes to records, so transactions can also limit our ability to run concurrent tests.

Split transactional and reference data

An alternative workaround is to commit the changes to the database, but ensure that the key data is not modified. Usually, this involves splitting the data into reference and transactional information.

Reference information is relatively static, key set-up information, designed to be the same for all tests. For example, billing schedules, provider information and item inventory set-up could be reference data for an order management system. That kind of information could be relatively small and generic.

Transactional information is dynamic, created or modified by individual test cases. For example, orders, refunds and notifications would be transactional data for an order management system.

A well designed split between transactional and reference data also makes it easier to manage the golden data set, since we can keep the SQL scripts minimal and restore databases more easily.

The problem with this approach is that it’s difficult to make a clean cut between the two categories. Customer information may fit into both, depending on the perspective. We can optimise the test set-up by creating a few customers upfront, and then use them to create orders and request refunds. Alternatively, we can make tests more isolated by creating a new customer for each test. This is further complicated if the same test suite runs different types of tests. For example, the inventory items might need to behave like transactional data for tests related to inventory management, but they can be reference data for tests related to order refunds.

Use object finders

Factories work well for in-memory systems. Golden data sources work well for databases, but can get tricky with modifications. A third popular approach for solving the complex data set-up issues is to combine the two, and use a pre-populated database that can contain partial information, complementing it with a factory that knows how to fill in the missing the information.

Object finders are usually responsible for creating an object with all its dependencies, but unlike factories they start from an existing data source. For example, if a scenario requires a valid credit card, the finder might look into the database to find a customer with a valid card and return it. If it doesn’t find anything, it can use any existing customer, just create a new valid credit card and associate with the customer. If there are no customers, it can create a valid customer object and so on. The process resembles a factory, but it treats the existing database like a temporary cache. You can decide how deep the finder goes, and at what point it gives up.

The benefit of the finder approach is that it’s much easier to ensure data isolation than just with a golden data source. For example, a step implementation may suggest to a finder that it wants to later modify a customer object, so the finder can clone an existing customer and return a modifiable copy instead of a shared reference. This allows finders to deal with different use cases, and avoid polluting reference data. A finder can treat inventory information as reference data for refund tests, but it can work with the same records as transactional data for inventory tests.

The downside of the finder approach is that it is the most complicated of all. We need to manage both object construction and database maintenance, and scenarios need to correctly report what they want to modify, and what they just want to read.


Although there is no perfect solution for all cases, the three approaches to constructing complex object networks all push the mess away from feature files into step implementation. They differ in terms of performance and ease of maintenance.

If you want to test in-memory systems, go with the object factory. If you must talk to an external database, consider how much creating the whole network takes each time, and whether this is too slow for your tests. If not, use an object factory again. If this would be too slow, then check if you can run tests in transactions easily. If so, a golden data source might be a better option. If you can’t run transactions easily, or if the golden data sources would take too long to set up, object finders are probably the best option.

Next challenge

Come back tomorrow for the next Given-When-Then with Style challenge. Meanwhile, we’d love your feedback on how to improve this article series. Please take a few moments to fill in a quick three-question form.

The next Given-When-Then with style challenge is to remove duplication from similar scenarios, in particular when groups of steps are shared between different scenarios.

Stay up to date with all the tips and tricks and follow SpecFlow on Twitter or LinkedIn.

PS: … and don’t forget to share the challenge with your friends and team members by clicking on one of the social icons below 👇