Dealing with complex workflows #GivenWhenThenWithStyle

Describing complex workflows is one of the biggest hurdles for beginners with Given-When-Then scenarios. Complex workflows imply a huge number of parameter combinations, usually too much to put everything into the same feature file. Many of the techniques mentioned so far in this series don’t help with complex workflows. The trick we touched upon last week can actually work, but it needs a slightly specific interpretation. Because of the summer schedule, there will not be a new challenge this week, so I’ll use the opportunity to dive in a bit deeper into this topic.

The problem with complex workflows

Workflows have always been a tricky subject for executable specifications, usually because they are misused to describe how something is tested instead of describing what it should do. (This is the difference between declarative and imperative style that was key to solving the fifth challenge). The first book ever published on specification by example, Fit for developing software, even suggests that if a spec looks like a flow, it’s probably wrong. The issue with specifying a workflow, though, is that the flow is what is being tested, not just how, so all the common advice just does not apply. We need to capture the workflow itself, not just the decisions inside it.

Trying to specify the whole workflow in the same place usually leads to long messy feature files. The complexity of the flow, combined the interactions between the steps, produce too many key examples. Even when we choose the right exemplars, many of them will have similar parameter values with slight differences, but they will also differ in structure, so the usual suggestions on grouping into scenario outlines won’t work easily.

Here’s a relatively simple flow for order purchasing:

With 10 non-terminal steps, two or three outgoing connections in each step, and a possibility of cycles due to re-offers, this workflow would take at least 2000-3000 key examples to fully cover. The picture might look messy, but this is quite a simple flow compared to the ones I see in real projects. If each step had 5 or 6 outgoing links, and the flow had a few more steps, the number of cases could quickly jump to millions. Figuring our the exact number is tricky because of all those interactions. Even with a comprehensive set, it will be a big challenge to assert if we have enough cases, or if we need to add more.

Single Responsibility Principle in action

The right way to approach this complex situation is to divide the cases into groups based on responsibility. In the previous post, I hinted at the division of responsibilities between the overall flow and the individual decisions. We should create one feature file that just deals with the flow, without going into the details of what happens inside each box. For each workflow step, we can then write a separate feature file with a smaller number of scenarios.

The overall feature file would need to show enough representative flow cases to prove that all the transitions work correctly, and that right components are connected. Although the flow tests will provide some proof that individual steps work, they do not need to prove that any of the decision points or step implementations work fully, and should not even try to do that.

For example, one of the representative flows could be order processing for a new customer, who will try to lease something that is available in the inventory, and successfully complete the process. This would correspond to the blue path in the following image:

Testing this specific path does not prove that the item reservation fully works, just that it works enough for the chosen flow to pass. We still need to have a separate feature file for item reservation. Splitting specifications in this way allows us to describe various functions on different levels. In the flow feature file, we can show reservation with just a quick note, showing that we got into the right step at the right time. In the individual feature file for item reservation the examples will be a lot more detailed and explore the right boundaries.

Instead of trying to cover the whole flow at once, this approach leads to more feature files, but overall far fewer cases. For this flow, we’ll need to work with twelve files containing 10 or so examples each.

Flow senarios

When describing the overall flow, focus on the path coverage. You don’t have to avoid duplication, since that is usually difficult, but try to minimise it.

It’s usually very beneficial to have the picture of the flow easily accessible from the file. If you use SpecFlow and LivingDoc, you can add markdown links into the feature description, including images. For other tools, perhaps just include the image on the same file system, or host it on a wiki and put the link into the feature file.

An example of how the flow feature file might look like is below. Note that I’m breaking the rule of not repeating the data in the scenario title, because there is no specific reason why a representative flow was chosen. We’re just trying to cover all the paths. I could have written “left-side scenario” here, but that’s even more meaningless.

Feature: Purchase flow

  This is a high-level specification of the overall puchase flow; for individual
  steps see the [purchase flow step specs](./purchase-flow-steps/).

  The overall flow looks like this:

  ![](flow-image.png)

  Scenario: new customer, successfull completion

  Given a new customer 
  And the customer ID is valid
  And the customer credit history is not risky
  And the inventory contains the following items
  | SKU         | Quantity   |
  | US-77-77-77 |     1000   |
  And the customer immediately signs "lease" contracts
  When the purchase system receives a "lease" order containing
  | SKU         | Quanity    |
  | US-77-77-77 |      100   |
  Then the customer ID should be checked
  Then the customer credit history should be checked
  Then the 100 items with SKU US-77-77-77 should be reserved for the order
  Then the "lease" contract should be sent to the customer
  Then the customer should sign the contract
  Then the contract should be processed after signing
  Then the 100 items with SKU US-77-77-77 should be scheduled for shipping
  Then a successful order notification should be sent to the customer

  Scenario: old customer, unsuccessful completion
  ...

It’s a perfectly valid question to raise why we need to use so much text here, especially because we’ll likely have similar information in the next scenario. Resist the urge to fit several scenarios into an outline. Because of taking different paths through the flow, the scenarios will use a different structure. (If they have the same structure, we’re very likely just duplicating a test through the same path to check decisions, which we should not be doing here).

Flow scenarios can get very long and complex, but within the Given-When-Then tool constraints, this is unfortunately something we’ll have to live with. I’ve seen attempts to use images instead of words. For example, the automation layer could track steps through the flow and generate a text description of a sequence diagram, which could be converted into a picture using Omnigraffle or Graphviz. This picture could then be used as the test output, and compared to a baseline for approval testing. This approach definitely helps with complex diagrams, where an image truly says more than a thousand words, but it can be also very tricky to maintain. And it’s outside the scope of Given-When-Then tools.

Step scenarios

After creating a separate feature file for the overall flow, we can create more focused specifications for individual steps in the process. This makes it easy to define and try out boundaries specific for a part of the workflow. For example, combining reservations and re-offers could in theory lead to deadlock situations, where two customers are competing for the same items and block each-other indefinitely. Testing the reservation process to avoid deadlocks can be incredibly difficult if we need to somehow set up and coordinate two complete flows. The same test is relatively easy to describe and execute if we can just focus on the reservation part. We can also check easily various ways a reservation can fail, or partially complete, without having to worry about the overall flow context.

Feature: Inventory reservation

  The inventory system can reserve items for a specific order, for a period 
  of time. The reservations can be fully successful, or partial to support
  re-offers. In case of partial reservations, we must take care to avoid
  mutual dead-lock situations between two orders.

  Background:

  Given the inventory contains the following items
  | SKU         | Total quantity | Available quantity |
  | US-1000     |           1000 |               1000 |
  | US-0500     |           1000 |                500 |
  | US-0010     |           1000 |                 10 |
  | US-0000     |           1000 |                  0 |


  Scenario: Multi-item reservation

  Given an order with ID O-12345 and the following items
  | SKU         | Quantity |
  | US-1000     |      100 |
  | US-0500     |       99 |
  When the order reservation is processed
  Then the inventory reservation list contains
  | SKU       | Order ID | Quantity |
  | US-1000   |  O-12345 |      100 |
  | US-0500   |  O-12345 |       99 |
  And the inventory contains the following items
  | SKU         | Total quantity | Available quantity |
  | US-1000     |           1000 |                900 |
  | US-0500     |           1000 |                401 |
  | US-0010     |           1000 |                 10 |
  | US-0000     |           1000 |                  0 |

Separating feature files for the individual steps from the specification of the overall flow also lets us use different structure for different scenarios. We do not have to enforce any kind of consistency across them.

Stay up to date with all the tips and tricks and follow SpecFlow on Twitter or LinkedIn.

PS: … and don’t forget to share the challenge with your friends and team members by clicking on one of the social icons below 👇