Is performance testing in non-representative environments a complete waste of time?

Date: July 2017

What is a non-representative test environment?

It is highly unusual for performance testing to be carried out in a perfectly live-like environment.There are normally at least a couple of differences between performance test and the Live environment, such as data sizing, backup and storage infrastructure, interfaces to other (live) systems, or cross-site data logging and data replication.

More realistically, there may be compromises due to the costs of creating and maintaining a full-size non-production environment which mean that CPU and Disk Write speeds may be specified at a lower capability in the performance test environment than for the production system, even if the number of physical devices are equivalent between the two environments.

Any one of the examples mentioned above could arguably be said to be reasons why a performance test environment is non-representative. However, this does not mean that we shouldn’t execute performance testing, but it does mean that we have to acknowledge that the performance testing will therefore not necessarily represent the outcome of testing is a fully live like environment.

Working with limitations, such as time, is an accepted part of a risk based approach; in performance testing this often manifests as focussing on the core transactions around the use of the system rather than all possible transactions. While server resources, test data and architectural limitations in a test environment are not ideal in performance testing, the risks these limitations present just need managing as with any testing risk, when using a risk based approach.

As Rob Munford states in his blog on ‘What is performance load modelling?’:

“It is worth noting that representing system usage in a 100% accurate way is, for all intents and purposes, impossible. Therefore, like all testing, the level of detail, areas of focus and level of effort allocated to this activity is based on risk, such as business criticality, technical risk and frequency of use.”

How should we mitigate for the differences and limitations of a performance test environment?

The first thing to do is to look to identify all the limitations in the performance test planning, and look to address as many as practical during the performance test Preparation phase. For example, there may be ways to make test data and interfaces more live-like thereby narrowing the gap between the behaviour of the system under test in the performance test environment when compared to the production environment.

The key challenge, however, is likely to be around the actual architecture of the system.

Live System Architecture diagram
Test System Architecture diagram

The modern technology landscape, including the use of containers and vertical scaling of cloud based resources, makes performance testing against a representative environment much easier, subject to any restrictions by your cloud provider. The ability to temporarily scale an environment in the cloud to a production like state mitigates many risks without a significant cost impact.

Where the cloud isn’t an option, the physical infrastructure should be made as production like as possible, using a risk based approach. While scaling an entire physical test environment to a production like environment is likely to be costly, many risks can be mitigated by scaling specific parts of the system.

What are the benefits of running performance testing on non-representative environments?

If the performance test is well simulated through performance load modelling, and the testing is executed against a spread of both transactional input data and historic data (users / products / accounts etc.) such that there are a representative mix of reads and writes, then the following benefits / system behaviour can be demonstrated, even with a non-representative environment:

  • A close approximation of the split between cache memory and disk requests
  • Suitability of the DB indexing approach
  • Evidence of whether there are row / table locking behaviours under load and how the system responds to the resulting increased response times
  • How queuing behaves, how queue depths increase and how the system responds to the resulting increased response times
  • How load-balancing algorithms perform for the SUT
  • The impact of increasing user concurrency
  • How back-office / scheduled tasks are impacted by increased system loads
  • Where the performance bottlenecks may be seen in the full-scale system

It is always better to include performance testing as part of the SDLC, as even with limitations it is still possible to characterise behaviours of the SUT in a controlled environment and find problems that would otherwise hit your live service, which could harm both your reputation and the bottom line.