Skip to main content

How digital developers can simplify evaluation choices

Dr Paulina Bondaronek is Principal Behavioural Insights Advisor in Research, Translation and Innovation at Public Health England. Here, Dr Bondaronek shares her advice for digital health developers on which evaluation method to choose. 

Many users have asked us to provide a tool to help to choose an evaluation method. This is not an easy task because each evaluation is different and there are no set rules. This represents both the challenge and the beauty of evaluating digital health products. It means you can be flexible and tailor your evaluation based on the need. 

Decide the focus of your evaluation 

Here, we’ve organised study methods into different families of designs. These are based on: 

  • what you want to get out of your evaluation 
  • practicalities and constraints 

The 3 families of designs are: 

  • understanding how users experience your product 
  • assessing your product’s effectiveness 
  • analysing value for money 

You will see that the groups are not mutually exclusive and within each method you have various options. 

All research is a pragmatic compromise between some ideal research design and what is practical. Therefore, before we look at the different sorts of designs available, there are some useful questions to consider in terms of practicalities. 

Do you already have data available on users of your digital product?

Some digital systems collect data on their users’ actions as they go; others will ask users questions about their progress. Users may be within a clinical service that routinely collects data. If this data is readily available and may be useful for an evaluation, it would be sensible to start with it. 

How easy would it be to collect data from users?

For example, it may be easy to put a question in front of users through the digital product itself. That works well to understand users’ experiences, but you might want to measure a clinical indicator that requires an examination or lab tests. These will be harder to obtain and probably cost more. If you want to make a comparison between a group of participants using a digital product and a control group not using it, remember that you will need to collect data outside of the digital product itself. 

It is easier to collect data from users when you have lots of users! If a product is still in development or has not launched yet, an evaluation will have to put resource into recruiting participants. 

It can be useful to have a comparison group. How easy will it be to identify a comparison group?

If you are going to introduce the digital product or service somewhere, it may be easy to do a before-and-after comparison. If you are rolling out the product or service in stages, then places getting the product later might work as control groups. 

1. Focus on understanding experience

If you’re interested in understanding the experiences and thought processes of your users, there are various qualitative methods that you can use. Here are some to explore:  

In practice, it is good to include more in-depth questions about experiences of your product in any evaluation, even if you are also using other methods as described below. This is called a mixed methods study. 

2. Focus on effectiveness

There is one thing common to all evaluations of effectivenessyou want to gather data to assess if the changes you observe in the outcomes you’re measuring can be attributed to your digital product or service. The more you can control for other influencing factors that could cause the change, the stronger your conclusions that what you have found can be attributed to your intervention. On the other hand, the less control you have over factors that may distort your results, the weaker your conclusions about the cause (your product) and the effect (the change).  

Here, we’ve organised effectiveness studies into 3 families of designs based on the practicalities and constraints of what you can do to minimise these influencing factors. The 3 statements will direct you to groups of appropriate methods to use, explored in detail in our methods library.  

EXPERIMENTAL DESIGN  – If you can randomise participants into different groups

Generally, methods where participants are randomly assigned to different groups provide most confidence in the findings. These methods minimise influencing factors that are out of your control but that may distort your results and conclusions. This means they can provide the strongest evidence and the most definitive answers to demonstrate cause and effect. 

Bear in mind that randomising participants does not mean that you automatically remove the influencing factors that may mislead your findings. Read about risks of bias to consider when designing a randomised study. 

Methods to explore and compare:   

QUASI-EXPERIMENTAL DESIGNS – If you cannot randomise to different groups but there is an appropriate comparison group

Let’s say that, for the evaluation you are doing, randomisation is not possible or practical. In this case, you want to structure your research to be as similar to a randomised design as possible. This will minimise the problems related to not having a control group. Read about risks of bias to consider when designing a non-randomised study. 

One way to increase the strength of your results is by finding a comparison group with similar attributes to your participants. For example, suppose you gave participants on a hospital ward an avatar that provides social support, and you have access to routinely collected data showing clinical indicators. You could retrospectively find a comparison group that will match your participants that received care-as-usual, i.e. no avatar. This is called case-control study. 

Another example would be an interrupted time series design where your digital product is introduced and then withdrawn, and the periods are compared to see if the outcomes have changed. You could also strengthen your design by introducing randomisation. For example, in a multiple baseline time-series study you could randomize participants to different baseline periods, i.e. how long they wait until you introduce the avatar. 

Methods to explore and compare:  

If you cannot randomise or use a comparison group but you can compare participants before and after your digital product was introduced

You may find that both randomisation and finding an appropriate comparison group are difficult. Although the strength of evaluation will be lower, there is still a spectrum of research methods you can use.  One way to minimise the influencing factors associated with not having a control or a comparison group is to measure your outcome before and after participants use your digital product or service. You can then compare the differences. 

Methods to explore and compare: 

If baseline and follow-up data are available: 

 If you cannot use a comparison group and you have only done an after assessment 

You collected some data after your participants used your digital product. There is no appropriate comparison group and you cannot compare participants before your digital product was introduced. Drawing any conclusions around cause and effect of your digital product will be challenging.  

This type of quasi-experimental design is most susceptible to biases and confounders that may affect the results of your evaluation. Still, using a design with one group and only testing participants after they receive the intervention will give you some insights about how your product is performing and will give you valuable directions for designing a stronger evaluation plan. If you are not planning a comparison 

Have a look at the descriptive studies in our methods library. These studies would not generally be used to determine whether your product has an impact on outcome in terms of cause and effect. However, they are a great starting point to find out about the general effect of your digital product or service and to get some ideas for what to focus on when planning your formal evaluation. Look at some of the descriptive studies in our library:  

Some of these studies involve using data that already exists (analysing routinely collected data, clinical audit). A behaviour change techniques review involves analysing the product, so you do not need participants. In a user feedback study, you do need to collect data from participants, but you don’t have any comparison group. 

3. Focus on the value for money

If your outcomes focus on the economic value of your interventions in comparison to others, then you may want to use a health economic evaluation. 

Methods to consider: 

4. Focus on patient safety

Clinical systems need to be effective and cost-effective, but they also have to be safe. NHSX, NHS Digital and NHS England and NHS Improvement have just published a Digital Clinical Safety Strategy 

Evaluating safety is an important part of that strategy. We can use similar methods as for evaluating effectiveness, particularly audit, but there are also evaluation approaches specific to safety and risk management. Retrospective hazard analysis methods look backwards. Root cause analysis, for example, considers a specific past incident and what happened. Prospective hazard analysis methods use a variety of structured techniques to try to determine where problems might arise, as discussed here. 

Choose your method

Here, we have provided you with initial guidance to lead you through methods you might want to use, depending on your context. In the real world, you may want to mix and match different method types. Do you want to show your product works? (focus on effectiveness), do you also want to understand how and why it works? (focus on understanding experience), do you need to show your product is cost-effective? (focus on value for money).  

The guide is most relevant for someone who is developing, or has already developed, a digital health product. However, the earlier you think about evaluation the better, and considering evaluation in your design process helps you to plan and execute your evaluation study down the line. The step by step guide will help you to plan your evaluation. The Methods Library includes over 30 different evaluation methods. The structure of each method is based on our research on user needs and includes a short description of the method, when to use it, pros and cons, a case study, and links to curated further resources.  

Whichever method you decide to use, remember to tell others about it. This way we can build the evidence around evaluating digital health products. If you have conducted an evaluation using our guide, please let us know. We want to hear from you, to improve this resource, and to accumulate case studies of different evaluations in digital health space.  

Developing a national solution for digital health evaluation: have your say

Navigating the evaluation landscape can be tricky for digital health companies, especially when they need help from academic institutions and evaluation experts. Researchers at University College London are working with various partner organisations, including DigitalHealth.London, on a project to understand how to support SMEs with generating high quality evidence about their products. The aim is to develop consenus about the type of national solution that has broad appeal to digital health companies, the NHS and patient groups. If you are a digital health company and would like to know more about this work, and possibly feed into the consensus-building process, please contact Dr Paulina Bondaronek at 

Contact us