Skip to main content

Comparing methods for handling missing cost and quality of life data in the Early Endovenous Ablation in Venous Ulceration trial



This study compares methods for handling missing data to conduct cost-effectiveness analysis in the context of a clinical study.


Patients in the Early Endovenous Ablation in Venous Ulceration (EVRA) trial had between 1 year and 5.5 years (median 3 years) of follow-up under early or deferred endovenous ablation. This study compares complete-case-analysis (CCA), multiple imputation using linear regression (MILR) and using predictive mean matching (MIPMM), Bayesian parametric approach using the R package missingHE (BPA), repeated measures fixed effect (RMFE) and repeated measures mixed model (RMM). The outcomes were total mean costs and total mean quality-adjusted life years (QALYs) at different time horizons (1 year, 3 years and 5 years).


All methods found no statistically significant difference in cost at the 5% level in all time horizons, and all methods found statistically significantly greater mean QALY at year 1. By year 3, only BPA showed a statistically significant difference in QALY between treatments. Standard errors differed substantially between the methods employed.


CCA can be biased if data are MAR and is wasteful of the data. Hence the results for CCA are likely to be inaccurate. Other methods coincide in suggesting that early intervention is cost-effective at a threshold of £30,000 per QALY 1, 3 and 5 years. However, the variation in the results across the methods does generate some additional methodological uncertainty, underlining the importance of conducting sensitivity analyses using alternative approaches.


Missing data occurs when one or all variables are missing for a given subject. This often occurs in longitudinal studies and can particularly be a problem in within-study cost-effectiveness analysis (CEA) because accurate estimates of total mean cost and quality-adjusted life years require full data to be collected on each subject at each follow-up time point [1,2,3].

This study compares six different methods for handling missing data in a cost-effectiveness analysis comparing early endovenous ablation versus delayed ablation for venous leg ulcer treatment [4]. The original cost-effectiveness analysis employed a repeated measure mixed model (RMM), and reported mean total cost of − £155 (95% CI − £1262 to £953) and mean total QALY of 0.073 (95% CI − 0.06 to 0.20) at 3 years [4]. RMM has been shown to have acceptable properties in simulation studies [5]. However, as missing data are always unknown, it is recommended to conduct sensitivity analyses to see how robust the results are to alternative methods, and this is the primary aim of this paper [1]. This work is unable to demonstrate which approach is “correct” because we do not know the values of the missing data. Nevertheless, this paper provides an interesting case study of “revisional research” in health economics [6], in which the original findings are challenged by employing more extensive methods to assess modelling uncertainty. The methods outlined in this paper may also be useful more generally to investigators wishing to explore the different ways that missing data approaches can be implemented with standard statistical software (STATA or R).

Due to the design of the trial, there was very low loss to follow-up, but considerable item missingness (see “Methods”: “Data”). There are several ways in which the chosen missing data approach might influence the results: different subjects used in the analysis, different number of observations used per subject, different statistical models of the missing data mechanism and the latent correlation between observed and missing observations, or different estimation model to estimate total mean costs and QALYs and the correlation between them [7, 8]. This paper addresses this challenge using six alternative methods: complete case-analysis (CCA) [9,10,11], multiple imputation by linear regression (MILR), multiple imputation by predictive mean matching (MIPMM) [9, 11,12,13], repeated measure mixed model (RMM) also known as random effect, repeated measure fixed effect (RMFE) [14], and a Bayesian parametric approach (BPA) using the selection model in the R package missingHE [15]. All methods assume data are Missing Completely at Random, given covariates (CD-MCAR) or Missing at Random (MAR). Under CD-MCAR, the probability that data are missing only depends on observed baseline covariates, and under MAR, the probability depends only on values of observed outcome data and baseline covariates [1]. The package missingHE also provides models to explore missing not at random (MNAR) situations but this is not considered here [5]. Results are estimated over different time horizons (and hence with different quantities of missing data) of 1, 3 and 5 years. In each case we calculate the mean incremental total cost and QALY, standard errors, the incremental cost-effectiveness ratio (ICER) and the cost-effectiveness acceptability curve (CEAC). The focus in this paper is on alternative statistical methods for handling missing data. We do not explore other sources of modelling uncertainty, such as use of different sets of covariates to make predictions or alternative statistical distributions of dependent variables [16, 17].



The Early Endovenous Ablation in Venous Ulceration (EVRA) randomised clinical trial evaluated the cost-effectiveness of early versus deferred endovenous ablation to treat venous leg ulcers. The trial methods and patients are described elsewhere [4]. Briefly, resource use items in hospital, primary and community care and medications related to the treatment of venous ulceration, adverse events or complications were collected by case note review and questionnaires completed at baseline and monthly thereafter up to 1 year, plus one further telephone follow up between October 2018 and March 2019.

The baseline covariates included in all the estimation models were: TREAT is treatment randomised (“early” coded as 1 or “delayed” coded as zero). The variable \({WEEK}_{t}\) is the time variable (coded as a set of categorical (dummy or factor) variables) representing the week after randomisation at which data are observed, from t = 0 (baseline) to t = 16 (week 260). SIZE, AGE and DURATION are the ulcer size (cm2), subject’s age (years) and length of time with ulcer (years), respectively, measured at baseline and centred at the means. SITE was coded as a factor variable.

Each item of resource use was multiplied by UK unit costs obtained from published literature, NHS reference costs, and manufacturers’ list prices to calculate overall costs within each of these categories for each patient [4]. The costs for each individual over their follow-up (from randomization to date of censoring for that individual) were assigned or apportioned into discrete time periods, that corresponded to 12 monthly periods during the first year (as follow-ups were monthly) and then yearly periods thereafter. This allowed discounting to be applied (3.5% per year), and facilitated analysis using the MI and mixed model in long format (see below).

EQ-5D-5L was collected at baseline, 6 weeks, 6 months, 12 months, plus one further telephone follow up between October 2018 and March 2019, and a utility index was calculated at each time point using a published tariff [18]. SF-36 was also administered but only up to 1 year, so was not used in this paper.

Patients who died during the study were assigned zero costs and HRQOL thereafter. Code and example data are available in Additional file 1,

Missing data

Due to rigorous trial design and conduct procedures [19], there were very few withdrawals or failures to complete questionnaires as planned in the study (see Additional file 1: Table S5). Nevertheless, data are incomplete in this study for two reasons. First, recruitment of the 450 patients into the clinical study across the 20 vascular centres took place between October 2013 and September 2016. The study finalised on March 2019. This “staggered” recruitment into the trial meant that patients had a minimum of 1 years of follow-up and a maximum of 5.5 years (median 3 years).

Second, all patients had regular and periodically scheduled follow-up during the first year after recruitment, but to keep the cost of the research study low, only one further telephone follow-up per patient was conducted. This took place between October 2018 and March 2019. Figure 1 shows how this study design influences the missing data pattern. A patient recruited in 2014 will have complete follow-up during the first year, missing data at years 2, 3 and 4, and one follow-up at 5 years (patient A). A patient recruited in 2015 will have complete follow-up during the first year, missing data at years 2 and 3, one follow-up at year 4, and missing data for year 5. A patient recruited in 2016 (patient C) will have complete follow-up during the first year, missing data at year 2, one follow-up at year 3, and missing data for years 4 and 5. This mainly affected collection of EQ-5D, because in the absence of telephone questionnaire data, most types of resource use and clinical outcomes could be obtained from case-notes.

Fig. 1
figure 1

Schematic relation between recruitment date and missing data pattern for 3 hypothetical patients

The pattern of missingness was examined using descriptive statistics and via the linear logistic model of indicators of missing cost and EQ-5D data on treatment allocation and a selection of baseline variables (Eq. 1) [1].

$$logit\left({\pi }_{it}\right)= {\gamma }_{1}{TREAT}_{i}+{\gamma }_{2}{DURATION}_{i}+{\gamma }_{3}{AGE}_{i}+{\gamma }_{4}{SIZE}_{i}+{\gamma }_{5}{Site}_{i}+{\gamma }_{6}{WEEK}_{t}$$

where \(\pi\) denotes the probability that an observation is missing in individual i at time t.

Cost-effectiveness analysis was conducted using aggregated data—CCA and BPA—and disaggregated (longitudinal) data—MI, RMM and RMFE. Table 1 summarises the approaches. Further details are also given in Additional file 1

Table 1 Overview of approaches employed to handle missing data

Repeated measure: mixed model and fixed effect

The effects of the events on the HRQOL and costs were computed using repeated measures regression model with the differences between subjects (\({\varsigma}_{i}\)) modelled as a random effect (RMM) or fixed effects (RMFE) (Eq. 2). The RMFE method eliminates unobserved time-invariant confounders without imposing any additional assumptions on \({\varsigma}_{i}\). The RMM method assumes that unobserved heterogeneity \({\varsigma}_{i}\) is not correlated with other controls [20].

$${Y}_{it}={\beta }_{0}+{\beta }_{1}{TREAT}_{i}+{\beta }_{2}{DURATION}_{i}+{\beta }_{3}{AGE}_{i}+{\beta }_{4}{SIZE}_{i}+{\beta }_{5}{Site}_{i}{+ \beta }_{6}{WEEK}_{t}+\delta {TREAT}_{i}*{WEEK}_{t}+{\varsigma}_{i}+{\epsilon }_{it}$$

\({Y}_{it}\) is the outcome variable (one model for costs during each period t and another for EQ-5D tariff at the end of each period t) for each subject i at time point t. Hence for the model where the dependent variable is cost, \({Y}_{i0}\) is set to be zero for all subjects, \({Y}_{i1}\) is the cost for patient i during the first 4 weeks \({Y}_{i2}\) is the cost between the 4th and the 8th week, and so on up to \({Y}_{i12}\) (week 52). After that, the periods are set to be yearly, so that \({Y}_{i13}\) is the cost between week 52 and week 104 (year 2), and so on up to \({Y}_{i16}\) (year 5 or week 260). \({\varsigma}_{i}\) is the random deviation of subject i’s mean costs or EQ-5D tariff from the overall mean \({\beta }_{0}\) and \({\epsilon }_{it}\), often called within-subject residual across time, is the random deviation of \({Y}_{it}\) from subject i’s mean costs or EQ-5D tariff [21, 22]. \({Y}_{i}\) is the outcome variable for each subject i.

In RMM and RMFE estimates of the \(\widehat{\delta }\) are a (vector of) coefficients for the interactions between treatment assignment and period number and hence represents the mean incremental cost of early treatment (versus delayed) during period t (in the cost model) or the mean incremental EQ-5D tariff at follow-up time point t (in the EQ-5D model). These analyses were implemented using the mixed and xtreg command in STATA 15. To estimate total mean incremental cost per patient over a desired time horizon (e.g., 3 years), the relevant period coefficients are simply added up (lincom). Thus, for example, where the dependent variable is cost accrued during the preceding period, and \({\widehat{\delta }}_{1}\) is the time-treatment interaction coefficient at 4 weeks (~ month 1), \({\widehat{\delta }}_{2}\) at 8 weeks (~ month 2), and \({\widehat{\delta }}_{3}\) at 13 weeks (month 3), then the difference in total mean incremental cost over the first 3 months is \({\widehat{\delta }}_{1}\) + \({\widehat{\delta }}_{2}\) + \({\widehat{\delta }}_{3}\). To estimate mean total incremental QALY over a given time horizon, the “area under the curve” applying the trapezium rule is calculated. Hence, using the coefficients from the EQ-5D model over the first 3 months (where \({\widehat{\beta }}_{1}\) is the difference in EQ-5D at baseline, \({\widehat{\delta }}_{1}\) at 4 weeks and \({\widehat{\delta }}_{2}\) at 3 months), the estimated mean total incremental QALY over the first 3 months would be \(0.5*\left(({\beta }_{1}+{\widehat{\delta }}_{1})*\frac{4}{52}+{(\widehat{\delta }}_{1}+{\widehat{\delta }}_{2})* \frac{9}{52}\right)\)).

Uncertainty was estimated by bootstrapping incremental mean costs and QALYs [23] and shown by the cost-effectiveness acceptability curve (CEAC). The bootstrap is used here because in the RMFE and RMM approaches, we run separate regressions for period costs and EQ-5D. In the MI, BPA and CCA approaches, we are able to analytically calculate the variance–covariance matrix using a joint regression of total costs and total QALY (assuming a bivariate normal distribution of the dependent variables) and so could estimate the CEAC parametrically. In the case of the RMM and RMFE models, this option is not available and so the bootstrap presents a pragmatic, numerical solution to this problem.

Multiple imputation

We implemented MI using three steps. Firstly [24], M imputations (completed datasets) were generated under an imputation model replacing missing values with “plausible” substitutes, based on distribution of the observed data using linear regression (MILR) and predictive mean matching (PMM). The variables included in the imputation models for costs and EQ-5D were treatment, age, duration, site, ulcer size, ethnicity, diabetes, history of deep vein thrombosis, trial leg and Eq. 5d at baseline [25].

This step was performed by multivariate imputation by chained equation (MICE) (also known as fully conditional specification (FCS) [26] or sequential regression multivariate imputation [27]) which is a practical approach to generating imputations based on a set of inter-linked imputation models. The process using MILR begins by choosing the first variable to impute, say costs in the first period (\({Y}_{1}\)). Values for all other variables (both EQ5D at each follow up and period costs) to be imputed were then filled in using a simple rule (simple random sampling with replacement from the observed values). Then, \({Y}_{1}\) was regressed on all other variables and baseline covariates, and then missing values for \({Y}_{1}\) were replaced by simulated draws from the corresponding posterior predictive distribution of \({Y}_{1}\). Then, the process was repeated for the next variable (e.g., \({Y}_{2}),\) which was regressed on all other variables and using the newly imputed values in \({Y}_{1}\). Again, missing values in \({Y}_{2}\) were replaced by draws from the posterior predictive distribution of \({Y}_{2}\). The process was repeated for all other variables with missing values in turn: this is called a cycle. In order to stabilize the results, the procedure was repeated for 20 cycles to produce a single imputed data set, and the whole procedure was repeated M times to give M imputed data sets [28,29,30,31].

A second method for MI, predictive mean matching (PMM) was also used. PMM is an ad hoc method of imputing missing values which combines the standard linear regression and the closest-neighbour imputation approaches. For each missing value \({Y}_{i}\) with covariates \({X}_{i}\), PMM identify k individual with the nearest value of observed \({Y}_{i}\)—It uses the linear predictions as a distance measure to form the set of the nearest neighbours (suitable “donor”) consisting of the complete value—, it then randomly draws an imputed value from this set. By drawing from the observed data, PMM preserves the distribution of the observed values in the missing part of the data which makes it more robust than the fully parametric linear approach [32]. Possible donors were set with 10 closest neighbours as suggested in Morris et al. [33].

Step 2 was to perform M = 40 imputations [34], and finally, step 3, the results obtained from the 40 completed-data analyses were combined into a single multiple-imputation result using Rubin’s rules [35]. Analyses were implemented using the mi suite of commands in STATA 15.

Monte Carlo Errors (MCE) and the fraction of missing information (FMI) were calculated to indicate the stability of the model. FMI and MCE reflect the variability of MI results across repeated uses of the same imputation procedure and are useful for determining an adequate number of imputations to obtain stable MI results [13].

For each of the m complete datasets, total cost and total QALY over 1 year, 3 years and 5 years for each subject were imputed passively using the same formulas given in the section for repeated measures. The difference between repeated measures and MI being that in the RMM and RMFE approaches, estimates of total mean cost and QALY for the group as a whole were made by linear combination (lincom) of the coefficients, while MI imputes a total cost and QALY for each subject, and then proceeds to estimate mean incremental cost and QALYs for the group as a whole using bivariate normal regression (sureg in STATA 15). Coefficients from this regression were then combined across the multiple imputed datasets using Rubin’s rules (mi estimate) [34]. The bootstrap was not used with MI as this can be complex and time-consuming [36]. Instead, the CEAC was calculated parametrically from the coefficients and covariance matrix of the bivariate normal regression.

Complete case analysis

Total cost and total QALY were calculated for each individual i over the relevant time horizon T (1, 3 or 5 years) (Eq. 3). Any subject with a missing period cost or EQ-5D in one the relevant time horizon was dropped (as total cost and total QALY for individual i at time T cannot be calculated if any period costs or EQ-5D values up to T are missing). A bivariate normal regression was performed at each time horizon for total costs and total QALY (Eq. 3), where \({Y}_{i}\) is a (cost, QALY) pair for individual i. The CEAC was calculated using the bootstrap (parametric estimates were also tried and made no noticeable difference to the results so are not reported).

$${Y}_{i}={\beta }_{0}+{{\beta }_{1}{eq5d0}_{i}+ \beta }_{2}{TREAT}_{i}+{\beta }_{3}{DURATION}_{i}+{\beta }_{4}{AGE}_{i}+{\beta }_{5}{SIZE}_{i}+ {\beta }_{6}{SITE}_{i}+{\varepsilon }_{i}$$

Bayesian parametric approach (BPA)

The dataset for BPA consists of total observed cost and total observed QALY for each individual over the time period of interest (1, 3 or 5 years), along with baseline control variables. Hence one total cost and one total QALY observation per subject are used as dependent variables in the analyses, in the same way as the CCA approach. However, unlike CCA, all individuals are included in the analysis dataset. In BPA each unobserved quantity (total cost or total QALY) in the model is handled as if it were a parameter [37,38,39,40].

The BPA was implemented based on Markov Chain Monte Carlo (MCMC) using the R function selection, within the missingHE package [37]. BPA requires the specification of four models: the first two are the estimation models for the total QALY and total cost variables (Y) assuming these data are bivariate normally distributed (as Eq. 3) and the last two are the auxiliary models which are fitted (similarly to Eq. 1) to estimate the probability Y is missing using logistic regressions.

The four models include baseline covariates of treatment allocation, ulcer duration, ulcer size, age, and site. And the auxiliary models also include the length of follow-up in the study, as the probability of missingness increases with time since baseline. Non-informative priors were used for the precision of the dependent variables, which were varied from 0.001 to 0.01 in sensitivity analyses. Incremental mean costs and QALY were computed from the estimation models and the CEAC was calculated parametrically from the variance–covariance matrix.

The original cost-effectiveness analysis for the EVRA trial coded SITE as a random effect. The documentation for BPA states that covariates can be included either as fixed or random effects [41], but despite our best efforts and attempting to contact the software authors for advice without reply, we were unable to implement this feature. Hence in this paper we implemented all models using fixed effects for SITE for comparability.


Pattern of missingness

No baseline data were missing. 74% of subjects had complete data (costs and EQ-5D) at 1 year, 10% at year 3 and 25% at year 5 (Table 2). This pattern arises from the staggered recruitment and because the final questionnaire was administered at a fixed calendar point irrespective of when the subject was recruited.

Table 2 Missing data pattern

The logistic model showed the probability that a value is missing in costs and EQ-5D are related to the time in follow-up, age at baseline and site (p < 0.0001), see Additional file 1: Tables S6, S7. As EQ-5D tend to change over time since surgery (see Additional file 1: Table S9), and EQ-5D are more likely to be missing at longer follow-up, this suggests that the probability of an item being missing may be correlated with values of observed outcomes (MAR). However, it cannot be ruled out that data might be MNAR (that is, missingness correlated with unobserved outcomes).

Only subjects with complete aggregate data were used in CCA: year 1, n = 338; year 3, n = 44 and year 5, n = 147. The BPA included all the 450 subjects. The data for RMM and MI included all the longitudinal observations for all follow-ups as an unbalanced panel.

Cost effectiveness analysis

Table 3 shows a summary of the results of the cost-effectiveness-analysis with the six different approaches at each time point. All methods agreed that there was no statistically significant difference in cost at the 5% level at any time horizon. Early intervention was associated with statistically significantly greater mean QALY among all methods at year 1. BPA showed a statistically significant difference at year 3, while other methods tended towards greater QALY for the intervention, but this did not reach statistical significance.

Table 3 Results of the models

At 3 years early intervention dominated according to RMM, RMFE and BPA methods. The ICER according to CCA was £6075/QALY, £319/QALY using PMM and £627/QALY using MILR. All methods suggested that early intervention is cost-effective at a threshold of £30,000 per QALY at 1-, and 3-year time horizons. At a threshold of £30,000/QALY, the estimated probability that the intervention was cost-effective was 93% using RMM, 91% using RMFE and 58% using CCA, see Fig. 2.

Fig. 2
figure 2

Cost-effectiveness acceptability curves at 3 years. RMM Repeated measure mixed model, RMFE Repeated measure fixed effect, MIPMM multiple imputation using predictive men matching, MILR multiple imputation using linear regression, CCA complete-case-analysis, BPA Bayesian parametric approach

When we compare the two methods for multiple imputation, MIPMM show a loss of efficiency of 0.03% in costs using M = 40 and 0.8% in QALY while MILR shows 0.20% and 1.3% for costs and QALY, respectively. MCE were less than 10% of the standard errors (SE) in both methods, indicating reasonable stability of the models. As would be expected, imputations with MIPMM correspond more closely than MILR to the distribution of observed data (Additional file 1: Fig. S1).

RMM and RMFE showed greatest standard errors (SE), 482 and 525, respectively at year 1 for incremental mean costs than other methods (Fig. 3a). CCA showed the greatest SE at year 3 and BPA at year 5, 831 and 807, respectively. MIPMM showed the lowest SE at all time horizons. Regarding QALY at year 1, CCA and BPA showed greater SE than other methods (Fig. 3b). BPA presented the highest SE at year 3 and 5. Other methods showed similar SE for incremental mean QALY at years 1, 3 and 5.

Fig. 3
figure 3

Standard errors of a incremental mean costs b incremental mean QALY. RMM repeated measure mixed model, RMFE repeated measure fixed effect, CCA complete case-analysis, MIPMM multiple imputation using predictive mean matching, MILR multiple imputation using linear regression, BPA Bayesian parametric approach


This paper compared six methods for handling missing data empirically, some in common use and others less so, using a real data set with several follow-up points over a long time period. We have attempted to use a similar estimation model in each case, so that differences arise mainly from the number of subjects and observations per subject that comprise the data, and the assumed latent correlation between observed and missing data.

The original cost-effectiveness analysis employed RMM, and reported mean total cost of − £155 (95% CI − £1262 to £953) and mean total QALY of 0.073 (95% CI − 0.06 to 0.20) at 3 years [4]. The very small differences arise in this paper because the original paper coded SITE as a random effect. In this paper, we code SITE as a factor variable (fixed effect). All the approaches coincide in estimating statistically significantly greater QALY at 1 year, but only BPA showed a statistically significant difference in QALY at 3 years. RMM, RMFE, MILR, MIPMM and BPA suggest the mean difference in QALY is positive (in favour of early intervention). However, the mean coefficient for incremental cost is negative in some methods and positive in others, leading to differences in the ICER.

CCA is the simplest method to implement. However, because subjects with any incomplete observations are discarded, it can be considered wasteful of the available data. Hence it is likely that the standard errors are over-estimates, arising from the low number of observations. CCA can also be biased if data are MAR. Hence the ICER for CCA could be inaccurate. Other methods coincide in suggesting that early intervention is cost-effective at a threshold of £30,000 per QALY at 1-, 3- and 5-year time horizons. However, the variation in the ICER across the methods does generate some additional methodological uncertainty, underlining the importance of conducting sensitivity analyses using alternative methods.

BPA offers a principled framework for handling missing data under the assumption of MAR. BPA includes all individuals but uses aggregate data for the dependent variables. This means that if a subject has one missing EQ-5D follow-up, then the QALY for that individual would be recorded as missing, and previous (or future) follow-ups for EQ-5D for that individual would be ignored. This means BPA can also be considered wasteful when (as is the case here) many individuals have some missing EQ-5D, in the sense that some relevant data is ignored. Hence it might be reasonable to conclude that the large standard errors generated by BPA at 3 and 5 years in this example are over-estimates.

MI, RMM and RMFE employ all the available longitudinal period cost and EQ-5D observations in all the subjects. Hence, they can be considered efficient methods in the sense that every item of observed data is used in the analysis model. This is important when there is substantial item missingness, as we have in this dataset. They are straightforward to implement using standard software. RMM and RMFE would not be a suitable option if there were considerable missing baseline covariates that needed to be included in the analysis model (selection and CCA share this limitation). There were slight differences between RMM and RMFE. This may be due to the cluster size.

MI has been widely recommended for cost-effectiveness analysis [1, 42,43,44]. MI can impute both missing outcome data and missing baseline data. Also, simulation studies have found that MIPMM offers a better fit to the data [45]. Some caution is needed when using MIPMM if there are few donors in the vicinity of an incomplete case, leading to a risk of bias [33]. Also, if a donor is selected for many individuals or repeatedly used by the same individual across imputations this will lead to inefficiency, underestimating the between-imputation variance. MI can compute the variance–covariance matrix of total mean cost and total mean QALY using parametric assumptions, while RMM and RMFE estimates costs and EQ-5D separately and uses bootstrap simulations to estimate the correlation between total mean cost and total mean QALY. This makes both RMM and RMFE rather slow to compute, though some analysts may favour semi-parametric methods such as bootstrap when data are not normally distributed.

Strengths and limitations

This study has compared the missing data approaches reported in Gohel et al. [4] against a wider set of methods for handling missing data. We included approaches that are commonly used, and others less so [1, 9], in a case study with a long follow up and a high proportion of item missingness There are also some limitations that need to be taken into account. First, other missing data approaches are available [46,47,48]. We only examined MAR mechanisms here. If data are MNAR then this may give rise to bias. The data could have been modelled as a three-level multilevel MI (time, subject and site). When the percentage of missing data is large MI strategies that do not take into account the intra-cluster correlation can underestimate the variance of the treatment effect [7, 49, 50]. Other Bayesian models could also have been tried to model sites as random effects [5, 51]. Also, costs and QALY were assumed normally distributed for the simplicity of modelling [52]. In this case study the standard errors for RM models were generally greater than for MIPMM. However, since we do not know the true values of the missing data, we cannot generalize about which method is “correct”.


The variation in the results across the methods underline the importance of conducting sensitivity analyses using alternative approaches to missing data. Further work might consider models for handling non-normal distributions and more complex missing data mechanisms.

Availability of data and materials

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request. Also, codes of STATA and R used in this study are available in Mendeley Data, V1,


  1. Faria R, Gomes M, Epstein D, White IR. A guide to handling missing data in cost-effectiveness analysis conducted within randomised controlled trials. Pharmacoeconomics. 2014;32(12):1157–70.

    Article  PubMed  PubMed Central  Google Scholar 

  2. Myers WR. Handling missing data in clinical trials: an overview. Ther Innov Regul Sci. 2000;34(2):525–33.

    Google Scholar 

  3. Fitzmaurice G, Laird N, Ware J. Applied longitudinal analysis (2nd edition). Wiley. 2011.

  4. Gohel MS, Mora J, Szigeti M, Epstein DM, Heatley F, Bradbury A, et al. Long-term clinical and cost-effectiveness of early endovenous ablation in venous ulceration. JAMA Surg. 2020;155(12):1113.

    Article  Google Scholar 

  5. Gabrio A, Hunter R, Mason AJ, Baio G. Joint longitudinal models for dealing with missing at random data in trial-based economic evaluations. Value Health. 2021;24(5):699–706.

    Article  PubMed  Google Scholar 

  6. Laxy M, Wilson ECF, Boothby CE, Griffin SJ. Incremental costs and cost effectiveness of intensive treatment in individuals with type 2 diabetes detected by screening in the ADDITION-UK trial: an update with empirical trial-based cost data. Value Health. 2017;20(10):1288–98.

    Article  Google Scholar 

  7. Gomes M, Díaz-Ordaz K, Grieve R, Kenward MG. Multiple imputation methods for handling missing data in cost-effectiveness analyses that use data from hierarchical studies: An application to cluster randomized trials. Med Decis Making. 2013;33(8):1051–63.

    Article  Google Scholar 

  8. Groenwold RHH, Moons KGM, Vandenbroucke JP. Randomized trials with missing outcome data: how to analyze and what to report. CMAJ. 2014;186(15):1153–7.

    Article  Google Scholar 

  9. Leurent B, Gomes M, Carpenter JR. Missing data in trial-based cost-effectiveness analysis: an incomplete journey. Health Econ (UK). 2018;27(6):1024–40.

    Article  Google Scholar 

  10. Carroll OU, Morris TP, Keogh RH. How are missing data in covariates handled in observational time-to-event studies in oncology? A systematic review. BMC Med Res Methodol. 2020;20(1):1–15.

    Article  Google Scholar 

  11. Manca A, Palmer S. Handling missing data in patient-level cost-effectiveness analysis alongside randomised clinical trials. Appl Health Econ Health Policy. 2005;4(2):65–75.

    Article  Google Scholar 

  12. Hayati Rezvan P, Lee KJ, Simpson JA. The rise of multiple imputation: a review of the reporting and implementation of the method in medical research Data collection, quality, and reporting. BMC Med Res Methodol. 2015;15(1):1–14.

    Article  Google Scholar 

  13. White IR, Royston P, Wood AM. Multiple imputation using chained equations: issues and guidance for practice. Stat Med. 2011;30(4):377–99.

    Article  Google Scholar 

  14. Cox E, Saramago P, Kelly J, Porta N, Hall E, Tan WS, et al. Effects of bladder cancer on UK healthcare costs and patient health-related quality of life: evidence from the BOXIT trial. Clin Genitourin Cancer. 2020;18(4):e418–42.

    Article  PubMed  PubMed Central  Google Scholar 

  15. Ma Z, Chen G. Bayesian methods for dealing with missing data problems. J Korean Stat Soc. 2018;47(3):297–313.

    Article  Google Scholar 

  16. Kreif N, Grieve R, Sadique MZ. Statistical methods for cost-effectiveness analyses that use observational data: a critical appraisal tool and review of current practice. Health Econ. 2013;22(4):486–500.

    Article  PubMed  Google Scholar 

  17. Hoch JS. All dressed up and know where to go: an example of how to use net benefit regression to do a cost-effectiveness analysis with person-level data (The “A” in CEA). Clin Neuropsychiatry. 2008;5(4):175–83.

    Google Scholar 

  18. van Hout B, Janssen MF, Feng YS, Kohlmann T, Busschbach J, Golicki D, et al. Interim scoring for the EQ-5D-5L: mapping the EQ-5D-5L to EQ-5D-3L value sets. Value Health. 2012;15(5):708–15.

    Article  PubMed  Google Scholar 

  19. Mason AJ, Gomes M, Grieve R, Carpenter JR. A Bayesian framework for health economic evaluation in studies with missing data. Health Econ. 2018;27(11):1670–83.

    Article  PubMed  PubMed Central  Google Scholar 

  20. Kennedy P. A guide to econometrics. 6th ed. Oxford: Blackwell; 2008.

    Google Scholar 

  21. Rabe-Hesketh S. Multilevel and longitudinal modeling using Stata, 2nd ed. In: Skrondal Anders, editor. College Station: Stata Press; 2008.

  22. Monsalves MJ, Bangdiwala AS, Thabane A, Bangdiwala SI. LEVEL (Logical Explanations & Visualizations of Estimates in Linear mixed models): recommendations for reporting multilevel data and analyses. BMC Med Res Methodol. 2020;20(1):1–9.

    Article  Google Scholar 

  23. Briggs AH, Wonderling DE, Mooney CZ. Pulling cost-effectiveness analysis up by its bootstraps: a non-parametric approach to confidence interval estimation. Health Econ. 1997;6(4):327–40.

    Article  CAS  Google Scholar 

  24. van Buuren S. Flexible imputation of missing data, 2nd edition. J Am Stat Assoc. Boca Raton: CRC Press; [2019]|: Chapman and Hall/CRC; 2018. 114:1421–1421.

  25. Schafer JL. Analysis of incomplete multivariate data. Boca Raton: Chapman and Hall; 2000.

    Google Scholar 

  26. van Buuren S, Boshuizen HC, Knook DL. Multiple imputation of missing blood pressure covariates in survival analysis. Stat Med. 1999;18(6):681–94.

    Article  Google Scholar 

  27. Raghunathan T, Lepkowski J, Hoewyk J, Solenberger P. A multivariate technique for multiply imputing missing values using a sequence of regression models. Surv Methodol. 2000;27:85–96.

    Google Scholar 

  28. Royston P. Multiple imputation of missing values: update of ice. Stata J. 2005;5(4):527–36.

    Article  Google Scholar 

  29. van Buuren S. Multiple imputation of discrete and continuous data by fully conditional specification. Stat Methods Med Res. 2007;16(3):219–42.

    Article  Google Scholar 

  30. Bartlett JW, Seaman SR, White IR, Carpenter JR. Multiple imputation of covariates by fully conditional specification: accommodating the substantive model. Stat Methods Med Res. 2015;24(4):462–87.

    Article  Google Scholar 

  31. Bartlett JW, Morris TP. Multiple imputation of covariates by substantive-model compatible fully conditional specification. Stata J. 2015;15(2):437–56.

    Article  Google Scholar 

  32. Little RJA. Statistical analysis with missing data. 3rd ed. In: Rubin DB, editor. Hoboken, NJ: Wiley; 2020. (Wiley series in probability and statistics).

  33. Morris TP, White IR, Royston P. Tuning multiple imputation by predictive mean matching and local residual draws. BMC Med Res Methodol. 2014;14(1):75.

    Article  Google Scholar 

  34. Bodner TE. What improves with increased missing data imputations? Struct Equ Model. 2008;15(4):651–75.

    Article  Google Scholar 

  35. Asch DA, Troxel AB, Stewart WF, Sequist TD, Jones JB, Hirsch AG, et al. Effect of financial incentives to physicians, patients, or both on lipid levels a randomized clinical trial. 2015.

  36. Brand J, van Buuren S, le Cessie S, van den Hout W. Combining multiple imputation and bootstrap in the analysis of cost-effectiveness trial data. Stat Med. 2019;38(2):210–20.

    Article  Google Scholar 

  37. Daniels MJ, Hogan JW. Missing data in longitudinal studies: strategies for Bayesian modeling and sensitivity analysis. In: Hogan JW, editor. Boca Raton: Chapman & Hall/CRC; 2008. (Monographs on statistics and applied probability; 109).

  38. Baio G, Dawid AP. Probabilistic sensitivity analysis in health economics. Stat Methods Med Res. 2015;24(6):615–34.

    Article  Google Scholar 

  39. Baio G. Bayesian methods in health economics. 2012. pp. 1–223.

  40. Gabrio A, Mason AJ, Baio G. A full Bayesian model to handle structural ones and missingness in economic evaluations from individual-level data. Stat Med. 2019;38(8):1399–420.

    Article  PubMed  Google Scholar 

  41. CRAN—Package missingHE. Accessed 1 Feb 2022.

  42. Marshall A, Billingham LJ, Bryan S. Can we afford to ignore missing data in cost-effectiveness analyses? Eur J Health Econ. 2009;10(1):1–3.

    Article  PubMed  Google Scholar 

  43. Briggs A, Clark T, Wolstenholme J, Clarke P. Missing.... presumed at random: cost-analysis of incomplete data. Health Econ. 2003;12(5):377–92.

    Article  Google Scholar 

  44. Burton A, Billingham LJ, Bryan S. Cost-effectiveness in clinical trials: using multiple imputation to deal with incomplete cost data. Clin Trials. 2007;4(2):154–61.

    Article  Google Scholar 

  45. Marshall A, Altman DG, Royston P, Holder RL. Comparison of techniques for handling missing covariate data within prognostic modelling studies: a simulation study. BMC Med Res Methodol. 2010;10:7.

    Article  Google Scholar 

  46. Tilling K, Williamson EJ, Spratt M, Sterne JAC, Carpenter JR. Appropriate inclusion of interactions was needed to avoid bias in multiple imputation. J Clin Epidemiol. 2016;80:107–15.

    Article  PubMed  PubMed Central  Google Scholar 

  47. Shah AD, Bartlett JW, Carpenter J, Nicholas O, Hemingway H. Practice of epidemiology comparison of random forest and parametric imputation models for imputing missing data using MICE: a CALIBER study. Am J Epidemiol. 2014;179(6):764–74.

  48. O’Kelly M. Multiple imputation and its application. In: Carpenter J, Kenward M (2013). Chichester: Wiley. 345 p, ISBN: 9780470740521. Biom J. 2014;56(2):352–3.

  49. Ma J, Akhtar-danesh N, Dolovich L, Thabane L. Imputation strategies for missing binary outcomes in cluster randomized trials. 2011.

  50. Andridge RR. Quantifying the impact of fixed effects modeling of clusters in multiple imputation for cluster randomized trials. Biom J. 2011;53(1):57–74.

    Article  Google Scholar 

  51. Lambert PC, Billingham LJ, Cooper NJ, Sutton AJ, Abrams KR. Estimating the cost-effectiveness of an intervention in a clinical trial when partial cost information is available: a Bayesian approach. Health Econ. 2008;17(1):67–81.

    Article  Google Scholar 

  52. Mihaylova B, Briggs A, O’hagan A, Thompson D SG. Health economics review of statistical methods for analysing healthcare resources and costs. 2010.

Download references


We are grateful to the EVRA trial members for the availability of the data: Manjit S. Gohel, MD, Cambridge University Hospitals NHS Foundation Trust, United Kingdom, Jocelyn Mora, M.Sc, Francine Heatley, B.Sc, Alun H. Davies, D.Sc, Department of Surgery and Cancer, Imperial College London, United Kingdom; Matyas Szigeti, M.Sc, Jane Warwick, Ph.D, Imperial Clinical Trials Unit, School of Public Health, Imperial College London, United Kingdom; Andrew Bradbury, M.D, University of Birmingham, United Kingdom; Richard Bulbulia, M.D, Keith R. Poskitt, M.D, Gloucestershire Hospitals National Health Service Foundation Trust, United Kingdom; Nicky Cullum, Ph.D, University of Manchester & Manchester University National Health Service Foundation Trust, United Kingdom; Isaac Nyamekye, M.D, Worcestershire Acute Hospitals National Health Service Trust, United Kingdom; Sophie Renton, MS, North West London Hospitals National Health Service Trust, United Kingdom.

Additional contribution: We thank the patient focus group who helped ascertain the importance of the EVRA research question, identify the most important outcome measures, and confirm the acceptability of trial interventions and follow-up protocols. As the Early Venous Reflux Ablation Trial Group, we thank the National Health Service centers and participating principal investigators and their colleagues for recruiting and monitoring trial participants: Addenbrooke’s Hospital: Manjit Gohel, M.D, D. Read, P. Hayes, S. Hargreaves, K. Dhillon, M. Anwar, A. Liddle, and H. Brown; Bradford Royal Infirmary, Bradford: K. Mercer, F. Gill, A. Liu, W. Jepson, A. Wormwell, H. Rafferty, and K. Storton. Charing Cross & St Mary’s Hospitals, London: A.H. Davies, K. Dhillon, R. Kaur, E. Solomon, K. Sritharan, R. Velineni, C. S. Lim, A. Busuttil, R. Bootun, C. Bicknell, M. Jenkins, T. Lane, and E. Serjeant. Cheltenham General Hospital: K. Poskitt, R. Bulbulia, J. Waldron, G. Wolfrey, F. Slim, C. Davies, L. Emerson, M. Grasty, M. Whyman, C. Wakeley, A. Cooper, J. Clapp, N. Hogg, J. Howard, J. Dyer, S. Lyes, D. Teemul, K. Harvey, M. Pride, A. Kindon, H. Price, L. Flemming, G. Birch, H. Holmes, and J. Weston. Cumberland Infirmary: T. Joseph, R. Eiffel, T. Ojimba, T. Wilson, A. Hodgson, L. Robinson, J. Todhunter, D. Heagarty, A. Mckeane, and R. McCarthy. Derriford Hospital, Plymouth: J. Barwell, C. Northcott, A. Elstone, and C. West. Frimley Park Hospital: P. Chong, D. Gerrard, A. Croucher, S. Levy, C. Martin, and T. Craig. Hull Royal Infirmary: D. Carradice, A. Firth, E. Clarke, A. Oswald, J. Sinclair, I. Chetter, J. El-Sheikha, S. Nandhra, C. Leung, and J. Hatfield. Leeds General Infirmary: J. Scott, N. Dewhirst, J. Woods, D. Russell, R. Darwood, M. Troxler, J. Thackeray, D. Bell, D. Watson, L. Williamson, and M. Todd. Musgrove Park Hospital, Taunton: J. Coulston, P. Eyers, K. Darvall, I. Hunter, A. Stewart, A. Moss, J. Rewbury, C. Adams, L. Vickery, L. Foote, H. Durman, F. Venn, P. Hill, K. James, F. Luxton, D. Greenwell, K. Roberts, S. Mitchell, M. Tate, and H. Mills. New Cross Hospital, Wolverhampton: A. Garnham, D. McIntosh, M. Green, K. Collins, J. Rankin, P. Poulton, V. Isgar, and S. Hobbs. Northwick Park Hospital, Harrow: S. Renton, K. Dhillon, M. Trivedi, M. Kafeza, S. Parsapour, H. Moore, M. Najem, S. Connarty, H. Albon, C. Lloyd, J. Trant, and S. Chhabra. Queen Elizabeth Hospital, Birmingham: R. Vohra, J. McCormack, J. Marshall, V. Hardy, R. Rogoveanu, W. Goff, and D. Gardiner. Russells Hall Hospital, Dudley: A. Garnham, R. Gidda, S. Merotra, S. Shiralkar, A. Jayatunga, R. Pathak, A. Rehman, K. Randhawa, J. Lewis, S. Fullwood, S. Jennings, S. Cole, and M. Wall.Salisbury District Hospital: C. Ranaboldo, S. Hulin, C. Clarke, R. Fennelly, R. Cooper, R. Boyes, C. Draper, L. Harris, and D. Mead. Solihull Hospital (part of the University Hospitals Birmingham National Health System Foundation Trust): A. Bradbury, L. Kelly, G. Bate, H. Davies, M. Popplewell, M. Claridge, M. Gannon, H. Khaira, M. Scriven, T. Wilmink, D. Adam, and H. Nasr. Northern General Hospital, Sheffield: D. Dodd, S. Nawaz, J. Humphreys, M. Barnes, J. Sorrell, D. Swift, P. Phillips, H. Trender, N. Fenwick, H. Newell, and C. Mason. Royal Bournemouth General Hospital: D. Rittoo, S. Baker, R. Mitchell, S. Andrews, S. Williams, J. Stephenson, and L. Vamplew. Worcester Royal Hospital: I. Nyamekye, S. Holloway, W. Hayes, J. Day, C. Clayton, and D. Harding. York Hospital: A. Thompson, A. Gibson, Z. Murphy, T. Smith, and J. Whitwell. We thank members of our 2 oversight committees, the Trial Steering Committee (Professor Julie Brittenden [Chair]; Miss Rebecca Jane Winterborn [Consultant Vascular Surgeon]; Professor Andrea Nelson [Head of School and Professor of Wound Healing]; Dr Richard Haynes [Research Fellow and Honorary Consultant Nephrologist] and Mr Bruce Ley-Greaves [lay member] who provided invaluable input and advice as the independent lay member over the course of the study) and the Data Monitoring Committee (Professor Gerard Stansby [Chair, Professor of Vascular Surgery]; Professor Frank Smith [Professor of Vascular Surgery & Surgical Education]; Professor Marcus Flather [Professor of Medicine—Clinical Trials]; Dr. Ian Nunney [Medical Statistician]) for their support and guidance. There was no financial compensation for these contributions.


This study was funded by the National Institute for Health Research (NIHR HTA) Programme (EVRA, project number 11/129/197) and European Union’s Horizon 2020 research under Grant agreement 733203.

Author information

Authors and Affiliations



All authors contributed to the study conception design and analysis. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Modou Diop.

Ethics declarations

Ethics approval and consent to participate

This study is a secondary analysis that did not involve human subjects beyond already existing data.

Consent for publication

Consent were no needed.

Competing interests

The authors declare they have no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1. 

Further description of methods, data quality and results of regression analyses.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Diop, M., Epstein, D. Comparing methods for handling missing cost and quality of life data in the Early Endovenous Ablation in Venous Ulceration trial. Cost Eff Resour Alloc 20, 18 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: