Comparing the performance of the EQ-5D and SF-6D when measuring the benefits of alleviating knee pain

Barton, Garry R; Sach, Tracey H; Avery, Anthony J; Doherty, Michael; Jenkinson, Claire; Muir, Kenneth R

doi:10.1186/1478-7547-7-12

Research
Open access
Published: 17 July 2009

Comparing the performance of the EQ-5D and SF-6D when measuring the benefits of alleviating knee pain

Garry R Barton¹,
Tracey H Sach^1,2,3,
Anthony J Avery³,
Michael Doherty⁴,
Claire Jenkinson³ &
…
Kenneth R Muir³

Cost Effectiveness and Resource Allocation volume 7, Article number: 12 (2009) Cite this article

8294 Accesses
35 Citations
Metrics details

Abstract

Objective

To assess the practicality, validity and responsiveness of using each of two utility measures (the EQ-5D and SF-6D) to measure the benefits of alleviating knee pain.

Methods

Participants in a randomised controlled trial, which was designed to compare four different interventions for people with self-reported knee pain, were asked to complete the EQ-5D, SF-6D, and Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) at both pre- and post-intervention. For both utility measures, we assessed their practicality (completion rate), construct validity (ability to discriminate between baseline WOMAC severity levels), and responsiveness (ability to discriminate between three groups: those whose total WOMAC score, i) did not improve, ii) improved by <20%, and iii) improved by ≥20%).

Results

The EQ-5D was completed by 97.7% of the 389 participants, compared to 93.3% for the SF-6D. Both the EQ-5D and SF-6D were able to discriminate between participants with different levels of WOMAC severity (p < 0.001). The mean EQ-5D change was -0.036 for group i), 0.091 for group ii), and 0.127 for group iii), compared to 0.021, 0.023 and 0.053 on the SF-6D. These change scores were significantly different according to the EQ-5D (p < 0.001), but not the SF-6D.

Conclusion

The EQ-5D and SF-6D had largely comparable practicality and construct validity. However, in contrast to the EQ-5D, the SF-6D could not discriminate between those who improved post-intervention, and those who did not. This suggests that it is more appropriate to use the EQ-5D in future cost-effectiveness analyses of interventions which are designed to alleviate knee pain.

Trial registration

Current Controlled Trials ISRCTN93206785

Background

In the UK it has been estimated that nearly 50% of those aged >50 years experience knee pain each year, and that 33% of these consequently consult their general practitioner [1]. Economic evaluations have accordingly been undertaken to assess whether interventions which alleviate knee pain represent a cost-effective use of scarce health care resources [2]. Within such studies outcomes are often measured on a utility scale, where 0 is equivalent to death and 1 is equal to full health, in order to enable the benefits of different interventions to be compared on a common scale [3, 4] There are however a number of different utility measures that can be used within such evaluations, including the EQ-5D [5], health utilities index [6], and SF-6D [7], all of which aim to measure utility on the same scale. Moreover, as each of these measures are based on different health descriptions [8], and different valuation methods [9], there is increasing evidence that they produce different results [10]. Fitzpatrick et al. [11] outlined a number of criteria (appropriateness, reliability, validity, responsiveness, precision, interpretability, acceptability, feasibility) on which evidence should be provided in order to select an appropriate outcome measure, and very few papers have assessed utility measures with regard to such criteria [12]. Thus, within this paper we seek to compare the performance of the EQ-5D and SF-6D with regard to the criteria of practicality, validity and responsiveness. The results of this study are particularly important as both of these measures have recently been used in a randomised controlled trial which compared four interventions for people with knee pain (diet and strengthening exercise advice, dietary advice, strengthening exercise advice, and leaflet provision) and we wish to select the preferred outcome measure for the cost-effectiveness analysis of this study in a systematic and transparent way.

The importance of such an analysis is further highlighted by two potentially opposed views. Firstly, the National Institute of Health and Clinical Excellence (NICE) has recently stated that the EQ-5D is the preferred measure of utility to be used in economic evaluations [13]. However, it did acknowledge that an alternative measure could be used if empirical evidence can be used to show that it is not suitable for a particular patient group, where relevant properties include practicality, validity and responsiveness [13]. Secondly, it has been argued that it might not be appropriate to use the EQ-5D in a rheumatology clinic group [14]. The basis for this latter view was that the EQ-5D was less responsive to change, than other measures in the study [14]. Moreover, the possible inappropriateness of the EQ-5D is also highlighted by two recent studies [2, 15], in similar population groups, which found that the interventions in question were associated with an improvement according to a clinical measure, but a reduced post-intervention score according to the EQ-5D. Thus, here we seek to compare the performance of the EQ-5D and SF-6D in a group of patients with self-reported knee pain.

Methods

Participants

All participants were taking part in the Lifestyle Interventions for Knee Pain (LIKP) study, which was designed to compare the effectiveness and cost-effectiveness of four different interventions (receipt of a leaflet, dietary advice, guidance on knee strengthening exercises, or dietary advice and guidance on knee strengthening exercises). Ethical approval for this study was granted by the UK Nottingham Research Ethics Committee. In order to recruit people into the LIKP study all registered patients in five Nottingham general practices who were aged ≥45 years, and deemed (by their general practitioner) to be well enough to complete a questionnaire, were sent an ascertainment questionnaire, and a local media campaign was also conducted. Responding individuals were recruited into the LIKP study if they reported that they had had knee pain on most days of the last month, were aged ≥45 years, had a body mass index (BMI) >28.0 kg/m², and gave consent to be randomised to one of the four interventions.

Outcome Measures

Participants in the LIKP study were asked to complete, amongst other things, three outcome questionnaire measures at both pre- and (6 months) post-intervention – the WOMAC (Western Ontario and McMaster Universities Osteoarthritis Index), EQ-5D and SF-36 (the latter was used to calculate the SF-6D score). The WOMAC was chosen as primary outcome measure within the LIKP study as the pain subscale of the WOMAC was considered to be the best way of capturing knee pain severity.

The WOMAC measures the amount of pain (5 questions), stiffness (2 questions), and difficulty in physical functioning (17 questions), where the response options are none (0), mild (1), moderate (2), severe (3) or extreme (4) [16]. Scores can thereby range between 0 and 20 on the pain scale, 0 and 8 on the stiffness scale, 0 and 68 on the functioning scale, and 0 to 96 on the total WOMAC (WOMAC₉₆) scale, where higher scores denote a worse response [17]. Previous evidence of the adequate performance of the WOMAC has been shown for construct validity [18] and responsiveness [19, 20].

The EQ-5D has five questions, where the respondent is asked to report the level of problems they have (no problems, some/moderate problems, and severe/extreme problems) with regard to mobility, self-care, usual activities, pain, and anxiety/depression [5]. Responses to these five dimensions are converted into one of 243 different EQ-5D health state descriptions, which range between no problems on all five dimensions (11111) and severe/extreme problems on all five dimensions (33333). A utility score was assigned to each health state using the York A1 tariff [21], which was based on the preferences elicited from a survey of 3395 UK residents – EQ-5D scores range between -0.594 and 1 (full health).

In a similar way, responses to eleven of the questions on the SF-36 [22] were used to estimate a score on the SF-6D [7]. The SF-6D is composed of six dimensions (physical functioning, role limitations, social functioning, pain, mental health and vitality) which have between four and six levels. We used the consistent [23] version of the SF-6D algorithm [8] to estimate utility scores for each of 18,000 potential health states – SF-6D scores range between 0.296 and 1.00.

Comparing the EQ-5D and SF-6D – performance criteria

Choice of Analysis

There are many approaches to assessing validity and, as Fitzpatrick et al. [11] point out, these criteria are not uniformly described. Indeed, Streiner & Norman [24] suggest that the myriad of terms that are used to describe such approaches means that one of the most difficult aspects of validity testing is the terminology. In the light of this, we attempt to provide clear definitions of the type of validity that we are testing for in order to avoid the possibility of misinterpretation. References to previous studies which have used similar techniques are also provided. Finally, rather than assessing the predictive ability of certain variables [25], it should be noted that we focus solely on the relationship between the WOMAC and the utility measures of the EQ-5D and SF-6D as, as far as we are aware, such relationships have not been previously investigated. This is in contrast to a number of previous studies e.g. [26, 27] which have looked at the effect that different socio-demographic characteristics and clinical conditions have on measures of utility.

Practicality

Practicality was assessed in terms of completion rates, where the SF-36 appeared before the EQ-5D in the ascertainment questionnaire. We assessed whether sufficient information was provided in order to calculate a utility score for the EQ-5D and SF-6D, as outlined by Gerard et al. [28].

Validity

Validity was assessed in terms of both construct and convergent validity. Construct validity relates to whether a measure can discriminate between two patient groups, one which has a certain trait, and the other which does not [24]. This has also been referred to as known groups validity [29, 30], based on the principle that certain specified groups of patients may be expected to score differently from one another. We assessed whether the EQ-5D and SF-6D could discriminate between participants with different levels of (pre-intervention) severity on the WOMAC. In accordance with the WOMAC response options (none (0), mild (1), moderate (2), severe (3) or extreme (4)), four severity levels were created on the overall WOMAC₉₆ score – i) none to mild (total score of 0 to ≤24 on the WOMAC₉₆ scale), ii) >mild to moderate (total score of >24 to ≤48 on the WOMAC₉₆ scale), iii) >moderate to severe (total score of >49 to ≤72 on the WOMAC₉₆ scale), and iv) >severe to extreme (total score of >72 to ≤96 on the WOMAC₉₆ scale). Similar severity levels were also created for each of the three sub-scales: i) none to mild was denoted by total scores of 0 to ≤5 (pain), 0 to ≤2 (stiffness), and 0 to ≤17 (functioning), ii) >mild to moderate was denoted by total scores of >5 to ≤10 (pain), >2 to ≤4 (stiffness), and >17 to ≤34, iii) >moderate to severe was denoted by total scores of >10 to ≤15 (pain), >4 to ≤6 (stiffness), and >34 to ≤51 (functioning), and iv) >severe to extreme was denoted by total scores of >15 to ≤20 (pain), >6 to ≤8 (stiffness), and >51 to ≤68 (functioning). On the overall WOMAC₉₆ score, and each of the three WOMAC subscales, in order to assess whether there were significant (p < 0.05) differences between the utility scores of participants in each of these four severity levels a one-way analysis of variance (ANOVA) was conducted. This analysis is akin to that conducted previously [30, 31].

Convergent validity is determined by how closely a measure is related to other measures of the same construct [24]. Thus, in line with previous studies [30, 32, 33], we assessed whether (pre-intervention) scores on the EQ-5D and SF-6D were significantly correlated with the WOMAC₉₆ score according to the Spearman rank test.

Responsiveness

Responsiveness, which is different to sensitivity [29], is determined by the ability an instrument has to detect a meaningful or clinically important change [34], where one seeks to discriminate between those who change a lot and those who change a little [24]. Previously, a 20% improvement on each of the three subscales of the WOMAC has been deemed to equate to the minimum clinically important difference [35–37]. Thus, we sought to assess whether the EQ-5D and SF-6D could discriminate between three groups: i) those who did not improve according to the WOMAC (≤0% change post-intervention), ii) those who improved by <20% (>0% to <20% change post-intervention), and iii) those who improved by ≥20% (≥20% change post-intervention) – the change was estimated by subtracting the pre-intervention score from the post-intervention score, and those who had a worse WOMAC score post-intervention were included in the first group. The mean change scores for the EQ-5D and SF-6D were calculated for each of these three post-intervention groups, for both the overall WOMAC₉₆ scale and each of the three WOMAC subscales, and the paired t-tests was conducted to assess whether there was a significant change in the mean utility score. For both the EQ-5D and the SF-6D a one-way ANOVA was also conducted to assess whether there was a significant difference between the mean change in utility across each of the three groups. The above analysis is in line with that undertaken previously [30].

Results

Participants

Questionnaires were returned by 8,044 of the 12,500 people (64.4%) who were sent an ascertainment questionnaire. Of these, 318 were eligible to take part in the LIKP study and consented to be randomised to one of the four interventions. An additional 71 participants were recruited via the media campaign. The mean age of these 389 participants was 62.0 years, 66.0% were female, and 23.4% were classified as overweight (BMI 25 to <30 kg/m²), 50.4% as class I obese (30 to <35 kg/m²), 16.9% as class II obese (35 to <40 kg/m²), and 9.9% as class III obese (≥40 kg/m²). Pre-intervention the mean score on each of the pain, stiffness and functioning dimensions of the WOMAC was 7.81 (N = 360), 3.92 (N = 360), and 27.90 (N = 359), respectively, the mean WOMAC₉₆ score was 39.59 (N = 359).

Comparing the performance of the EQ-5D and SF-6D

Practicality

Pre-intervention the EQ-5D was wholly completed by 378 of the 389 participants (97.2%). Four of these participants were categorised as being in full health (11111), and a total of 36 different EQ-5D health states were reported – 29 participants had health states rated as worse than death, the lowest score was -0.239 (22333), and the mean score was 0.550 (95% confidence interval 0.521 to 0.578).

Pre-intervention, SF-6D scores could be calculated for 366 of the 389 participants (94.1%). None were categorised in full health, but one person did report that they had the lowest score on all six dimensions (645655). SF-6D scores ranged from 0.296 to 0.948, 194 different health states were reported, and the mean score was 0.646 (95% confidence interval 0.631 to 0.660).

Validity

In terms of construct validity the results in Table 1 show that, for the 359 participants for whom the WOMAC₉₆ could be calculated (pre-intervention), participants who had higher scores (increased severity) tended to have lower utility scores on both the EQ-5D and SF-6D – the mean EQ-5D (SF-6D) utility score for those with a WOMAC₉₆ score between 0 and 24 was 0.722 (0.731), compared to 0.069 (0.460) for those with a WOMAC₉₆ score between 73 and 96. These differences were significant according to the one-way ANOVA, and similar results were also obtained for each of the three WOMAC subscales (Table 1). With regard to convergent validity, scores on both the EQ-5D and SF-6D scores were highly correlated (p < 0.001) with scores on the WOMAC₉₆ scale (r = -0.576 and r = -0.501, respectively).

Table 1 Construct validity: Mean EQ-5D and SF-6D utility scores for each of the four baseline severity levels.

Full size table

Responsiveness

Pre- and post-intervention WOMAC₉₆ scores could be calculated for 324 participants. Post-intervention the WOMAC₉₆ score did not improve for 33.8% of the participants, for 25.2% the WOMAC₉₆ score improved by <20%, and for 40.9% the WOMAC₉₆ score improved by ≥20%. The mean change in utility (post-intervention) for each of these three groups was -0.036, 0.091 and 0.127 for the EQ-5D, compared to 0.021, 0.023 and 0.053 for the SF-6D (Table 2). The one-way ANOVA showed that the differences between these three groups were significant according to the EQ-5D (p < 0.001), but not the SF-6D (p = 0.084). Similar results were also obtained across the three subscales of the WOMAC (see Table 2).

Table 2 Responsiveness: Mean EQ-5D and SF-6D changes scores for each of the three post-intervention groups.

Full size table

Discussion

When comparing the performance of the EQ-5D and SF-6D with regard to the criteria of practicality we found that the completion rate (pre-intervention) was lower for the SF-6D, even though the SF-36 appeared first in the ascertainment questionnaire. In terms of validity both the EQ-5D and SF-6D were able to discriminate between groups with different levels of severity according to the WOMAC, and were also highly correlated with the WOMAC₉₆ score. However, in contrast to the EQ-5D, the SF-6D could not discriminate between participants whose condition had not improved according to the WOMAC and those who had improved by ≥20%.

Comparisons with other studies

We are aware of only one other study which has compared the performance of the EQ-5D and SF-6D with regard to similar criteria in a similar clinical area [38]. In line with our results it was found that the EQ-5D had a higher completion rate, and that both measures were able to discriminate between groups of patients with different levels of self-reported severity, and control, of rheumatoid arthritis [38]. However, when assessing the responsiveness of the EQ-5D and SF-6D (in relation to a self-reported assessment of disease severity) they found that, on the basis of the effect size [24], the EQ-5D was more responsive in patients who (post-intervention) were classified as 'worse' but that the SF-6D was more responsive in those patients classified as 'better' [38]. For the EQ-5D, other results in similar clinical areas concur with our findings – the EQ-5D was able to discriminate between patients with different severity levels of knee osteoarthritis [14] and patients with different levels of functional class according to the Stanford Health Assessment Questionnaire [39]. EQ-5D scores have also been shown to be highly correlated with many measures from the American College of Rheumatology (ACR) disease activity set [39], and the WOMAC [40–42]. Finally, in terms of responsiveness, scores on the EQ-5D have been shown to increase for those who reported an improvement in their arthritis [39], and for rheumatoid arthritis patients who reported an improvement in pain after receiving infliximab [43].

Limitations

Within this paper we have used the WOMAC to assess the validity and responsiveness of the EQ-5D and SF-6D. Evidence of adequate performance of the WOMAC on these criteria [18–20] justifies such an approach. However, in other patient groups, when a different condition-specific measure acts as a so called 'gold standard', the results may be different, and one should therefore be cautious about generalizing the results of this study beyond patients with knee pain. Similarly, we have only assessed the performance of the two utility measures on a limited number of criteria (the design of our study did not permit us to assess the remaining criteria outlined by Fitzpatrick et al. [11]), and thus we can not wholly conclude that the EQ-5D is superior to the SF-6D.

Implications

Economic evaluation plays a major role in decision making [44]. Our finding that the SF-6D is less responsive to interventions designed to alleviate knee pain suggests i) that these two measures will provide different estimates of the effectiveness of different health care interventions, as has been demonstrated elsewhere [45–47], and ii) that it is more appropriate to use the EQ-5D to estimate the cost-effectiveness of interventions designed to alleviate knee pain.

The SF-6D was less responsive even though it had a greater descriptive ability (pre-intervention patients were assigned to 194 different health states on the SF-6D, compared to 36 on the EQ-5D). This is an important finding as researchers are currently investigating whether to expand the number of responses within each of the five dimensions of the EQ-5D from 3 to 5 levels [48]. One potential implication of our research is therefore that further validity checks, akin to those outlined in this paper, are needed in order to ascertain the extent to which an increased descriptive ability (which was argued to be one of the main advantages of the SF-6D [7]) results in a corresponding increase in the level of responsiveness.

Conclusion

Though the construct and convergent validity of the EQ-5D and SF-6D were similar the EQ-5D had a higher completion rate and was more responsive. This suggests that these two measures may provide different estimates of effectiveness, and that it is more appropriate to use the EQ-5D to estimate the cost-effectiveness of alleviating knee pain.

References

Jinks C, Jordan K, Ong BN, Croft P: A brief screening tool for knee pain in primary care (KNEST). 2. Results from a survey in the general population aged 50 and over. Rheumatology 2004, 43: 55–61. 10.1093/rheumatology/keg438
Article CAS PubMed Google Scholar
Hurley MV, Walsh NE, Mitchell HL, Pimm TJ, Williamson E, Jones RH, Reeves BC, Dieppe RA, Patel A: Economic evaluation of a rehabilitation program integrating exercise, self-Management, and active coping strategies for chronic knee pain. Arthritis Rheum 2007, 57: 1220–1229. 10.1002/art.23011
Article CAS PubMed Central PubMed Google Scholar
Drummond MF, Sculpher MJ, Torrance GW, O'Brien BJ, Stoddart GL: Methods for the Economic Evaluation of Health Care Programmes (3rd Edition). New York: Oxford University Press; 2005.
Google Scholar
Sach TH, Barton GR, Doherty M, Muir K, Jenkinson C, Avery AJ: The relationship between BMI and health related quality of life: comparing the EQ-5D, EuroQol VAS, and SF-6D. Int J Obes (Lond). 2007,31(1):189–196. 10.1038/sj.ijo.0803365
Article CAS Google Scholar
Brooks R: EuroQol: the current state of play. Health Policy 1996, 37: 53–72. 10.1016/0168-8510(96)00822-6
Article CAS PubMed Google Scholar
Feeny D, Furlong W, Torrance GW, Goldsmith CH, Zhu Z, DePauw S, Denton M, Boyle M: Multi-attribute and single attribute utility functions for the Health Utilities Index Mark 3 system. Med Care 2002, 40: 113–128. 10.1097/00005650-200202000-00006
Article PubMed Google Scholar
Brazier JE, Roberts J, Deverill M: The estimation of a preference-based measure of health from the SF-36. J Health Econ 2002, 21: 271–292. 10.1016/S0167-6296(01)00130-8
Article PubMed Google Scholar
Brazier JE, Roberts J, Tsuchiya A, Busschbach J: A comparison of the EQ-5D and SF-6D across seven patient groups. Health Econ 2004, 13: 873–884. 10.1002/hec.866
Article PubMed Google Scholar
Tsuchiya A, Brazier J, Roberts J: Comparison of valuation methods used to generate the EQ-5D and the SF-6D value sets. J Health Econ 2006, 25: 334–346. 10.1016/j.jhealeco.2005.09.003
Article PubMed Google Scholar
Bryan S, Longworth L: Measuring health-related utility: Why the disparity between EQ-5D and SF-6D? Eur J Health Econ 2005, 6: 253–260. 10.1007/s10198-005-0299-9
Article PubMed Google Scholar
Fitzpatrick R, Davey C, Buxton MJ, Jones DR: Criteria for assessing patient based outcome measures for use in clinical trials. Health Technol Assess 1998, 14: 1–74. 10.1017/S0266462300010473
Article Google Scholar
Marra CA, Esdaile JM, Guh D, Kopec JA, Brazier JE, Koehler BE, Chalmers A, Anis AH: A comparison of four indirect methods of assessing utility values in rheumatoid arthritis. Med Care 2004, 42: 1125–1131. 10.1097/00005650-200411000-00012
Article PubMed Google Scholar
National Institute of Health and Clinical Excellence: Guide to the Methods of Technology Appraisal. NICE publications 2008.
Google Scholar
Brazier JE, Harper R, Munro J, Walters SJ, Snaith ML: Generic and condition-specific outcome measures for people with osteoarthritis of the knee. Rheumatology 1999, 38: 870–877. 10.1093/rheumatology/38.9.870
Article CAS PubMed Google Scholar
Hout WB, de Jong Z, Munneke M, Hazes JM, Breedveld FC, Vliet Vlieland TP: Cost-utility and cost-effectiveness analyses of a long-term, high-intensity exercise program compared with conventional physical therapy in patients with rheumatoid arthritis. Arthritis Rheum 2005, 53: 39–47. 10.1002/art.20903
Article PubMed Google Scholar
Bellamy N, Buchanan WW, Goldsmith CH, Campbell J, Stitt LW: Validation study of WOMAC: a health status instrument for measuring clinically important patient relevant outcomes to antirheumatic drug therapy in patients with osteoarthritis of the hip or knee. J Rheumatol 1988, 15: 1833–1840.
CAS PubMed Google Scholar
McCarthy CJ, Mills PM, Pullen R, Richardson G, Hawkins N, Roberts CR, Silman AJ, Oldham JA: Supplementation of a home-based exercise programme with a class-based programme for people with osteoarthritis of the knees: a randomised controlled trial and health economic analysis. Health Technol Assess 2004,8(46):1–76.
Article Google Scholar
Miller GD, Rejeski WJ, Williamson JD, Morgan T, Sevick MA, Loeser RF, Ettinger WH, Messier SP, Investigators A: The Arthritis, Diet and Activity Promotion Trial (ADAPT): design, rationale, and baseline results. Control Clin Trials 2003, 24: 462–480. 10.1016/S0197-2456(03)00063-1
Article PubMed Google Scholar
Davies GM, Watson DJ, Bellamy N: Comparison of the responsiveness and relative effect size of the western Ontario and McMaster Universities Osteoarthritis Index and the short-form Medical Outcomes Study Survey in a randomized, clinical trial of osteoarthritis patients. Arthritis Care Res. 1999,12(3):172–179. Publisher Full Text 10.1002/1529-0131(199906)12:3<172::AID-ART4>3.0.CO;2-Y
Article CAS PubMed Google Scholar
Theiler R, Bischoff-Ferrari HA, Good M, Bellamy N: Responsiveness of the electronic touch screen WOMAC 3.1 OA Index in a short term clinical trial with rofecoxib. Osteoarthritis Cartilage 2004, 12: 912–916. 10.1016/j.joca.2004.08.006
Article CAS PubMed Google Scholar
Dolan P: Modelling valuations for EuroQol health states. Med Care 1997, 35: 1095–1108. 10.1097/00005650-199711000-00002
Article CAS PubMed Google Scholar
Ware JE, Sherbourne C: The MOS 36 item short-form health survey: Conceptual framework and item selection. Med Care 1992, 30: 473–483. 10.1097/00005650-199206000-00002
Article PubMed Google Scholar
Badia X, Roset M, Herdman M: Inconsistent responses in three preference-elicitation methods for health states. Soc Sci Med 1999, 49: 943–950. 10.1016/S0277-9536(99)00182-3
Article CAS PubMed Google Scholar
Streiner DL, Norman GR: Health Measurement Scales: a practical guide to their development and use (3rd edition). New York: Oxford University Press; 2003.
Google Scholar
Bansback N, Marra C, Tsuchiya A, Anis A, Guh D, Hammond T, Brazier J: Using the health assessment questionnaire to estimate preference-based single indices in patients with rheumatoid arthritis. Arthritis Rheum 2007, 963–971. 10.1002/art.22885
Google Scholar
Barton GR, Sach TH, Avery AJ, Jenkinson C, Doherty M, Muir KR: An assessment of the discriminative ability of the EQ-5D _index , SF-6D and EQ VAS, using socio-demographic factors and clinical conditions. Eur J Health Econ 2008, 9: 237–249. 10.1007/s10198-007-0068-z
Article PubMed Google Scholar
Lubetkin EI, Jia H, Franks P, Gold MR: Relationship among sociodemographic factors, clinical conditions, and health-related quality of life: Examining the EQ-5D in the U.S. general population. Qual Life Res 2005, 14: 2187–2196. 10.1007/s11136-005-8028-5
Article PubMed Google Scholar
Gerard K, Nicholson T, Mullee M, Mehta R, Roderick P: EQ-5D versus SF-6D in an older, chronically ill patient group. Appl Health Econ Health Policy 2004, 3: 91–102. 10.2165/00148365-200403020-00005
Article PubMed Google Scholar
Fayers P, Machin D: Quality of life: assessment, analysis and interpretation. Chichester: Wiley; 2000.
Chapter Google Scholar
Aggarwal R, Wilke CT, Pickard AS, Vats V, Mikolaitis R, Fogg L, Block JA, Jolly M: Psychometric Properties of the EuroQol-5D and Short Form-6D in Patients with Systemic Lupus Erythematosus. J Rheumatol 2009,36(6):1209–16. Epub 2009 Apr 15 10.3899/jrheum.081022
Article PubMed Google Scholar
Schag CA, Ganz PA, Kahn B, Petersen L: Assessing the needs and quality of life of patients with HIV infection: development of the HIV Overview of Problems-Evaluation System (HOPES). Qual Life Res 1992, 1: 397–413. 10.1007/BF00704434
Article CAS PubMed Google Scholar
McCrone P, Patel A, Knapp M, Schene A, Koeter AM, Amaddeo F, Ruggeri M, Giessler A, Puschner B, Thornicroft G: A comparison of SF-6D and EQ-5D utility scores in a study of patients with schizophrenia. J Ment Health Policy Econ 2009, 12: 27–31.
PubMed Google Scholar
Barton GR, Hodgekins J, Mugford M, Jones PB, Croudace T, Fowler D: Measuring the benefits of treatment for psychosis: The validity and responsiveness of the EQ-5D. Br J Psych, in press.
Liang MH: Longitudinal construct validity: establishment of clinical meaning in patient evaluative instruments. Med Care. 2000,38(9 Suppl):II84-II90.
CAS PubMed Google Scholar
Raynauld JP, Torrance GW, Band PA, Goldsmith CH, Tugwell P, Walker V, Schultz M, Bellamy N, Group CKOAS: A prospective, randomized, pragmatic, health outcomes trial evaluating the incorporation of hylan G-F 20 into the treatment paradigm for patients with knee osteoarthritis: clinical results. Osteoarthritis Cartilage 2002, 10: 506–517. 10.1053/joca.2002.0798
Article PubMed Google Scholar
Torrance GW, Raynauld JP, Walker V, Goldsmith CH, Bellamy N, Band PA, Schultz M, Tugwell P, Group CKOAS: A prospective, randomized, pragmatic, health outcomes trial evaluating the incorporation of hylan G-F 20 into the treatment paradigm for patients with knee osteoarthritis: economic results. Osteoarthritis Cartilage 2002, 10: 518–527. 10.1053/joca.2001.0513
Article CAS PubMed Google Scholar
Bellamy N, Bell MJ, Goldsmith CH, Pericak D, Walker V, Raynauld JP, Torrance GW, Tugwell P, Polisson R: Evaluation of WOMAC 20, 50, 70 response criteria in patients treated with hylan G-F 20 for knee osteoarthritis. Ann Rheum Dis 2005, 64: 881–885. 10.1136/ard.2004.026443
Article CAS PubMed Central PubMed Google Scholar
Marra CA, Rashidi AA, Guh D, Kopec JA, Abrahamowicz M, Esdaile JM, Brazier JE, Fortin PR, Anis AH: Are indirect utility measures reliable and responsive in rheumatoid arthritis patients? Qual Life Res 2005, 14: 1333–1344. 10.1007/s11136-004-6012-0
Article PubMed Google Scholar
Hurst NP, Kind P, Ruta D, Hunter M, Stubbings A: Measuring health-related quality of life in rheumatoid arthritis: validity, responsiveness and reliability of EuroQol (EQ-5D). Br J Rheumatol. 1997,36(5):551–559. 10.1093/rheumatology/36.5.551
Article CAS PubMed Google Scholar
Fransen M, Edmonds J: Reliability and validity of the EuroQol in patients with osteoarthritis of the knee. Rheumatology 1999, 38: 807–813. 10.1093/rheumatology/38.9.807
Article CAS PubMed Google Scholar
Ostendorf M, van Stel HF, Buskens E, Schrijvers AJ, Marting LN, Verbout AJ, Dhert WJ: Patient-reported outcome in total hip replacement: A comparison of five instruments of health status. J Bone Joint Surg Br. 2004,86(6):801–808. 10.1302/0301-620X.86B6.14950
Article CAS PubMed Google Scholar
Conner-Spady B, Estey A, Arnett G, Ness K, McGurran J, Bear R, Noseworthy T, Steering Committee of the Western Canada Waiting List P: Prioritization of patients on waiting lists for hip and knee replacement: validation of a priority criteria tool. Int J Technol Assess Health Care 2004, 20: 509–515. 10.1017/S0266462304001436
Article PubMed Google Scholar
Russell AS, Conner-Spady B, Mintz A, Maksymowych WP: The responsiveness of generic health status measures as assessed in patients with rheumatoid arthritis receiving infliximab. J Rheumatol 2003, 30: 941–947.
CAS PubMed Google Scholar
Buxton MJ: Economic Evaluation and Decision Making in the UK. Pharmacoeconomics 2006, 24: 1133–1142. 10.2165/00019053-200624110-00009
Article PubMed Google Scholar
Longworth L, Bryan S: An empirical comparison of EQ-5D and SF-6D in liver transplant patients. Health Econ 2003, 12: 1061–1077. 10.1002/hec.787
Article PubMed Google Scholar
Pickard AS, Johnson JA, Feeny DH: Responsiveness of generic health-related quality of life measures in stroke. Qual Life Res 2005, 14: 207–219. 10.1007/s11136-004-3928-3
Article PubMed Google Scholar
Conner-Spady B, Suarez-Almazor ME: Variation in the estimation of quality-adjusted life-years by different preference-based instruments. Med Care 2003, 41: 791–801. 10.1097/00005650-200307000-00003
Article PubMed Google Scholar
Lamers L: Adjustment of existing EQ-5D TTO values for use of an EQ-5D five level descriptive system. Eur J Health Econ 2006,7(S1):S57.
Google Scholar

Download references

Acknowledgements

We thank all participants who completed the Lifestyle Interventions for Knee Pain (LIKP) study questionnaire. The LIKP study was funded by the UK Arthritis Research Campaign (ARC) (grant number 13550).

Author information

Authors and Affiliations

Health Economics Group, School of Medicine, Health Policy and Practice, University of East Anglia, Norwich, UK
Garry R Barton & Tracey H Sach
School of Chemical Sciences and Pharmacy, University of East Anglia, Norwich, UK
Tracey H Sach
School of Community Health Sciences, University of Nottingham, Nottingham, UK
Tracey H Sach, Anthony J Avery, Claire Jenkinson & Kenneth R Muir
Academic Rheumatology, University of Nottingham, Nottingham, UK
Michael Doherty

Authors

Garry R Barton
View author publications
You can also search for this author in PubMed Google Scholar
Tracey H Sach
View author publications
You can also search for this author in PubMed Google Scholar
Anthony J Avery
View author publications
You can also search for this author in PubMed Google Scholar
Michael Doherty
View author publications
You can also search for this author in PubMed Google Scholar
Claire Jenkinson
View author publications
You can also search for this author in PubMed Google Scholar
Kenneth R Muir
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tracey H Sach.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

GB and TS conceived the idea for the paper, undertook the analysis and drafted the paper. CJ, AA, MD, and KM assisted in the acquisition of data, interpretation of the analysis, and commented on drafts of the manuscript. All authors read and approved the final manuscript.

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Barton, G.R., Sach, T.H., Avery, A.J. et al. Comparing the performance of the EQ-5D and SF-6D when measuring the benefits of alleviating knee pain. Cost Eff Resour Alloc 7, 12 (2009). https://doi.org/10.1186/1478-7547-7-12

Download citation

Received: 06 November 2008
Accepted: 17 July 2009
Published: 17 July 2009
DOI: https://doi.org/10.1186/1478-7547-7-12

Comparing the performance of the EQ-5D and SF-6D when measuring the benefits of alleviating knee pain

Abstract

Objective

Methods

Results

Conclusion

Trial registration

Background

Methods

Participants

Outcome Measures

Comparing the EQ-5D and SF-6D – performance criteria

Choice of Analysis

Practicality

Validity

Responsiveness

Results

Participants

Comparing the performance of the EQ-5D and SF-6D

Practicality

Validity

Responsiveness

Discussion

Comparisons with other studies

Limitations

Implications

Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors' contributions

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Cost Effectiveness and Resource Allocation

Contact us