Skip to main content

Using informative prior based on expert opinion in Bayesian estimation of the transition probability matrix in Markov modelling—an example from the cost-effectiveness analysis of the treatment of patients with predominantly negative symptoms of schizophrenia with cariprazine



When patient health state transition evidence is missing from clinical literature, analysts are inclined to make simple assumptions to complete the transition matrices within a health economic model. Our aim was to provide a solution for estimating transition matrices by the Bayesian statistical method within a health economic model when empirical evidence is lacking.


We used a previously published cost-effectiveness analysis of the use of cariprazine compared to that of risperidone in patients with predominantly negative symptoms of schizophrenia. We generated the treatment-specific state transition probability matrices in three different ways: (1) based only on the observed clinical trial data; (2) based on Bayesian estimation where prior transition probabilities came from experts’ opinions; and (3) based on Bayesian estimation with vague prior transition probabilities (i.e., assigning equal prior probabilities to the missing transitions from one state to the others). For the second approach, we elicited Dirichlet prior distributions by three clinical experts. We compared the transition probability matrices and the incremental quality-adjusted life years (QALYs) across the three approaches.


The estimates of the prior transition probabilities from the experts were feasible to obtain and showed considerable consistency with the clinical trial data. As expected, the estimated health benefit of the treatments was different when only the clinical trial data were considered (QALY difference 0.0260), its combination with the experts’ beliefs were used in the economic model (QALY difference 0.0253), and when vague prior distributions were used (QALY difference 0.0243).


Imputing zeros to missing transition probabilities in Markov models might be untenable from the clinical perspective and may result in inappropriate estimates. Bayesian statistics provides an appropriate framework for imputing missing values without making overly simple assumptions. Informative priors based on expert opinions might be more appropriate than vague priors.


One of the most widely used methods of cost-effectiveness modelling in healthcare is Markov modelling [1]. Markov models are state transition models in which the life course of a cohort of subjects or a series of individuals are modelled by placing patients into discrete and mutually exclusive health states. Utilities and costs are assigned to each state and time period so that expected utilities and costs can be estimated [2].

In these models, the health state transition probabilities (from one time period to the next) are usually estimated from empirical studies (clinical trials, observational epidemiological studies, or a meta-analysis of these studies). Sometimes data on some transition probabilities might be lacking entirely, simply because those transitions were not observed or reported in the clinical studies. An easy modelling solution for unobserved transitions is to assume that they never happen and assign a zero value to these transition probabilities. Nevertheless, this zero-value assignment approach may not be reasonable from the clinical perspective since such health state transitions occur in clinical practice within large samples or long time periods.

A natural way to address this missing evidence issue comes from Bayesian statistics, which allows us to combine prior beliefs and evidence formally and quantitatively to estimate posterior probabilities [3]. Briggs et al. proposed a Bayesian estimation of the transition probability matrix for models with multibranch nodes [4]. In their paper flat Dirichlet prior distributions were assumed with high level of uncertainty. In their approach these flat prior probabilities from each state to the other states were combined with the clinical trial data to obtain the posterior probabilities of the transition probability matrix with Markov chain Monte Carlo (MCMC) simulation using WinBUGS [5].

The method has several advantages: (1) it solves the problem of having zero values in the transition probability matrix derived from the clinical trial, (2) assuming a high level of uncertainty of the prior probabilities ensures that the clinical trial data influences most of the posterior probabilities, (3) estimation and probabilistic sensitivity analysis can be performed in one step if the Markov modelling process is also performed within the same framework, and (4) MCMC methods allow samples to be drawn from the joint posterior density, fully considering parameter uncertainty and the correlations between parameters.

The major limitation of the method applied by Briggs et al. is that the transition probabilities that did not occur in the clinical trials are influenced only by the prior probabilities, and in this case, applying vague priors might result in inappropriate estimates. A solution for the problem may be the use of informative priors based on expert opinions. Unfortunately, eliciting a Dirichlet prior distribution is not straightforward [6]. A key challenge is satisfying all the constraints of mathematical coherence. For example, the probabilities of each category must sum to one. Another challenge is that clinical experts cannot easily and directly estimate the probabilities of multiple outcomes, as “human limitations of memory and information processing capacity often lead to subjective probabilities that are poorly calibrated or internally inconsistent, even when assessed by experts”, so the primary questions must be divided into simple questions that are easy to understand and answer [7].

Recently, a freeware application was published by Elfadaly and Garthwaite to aid in eliciting Dirichlet and Gaussian copula prior distributions [8]. The proposed method elicits hyperparameters of the Dirichlet distribution from those of its marginal beta distributions through forms of reconciliation that use least-squares techniques.

Using both of the two abovementioned methods, we estimated the transition probability matrix of the patients with predominantly negative symptoms of schizophrenia in a cost-effectiveness analysis comparing the effect of cariprazine to that of risperidone. We illustrated case-specific differences between the clinical expert elicited priors and those of more common approaches (i.e., applying non-informative priors or assuming unobserved transitions have zero probability).


The context of the case study is as follows

The cost-effectiveness model has been described elsewhere (Fig. 1) [9]. Briefly, a Markov cohort model was built in Microsoft Excel with eight health states for schizophrenia (hereinafter referred to as the Mohr-Lenert health states) defined by Mohr et al. in 2004 and a death state [10]. The definition of the health states can also be found in a concise format in Table 2 of the paper presenting the cost-effectiveness model [9]. As the pivotal clinical trial in which the model was based on provided no data on mortality, as there were not any participants who died during the study period, the age- and sex-specific mortality rates of the general population were used in the model, and no difference in the mortality between the two treatment groups was assumed [11]. Considering the pharmacokinetic properties of cariprazine [12], the modelled time period was split into two periods: an initial 6-week time period with weekly cycles and 12-week long cycles occurring thereafter. Because the full clinical effects of cariprazine are expected to occur after the first 6 weeks, different transition probabilities were necessary for the first 6 weeks and for the subsequent model time period. Because the aim of this paper is to present a method of estimating the transition probability matrix for Markov models based on both expert opinion and clinical trial data, we used only the first 6-week period as an illustration of the method. The same method was applied for the subsequent period but with different observed data and prior probabilities.

Fig. 1

Source: Németh et al. 2017

Cost-effectiveness model structure. [9]

Data to estimate the weekly transition probabilities for the first 6 weeks of the modelled time period for both the cariprazine and the risperidone arms were available from the first 4-week period of the Németh et al. clinical trial [11]. In the original publication, the model was used to estimate the utility of the cariprazine treatment compared to the risperidone treatment.

Prior elicitation

The prior probabilities in the transition matrix were elicited with the involvement of three leading clinical experts from Hungary who are actively involved in treating patients with schizophrenia and have deep insight into the typical courses of the disease (IB, BM, JR), and who were involved in the project. An additional criterion at the selection was to have considerable research experience. The prior probabilities were elicited using Prior Elicitation Graphical Software (PEGS) [8]. The feasibility of using the application, and the process of the elicitation was pilot tested without the involvement of the experts. The definitions of the Mohr-Lenert health states were explained to the experts. Besides this information given verbally, and the technical description of the exercise there were not any additional information given to the experts regarding the transitional probabilities between the studied health states. Then, the experts were asked individually face-to face to give their opinion about the probabilities of patients with predominantly negative symptoms of schizophrenia moving from a given Mohr-Lenert health state to another state conditional on the comparator treatment (i.e., risperidone). The experts were aware of main results of the clinical trial regarding the efficacy of cariprazine, but they were unaware of the trial results about the transition probabilities between different Mohr-Lenert health states. At each step we verified whether the experts understood what they had to assess (i.e., the descriptions of the patients in a certain state, which was used as the current state, and that they had to estimate the proportion of patients moving from this state to another state). In the assessment process, all experts were asked to assess the marginal medians and quartiles of the probability of each transition. Because directly estimating these statistics is difficult for clinical experts, we elicited this information by asking the following questions. For the median, the question was phrased as follows: “Considering patients in state A, what is the most likely proportion of patients moving to state B within one week during the first six weeks of treatment? Consider it equally likely that the true proportion is above this value or below this value. For example, suppose you assess this value as 0.4, you should think it is equally likely that the true proportion will be above 0.4 as it will be below 0.4.” For the lower quartiles, the question was phrased as follows: “If you were told that the true proportion was smaller, what is the value that you think is still reasonable? Estimate a proportion for which it holds that the probability that the true proportion is below equals the probability that the true proportion is between it and 0.4 (the median)”. Similar questions were asked to assess the upper quartiles.

In the next step, the experts were told that “the probabilities of the different patient movements from a given health state must add to one, and the assessments must also meet certain other requirements to be internally consistent. The application gives you three options for reconciling your assessments to meet these requirements. Select the one which best represents your opinion.” Once a clinical expert made a choice about these options, he or she was confronted with the result. Then, he or she still had an opportunity to change any quartiles. After the modifications, the application again calculated the coherent marginal quartiles for the Dirichlet distribution and presented them together with the expert’s revised values. Finally, when the expert thought that the proposition was in line with his or her view, the application presented the estimated transition probabilities with their variance and the hyperparameters of the Dirichlet distribution.

As the same prior was later used for modelling patients’ paths in the cariprazine arm, the difference in treatment efficacy originated only from the trial data. The estimated transition probabilities by the three experts were averaged and scaled to one person in each source state, ensuring high uncertainty of the prior probabilities and thereby allowing a large influence of the clinical trial data.

Estimation of the transition probability matrix

The transition probability matrix was finally estimated by WinBUGS based on the priors and the clinical evidence from the trial with 1000 burn-in samples and 50,000 estimation samples; see the code in (Additional file 1). Two chains were run, and convergence was assessed by visual inspection of the trace plots and by tracking the Brooks-Gelman-Rubin diagnostics.

We generated the treatment-specific transition probability matrices for the Markov model in three different ways: (1) based only on the observed clinical trial data (transition probabilities not observed within the trial were set to 0); (2) based on Bayesian estimation where prior transition probabilities came from experts’ opinions (as described previously); and (3) based on Bayesian estimation with vague prior transition probabilities (flat Dirichlet prior distributions). Furthermore, we compared the transition probability matrices and the incremental quality-adjusted life years (QALYs) across the three approaches.


The experts found determining the priors mentally exhausting, as the phrasing of the questions were the same for each health state. It took approximately one hour for each expert to complete the two transition probability matrices (for the first 6 weeks and thereafter). The experts found the feedback loops in the application to be very helpful, as reviewing the results of their estimations helped them correct their initial judgements when they felt it was necessary. The estimates of experts 1 and 2 showed considerable consistency (Fig. 2), whereas expert 3 consistently estimated higher probabilities for the patients staying in the state where they were than did the other two experts. Table 1 shows the mean values of the estimates of the three experts.

Fig. 2

Weekly transition probability estimates determined by the experts

Table 1 Weekly transition probability estimates (%) by the experts (mean values)

In Table 2, the observed relative frequencies of the weekly transitions in the first four weeks of the pivotal clinical trial are shown for both treatment arms [11]. There were quite a few transitions that were not observed, and there were no patients in states 7 and 8 (the two most severe Mohr-Lenert health states) in the first four weeks of the trial. In contrast to this observation, the experts generally believed that any of the possible transitions could happen, but some of these transitions were rather unlikely according to them.

Table 2 Observed weekly transition relative frequencies (%) between the eight health states from the first four weeks of the RGH-188-005 clinical trial

The average estimates of the experts showed reasonable consistency with the trial data, with some attenuation of the extreme observed values in the trial (Tables 1 and 2). For example, both patients who were in the risperidone arm who were in state 3 at the beginning of the trial stayed in state 3, resulting in a 100% transition probability, whereas the mean of the expert estimates was 42.9%. Note that if one of the two patients had moved to a different state, the trial transition probability estimate would have dropped to 50%. Similarly, the estimates for the transitions from state 1 to state 1 were 100% in both treatment arms based on the trial data, while the estimate was 82.39% by the experts. The experts did not exclude the possibility of any particular transition; thus, there were no zero estimates in their estimated transition probability matrix.

The estimates of the two chains converged on the trace plot, and the Brooks-Gelman-Rubin diagnostics did not show evidence for non-convergence either.

As expected, the posterior values of the transition probability matrix were very similar to the trial data relative frequencies in cells where there were a reasonably large number of events (states 2, 4, and 6) (Table 3). For example, in the clinical trial, the proportion of patients who stayed in state 2 was 80.99% in the risperidone arm, whereas the estimated posterior probabilities (expressed in %) were 80.72 and 80.43%, depending on the choice of the prior probability. In the case of the mildest disease state (state 1), the pattern of the posterior estimates clearly followed the pattern of the clinical trial data estimates when the informative prior was used. The posterior probability of staying in the mildest state, however, was much lower when the uninformative (flat) prior was used. This result was expected, as the experts’ opinions were in line with the observations in the clinical trial, and only 2 and 3 cases occurred in the risperidone arm and in the cariprazine arm, respectively, in the initial 4 weeks. This behaviour also occurred for the patient movements from state 3 in the risperidone arm. As the prior probability and the relative frequency from the trial had the same level of uncertainty (both were based on one case) in the cariprazine arm for state 3, the posterior probability distribution reflects equal influence of the prior probabilities and clinical trial data. As the same prior was used for both treatment arms, and there were no observations in states 7 and 8 in the initial 4 weeks of the trial, the posterior distributions of the transition probabilities from these health states were essentially the same in the two arms.

Table 3 Posterior estimates of the weekly transition probabilities (%) estimated with the use of uninformative (normal text) and informative (bold text) prior probabilities and clinical trial data

When the informative priors were used, these distributions reflected the opinions of the experts and were not at all uniform, avoiding the bias that would have resulted from the use of vague priors that assumed that the probability of each transition was the same from these health states (i.e., 1/8).

Table 4 shows the results regarding the health benefits with the use of the three different transition probability matrices. The estimated difference in QALYs was small but not negligible over a 54-week-long time period. Compared to the model that incorporated expert opinions, the model including only the clinical trial data estimated 2.8% higher, while the model using vague priors estimated 4% smaller QALY differences between the treatment arms. Nevertheless, these differences were very small compared to the precision of the QALY difference estimate, as the standard deviation of it was 0.014 after 1000 model runs in the probability sensitivity analysis in the cost-effectiveness analysis.

Table 4 Health benefit results of cost-effectiveness modelling with different transition probability matrices


Substituting zeros with plausible and valid values in a transition probability matrix of a state transition model is challenging. In a Bayesian analysis, we combined clinical trial data with informative Dirichlet prior probabilities of transitions between health states defined by Mohr et al. in patients with predominantly negative symptoms of schizophrenia [10]. The elicitation of Dirichlet prior probabilities proved to be feasible and reliable with the application developed by Elfadaly and Garthwaite, as the average values of the experts’ estimates showed considerable consistency with the observed relative frequencies from the clinical trial for transitions that were observed in the trial. The opinions of the different experts could be pooled by linear combinations or by the supra Bayesian procedure [13]. The simple averaging method, which we used, is a widely used and recommended method, especially with small sample sizes [14].

A strength of our approach was that the evidence about treatment efficacy was derived entirely from the clinical trial, as the same prior distribution was used on the two arms. The use of informative priors for treatment efficacy might be justifiable in some cases, but we think that extracting efficacy estimates from clinical trials or effectiveness estimates from well-designed observational studies provides more valid results [15].

Different methods of estimating the transition probability matrix may lead to considerably different results. In our case, the size of the difference between the three results in the incremental QALY was not large. The size of this difference, however, depends on the specifics of the actual model. When the forecasting of the model includes significantly long periods of time in health states or transitions that were not observed or when there are large differences in the quality of life or costs of health states, the size of the difference in the results is likely to be larger. In our case, for example, when the time period of the model was extended to 258 weeks, the 4% difference in the incremental health gain between the approaches using vague and informative priors increased to approximately 6% (QALY difference of 0.0825 and 0.0875, respectively). The differences in the QALY estimates by estimation method were in line with our expectations, as the clinical experts estimated that transitions to the more severe states were less frequent than assumed with the use of the non-informative priors. Thus, we expected larger QALY estimates when informative priors were used.

Although several guidelines have been published about how to perform health economic modelling, the estimation of the transition probability matrix is a neglected area of study, as there are no available guidelines regarding this topic [16]. The Bayesian approach provides more plausible and valid results than simply assuming zero probabilities for the unobserved transitions, when these unobserved transitions still occur in clinical practice. Our case study showed that using vague priors different from the experts’ prior beliefs resulted in different model results. When the selection of the priors largely influences the model results (e.g., when the clinical trial data have large uncertainty), sensitivity analyses with plausible ranges of the priors can help determine the robustness of the results.


In summary, the proposed method by Briggs et al. provides a conceptual framework and practical solution to estimate transition probability matrices when some of the possible transitions did not occur in the empirical studies [4]. Using informative priors rather than vague priors when the Bayesian approach is applied could be an option when the transition probabilities without empirical estimates have a significant impact on the model results.. The software application developed by Elfadaly and Garthwaite is a good practical aid to elicit Dirichlet and Gaussian copula priors by expert interviews [8].

Availability of data and materials

The supplementary material (WinBUGS program code) used to support the findings of this study are included within (Additional file 1).



Markov chain Monte Carlo


Prior Elicitation Graphical Software


Quality adjusted life years


  1. 1.

    Marsh K, Phillips CJ, Fordham R, Bertranou E, Hale J. Estimating cost-effectiveness in public health: a summary of modelling and valuation methods. Health Econ Rev. 2012;2:17.

    Article  Google Scholar 

  2. 2.

    Sonnenberg FA, Beck JR. Markov models in medical decision making: a practical guide. Med Decis Making. 1993;13:322–38.

    CAS  Article  Google Scholar 

  3. 3.

    O’Hagan A, Stevens JW. Bayesian methods for design and analysis of cost-effectiveness trials in the evaluation of health care technologies. Stat Methods Med Res. 2002;11:469–90.

    Article  Google Scholar 

  4. 4.

    Briggs AH, Ades AE, Price MJ. Probabilistic sensitivity analysis for decision trees with multiple branches: use of the Dirichlet distribution in a Bayesian framework. Med Decis Making. 2003;23:341–50.

    Article  Google Scholar 

  5. 5.

    Lunn D, Jackson C, Best N, Spiegelhalter D, et al. The BUGS book: a practical introduction to Bayesian analysis. Boca Raton: Chapman and Hall/CRC; 2012.

    Google Scholar 

  6. 6.

    Zapata-Vázquez RE, O'Hagan A, Soares BL. Eliciting expert judgements about a set of proportions. J Appl Stat. 2014;41:1919–33.

    Article  Google Scholar 

  7. 7.

    Fox CR, Clemen RT. Subjective probability assessment in decision analysis: partition dependence and bias toward the ignorance prior. Manage Sci. 2005;51:1417–32.

    Article  Google Scholar 

  8. 8.

    Elfadaly FG, Garthwaite PH. Eliciting Dirichlet and Gaussian copula prior distributions for multinomial models. Stat Comput. 2017;27:449–67.

    Article  Google Scholar 

  9. 9.

    Németh B, Molnár A, Akehurst R, Horváth M, Kóczián K, Németh G, Götze Á, Vokó Z. Quality-adjusted life year difference in patients with predominant negative symptoms of schizophrenia treated with cariprazine and risperidone. J Comp Eff Res. 2017;6:639–48.

    Article  Google Scholar 

  10. 10.

    Mohr PE, Cheng CM, Claxton K, Conley RR, Feldman JJ, Hargreaves WA, Lehman AF, Lenert LA, Mahmoud R, Marder SR, Neumann PJ. The heterogeneity of schizophrenia in disease states. Schizophr Res. 2004;71:83–95.

    Article  Google Scholar 

  11. 11.

    Németh G, Laszlovszky I, Czobor P, Szalai E, Szatmári B, Harsányi J, Barabássy Á, Debelle M, Durgam S, Bitter I, Marder S. Cariprazine versus risperidone monotherapy for treatment of predominant negative symptoms in patients with schizophrenia: a randomised, double-blind, controlled trial. Lancet. 2017;389:1103–13.

    Article  Google Scholar 

  12. 12.

    Citrome L. Cariprazine: chemistry, pharmacodynamics, pharmacokinetics, and metabolism, clinical efficacy, safety, and tolerability. Expert Opin Drug Metab Toxicol. 2013;9:193–206.

    CAS  Article  Google Scholar 

  13. 13.

    Jacobs RA. Methods for combining experts' probability assessments. Neural Comput. 1995;7:867–88.

    CAS  Article  Google Scholar 

  14. 14.

    de Menezes LMW, Bunn D, Taylor JW. Review of guidelines for the use of combined forecasts. Eur Jour Operational Res. 2000;120:190–204.

    Article  Google Scholar 

  15. 15.

    Cooper NJ, Sutton AJ, Abrams KR. Decision analytical economic modelling within a Bayesian framework: application to prophylactic antibiotics use for caesarean section. Stat Methods Med Res. 2002;11:491–512.

    CAS  Article  Google Scholar 

  16. 16.

    Olariu E, Cadwell KK, Hancock E, Trueman D, Chevrou-Severac H. Current recommendations on the estimation of transition probabilities in Markov cohort models for use in health care decision-making: a targeted literature review. Clinicoecon Outcomes Res. 2017;9:537.

    Article  Google Scholar 

Download references


Not applicable.


Financial support for this study was provided by a contract with Gedeon Richter Plc., Budapest, Hungary and Recordati S.p.A, Milan, Italy. The funding agreement ensured the authors’ independence in designing the study, interpreting the data, and writing and publishing the report. The following authors are employed by Gedeon Richter Plc.: Götze Á, Horváth M, and Kóczián K, and the following authors are employed by Recordati S.p. A: Fonticoli L and Lelli F.

Author information




All authors contributed to the design of this study. ZV, BN and AM performed the analysis. IB, BM and JR provided their expert opinions in the elicitation of the priors. ZV wrote the first draft of the manuscript, which, after review by all co-authors, was revised to its current form. ZV acts as the overall guarantor. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Zoltán Vokó.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Permission was granted for reuse from the chief editor of the Journal of Comparative Effectiveness Research, the copyright holder for Tables 23 and Fig. 1.

Competing interests

Financial support was provided by Richter Gedeon Plc., Budapest, Hungary and Recordati S.p. A, Milan, Italy. ÁG, MH and KK are employed by Richter Gedeon Plc., Budapest, Hungary, while LF and FL are employed by Recordati S.p. A, Milan, Italy. ZV, AM, JP and BN are employees of Syreon Research Institute, which was a contracted research partner of Richter Gedeon Ltd. in this project, IB has received fees from Richter Gedeon Ltd. for consulting and as a speaker, and BM and JR have received fees for consulting.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Additional file 1.

The WinBUGS code of the analysis.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Vokó, Z., Bitter, I., Mersich, B. et al. Using informative prior based on expert opinion in Bayesian estimation of the transition probability matrix in Markov modelling—an example from the cost-effectiveness analysis of the treatment of patients with predominantly negative symptoms of schizophrenia with cariprazine. Cost Eff Resour Alloc 18, 28 (2020).

Download citation


  • Bayesian statistics
  • Transition probabilities
  • Markov model
  • Schizophrenia