Data source
Three counties were selected as the sample counties (Dingyuan in Anhui province in central China; Huining in Gansu province, Yilong in Sichuan province in western China). The reimbursement and payment levels of the new rural cooperative medical scheme (NRCMS) in the three counties are similar.
Cluster sampling method was applied in this study. The largest and most capable public comprehensive hospital in each county was selected as a sample hospital. Medical records were the objects of sampling. In the sampling calculation, according to the existing research [2], the estimated inappropriate admission rate P is 16%, and the relative tolerance δ = 0.09, the absolute tolerance d = 0.09 * P = 1.44%, the significance level α = 0.05, and the one-sided standard normal deviation Zα = 1.96. The equation of sample size (N) was as follows:
$${\text{N}} = ({{\text{Z}}_{\upalpha}}/{\text{d}})^{2} \times {\text{P}} (1 - {\text{P}}) = (1.96/1.44\%)^{2} \times 16\% \times (1-16\%)=2489.93$$
(1)
Considering the quality of medical records, 900 medical records in 2017 were selected from each hospital. Firstly, admissions of hospital delivery records in obstetrics were excluded considering the pertinence of AEP. Then, corresponding quantity of medical records were selected from the remaining departments according to the proportion of patients in the department accounted for the total quantity of patients in all departments. At last, a total of 2575 medical records were screened as samples after eliminating the records that have too many missing values and serious logic errors (Fig. 1). There were no missing values in outcome variables in the final samples.
All the medical records were evaluated by an adjusted AEP standard constructed in 2014 for county hospitals in China [17] (Appendix). The records were evaluated by two trained judges respectively. The judges were members of the research team. A professional training was held before they evaluating the admission appropriateness. Among all the records, 609 admissions were regarded appropriate (the control group) and 1966 were classified as inappropriate (the treatment group). This study believes that in addition to the general influencing factors (individual basic characteristics, external systems and policies, etc.) that affect the utilization of inpatient services, the constant development and change of the disease itself is also an important factor that cannot be ignored. Based on the above considerations, this paper used a dynamic perspective to compare the utilization of health services after hospitalization in patients that with different admission appropriateness. This is also one of the highlights of this study. Based on such a research perspective and the characteristics of the AEP criteria, the medical records were judged mainly according to the patients’ indications at the time of admission (when some disease indications may not be fully manifested) rather than the final discharge results (when the disease indications are relatively comprehensive). Because disease indications are not fully manifested, it may be not easy to meet AEP’s criteria for “appropriate” admission. On this basis, it is possibly lead to overestimating the inappropriate admission rate.
Study variables
Outcome variables
In this study, we use LOS, NCI and EOH as the outcome variables. These three indicators can be used to describe the patients’ utilization of services. LOS is a comprehensive index that directly measures hospital medical quality and management level [18]. NCI is an important index to reflect services projects of inpatients receiving. EOH is a critical index in the evaluation of health economics, which is the most direct reflection of health resource consumption [19]. At the same time, considering that EOH may not conform to the normal distribution, the study logarithmically processed variable EOH and it conformed to the normal distribution after logarithmic transformation.
Explanatory variables
Since the selection of covariates by PSM was to include relevant variables that may affect the outcome variables and processing variables as far as possible to satisfy the negligible hypothesis, this study included as many covariates as possible in the medical records. There were 15 patient-level covariates in the study, including gender, age, type of medical insurance, profession, marital status, way of admission, frequency of hospitalization, department in charge of treatment, disease system, having more than one disease, status of the patient upon admission, history of disease, with chronic diseases, health condition at ordinary times and receiving any surgery. Due to disease severity and considerations different, differences exist in the utilization of health services among different age. Type of medical insurance also affects the utilization of health services. Especially with the development of NRCMS, the reimbursement ratio increases gradually, which promotes the release of patient medical service demand and increases the services projects [2]. The profession may affect the length of hospital stay. For instance, farmers may shorten the LOS regardless of the severity of the disease during busy seasons [20, 21]. Health condition at ordinary times, status of the patient upon admission, having more than one disease and disease system are closely related to the changes of patients’ conditions after hospitalization. These are variables that especially need to be paid attention to in this study. Changes in illness can affect LOS and utilization of services [22]. Whether receiving any surgery would influence their hospitalization results due to the risk of nosocomial infections and complications [23].
Propensity score matching (PSM)
There were differences in individual characteristics between the treatment and control group, which will affect the comparison of the results of service utilization. Propensity scores were used to match each inpatient between two groups in similar conditions. PSM was used to balance observable covariates and reduce potential selection bias [24, 25]. The samples were matched in two major steps in this study. In the first step, total samples were matched to examine the differences in the utilization of hospital services between two groups using 15 individual covariates. In the second step, PSM was computed to analyze the differences in different disease systems, because the use of health services varies among disease systems. Disease system was divided into five groups (circulatory diseases, digestive diseases, respiratory diseases, surgical diseases and others). Then, inpatients in the treatment and control group were matched in each group of disease. Fourteen individual covariates were used except “disease system”. Therefore, it can be known whether there are significant differences in service utilization between the two groups in different diseases systems.
Statistical analysis
First of all, propensity score was obtained by incorporating the covariates into the logit model. Then, kernel matching was used to match each patient in the treatment group with similar counterpart patients in the control group (one-to-one matching) based on propensity score. The matching result of kernel method is good in terms of accuracy and it was summarized through literature in the field of health services [24, 26]. Finally, we calculated the average treatment effect on treated (ATT), which reflects the average change level of the outcome variable after controlling the covariates.
Assume that each inpatient i has two potential outcomes, \(Y_{i1}\) (treat, inappropriate admission) and \(Y_{i0}\) (control, appropriate admission). The average effect of the treatment is given by \(E(Y_{i1} - Y_{i0} )\). However, as \(Y_{i0}\) and \(Y_{i1}\) cannot be observed simultaneously for the same inpatient, the ATT is calculated instead:
$$ATT = E(Y_{i1} |D_{i} = 1) - E(Y_{i0} |D_{i} = 1)$$
(2)
where \(D_{i}\) is the dichotomous indicator of treatment, with 1 indicating that inpatients i are admitted inappropriately, and 0 are admitted appropriately. Stata 15.0 software (Stata Corp LP, College Station, TX, USA) was used for statistical analysis in a Windows environment. The two-sided statistical significance level was set at 0.05.