Comparing the Value of Three Main Diagnostic-Based Risk-Adjustment Systems (DBRAS)

by Marc Berlinguet | Mar 01, 2005

Key Implications for Decision Makers

  • Diagnostic-Based Risk-Adjustment Systems (DBRAS) are now widely used in the United States by healthcare payers and providers to identify the health status of individuals and predict their expenditures for the same year or the next year. That requires linking all diagnoses over a period of a year for an individual (from the same administrative databases input files as diagnosis related groups (DRG)) and generating one (for the categorical systems) or many groups (for the so-called dichotomous variables groupers) for each individual. These systems can be used for funding under a capitation arrangement, identifying high-cost patients for case management, monitoring health status of groups of enrolees, and planning and evaluating the health services.
  • The lead researchers secured access to large development and validation samples from Ontario, Quebec, and Alberta. Evaluation licenses from three most relevant DBRAS were obtained, the ADG/ACG system from Johns Hopkins University, the HCC/ DCG system from DxCG Inc., and ACRG2/ CRG from 3M Inc. Data were processed successfully. All diagnoses coming from fee-for-service and hospital discharge summaries were used and pooled for each patient.
  • The design involved measuring an expected cost and an observed cost for each individual of a validation sample for the same year (concurrent model) and for the following year (prospective model). Retrieving all expenditures from fee-for-service medical billings and/or acute hospital expenditures for inpatient services or ambulatory day surgeries is needed to calculate weights. Evaluation was done initially in all three provinces using socio-economic adjustments in addition to age and gender, and the three DBRAS systems were much better predictors of costs. Then our core comparative evaluation between DBRAS showed that the HCC/DCG system slightly outperformed the ACRG2/CRG model and more so, outperforms the ADG/ACG for cost prediction power for medical fee-for-service expenditures, hospital inpatient and ambulatory expenditures, and total cost. Some results varied much between provinces for same groupers.
  • These systems are never used to predict individual expenses but rather to estimate expenses for groups of people with similar conditions. Predictive ratios (expected over observed costs) pool expenditures for many individuals. Hence, the prediction is much greater with groups of people. Still, we observe that these systems over-predict costs for the groups (here deciles: meaning all population sampled divided in 10 equal bins) in the lower-cost deciles, and under-predict for higher-cost deciles.
  • Three main evaluation criteria were developed in January 2004 and used to rate each DBRAS grouper: 1) clinical and administrative value of categories; 2) discrimination and predictive value of categories; and 3) transparency, ease of use, and simplicity of resource weight calculation (see table 15 in the full report). All groupers are good and sound but decision makers shall select the one that fits their needs. Since then, clinical risk groups (CRGs) have been proposed in 2004 by the Quebec Ministry of Health for severity adjusting capitation payment of GPs; and the Calgary Health region has since acquired an operational license of CRGs.

Executive Summary

This research project was initiated in July 2000 when a group of public servants from five provinces west of New Brunswick met in Calgary to share funding mechanisms for acute healthcare and identify research priorities. Encounter groupers like Diagnostic Related Groups or the Canadian CMG (TM CIHI) have been used extensively to measure products of hospitals; a new type of groupers called Diagnostic-Based Risk-Adjustment Systems (DBRAS) were more widely used south of the border by healthcare payers and providers to identify health status of individuals, and predict their expenditures for the same year or the next year. That required linking together all diagnoses (and some interventions for at least one grouper) over a period of a year for an individual (from the same administrative data bases input files as DRG) and generating one (for the categorical systems) or more groups (for the so-called dichotomous variables groupers based on additive multiple linear regression models) for same individuals. It also involves retrieving all expenditures from fee-for-services medical billings and/or acute hospital expenditures for inpatient services or ambulatory day surgeries. These systems can be used for funding under a capitation arrangement, identifying high-cost patients for case management, monitoring health status of groups of enrolees over many years, and planning and evaluating the health services.

The lead researchers based in three provinces at the Calgary Health Region, Regie de l'Assurance Maladie du Québec, and the Ontario Ministry of Health and Long-Term Care secured access of large and representative development and validation samples from each province for the years 1997/1998 (only Quebec and Ontario), 1998/1999, and 1999/2000 (all three provinces ). The clinical information of all those individuals was linked together and the medical fee-for-service and acute inpatient and ambulatory surgeries expenditures for the same year and the following year were linked and estimated. Evaluation licenses from three most relevant American providers of such DBRAS were obtained, namely the ADG/ACG system from Johns Hopkins University, the HCC/ DCG system from DxCG Inc., and ACRG2/ CRG from 3M Inc. Data were processed successfully. All diagnoses coming from fee-for-service (private offices, clinics, and emergency rooms) and from hospital discharge summaries were pooled for each patient. The number of invalid diagnoses was less than one percent in each province. Frequency distribution in each province and with American databases was comparable and reviewed by the developer of each system and proved valid.

Evaluation of predictive power of the best predictive models of the two dichotomous variables models (ADG and HCC) were done, while a least performing model (ACRG2) was selected for the CRG system (mutually exclusive categorical model) because the number of categories (maximum: 149) was a better match with the other two systems and that 16 sub-groups of age and gender cells were added to the explanatory models, which would have made the total number of possible combinations too high to have used the most detailed model encompassing a maximum of 1,075 cells.

The methods involved measuring an expected cost and an observed cost for each individual from a validation sample for the same year (concurrent model) and for the following year (prospective model). In order to identify an expected cost, estimation of expected costs was done prior to that with another independent random sample. The way the CRG weights were calculated was to average the costs for all individuals in the same ACRG2 group, severity level and age and gender sub-group, much akin what is done for the encounter groupers DRG/CMG. Capping (truncation) of costs at the 99th percentile was also done, and all analyses used both the raw expenditures and the truncated value for each of the specific buckets of medical fee-for-service expenditures, acute care hospital expenditures (inpatient and ambulatory surgeries), and total expenditures (sum of fee-for-service and hospital expenditures). For the dichotomous variables groupers ADG and HCC, because one individual may be described by one or many groups at the same time, multiple linear regression calculations were done to derive coefficients that were then added to obtain a final scoring weight and estimated cost. Once expected costs and observed costs were obtained for each individual, the next and final step to quantify predictive power of each system was to proceed with a simple linear regression model where the variable to be explained is the observed cost, either the raw cost or the truncated cost for same year expenditure or for the following year expenditures.

When all three systems were compared with using only age and gender 16 sub-groups as predictors, the explained variance (maximum of 1.00) for each and all individuals for the same year, Quebec total raw costs was only 0.04 for the age/gender adjusters while 0.43 for the best model CRG there, and 0.07 for the truncated costs in relation to 0.55 for the ACRG2/CRG model. As for explanation of the following year costs (prospective model), the comparable results were, for the age/gender adjuster, 0.07 in the untruncated (raw) costs model and 0.04 for the truncated model, while respectively 0.17 and 0.12 for the best performing DBRAS grouper in that test in Quebec, ACRG2/CRGs.

Evaluation was initially done in all three provinces of using socio-economic adjustments in addition to age and gender. Here again, using Ontario and ADG as examples in this report but the same magnitude of results in Quebec, while slightly higher in Alberta where an individual measure of SES is done (mean test), the explained variance was much lower using SES ecological (measured not on individuals but on geographic location) values and age/gender adjusters than using one DBRAS, here ADG. The results were 0.03 (truncation on costs) and 0.01 (no truncation of costs) versus 0.37 and 0.21 for the ADG concurrent total costs models, while the following results were produced for the prospective model (explanation of next year costs): 0.03 (truncation) and 0.01 (no truncation of costs) for the SES+ age/gender adjusters versus 0.14 and 0.16 for the ADG models with age+ gender adjustments.

Tables 10 and 11 (in the full report) summarize all results for all costs buckets for the three provinces tests. In general, the HCC/DCG system slightly outperformed the ACRG2/CRG model and more so, outperforms the ADG/ACG. Some results varied between provinces for same groupers. For example, one explanation for the relatively poorer performance of ACRG2/CRG models in Ontario and Alberta may be due to the distinction between principal and secondary diagnoses were not retained in the grouping process for these two provinces while it was done in Quebec. Another factor may have been the higher variability of expenditures in those two provinces, both for the medical fee-for-service and hospital costs: given that the explained variance from the regression is measured by squaring the differences between observed and expected, this may have had a larger impact on CRGs, especially because this classification only retains one mutually exclusive group per individual and not one or many as the dichotomous variables ones (in the ADG and HCC models). Finally, in Ontario and even more so in Alberta, more diagnoses were available for each patient given in Ontario for the medical billing up to two diagnoses could be documented, and in Alberta, diagnosis information from the emergency rooms and outpatients clinic hospital administrative systems were also available: this may have also favoured the two other groupers in relation to ACRG2/CRGs.

Overall, the relevance of higher explained variance proportion has to be put in perspective. First, if one sees that there is a 0.50 explained variance for one system at the individual level, that roughly means that it is almost like tossing a coin to predict right amount of spent expenditures for same year; and 0.20 is that much lower to explain next year expenditures. Obviously, there is more than meets the eye, and that is why predictive ratios are so useful to consider (see Figure 15 in the full report): they pooled expenditures for many individuals and there the predictive power is much stronger. Indeed such systems are never used to predict on individual expenses but rather to estimate expenses for groups of people with similar conditions. The prediction is much greater, even if we see that these systems usually over-predict costs for the groups (here deciles: meaning all population sampled divided in 10 equal bins) in the lower-cost deciles, and under-predict for higher costs deciles. The exception here is that the regression models that contain negative coefficient artificially create negative costs here if such groups are not pruned from the models tested, which we did not do, in order to secure similar comparison with all same cases and no manipulation.

In the final analysis, the investigators went through a semi-structured consensus methodology (quasi Delphi) to come to three main evaluation criteria to rate each and all groupers: 1) clinical and administrative value of categories (face value/clinical relevance and level of granularity for epidemiological applications); 2) discrimination and predictive value of categories (accuracy and precision for cost prediction); and 3) convenient resource weighting (transparency, ease, and simplicity of calculation). Table 11 in the full report provides our collective rating for each DBRAS.

Criterion/Product Clinical Relevance Resources Prediction Convenient Resource Weighting
ADG/ACG + ++ +
DCG-HCC ++ +++ +
CRG +++ ++/+++ +++

The Calgary Health Region has since acquired an operational license of CRGs; and CRGs have been selected by the Quebec Ministry of Health for capitation payment of GPs.