The current study represents one of the largest evaluations of variations in ICU mortality, and the first study (to our knowledge) to include all hospitals providing critical care services in a single metropolitan region. In analyses of > 116,000 patients admitted to ICUs in 28 hospitals over a 4-year period, several important findings emerge. First, an existing ICU risk stratification tool can be successfully implemented in a diverse spectrum of hospitals, as part of an ongoing collaboration between purchasers and providers to evaluate health-care quality and reward high performing institutions. The longevity of this initiative may, in part, be due to the use of a previously validated method. Other important and explicit steps were taken under the initiative to ensure long-term participation, including the following: rigorous independent examination of the validity and reliability of the method in the current population prior to public reporting; a commitment to refine data collection and analysis based on feedback by local ICU physicians; development of training workshops for purchasers and local media to review important methodologic limitations of the data being disseminated; and allowing adequate time to scrutinize results prior to public release.
Second, a substantial amount of the variation in observed mortality rates was explained by the APACHE III normative risk predictions that were developed previously. The data confirm the powerful predictive value of weighted abnormal physiology for patients admitted to the ICU review buy ventolin inhaler. Moreover, the discrimination of the APACHE III method in our community-based cohort was nearly identical to its discrimination in a development cohort assembled using strict research protocols.
Calibration analyses indicated that APACHE III systematically overestimated the risk of death in the current population. This may reflect the likelihood that a prediction model will almost always perform better in the data set from which it was derived and not as well when applied to a new population. It is possible that the APACHE III model would be better calibrated than the current model if both were prospectively applied to a new population or there may be no appreciable differences in performance. It is also important to note that, in large data sets, the Hosmer-Lemeshow test is an extremely sensitive method for discerning differences in observed and predicted outcome rates.