Modeling Strategies for Risk Prediction in Clinical Medicine with Restricted Data: Application to Cardiovascular Disease

Lee, Junyoung; Chan, Wai Kin (Victor)

doi:10.1007/978-981-33-4359-7_2

Junyoung Lee⁶ &
Wai Kin (Victor) Chan⁶

909 Accesses

Abstract

This paper describes modeling strategies for risk prediction in clinical medicine, mainly with respect to survival analysis. Restricted data, which is commonly given in initial clinical research, is assumed for these strategies. Cox’s proportional hazard model is used with modern statistical approaches. In this paper, detailed modeling strategies for clinical risk prediction are proposed and demonstrated by using a case study on the cardiovascular disease. Experiments were conducted by employing Stepwise selection and Elastic Net with bootstrapping. Results give some insights for risk prediction and modeling with limitation of clinical data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+

from €39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Chapter: EUR 29.95; Price includes VAT (Germany)

eBook: EUR 160.49; Price includes VAT (Germany)

Softcover Book: EUR 213.99; Price includes VAT (Germany)

Hardcover Book: EUR 213.99; Price includes VAT (Germany)

Tax calculation will be finalised at checkout

Purchases are for personal use only

') var buybox = document.querySelector("[data-id=id_"+ timestamp +"]").parentNode var buyingOptions = buybox.querySelectorAll(".buying-option") ;[].slice.call(buyingOptions).forEach(initCollapsibles) var buyboxMaxSingleColumnWidth = 480 function initCollapsibles(subscription, index) { var toggle = subscription.querySelector(".buying-option-price") subscription.classList.remove("expanded") var form = subscription.querySelector(".buying-option-form") var priceInfo = subscription.querySelector(".price-info") var buyingOption = toggle.parentElement if (toggle && form && priceInfo) { toggle.setAttribute("role", "button") toggle.setAttribute("tabindex", "0") toggle.addEventListener("click", function (event) { var expandedBuyingOptions = buybox.querySelectorAll(".buying-option.expanded") var buyboxWidth = buybox.offsetWidth ;[].slice.call(expandedBuyingOptions).forEach(function(option) { if (buyboxWidth buyboxMaxSingleColumnWidth) { toggle.click() } else { if (index === 0) { toggle.click() } else { toggle.setAttribute("aria-expanded", "false") form.hidden = "hidden" priceInfo.hidden = "hidden" } } }) } initialStateOpen() if (window.buyboxInitialised) return window.buyboxInitialised = true initKeyControls() })()

Institutional subscriptions

Disease Risk Prediction from Clinical Texts

Cardiovascular Risk Assessment: An Interpretable Machine Learning Approach

Ten Year Cardiovascular Risk Estimation: A Machine Learning Approach

References

L. Zhang, H. Wang, Q. Li, M.H. Zhao, Q.M. Zhan, Big data and medical research in China. BMJ 360, j5910 (2018). https://doi.org/10.1136/bmj.j5910
M. Pavlou, et al., How to develop a more accurate risk prediction model when there are few events. BMJ 351, h3868 (2015). https://doi.org/10.1136/bmj.h3868
D. Cox, Regression models and life tables. J. Roy. Stat. Soc. 34(2), 187–220 (1972)
Google Scholar
S. Polsterl, P. Gupta, L. Wang, S. Conjeti, A. Katouzian, N. Navab, Heterogeneous ensembles for predicting survival of metastatic, castrate-resistant prostate cancer patients. F1000Res 5, 2676 (2016). https://doi.org/10.12688/f1000research.8231.3
P.A. Wolf, R.B. D’Agostino, A.J. Belanger, W.B. Kannel, Probability of stroke: a risk profile from the framingham study. Stroke 22(3), 312–318 (1991)
Article Google Scholar
C. Dufouil et al., Revised framingham stroke risk profile to reflect temporal trends. Circulation 135(12), 1145–1159 (2017)
Article Google Scholar
J.A. Dorresteijn et al., Development and validation of a prediction rule for recurrent vascular events based on a cohort study of patients with arterial disease: the SMART risk score. Heart 99(12), 866–872 (2013)
Article Google Scholar
J. Hippisley-Cox, C. Coupland, P. Brindle, Derivation and validation of QStroke score for predicting risk of ischaemic stroke in primary care and comparison with other risk scores: a prospective open cohort study. BMJ. 346, f2573 (2013). https://doi.org/10.1136/bmj.f2573
X. Xing, et al., Predicting 10-year and lifetime stroke risk in chinese population. Stroke, p. STROKEAHA119025553 (2019). https://doi.org/10.1161/strokeaha.119.025553
STEYERBERG, E.W, Clinical prediction models. A practical approach to development, validation, and updating. J. Roy. Stat. Soc. 66(2), 661–662 (2010)
Google Scholar
E. Vittinghoff, D.V. Glidden, S.C. Shiboski, C.E. McCulloch, Regression Methods in Biostatistics: Linear, Logistic, Survival, and Repeated Measures Models (Springer Science & Business Media, 2011)
Google Scholar
F.E. Harrell Jr, Regression Modeling Strategies: With Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis. (Springer, 2015)
Google Scholar
J.A. Sterne, et al., Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ 338, b2393 (2009). https://doi.org/10.1136/bmj.b2393
A.E. Hoerl, R.W. Kennard, Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12(1), 55–67 (1970)
Article Google Scholar
R. Tibshirani, Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc.: Ser. B (Methodol.) 58(1), 267–288 (1996)
Google Scholar
T. Hastie, H. Zou, Regularization and variable selection via the elastic net. J. Roy. Stat. Soc. 67(5), 768–768
Google Scholar
Y. Huo et al., Efficacy of folic acid therapy in primary prevention of stroke among adults with hypertension in China: the CSPPT randomized clinical trial. JAMA 313(13), 1325–1335 (2015). https://doi.org/10.1001/jama.2015.2274
Article Google Scholar
T.P. Morris, I.R. White, P. Royston, Tuning multiple imputation by predictive mean matching and local residual draws. BMC Med. Res. Methodol. 14(1), 75 (2014)
Article Google Scholar
F.E. Harrell, R.M. Califf, D.B. Pryor, K.L. Lee, R.A. Rosati, Evaluating the yield of medical tests. JAMA 247(18), 2543–2546 (1982)
Article Google Scholar
M.J. Pencina, R.B. D’Agostino, Overall C as a measure of discrimination in survival analysis: model specific population value and confidence interval estimation. Stat. Med. 23(13), 2109–2123 (2004). https://doi.org/10.1002/sim.1802
Article Google Scholar
S. Derksen, H.J. Keselman, Backward, forward and stepwise automated subset selection algorithms: frequency of obtaining authentic and noise variables. Br. J. Math. Stat. Psychol. 45(2), 265–282 (1992)
Article Google Scholar
W. Sauerbrei, M. Schumacher, A bootstrap resampling procedure for model building: application to the Cox regression model. Stat. Med. 11(16), 2093–2109 (1992)
Article Google Scholar
D.G. Altman, P.K. Andersen, Bootstrap investigation of the stability of a cox regression model. Stat. Med. 8(7), 771–783 (1989)
Article Google Scholar
J. Shao, Bootstrap model selection. J. Am. Stat. Assoc. 91(434), 655–665 (1996)
Article Google Scholar
M. W. Heymans, S. van Buuren, D. L. Knol, W. van Mechelen, H. C. de Vet, Variable selection under multiple imputation using the bootstrap in a prognostic study. BMC Med Res Methodol 7(33) (2007). https://doi.org/10.1186/1471-2288-7-33
P.C. Austin, J.V. Tu, Bootstrap methods for developing predictive models. Am. Stat. 58(2), 131–137 (2004). https://doi.org/10.1198/0003130043277
Article Google Scholar
P. C. Austin, Bootstrap model selection had similar performance for selecting authentic and noise variables compared to backward variable elimination: a simulation study. J. Clin. Epidemiol, 61(10), 1009–17 e1 (2008). https://doi.org/10.1016/j.jclinepi.2007.11.014
N. Meinshausen, P. Bühlmann, Stability selection. J. Roy. Stat. Soc. 72(4), 417–473 (2010)
Article Google Scholar

Download references

Acknowledgements

In this study, the samples used for case study was provided from AUSA company. Clinical guidance by professionals in the company was of great importance. We particularly appreciate their contribution to our research.

This paper was funded by a grant from the National Natural Science Foundation of China (Grant No. 71971127) and a grant from the Shenzhen Municipal Development and Reform Commission, Shenzhen Environmental Science and New Energy Technology Engineering Laboratory (Grant Number: SDRC [2016]172).

Author information

Authors and Affiliations

Tsinghua-Berkerley Shenzhen Institute, Tsinghua University, Shenzhen, China
Junyoung Lee & Wai Kin (Victor) Chan

Authors

Junyoung Lee
View author publications
Search author on:PubMed Google Scholar
Wai Kin (Victor) Chan
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Wai Kin (Victor) Chan .

Editor information

Editors and Affiliations

Beijing Jiaotong University, Beijing, China
Shifeng Liu
Budapest University of Technology and Economics, Budapest, Hungary
Gábor Bohács
Beijing Jiaotong University, Beijing, China
Xianliang Shi
International Center for Informatics Research, Beijing Jiaotong University, Beijing, China
Xiaopu Shang
Beijing Logistics Informatics Research Base, Beijing Jiaotong University, Beijing, China
Anqiang Huang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lee, J., Chan, W.K. (2021). Modeling Strategies for Risk Prediction in Clinical Medicine with Restricted Data: Application to Cardiovascular Disease. In: Liu, S., Bohács, G., Shi, X., Shang, X., Huang, A. (eds) LISS 2020. Springer, Singapore. https://doi.org/10.1007/978-981-33-4359-7_2

Download citation

DOI: https://doi.org/10.1007/978-981-33-4359-7_2
Published: 11 April 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-33-4358-0
Online ISBN: 978-981-33-4359-7
eBook Packages: Business and ManagementBusiness and Management (R0)

Publish with us

Policies and ethics