Prediction of Tacrolimus Blood Concentration in Liver Transplantation
Prediction of Tacrolimus Blood Concentration in Liver Transplantation
Introduction: Tacrolimus is an important immunosuppressive drug for organ transplantation patients. It has a narrow therapeutic range, toxic side effects, and a blood concentration with wide intra- and interindividual variability. Hence, it is of the utmost importance to monitor tacrolimus blood concentration, thereby ensuring clinical effect and avoiding toxic side effects. Prediction models for tacrolimus blood concentration can improve clinical care by optimizing monitoring of these concentrations, especially in the initial phase after transplantation during intensive care unit (ICU) stay. This is the first study in the ICU in which support vector machines, as a new data modeling technique, are investigated and tested in their prediction capabilities of tacrolimus blood concentration. Linear support vector regression (SVR) and nonlinear radial basis function (RBF) SVR are compared with multiple linear regression (MLR).
Methods: Tacrolimus blood concentrations, together with 35 other relevant variables from 50 liver transplantation patients, were extracted from our ICU database. This resulted in a dataset of 457 blood samples, on average between 9 and 10 samples per patient, finally resulting in a database of more than 16,000 data values. Nonlinear RBF SVR, linear SVR, and MLR were performed after selection of clinically relevant input variables and model parameters. Differences between observed and predicted tacrolimus blood concentrations were calculated. Prediction accuracy of the three methods was compared after fivefold cross-validation (Friedman test and Wilcoxon signed rank analysis).
Results: Linear SVR and nonlinear RBF SVR had mean absolute differences between observed and predicted tacrolimus blood concentrations of 2.31 ng/ml (standard deviation [SD] 2.47) and 2.38 ng/ml (SD 2.49), respectively. MLR had a mean absolute difference of 2.73 ng/ml (SD 3.79). The difference between linear SVR and MLR was statistically significant (p < 0.001). RBF SVR had the advantage of requiring only 2 input variables to perform this prediction in comparison to 15 and 16 variables needed by linear SVR and MLR, respectively. This is an indication of the superior prediction capability of nonlinear SVR.
Conclusion: Prediction of tacrolimus blood concentration with linear and nonlinear SVR was excellent, and accuracy was superior in comparison with an MLR model.
Purpose. Tacrolimus blood concentrations demonstrate a wide intra- and interindividual variability. Therefore, monitoring of these concentrations remains an issue of pivotal importance to safeguard therapeutic efficacy and to manage the risk for nephrotoxicity, other toxicities, and rejection in liver transplantation patients. This study examines the feasibility and clinical benefits of using a support vector regression (SVR) algorithm in comparison with a multiple linear regression (MLR) algorithm in predicting tacrolimus blood concentration. Tacrolimus blood concentration is predicted starting from a selected number of clinically relevant input variables.
Background. Hospital information systems in intensive care medicine generate large datasets on a daily basis. These rapidly increasing amounts of data make the task of extracting correct and relevant clinical information from intensive care unit (ICU) patients difficult. Data modeling techniques based on machine learning such as support vector machines (SVMs) can partially reduce workload, aid clinical decision-making, and lower the frequency of human error. Fundamental research in clinical data modeling forms the basis on which later validation can be performed in multicentered clinical trials. This is the first study to use SVM for data modeling in the ICU domain. SVMs have been applied, however, in molecular biology, bioinformatics, as well as in genetics and proteomics. In cancer research, kernel methods (or SVM) have been used to predict malignancy in brain tumors and also in staging certain forms of breast and prostate cancer. In cardiology, heart valve disease has been predicted with SVMs, and in fundamental cardiology research, nucleotide polymorphisms of candidate genes for ischemic heart disease have been modeled by kernel methods. Clinical decision-making has been compared for prospective performance with logistic regression and SVM. In contrast with the absence of data concerning SVM applications in the ICU, artificial neural networks (ANNs) - as a less recent statistical learning technique - have been studied thoroughly in the ICU environment: they have been used for prediction of ICU mortality and prognosis in septic shock, clinical decision-making, and prediction of plasma drug concentrations. Also, the management of infectious diseases, real-time analysis of hemodynamics, and research in cardiology and oncology have benefited from recent evolutions in artificial intelligence (AI) and ANN.
Underlying theory. The roots of SVM lie in the statistical learning theory, which describes properties of learning machines which enable them to generalize well to unseen data. During the 1990s, SVM was developed by Vapnik and coworkers at Bell Labs (formerly AT&T Bell Laboratories, Murray Hill, NJ, USA). A profound overview of the underlying theory and the SVM algorithm itself is given by Guyon and Elisseeff. In the case of SVR, the goal is to find a function that predicts the target values of the training data with a deviation of at most ε, while requiring this function to be as flat as possible. The core of the support vector algorithm does this for linear functions f(x) = <w,x> + b, where (w,x) denotes the dot product of vectors w and x, thereby enforcing flatness by minimizing |w| (|w| denotes the Euclidian norm of vector w). By using a dual representation of the minimization problem, the algorithm requires only dot products of the input patterns. This allows the application of nonlinear regression by using a kernel function that represents the dot product of the two transformed vectors. The MLR and the linear support vector algorithm are both linear approaches, but they differ in their underlying theoretical heuristics: the MLR method fits a model using the least-mean-squares heuristic (that is, the sum of the squared distances to the regression line is minimized). The support vector algorithm fits a flat-as-possible function by searching a separating hyperplane (Figure 1). The radial basis function (RBF) SVR method fits a nonlinear function onto the data, again aiming for maximum flatness. The RBF kernel is also often named a Gaussian kernel since the kernel function is the same as the Gaussian distribution function. Smola and Schölkopf give an excellent overview of many details of the SVR procedure.
Abstract and Introduction
Abstract
Introduction: Tacrolimus is an important immunosuppressive drug for organ transplantation patients. It has a narrow therapeutic range, toxic side effects, and a blood concentration with wide intra- and interindividual variability. Hence, it is of the utmost importance to monitor tacrolimus blood concentration, thereby ensuring clinical effect and avoiding toxic side effects. Prediction models for tacrolimus blood concentration can improve clinical care by optimizing monitoring of these concentrations, especially in the initial phase after transplantation during intensive care unit (ICU) stay. This is the first study in the ICU in which support vector machines, as a new data modeling technique, are investigated and tested in their prediction capabilities of tacrolimus blood concentration. Linear support vector regression (SVR) and nonlinear radial basis function (RBF) SVR are compared with multiple linear regression (MLR).
Methods: Tacrolimus blood concentrations, together with 35 other relevant variables from 50 liver transplantation patients, were extracted from our ICU database. This resulted in a dataset of 457 blood samples, on average between 9 and 10 samples per patient, finally resulting in a database of more than 16,000 data values. Nonlinear RBF SVR, linear SVR, and MLR were performed after selection of clinically relevant input variables and model parameters. Differences between observed and predicted tacrolimus blood concentrations were calculated. Prediction accuracy of the three methods was compared after fivefold cross-validation (Friedman test and Wilcoxon signed rank analysis).
Results: Linear SVR and nonlinear RBF SVR had mean absolute differences between observed and predicted tacrolimus blood concentrations of 2.31 ng/ml (standard deviation [SD] 2.47) and 2.38 ng/ml (SD 2.49), respectively. MLR had a mean absolute difference of 2.73 ng/ml (SD 3.79). The difference between linear SVR and MLR was statistically significant (p < 0.001). RBF SVR had the advantage of requiring only 2 input variables to perform this prediction in comparison to 15 and 16 variables needed by linear SVR and MLR, respectively. This is an indication of the superior prediction capability of nonlinear SVR.
Conclusion: Prediction of tacrolimus blood concentration with linear and nonlinear SVR was excellent, and accuracy was superior in comparison with an MLR model.
Introduction
Purpose. Tacrolimus blood concentrations demonstrate a wide intra- and interindividual variability. Therefore, monitoring of these concentrations remains an issue of pivotal importance to safeguard therapeutic efficacy and to manage the risk for nephrotoxicity, other toxicities, and rejection in liver transplantation patients. This study examines the feasibility and clinical benefits of using a support vector regression (SVR) algorithm in comparison with a multiple linear regression (MLR) algorithm in predicting tacrolimus blood concentration. Tacrolimus blood concentration is predicted starting from a selected number of clinically relevant input variables.
Background. Hospital information systems in intensive care medicine generate large datasets on a daily basis. These rapidly increasing amounts of data make the task of extracting correct and relevant clinical information from intensive care unit (ICU) patients difficult. Data modeling techniques based on machine learning such as support vector machines (SVMs) can partially reduce workload, aid clinical decision-making, and lower the frequency of human error. Fundamental research in clinical data modeling forms the basis on which later validation can be performed in multicentered clinical trials. This is the first study to use SVM for data modeling in the ICU domain. SVMs have been applied, however, in molecular biology, bioinformatics, as well as in genetics and proteomics. In cancer research, kernel methods (or SVM) have been used to predict malignancy in brain tumors and also in staging certain forms of breast and prostate cancer. In cardiology, heart valve disease has been predicted with SVMs, and in fundamental cardiology research, nucleotide polymorphisms of candidate genes for ischemic heart disease have been modeled by kernel methods. Clinical decision-making has been compared for prospective performance with logistic regression and SVM. In contrast with the absence of data concerning SVM applications in the ICU, artificial neural networks (ANNs) - as a less recent statistical learning technique - have been studied thoroughly in the ICU environment: they have been used for prediction of ICU mortality and prognosis in septic shock, clinical decision-making, and prediction of plasma drug concentrations. Also, the management of infectious diseases, real-time analysis of hemodynamics, and research in cardiology and oncology have benefited from recent evolutions in artificial intelligence (AI) and ANN.
Underlying theory. The roots of SVM lie in the statistical learning theory, which describes properties of learning machines which enable them to generalize well to unseen data. During the 1990s, SVM was developed by Vapnik and coworkers at Bell Labs (formerly AT&T Bell Laboratories, Murray Hill, NJ, USA). A profound overview of the underlying theory and the SVM algorithm itself is given by Guyon and Elisseeff. In the case of SVR, the goal is to find a function that predicts the target values of the training data with a deviation of at most ε, while requiring this function to be as flat as possible. The core of the support vector algorithm does this for linear functions f(x) = <w,x> + b, where (w,x) denotes the dot product of vectors w and x, thereby enforcing flatness by minimizing |w| (|w| denotes the Euclidian norm of vector w). By using a dual representation of the minimization problem, the algorithm requires only dot products of the input patterns. This allows the application of nonlinear regression by using a kernel function that represents the dot product of the two transformed vectors. The MLR and the linear support vector algorithm are both linear approaches, but they differ in their underlying theoretical heuristics: the MLR method fits a model using the least-mean-squares heuristic (that is, the sum of the squared distances to the regression line is minimized). The support vector algorithm fits a flat-as-possible function by searching a separating hyperplane (Figure 1). The radial basis function (RBF) SVR method fits a nonlinear function onto the data, again aiming for maximum flatness. The RBF kernel is also often named a Gaussian kernel since the kernel function is the same as the Gaussian distribution function. Smola and Schölkopf give an excellent overview of many details of the SVR procedure.
Source...