## Kernel-machines.org

Cyclosporine Concentration Prediction using
G. Camps-Valls∗, E. Soria-Olivas∗, J. J. P´
erez-Cruz‡, A. R. Figueiras-Vidal‡, A. Art´
∗ Dept. Enginyeria Electr`onica. Universitat de Val`encia, Spain.

C/ Dr. Moliner, 50, 46100 - Burjassot (Val`
† Pharmacy Service. Dr. Peset University Hospital. Val`encia, Spain∗.

‡ Dpto. Teor´ıa de la Se˜nal y las Comunicaciones. Universidad Carlos III de Madrid.

This paper proposes a combined strategy of clustering and Support
Vector Regression (SVR) methods to predict Cyclosporine A (CyA) con-
centration in renal transplant recipients.

variability and non-stationarity of the time series and reports knowledge
gain in the problem. The SVR outperforms other classical neural net-
∗J. J. P´erez-Ruixo is also with the Global Clinical Pharmacokinetics and Clinical Pharma-
codynamics Department. Janssen Research Foundation (Belgium).

Despite progress with newer agents, Cyclosporine A (CyA) is still the corner-
stone of immunosuppression in patients who have undergone kidney transplan-
tation. However, CyA is generally considered to be a critical dose drug. Under-
dosing may result in graft loss and overdosing causes kidney damage, increases
opportunistic infections, systolic and diastolic pressure, and cholesterol. More-
over, the pharmacokinetic behavior of CyA presents a substantial inter- and
intra-individual variability which appears to be particularly evident in the ear-
lier post-transplantation period (<3 months), when the risk and clinical conse-
quences of acute rejection are higher than in stable renal patients (>6 months)
[1]. Several factors such as clinical drug interactions and patient compliance can
also significantly alter blood CyA concentrations and thus intensive therapeu-
tic drug monitoring of CyA becomes necessary but it influences the patient’s
quality of life and the cost of the care.

Models capable of predicting the future concentration and determining the
optimal dosage of CyA usually aid to individualize therapy. Few studies have
been done and none, to our knowledge, using machine learning or neural net-
works. We propose the use of Support Vector Machines (SVM) for solving this
task since they do not rely on any a priori assumption about the problem and
have proven to be effective techniques in a wide range of applications [2]. To
deal with the non-uniform sampling (NUS), the presence of non-stationary pro-
cesses, and the high variability in the time series, we have previously clustered
SVMs are state-of-the-art tools for nonlinear input-output knowledge discovery
[2]. The Support Vector Regressor (SVR) is for regression and function approx-
imation. Given a labelled training data set ({(x
and yi ∈ R) and a nonlinear mapping to a higher dimensional space φ(·)
R −→ φ(x) ∈ R , d ≤ H ), the SVR solves:
where w and b define a linear regressor in the feature space, nonlinear in the
input space unless φ(xi) = xi. In addition, ξi, ξ∗ and C are, respectively,
positive slack variables to deal with training samples with a prediction error
larger than ε and the penalization applied to these ones. The tube size ε is
traded off against model complexity and slack variables via a constant ν ∈ [0, 1)
which can be regarded as an upper bound on the fraction of errors and a lower
bound on the fraction of Support Vectors (SV). This formulation is known as
the ν-SVR [3]. The usual procedure for solving SVRs introduces the linear
restrictions using Lagrange multipliers into Eq.

Kuhn-Tucker conditions and solves the Wolfe’s dual problem using quadratic
programing procedures [2, 3]. We will instead use an alternative procedure
that consists in solving iteratively a series of weighted least square problems
[4], known as Iterative Re-Weighted Least Square (IRWLS) procedure, that is
and, in order to work with reproducing kernels in Hilbert Space, we
require w to be a linear combination of a subset of training samples
where ei = yi − φT (xi)w − b − ε and e∗ = φT (x
The column vectors y, a, a∗, β and 1, present the obvious expressions and H
is known as the kernel matrix, since it is only formed by inner products of the
training samples in the feature space. Consequently, neither the minimizing
procedure nor the use of the regressor needs to know explicitly the form of the
nonlinear mapping, φ(·), but only its kernel representation κ(·, ·). The needed
transformations to obtain the IRWLS procedure from the minimization of Eq.

Fifty-seven renal allograft recipients treated in the Nephrology Service of the
Hospital Universitari Dr. Peset in the city of Val`
this study. Patients received a standard immunosuppressive regimen of CyA
Steady state blood samples were withdrawn 12-14
hours after dose administration and measured by a specific monoclonal fluo-
rescence polarization immunoassay. We collected 11 patient factors to build
the models: age, gender, creatinine plasma levels, creatinine clearance, alka-
line phosphatase, hematocrit, urea and bilirubin, along with dosage, CyA blood
concentration and post-transplantation days. Each pattern was formed by the
present and past values of the these variables in order to perform one-step-
ahead prediction. We split the data into two groups: two-thirds of the patients
were used to train the models and the rest for their validation using the cross-
The high intersubjects variability (coefficient of variation, CV = 31%) led us to
set up clusters and then building individual predictive models for each one of
them. We have used the well-known K-means clustering algorithm and selected
the optimal partition by evaluating the root-mean-square error (RMSE) of ded-
icated models through 3-fold cross-validation experiments. Four clusters were
identified with this methodology. Since the second cluster was the largest (42%
of the patterns), models yielded poor results in it (RMSE>60 ng/mL) and its
variability still held high values (CV=27%), we decided to perform re-clustering
on it. Once again a four-clusters partition was employed for the posterior pre-
diction. The clustering reduced the RMSE and revealed postoperatory days,
creatinine clearance, CyA blood concentration and serum alkaline phosphatase
as decisive factors. In Table 1 results are benchmarked with a multilayer per-
ceptron (MLP) trained with the familiar back-propagation algorithm and the
Elman recurrent neural network [5] both with and without a previous clustering.

Elman network fails with a clustering approach since NUS becomes more
evident (samples from the same patient can be in different clusters) and thus
contextual neurons do not deal efficiently with past time samples. The ν-SVR
model outperforms the MLP since the CV[%] is subsequently reduced in each
partition. Blood levels accurately predicted (%BLAP) if an error margin of 20%
is fixed have reached a value of 70%, which is an excellent result considering the
time series characteristics of our population. Figure 1 shows predictions in two
There is, nevertheless, 14% of patients with poor predictions which can be
due to errors in drug dosage administration, in recording blood sampling times
or abrupt changes in each patient’s clinical condition. An additional hypothesis
for this could be some kind of liver dysfunction since alkaline phosphatase has
resulted in a critical clustering factor. If we discard these patients, %BLAP
increases to a 88% with the ν-SVR and to a 75% with the MLP. Support Vectors
are mainly placed in the early post-operatory period (68% of them in the early
three months) and change of cluster according to high levels in CV[%].

In this paper we have proposed the combination of clustering and a state-of-
the-art technique for knowledge gain and accuracy improvement in a complex
pharmacokinetic prediction problem. By means of clustering we can identify
patients’ state and their future evolution, specify confidence intervals for each
cluster prediction and discover important and meaningless patient factors all the
time. The power and versatility of the SVR machines allowed fast and reliable
Our future work is tided to benchmark Incremental Learning using SVR
and Non-Linear Mixed-Effects Modelling (NONMEM) which is a commonly
used method in population pharmacokinetics.

[1] Lindholm A., “Factors influencing the pharmacokinetics of cyclosporine in
man,” Therapeutic Drug Monitoring, vol. 13, no. 6, pp. 465–477, Nov 1991.

[2] Vladimir N. Vapnik, Statistical Learning Theory, John Wiley & Sons, New
olkopf, P. L. Bartlett, A. Smola, and R. Williamson, “Shrinking the
tube: a new support vector regression algorithm,” in Advances in Neural
Information Processing Systems 11, M. S. Kearns, S. A. Solla, and D. A.

Cohn, Eds., Cambridge, MA, 1999, pp. 330 – 336, MIT Press.

es-Rodr´ıguez, “An IRWLS procedure for ν-SVR,”
in Proceedings of the ICASSP’01, Salt Lake City, Utah. U.S.A., May 2001.

[5] S. Haykin, Neural Networks: A Comprehensive Foundation, Prentice Hall,
Figure 1 CyA trough concentration predictions within two validation patients.

Table 1 Mean error (ME [ng/mL]) and root-mean-square error (RMSE [ng/mL])
of models both for training and validation.

Source: http://www.kernel-machines.org/papers/upload_23902_IEE2002CyA.pdf

INFORME TÉCNICO MIASTENIA GRAVIS Diciembre de 2008 Informe técnico: Miastenia gravis - Consejo General de Colegios Oficiales de Farmacéuticos MIASTENIA GRAVIS La miastenia gravis, miastenia grave (código G70.0 de la CIE-10)1 o enfermedad de Goldflam, es una enfermedad autoinmune adquirida, que se caracteriza por la exis-tencia de debilidad extrema, especialme

Aliment Pharmacol Ther 2004; 20: 1181–1188. Bacillus clausii therapy to reduce side-effects of anti-Helicobacterpylori treatment: randomized, double-blind, placebo controlled trialE . C . N I S T A * , M . C A N D E L L I * , F . C R E M O N I N I * , I . A . C A Z Z A T O * , M . A . Z O C C O * , F . F R A N C E S C H I * ,G . C A M M A R O T A * , G . G A S B A R R I N I * & A G A S B A