## Doi:10.1016/j.theochem.2005.06.032

Journal of Molecular Structure: THEOCHEM 731 (2005) 73–81
Prediction of antifungal activity by support vector machine approach
Shi-Wei Chena, Ze-Rong Lib,*, Xiang-Yuan Lia,**
aCollege of Chemical Engineering, Sichuan University, Chengdu 610064, People’s Republic of China
bCollege of Chemistry, Sichuan University, Chengdu 610065, People’s Republic of China
Received 13 December 2004; revised 30 June 2005; accepted 30 June 2005
A set of molecular descriptors, including electronic descriptors, topological descriptors, geometric descriptors and molecular shape
indices, are calculated to characterize the structural and physicochemical properties for 94 chemical compounds: 42 antifungal active and 52inactive. Support Vector Machine (SVM) classification method is employed to model the discrimination between the antifungal activity andinactivity for these compounds. Leave-one-out (LOO) cross-validation method is used to optimize the SVM model and a genetic algorithm isused in variable selection, this reduces the number of molecular descriptors from 67 to 30. Five-fold cross-validation method and anindependent evaluation set are used to test SVM model. The training sets are effectively and evenly chosen in the descriptor space byclustering based on their chemical similarity, and both of the test methods give consistent results with the LOO method. Compared to theLOO method, 5-fold cross-validation method or the independent test method requires much less time for the SVM model optimization and itis impossible to do a LOO cross-validation for a very large data set. Our work suggests that a proper choice of training set for 5-fold cross-validation method or the independent test method can give consistent results with the LOO method. Comparison of the results by SVMmethod and those by other statistical classification methods, for example k-nearest neighbor (k-NN) and C4.5 decision tree that use the samepre-selected molecular descriptors, is also conducted. Our investigation indicates the potential of SVM in facilitating the prediction ofantifungal activity.

q 2005 Elsevier B.V. All rights reserved.

Keywords: Support vector machines; Antifungal activity; Variable selection; Training set design; Genetic algorithm
compounds with a broad antifungal spectrum that can betaken as a starting point for the development and
During the past two decades, the prevalence of systemic
fungal infections increased significantly due to the
Quantitative structure-activity relationship (QSAR) rep-
increasing use of broad-spectrum antibiotics, immunosup-
resents an attempt to correlate structure descriptors of
pressive agents, hyperalimentation products and central
compounds with their biological activity. Conventional
venous catheters, intensive care of low birth weight infants,
QSAR approaches have been challenged by combinatorial
organ transplantation, and the acquired immunodeficiency
chemistry and high throughput screening (HTS) which are
syndrome (AIDS) epidemic Invasive fungal infections
innovative techniques adopted by the pharmaceutical and
have become the important causes of morbidity and
agrochemical industries in an effort to reduce costs and
mortality in immunocompromised patients None of
shorten drug discovery timelines. HTS produces a large
the existing systemic antifungals satisfies medical need
amount of screening data, which in most cases identifies
completely It is urgent to find new chemical
compounds as either active or inactive Compounds inHTS assay have diverse structures which make it difficult to
* Corresponding authors. Tel.: C86 28 8540 3231; fax: C86 28 8540
analyze HTS data using conventional QSAR methods and
make reliable predictions . Classification and pattern
** Tel.: C86 28 8540 5233; fax: C86 28 8540 7797.

recognition have become a necessary step in QSAR to
E-mail addresses: lizrscu@yahoo.com.cn (Z.-R. Li), xyli@scu.edu.cn
analyze the large amount of data produced by combinatorial
chemistry and HTS. Since Lipinski’s rule of 5 should not be
0166-1280/$ - see front matter q 2005 Elsevier B.V. All rights reserved.

applied to antibiotics, antifungals, vitamins and cardiac
S.-W. Chen et al. / Journal of Molecular Structure: THEOCHEM 731 (2005) 73–81
glycosides, a correspondingly simple classification into
inactive The antifungal compounds contain allylamines,
antifungal active and inactive is not advocated.

imidazoles, thiocarbamtes, triazole and other derivatives,
A lot of computational methods are available nowadays
showing a great diversity in molecular structure.

to discover new drugs with antifungal activity, such asmultiple linear regression (MLR), linear discriminant
analysis (LDA), comparative molecular similarityindices analysis (CoMSIA), genetic function approxi-
In our statistical investigation, 67 descriptors are
mation (GFA), artificial neural network (ANN),
calculated to encode structural and physicochemical proper-
and so forth. In particular, the LDA model showed
ties of molecules, including topological descriptors,
promising capability of antifungal activity prediction and
electronic descriptors, geometrical descriptors, descriptors
achieved a prediction accuracy of 60% for the actives and of
based on charged-partial surface area and a series of
98% for inactive A new statistical learning method,
molecular shape descriptors defined by us.

support vector machine (SVM) is very useful for
classification of systems with multiple mechanisms, such as
the prediction of blood-brain barrier penetration,
P-glycoprotein substrates, HIV protease cleavage
sites in protein, and protein fold recognition In those cases, SVM is found to be superior to other
with K Z n1 C n2 C n3, n1, n2, n3Z0,1,2,3., Edt represents
an integral over the body which may be a volume or a
Since thousands of molecular descriptors are available
surface. As defined, these quantities depend on rotation and
for QSAR analysis, and only a subset of them is statistically
translation of the molecules. Therefore, Covell and co-
significant in terms of correlation with biological activity
workers defined K order invariant moment (KZ2n) as
for a particular QSAR, deriving the optimal subset for a
QSAR model through variable selection needs to be
addressed. Several variable selection techniques, includingsimulated annealing recursion feature elimination
G2n is invariant with respect to rotations and translation after
, and genetic algorithm, have been used to
translating the molecules to the center-of-mass of the body.

select variables to improve the classification performance of
Based on the above illustrations, we define a new K order
moment shape index as given in Eq. (3), in order to encode
This work applies SVM as a pattern recognition method
for the antifungal activity of compounds. A set of moleculardescriptors including a series of molecular shape descriptors
defined by us are calculated to characterize the structural
and physicochemical properties for known antifungals and
K(real) is the moment of actual molecule and
inactive compounds. The compounds are used for develop-
K(sphere) the moment of a sphere with the same volume as
the actual molecule. c(0) is the usually defined ovality.

ing the SVM system of prediction. A widely used feature
These molecular shape descriptors encode the degree of the
selection method, genetic algorithm (GA), is adopted for the
deviation of the shape of one molecule from a sphere. The
variable selection in order to find the informative features.

programs computing the new descriptors described here are
The trained SVM system then is used to classify the
written by the authors using Fortran, but they are still under
chemical compounds as active or inactive. The classifi-
cation accuracy of this system is evaluated using two
In this work, atomic charges are calculated according to
methods: an independent set and 5-fold cross-validation.

partial equalization of orbital electronegativity (PEOE)
The training sets are effectively and evenly chosen in the
method of Gasteiger et al. Electronical descriptors
descriptor space by clustering based on the structure
are all based on these PEOE charges. Descriptors used in
similarity of the compounds, and the results by SVM
this work are listed in All of these descriptors for
method are compared with those by decision tree method
each compound are computed using our designed molecular
and k-nearest neighbor method, using the same sets of
compounds and molecular descriptors.

SVM is a supervised machine learning technique for
learning classification and regression rules from data.

A good introduction to SVM can be found in literature
The compounds investigated in this work are listed in
. Here we give only the main idea of SVM for
, 42 compounds with antifungal activity and 52
S.-W. Chen et al. / Journal of Molecular Structure: THEOCHEM 731 (2005) 73–81
Table 1Prediction from SVM by LOO and 5-fold cross validation
*Incorrectly classified compounds in the test set and independent evaluation set.

a The number of antifungal active compound in this paper.

c Tr, training set; Ts, test set; Ind, independent evaluation set.

d The number of inactive compound in this paper.

S.-W. Chen et al. / Journal of Molecular Structure: THEOCHEM 731 (2005) 73–81
Second order moment shapeindex (based on moleculesurface area)
Given a training data set {xi, yi}, iZ1,2,.,N, where
yi2{K1,1} represents the label of the classification of an
arbitrary sample xi2Rd, d being the dimension of the input
space. If the training data are linearly separable, the original
which maximizes the margin between positive samples
(yiZ1) and negative examples (yiZK1). This can be solved
through the minimization of kwkunder the constraints of
S.-W. Chen et al. / Journal of Molecular Structure: THEOCHEM 731 (2005) 73–81
work. Secondly, the fitness of each chromosome is
evaluated by the cross-validated predictive accuracy of the
SVM model. Thirdly, a new population is created by one-
point crossover and mutation of the chromosomes that are
selected from the population in an arbitrary proportion with
highest fitness. Typical mutation rate of 1% and crossover
where sgn is a sign function. A linear classifier may not be
rate of 25% are chosen. Finally, we turn back to second step
the most suitable hypothesis for the two classes. For non-
until the number of generations reaches a given maximum.

linear classifier, the SVM maps the data to some higher
After these steps, an optimal subset of molecular descriptors
dimensional feature space by introducing a kernel function
will be obtained for a given s. By the optimization of s, a
k(x,y) and constructing a separating hyperplane in this
SVM model, which gives the best prediction accuracy, is
space. The resulting decision function becomes
and the solution is obtained by maximizing
There are various ways to measure the prediction
performance, and some of them are more suitable than
others, depending on the application considered .

The most common measure of overall performance is Qtotal
which is the fraction of correctly predicted antifungal active
compound and inactive compound among all predictions.

As for the kernel function, the most widely used forms are
where TP, TN, FP and FN are the numbers of the truepositive, the true negative, the false positive and the false
negative, respectively. To get a measure on the sensitivity of
For the model selection in SVM, the SVM minimizes a
prediction performance, SEZ TP=ðTPC FNÞ, the fraction of
bound on the expected generalization error for a test set,
correctly predicted antifungal compounds among observed
instead of minimizing error on the training set. This is
antifungal compounds, is used. Similarly, to get a measure
accomplished by minimizing a composite error, comprised
of the training error plus a regularization term relating to the
SPZ TN=ðTNC FPÞ, the fraction of correctly predicted
inactive compounds among observed inactive compounds,is used.

ability in the SVM model optimization through the LOOcross-validation.

Genetic algorithm (GA) models the process of natural
evolution in which species with a high fitness can prevailand survive to the next generation, and the best species can
be adapted by crossover and/or mutation to search for thebetter individuals In this work, a chromosome and
Because of the limitation of LOO method for the SVM
its fitness in the species represent the encoding of a set of
model optimization for a large data set, two widely used
molecular descriptors and the predictive accuracy of
additional methods are employed in the present work to
the SVM model, respectively. We use the leave-one-out
evaluate SVM model. In the first method, the compounds
(LOO) cross-validation method to evaluate the average
are divided into three sets: a training set, a test set and an
generalization ability of the SVM model. The optimization
independent set. The training set are used to train the SVM,
of the model includes the optimization of the exponent s in
and the test set are used to optimize the exponent in the
Gaussian kernel and the optimal choice of molecular
Gaussian kernel function by maximizing the generalization
descriptors. Our algorithm for the model optimization
ability, and the independent set are used to test the
consists of four steps for a given s parameter in SVM.

prediction ability of the final model.

The first step is the creation of an initial population. The
In the selection of compounds in a training set, one
initial population of chromosomes is created by setting all
molecular descriptor corresponds to one dimension of the
bits in each chromosome to a random value (1 or 0). Bit ‘1’
multidimensional space. In order to make the space sampled
denotes a selection of a variable, and bit ‘0’ denotes a non-
effectively and evenly, we use the k-means clustering
selection. The size of the initial population is 30 in this
algorithm for the training set design. At first the compounds
S.-W. Chen et al. / Journal of Molecular Structure: THEOCHEM 731 (2005) 73–81
are divided into a given number of clusters according to the
size of the training set, using the k-means clustering
algorithm. The distances between two compounds in
chemical space is the Euclidean distance. In this way,
compounds that are close in the space are in the same
cluster. Then the compound nearest to the center of each
cluster is chosen as a member of the training set. The
compounds left after the selection of the training set are
finally divided into test set and independent set at random.

The compounds of training set, the test set and the
independent set are listed in and labeled as Tr, Ts
The second method is the 5-fold cross-validation
method. The data set of 94 compounds are divided into
five subsets of nearly equal size. One subset is used as thetest set and all samples in the other four subsets are used as
See for the definition of the compound.

the training set. This procedure is repeated until every subsetis once used as the test set.

(k-NN) and C4.5 decision tree to model the antifungal
Clustering technique is also used for the subset design in
activity of these compounds for comparison.

the 5-fold cross-validation. However, here the clustering is
In k-NN, the Euclidean distance between the
done manually rather than by computer, based on the
unclassified vector x and each individual vector xi in the
structure similarity. The antifungal compounds are divided
training set is measured, and k nearest vectors to the
into five clusters (The first cluster contains
unclassified vector x are used to determine the class of
imidazoles and triazoles, the second cluster is composed of
unclassified vector x. The class of the majority of the k-
derivatives of benzothiazoles and benzimidazoles, allyla-
nearest neighbors is chosen as the predicted class of the
mines and thiocarbamates are classified into the third
unclassified vector x. The important parameter, the number
cluster, the fourth cluster contains the derivatives of
of nearest neighbors, can affect the outcome. We found that
naphthyl and quinoline, and finally the compounds left
the nearest neighbor (kZ1) can give the best prediction in
enter the fifth cluster. In the same way, the inactive
compounds are divided into eight clusters based on their
C4.5 decision tree is formalism expressing mappings
structural similarity, as listed in . The compounds in
from feature values to classes (predictions), and consists of
every cluster are averagely divided into five subsets at
attribute nodes that link to subtrees and leaves labeled with a
random, so the compounds in every subset can span the
class C4.5 decision tree uses recursive partitioning
chemical space well. In this way, each training set in the
to examine every attribute of the data and rank them
5-fold cross-validation can sample the space evenly.

according to their ability to partition the remaining data, andthus a decision tree is constructed. Instances are sorted downthe tree from the root node to some leaves. Each node in the
2.7. Comparison with other statistical classification
tree specifies a test of some attribute of the instance, and
each branch descending from that node corresponds to oneof the possible values from this attribute.

In order to evaluate the SVM method for the antifungal
activity prediction, we also used the k-nearest neighbor
The overall prediction accuracy (Qtotal) is 89.4% after
242 GA generations using LOO cross-validation method for
an optimal Gaussian exponent and 30 molecular descriptors
are selected and marked with asterisks in . These
descriptors mainly consist of topological and shape
descriptors. To evaluate the effect of variable selection on
the classification accuracy of the SVM model, a 5-fold
cross-validation is conducted using the selected 30
descriptors (termed as SVMCGA model) and then all the
See for the definition of the compound.

67 descriptors (termed as SVM model). The results are
S.-W. Chen et al. / Journal of Molecular Structure: THEOCHEM 731 (2005) 73–81
Table 5SVM and SVMCGA prediction accuracy by using 5-fold cross-validation
given in The average accuracies for antifungals and
selected using GA gives an overall prediction accuracy of
inactive compounds, and the overall accuracy are, respect-
89.4% and the prediction accuracy of 78.6% for the active
ively, 91.0, 81.8 and 84.0% for SVM model and 97.1, 85.2
and 98% for the inactive by LOO. Results show that for both
and 89.4% for SVMCGA model, showing an obvious
of the validation methods, nine antifungal compounds,
improvement with GA variable selection. Our investi-
Chlordantoin, Ciclopirox, Cloxyquine, Fluconazole, Flucy-
gations suggest that GA is useful for removing redundant
tosine, Hexetidine, Itraconazole, Nifuratel, Saperconazole,
descriptors, and helpful for the computational efficiency of
are misclassified as inactive compounds, and one inactive
compound, Tamoxifen, is misclassified as antifungal(The misclassified compounds by LOO cross-
These ten compounds are always misclassified by the
In this work, the test with SVM is performed using the
SVM system using 30 descriptors selected. Therefore, our
methods described in Section 2.6, i.e. the independent set
analysis suggests that the incorrect classification of them
and 5-fold cross-validation methods. As shown in
arises from an inadequate description of the detailed
, for independent set and 5-fold cross-validation
configuration. In the same way, the six compounds of
methods, the prediction accuracies of the antifungal activity
incorrect classification using the independent validation are
are found 100 and 97.1% and those of inactive 77.8 and
included in the ten compounds except Clonidine.

85.0%, respectively. Both methods give consistent resultsand demonstrate the stability of the SVM system.

3.3. Effect of cluster analysis on classification accuracy
Only a tentative comparison can be made to provide
some crude estimate regarding the approximate level of
In 5-fold cross-validation, each subset built through
accuracy of our method with respect to those obtained by
clustering and reflects its distribution approximately in the
other studies because of differences in the use of descriptors
same manner. With SVMCGA, the prediction accuracies of
and classification methods. Garcı´a–Domenech et al.

antifungal activity are 100.0, 100.0, 100.0, 85.7 and 100.0%,
modeled the antifungal activity for these compounds using
whereas the accuracies of inactive groups are 91.7, 83.3,
LDA with topological descriptors by LOO and got the
83.3, 83.3 and 83.3% for the five subsets. It is clear that
prediction accuracy of over 60% for the active and 98% for
prediction for antifungal activity is better than for inactive
the inactive . Our SVM modeling with the descriptors
groups. Also, it can be seen from that SVM without
Table 6Performance comparison of different classification methods by using independent evaluation set
See the text and for the definitions of symbols.

S.-W. Chen et al. / Journal of Molecular Structure: THEOCHEM 731 (2005) 73–81
Clustering technique is employed to design the subsets for
Performance comparison of different classification methods by 5-fold
5-fold cross-validation and for independent set validation,
and our calculations show that the clustering technique is
very helpful for improving the efficiency of SVM model
building. Comparison of SVM method with k-NN and C4.5
decision tree shows that SVM method is a potential
computation method for screening antifungal drugcandidates.

feature selection, i.e. the SVM model, shows the same trend.

Therefore, it is suggested that the subsets designed throughclustering can span effectively and evenly the sample space.

The misclassified compounds through LOO cross-
validation and 5-fold cross-validation by SVMCGAs
This work is supported by the National Natural Science
model are the same, although LOO takes more time than
5-fold cross-validation. This implies that cluster analysiscan reduce the time for SVM model building withoutreducing the accuracy.

3.4. Comparison to k-NN and C4.5 decision tree methods
[1] I. Al-Mohsen, W.T. Hughes, Ann. Saudi. Med. 18 (1998) 28.

[2] H.G. Nafsika, Curr. Opin. Microbiol. 1 (1998) 547.

In we compare the results of SVM with those of
[3] A.H. Groll, T.f. Walsb, Swiss Med. Wkly. 132 (2002) 303.

k-NN method and C4.5 decision tree methods by 5-fold
[4] R. Garcı´a-Domenech, I. Rios-Santamarina, A. Catala´, C. Calabuig,
cross-validation. The same compound sets and descriptors
L. del Castillo, J. Ga´lvez, J. Mol. Struct. (Theochem) 624 (2003) 97.

[5] M.C. Mar, G.R. Fe´lix, Lancet Infect. Dis. 2 (2002) 550.

are used. The overall prediction accuracy is 89.4, 76.5,
[6] R. Garcı´a-Domenech, A. Catala´-Gregori, C. Calabuig, G. Anto´n-Fos,
75.6% for SVM, k-NN and decision tree, respectively,
L. del Castillo, J. Ga´lvez, Internet Electron J. Mol. Des. 1 (2002) 339–
obviously showing the prior classification ability of SVM.

For the present data set of 42 antifungals and 52 inactive
[7] M.D. De-Bacher, P.T. Magee, J. Pla, Annu. Rev. Microbiol. 54
compounds, SVM, k-NN and C4.5 decision tree lead to false
[8] J. Wo¨lcke, D. Ullmann, Drug Discov. Today 6 (2001) 637.

negative of 2, 27, and 21%, and false positive of 21, 19, and
[9] H. Gao, M.S. Lajiness, J.V. Drie, J. Mol. Graph. Model. 20 (2002)
29%, respectively. This shows the considerably lower
probability of false negative than false positive with the
[10] C.A. Lipinski, F. Lombardo, B.W. Dominy, P.J. Feeney, Adv. Drug.

SVM model. Furthermore, from the perspective of a
computerized screening of molecular structures for sub-
[11] A.A.C. Pinheiro, R.S. Borges, L.S. Santos, C.N. Alves, J. Mol. Struct.

sequent synthesis and experimental testing, a greater risk to
[12] V.M. Gokhale, V.M. Kulkarni, J. Med. Chem. 42 (1999) 5348.

overlook an active structure is probably preferred to a
[13] V.M. Gokhale, V.M. Kulkarni, Bioorgan. Med. Chem. 8 (2000) 2487.

greater risk for false prediction of biological activities. In
[14] K. Hasegawa, T. Deushi, O. Yaegashi, Y. Miyashita, S. Sasaki, Eur.

the prediction results of SVM compared with those
of k-NN and C4.5 decision tree methods by using
[15] V.N. Vapnik, The Nature of Statistical Learning Theory, Springer,
independent evaluation set. The prediction accuracy for
[16] C.J.C. Burges, Data Min. Knowl. Disc. 2 (1998) 127.

antifungal activity are 100.0, 75, 80.0%, respectively for
[17] M.W.B. Trotter, B.F. Buxton, S.B Holden, Meas. Control 34
SVM, k-NN and decision tree. Therefore, we can conclude
that SVM method is superior to the other methods when
[18] Y. Xue, C.W. Yap, L.Z. Sun, Z.W. Cao, J.F. Wang, Y.Z. Chen,
considering the risk for prediction.

J. Chem. Inf. Comput. Sci. 44 (2004) 1497.

[19] R. Czermin´ski, A. Yasri, D. Hartsough, Quant. Struct.-Act. Relat. 20
[20] Y.D. Cai, X.J. Liu, X.B. Xu, K.C. Chou, J. Comput. Chem. 23
[21] C.H.Q. Ding, I. Dubchak, Bioinformatics 17 (2001) 349.

In this work, SVM method is employed for modeling the
[22] J.M. Sutter, S.L. Dixon, P.C. Jurs, J. Chem. Inf. Comput. Sci. 35
discrimination between the antifungal activity and inactiv-
[23] J.M. Sutter, J.H. Kalivas, Microchem. J. 47 (1993) 60.

ity, and GA is applied for variable selection in order to
[24] Y. Xue, Z.R. Li, C.W. Yap, L.Z. Sun, X. Chen, Y.Z. Chen, J. Chem.

reduce the noise generated by the use of overlapping and
redundant molecular descriptors. Our investigations show
[25] I. Guyon, J. Weston, S. Barnhill, V. Vapnik, Mach. Learn. 46
that variable selection is helpful in enhancement of the
[26] H. Yu, J. Yang, W. Wang, J. Han, Proc. IEEE Comput. Soc.

prediction ability for the prediction of antifungal activity of
chemical agents. The prediction ability of SVM is tested by
[27] T.J. Hou, J.M. Wang, N. Liao, X.J. Xu, J. Chem. Inf. Comput. Sci. 39
the independent validation and 5-fold cross-validation.

S.-W. Chen et al. / Journal of Molecular Structure: THEOCHEM 731 (2005) 73–81
[28] K. Hasegawa, T. Kimura, K. Funatsu, Quant. Struct. Act. Relat. 18
[41] J.H. Holland, Adaptation in Natural and Artificial Systems, University
[29] J.K. Wegner, H. Fro¨hlich, A. Zell,, J. Chem. Inf. Comput. Sci. 44
[42] D.E. Goldberg, Genetic Algorithms in Search, Optimization and
Machine Learning, Addison-Wesley, Reading, MA, 1989.

[30] J.K. Wegner, H. Fro¨hlich, A. Zell,, J. Chem. Inf. Comput. Sci. 44
[43] I.P. Androulakis, V.A. Venkatasubramanian, Comput. Chem. Eng. 15
[31] H. Fro¨hlich, J.K. Wegner, A. Zell, QSAR Comb. Sci. 23 (2004) 311.

[44] J.E. Roulston, Mol. Pharmacol. 20 (2002) 153.

[32] M.L. Mansfield, D.G. Covell, J. Chem. Inf. Comput. Sci. 42
[45] P. Baldi, S. Brunak, Y. Chauvin, C.A. Andersen, H. Nielsen,
[33] J. Gasteiger, M. Marsili, Tetrahedron 36 (1980) 3219.

[46] C.J. Huberty, Applied Discriminant Analysis, Wiley, New York,
[34] M. Mortier, K. Van Genechten, J. Gasteiger, J. Am. Chem. Soc. 107
[47] R.A. Johnson, D.W. Wichern, Applied Multivariate Statistical
[35] C.J.C. Burges, Tutorial on Support Vector Machines for Pattern Recogni-
Analysis, Prentice Hall, Englewood Cliffs, NJ, 1982.

[48] L. Breiman, J. Friedman, R. Olshen, P. Stone, Classification and
[36] B. Scho¨lkop, A. short, A Short Tutorial on Kernels 2000
Regression Trees, Wadsworth, Belmont, CA, 1984.

[49] J.R. Quinlan, C4.5: Programs for Machine Learning, Morgan
[37] B. Scho¨lkopf, Support Vector Learning 1999 (
[50] R. Todeschini, V. Consonni, Handbook of Molecular Descriptors,
[38] M.W.B. Trotter, B.F. Buxton, S.B. Holden, Meas. Control 34 (2001) 235.

[39] R. Burbidge, M. Trotter, B. Buxton, S. Holden, Comput. Chem. 26
[51] H. Wiener, J. Am. Chem. Soc. 69 (1947) 17.

[52] L.H. Hall, L.B. Kier, Issues in representation of molecular structure,
[40] R. Czerminski, A. Yasri, D. Hartsough, Quant. Struct.-Act. Relat. 20
[53] K.B. Lipkowitz, D.B. Boyd, Rev. Comput. Chem. 2 (1991) 401.

Source: http://ce.scu.edu.cn/xueyuanjiaoshizhuye/LiXiangyuan/script/PDF/Theochem-Chen.pdf

Editorial New therapeutic perspectives in prostate cancerIrena Manea1, B. Djavan2, C. N. Manea1, V.Cristea1, I. Coman1 1 University of Medicine and Pharmacy „Iuliu Haflieganu” Cluj-Napoca, Romania 2 Minimal Invasive and Prostate Center, New York University (NYU), New York, USA Abstract Along with progresses in understanding the complex interactions between tumor cells and the immune r

Scientists with Multiple Hot Papers Institution of Hot Papers © 1994-2009 China Academic Journal Electronic Publishing House. All rights reserved. http://www.cnki.net The Red 2 Hot Research Papers of 2008 Citations Y. Kamihara , et al . , Iro n2based superconductor wit h La[ O1 - x Fx ] FeAs ( x = 0. 05 J . A m. Chem. S oc. ,130 (11) :3296 - 7 ,19 March 2008. I. H. Park ,