While small depth tends to increase the applicability of a tree to new data sets, but at the risk of decreased accuracy and failing to identify important features in the training data. The best tree depth parameter should be defined according to the predictions for the test data. In present study, the tree depth was changed from 3 to 20 and the corresponding performance of 234 RP models on training and test sets was evaluated. The 5-fold cross-validation technique was used to evaluate the model robustness. Molecular properties can depict whole-molecule properties, but they cannot characterize the important substructures or molecular fragments that play a key role in mTOR inhibition. Therefore, a combination of MP and molecular fingerprints were used simultaneously to XL-184 establish RP models. Here, 216 RP models based different combinations of 12 sets of fingerprints with MP were constructed and evaluated. Obviously, the addition of fingerprints can improve the performance of the RP models because the C values of the RP models based on fingerprints and MP are higher than those of RP models solely based on MP. For different combinations fingerprints and MP, the performances of the best RP models were screened according to the C value from different tree depth. For different fingerprint, the best tree depth parameters are different. The best decision tree, with a tree depth of 12 based on FPFP_4 and MP, is shown in Figure 5. The discriminant descriptors include seven MPs and 18 structure fragments. Of the seven MPs chosen by the decision tree, AlogP and logD are properties that describe molecule hydrophobicity, Nrot and nAR describe the molecule’s bulkiness, and MFPSA, MPSA and nHBAcc describe its electrostatic properties or hydrogen binding ability. In other words, the molecular hydrophobicity, size and electrostatic properties are important for mTOR inhibition, which is consistent with previously 3D-QSAR results. Moreover, the eighteen fragments based on FPFP_4 fingerprint also play a key role in discriminating between mTOR inhibitors and non-inhibitors. The na? ��ve Bayesian classifier is an unsupervised learner that does not have a fitting process and tuning parameters, unlike RP method that is sensitive to predefined parameters, e.g., tree depth. The process of Bayesian learning is to Talazoparib search through each feature in an unbiased way for those with separation power. Similar to the RP analysis, the performance of the na? ��ve Bayesian classifier based on MP and fingerprints was evaluated. Detailed results are summarized in Table 2. The Bayesian score based on MP and LCFP_6 was used to evaluate the discrimination of inhibitors from non-inhibitors via bimodal histograms of the training and test data sets. As shown in Figure 7a, the p value associated with the difference in the mean Bayesian score of training set mTOR inhibitors versus non-inhibitors was 1.176102221 at the 95% confidence level, suggesting that the two distributions are significantly different. The Bayesian score of inhibitor tends to have more positive value, while the Bayesian score of non-inhibitor tends to have more negative value. Similar results can be found in the 300 tested compounds. For virtual screening, the Bayesian score can be a quantitation standard to select new potential mTOR inhibitors. For both the training and test sets, the Bayesian score of both classes of compounds have some overlaps between 220 and 0. This region can be defined as the ����uncertain zone”, indicating that when the Bayesian score of a compound is located in this region, the prediction for this compound may be not reliable. In other words, a Bayesian score is greater than zero that can be used as a cutoff value to select new mTOR inhibitors for a virtual screening project.
The nitrogen atoms can serve as strong hydrogen acceptors and form stable H-bonding interactions with the mTOR
Leave a reply