Small Molecule Drug and Biotech Drug Interaction Prediction Based On Multi-Modal Representation Learning
Small Molecule Drug and Biotech Drug Interaction Prediction Based On Multi-Modal Representation Learning
Small Molecule Drug and Biotech Drug Interaction Prediction Based On Multi-Modal Representation Learning
*Correspondence:
dongxinsmmu@126.com; Abstract
jiangx@shu.edu.cn Background: Drug–drug interactions (DDIs) occur when two or more drugs are taken
1
School of Computer simultaneously or successively. Early detection of adverse drug interactions can be
Engineering and Science, essential in preventing medical errors and reducing healthcare costs. Many computa-
Shanghai University,
Shanghai 200444, China tional methods already predict interactions between small molecule drugs (SMDs). As
2
School of Medicine, Shanghai the number of biotechnology drugs (BioDs) increases, so makes the threat of interac-
University, Shanghai 200444, tions between SMDs and BioDs. However, few computational methods are available to
China
predict their interactions.
Results: Considering the structural specificity and relational complexity of SMDs
and BioDs, a novel multi-modal representation learning method called Multi-SBI is
proposed to predict their interactions. First, multi-modal features are used to ade-
quately represent the heterogeneous structure and complex relationships of SMDs
and BioDs. Second, an undersampling method based on Positive-unlabeled learning
(PU-sampling) is introduced to obtain negative samples with high confidence from
the unlabeled data set. Finally, both learned representations of SMD and BioD are fed
into DNN classifiers to predict their interaction events. In addition, we also conduct a
retrospective analysis.
Conclusions: Our proposed multi-modal representation learning method can extract
drug features more comprehensively in heterogeneous drugs. In addition, PU-sampling
can effectively reduce the noise in the sampling procedure. Our proposed method
significantly outperforms other state-of-the-art drug interaction prediction methods. In
a retrospective analysis of DrugBank 5.1.0, 14 out of the 20 predictions with the highest
confidence were validated in the latest version of DrugBank 5.1.8, demonstrating that
Multi-SBI is a valuable tool for predicting new drug interactions through effectively
extracting and learning heterogeneous drug features.
Keywords: Drug–drug interactions, Multi-modal representation learning, PU-sampling
Introduction
DDIs refer to the phenomenon in which one drug alters the pharmacological effects
of another drug when two or more drugs are taken simultaneously or sequentially [1].
DDIs may lead to unexpected adverse drug side effects [2]. Early detection of DDIs
© The Author(s) 2022. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits
use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original
author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third
party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the mate-
rial. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or
exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://
creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publi
cdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Huang et al. BMC Bioinformatics (2022) 23:561 Page 2 of 16
can effectively prevent medical errors and reduce healthcare costs. Early on, research-
ers identified DDIs by wet experiments and later used high-throughput screening and
in vivo models. However, these methods are time-consuming and labor-intensive, so
systematic combinatorial screening of potential DDIs remains challenging. To reduce
the cost in time and money, computational methods are gaining more highlights. Early
researchers collected drug data from the literature, reports, etc., to predict DDIs, and
some proposed machine learning methods to predict DDIs [3].
The current DDI prediction methods based on machine learning are broadly classified
into similarity-based and network-based methods. Similarity-based methods assume
that drugs with similar properties interact with the same drugs [4]. Early research used
molecular structure similarity information to identify new DDI [4]. Since single molecu-
lar structure information is insufficient to express drug characteristics, [5] established a
DDI prediction model by integrating multiple drug similarity measures. Moreover, four
classifiers were adopted to construct predictive models simultaneously [6]. With the
advancement of deep learning research, DeepDDI [7] used the drug name and chemical
structure as inputs to the deep neural network (DNN) to predict the DDI types of drug
pairs and drug-food component pairs. The DDIMDL [8] constructed four sub-models
using features of each drug and used joint deep learning DNNs to predict DDI-related
events. The latest study combines two drugs in four different ways. It feeds the combined
drug feature representation into four different drug fusion networks to obtain the latent
feature vectors of the drug pairs [9]. The network-based method converts the graph into
a low-dimensional space that preserves the information of the structural graph and then
uses the learned low-dimensional representation as a feature for prediction. [10] con-
structed a network based on chemical structure and side effect similarities of drugs and
applied a label propagation algorithm to identify DDIs. Decagon, a graph convolutional
neural network, was designed for running on large multi-modal graphs [11]. Based on
this model, a three-picture information dissemination (TIP) model improved prediction
accuracy and time and space efficiency [12].
Generally, most of the state-of-the-art methods mentioned above only predict whether
there exists a DDI between a pair of SMDs. As the number of biotech drugs (BioDs)
increases, so makes the threat of adverse interactions between SMD and BioD. Biolog-
ics are medicines derived from living cells or biological processes [13, 14]. Unlike the
relatively simple structure of SMDs, the structural complexity of biologics makes the
characterization of SMD and BioD drug pairs difficult [15]. Besides that, most methods
straightforwardly employ random sampling in unlabeled data for generating negative
samples, resulting in many false negatives in the sampled negative samples [16, 17].
To overcome these limitations, we propose a multi-modal representation learning
method called Multi-SBI for predicting the interaction between SMDs and BioDs. Con-
sidering the structural specificity and relational complexity of SMDs and BioDs, we first
apply multi-modal representation learning to learn drug features thoroughly. On the one
hand, it takes the one-dimensional sequence information of two types of drugs as input.
It learns the sequence features separately through traditional methods such as convo-
lutional neural networks (CNN). On the other hand, the association information of all
drug nodes in the heterogeneous network is encoded as a one-dimensional feature vec-
tor. Then, we adopt the PU-sampling to select high-confidence negative samples, which
Huang et al. BMC Bioinformatics (2022) 23:561 Page 3 of 16
can reduce sampling noise. Finally, different modal drug pair features of dimensionality-
reducing are input into DNN classifiers to predict the new SMD-BioD interaction (SBI).
In the SBI prediction experiment on the public data set, the fully designed Multi-SBI
has a higher accuracy rate and performs better than several state-of-the-art methods. In
addition, in retrospective analysis, the high-confidence SBI predicted by the Multi-SBI
model has been verified by the latest version of the DrugBank database, proving that
our model has solid predictive capabilities. To summarize, the main contributions of this
paper are:
The rest of this paper is structured as follows. The “Methods” section introduces the
basic concepts and processes of Multi-SBI. In addition, the experiments are analyzed in
the “Experiments” section. Next, the Multi-SBI is analyzed and verified through various
experiments in the “Discussion” section, finally showing the retrospective analysis. In
the “Conclusion” section, the work that has been carried out and the direction of future
research are summarized.
Methods
Problem description
As shown in Fig. 1a, conventional DDI prediction focuses on SMDs, only containing one
type of drug node and drug-protein association, and drug features only consist of struc-
tural forms like SMILES. In comparison, in Fig. 1b after adding BioDs three types of
Fig. 1 Two DDI diagrams. a The traditional drug interaction (SSI) prediction task contains one type of drug
node and two types of node associations. b Two types of drug nodes and five types of node associations are
included in the SMD-BioD interaction (SBI) prediction task
Huang et al. BMC Bioinformatics (2022) 23:561 Page 4 of 16
nodes and five types of associations make the SBI prediction more complex. Further-
more, BioDs are composed of amino acid sequences, which differ from SMDs. The other
problem is that there are no accurately annotated negative samples in the database,
which means the prediction results depend on the sampling strategy. To solve the above
problem, we use multi-modal representation learning to learn complex drug pair fea-
tures and apply the PU-sampling method to deal with imbalanced data.
Fig. 2 The overall workflow of Multi-SBI. a Multi-modal representation learning obtains structure and
network topology features from the diverse drug types. b PU-sampling is introduced to obtain negative
samples with high confidence from the unlabeled data set. c Combining multi-modal data into the DNN
classifiers provides a complementary view of SBI
Huang et al. BMC Bioinformatics (2022) 23:561 Page 5 of 16
features from the sequence input(Structure/Sequence). After one-hot encoding the four
interconnected networks (SMD-protein interaction (SPI), BioD-protein interaction
(BPI), SMD-SMD interaction (SSI), and BioD-BioD interaction (BBI)), the similarity is
encoded into a heterogeneous network to fully characterize drugs relational topology
representation.
Fig. 3 Two independent three-layer 1D-CNN blocks extract context structure information from different
drug sequence inputs. The length of the convolution filters is fixed to 8, while the filter numbers are 64, 128,
and 192, respectively
many filters as the first. The last layer is the maximum pooling layer. The output of the
maximum pooling layer are connected and fed into the three-layer DNN classifier.
A critical problem of direct one-hot encoding is that the calculated topological rela-
tionship is not entirely accurate, partly because of the noisy, incomplete, and high-
dimensional nature of biological data. To speed up the prediction process and eliminate
noise as much as possible, we compress features to reduce sparsity. Instead of using bit
vectors, we use the Jaccard similarity metric to calculate paired drug–drug similarity
from bit vectors. Jaccard similarity is calculated by Eq. (1):
|A ∩ B|
J (A, B) = (1)
|A| + |B| − |A ∩ B|
Among them, A and B are the set forms of the position vectors of the two drugs;
|A ∩ B| is the intersection of A and B. Using Jaccard similarity, we convert topological
features of SMD drugs and BioD drugs to 1941 and 148 dimensions (determined by the
number of drugs). Because SMD drugs have 1941 dimensions, we use PCA to reduce the
feature dimension to 512 dimensions.
Finally, we obtain the drug pair feature consisting of two types of sequence features
and two types of topological features.
PU‑sampling
In some applications, such as drug interaction prediction, only positive cases are known
and labeled, while unlabeled data may include negative and unlabeled positive cases.
Previous methods used experimentally verified DDI as positive samples and randomly
generated negative samples to learn predictive models. However, randomly generated
negative samples may include unknown true positive samples. A classifier trained with
such randomly generated negative samples may produce high cross-validation accuracy,
but it is likely to perform poorly on independent real test data set. Therefore, screen-
ing highly reliable negative samples is essential to improve the effectiveness of computa-
tional prediction methods [32].
As shown in Fig. 2b, to address the unbalanced data set problem in DDI prediction, we
introduce an undersampling method, PU-sampling, based on Positive-unlabeled learn-
ing (PU Learning) [33]. The core concept of PU Learning is converting positive and unla-
beled examples into a series of supervised binary classification problems discriminating the
known positive examples from random subsamples of the unlabeled set. As more details
are shown in Fig. 4, positive samples are labeled with red triangles. Firstly, PU-sampling
scores all unlabeled examples through many simple decision tree classifiers. Then removes
low-confidence negative sample drug pairs that are painted in light green circles. Finally,
during the training process, high confidence samples are selected from the remaining unla-
beled set with the same number of positives to compose the 1:1 balanced data set. As will
be introduced in the “Experiment” section, there are 148 BioDs and 1,941 SMDs in the data
set, generating 287,268 potential SBI drug pairs. However, only 40,959 SBI are verified posi-
tive in DrugBank. The remaining 246,309 are unlabeled. Here, we denote positive drug pairs
as set P, unlabeled drug pairs as set U, and selected high-confidence negative drug pairs as
N, correspondingly. The PU-sampling algorithm is as follows:
Finally, as the positive samples are 40,959, the same number of negative samples were
retained from 246,309 unlabeled drug pairs.
DNN construction
Multi-SBI is designed as a multi-classification model that can predict multiple SBI types for
a given drug pair (multiple output neurons are activated simultaneously, and each neuron
represents one SBI type). In this work, we adopt "DNN" as the multivariate classifier. Since
there are four types of feature, we construct four sub-models based on each type of feature
using the DNN. The average operator combines the outputs from sub-models to produce
the final prediction.
Figure 2c shows that each prediction sub-model concatenates a pair of SMD and BioD
embedding vectors, which is input to the fully connected layer to calculate the interacting
probability. The output layer has 49 output neurons, representing the 49 classification types
considered in this study. These output neurons have activity values between 0 (no interac-
tion) and 1 (possible interaction), which can be considered a probability [34].
As shown in Fig. 2c, the DNN consists of three layers, with the number of nodes being
512, 256, and 49.
Experiments
Data resources
The number of drugs in the database has dramatically increased in the past few years.
The DrugBank [35] database integrates bioinformatics and chemoinformatics resources,
Huang et al. BMC Bioinformatics (2022) 23:561 Page 9 of 16
providing detailed drug data. We collect features about SBI and drugs from DrugBank 5.1.8
released in January 2021: molecular structure of SMD, amino acid sequence of BioD, SMD-
SMD interaction (SSI), BioD-BioD (BBI) interaction, SMD-Protein Interaction (SPI), BioD-
Protein Interaction (BPI) and known SBI. We select drugs with at least one SBI and SPI, and
the experimental data obtained are shown in Table 1.
For SBI classification categories, we use a similar method in [8] to extract SBI and define
the expression of SBIs as a quaternary structure: (drug A, drug B, mechanism, action).
The "mechanism" means the effect of drugs in terms of metabolism, serum concentration,
therapeutic efficacy, and other aspects. The "action" means an increase or decrease of the
corresponding mechanism. With the above definition, we obtain 48 events to describe the
existing SBI types. When it is worth noting that in order to facilitate analysis [8], deleted
the DDI related to a single event and selected events with more than 10 DDIs. Although
such label preprocessing is beneficial to program design and improves the accuracy of drug
interaction prediction, it is unreasonable in actual clinical trials. Therefore, to retain all
DDIs and perform cross-validation, we reserved events with no more than 10 DDIs into a
single category to facilitate subsequent experiments.
The number of 48 different SBI events and negative samples (as category 0) is described
in Fig. 5. Due to the unbalanced data distribution, the negative and most positive samples
are centralized on the left side of the histogram.
Evaluation metrics
We evaluate the prediction performance of Multi-SBI using a five-fold cross-validation pro-
cedure, in which 80% of the drug pairs are randomly selected as the training set, and the
remaining 20% of the drug pairs are used as the test set. The final performance of the model
takes the average of the five-fold results. For each fold of each prediction model, the follow-
ing indicators are calculated:
TP + TN
ACC = (2)
TP + FP + TN + FN
n
AUC = TPRi FPRi (3)
i=1
40959
Count
9102
8074
3631
3597
2723
1803
1685
1529
1143
1083
978
787
564
456
354
349
319
314
239
234
219
206
164
137
110
92
91
89
84
82
71
68
67
66
64
63
46
43
40
37
33
27
22
18
17
16
12
11
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48
Event Types
Fig. 5 All classification categories (category 0 for negative samples and 1 to 48 for SBI types)
n
AUPR = Prej Recj (4)
j=1
2 ∗ Sen ∗ Pre
F1 = (5)
Sen + Pre
TP
Pre = (6)
TP + FP
TP
Rec = TPR = (7)
TP + FN
FP
FPR = (8)
FP + TN
where TP means true positive, TN means true negative, FP means false positive, FN
means false negative, i is ith true-positive/false-positive operating point, and j is jth pre-
cision/recall operating point.
Experimental setup
There are four essential hyper-parameters in our model, namely the layer number,
optimizer, learning rate, and dropout rate on the model.
First, we discuss the number of DNN layers. We set a rule that the number of neu-
rons in a layer is half the previous layer and then fixed the number of neurons in the
last hidden layer to 256. We consider 2, 3, 4, and 5 hidden layers and adopt a three-
layer structure (the number of nodes is 512, 256, and 49, respectively) because it can
achieve the best performance.
In order to optimize the model, we use the Adam optimizer [36] to train up to 100
epochs (training iterations) with a learning rate of 0.3 and stop training if the verifica-
tion loss does not decrease in 10 epochs [37]. This strategy can prevent over-fitting
while considerably speeding up the training process.
Huang et al. BMC Bioinformatics (2022) 23:561 Page 11 of 16
In order to make the model generalize well to the unobserved drug pairs, we apply
regular dropout [38] to hidden layer units. We set the dropout rate from 0 to 0.5 in steps
of 0.1 and get the highest Accuracy (ACC) when dropout is equal to 0.3.
Feature evaluation
Here, we first evaluate the impact of multi-modal features on model performance. While
keeping other parameters constant, we use different drug features for drug representa-
tion. Specifically, four types of features: CNN, daylight/EMS, SPI/BPI, and SSI/BBI are
used to compare. Then we test the following 15 drug feature combinations to make
predictions.
It can be seen in Table 2, using only CNN, that the performance indicators of the
model are significantly higher than other single features. The results show that CNN
can more effectively represent long-distance associations and global information in
long sequences, thereby improving the performance of predicting SBI. The performance
of the feature combination of daylight/EMS and CNN is higher than that of daylight/
EMS or CNN alone, which indicates that the combination of different feature repre-
sentations of the same data source can extract features from different perspectives and
thus improve prediction accuracy. In addition, the best results can be obtained when all
modalities are used, proving the superiority of our proposed multi-modal representa-
tion learning framework, combing drug structure information and the relevant infor-
mation of heterogeneous networks. Therefore, we choose CNN + daylight/EMS + SPI/
BPI + SSI/BBI as the model feature.
PU‑sampling evaluation
In related work, randomly selected instances from unlabeled data are used as nega-
tive DDI [7, 8]. This approach may introduce noisy data and lead to a lack of dis-
tinction between positive and negative samples. To test whether PU-sampling can
accurately screen out high-confidence negative samples, we compare PU-sampling
with traditional random sampling and the classical sampling method SMOTE [39]. As
shown in Table 3, the results of traditional random sampling are significantly lower
than the other two methods, proving the necessity of sampling negative samples in
the DDI data set. In addition, PU-sampling outperforms SMOTE, verifying the effec-
tiveness of PU-sampling in identifying noise in negative samples.
other advanced methods in five out of six metrics. It is found that all evaluation indi-
cators obtained by Multi-SBI are higher than other methods. We can conclude that
our method improves further with the enhancement of PU-sampling.
In addition, the precision-recall curves of the above methods are shown in Fig. 7. We
can see that the area under the precision-recall curves of Multi-SBI is more extensive
than all other methods. These results go beyond previous reports, showing that Multi-
SBI can effectively predict SBI.
During the experiments, we noticed that all the AUC metrics in different models were
high (close to 1). So we analyzed the data distribution in Fig. 5. Most of the samples were
concentrated in a few categories on the left side of the histogram (the first ten classes
containing 90% data), which played a decisive role in the multi-classification tasks.
Although the AUC metrics of the models were close to each other, our model performed
well on the recall metric (Rec in Table 4) under both sampling mechanisms. The recall
metric can reflect the ability to predict "Right" without considering the negative differ-
ence, which is acceptable to illustrate the capability of our model.
Discussions
Very few computational methods can currently predict the interaction between SMDs
and BioDs. Although determining the precise SBI is critical to improving patient care, it
remains a challenging task that has not been fully studied through predictive modeling.
This study proposes a multi-modal representation learning framework called Multi-SBI
to predict potential SBI.
The feature representation of SMD and BioD drug pairs is much more complex than
that of SMD drug pairs. We use multi-modal representation learning to represent drug
pair features adequately. On the other hand, no specific database represents non-inter-
acting drugs. We apply PU-sampling to filter unlabeled negative samples. The experi-
ments demonstrate the ability of PU-sampling to remove imbalanced data set, and
multi-modal features improve the performance of drug interaction prediction.
To fully demonstrate the ability of Multi-SBI to discover potential drug interactions,
we perform retrospective analysis. In DrugBank 5.1.0, We obtained 8,547 drug interac-
tions between 1,249 SMDs and 105 BioDs and used them as a training set for testing in
unlabeled samples. The 14 out of the 20 drug pairs with the highest prediction scores
can be found in the latest version of the DrugBank5.1.8, indicating the effectiveness of
our model in predicting unknown drug interactions. The results are shown in Table 5.
Conclusions
Identifying novel drug interactions is critical for improving clinical care. This paper pre-
sents a multi-modal representation learning method for interaction prediction between
SMDs and BioDs. To our knowledge, this work is the first attempt to predict the interac-
tion between SMDs and BioDs computationally.
On the one hand, in addition to the traditional method, we use two independent CNN-
based blocks to extract the SMD and BioD sequences. On the other hand, we obtain
the heterogeneous network information of the drug through one-hot encoding. Then,
we use PU-sampling to obtain a balanced data set. Compared with previous methods of
predicting drug interactions, Multi-SBI not only digs deep into the structural informa-
tion of drugs but also considers node associations in heterogeneous networks. At the
same time, the high-confidence negative sample set is selected. The prediction perfor-
mance of our model in experiments has been significantly improved, and some new SBI
predictions have been confirmed. These results show that Multi-SBI can provide a valu-
able tool for extracting and learning drug features to predict new SBI. It can provide
biologists with SBI candidates, reduce the workload of wet laboratory experiments, and
promote the development of new drug discovery and drug repositioning.
Table 5 Top 20 prediction results from the retrospective analysis on DrugBank 5.1.0
No SMD A BioD B *Event Type Evidence
Despite the promising performance described above, our method still needs to
address some limitations and provide insights for future research. First, the lengths of
BioD sequences in the DrugBank database are pretty different. How to uniformly extract
and characterize protein drugs of different lengths is still a complex problem, and we will
improve this later. In addition, in the future, we will conduct biological experiments on
the newly predicted drug pair to determine its authenticity.
Abbreviations
DDIs Drug–drug interactions
SMD Small molecule drugs
BioDs Biotechnology drugs
PU-sampling Positive-unlabeled sampling
DNN Deep neural network
TIP Three-picture information dissemination
CNN Convolutional neural network
AUC Area under the ROC curve
AUPR Area under the precision-recall curve
SMILES Simplified Molecular Input Line Entry System
CDK Chemistry Development Kit
SPI SMD-Protein Interaction
BPI BioD-Protein Interaction
SSI SMD-SMD interaction
BBI BioD-BioD interaction
Acknowledgements
Not applicable.
Author contributions
DH and HH conducted the experiments and wrote the paper. JO, CZ, XD and JX helped revise this paper and conceived
the experiments. All authors read and approved the final manuscript.
Funding
This work was supported by the National Natural Science Foundation of China (No. 61873156).
Declarations
Ethics approval and consent to participate
Not applicable.
Competing interests
The authors declare that they have no competing interests.
References
1. Foucquier J, Guedj M. Analysis of drug combinations: current methodological landscape. Pharmacol Res Perspe.
2015;3(3):e00149.
2. Edwards IR, Aronson JK. Adverse drug reactions: definitions, diagnosis, and management. Lancet.
2000;356(9237):1255–9.
3. Percha B, Garten Y, Altman RB. Discovery and explanation of drug–drug interactions via text mining. Biocomput-Pac
Sym 2012:410–421.
4. Vilar S, Harpaz R, Uriarte E, Santana L, Rabadan R, Friedman C. Drug-drug interaction through molecular structure
similarity analysis. J Am Med Inform Assn. 2012;19(6):1066–74.
5. Gottlieb A, Stein GY, Oron Y, Ruppin E, Sharan R. INDI: a computational framework for inferring drug interactions and
their associated recommendations. Mol Syst Biol. 2012;8:592.
6. Cheng FX, Zhao ZM. Machine learning-based prediction of drug-drug interactions by integrating drug phenotypic,
therapeutic, chemical, and genomic properties. J Am Med Inform Assn. 2014;21(E2):E278–86.
Huang et al. BMC Bioinformatics (2022) 23:561 Page 16 of 16
7. Ryu JY, Kim HU, Lee SY. Deep learning improves prediction of drug-drug and drug-food interactions. Proc Natl Acad
Sci USA. 2018;115(18):E4304–11.
8. Deng YF, Xu XR, Qiu Y, Xia JB, Zhang W, Liu SC. A multimodal deep learning framework for predicting drug-drug
interaction events. Bioinformatics. 2020;36(15):4316–22.
9. Lin SG, Wang YJ, Zhang LF, Chu YY, Liu YT, Fang YT, Jiang MM, Wang QK, Zhao BW, Xiong Y, Wei DQ. MDF-SA-DDI:
predicting drug–drug interaction events based on multi-source drug fusion, multi-source feature fusion and trans-
former self-attention mechanism. Brief Bioinform. 2020;23(1):bbab421.
10. Zhang P, Wang F, Hu J, Sorrentino R. Label propagation prediction of drug-drug interactions based on clinical side
effects. Sci Rep-Uk. 2015;5(1):1–10.
11. Zitnik M, Agrawal M, Leskovec J. Modeling polypharmacy side effects with graph convolutional networks. Bioinfor-
matics. 2018;34(13):457–66.
12. Xu H, Sang S, Lu H: Tri-graph information propagation for polypharmacy side effect prediction. arXiv preprint
arXiv:200110516 2020.
13. Dabrowska A. Biologics and biosimilars: background and key issues. Congressional Res Service 2019:27–66.
14. Sengupta A. Biological drugs: challenges to access: Third World Network; 2018.
15. Makurvet FD. Biologics vs. small molecules: drug costs and patient access. Med Drug Discov. 2021;9(1):100075.
16. Cheng F, Zhao Z. Machine learning-based prediction of drug-drug interactions by integrating drug phenotypic,
therapeutic, chemical, and genomic properties. J Am Med Inform Assn. 2014;21(2):278–86.
17. Cami A, Manzi S, Arnold A, Reis BY. Pharmacointeraction network models predict unknown drug-drug interactions.
PLoS ONE. 2013;8(4): e61468.
18. Zhao Q, Zhao H, Zheng K, Wang J. HyperAttentionDTI: improving drug-protein interaction prediction by sequence-
based deep learning with attention mechanism. Bioinformatics. 2021;38(3):655–62.
19. Öztürk H, Özgür A, Ozkirimli E. DeepDTA: deep drug–target binding affinity prediction. Bioinformatics.
2018;34(17):i821–9.
20. Wen M, Zhang Z, Niu S, Sha H, Yang R, Yun Y, Lu H. Deep-learning-based drug-target interaction prediction. J Pro-
teome Res. 2017;16(4):1401–9.
21. Zhang W, Chen Y, Li D. Drug-target interaction prediction through label propagation with linear neighborhood
information. Molecules. 2017;22(12):2056–69.
22. Shi Z, Li J. Drug-target interaction prediction with weighted bayesian ranking. In: International conference on
biomedical engineering and bioinformatics 2018;19–24.
23. Chu YY, Shan XQ, Chen TH, Jiang MM, Wang YJ, Wang QK, Salahub DR, Xiong Y, Wei DQ. DTI-MLCD: predict-
ing drug-target interactions using multi-label learning with community detection method. Brief Bioinform.
2021;22(3):bbaa205.
24. Chu YY, Kaushik AC, Wang XG, Wang W, Zhang YF, Shan XQ, Salahub DR, Xiong Y, Wei DQ. DTI-CDF: a cascade
deep forest model towards the prediction of drug-target interactions based on hybrid features. Brief Bioinform.
2021;22(1):451–62.
25. Willighagen EL, Mayfield JW, Alvarsson J, Berg A, Carlsson L, Jeliazkova N, Kuhn S, Pluskal T, Miquel RC, Spjuth O,
Torrance G, Evelo CT, Guha R, Steinbeck C. The Chemistry Development Kit (CDK) v2.0: atom typing, depiction,
molecular formulas, and substructure searching. J cheminform. 2017;9(1):1–19.
26. Weininger D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding
rules. J Chem Inf Comput Sci. 1988;28(1):31–6.
27. Yang L, Xia J, Gui J. Prediction of protein-protein interactions from protein sequence using local descriptors. Protein
Pept Lett. 2010;17(9):1085–90.
28. Sun T, Zhou B, Lai L, Pei J. Sequence-based prediction of protein protein interaction using a deep-learning algo-
rithm. BMC Bioinform. 2017;18(1):1–8.
29. Rives A, Meier J, Sercu T, Goyal S, Lin Z, Liu J, Guo D, Ott M, Zitnick CL, Ma J. Biological structure and function
emerge from scaling unsupervised learning to 250 million protein sequences. Proc Natl Acad Sci. 2021;118(15):
e2016239118.
30. Zhou D, Xu Z, Li W, Xie X, Peng S: MultiDTI: drug–target interaction prediction based on multi-modal representation
learning to bridge the gap between new chemical entities and known heterogeneous network. Bioinformatics
2021.
31. Xie J, Ouyang J, Zhao C, He H, Dong X: A deep learning approach based on feature reconstruction and multi-dimen-
sional attention mechanism for drug-drug interaction prediction. In: International Symposium on Bioinformatics
Research and Applications: 2021. Springer, p. 400–410.
32. Liu H, Sun J, Guan J, Zheng J, Zhou S. Improving compound-protein interaction prediction by building up highly
credible negative samples. Bioinformatics. 2015;31(12):221–9.
33. Mordelet F, Vert J-P. A bagging SVM to learn from positive and unlabeled examples. Pattern Recogn Lett.
2014;37:201–9.
34. Wan EA. Neural network classification: a bayesian interpretation. IEEE Trans Neural Netw. 1990;1(4):303–5.
35. Wishart DS, Feunang YD, Guo AC, Lo EJ, Marcu A, Grant JR, Sajed T, Johnson D, Li C, Sayeeda Z, et al. DrugBank 5.0: a
major update to the DrugBank database for 2018. Nucleic Acids Res. 2018;46(D1):1074–82.
36. Kingma DP, Ba J: Adam: a method for stochastic optimization. arXiv preprint arXiv:14126980 2014.
37. Prechelt L: Early stopping-but when? In: Neural Networks: Tricks of the trade. Springer; 1998: 55–69.
38. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks
from overfitting. J Mach Learn Res. 2014;15(1):1929–58.
39. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: synthetic minority over-sampling technique. J Artif Intell
Res. 2002;16(1):321–57.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.