Identifying drug interactions using machine learning

Creative Commons License


Advances in clinical and experimental medicine : official organ Wroclaw Medical University, vol.32, no.8, pp.829-838, 2023 (SCI-Expanded) identifier identifier


The majority of Americans, accounting for 51% of the population, take 2 or more drugs daily. Unfortunately, nearly 100,000 people die annually as a result of adverse drug reactions (ADRs), making it the 4th most common cause of mortality in the USA. Drug-drug interactions (DDls) and their impact on patients represent critical challenges for the healthcare system. To reduce the incidence of ADRs, this study focuses on identifying DDls using a machine-learning approach. Drug-related information was obtained from various free databases, including DrugBank, BioGRID and Comparative Toxicogenomics Database. Eight similarity matrices between drugs were created as covariates in the model in order to assess their infiuence on DDls. Three distinct machine learning algorithms were considered, namely, logistic regression (LR), extreme Gradient Boosting (XGBoost) and neural network (NN). Our study examined 22 notable drugs and their interactions with 841 other drugs from DrugBank. The accuracy of the machine learning approaches ranged from 68% to 78%, while the F1 scores ranged from 78% to 83%. Our study indicates that enzyme and target similarity are the most significant parameters in identifying DDls. Finally, our data-driven approach reveals that machine learning methods can accurately predict DDls and provide additional insights in a timely and cost-effective manner.