摘 要: 新药研发存在研发周期长、成本高和成功率低等问题。为了解决这一系列问题,提高早期药物研发效率,提出一种基于图卷积神经网络的虚拟筛选方法,并利用模型对EGFR(Epidermal Growth Factor Receptor, 表皮生长因子受体)靶点进行虚拟筛选。首先获取EGFR靶点的相关数据,对其进行数据处理后用于模型训练;随后应用模型筛选大量化合物,筛选出小分子后,将其与药物分子进行化合物相似性搜索,验证其是否与已知的EGFR药物存在相似性;同时,将图卷积神经网络(Graph Convolutional Networks, GCN)模型与其他传统机器学习模型进行比较,证明本研究模型在各项指标中均优于其他模型。实验结果表明,本研究提出的方法具有较好的预测性和准确性,为发现潜在药物提供了助力。 |
关键词: 图卷积神经网络;虚拟筛选;EGFR;化合物相似性搜索;机器学习 |
中图分类号: TP391
文献标识码: A
|
基金项目: 国家自然科学基金(82127807);上海市分子影像学重点实验室建设项目(18DZ2260400). |
|
Virtual Screening of Small Molecules based on Graph Convolutional Neural Network |
ZHANG Kairui1,2, HUANG Gang1,2
|
( 1. School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China; 2. Shanghai Key Laboratory of Molecular Imaging, Shanghai University of Medicine and Health Sciences, Shanghai 201318, China)
zhangkarry0328@163.com; huanggang@sumhs.cn
|
Abstract: New drug research and development has the problems of long research and development cycle, high cost and low success rate. In order to solve these problems and improve the efficiency of early drug research and development, this paper proposes a virtual screening method based on graph convolution neural network, and uses the model to perform virtual screening of the EGFR (Epidermal Growth Factor Receptor) targets. Firstly, the relevant data of EGFR targets are obtained and used for model training after data processing. After that, the model is used to screen a large number of compounds, and after small molecules are screened out, they are searched for compound similarity with drug molecules to verify whether they are similar to known EGFR drugs. At the same time, the graph convolution neural network model is also compared with other traditional machine learning models, and the proposed model is superior to other models in all indicators. Experimental results show that the proposed method has good predictability and accuracy, which facilitates the discovery of potential drugs. |
Keywords: graph convolutional neural network; virtual screening; EGFR; compound similarity search; machine learning |