摘 要: TCR(T细胞受体)-多肽结合位点的准确预测对免疫治疗和相关药物发现具有重要意义。文章综合多个文献及数据库整理了一个TCR-多肽结合位点数据集,并引入了一种基于卷积神经网络的预测方法Propep-TCR。该方法综合考虑了输入TCR的序列特征和结构特征,通过采用残基可变滑动窗口方法提取每个目标残基的特征向量。为解决数据集中正负样本不平衡的问题,还采用了改进的损失函数和过采样技术。实验结果表明,Propep-TCR可以成功预测出TCR序列中的潜在结合位点,取得了优于传统算法的性能,其预测准确度达到0.98,AUROC达到了0.95。 |
关键词: 流感预测;小波分解;季节性自回归综合移动平均模型;长短期记忆神经网络 |
中图分类号: TP183
文献标识码: A
|
基金项目: 国家中医药管理局中医药创新团队及人才支持计划项目(ZYYCXTD-D-202208) |
|
Prediction of TCR-peptide Binding Sites Based on Dual-Module Convolutional Neural Networ |
HU Zhaohui, CHEN Zhaoxue
|
(School of Health Science and Engineering, University of Shanghai f or Science and Technology, Shanghai 200093, China)
hzh_xy666@163.com; chenzhaoxue@163.com
|
Abstract: Accurate prediction of TCR ( T Cell Receptor)-peptide binding sites is of great significance for immunotherapy and related drug discovery. This paper proposes to compile a TCR-peptide binding site dataset based on multiple literatures and databases, and introduce a prediction method Propep-TCR based on convolutional neural network. This method comprehensively considers the sequence and structural features of input TCR, and extracts the feature vector of each target residue using the residue-variable sliding window method. To address the issue of imbalanced positive and negative samples in the dataset, an improved loss function and oversampling technique are also employed. Experimental results show that Propep-TCR can successfully predict potential binding sites in TCR sequences, outperforming traditional algorithms with a prediction accuracy of 0.98 and an AUROC of 0.95. |
Keywords: influenza forecasting; Discrete Wavelet Transform ( DWT); Seasonal Autoregressive Integrated Moving Average (SARIMA); Long Short-Term Memory (LSTM) |