摘 要: 在计算资源受限情况下,针对大语言模型微调面临的训练成本高、微调时间长等问题,提出一种基于迁移学习的预训练大模型的微调优化方法。首先,在原有自注意力验证损失函数中引入权重方向惩罚因子,提高模型收敛速度。其次,改进自注意力验证优化器,用来平衡不同权重参数的衰减程度。实验结果表明,改进后的微调优化方法可以有效减少微调的迭代次数,提高微调效率,从而提升大语言模型在下游任务的迁移应用能力。 |
关键词: 大语言模型;微调优化;迁移学习 |
中图分类号: TP312
文献标识码: A
|
基金项目: 辽宁省教育厅基本科研项目(LJKQZ20222447) |
|
Fine-Tuning Optimization Method for LLaMA 2 Large Language Models Based on Transfer Learning |
SUN Qian1, SHI Jingze1, PEI Lijun1, ZHANG Qianyi1, XU Fengqiang2
|
(1.Dalian Neusof t University of Inf ormation, Dalian 116023, China; 2.School of Sof tware, Dalian Jiaotong University, Dalian 116028, China)
sunqian@neusoft.edu.cn; losercheems@gmail.com; peilijun@neusoft.edu.cn; zhangqianyi@neusoft.edu.cn; xfq@djtu.edu.cn
|
Abstract: In the context of limited computational resources, this paper proposes a fine-tuning optimization method for large language models that addresses problems, such as the high training costs and long fine-tuning time. Firstly, a weight direction penalty factor is introduced into the existing self-attention validation loss function to enhance the convergence speed of the model. Secondly, an improved self-attention validation optimizer is proposed to balance the decay rates of different weight parameters. Experimental results demonstrate that the improved fine-tuning optimization method effectively reduces the number of iterations required for fine-tuning, increases fine-tuning efficiency, and thereby enhances the transferability of large language models for downstream tasks. |
Keywords: large language model; fine-tuning optimization; transfer learning |