摘 要: 在大数据时代如何利用数据挖掘处理海量数据从而对信用风险进行预测分析成为了当下非常重要的问 题,本文运用XGBoost算法建立信用风险分析模型,运用栅格搜索等方法调优XGBoost参数,基于以AUC、准确率、 ROC曲线等评价指标,与决策树、GBDT、支持向量机等模型进行对比分析,基于德国信用数据集验证了该模型的有效 性及高效性。 |
关键词: 信用风险分析;XGBoost;数据挖掘;栅格搜索 |
中图分类号: TP39
文献标识码: A
|
|
A Study of the Credit Risk Analysis Based on XGBoost |
ZHAO Tianao,ZHENG Shanhong,LI Wanlong,LIU Kai
|
( School of Computer Science & Engineering, Changchun University of Technology, Jilin 130012, China)
|
Abstract: How to use data mining to process massive data to predict and analyze credit risks has become a very important issue in the era of big data.This paper adopts the XGBoost algorithm to establish a credit risk analysis model,and uses grid search and other methods to tune the XGBoost parameters based on AUC.The evaluation indicators such as accuracy rates,ROC curves,etc.are compared with the models such as decision tree,GBDT,and support vector machine.The validity and efficiency of the model are verified based on the German credit data set. |
Keywords: credit risk analysis;XGBoost;data mining;grid search |