摘 要: 目前,海面溢油检测分类器大多为监督学习分类器,但是针对特定油进行分类时,标签数据较少,因此监督学习分类器难以获得较好的识别效果。为提升识别准确率,采用了最大相关-最小冗余(mRMR)特征选择方法,同时为了解决标签样本较少的问题,选择自适应的半监督决策树学习模型,对公开的海面溢油样本集在不同的标签样本比例下进行分类实验,在仅有5.0%、7.5%、10.0%、15.0%和20.0%标签样本的情况下,自适应的半监督决策树学习模型的识别准确率,相比监督学习分类模型SVM 和决策树的识别准确率,分别平均提升了26.22%和16.22%。实验结果表明,该方法在标签样本较少的情况下实用性较强。 |
关键词: 溢油自动识别;半监督决策树;自适应置信度 |
中图分类号: TP753
文献标识码: A
|
|
Oil Spill Automatic Identification Model Using Adaptive Semi-Supervised Decision Tree |
LIU Hongyu, ZHOU Hui
|
(Department of Big Data Science, Dalian Neusoft Information University, Dalian 116023, China)
liuhongyu@nou.cn; zhouhui@neusoft.edu.cn
|
Abstract: Most classifiers currently used in sea surface oil spill detection are supervised learning classifiers. But when classifying specific oil, there is limited label data, making it difficult for supervised learning classifiers to achieve good recognition results. To improve the recognition accuracy, Max-Relevance and Min-Redundancy (mRMR) feature selection method is adopted. In order to solve the problem of fewer label samples, an adaptive semi-supervised decision tree learning model is selected to conduct classification experiments on the publicly available sea surface oil spill sample set at different label sample ratios. Compared to the supervised learning classification model SVM and decision tree, the recognition accuracy of the adaptive semi-supervised decision tree learning model is improved by an average of 26.22% and 16.22% , respectively, when 5.0% , 7.5% , 10.0% , 15.0% , and 20.0% label samples are only available. The experimental results show that the proposed method has strong practicality in the case of fewer label samples. |
Keywords: automatic identification for oil spill; semi-supervised decision tree; adaptive confidence |