摘 要: 为了实现机械手对卫星的自动装配,保证在装配过程中机械手能明确每一步的操作类型。本文主要基于 对人工作业的卫星装配工艺规程文件进行文本挖掘,以装配工步内容作为短文本进行操作类型的分类。利用自然语言处 理中常用的TF-IDF算法与TextRank算法提取关键字,结合基于装配工艺术语的分级加权方法,构建三种不同的词向 量模型与词袋空间。最后使用K-means聚类算法,分别对上述三种方案下的聚类结果进行比较与评估。结果表明,基 于装配技术术语的分级加权方案表现最好,平均准确率、召回率、F值分别为88.67%、88.71%、88.66%。基于装配技 术术语的短文本聚类方法不仅能自动对复杂的操作类型进行自动分类,大大减少了人工干预,而且极大地提升了分类的 准确率。 |
关键词: 操作类型;TF-IDF;TextRank;分级加权;K-means |
中图分类号: TP391.1
文献标识码: A
|
|
Research on Short Text Clustering Based on Satellite Assembly Process |
CUI Qingyang,LIANG Xiaofeng,NI Jing,LI Shuai,ZHANG Sheng,ZHONG Liangwei1,2,3,4,5,6,7
|
1.( 1.School of Mechanical Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China;2. 2.Aerospace Dongfanghong Satellite co., LTD., Beijing 100094, China)840661306@qq.com;3.15866585@qq.com;4.nijing501@126.com;5. LS0387@163.com;6.zhangsheng@usst.edu.cn;7.zlv@usst.edu.cn
|
Abstract: In order to realize the automatic assembly of the manipulator to the satellite,the manipulator can specify the operation type of each step in the assembly process.This paper is mainly based on the text mining of manual satellite assembly process documents and classifies the operation types with the assembly step content as the short text.Keywords were extracted by TF-IDF and TextRank algorithms commonly used in natural language processing.Three different word vector models and word pocket spaces were constructed by combining the hierarchical weighting method based on assembly technology terms.Finally,the K-means clustering algorithm is used to compare and evaluate the clustering results under the above three schemes.The results showed that the grade-weighted scheme based on assembly technical terms had the best performance,with average accuracy,recall rate,and F value of 88.67%,88.71%,and 88.66%,respectively.The method based on assembly technical terms can automatically classify complex operation types,reducing manual intervention,and significantly improve the classification accuracy |
Keywords: operation type;TF-IDF;textrank;hierarchical weighting;K–means |