首页 | 本学科首页   官方微博 | 高级检索  
     

基于BERT-BILSTM-CRF模型的电力行业事故文本智能分析*
引用本文:刘斐,文中,吴艺. 基于BERT-BILSTM-CRF模型的电力行业事故文本智能分析*[J]. 中国安全生产科学技术, 2023, 19(1): 209-215. DOI: 10.11731/j.issn.1673-193x.2023.01.031
作者姓名:刘斐  文中  吴艺
作者单位:(三峡大学 电气与新能源学院,湖北 宜昌 442003)
基金项目:* 基金项目: 国家自然科学基金项目(51877122)
摘    要:为解决电力行业事故报告文本较长、语义复杂,难以进行有效文本识别问题,提出1种以BERT作为底层的预训练模型,并设计1种双重注意力机制编码器,结合BILSTM-CRF深度挖掘事故文本语义特征,从而实现文本智能分析。首先构建电力词典,通过对BERT预训练,进行BIO标注,然后引入BILSTM-CRF模型实现对文本标签智能分类,最后将该模型与现行其他4种深度学习模型进行对比。研究结果表明:该模型智能识别精确率、召回率及F1值(查准率)均达到约97%,较其他4种模型中效果最好的模型分别提高0.02,0.03,0.02。研究结果可为电力行业事故报告文本分析提供1种新思路。

关 键 词:BERT-BILSTM-CRF  实体识别  电力行业  预训练  文本分类

Intelligent analysis on text of power industry accident based on BERT-BILSTM-CRF model
LIU Fei,WEN Zhong,WU Yi. Intelligent analysis on text of power industry accident based on BERT-BILSTM-CRF model[J]. Journal of Safety Science and Technology, 2023, 19(1): 209-215. DOI: 10.11731/j.issn.1673-193x.2023.01.031
Authors:LIU Fei  WEN Zhong  WU Yi
Affiliation:(School of Electrical and New Energy,China Three Gorges University,Yichang Hubei 442003,China)
Abstract:In order to solve the problem that the accident report of power industry is difficult to carry out the effective test identification due to the long text and complex semantics,a pre-training model with BERT as the underlying layer was proposed,then an encoder of dual attention mechanism was designed,and combined with BILSTM-CRF,the semantic features of accident text were deeply mined,so as to realize the intelligent text analysis.Firstly,the power dictionary was constructed,and BERT was pre-trained to conduct BIO annotation.Secondly,the BILSTM-CRF model was introduced to realize the intelligent classification of text labels.Finally,the model was compared with four existing deep learning models.The results showed that the accuracy,recall rate and F1 value (precision ratio) of the model are all 97%,which are 0.02,0.03,0.02 higher than those of the best model in other four models.This model provides a new idea for the accident report text analysis of power industry.
Keywords:bidirectional encoder representation from transformers (BERT)-bidirectional long short term memory (BILSTM)-conditional random field (CRF)   entity recognition   power industry   pre-training   text classification
点击此处可从《中国安全生产科学技术》浏览原始摘要信息
点击此处可从《中国安全生产科学技术》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号