A Keywords Extraction Method for Public Safety Domain TextsBased on Deep Reinforcement Learning

GAO Yuxuan; SUN Lijuan; DING Hongxin; XIONG Ziqi

doi:10.3724/j.gyjzG23121201

Volume 54 Issue 2

Feb. 2024

Turn off MathJax

Article Contents

Article Navigation > INDUSTRIAL CONSTRUCTION > 2024 > 54(2): 155-160

GAO Yuxuan, SUN Lijuan, DING Hongxin, XIONG Ziqi. A Keywords Extraction Method for Public Safety Domain TextsBased on Deep Reinforcement Learning[J]. INDUSTRIAL CONSTRUCTION, 2024, 54(2): 155-160. doi: 10.3724/j.gyjzG23121201

Citation:

GAO Yuxuan, SUN Lijuan, DING Hongxin, XIONG Ziqi. A Keywords Extraction Method for Public Safety Domain TextsBased on Deep Reinforcement Learning[J]. INDUSTRIAL CONSTRUCTION, 2024, 54(2): 155-160. doi: 10.3724/j.gyjzG23121201

Citation:

PDF( 1990 KB)

A Keywords Extraction Method for Public Safety Domain TextsBased on Deep Reinforcement Learning

doi: 10.3724/j.gyjzG23121201

GAO Yuxuan¹,
SUN Lijuan^2,3,
DING Hongxin^{2,3
,
,},
XIONG Ziqi^2,3

1. Chengdu River and Lake Protection and Smart Water Service Center, Chengdu 610072, China;
2. CETC Big Data Research Institute Co., Ltd., Guiyang 550022, China;
3. National Engineering Research Center of Big Data Application to the Improvement of Governance Capacity, Guiyang 550022, China

Received Date: 2023-12-12
Available Online: 2024-04-23

Abstract

Abstract

With the rapid development of big data in China’s government affairs, it is of great significance to fully utilize a large amount of unlabeled text data in the field of public safety, effectively extract key information from the text, and enhance urban safety governance capabilities. Therefore, a public safety domain text keyword extraction model based on deep reinforcement learning was proposed to quickly label the text content in an unsupervised manner, in order to improve the user's retrieval ability for public safety domain files or events. The paper used the log-sum norm regularization term as the sparse constraint of the loss function of the model to guide the policy network to learn strategies that retain important vocabulary and discard unimportant vocabulary. At the same time, a model training method with variable mini-batch sizes was designed, which could control the difficulty of learning the policy network by setting different mini batch sizes, thereby improving the generalization capacity of the policy network. The performance comparison results showed that the model outperformed traditional unsupervised methods in the task of keyword extraction.
- deep reinforcement learning,
- keyword extraction,
- log-sum norm,
- public safety big data

FullText(HTML)

References(20)

References

[1]	KIM G H, TRIMI S, CHUNG J H. Big-data applications in the government sector[J]. Communications of the ACM, 2014(5):78-85.
[2]	王国辉.大数据技术在电子政务领域的应用[J].数字技术与应用, 2023, 41(10):70-72.
[3]	BULGAROV F, CARAGEA C. A comparison of supervised keyphrase extraction models[C]//Proceedings of the 24th International Conference on World Wide Web. Florence, ltaly:2015:13-14.
[4]	HADDOUD M, ABDEDDAM S. Accurate keyphrase extraction by discriminating overlapping phrases[J]. Journal of Information Science, 2014, 40(4):488-500.
[5]	LIU Z Y. Research on keyword extraction using document topical structure[J]. New Technology of Library and Information Service, 2013(9):30-34.
[6]	STERCKX L, DEMEESTER T, DELEU J, et al. Topical word importance for fast keyphrase extraction[C]//Proceedings of the 24th International Conference on World Wide Web. Florence, ltaly:2015:121-122.
[7]	MIHALCEA R. Graph-based ranking algorithms for sentence extraction, applied to text summarization[C]//Proceedings of the ACL Interactive Poster and Demonstration Sessions. Barcelona, Spain:2004:170-173.
[8]	BOUGOUIN A, BOUDIN F, DAILLE B. TopicRank:graph-based topic ranking for keyphrase extraction[C]//Proceedings of the Sixth International Joint Conference on Natural Language Processing. Nagoya, Japan:2013:543-551.
[9]	GOLLAPALLI S D, CARAGEA C. Extracting keyphrases from research papers using citation networks[C]//Proc. of the 28th AAAI Conference on Artificial Intelligence. Quebec, Canada:2014:1629-1635.
[10]	兰晓芳,刘卓,许志豪,等.基于TF-IDF和TextRank结合的中文文本关键词提取方法:以体育新闻为例[J].软件工程, 2023, 26(8):6-10.
[11]	邸小康,张辉,秦晓婧,等.融合新词发现和改进TextRank算法的农业领域关键词提取算法[J].农业工程, 2023, 13(6):21-25.
[12]	HINTON G E, SALAKHUTDINOV R. Reducing the dimensiionality of data with neural networks[J]. Science, 2006, 313(5786):504-507.
[13]	ZHANG Q, WANG Y, GONG Y Y, et al. Keyphrase extraction using deep recurrent neural networks on twitter[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Austin, USA:2016:836-845.
[14]	KIM Y. Convolutional neural networks for sentence classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Poha, USA:2014:1746-1751.
[15]	PENG J, HAN K. Survey of pre-trained models for natural language processing[C]//2021 International Conference on Electronic Communications, Internet of Things and Big Data. Yilan, China:2021:277-280.
[16]	DEVLIN J, CHANG M, LEE K, et al. Bert:Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Minneapolis, USA:2019:4171-4186.
[17]	LIU Y H, OTT M, GOYAL N, et al. RobErta:A robustly optimized BERT pretraining approach[EB/OL].[2019-07-26]. https://doi.org/10.48550/arXiv.1907.11692.
[18]	YANG Z, DAI Z, YANG Y, et al. XLNet:Generalized autoregressive pretraining for language understanding[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver, Canada:2019:5753-5763.
[19]	FENG J, HUANG M, ZHAO L, et al. Reinforcement learning for relation classification from noisy data[C]//Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence. New Orleans, USA:2018:5779-5786.
[20]	ZHANG T, HUANG M, ZHAO L. Learning structured representation for text classification via reinforcement learning[C]//Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence. New Orleans, USA:2018:6053-6060.