Water conservancy field text retrieval method and system based on topic mining

A subject and field technology, applied in the field of water conservancy text retrieval methods and systems, can solve problems such as lack of scalability, increased inference complexity, and inability to jointly train

Pending Publication Date: 2022-05-13
HOHAI UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this method still has the following shortcomings: first, the inference process needs to be customized, the inference complexity increases significantly with the increase of model complexity, and the design of the inference process is difficult to automate
Second, it is difficult for this method to effectively scale up large text collections and use GPU for parallel computing
Third, the method is not scalable and cannot be jointly trained with other deep neural networks

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Water conservancy field text retrieval method and system based on topic mining
  • Water conservancy field text retrieval method and system based on topic mining
  • Water conservancy field text retrieval method and system based on topic mining

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0058] The technical solution of the present invention will be further described below in conjunction with the accompanying drawings.

[0059] Such as figure 1 As shown, a text retrieval method in the field of water conservancy based on topic mining includes the following steps:

[0060] (1) Organize the experimental data sets, mainly including the public data sets THUCnews, 20newsgroups and the constructed water conservancy official document data sets, and desensitize the data sets and preprocess the text data;

[0061] The text preprocessing that data set is carried out in described step (1) mainly comprises the following steps:

[0062] (1.1) To remove stop words, first build a set of stop words in the water conservancy field, add the existing stop word list to the stop word set in the water conservancy field, and use the Jieba word segmentation tool to segment the input text data, during the process of word segmentation The set of stop words constructed by query, if the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a topic mining-based water conservancy field text retrieval method and system. The method comprises the following steps of collecting a data set; preprocessing the data set; the method comprises the following steps: constructing a topic attention model GAN-BiGRU Topic Attention Model in which a bidirectional adversarial neural network based on topic mining and a bidirectional GRU are combined; verifying and testing the test set; performing theme-related sorting; performing theme retrieval; the system comprises a data crawling module, an index construction module and a data retrieval module. According to the method, the accuracy and recall rate of theme diversity detection, theme coherence detection and downstream classification tasks are obviously higher than those of an existing method, and a new solution is provided for research in related fields; based on sufficient field data reserve support, the network model formed by combining the bidirectional adversarial neural network and the bidirectional GRU is used in the field of water conservancy retrieval for the first time, and contribution is made to application of topic model retrieval in the field of water conservancy.

Description

technical field [0001] The invention relates to natural language processing and information retrieval, in particular to a text retrieval method and system in the field of water conservancy based on topic mining. Background technique [0002] Now we are in an era of informationization. With the popularization and use of computers, big data technology, cloud computing technology, artificial intelligence technology, etc. in the information field, the electronic digitization of various materials and documents has given traditional information Retrieval has had a huge impact, so information retrieval technology and related technologies are still widely used today, and continue to develop and innovate. Since modern times, my country's water conservancy industry has developed vigorously, and a large amount of text data has been accumulated in related water conservancy fields. The problem of water conservancy information overload has also become more and more severe. It has become m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/33G06F16/35G06F40/216G06F40/289G06K9/62G06N3/04G06N3/08
CPCG06F16/3335G06F16/3344G06F16/3346G06F16/35G06F40/289G06F40/216G06N3/08G06N3/047G06N3/044G06F18/241G06F18/2415
Inventor 冯钧苏栋陆佳民
Owner HOHAI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products