NLP text security auditing multi-level retrieval system

A retrieval system and text technology, applied in the field of NLP text security review multi-level retrieval system, can solve the problems of model generalization ability limitation, occupying a large storage space, inconvenient large-scale deployment of the same server, etc., to improve the model generalization ability , the effect of high query accuracy

Pending Publication Date: 2022-06-03
GUANGZHOU QUWAN NETWORK TECH CO LTD
View PDF0 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The present invention provides a multi-level retrieval system for NLP text security review, which is used to solve the problem that the existing NLP text security review system uses a Trie tree data structure for storage, occupies a large storage space, increases the memory cost of the server, and is inconvenient Large-scale deployment on the same server, it is difficult to achieve optimal performance, and the generalization ability of the model is limited, and the prediction accuracy is unstable.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • NLP text security auditing multi-level retrieval system
  • NLP text security auditing multi-level retrieval system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] In order to make those skilled in the art better understand the solutions of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only These are some embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

[0027] For ease of understanding, see figure 1 , the present invention provides an embodiment of an NLP text security audit multi-level retrieval system, including an environment inspection module, a text preprocessing module, a text classification processing module and a result parsing module connected in sequence;

[0028] The environment ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an NLP text security auditing multi-level retrieval system, which utilizes a data structure of a compressed prefix tree to store and search data, is high in query speed, reduces memory occupation by more than two thousands of times compared with a dictionary tree data structure, and improves the retrieval efficiency. The keyword matching sub-module, the sentence similarity matching sub-module and the text classification deep learning sub-module form a hierarchical search structure of three-level search, the query accuracy is high, search of dominant sensitive words can be covered, meanwhile, text content security auditing can be conducted semantically, the accuracy, the error-tolerant rate and the coverage rate are guaranteed, and the search efficiency is improved. The problems that an existing NLP text security auditing system uses a data structure of a Trie tree for storage, the occupied storage space is large, the memory cost of a server is increased, large-scale deployment on the same server is inconvenient, the performance is difficult to achieve optimization, the model generalization ability is limited, and the system reliability is poor are solved. And the prediction accuracy is unstable.

Description

technical field [0001] The invention relates to the technical field of text security audit, in particular to an NLP text security audit multi-level retrieval system. Background technique [0002] Text content security auditing is essentially a text classification problem, that is, given a text, and then what is the security intent of the text, the security intent here is the text label. The NLP text security auditing system is mainly used for security auditing in user text chats. The auditing areas generally include advertisements, blacklists, and prohibitions. In the existing NLP text security auditing system, three combined technologies of Trie tree, sentence similarity matching and deep learning text classification model are used for hierarchical search, and customized text preprocessing technology is matched at the same time. The advantage of the data structure of Trie tree is that the query speed is very fast, but the problem is that the storage space is very large, wh...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/33G06F16/35G06F40/211G06F40/284G06F40/289G06F40/30G06K9/62
CPCG06F16/3344G06F16/35G06F40/211G06F40/284G06F40/289G06F40/30G06F18/22Y02D10/00
Inventor 曾锐鸿马金龙熊佳王伟喆吴文亮罗箫盘子圣焦南凯黎子骏徐志坚谢睿陈光尧
Owner GUANGZHOU QUWAN NETWORK TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products