Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Training text collection method, system and equipment for sensitive content quality inspection model

A technology for training text and collection methods, which is applied in the fields of text database clustering/classification, unstructured text data retrieval, character and pattern recognition, etc. Improve accuracy and acquisition efficiency, reduce acquisition cost and difficulty, and reduce the effect of manual screening

Pending Publication Date: 2021-09-21
CHINA PING AN LIFE INSURANCE CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In view of this, it is necessary to provide a training text acquisition method, system, device and readable storage medium for the quality inspection model of sensitive content, so as to solve the difficulty in obtaining the training text of the quality inspection model of sensitive content, and the acquisition speed and efficiency. lower question

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Training text collection method, system and equipment for sensitive content quality inspection model
  • Training text collection method, system and equipment for sensitive content quality inspection model
  • Training text collection method, system and equipment for sensitive content quality inspection model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0052] refer to figure 1 , shows a flow chart of the steps of the training text acquisition method for the sensitive content quality inspection model according to the embodiment of the present invention. It can be understood that the flowchart in this method embodiment is not used to limit the sequence of execution steps. The training text acquisition system for the sensitive content quality inspection model in this embodiment can be implemented in the computer device 2, and the following uses the computer device 2 as the execution subject for an exemplary description. details as follows.

[0053] Step S100, acquire account data of multiple users and relationship data between users, so as to obtain multiple account data and multiple relationship data.

[0054] The account data can be the user ID of each user, and the relationship data can be data used to record the association relationship between the users. The association relationship can include whether the user accounts ...

Embodiment 2

[0081] figure 2 It is a schematic diagram of the program module of Embodiment 2 of the training text acquisition system for the sensitive content quality inspection model of the present invention. The training text acquisition system 20 for the quality inspection model of sensitive content may include or be divided into one or more program modules, one or more program modules are stored in a storage medium and executed by one or more processors, In order to complete the present invention, the above-mentioned training text collection method for sensitive content quality inspection models can be realized. The program module referred to in the embodiment of the present invention refers to a series of computer program instruction segments capable of completing specific functions, which is more suitable than the program itself to describe the execution process of the training text acquisition system 20 for sensitive content quality inspection models in the storage medium . The f...

Embodiment 3

[0095] refer to image 3 , is a schematic diagram of the hardware architecture of the computer device according to Embodiment 3 of the present invention. In this embodiment, the computer device 2 is a device that can automatically perform numerical calculation and / or information processing according to preset or stored instructions. The computer device 2 may be a rack server, a blade server, a tower server or a cabinet server (including an independent server, or a server cluster composed of multiple servers) and the like. As shown in the figure, the computer device 2 at least includes, but is not limited to, a memory 21, a processor 22, a network interface 23, and a training text collection system 20 for sensitive content quality inspection models that can communicate with each other through a system bus.

[0096] In this embodiment, the memory 21 includes at least one type of computer-readable storage medium, and the readable storage medium includes flash memory, hard disk, ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the field of data collection, and provides a training text collection method for a sensitive content quality inspection model. The method comprises the steps: obtaining account data of a plurality of users and relation data between the users, so as to obtain a plurality of account data and a plurality of relation data; according to the multiple pieces of account data and the multiple pieces of relational data, constructing an account contact graph; clustering the account data based on the account contact atlas to obtain a plurality of user sets; selecting a sensitive account set from the plurality of user sets, wherein the sensitive account set comprises a plurality of sensitive users; collecting historical texts of each sensitive user in a preset time window to obtain a plurality of historical texts; and performing screening operation on the plurality of historical texts to obtain a plurality of training texts for training the sensitive content quality inspection model. According to the invention, the collection cost and collection difficulty of the training text are reduced, and the accuracy and collection efficiency of the training text are improved.

Description

technical field [0001] Embodiments of the present invention relate to the field of data collection, in particular to a training text collection method, system and device for sensitive content quality inspection models. Background technique [0002] With the rapid development of the Internet and its universal application, Internet public opinion has become a very important part of social public opinion. Compared with traditional media (television, newspapers, radio, etc.), the Internet that carries online public opinion has the characteristics of high freedom of speech, suddenness, fast transmission, and wide audience. accuracy requirements. Therefore, the quality inspection of sensitive malicious content that is maliciously distributed on the network is particularly important. [0003] Sensitive content quality inspection can also be regarded as a short text classification problem, that is, to judge whether a text message sent by a user belongs to normal text or illegal te...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35G06F16/36G06F16/335G06K9/62
CPCG06F16/35G06F16/367G06F16/335G06F18/214Y02P90/30
Inventor 成杰峰
Owner CHINA PING AN LIFE INSURANCE CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products