Training sample generation, text data, public opinion event classification method and related equipment

A technology of text data and training samples, applied in the Internet field, can solve problems such as taking up time

Active Publication Date: 2019-02-01
TENCENT TECH (SHENZHEN) CO LTD
View PDF14 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] It can be seen from this that the more accurate the classification result is, the more data needs to be labeled. Therefore, in order to improve the accuracy of the classification, a lot of data needs to be labeled, which takes a lot of time in the implementation process.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Training sample generation, text data, public opinion event classification method and related equipment
  • Training sample generation, text data, public opinion event classification method and related equipment
  • Training sample generation, text data, public opinion event classification method and related equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0051] The method and related equipment provided in the embodiment of the present application are applied in a network public opinion monitoring system. First, make a brief description of the network public opinion monitoring system.

[0052] The public opinion monitoring system monitors hot issues and concentrated website information in key areas, such as web pages, forums, BBS, etc., and downloads the latest news and opinions at any time. To implement monitoring on hot issues and important areas, the premise is that intelligent analysis of hot issues must be carried out. First, based on the traditional feature analysis technology based on vector space, the text content of the captured web pages is classified, clustered and summarized. Complete initial reorganization. Then, under the guidance of the monitoring knowledge base, the semantic analysis based on public opinion is carried out, so that the public opinion and public opinion seen by managers are more effective and mor...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

Training sample generation is disclosed, Text data, A method for classify public opinion events and relate equipment, in a training sample generation method provided by an embodiment of the present application, At first, that text data is cluster, Because text data is clustered, When a clustering result corresponding to a target category is found, The training samples of the target category can beobtained only by selecting the text data that meet the target category conditions in the corresponding clustering results and then labeling the target category, without analyzing whether the text data in other clusters meet the target category conditions or not. Therefore, the selection range of text data is greatly narrowed, the efficiency of annotation and the accuracy of samples are improved,and the time of annotation text data is shortened. At the same time, it improves the efficiency and accuracy of text data classification and public opinion event classification process.

Description

technical field [0001] The present application relates to the field of Internet technology, and more specifically, to a method for generating training samples, text data, public opinion event classification and related equipment. Background technique [0002] In recent years, with the rapid development of the Internet, network media has become a new form of information dissemination. Netizens' active speech has reached an unprecedented level. Whether it is a major domestic or international event, online public opinion can be formed immediately. If the content of public opinion is not effectively monitored and managed, it is likely to cause negative social effects. [0003] The core of public opinion monitoring is to capture the information of concern from the complicated information on the Internet according to certain rules and methods, classify the captured information, and use the classified information to identify the information represented by this type of information....

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35G06F16/9535G06F17/27
CPCG06F40/289G06F40/30
Inventor 袁恺村
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products