Text data labeling method and device, storage medium and electronic equipment

A technology of text data and sample data, which is applied in the field of data processing, can solve the problems of low transfer amount and noisy sample text data, and achieve the effect of improving success rate, improving user experience, and reducing noise

Pending Publication Date: 2020-02-18
TENCENT TECH (SHENZHEN) CO LTD
View PDF0 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present disclosure is to provide a text data labeling method, a text data labeling device, an electronic device, and a computer-readable storage medium, so as to overcome the limitations and defects of related technologies to a certain extent. When labeling text data, transfer to the problem of low quota and high noise of sample text data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text data labeling method and device, storage medium and electronic equipment
  • Text data labeling method and device, storage medium and electronic equipment
  • Text data labeling method and device, storage medium and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0068] Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided in order to give a thorough understanding of embodiments of the present disclosure. However, those skilled in the art will appreciate that the technical solutions of the present disclosure may be practiced without one or more of the specific details being omitted, or other methods, components, devices, steps, etc. may be adopted. In other instances, well-known technical solution...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a text data labeling method and device, electronic equipment and a storage medium, relates to the technical field of data processing. The text data annotation method comprises the steps of obtaining to-be-annotated text data, and performing conversion processing on the text data according to a pre-trained topic model to determine vector representation data corresponding to the text data; determining the similarity between the text data through the vector representation data; determining similar text data of which the similarity exceeds a preset threshold, and extractingfirst text data and second text data of the similar text data in a preset similarity interval; and presenting the first text data and the second text data to a display interface, so that a target object marks the similar text data according to the first text data and the second text data. According to the method, the label labeling efficiency of the sample text data can be improved, and the user experience is improved.

Description

technical field [0001] The present disclosure relates to the technical field of data processing, and in particular, to a text data tagging method, a text data tagging device, electronic equipment, and a computer-readable storage medium. Background technique [0002] With the rapid development of artificial intelligence technology, the construction and training of learning models have attracted more and more attention. When most learning models are trained, labeled sample data need to be provided. [0003] At present, when labeling most sample text data, the sample text data is labeled in batches manually in combination with keyword query or clustering algorithm. The sample text data is classified only through the screening of keywords, without considering the association between keywords and text topics and the ambiguity of keywords, resulting in a low accuracy rate of labeling sample text data, and it will introduce a lot of noise and reduce the learning efficiency. The t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/33G06F16/35
CPCG06F16/334G06F16/35
Inventor 李快
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products