Text data enhancement method and device and electronic equipment

A text data and text technology, which is applied in electronic digital data processing, special data processing applications, instruments, etc., can solve the problem of low semantic accuracy of text data enhancement.

Pending Publication Date: 2019-09-10
PING AN TECH (SHENZHEN) CO LTD
View PDF0 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] In order to solve the problem of low semantic accuracy of text data enhancement existing in related technologies, the present invention provides a text data enhancement method and device, and electronic equipment

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text data enhancement method and device and electronic equipment
  • Text data enhancement method and device and electronic equipment
  • Text data enhancement method and device and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0054] The implementation environment of the present invention may be an electronic device, such as a smart phone, a tablet computer, or a desktop computer.

[0055] figure 1 It is a structural schematic diagram of a device disclosed in an embodiment of the present invention. The apparatus 100 may be the above-mentioned electronic equipment. like figure 1 As shown, apparatus 100 may include one or more of the following components: processing component 102 , memory 104 , power component 106 , multimedia component 108 , audio component 110 , sensor component 114 , and communication component 116 .

[0056] The processing component 102 generally controls the overall operations of the device 100, such as operations associated with display, phone calls, data communications, camera operations, and recording operations, among others. The processing component 102 may include one or more processors 118 to execute instructions to complete all or part of the steps of the methods descr...

Embodiment 2

[0065] see figure 2 , figure 2 It is a schematic flowchart of a text data enhancement method disclosed in an embodiment of the present invention. like figure 2 As shown, the text data enhancement method may include the following steps:

[0066] 201. Acquire the original text.

[0067] 202. Perform word segmentation processing on the original text to obtain several candidate words.

[0068] 203. For the target candidate word, based on the context information of the target candidate word, use the bidirectional long-short-term memory network model to obtain N replacement words from the preset dictionary.

[0069] In the embodiment of the present invention, the target candidate word is any one of the above-mentioned candidate words, and the semantic label corresponding to each of the N replacement words matches the semantic label corresponding to the original text; N is a positive integer, and The value of N can be configured by itself, and is not specifically limited.

...

Embodiment 3

[0076] see image 3 , image 3 It is a schematic flowchart of another text data enhancement method disclosed in the embodiment of the present invention. like image 3 As shown, the text data enhancement method may include the following steps:

[0077] 301. Acquire the original text.

[0078] 302. Perform word segmentation processing on the original text to obtain several candidate words.

[0079] 303. For the target candidate word, based on the word order information of the original text, forward-encode the context information of the target candidate word from left to right to obtain forward-encoded information.

[0080]304. Perform backward coding on the context information of the target candidate word from right to left, to obtain backward coding information.

[0081] In the embodiment of the present invention, the way of forward coding the context information of the target candidate words is mainly: forward numbering of the candidate words included in the context infor...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the technical field of machine learning, and discloses a text data enhancement method and device and electronic equipment. The method comprises steps of performing word segmentation processing on the original text to obtain a plurality of candidate words; for the target candidate word, based on context information of the target candidate word, obtaining N replacement wordsfrom a preset dictionary by using a bidirectional long-short-term memory network model, wherein the target candidate word is any one candidate word in the plurality of candidate words, the semantic label corresponding to each replacement word in the N replacement words is matched with the semantic label corresponding to the original text, and N is a positive integer; and generating N first extended texts according to the N replacement words and the original text. By implementing the embodiment of the invention, the semantic accuracy of text data enhancement can be improved.

Description

technical field [0001] The invention relates to the technical field of machine learning, in particular to a text data enhancement method and device, and electronic equipment. Background technique [0002] In the field of machine learning technology, data augmentation technology is an important means to expand the training set, which is often used to generate more new data to train the model, so as to make the model more accurate and more generalizable. The core point of data enhancement is: it is necessary to use new data to replace the original data, but also to ensure that the new data and the original data belong to the same category. For data enhancement techniques applied to images, this is very easy to implement. For example, if a new image is obtained by horizontally flipping the original image, randomly cropping, or adjusting RGB channels, the content contained in the new image still belongs to the original image. However, for the data enhancement technology applied...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06N3/04G06F17/27
CPCG06F40/205G06F40/30G06N3/045G06F18/214
Inventor 于凤英王健宗
Owner PING AN TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products