Unlock instant, AI-driven research and patent intelligence for your innovation.

Sample data processing method, sample data processing device and electronic equipment

A sample data and processing method technology, applied in the field of data processing, can solve problems such as sample data imbalance, training failure, and large differences in the number of training corpora

Pending Publication Date: 2020-05-26
UBTECH ROBOTICS CORP LTD
View PDF4 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, when training the intent recognition model adopted by smart devices, there are often situations where the number of training corpora corresponding to different intent categories is quite different. For example, some common intent categories have hundreds or even thousands of training corpus, and there are only a few training corpora for some uncommon intent categories, and the difference may be hundreds or thousands of times, which may lead to training failure due to imbalanced sample data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Sample data processing method, sample data processing device and electronic equipment
  • Sample data processing method, sample data processing device and electronic equipment
  • Sample data processing method, sample data processing device and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0032] A sample data processing method provided by the embodiment of this application is described below, please refer to figure 1 , the sample data processing method in the embodiment of the present application includes:

[0033] Step 101, obtaining all sample data for training a preset intent recognition model;

[0034] In the embodiment of the present application, the electronic device may first obtain all sample data used for training the intent recognition model during a preset training phase of the intent recognition model. Optionally, the above-mentioned sample data may be the data set by the corpus personnel, or, after the corpus personnel first set several pieces of sample data, the developer triggers the electronic device to continue generating other samples based on the set sample data. Sample data, the source of the above sample data is not limited here. Specifically, the above-mentioned sample data is in the form of a template corpus, and the above-mentioned tem...

Embodiment 2

[0091] Embodiment 2 of the present application provides a sample data processing device, which can be integrated into electronic equipment, such as image 3 As shown, the sample data processing device 300 in the embodiment of the present application includes:

[0092] An acquisition unit 301, configured to acquire all sample data for training a preset intention recognition model;

[0093] A category determination unit 302, configured to determine the intent category and language model category to which each sample data belongs according to the intent label and language model label of each sample data, wherein the above language model categories include positive samples and negative samples;

[0094] A statistical unit 303, configured to count the number of sample data under each intent category, count the number of sample data under each language model category, and count the total number of all sample data;

[0095] The first calculation unit 304 is configured to calculate t...

Embodiment 3

[0123] Embodiment 3 of the present application provides an electronic device, please refer to Figure 4 , the electronic device 4 in the embodiment of the present application includes: a memory 401, one or more processors 402 ( Figure 4 Only one of them is shown) and a computer program stored on the memory 401 and executable on the processor. Wherein: the memory 401 is used to store software programs and modules, and the processor 402 executes various functional applications and data processing by running the software programs and units stored in the memory 401 to obtain resources corresponding to the above preset events. Specifically, the processor 402 implements the following steps by running the above-mentioned computer program stored in the memory 401:

[0124] Obtain all sample data used to train the preset intent recognition model;

[0125] Determining the intent category and language model category to which each sample data belongs according to the intent label and l...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a sample data processing method and device, electronic equipment and a computer readable storage medium. The method comprises the steps of obtaining all sample data used for training a preset intention recognition model; determining an intention category and a language model category to which each piece of sample data belongs according to the intention label and the language model label of each piece of sample data; counting the quantity of the sample data under each intention category, the quantity of the sample data under each language model category and the total quantity of all the sample data, and calculating the intention weight of each intention category and the language model weight of each language model category based on the quantity of the sample data under each intention category; and determining a loss function of the intention recognition model based on the intention weight of each intention category and the language model weight of each language model category, and training the intention recognition model according to the loss function. By means of the scheme, the influence difference of large-data-volume sample data and small-data-volume sample data on the intention recognition model can be reduced, and training effectiveness is guaranteed.

Description

technical field [0001] The present application belongs to the technical field of data processing, and in particular relates to a sample data processing method, a sample data processing device, electronic equipment, and a computer-readable storage medium. Background technique [0002] Now more and more smart devices have human-computer interaction functions. The basis of the above-mentioned human-computer interaction function is that the smart device needs to understand the user's intention first. Therefore, the ability of the smart device to recognize the intention affects the quality of the human-computer interaction function of the smart device to a certain extent. At present, when training the intent recognition model adopted by smart devices, there are often situations where the number of training corpora corresponding to different intent categories is quite different. For example, some common intent categories have hundreds or even thousands of training corpus, and som...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/332G06F16/33G06F40/279G06F40/30
CPCG06F16/3329G06F16/3344
Inventor 黄日星熊友军
Owner UBTECH ROBOTICS CORP LTD