Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Dialect sample data extraction method, device and equipment and storage medium

A technology of sample data and storage medium, applied in the field of data processing, it can solve the problems that the robot can not recognize the dialect and can not accurately answer the user.

Active Publication Date: 2020-06-16
XIAMEN KUAISHANGTONG TECH CORP LTD
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, there is a lack of such dialect data in the training sample data, which makes the dialect comparison robot unable to recognize the dialect, so that the dialogue robot cannot accurately answer the user's question.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Dialect sample data extraction method, device and equipment and storage medium
  • Dialect sample data extraction method, device and equipment and storage medium
  • Dialect sample data extraction method, device and equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0043] It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0044] An embodiment of the present invention provides a method for extracting dialect sample data, and the method is applied to electronic equipment, including, but not limited to, medical aesthetic robots, terminals, electronic equipment, and the like. The electronic device acquires the first dialects of multiple dialect areas and the city data corresponding to each dialect area in the multiple dialect areas, wherein one dialect area corresponds to one city; and the dialect areas with the same first dialect are classified into the same dialect group , and get multiple dialect groups; according to the city data corresponding to each dialect area, sort each dialect group, and determine the target dialect area of ​​each dialect group from each sorted dialect group; get each dialect The medical aesthetic dialogue data o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a dialect sample data extraction method, and the method comprises the steps: obtaining the first dialect of a plurality of dialect regions and the city data corresponding to each dialect region in the plurality of dialect regions, wherein one dialect region corresponds to one city; classifying the dialect regions with the same first dialect into the same dialect group, andobtaining a plurality of dialect groups; sorting each dialect group according to the city data corresponding to each dialect region, and determining a target dialect region of each dialect group fromeach sorted dialect group; acquiring medical and aesthetic dialogue data of a city corresponding to the target dialect area of each dialect group; and taking the obtained medical and aesthetic dialogue data corresponding to each dialect group as dialect sample data. Therefore, theoretically, the data needs to cover all official areas in terms of machine learning data selection, and the generalization ability of the model can be enhanced.

Description

technical field [0001] The present invention relates to the technical field of data processing, in particular to a dialect sample data extraction method, device, equipment and storage medium. Background technique [0002] In the field of natural language processing and task-based dialogue robots, dialects are often a headache for robots, because there are roughly 7 official dialects in China, and there are hundreds of dialects. The mainstream natural language processing is Chinese, that is, in non-official areas, there may be errors in task recognition, such as intent recognition. For example, people in most places will say "how much money" when asking about the price, while in some dialect areas they will say "how much money". However, the lack of such dialect data in the training sample data makes the dialect comparison robot unable to recognize the dialect, so that the dialogue robot cannot accurately answer the user. Contents of the invention [0003] The present inv...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/33
CPCG06F16/3344
Inventor 陈鑫肖龙源蔡振华李稀敏刘晓葳谭玉坤
Owner XIAMEN KUAISHANGTONG TECH CORP LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products