Unlock instant, AI-driven research and patent intelligence for your innovation.

Speech recognition test data generation method based on context in user comments

A speech recognition test and test data technology, which is applied in speech recognition, natural language data processing, speech analysis, etc., can solve problems such as untrue test data, and achieve the effects of avoiding low test efficiency, improving test efficiency, and reducing workload

Pending Publication Date: 2020-12-18
NANJING UNIV OF AERONAUTICS & ASTRONAUTICS
View PDF0 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Aiming at the deficiencies of the above-mentioned prior art, the object of the present invention is to provide a method for generating speech recognition test data based on the context in user comments, so as to overcome the problems of insufficient existing test data and untrue test data generated by existing methods; The invention makes full use of the user comments of speech recognition-related APPs, mines the contextual factors in speech recognition and the relationship between them, and extracts the values ​​of the contextual factors from the data set to build a test context classification tree model, thereby guiding the generation of speech test data. The quality assurance of speech recognition software in the process of frequent version evolution has important research significance and application value

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech recognition test data generation method based on context in user comments
  • Speech recognition test data generation method based on context in user comments
  • Speech recognition test data generation method based on context in user comments

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] In order to facilitate the understanding of those skilled in the art, the present invention will be further described below with reference to the embodiments and the accompanying drawings, and the contents mentioned in the embodiments are not intended to limit the present invention.

[0030] refer to figure 1 , figure 2 As shown, a method for generating speech recognition test data based on the context in user comments of the present invention includes the following steps:

[0031] 1) Extraction of contextual factors: collect information about speech recognition APPs (including speech recognition APPs and APPs with speech recognition functions, such as speech translation) in the mobile application store (Google Play Store in the example, but also other application stores). ) of the comment text information, preprocessing and keyword extraction are performed on the comment text information, and the keywords are screened to construct context factors; according to the co...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a speech recognition test data generation method based on context in user comments. The method comprises the following steps of: extracting context factors of speech recognition and priority and semantic association relationships among the factors from the user comments of a plurality of speech recognition APPs, and extracting value domains of the context factors accordingto a selected data set to construct a context classification tree model. The model can modify original test data in the value domains according to specific test requirements to generate more availabletest data. Through adoption of the speech recognition test data generation method, the problem of severe insufficiency of the test data due to high updating frequency of an intelligent system is solved. Meanwhile, the value domains of the context factors of a real data set are utilized, and a method for rapidly generating real test data is provided for a test speech recognition system, so that the speech recognition test efficiency is increased effectively. Meanwhile, a developer can understand the performance of the speech recognition system more easily, so that the system can be upgraded and updated more specifically.

Description

technical field [0001] The invention belongs to the technical field of intelligent software testing, and in particular relates to a method for generating test data for speech recognition based on context in user comments. Background technique [0002] With the development of artificial intelligence and big data technology, more and more products based on artificial intelligence have emerged in the market, such as face recognition, speech recognition, machine translation and so on. On the basis of traditional software, intelligent software integrates intelligent functional attributes, which undoubtedly brings many new problems and difficulties to the testing of intelligent software, and also puts forward greater market and research needs for intelligent software testing. The current rapid iteration of intelligent software generally has problems such as insufficient test data and insufficient reliability, and it is difficult to meet the test requirements. Meanwhile, the test ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L15/01G06F40/216G06F40/289G06F40/30G06N3/04G06K9/62
CPCG10L15/01G06F40/216G06F40/289G06F40/30G06N3/045G06F18/24323
Inventor 陶传奇曹冬玉黄志球
Owner NANJING UNIV OF AERONAUTICS & ASTRONAUTICS
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More