Supercharge Your Innovation With Domain-Expert AI Agents!

Phonetic notation method for input method material, method for generating evaluation corpus, and electronic device

An input method and corpus technology, applied in the field of input methods, can solve problems such as unreasonable word segmentation mechanism, loss of corpus, no phonetic notation of corpus, etc., and achieve the effect of reducing the workload of manual review, improving the correct rate, and improving accuracy

Active Publication Date: 2019-01-15
BAIDU INT TECH (SHENZHEN) CO LTD
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, the efficiency of manual collection of evaluation corpus is low, and the evaluation corpus generated by general automated methods has at least the following problems: the word segmentation mechanism is unreasonable, resulting in the loss of most of the corpus actually input by users, making the acquired corpus inappropriate and affecting input The evaluation results of the method; there is no mature phonetic tool for accurate phonetic notation of the corpus

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Phonetic notation method for input method material, method for generating evaluation corpus, and electronic device
  • Phonetic notation method for input method material, method for generating evaluation corpus, and electronic device
  • Phonetic notation method for input method material, method for generating evaluation corpus, and electronic device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment approach

[0073] see Figure 4 , another embodiment of the method for generating input method evaluation corpus in the present invention includes:

[0074] Step S401: Capture content in a predetermined field or type on the network as historical input content;

[0075] Crawl the content of a predetermined field or type on the network as the user's historical input content. For example, content in various fields such as "technology", "tourism", and "sports" on the portal website can be captured as historical input content. It can also be used as the user's historical input content by grabbing the user's log or signature input on various social networking sites such as twitter, facebook, or microblog.

[0076] Step S402: Segment the captured historical input content into at least one corpus entered by the user;

[0077] The captured historical input content can be segmented for the first time according to the threshold separated by punctuation marks, and the corpus after the first segme...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method of the phonetic notation of input method linguistic data, and a method and an electronic device for generating evaluation linguistic data. The method of the phonetic notation of the linguistic data comprises the steps of: using at least two different phonetic annotation tools to respectively perform phonetic annotation to each linguistic datum, so that each linguistic datum has corresponding at least two phonetic annotations; judging if the at least two phonetic annotations of each linguistic datum are the same, if not, selecting the phonetic annotation of which the evaluation result is optimal as the correct phonetic annotation of the linguistic datum, and if so, using the phonetic annotation as the correct phonetic annotation of the linguistic datum. By the way, with the adoption of the method, the workload of needing workers to check the correct phonetic annotations of the linguistic data is greatly reduced, and the efficiency of the phonetic annotation of the linguistic data is improved while the correct rate of the phonetic annotation of the linguistic data is improved.

Description

technical field [0001] The invention relates to the technical field of input methods, in particular to a phonetic notation method for input method materials, a method for generating evaluation language materials and an electronic device. Background technique [0002] Input method refers to the encoding method used to input various symbols into computers or other devices (such as mobile phones). The performance of the input method will directly affect the input efficiency on the computer or other equipment. Therefore, it is necessary to evaluate the performance of the input method to provide a basis for continuous improvement of the input method. [0003] The evaluation of the input method is to perform operations such as input and word selection on the evaluation corpus, and record the position of the ideal candidate result and the number of edits to obtain the ideal candidate result in the process, and finally count the ideal candidates in the process of multiple input and...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/27
Inventor 景富香
Owner BAIDU INT TECH (SHENZHEN) CO LTD
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More