Method and device for detecting wrongly written characters, computer storage medium and electronic equipment

A detection method and typo technology, applied in the field of data processing, can solve the problems of complex process and low efficiency of the method of identifying typos

Pending Publication Date: 2020-01-17
上海斑马来拉物流科技有限公司
View PDF10 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Existing methods for identify

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for detecting wrongly written characters, computer storage medium and electronic equipment
  • Method and device for detecting wrongly written characters, computer storage medium and electronic equipment
  • Method and device for detecting wrongly written characters, computer storage medium and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0034] figure 1 It shows a schematic flowchart of the implementation of the typo detection method in the first embodiment of the present application.

[0035] As shown in the figure, the typo detection method includes:

[0036] Step 101, determine the text data to be detected;

[0037] Step 102, converting the text data into pinyin data;

[0038] Step 103, generating the feature template based on the ngram model of the pinyin data;

[0039] Step 104, inputting the feature template of the pinyin data into a pre-built typo detection model; the typo detection model is obtained according to conditional random field CRF model and feature template training based on ngram model;

[0040] Step 105. Determine whether there is a typo in the text data to be detected according to the output result of the typo detection model.

[0041] During specific implementation, the text data to be detected is Chinese characters or Chinese. The conversion of text data into pinyin data can specifi...

Embodiment 2

[0095] Based on the same inventive concept, the embodiment of the present application provides a misspelling detection device. The principle of solving technical problems of the device is similar to a misspelling detection method, and the repetition will not be repeated.

[0096] figure 2 A schematic structural diagram of a typo detection device in Embodiment 2 of the present application is shown.

[0097] As shown in the figure, the typo detection device includes:

[0098] Data determining module 201, for determining the text data to be detected;

[0099] Pinyin conversion module 202, for converting the text data into pinyin data;

[0100] Template generating module 203, for generating the feature template based on the ngram model of the pinyin data;

[0101] Model detection module 204, for inputting the feature template based on the ngram model of the pinyin data to the pre-built typo detection model; the typo detection model is obtained according to the conditional rand...

Embodiment 3

[0126] Based on the same inventive concept, an embodiment of the present application further provides a computer storage medium, which will be described below.

[0127] The computer storage medium stores a computer program thereon, and when the computer program is executed by a processor, the steps of the typo detection method as described in the first embodiment are implemented.

[0128] The computer storage medium provided in the embodiment of the present application converts the text data to be detected into pinyin, and then generates a feature template of the pinyin data and inputs it into a pre-built typo detection model to detect and determine whether there are typos in the text data. The embodiment applies the CRF model to the detection of typos, and adds a feature template based on the ngram language model, which effectively combines the characteristics of the language model and the scalability of the CRF feature function, making the process of typos detection simple an...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a wrongly written character detection method and apparatus, a computer storage medium and an electronic device. The method comprises the steps of determining to-be-detected text data; converting the text data into pinyin data; generating a feature template of the pinyin data based on an ngram model; inputting the feature template of the pinyin data into a pre-constructed wrongly written character detection model; wherein the wrongly written character detection model is obtained by training a conditional random field CRF model and a feature template based on an ngram model; and determining whether the text data to be detected has wrongly written characters or not according to an output result of the wrongly written character detection model. By adopting the scheme inthe invention, wrongly written characters can be simply and efficiently detected.

Description

technical field [0001] The present application relates to data processing technology, in particular, to a typo detection method, device, computer storage medium, and electronic equipment. Background technique [0002] With the popularization of smart phones and other mobile devices, the communication between people is mainly based on typing in pinyin. Due to various accidental factors in the typing process, such as typing too fast, not finding rare characters, or hand errors, etc., some typos may appear in the communication process. For humans, typos can be recognized and corrected by the human brain, however, for machines, typos may cause great problems. In the computer, words are stored as 0 and 1. Different words have different values, and the values ​​are independent and not related to each other like words (such as the same pronunciation, similar font shape, etc.). This leads to the need for typos to be corrected when the computer performs natural language processing ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F40/151G06F40/186G06F40/232
CPCY02D10/00
Inventor 龚伟松郭得庆
Owner 上海斑马来拉物流科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products