Chinese proofreading and error-correction method and system based on Chinese word segmentation

A technology of Chinese word segmentation and error correction methods, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve problems such as poor results, intricate words, words and sentences, and complete correctness, so as to improve accuracy and The effect of work efficiency

Inactive Publication Date: 2018-10-30
北京一览群智数据科技有限责任公司
View PDF6 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, none of the above four input methods can ensure that the text information entered into the computer is completely correct
The traditional method of language proofreading is manual text proofreading, which requires a lot of manpower, material and financial resources
Although foreign text proofreading has achieved certain results in English ...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese proofreading and error-correction method and system based on Chinese word segmentation
  • Chinese proofreading and error-correction method and system based on Chinese word segmentation
  • Chinese proofreading and error-correction method and system based on Chinese word segmentation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] Preferred embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0029] The present invention provides a Chinese correction and error correction method based on Chinese word segmentation, which includes the following steps: performing Chinese word segmentation on the input Chinese text by a single sentence to obtain a word array, and the word array includes single-character, two-character, three-character or four-character words ; Recombining the word array to form a short sentence; judging whether the number of occurrences of the short sentence in the preset text library is greater than the first threshold, if so, marking the short sentence as correct; The word form and pinyin error correction processing of the short sentence is carried out. In this way, by performing Chinese word segmentation and retrieval and matching on Chinese texts, it is possible to identify and judge text errors contained in the te...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a Chinese proofreading and error-correction method and system based on Chinese word segmentation. The method comprises the steps of: carrying out Chinese word segmentation on input Chinese text according to single sentences to obtain word arrays, wherein the word arrays comprise single-character words, two-character words, three-character words or four-character words; recombining the word arrays to form short sentences; judging whether occurrence frequency of a short sentence in a preset text library is greater than a first threshold, and if yes, marking the short sentence as correct; and if not, carrying out glyph and Pinyin error-correction processing on the short sentence. The method realizes automatic proofreading and error correction of wrongly written characters in the text, and improves accuracy and work efficiency of Chinese proofreading and error correction.

Description

technical field [0001] The invention relates to the technical field of text correction, in particular to a Chinese word segmentation-based Chinese proofreading and error correction method and system. Background technique [0002] Chinese text information enters the computer mainly through four ways: traditional coding input, optical scanning input, intelligent voice input and intelligent handwriting input. At present, the above four input methods cannot ensure that the text information entering the computer is completely correct. The traditional method of language proofreading is manual text proofreading, which requires a lot of manpower, material and financial resources. Although foreign text proofreading has achieved certain results in English spelling proofreading, and some of the results have been commercialized, due to the complexity of the Chinese language structure and the diversity of word collocations, combined with the context, words, words and sentences have chan...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27
CPCG06F40/211G06F40/289
Inventor 窦志成曾泽群谢峰
Owner 北京一览群智数据科技有限责任公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products