A Chinese typo correction method and system based on word segmentation enhancement
A correction method and word segmentation technology, applied in neural learning methods, semantic analysis, electrical digital data processing, etc., can solve the problem that word segmentation tools cannot predict the correct word segmentation results, and achieve the effect of ensuring correctness
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0053] like figure 1 As shown, the flowchart of the method for correcting Chinese typos based on word segmentation enhancement provided by the disclosed embodiments of the present invention includes:
[0054] S1. Obtain the original text containing Chinese typo;
[0055]S2, utilize the first text encoding module in the word segmentation module to obtain the first hidden state of the original text, and predict the word segmentation result of the target text according to the first hidden state of the original text;
[0056] The word segmentation module includes a first text encoding module and a word segmentation network module, and the first text encoding module includes a first embedding layer and an encoder;
[0057] Obtain the character sequence, segment sequence and position sequence corresponding to the original text according to the original text;
[0058] Calculate the first embedding vector by using the first embedding layer according to the character sequence, segmen...
Embodiment 2
[0080] like figure 2 As shown, another method for correcting Chinese typos based on word segmentation enhancement provided by the disclosed embodiments of the present invention includes:
[0081] S1. Obtain the original text containing Chinese typo;
[0082] Wherein, the original text is , , n is the length of the original text, are characters in the original text, i∈{1,2,…,n}.
[0083] S2, utilize the first text encoding module in the word segmentation module to obtain the first hidden state of the original text, and predict the word segmentation result of the target text according to the first hidden state of the original text;
[0084] The word segmentation module includes a first text encoding module and a word segmentation network module; the first text encoding module is a BERT module, which includes a first embedding layer and an encoder; the encoder is a BERT model.
[0085] According to the input requirements of the first embedding layer of the BERT module, t...
Embodiment 3
[0144] refer to image 3 As shown, this exemplary embodiment also provides a word segmentation enhancement based Chinese typo correction system 100, which includes a word segmentation module 110 and a correction module 120; the word segmentation module 110 predicts the word segmentation result of the target text according to the original text; the correction The module 120 corrects the original text according to the word segmentation result, and outputs the target text.
[0145] In the embodiment of this example, the word segmentation module 110 includes:
[0146] The first text encoding module 111, which includes a first embedding layer and an encoder; the first embedding layer is used to obtain a first text embedding vector; the encoder is used to obtain the first hidden state of the original text according to the text embedding vector;
[0147] The word segmentation network module 112 is used to predict the word segmentation result of the target text through the fully conn...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com