A feature-aligned Chinese word segmentation method
A technology of Chinese word segmentation and feature pairing, which is applied in the fields of instruments, computing, and electrical digital data processing, etc., can solve the problem of parameter growth and other problems, and achieve the effect of avoiding over-fitting and alleviating the difference in feature distribution
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0055] In order to further illustrate the solution of the present invention, the technical solution is described in detail by taking the marked data and unmarked data of the PKU text in the commonly used Chinese word segmentation corpus SIGAN-2005 as an example. figure 1 , figure 1 A flowchart of a feature-aligned Chinese word segmentation method provided in this embodiment:
[0056] Step 1: Extract the bigrams composed of adjacent words in the marked data in the PKU and the unmarked data respectively, and count the number of times the bigrams appear in the text. If the number of occurrences is 1, the bigram will be removed; if the current bigram has punctuation marks, it will also be removed, so as to obtain the marked data and unmarked data for building the model.
[0057] Step 2: Extract the following 19 features for the big words of the labeled data and unlabeled data in step 1: count the number of times the current big word appears in the document; calculate the multipli...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com