A Chinese word segmentation method with feature alignment
A Chinese word segmentation and feature pair technology, applied in special data processing applications, instruments, electrical digital data processing and other directions, can solve problems such as parameter growth
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0056] For further elaborating the scheme of the present invention, the marked data and the unmarked data of the PKU text in the commonly used Chinese word segmentation corpus SIGAN-2005 are taken as an example to elaborate the technical scheme, refer to figure 1 , figure 1 A flow chart of a feature-aligned Chinese word segmentation method provided in this embodiment:
[0057] Step 1: Extract the binary words composed of adjacent words in the labeled data and unlabeled data in the PKU respectively, and count the number of times the binary words appear in the text. If the number of occurrences is 1, remove the bigram; if there are punctuation marks in the current bigram, it will also be removed, so as to obtain the bigrams for which the labeled data and unlabeled data are used to build the model.
[0058] Step 2: Extract the following 19 features from the binary words of the marked data and unlabeled data in step 1: count the number of times the current binary word appears in ...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com