Iteration-based three-step unsupervised Chinese word segmentation method
A Chinese word segmentation, unsupervised technology, applied in character and pattern recognition, special data processing applications, instruments, etc., can solve problems of high complexity
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0085] The present invention describes the specific implementation of a three-step unsupervised Chinese word segmentation method based on iteration.
[0086] From figure 1 It can be seen that a three-step unsupervised Chinese word segmentation method based on iteration includes three processes of initialization, iterative processing, and adjustment processing. The iterative processing includes three steps of local segmentation, global word selection, and corpus subtraction.
[0087] In the unsupervised word segmentation framework of the present invention, in each iteration of specific implementation, the first step uses the word formation probability model based on segmentation-context independence (MISC) to perform locally optimal unsupervised segmentation on the text corpus. The MISC model does not need to introduce statistical assumptions about the segmentation length, and it takes into account both global and local features, and the form is simple and effective; for the long t...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com