Multilingual word segmentation method based on dictionaries and grammar analysis
A technology of grammar analysis and word segmentation method, applied in natural language data processing, special data processing applications, instruments, etc., can solve the problems of reducing storage space, garbled characters, and little representative meaning
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0032] In order to make the purpose, technical solution and advantages of the present invention clearer, the technical solution of the present invention will be described in detail below in conjunction with the accompanying drawings and embodiments.
[0033] like figure 1 As shown, according to the first aspect of the present invention, a new word segmentation framework system is adopted. The new word segmentation system proposed by the present invention can realize accurate word segmentation for text judgment of each type of language by embedding Chinese, Japanese, Korean and Cantonese sub-word breakers, Chinese quantum word breakers and Western language word breakers; through the built-in language segment code recognition mechanism field to segment the text fragments to be segmented, and each segmented text segment corresponds to a language family, and the corresponding sub-tokenizer is used for word segmentation; it contains an extended dictionary configuration management u...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com