Method and apparatus for cutting large and small granularity of Chinese language text
An implementation method and a small-grained technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problem of being unable to process requirements and provide Chinese word segmentation results
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0029] The basic process of the embodiment scheme of the present invention is as figure 1 As shown, the following basic steps are included:
[0030] Step 101: Formulate identification rules for pattern words, named entity words such as person names, place names, and organization names, and corresponding large-grained and small-grained distinguishing information.
[0031] Among them, the identification rules for pattern words include:
[0032] Granularity information is added to the recognition rules, that is, granularity distinction points. Then use a deterministic finite state automaton (Deterministic Finite Automaton, DFA) to express the recognition rules, so that in the process of word segmentation, the finite state automaton can be used to identify pattern words that meet the rules. In this way, at the time of final output, the above-mentioned DFA can be used to divide the pattern words according to the large and small granularity requirements of the user, and the patter...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com