Dictionary-based lucene Chinese word segmentation method
A Chinese word segmentation and dictionary technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve problems such as not being able to support Lucene well, achieve strong versatility, and improve effectiveness
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment
[0024] The dictionary-based lucene Chinese word segmentation method of the present invention mainly includes two stages, one is the construction of a professional dictionary, and the other is text word segmentation. figure 1 It is a flowchart of a specific embodiment of the dictionary-based lucene Chinese word segmentation method of the present invention. Such as figure 1 Shown, the lucene Chinese word segmentation method based on dictionary of the present invention comprises the following steps:
[0025] S101: Build a professional dictionary:
[0026] The present invention firstly needs to collect corpus and construct a professional dictionary. figure 2 It is a flow chart of building a professional dictionary. Such as figure 2 Shown, the concrete steps of constructing professional dictionary among the present invention are:
[0027] S201: Corpus preprocessing:
[0028] First, the collected corpus needs to be preprocessed, that is, the manually collected stop words are...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com