Search algorithm for Chinese word segmentation
A search algorithm and Chinese word segmentation technology, which is applied in the field of text search engines to achieve high search efficiency, balance index construction time and space, and reduce construction time and memory cost.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0049] In order to study the search performance of the present invention on data sets of different sizes, we constructed five data sets with a data volume of 10,000, 20,000, 50,000, 100,000 and 200,000 respectively, and compared each data set with the Lucene engine based on the inverted list Carry out multiple comparison experiments.
[0050] Randomly generate 25 search strings each with a length ranging from 2 to 4 to form 75 search strings. For each search string, 100,000 searches are performed, and the time consumption of each search is recorded on the premise that the search results are correct.
[0051] In order to allow Lucene to complete the same task as the index of the present invention, a space is added between each character of the initial sequence when establishing the initial index, so that each character is considered as a word, and between each character of the search string Spaces are also added to realize the same search function of the present invention.
...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com