Keyword extension method and system and classification corpus labeling method and system
A keyword expansion and keyword technology, which is applied in the fields of instruments, calculations, and electronic digital data processing, etc., can solve the problems of heavy workload for thesaurus establishment, high subjectivity of keyword expansion methods, and low accuracy of keyword expansion. Accelerate the processing speed, achieve convenience and high accuracy
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0054] This embodiment provides a keyword expansion method, the flow chart is as follows figure 1 shown, including the following steps:
[0055] (1) Retrieve according to the pre-given initial keywords, and retrieve keywords. In this example, the initial keyword is used to search the article database to obtain highly relevant articles, and then perform word segmentation on these articles, and use the result after word segmentation as the words obtained by the search. The number of occurrences of the retrieved words is counted, and the words whose occurrences are greater than the preset threshold of 50 times (the number of times here is set according to the size of the article database and the degree of common use of the retrieved keywords) are used as the retrieved keywords. In this way, keywords will be obtained, which have a certain statistical significance, and it is convenient to find words related to each meaning of the keyword.
[0056] (2) Use the keywords obtained by...
Embodiment 2
[0060] (1) Retrieve according to the pre-given initial keywords, and retrieve keywords.
[0061] (2) Use the keywords obtained by retrieval as the basis for the next retrieval, and perform cyclic retrieval through keyword iteration.
[0062] In the retrieval process of (1) and (2) above, the retrieval method is as follows:
[0063] Use the preset keywords to search in the article database to obtain highly relevant articles, and then perform word segmentation on these articles, and perform the operation of removing stop words after word segmentation, and then obtain the same keywords that appear at the same time as the preset keywords The co-occurrence words can be obtained through the sliding window method, and the co-occurrence words are used as the words obtained by retrieval. The retrieved words are obtained through word segmentation, stop words removal, and co-occurrence words. After the above-mentioned step-by-step filtering, unnecessary redundant words are removed to ob...
Embodiment 3
[0070] A keyword expansion system comprising:
[0071] (1) Acquisition unit: perform retrieval according to the predetermined initial keywords, and retrieve keywords. In the keyword expansion system, the acquisition unit further includes a retrieval keyword module: counting the occurrence times of the words obtained through retrieval, and using words whose frequency is greater than a preset threshold as keywords obtained through retrieval.
[0072] As other alternative implementations, the acquisition unit also includes a retrieval and comparison module for obtaining keywords: counting the number of words obtained by retrieval and the number of occurrences of each word, sorting them in descending order according to the number of times, and ranking the first A certain proportion of words are used as keywords obtained by retrieval.
[0073] (2) Circular retrieval unit: use the retrieved keywords as the basis for the next retrieval, and perform cyclic retrieval through keyword i...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com