Key phrase extraction method and device
A technology of key phrases and phrases, which is applied in the field of text processing, can solve the problems of inaccurate key phrase extraction and low precision, and achieve the effect of improving precision and accuracy
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0054] refer to figure 1 , which shows the flow chart of the key phrase extraction method in Embodiment 1 of the present invention, such as figure 1 As shown, the method may include the following steps:
[0055] Step 101, preprocessing the text to obtain multiple word segmentations.
[0056] The text in the embodiment of the present invention is the text that needs to carry out keyphrase extraction, example, can be the video title of video website, or article data etc., the format of this text can be word, pdf etc. commonly used text format, the present invention The embodiment does not limit this. The participle in Chinese is the smallest and meaningful language component that can act independently, while Chinese uses characters as the basic writing unit, which will lead to no obvious distinguishing marks between words. Therefore, when the text is a Chinese text, it is necessary to preprocess the text to determine the word segmentation. By preprocessing the text to obtain ...
Embodiment 2
[0066] refer to figure 2 , which shows the flow chart of the key phrase extraction method in Embodiment 2 of the present invention, such as figure 2 As shown, the method may include the following steps:
[0067] Step 201, preprocessing the text to obtain multiple word segmentations.
[0068] Preprocessing the text in this embodiment of the present invention may be to segment the text according to a certain principle. For example, when performing word segmentation, you can use common word segmentation databases, such as common dictionaries, to perform word-by-word traversal, and traverse and match all the words in the common word segmentation database in the text according to the order of arrangement. If the match is successful Then the current word is determined as the word segmentation of the text, and so on, until all the words in the common word segmentation database are matched once, and multiple word segmentation of the text is determined.
[0069] In specific implem...
Embodiment 3
[0111] refer to image 3 , which shows a block diagram of a key phrase extraction device in Embodiment 3 of the present invention, such as image 3 As shown, the device 30 may include:
[0112] A preprocessing module 301, configured to preprocess the text to obtain multiple word segmentations;
[0113]A combination module 302, configured to combine every two adjacent word segments in the plurality of word segments to obtain a plurality of word pairs;
[0114] The first determination module 303 is used to determine the co-occurrence information of each word pair in the plurality of word pairs through the preset word collocation feature table;
[0115] The second determining module 304 is configured to determine key phrases of the text according to the co-occurrence information of each word pair.
[0116] In summary, in the key phrase extraction device provided by Embodiment 3 of the present invention, when determining key phrases, the first determination module can determine...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com