Chinese text keyword extraction method based on document theme structures and semantics
A technology of document subject and extraction method, which is applied in the field of keyword extraction, can solve the problems of spending more energy and different ability of keyword summary and generalization, and achieve the effect of improving and improving the effect
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0026] The following embodiments will further illustrate the present invention in conjunction with the accompanying drawings.
[0027] The present invention comprises the following steps:
[0028] 1) Text preprocessing steps:
[0029] The text documents used mainly come from various types of data such as web pages, PDF, Word, etc. The preprocessing process is divided into two aspects, one is the preprocessing of web pages, and the other is the preprocessing of other text types;
[0030] Preprocessing for webpages: Preprocessing these news webpages aims at extracting corresponding titles, content and marked keywords from them. By writing the extracted rules and conditional filtering, the web pages are structured and extracted, and saved in the form of text. Different websites have mostly different templates for their web pages. Through website research, every news article provided in Sina News.com will provide artificially marked keywords, which can better reflect news conte...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com