News key information extraction method and system
A key information and extraction method technology, applied in the field of news key information extraction methods and systems, can solve problems such as methods that do not have versatility, real-time performance, no extraction requirements, and complication of simple problems, so as to achieve less resource consumption, The effect of high accuracy, strong practicability and robustness
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0063] The present invention proposes a method for extracting news key information, the name is newsExtractor; the method can include extracting four modules of title, time, source and text in the news webpage, and the overall process is as follows image 3 shown.
[0064] 1. Pretreatment
[0065] Preprocessing is mainly to remove some noise and special HTML symbol entities that are obviously not text content, simplify HTML tags, and reduce the workload of post-processing. In the preprocessing process, this article will borrow the third-party open source tool Jsoup (Jsoup[Z].http: / / jsoup.org / ) for auxiliary processing. The preprocessing process of this article includes the following aspects:
[0066] 1) Remove useless label pairs. The source code information of the web page is very mixed, including many script language tag pairs , user interaction label pairs, such as , Wait. We first remove these tag pairs that obviously do not contain body content. The ...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com