News automatic derivation association mechanism method and system
An automatic derivation and news technology, applied in the Internet field, can solve the problems such as the inability to effectively remove duplicate or redundant news and events, the low accuracy of event extraction, and the difficulty of event extraction, so as to remove duplicate or redundant news and events, The effect of improving the accuracy of event extraction and reducing the requirements of syntactic analysis
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0053] see figure 1 , a method for automatically deriving affiliated organizations from news, comprising the following steps:
[0054] S1. Establishment of basic corpus: set the scope of data source collection for news information according to ongoing events, and build a basic corpus, and collect information based on the current basic corpus;
[0055] S2. News information collection: use Internet data collection tools to collect corresponding news sentences for news media, financial media, and financial institutions according to the basic corpus established in S1;
[0056] S3. Analysis of news information: After obtaining a large amount of news information data through S2, it is necessary to perform text analysis on the collected news sentences;
[0057] S4. News information identification: perform multiple entity unit identification on the news information after text analysis, and mark the identified entity units;
[0058] S5. News information association: judge and analyze t...
Embodiment 2
[0061] see figure 2 , as another preferred embodiment of the present invention, the difference from Embodiment 1 is:
[0062] The formation of the basic corpus in step S1 includes the following steps:
[0063] S11. According to the basic corpus to be formed, first build a learning model;
[0064] S12. According to the collection scope, use the learning model to collect corresponding news sentences;
[0065] S13. Analyzing the collected news sentences to obtain an analysis result;
[0066] S14. According to the analysis result, conduct attitude analysis to determine whether the news sentence has corpus value, and if so, add the corresponding news sentence to the basic corpus, and finally complete the formation of the basic corpus.
Embodiment 3
[0068] As another preferred embodiment of the present invention, the difference from Embodiment 1 is that the data collection of the Internet data collection tool in step S2 includes: pre-determining the initial grab seed sample, pre-determined web page classification directory and classification The seed samples corresponding to the directory, the captured data samples displayed and marked by simulating the user browsing process, and the search-style data capture of large vertical websites by pre-setting keywords.
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com