News emotion entity extraction method based on remote supervision

A technology of entity extraction and remote supervision, which is applied to instruments, network data retrieval, semantic analysis, etc., can solve the problems of difficulty in identifying entities, lack of rules, and irregular data expression, so as to reduce the cost of manual annotation and improve efficiency. Effect

Pending Publication Date: 2021-05-11
NANJING UNIV OF SCI & TECH
View PDF0 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The social media data expression represented by Weibo is not standardized, there are a large number of spoken expressions, there are no specific rules, and it is difficult to identify entities
[0008] There are currently no publicly available corpus datasets and entity taxonomies for journalism, hampering OSINT research efforts

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • News emotion entity extraction method based on remote supervision
  • News emotion entity extraction method based on remote supervision
  • News emotion entity extraction method based on remote supervision

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0041] A news emotional entity extraction method based on remote supervision, such as figure 1 shown, including the following steps:

[0042] Step 1: Use crawler technology to crawl news forecasts from official news websites and cache them in the local warehouse;

[0043] Use crawler technology to crawl related news forecasts from official news websites such as Huanqiu.com, Netease News, and Xinhua Daily for hot news events. The specific method is: by analyzing the search results with keywords on the official website, obtain the news website related to the event, analyze the news content according to the news website, obtain the title, time, specific content and other data of the news and cache them in the local warehouse.

[0044] Step 2: Preprocess the crawled news corpus to obtain news predictions segmented into sentences;

[0045] Read the crawled news corpus from the local warehouse for data cleaning to remove redundant and dirty data irrelevant to the topic. Delete us...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a news emotion entity extraction method based on remote supervision. The method comprises the steps: crawling official news website news corpus and caching the corpus to a local warehouse; preprocessing the crawled news corpus to obtain news corpus segmented into sentences; constructing a key entity knowledge base, and automatically labeling the news corpus segmented into sentences according to the knowledge base; training an emotion sentence extraction model by using the labeled news corpus to enable the model to have the capability of performing automatic emotion judgment on the input sentences; using the extracted sentiment sentences, and taking the sentiment sentences as a training set of a sentiment entity extraction model for training; and crawling the news corpus, segmenting the news corpus into sentences, inputting the news corpus segmented into sentences into the trained emotion sentence extraction model to extract emotion sentences, and inputting the extracted emotion sentences into the trained emotion entity extraction model to obtain emotion entities. According to the method, a data set with noise is generated for a large number of samples in a remote supervision mode for model training, and the model training efficiency is improved.

Description

technical field [0001] The invention belongs to the field of computer artificial intelligence, specifically a method for extracting news emotional entities based on remote supervision. Background technique [0002] Due to the unique application background and text expression method of named entity recognition in the field of news, researchers have explored it. Feng Yuntian et al. proposed the classification principles of entities such as personnel, military ranks, military positions, military institutions, and facilities, and constructed a corpus based on standardized texts such as combat documents, duty documents, and military documents. It uses a small amount of manually labeled training corpus to train the CRF model, and the trained model performs entity recognition on the unlabeled test corpus, and the model obtains a recognition effect with an F value of 90.9% on the test corpus. You Fei et al. established a DNN-based weapon entity recognition model for weapon named en...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/295G06F40/30G06F16/951G06N3/04
CPCG06F40/295G06F40/30G06F16/951G06N3/044G06N3/045
Inventor 张琨孙琦李寻张李林清刘志敏
Owner NANJING UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products