Extraction method of radio and television news event elements based on deep learning

A radio and television, deep learning technology, applied in neural learning methods, electrical digital data processing, text database browsing/visualization, etc., can solve problems such as laborious and time-consuming, achieve good normalization discrimination, and improve the effect of extraction efficiency.

Active Publication Date: 2021-08-03
成都索贝视频云计算有限公司
View PDF8 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, relying solely on manually identifying news elements and organizing them into structured information is time-consuming and labor-intensive, so the automatic extraction of news elements is of great significance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Extraction method of radio and television news event elements based on deep learning
  • Extraction method of radio and television news event elements based on deep learning
  • Extraction method of radio and television news event elements based on deep learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0054] Such as Figure 1~3 As shown, the method for extracting elements of radio and television news events based on deep learning includes steps:

[0055] S1, mark the abstract and element information of the radio and television news data to be analyzed, and construct the abstract data set and element data set;

[0056] S2, using the pre-trained model to construct a summary extraction model and an element extraction model, and using the abstract data set and the element data set constructed in step S1 to train the abstract extraction model and the element extraction model;

[0057] S3, using the abstract extraction model and element extraction model trained in step S2 to construct a two-stage radio and television news element automatic extraction model, using the model to predict the input radio and television news, and obtain a structured element extraction result.

Embodiment 2

[0059] On the basis of Embodiment 1, in step S1, constructing the feature data set is carried out on the basis of the abstract data set, including the following steps:

[0060] S11. Establish the core element lexicon of news events of N types of target types, use synonyms to expand the core element words, and then revise the expanded core element lexicon of news events, recall the news event data, and locate the core of the news event summary at the same time sentence; then the core element words are classified as the normalized expression of the core element; wherein, N is a positive integer;

[0061] S12, filter out the core sentences in the marked abstract according to the core elements of news events; sample and set the core sentences, summarize the characteristics of all other elements in the core sentences and their role category information in the news, and build elements for the follow-up The labeling system provides relevant configuration information and constraint in...

Embodiment 3

[0064] On the basis of embodiment 1, the summary extraction model in step S2 is recorded as the BertSum model, and the BertSum model is based on the Bert model, which adds a summary extraction layer based on Transformer on top of Bert to obtain sentence information as a summary.

[0065] The series sequence of the element extraction model in step S2 includes: text vectorization layer, core element extraction layer, other element extraction layer and core element normalized expression layer.

[0066] The text vectorization layer adopts the Bert layer after abstract extraction training fine-tune.

[0067] Considering that multiple news events may be described in a news core sentence, there are multiple core element words, and there may be nesting between these core element words (here, one core element word is inside another core element word, is its substring), so, in this embodiment, the construction process of the core element extraction layer includes the following steps:

...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for extracting elements of radio and television news events based on deep learning, including steps: S1, marking the abstract and element information of the radio and television news data to be analyzed, and constructing an abstract data set and an element data set; S2, using a pre-trained model Construct a summary extraction model and a feature extraction model, and use the constructed summary data set and feature data set to train the summary extraction model and feature extraction model; S3, use the trained summary extraction model and feature extraction model in step S2 to build a two-stage Radio and television news elements automatic extraction model, use the model to predict the input radio and television news, obtain structured element extraction results, etc.; All-media news content, establishment of content knowledge base, sorting out the context of news events and other upper-level analysis or application services to provide intelligent technical support, etc.

Description

technical field [0001] The present invention relates to the field of radio and television news text structuring, and more specifically, to a method for extracting radio and television news event elements based on deep learning. Background technique [0002] In recent years, with the rapid development of my country's radio and television industry, media content data, user service data, etc. are increasing massively. [0003] Radio and television news is a kind of unstructured media content data, which consists of title, lead, subject, background, and epilogue. Among them, title, lead, and subject are often indispensable. In addition, there are simultaneous sounds in some scenes. Therefore, news is usually relatively lengthy. News elements such as time (when), place (where), person (who), event (what), reason (why), etc., as the basic elements that a news report must have, can be used as the basic elements of the event information contained in the news content. Structural ch...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/34G06F16/31G06F16/35G06F40/247G06F40/30G06N3/08G06N3/04
CPCG06F16/345G06F16/313G06F16/353G06F40/247G06F40/30G06N3/08G06N3/047G06N3/048
Inventor 杨瀚朱婷婷温序铭
Owner 成都索贝视频云计算有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products