An Intelligent News Cataloging Method Based on AI Content Analysis and OCR Recognition

A technology of content analysis and news, applied in the direction of selective content distribution, character and pattern recognition, instruments, etc., can solve the problems of no effective title, large difference, poor readability, etc., to improve work efficiency, meet business requirements, The effect of improving work efficiency

Active Publication Date: 2021-05-04
BEIJING DAYANG TECH DEV
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In actual use, through the method of video content analysis, the obtained fragment content may have the following problems: affected by the voice analysis module, the split fragments are different from the actual ones, and there are cases where there are omissions or too fine splits Existence; the split segment has no effective title, and the readability is poor; the split segment, the extracted abstract content is quite different from the actual one, and the news segment cannot be accurately summarized
In the end, the intelligently split news fragments cannot effectively improve the work efficiency of the catalogers, either requiring the catalogers to re-enter the fragment names, or requiring the catalogers to extract and record the fragment summaries after browsing each fragment

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An Intelligent News Cataloging Method Based on AI Content Analysis and OCR Recognition
  • An Intelligent News Cataloging Method Based on AI Content Analysis and OCR Recognition
  • An Intelligent News Cataloging Method Based on AI Content Analysis and OCR Recognition

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0027] This embodiment is an intelligent news cataloging method based on AI content analysis and OCR identification, the method includes the following steps, the process is as follows figure 1 Shown:

[0028] Step 1, decoding processing: performing decoding processing on the obtained video and audio files to obtain video streams and audio streams.

[0029] Perform conventional decoding processing on the video and audio files to be processed to obtain video streams and audio streams, and prepare for separate processing of video and audio streams and audio streams. The following steps are performed in parallel according to the video stream and audio stream respectively.

[0030] Processing of video streams:

[0031] Step 2, extract video key frames: extract key frames from the video stream obtained in step 1, and extract screen content information from key frames to obtain label data.

[0032] Firstly, the conventional video clustering method is used for the video stream to e...

Embodiment 2

[0049] This embodiment is an improvement of the first embodiment, and is a refinement of step 4 of the first embodiment. The method of extracting high-value key frames described in step 4 of this embodiment is: scoring the results of the content analysis, and extracting highlights with high business value from the identified segments.

[0050]Combining with the business characteristics of simulcast news, one shot contains the title, the host’s entry and exit of the mirror lake, landmarks, sensitive people and other information. After these pictures are extracted as key pictures, they will be very intuitive for the cataloger to obtain in the shortest time. Based on this goal, in this embodiment, OCR technology and face recognition technology are used to analyze the above-mentioned business elements in a targeted manner, and the requirements for each hit business feature Score for structured data, such as 3 points for sensitive persons, 2 points for titles, 2 points for special ...

Embodiment 3

[0052] This embodiment is an improvement of the above embodiment, and is a refinement of step 5 of the above embodiment. The partition processing described in step 5 of this embodiment is: divide the video key frame into 16 regions to identify station logo, title, logo and channel information.

[0053] The 16-area processing method for the OCR text recognition results of video frames can be combined with business recognition to identify content such as station logos and titles.

[0054] After analyzing the news video picture, the whole frame picture is divided into 16 areas of 4×4, such as image 3 As shown, in combination with the business characteristics, different areas are assigned business attributes to segment, and then the expected station logo, logo, and news titles and other information are obtained. After this data processing, the structured data of the lens is more rich in business characteristics, which greatly provides a rich data basis for the content of the fol...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to an intelligent news cataloging method based on AI content analysis and OCR identification, including: decoding processing; extracting video key frames; extracting shots; extracting high-value key frames; Linguistic analysis; output complete fragment information. Based on various methods such as content analysis, OCR text recognition result partition processing, and regular expression matching, the present invention can extract titles, abstracts, and highlights of news segments, and meet the requirements of business cataloging. On the basis of conventional content analysis, the present invention adds a logical processing method, which has a fast processing speed and does not affect the time-consuming overall processing. However, it greatly meets the business requirements of users, improves the work efficiency of users, truly applies intelligent data processing to practical applications, and finally improves the work efficiency of catalogers.

Description

technical field [0001] The invention relates to an intelligent news cataloging method based on AI content analysis and OCR identification, which is a computer processing method and a method for processing digital video signals. Background technique [0002] For news programs, the traditional manual cataloging method requires catalogers to browse and view the entire news program, find the entry and exit points of each news segment one by one, and manually cut out multiple segments. And it is necessary for the cataloger to carefully check the video content, in order to define the title of the segmented segment in combination with the actual screen content, and at the same time, to describe and record the content of the segment with keywords. The whole process relies entirely on the manual actions of catalogers, which takes a long time and the efficiency of cataloging and recording is low. Some existing solutions include intelligent stripping methods based on audio and video s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): H04N21/44H04N21/4402H04N21/4415H04N21/84H04N21/8549G06F16/78G06F16/783G06K9/34
CPCH04N21/44008H04N21/4402H04N21/44016H04N21/4415H04N21/84H04N21/8549G06F16/7867G06F16/784G06V30/153
Inventor 李永葆陈美玲严佳王彦斌
Owner BEIJING DAYANG TECH DEV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products