News event searching method and system based on multistage image-text semantic alignment model

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A search method and news technology, applied in the computer field, can solve problems such as the inability to use top-level embedding vector input signals, computing performance bottlenecks, and long time consumption, so as to improve generalization performance, recall rate, average accuracy and precision, and time-consuming retrieval short effect

Pending Publication Date: 2022-04-08

BEIJING UNIV OF POSTS & TELECOMM

View PDF0 Cites 7 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0067] Although the combination of graphic and text features can provide more cross-feature information for the hidden layer of the model, it is impossible to use the top-level embedding vector to independently represent the input signals of images and text

Compared with common space feature learning methods, the search process of cross-modal similarity measurement methods is time-consuming

Specifically, when the user enters a text query q, the system needs to calculate the feature combination of all images and q online to obtain the similarity score between q and each image. The calculation performance is a huge bottleneck, making it impossible to apply in practice.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0133] The technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the accompanying drawings.

[0134] The news event search method based on the multi-level graphic-text semantic alignment model proposed by the present invention comprises the following steps:

[0135] Step 1), building a multi-modal news graphic dataset

[0136] Unlike traditional methods, the training of neural networks requires the support of a large number of samples. The available and high-quality multimodal data sets of news images and texts are the first step in the research of cross-modal search algorithms for news events. At present, there is no open-source multi-modal data set of news events, images and texts, so it is necessary to build the data set by ourselves.

[0137] The specific steps for constructing a multimodal news graphic dataset are as follows:

[0138] Step 1.1) News event selection

[0139] Aiming at the par...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The multi-level vision-text semantic alignment model MSAVT used for image-text matching is provided, the news event retrieval method based on the multi-level vision-text semantic alignment model MSAVT used for image-text matching is provided, news event cross-modal image-text search is achieved, and the current news retrieval requirement is met. The image-text alignment precision of the cross-modal retrieval model provided by the invention is higher, and when the cross-modal retrieval model is applied to news event cross-modal image-text retrieval, indexes such as recall rates of multiple levels, average accuracy and the like are remarkably improved. And meanwhile, a pre-trained BERT model is introduced to extract text features, so that the generalization performance of the algorithm is improved. The model adopts a public space feature learning method, vector representations of images and texts can be independently obtained, namely, vector representations of retrieval results can be stored in advance, retrieval time is short, and the method can be applied to actual scenes.

Description

technical field [0001] The present application relates to the field of computer technology, in particular to a news event search method based on a multi-level image-text semantic alignment model. Background technique [0002] Cross-modal retrieval [0003] Mode refers to the existence form of data, such as text, picture, video, etc. Cross-modal retrieval aims to use data from one modality as a query to retrieve data from another modality. The most common is image-text retrieval (image-text retrieval). Given a piece of text, retrieve related images, or conversely given an image, retrieve related text. The main difficulty of cross-modal retrieval lies in the "heterogeneous gap". The heterogeneous gap refers to the inconsistency between the query input and the retrieval result, and the two data are in different distribution spaces. Although the high-level semantics are related, the similarity cannot be directly measured. Therefore, the focus of research is how to represent ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06F16/9535G06F16/907G06F16/906G06N3/04G06N3/08

CPCG06N3/04G06N3/08G06F16/906G06F16/907G06F16/9535

Inventor 范春晓吴岳辛孙娟娟汤艺郭皓洁

Owner BEIJING UNIV OF POSTS & TELECOMM

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

News event searching method and system based on multistage image-text semantic alignment model

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology