Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

News event searching method and system based on multistage image-text semantic alignment model

A search method and news technology, applied in the computer field, can solve problems such as the inability to use top-level embedding vector input signals, computing performance bottlenecks, and long time consumption, so as to improve generalization performance, recall rate, average accuracy and precision, and time-consuming retrieval short effect

Pending Publication Date: 2022-04-08
BEIJING UNIV OF POSTS & TELECOMM
View PDF0 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0067] Although the combination of graphic and text features can provide more cross-feature information for the hidden layer of the model, it is impossible to use the top-level embedding vector to independently represent the input signals of images and text
Compared with common space feature learning methods, the search process of cross-modal similarity measurement methods is time-consuming
Specifically, when the user enters a text query q, the system needs to calculate the feature combination of all images and q online to obtain the similarity score between q and each image. The calculation performance is a huge bottleneck, making it impossible to apply in practice.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • News event searching method and system based on multistage image-text semantic alignment model
  • News event searching method and system based on multistage image-text semantic alignment model
  • News event searching method and system based on multistage image-text semantic alignment model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0133] The technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the accompanying drawings.

[0134] The news event search method based on the multi-level graphic-text semantic alignment model proposed by the present invention comprises the following steps:

[0135] Step 1), building a multi-modal news graphic dataset

[0136] Unlike traditional methods, the training of neural networks requires the support of a large number of samples. The available and high-quality multimodal data sets of news images and texts are the first step in the research of cross-modal search algorithms for news events. At present, there is no open-source multi-modal data set of news events, images and texts, so it is necessary to build the data set by ourselves.

[0137] The specific steps for constructing a multimodal news graphic dataset are as follows:

[0138] Step 1.1) News event selection

[0139] Aiming at the par...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The multi-level vision-text semantic alignment model MSAVT used for image-text matching is provided, the news event retrieval method based on the multi-level vision-text semantic alignment model MSAVT used for image-text matching is provided, news event cross-modal image-text search is achieved, and the current news retrieval requirement is met. The image-text alignment precision of the cross-modal retrieval model provided by the invention is higher, and when the cross-modal retrieval model is applied to news event cross-modal image-text retrieval, indexes such as recall rates of multiple levels, average accuracy and the like are remarkably improved. And meanwhile, a pre-trained BERT model is introduced to extract text features, so that the generalization performance of the algorithm is improved. The model adopts a public space feature learning method, vector representations of images and texts can be independently obtained, namely, vector representations of retrieval results can be stored in advance, retrieval time is short, and the method can be applied to actual scenes.

Description

technical field [0001] The present application relates to the field of computer technology, in particular to a news event search method based on a multi-level image-text semantic alignment model. Background technique [0002] Cross-modal retrieval [0003] Mode refers to the existence form of data, such as text, picture, video, etc. Cross-modal retrieval aims to use data from one modality as a query to retrieve data from another modality. The most common is image-text retrieval (image-text retrieval). Given a piece of text, retrieve related images, or conversely given an image, retrieve related text. The main difficulty of cross-modal retrieval lies in the "heterogeneous gap". The heterogeneous gap refers to the inconsistency between the query input and the retrieval result, and the two data are in different distribution spaces. Although the high-level semantics are related, the similarity cannot be directly measured. Therefore, the focus of research is how to represent ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/9535G06F16/907G06F16/906G06N3/04G06N3/08
CPCG06N3/04G06N3/08G06F16/906G06F16/907G06F16/9535
Inventor 范春晓吴岳辛孙娟娟汤艺郭皓洁
Owner BEIJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products