Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Image text matching model training method, bidirectional search method and related device

A technology for matching models and image samples, applied in the field of artificial intelligence, can solve problems such as the inability to comprehensively measure the matching degree of images and texts, and achieve the effect of accurate matching, accurate search results, and comprehensive matching.

Active Publication Date: 2018-07-17
TENCENT TECH (SHENZHEN) CO LTD
View PDF7 Cites 35 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] The embodiment of the present application provides a training method, a search method, and a related device for an image-text matching model, so as to solve the problem in the prior art that the matching degree between an image and a text cannot be fully measured, etc.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Image text matching model training method, bidirectional search method and related device
  • Image text matching model training method, bidirectional search method and related device
  • Image text matching model training method, bidirectional search method and related device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0055] refer to figure 2 , is a flow chart of a training method for an image-text matching model provided in Embodiment 1 of the present application, including the following steps:

[0056] Step 201: Extract global representation and local representation of image samples.

[0057] Step 202: Extract global and local representations of text samples.

[0058] It should be noted that the execution sequence of step 201 and step 202 is not limited.

[0059] Step 203: According to the extracted global representation and local representation, train a pre-built matching model, so that the matching model can determine the matching degree between the image and the text based on the global representation and the local representation;

[0060] Among them, the matching model maps the respective global representations of image samples and text samples to the specified semantic space, and calculates the similarity of global representations between heterogeneous sample pairs composed of ima...

Embodiment 2

[0109] Such as Figure 7 As shown, it is a schematic flow chart of a specific embodiment of the training method of the image-text matching model provided in the embodiment of the present application. The method includes the following steps:

[0110] Step 701: Extract the global representation of the image sample based on the global image representation CNN.

[0111] Step 702: Divide the image sample into a specified number of image blocks, and calculate the probability that the image block contains the image information of the specified category for each image block based on the local image CNN; and select each specified category in the specified number of image blocks The maximum probability of the image information of each specified category constitutes the local representation of the image sample.

[0112] Step 703: segment the text sample; for each word segment, determine the vector of the word segment, wherein the vectors of different word segments have the same length; ...

Embodiment 3

[0122] Such as Figure 8 As shown, it is a flow chart of the image-text bidirectional search method based on the matching model in Embodiment 1, including the following steps:

[0123] Step 801: Receive a reference sample, where the reference sample is text or an image.

[0124] Step 802: Extract global representation and local representation of the reference sample.

[0125] Step 803: Input the global representation and locality of the reference sample into the matching model, so that the matching model can calculate the matching degree between the reference sample and the corresponding material; wherein, if the reference sample is text, then the corresponding material is an image; if the reference sample is image, the corresponding material is text; the matching model can determine the matching degree between the image and the text based on the global representation and the local representation.

[0126] Wherein, a material library may be established, and a matching degree...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the artificial intelligence technology field and particularly relates to an image text matching model training method, a bidirectional search method and a related device. Thetraining method comprises steps that global and local representations of an image sample and a text sample are extracted, a pre-constructed matching model is trained, the global and local representations of the image sample and the text sample are respectively mapped to the specified semantic space by the matching model, and similarity of the global representation and similarity of the local representation are calculated; according to the default weight of the similarity of the global representation and the default weight of the similarity of the local representation, the image and text matching degree is determined through employing a weighted summation mode. The training method is advantaged in that the acquired matching degree is based on considering both detail characteristics of an image and global characteristics, and the acquired matching degree is more accurate and comprehensive.

Description

technical field [0001] The present application relates to the technical field of artificial intelligence, in particular to a training method, a search method and related devices for an image-text matching model. Background technique [0002] Image and text understanding has always been one of the most important research directions in artificial intelligence. One of the important research is to discover the relationship between image and text. For example, news text content and news images in web news express the same theme. That is, the image and the text do not exist absolutely independently, but there is a matching relationship between the image and the text. Therefore, how to find the text matching the given image, or how to find the image matching the given text has become a topic of concern in the industry. [0003] The inventors have found that the matching of images and texts is usually achieved through the following two methods in the related art: [0004] Method...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06N3/04
CPCG06V10/76G06N3/045G06F18/22G06F18/214G06V30/413G06V10/454G06V10/761G06V30/19173G06N3/08
Inventor 马林姜文浩刘威
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products