Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Fashion field-based end-to-end image semantic description method and system

A semantic description and fashionable technology, applied in the image semantic description generation method and system field, can solve the problems of high click-through rate, obtain good effect, etc., and achieve reasonable design effect

Pending Publication Date: 2022-05-17
NANJING UNIV OF AERONAUTICS & ASTRONAUTICS
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At the same time, since search engines and recommendation systems usually use keywords to search, such descriptions may obtain higher click-through rates
Therefore, the particularity of the fashion field leads to the fact that the image semantic description method directly transferred to the conventional field cannot achieve good results on this multi-attribute description. An image semantic description method specially designed for the fashion field is urgently needed

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Fashion field-based end-to-end image semantic description method and system
  • Fashion field-based end-to-end image semantic description method and system
  • Fashion field-based end-to-end image semantic description method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0060] The present invention will be further explained below.

[0061] A kind of end-to-end image semantic description method based on the fashion field of the present invention comprises the following steps:

[0062] Step 1: Data Preparation:

[0063]Fashion datasets are often constructed by crawling e-commerce websites or manually photographing. The size, angle, and quantity of the pictures in the products are not uniform. The resolutions of the pictures of different products are quite different, and there are some noise pictures that have nothing to do with the description content. . In addition, the description expressions of different data sets are also very different. In Fashion-Gen, the description is a templated description composed of multiple sentences. The overall description is longer and has more professional vocabulary in the fashion field, but the template The form may make it easier for the model to predict the words in the template, while ignoring the meanin...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a fashion field-based image language description generation method and system. The method comprises the following steps of preparing a data set; extracting features; carrying out downsampling operation on commodity features by using pooling layers of multiple scales, and sending the pooled features into a multi-layer perceptron to learn high-level visual features; designing a multi-scale transform network as a feature encoder for each visual high-level feature, and learning an interaction relationship between the interior of each feature; and calculating relative contribution values of the multi-scale features to the description by using a decoder with a gated attention mechanism, fusing the relative contribution values, and predicting the sentences. According to the method, commodity pictures of a real scene are described, so that the model has higher robustness and application value. By using multi-scale features of commodities and a gating attention mechanism designed for the fashion field, description expression generated by the model is more accurate and more humanized. Due to the end-to-end model framework, the landing threshold is lower.

Description

technical field [0001] The invention belongs to the cross field of computer vision and natural language processing, and in particular relates to a method and system for generating a semantic description of a picture. Background technique [0002] Image Semantic Description (Image Caption) is a comprehensive task that combines computer vision and natural language processing, and its goal is to translate a picture into a description. This task not only needs to use the model to understand the content of the picture, capture the semantic information of the image, but also need to use natural language to generate readable sentences. Automatically generating semantic descriptions for images can not only help visually impaired people understand image content by converting image content into text and then audio, but also assist in understanding complex images such as spectrograms and remote sensing images. In addition, tasks related to image understanding, such as image-text retri...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/126G06K9/62G06N3/04G06N3/08
CPCG06F40/126G06N3/08G06N3/045G06F18/253
Inventor 张立言汤宇豪
Owner NANJING UNIV OF AERONAUTICS & ASTRONAUTICS
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products