Image title automatic generation method based on multi-modal attention
An attention, multi-modal technology, applied in the intersection of computer vision and natural language processing, can solve the problem that semantic information does not have strict alignment relationship, the number of categories is limited, does not contain, etc., to alleviate visual features and semantic features. Alignment problems, solving visual and semantic alignment problems, and improving quality
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Example Embodiment
[0019] The invention provides a method for automatically generating image captions based on multimodal attention. The specific embodiments discussed are only used to illustrate the implementation of the present invention, but do not limit the scope of the present invention. The following describes the embodiments of the present invention in detail with reference to the drawings. A method for automatically generating image captions based on multi-modal attention. The specific steps are as follows:
[0020] (1) Image preprocessing
[0021] A selective search algorithm is used to extract the image area containing the object from the original image. However, the size of these image regions is different, and it is not suitable for subsequent feature extraction through the ResNet convolutional neural network. Therefore, the present invention scales the extracted image area so that its size can meet the requirements, and at the same time, the image pixel value is regularized.
[0022] (2...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap