Unlock instant, AI-driven research and patent intelligence for your innovation.

Commodity title generation method based on multi-mode GPT2 model

A multi-modal and commodity technology, applied in image data processing, instrumentation, electrical digital data processing, etc., can solve problems such as difficult-to-control generated content

Pending Publication Date: 2021-09-10
FOCUS TECH
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] At present, the text generation model represented by GPT2 can generate very coherent text based on the pre-training of a large number of parameters and massive corpus, and achieve the effect of confusing the real, but it is difficult for this type of model to control the generated content, and the release of product titles requires It is highly related to the content of the product itself and requires strong conditional control capabilities

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Commodity title generation method based on multi-mode GPT2 model
  • Commodity title generation method based on multi-mode GPT2 model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] The present invention will be further described below in conjunction with accompanying drawing and exemplary embodiment:

[0024] Such as figure 1 As shown, the present invention discloses a product title generation scheme based on multimodal GPT2, including a unified pre-processing and post-processing process.

[0025] GPT is an NLP (Natural Language Processing) model. GPT-2 is an upgraded version of GPT. The biggest difference is that it has more scale and more training data. GPT is a 12-layer transformer, and the deepest BERT is a 24-layer transformer. GPT -2 can be 48 layers, and its training data is a WebText dataset, which has done some simple data cleaning and covers a very wide area.

[0026] Step 1: Preprocess the corpus, count the attribute dictionary and special markers, and obtain the desensitized product title;

[0027] Step 2: Coding the product content, including encoding pictures with ResNet, encoding category names with GPT2, and encoding attributes w...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The commodity title generation method based on the multi-modal GPT2 model comprises the steps of fusing commodity information of different modals into a context state of GPT2, and generating a commodity title on the basis of understanding commodity content: 1) preprocessing commodity voice data, 2) adopting a commodity information coding module to respectively use ResNet and Embedding to code commodity pictures and attributes, encoding a commodity category name by using a GPT2 to obtain encoding representations of a GPT2 network and three kinds of different modal information of a ResNet image encoder and an Embedding attribute encoder; 3) adopting a title generation module; 4) post-processing the generated commodity titles, identifying special markers in the commodity titles and replacing the special markers with corresponding commodity attributes; and perfecting specification and parameter information of the generated title text.

Description

technical field [0001] The invention relates to multi-modal understanding and text generation technology, and relates to a technology for understanding product content and automatically generating product titles through multiple modal information (including natural language processing models, etc.). Background technique [0002] E-commerce platforms need to frequently release product information, including product categories, attributes, titles, descriptions, and pictures. Editing of text information such as titles and descriptions is the most labor-intensive. The text organization of product titles needs to use highly refined language to highlight the characteristics of products, and requires accurate description of product information on the basis of product content understanding. At the same time, a large amount of product information on the platform needs to ensure diversity. High intelligence requirements. [0003] At present, the text generation model represented by G...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/258G06F40/237G06T9/00
CPCG06F40/258G06F40/237G06T9/002Y02P90/30
Inventor 蔡世清郭选陵
Owner FOCUS TECH