Image-text data expansion method and device and electronic equipment

A data and graphic technology, applied in the field of devices and electronic equipment, graphic data expansion methods, can solve the problems of data collection increase and difficulty

Active Publication Date: 2020-01-14
BEIJING BYTEDANCE NETWORK TECH CO LTD
View PDF11 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, due to the diversity of real chat scenarios, different words may express the same meaning
In addition, in the process of automatic label gen

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Image-text data expansion method and device and electronic equipment
  • Image-text data expansion method and device and electronic equipment
  • Image-text data expansion method and device and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0059] Embodiments of the present disclosure will be described in detail below in conjunction with the accompanying drawings.

[0060] Embodiments of the present disclosure are described below through specific examples, and those skilled in the art can easily understand other advantages and effects of the present disclosure from the content disclosed in this specification. Apparently, the described embodiments are only some of the embodiments of the present disclosure, not all of them. The present disclosure can also be implemented or applied through different specific implementation modes, and various modifications or changes can be made to the details in this specification based on different viewpoints and applications without departing from the spirit of the present disclosure. It should be noted that, in the case of no conflict, the following embodiments and features in the embodiments can be combined with each other. Based on the embodiments in the present disclosure, al...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention provides an image-text data expansion method and device and electronic equipment, and belongs to the technical field of image processing, and the method comprises the steps: carrying out the coding vectorization of vocabularies in a corpus, so as to obtain vocabulary codes corresponding to the vocabularies; clustering the vocabulary codes to obtain a plurality of word aggregation class sets; obtaining an image set corresponding to each word aggregation class set in the plurality of word aggregation class sets; rejecting unqualified vocabulary sets according to the image distribution condition in the image set to obtain qualified vocabulary sets; and combining any element in the qualified vocabulary set with any element in the image set to obtain expanded image-text data. By means of the processing scheme, the number of high-confidence image-text data is increased, the problem that weak label data related to images and texts is insufficient is solved, andthe data collected through the scheme can be used for subsequent model training, data analysis, algorithm adjustment and other links.

Description

technical field [0001] The present disclosure relates to the technical field of graphic and text processing, and in particular to a graphic and text data expansion method, device and electronic equipment. Background technique [0002] With the development of Internet technology, people are increasingly socializing through the Internet. In the social chat scene, in addition to text, images can also be used to increase the richness of social interaction. Socializing through images requires users to be able to use or select images that convey their meaning correctly. Currently, meaning is usually expressed by clicking, that is, inputting text, and then selecting an image related to the text from the candidate map. In this case, the selected image can be considered to have a certain correlation (weak label) with the input text. [0003] Data collection through this method of clicking can increase the amount of weak label data related to graphics and text on the one hand, and ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/58G06F16/35G06Q50/00
CPCG06F16/5866G06F16/355G06Q50/01
Inventor 范仲悦
Owner BEIJING BYTEDANCE NETWORK TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products