Training sample generation method and device, storage medium and electronic equipment

A training sample, columnar storage technology, applied in the field of data processing, can solve problems such as large memory occupation, data redundancy, data delay, etc.

Active Publication Date: 2021-11-26
BEIJING SANKUAI ONLINE TECH CO LTD
View PDF9 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The training sample collection scheme in related technologies has disadvantages such as large memory usage, data redundancy, and data delay.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Training sample generation method and device, storage medium and electronic equipment
  • Training sample generation method and device, storage medium and electronic equipment
  • Training sample generation method and device, storage medium and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0051] In order to make the above objects, features and advantages of the present application more obvious and comprehensible, the present application will be further described in detail below in conjunction with the accompanying drawings and specific implementation methods.

[0052] After the user requests the online business system once at the terminal, the system will call the model reasoning service of recommendation, search, and advertisement, and the entire link will be reported to generate two types of data: one is traffic data, and the other is request data. Among them, traffic data includes exposure data and user behavior data. Exposure data includes exposure indication, request ID and other fields, which refers to the data displayed on the user terminal. User behavior data includes behavior identification, request ID and other fields, which refers to The displayed data has been clicked, saved, etc. by the user, or the data displayed on the page for a long time; the re...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention provides a training sample generation method and device, a storage medium and electronic equipment, relates to the technical field of data processing, and aims to provide a high-quality training sample generation method. The method comprises the following steps: obtaining exposure data, storing the exposure data in a column type storage engine, and enabling the exposure data to carry a request ID (Identity); querying whether request data which is the same as the request ID carried by the exposure data exists in a cache in which a plurality of pieces of request data are stored; when it is queried that the request data which is the same as the request ID carried by the exposure data exist, storing the queried request data in a row where the exposure data is located in the column storage engine; and generating a training sample according to each row in the column storage engine.

Description

technical field [0001] The present application relates to the technical field of data processing, in particular to a training sample generation method, device, storage medium and electronic equipment. Background technique [0002] In the fields of artificial intelligence such as Internet search, recommendation, and advertising, neural network models can speculate and identify user intentions based on user online requests or behaviors, and give personalized responses or feedback, thereby helping users complete clicks, ordering, and other behaviors. The neural network model relies on a large number of training samples and is trained. Therefore, how to efficiently and accurately collect and process training samples is crucial to the efficiency and effect of neural network model training. [0003] The training sample collection scheme in the related art has disadvantages such as large memory usage, data redundancy, and data delay. Contents of the invention [0004] In view of...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62G06N3/04G06N3/08
CPCG06N3/08G06N3/045G06F18/214Y02D10/00
Inventor 刘磊仝晔
Owner BEIJING SANKUAI ONLINE TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products