Data generation method and device, equipment and storage medium

A data generation and data technology, applied in the field of machine learning, can solve the problem that training samples are not easy to obtain in large quantities, and achieve the effect of improving the accuracy of the generative model

Pending Publication Date: 2021-06-01
SHANGSHANG TECH INC
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, real training samples are not easy to obtain in large quantities, so it is necessary to generate fake data through the mapping relationship obtained by training

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data generation method and device, equipment and storage medium
  • Data generation method and device, equipment and storage medium
  • Data generation method and device, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0058] figure 1 It is a flow chart of a data generation method provided by Embodiment 1 of the present invention. This embodiment is applicable to the situation of generating dummy data. The method can be executed by a data generation device, and the device can be implemented by software and / or hardware.

[0059] Such as figure 1 As shown, the method specifically includes the following steps:

[0060] Step 110. Determine the Voronoi weights of each first data in the first data set in the space convex region.

[0061] In this embodiment, the first data set and the second data set are two data sets for calculating the target mapping relationship, the first data set includes a plurality of first data, and the second data set includes a plurality of second data.

[0062] Specifically, the first data set containing the first data can be obtained, and in the minimum spatial convex region surrounding the first data set, a Voronoi diagram of the first data set in the spatial convex ...

Embodiment 2

[0071] image 3 It is a flow chart of a data generation method provided by Embodiment 2 of the present invention. In this embodiment, on the basis of the foregoing embodiments, the foregoing data generation method is further optimized.

[0072] Such as image 3 As shown, the method specifically includes:

[0073] Step 210: Obtain the first data set including the first data, and determine the smallest convex area including all the first data as the spatial convex area.

[0074] In this embodiment, after the first data set is acquired, the minimum d-dimensional spatial convex area surrounding all the first data can be recorded as a hypercube C, where d is the dimension of the first data in the first data set.

[0075] Step 220, based on the data distribution of each first data, construct a Voronoi diagram of the spatial convex region.

[0076] Wherein, the Voronoi cells in the Voronoi diagram are in one-to-one correspondence with the first data in the first data set, and eac...

Embodiment 3

[0115] The data generation device provided by the embodiment of the present invention can execute the data generation method provided by any embodiment of the present invention, and has corresponding functional modules and beneficial effects for executing the method. image 3 It is a structural block diagram of a data generation device provided in Embodiment 3 of the present invention, such as image 3 As shown, the device includes: a Voronoi weight determination module 310 , a target mapping relationship determination module 320 and a data generation module 330 .

[0116] The Voronoi weight determination module 310 is configured to determine the Voronoi weights of each first data in the first data set in a spatially convex region.

[0117] The target mapping relationship determination module 320 is configured to adjust the initial mapping relationship between the first data set and the second data set according to the Voronoi weights of each of the first data to obtain a targ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a data generation method and device, equipment and a storage medium. The method comprises the following steps: determining a Voronoi weight of each piece of first data in a first data set in a spatial convex region; according to the Voronoi weight of each first data, adjusting an initial mapping relation between the first data set and a second data set to obtain a target mapping relation, wherein in the mapping relation satisfied by the data distribution of the first data set and the data distribution of the second data set, the transmission cost of the target mapping relation is minimum; and on the basis of the target mapping relationship and the Voronoi weight of each piece of first data, generating pseudo data conforming to the data distribution of the second data set. According to the method, the problems that the generated pseudo data is inaccurate and more singular points exist are solved, and the pseudo data is accurately generated, so that the effect of improving the accuracy of the generated model in machine learning is achieved.

Description

technical field [0001] Embodiments of the present invention relate to machine learning technology, and in particular, to a data generation method, device, device, and storage medium. Background technique [0002] With the rise of machine learning, neural networks have been widely used in academia and business as an effective tool for processing data. [0003] Deep learning can be attributed to two laws: one is the law of manifold distribution, that is, the high-dimensional data of the same category in nature are often concentrated near a certain low-dimensional manifold; the other is the law of cluster distribution, that is, the high-dimensional data category Different subclasses of data in correspond to different probability distributions on the manifold, and the distance between these distributions is large enough to distinguish these subclasses. Therefore, the basic task of deep learning is to learn the manifold structure from the data, establish the parameter expression...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N20/00G06K9/62
CPCG06N20/00G06F18/214
Inventor 柯景耀潘征潘燕峰刘岚
Owner SHANGSHANG TECH INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products