Structured data sample increment method and device, electronic equipment and medium

A technology of structured data and samples, applied in the fields of unstructured text data retrieval, neural learning methods, biological neural network models, etc., can solve problems such as the inability of generative adversarial networks to generate new samples

Pending Publication Date: 2020-08-04
SANGFOR TECH INC
View PDF6 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] This application provides an incremental method, device, electronic device, and computer-readable storage medium for structured data samples, aiming to solve the problem that existing generative confrontation networks cannot be used to generate new samples whose data type is the original sample of structured data question

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Structured data sample increment method and device, electronic equipment and medium
  • Structured data sample increment method and device, electronic equipment and medium
  • Structured data sample increment method and device, electronic equipment and medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0068] See figure 1 , figure 1 A flow chart of an incremental method for structured data samples provided in the embodiment of this application, including the following steps:

[0069] S101: Obtain an original sample whose data type is structured data;

[0070] This step aims to obtain the original samples whose data type is structured data, among which, web pages, PDF files, PE files (Portable Executable, portable executable files, common EXE, DLL, OCX, SYS, COM files are all It is a PE file), ELF file (a format file used for binary files, executable files, object code, shared libraries, and core dumps, commonly found in Linux operating systems), etc., all contain a large amount of structured data. Wherein, when the original sample is specifically a webpage, it can be a white webpage (that is, a normal webpage, a webpage that does not contain malicious data), or a tampered webpage (also called a black webpage, that is, a webpage that contains malicious data), and the corres...

Embodiment 2

[0084] See figure 2 , figure 2 The flow chart of a structured data sample increment method for converting unquantized structured data into quantized feature vectors provided by the embodiment of the present application is different from the first embodiment, this embodiment specifically provides a method for converting unquantized structural data into quantized feature vectors The implementation of converting structured data into quantized feature vectors includes the following steps:

[0085] S201: Obtain an original sample whose data type is structured data;

[0086] S202: Convert the non-quantified structured data in the original sample into a quantized feature vector;

[0087] In this step, vectors are used as the quantitative parameter expression form of non-quantified structured data. Since there are more structured data in structured data samples, the feature vectors used in this step can be preferably related to the number or quantity of structured data. A multidi...

Embodiment 3

[0100] See Figure 4 , Figure 4 A flow chart of an incremental method for tampering webpages specifically provided by the number of examples in this application for tampering webpages. On the basis of Embodiment 1 and Embodiment 2, this embodiment starts from the current demand for the number of tampered webpages containing malicious data. Starting from this method, a specific incremental method for tampering with webpages is provided, including the following steps:

[0101] S401: Obtain the original tampered webpage containing the original malicious data;

[0102] The original falsified webpage refers to the falsified webpage obtained by the intruder directly invading the normal webpage and inserting various malicious data, which can be obtained directly through conventional technical means.

[0103] S402: Represent the non-quantified elements in the original tampered webpage as a DOM tree;

[0104] This step aims to represent the non-quantified elements in the original t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an increment method of a structured data sample. The objective of the invention is to solve the technical defect that an existing generative adversarial network cannot generatea new sample for an original sample of which the data type is structured data. The method comprises the following steps that: before a structured data original sample is input into a generative adversarial network, conversion operation of converting non-quantized structural data into quantized parameters is firstly executed, namely, the non-quantized structural information is expressed by using the quantized parameters, so that an original sample expressed by using the quantized parameters meets the precondition of generating a new sample according to the original sample through a generativeadversarial network. Through the technical scheme provided by the invention, the application field range of the generative adversarial network is expanded to the field of structured data, so that thestructured data sample can also generate a high-quality new sample by utilizing the generative adversarial network. The invention further discloses a structured data sample increment device, electronic equipment and a computer readable storage medium, which have the above beneficial effects.

Description

technical field [0001] The present application relates to the field of new sample generation, and in particular to a method, device, electronic device and computer-readable storage medium for incrementally constructing structured data samples. Background technique [0002] Generative Adversarial Networks (GAN, Generative Adversarial Networks) is a deep learning model and one of the most promising methods for unsupervised learning on complex distributions in recent years. Different from other deep learning models, the generative model (Generative Model) and the discriminative model (Discriminative Model) of the mutual game in the generative confrontation network can produce high-quality output. The process of the game is also a process of confrontational learning. In the game process: the generative model is responsible for generating new data based on the input data, and strives to pass the discrimination of the new data generated by the discriminant model. New data that me...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/30G06N3/08
CPCG06N3/08
Inventor 王大伟杨荣海
Owner SANGFOR TECH INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products