Method and system for automatically extracting virus characteristics based on family samples

A sample feature and automatic extraction technology, applied in the field of network security, can solve the problems of too few extracted feature codes, low accuracy of feature codes, and low efficiency of extracting family sample feature codes, so as to improve efficiency, quantity, and accuracy Effect

Active Publication Date: 2013-09-25
HARBIN ANTIY TECH
View PDF3 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The present invention provides a method and system for automatically extracting virus features based on family samples, which solves the problems of low...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for automatically extracting virus characteristics based on family samples
  • Method and system for automatically extracting virus characteristics based on family samples

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] In order to enable those skilled in the art to better understand the technical solutions in the embodiments of the present invention, and to make the above-mentioned purposes, features and advantages of the present invention more obvious and easy to understand, the technical solutions in the present invention will be further detailed below in conjunction with the accompanying drawings illustrate.

[0027] The invention provides a method and system for automatically extracting virus features based on family samples, which solves the problems of low efficiency, too few extracted feature codes, and low accuracy of feature codes extracted by the longest common subsequence algorithm.

[0028] The method of the present invention improves the longest common subsequence algorithm, and then uses the improved longest common subsequence algorithm to extract the signature of a certain family sample in the form of program automation. The improvement of the sequence algorithm is give...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method and a system for automatically extracting virus characteristics based on family samples. According to the method and the system, a longest public subsequence algorithm is modified, a sequence A and a sequence B are established by using samples in the family samples, Hash values of subsequences with lengths equal to preset values in the sequence A and the sequence B are calculated respectively through preset feature code lengths, and the Hash values of the subsequences in the sequence A and the sequence B are matched through a red black tree manner, if the Hash values are same, the subsequences corresponding to the Hash values are public subsequences of the sequence A and the sequence B, and the public subsequences are feature codes of the family samples; and when surplus samples are taken as the sequence B and searched in a red black tree, feature codes of all family samples are obtained and combined into a feature set of the family samples, a weighting model is evaluated according to qualities of the established feature codes, the qualities of the established feature codes are judged, and the feature codes of the family samples are determined. According to the method, the time complexity of the algorithm is simplified, and the extraction efficiency and the accuracy of the feature codes are improved.

Description

technical field [0001] The invention relates to the field of network security, in particular to a method and system for automatically extracting virus features based on family samples. Background technique [0002] The time complexity of the currently known longest common subsequence algorithm is O(m*n), where m and n are the length of the sequence. If it is applied to the virus signature extraction based on family samples, in the case of a large number of samples Given the time complexity of the algorithm, the resulting cost will have a huge negative impact on the extraction efficiency of virus signatures; at the same time, the existing longest common subsequence algorithm can only obtain the unique If the longest common subsequence is used in the extraction of family samples, it will face the problem that the extracted signatures are too small to provide for manual analysis, and the quality of signatures is difficult to guarantee, and the selection of signatures depends on...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F21/56
Inventor 童志明董雷田彻张栗伟
Owner HARBIN ANTIY TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products