Method for judging affiliation of Internet website through clustering algorithm

A clustering algorithm and Internet technology, applied in computing, computer components, network data retrieval, etc., can solve problems such as wrong determination of attribution, inability to determine the attribution of websites, etc., and achieve the effect of improving accuracy

Active Publication Date: 2020-07-24
国家计算机网络与信息安全管理中心黑龙江分中心
View PDF12 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0010] The purpose of the present invention is to provide a method for determining the attribution of an Internet website through a clustering algorithm, so as to solve the problem that the trad...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for judging affiliation of Internet website through clustering algorithm
  • Method for judging affiliation of Internet website through clustering algorithm
  • Method for judging affiliation of Internet website through clustering algorithm

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment approach 1

[0042] Specific implementation mode 1, such as figure 1 As shown, a method for determining the attribution of an Internet website through a clustering algorithm according to the present invention is characterized in that it comprises the following steps:

[0043] Step a, input the website collection of the unit to be determined to belong to, and the basic data is the website URL;

[0044] Step b, extracting the basic information of the website;

[0045] Step c, quantifying all the information extracted in step b;

[0046]Step d, map various eigenvalues ​​to the [0, 1] interval under the same dimension; use the normalize function of the sklearn module to realize the normalized eigenvector FN website ;

[0047] FN website =[FN ip ,FN domain ,FN title ,FN keywords ,FN copyright ,FN recordID ,FN recordENTITY ];

[0048] Step e, using the unsupervised clustering algorithm DBSCAN to cluster the data set, so that the websites belonging to the same unit are clustered under...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for judging affiliation of an Internet website through a clustering algorithm, belongs to the technical field of network space security, and aims to solve the problemthat a traditional website record information judgment method and a webpage information judgment method cannot judge the affiliation of the website or cause an affiliation judgment error. The method comprises the following steps: a, inputting a website set of a to-be-judged affiliated unit, wherein basic data is a website URL; b, extracting the basic information of the website; c, quantifying allthe information extracted in the step 2; d, mapping the various characteristic values to a [0, 1] interval under the same dimension; normalizing the feature vector FN < websize >; and e, clustering the data set by using an unsupervised clustering algorithm DBSCAN. According to the method for judging the Internet website affiliation through the clustering algorithm, the clustering analysis algorithm is used, automatic judgment of the website affiliation is achieved, and the affiliation judgment accuracy is effectively improved.

Description

technical field [0001] The invention relates to a method for judging the attribution of an Internet website, in particular to a method for judging the attribution of an Internet website through a clustering algorithm, and belongs to the technical field of network space security. Background technique [0002] From APPANet in the United States in the 1960s to today's Internet, network technology has developed rapidly, and more and more organizations and individuals are connected to the Internet. Network assets, including network terminals, network equipment, and network services, have been widely used in the daily business work of various governments, enterprises and institutions, which greatly improves work efficiency and promotes the development of business work, but also brings There are many problems and hidden dangers. With the continuous expansion of the organization's network scale, network assets and the types of vulnerabilities they contain continue to increase, whic...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62G06F16/957
CPCG06F16/9577G06F18/2321Y02D10/00
Inventor 于佳华韩钢常远张光耀康海东孙巍
Owner 国家计算机网络与信息安全管理中心黑龙江分中心
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products