Cloud-based website homepage structure monitoring method

A structure monitoring and home page technology, which is applied in the direction of website content management, network data retrieval, special data processing applications, etc., can solve the problems that whether the page has been tampered with cannot be sensed and monitored, the website monitoring system cannot detect, and the deformation of the home page cannot be monitored. , to achieve the effect of improving monitoring accuracy and timeliness, improving user experience, and improving timeliness

Active Publication Date: 2021-05-07
西安博达软件股份有限公司
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Unable to monitor whether the homepage is deformed or tampered with
When similar problems occur, they can only be discovered manually, which lacks timeliness. The existing monitoring system cannot fully meet the real needs of website monitoring
[0003] At the same time, it is impossible to perceive and monitor whether the page is deformed or tampered with
When the page is deformed or tampered with, the existing website monitoring system cannot detect it, and it can only be discovered when the website is manually visited

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cloud-based website homepage structure monitoring method
  • Cloud-based website homepage structure monitoring method
  • Cloud-based website homepage structure monitoring method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0074]在某个服务小程序产品中使用了本发明的系统,具体应用方法如下:

[0075]S1、添加域名:在系统中添加待监测的网站域名清单www.xjtu.edu.cn;

[0076]S2、进行采集:间隔5分钟访问一次域名对应的网站首页http: / / www.xjtu.edu.cn / index.htm,下载首页网页源代码,用程序过滤掉首页网页源代码代码中的文字,IMG标签的src属性、A标签的href属性、标签中的src属性,只保留标签,生成首页标签代码;

[0077]S3、进行保存:在数据库中检查域名www.xjtu.edu.cn是否存在采集记录,如果是第一次采集的话,将首页网页源代码保存在pageCode目录下,命名为2020-08-20-11-30_index_pageCode.txt;首页标签代码保存在labelCode目录下,命名为2020-08-20-11-30_index_labelCode.txt;

[0078]S4、进行计算:如果不是第一次采集,则分别计算下载下来的首页页面代码与pageCode目录下最近一次历史文件2020-08-20-11-25_index_pageCode.txt的相似度;计算方法如下:

[0079](1)以两次采集的标签元素分别为行和列生成二维矩阵,矩阵的元素为两次生成的对应标签是否相等,如果相等则为1,不相等则为0,二维矩阵如下表1所示:

[0080]表1:

[0081]

[0082](2)计算两次标签变化数量为两次标签数量m和n的差值绝对值:

[0083]k=|m-n|=|13-13|=0;

[0084](3)计算矩阵上下三角元素之和为:

[0085]

[0086](4)计算举证对角线元素之和:

[0087]

[0088](5)计算举证对角线为0的元素之和为:

[0089]

[0090](6)计算首页标签相似度:

[0091]

[0092]S5、下载下来的首页标签代码与labelCode目录下最近一次历史文件2020-08-20-11-25index_labelCode.txt的相似度。计算方法如下:

[0093](1)按照本次采集的首页标签代码结构将本次采集首页网页源码和最近一次采集到的首页网页源码中的标签替换成空字符串,然后将空格和换行液体换成空字符串,只保留文本内容。分别记为本次采集首页文本内容NC,和最近一次采集首页文本内容OC,如下表2所示:

[00...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a cloud-based website homepage structure monitoring method. The method comprises the following steps: S1, adding a domain name: determining a website to be monitored and a website domain name; s2, collecting: accessing the website homepage in the S1 once at a preset time interval; s3, saving: filtering out the text in the source code of the homepage, the src attribute of the IMG tag, the href attribute of the A tag, and the src attribute in the <SCRIPT> tag , and only keeping tags; s4, performing calculation: checking whether a record of a current website home page exists in the sample collection record in S3; and S5, performing judgment: calculating the similarity. According to the method, the detection index of website monitoring is perfected, the timeliness of monitoring is improved, when the website home page deforms or is tampered, a website administrator is quickly notified, problems are quickly found and solved, meanwhile, the user experience of a website user and the authority of the website are improved, and the cost of manual monitoring is saved.

Description

technical field [0001] The invention relates to the technical field of website monitoring, in particular to a cloud-based method for monitoring the home page structure of a website. Background technique [0002] Website monitoring systems generally use crawler technology to crawl website information to determine whether the homepage of the website is accessible, whether the homepage content is updated in a timely manner, whether the links on the homepage are available, and whether the homepage content contains sensitive information. If the above items are detected, a message will be sent to the website administrator for early warning. The existing website monitoring system can only monitor whether the home page is connected, whether the content of the home page is updated in time, whether the links on the home page are available, and whether the home page contains sensitive information. wait. It is impossible to monitor whether the homepage is deformed or tampered with. Wh...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F21/55G06F16/958
CPCG06F21/552G06F16/958G06F2221/033G06F2221/2119Y02D10/00
Inventor 李传咏卢颖赵莉陈宁张亮
Owner 西安博达软件股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products