Method, system and equipment for inspecting website through IP (Internet Protocol) and judging website category, and medium

A website and category technology, applied in the field of computer image processing, can solve the problem of low classification accuracy and achieve the effect of improving efficiency and accuracy

Inactive Publication Date: 2021-07-23
江苏匠算天诚信息科技有限公司
View PDF5 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0010] In order to solve the problem of low classification accuracy of the above classification methods, considering that images and text are the most direct manifestation of website content classification, the present invention proposes a method, system and device for patrolling websites through IP and judging website categories And medium, can improve the classification accuracy to more than 85%

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method, system and equipment for inspecting website through IP (Internet Protocol) and judging website category, and medium
  • Method, system and equipment for inspecting website through IP (Internet Protocol) and judging website category, and medium
  • Method, system and equipment for inspecting website through IP (Internet Protocol) and judging website category, and medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0091] Scanned a total of 1,089,793 IPs within a certain city to monitor whether a website has been set up; for the IPs that have set up a website, visit the first two pages of the website to check whether there is an ICP record number, and extract keywords to classify the attributes of the website. Focus on games, audio-visual (video and music), and novels. The overall scan results are shown in the table below.

[0092]

[0093]

[0094] It can be seen from the above table that the present invention can help the law enforcement department patrol the legal websites in the jurisdiction and give accurate classification, and the classification accuracy reaches 99.79%.

[0095] In the embodiments provided in this application, it should be understood that the disclosed systems, devices and methods may be implemented in other ways. Each functional unit may be integrated into one processing unit, or each unit may physically exist separately, or two or more units may be integra...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a method, a system, equipment and a medium for inspecting a website through an IP (Internet Protocol) and judging the category of the website. The method comprises the following steps: capturing webpage content of a target website; extracting effective characters and pictures in the webpage; carrying out classification labeling on the extracted effective characters and pictures; constructing and training a network model for the character and picture data; the method comprises the following steps: respectively taking pictures and characters crawled from a webpage in a website as inputs of respective corresponding models to obtain classification prediction results of the pictures and the characters in the webpage, and setting weights of an image classification result and a character classification result; counting prediction results of all pictures and characters under the website, and generating distribution of picture classification and distribution of character classification; and obtaining a final classification result through calculating scores. According to the method, webpage browsers in reality are simulated, the artificial intelligence technology is adopted, website information such as specific content, covered videos, pictures and characters in a website is directly analyzed, and a website content judgment result is comprehensively formed.

Description

technical field [0001] The invention relates to the field of computer image processing, in particular to a method, system, device and medium for inspecting websites through IP and judging the category of the websites. Background technique [0002] At present, there are mainly the following ways to solve the website classification in the market: [0003] 1) Based on webpage text; [0004] A. By establishing a website classification dictionary, analyze the effective words of the webpage to be judged to determine the type of website; [0005] B. It is purely aimed at explaining the similarity between texts through algorithms such as deep learning CNN; [0006] C. Classify the text through machine learning methods such as logistic regression and Bayesian. [0007] 2) Classify based on the structural characteristics of the website. [0008] 3) Classify based on website log data. [0009] However, these methods only extract some features of the website, such as the text infor...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/951G06F16/958G06F16/35G06F16/55G06K9/62G06N3/04G06N3/08
CPCG06F16/951G06F16/958G06F16/35G06F16/55G06N3/08G06N3/045G06F18/241
Inventor 张乐平顾明娟吴一超卞豪
Owner 江苏匠算天诚信息科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products