Network information acquisition method and device and electronic equipment

A network information and acquisition method technology, which is applied in the field of computer-readable storage media and network information acquisition, can solve the problems of time-consuming and energy-consuming, low crawler program development efficiency, etc., to save manpower and time costs, and improve program development efficiency high effect

Pending Publication Date: 2020-12-25
亿存(北京)信息科技有限公司
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Most of the related technologies use crawler technology to collect information on the network. Crawler technology is a program or script that automatically grabs information on the World Wide Web according to certain rules, which improves the efficiency of obtaining net

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Network information acquisition method and device and electronic equipment
  • Network information acquisition method and device and electronic equipment
  • Network information acquisition method and device and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0089] In the embodiments of the present disclosure, there are two possible implementation methods for the determination method of target machine learning model identification failure as follows:

[0090] Method 1. Recognizing the target The machine learning model cannot perform image feature extraction on webpage pictures.

[0091] It is understandable that due to the complexity of the type, structure, and content of the webpage, the target machine learning model may not be able to extract the image features of the webpage pictures, and thus cannot obtain the page elements and the corresponding content of the page elements. At this time, it can be determined that the target machine Learning model recognition failed.

[0092] Method 2: All the image features extracted by recognition fail to match with the preset target feature library in the model.

[0093] It is understandable that after the target machine learning model extracts image features from webpage pictures, on the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a network information acquisition method and device and electronic equipment. The method comprises the steps of acquiring a uniform resource locator (URL) of a target webpage,wherein the target webpage is a webpage needing to be subjected to network information collection currently; downloading a target webpage for caching according to the URL; generating a webpage picturecorresponding to the target webpage according to the target webpage; and performing image recognition on the webpage picture to obtain page elements carried by the target webpage and contents corresponding to the page elements. According to the acquisition method provided by the embodiment of the invention, the target webpage can be converted into the image, and the image is subjected to image recognition to acquire the page element carried by the target webpage and the corresponding content, so that the webpage information is acquired, and compared with the prior art that developers compiledifferent crawler codes according to different webpages, the method is suitable for obtaining information of all web pages in the world wide web, a large amount of manpower and time cost is saved, andthe program development efficiency is high.

Description

technical field [0001] The present invention relates to the field of computer application technology, in particular to a network information acquisition method, device, electronic equipment and computer-readable storage medium. Background technique [0002] At present, with the vigorous development of Internet technology, there is a large amount of information on the Internet. Most of the related technologies use crawler technology to collect information on the network. Crawler technology is a program or script that automatically grabs information on the World Wide Web according to certain rules, which improves the efficiency of obtaining network information. However, due to web page types, structures, The difference in content causes developers to write different crawler codes according to different webpages, which consumes a lot of time and energy, and the efficiency of crawler program development is low. Contents of the invention [0003] The present invention aims to ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/951G06F16/9535G06F16/955G06N3/04G06N3/08G06K9/46G06K9/62G06N20/10
CPCG06F16/951G06F16/9535G06F16/9566G06N3/08G06N20/10G06V10/40G06N3/045G06F18/2411
Inventor 杨硕官延斌王庚
Owner 亿存(北京)信息科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products