Webpage content acquisition method and device, storage medium and equipment

A technology of webpage content and acquisition method, applied in the Internet field, can solve problems such as insufficient content and achieve the effect of comprehensive acquisition

Pending Publication Date: 2020-05-19
GUANGZHOU BAIGUOYUAN NETWORK TECH
View PDF3 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, related technologies can only obtain content in static web pages, and in view of the popularity of dynamic pages and their advantages over static pages, many we

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Webpage content acquisition method and device, storage medium and equipment
  • Webpage content acquisition method and device, storage medium and equipment
  • Webpage content acquisition method and device, storage medium and equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] The present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, but not to limit the present invention. In addition, it should be noted that, for the convenience of description, only some structures related to the present invention are shown in the drawings but not all structures. In addition, the embodiments and the features in the embodiments of the present invention can be combined with each other under the condition of no conflict.

[0024] figure 1 It is a schematic flowchart of a method for acquiring webpage content provided by an embodiment of the present invention. The method can be executed by an apparatus for acquiring webpage content, wherein the apparatus can be implemented by software and / or hardware, and generally can be integrated into a computer device. Such as figure 1 As s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention discloses a webpage content obtaining method and a device, a storage medium and equipment. The method comprises the following steps: obtaining a first network resourceaddress, and using a preset headless browser to obtain a corresponding webpage source file according to the first network resource address, sending a target network request for a target project in thewebpage source file through the preset headless browser, the target project is generated in a dynamic loading mode, and target content corresponding to the target project is obtained according to thetarget network request. According to the method provided by the embodiment of the invention, the dynamically generated items in the page can be accessed by utilizing the headless browser, so that thewebpage content can be more comprehensively acquired.

Description

technical field [0001] The embodiments of the present invention relate to the technical field of the Internet, and in particular, to a method, device, storage medium, and equipment for acquiring webpage content. Background technique [0002] As an important dissemination carrier of information, the network is developing and growing at an astonishing speed. While the Internet has the two characteristics of rapid growth and huge amount of information, it also has many characteristics such as dynamics, openness, interactivity, and anonymity, resulting in many web pages containing sensitive or illegal content on the Internet. Therefore, it has become an important research topic in network information security to research and develop automatic identification and filtering technology adapted to the network, and to effectively detect and filter the increasingly flooded sensitive information on the network. [0003] The premise of the web page recognition method is to obtain conten...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/9532G06F16/9536
CPCG06F16/9532G06F16/9536
Inventor 尹海锋
Owner GUANGZHOU BAIGUOYUAN NETWORK TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products