Knowledge graph data extraction method and device based on web crawler

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of knowledge graph and web crawler, which is applied in the field of knowledge graph data extraction based on web crawler, readable storage media and computing equipment, which can solve problems such as heavy workload, inconsistent web page format, unfavorable code maintenance, etc., to improve efficiency Effect

Pending Publication Date: 2021-05-14

厦门渊亭信息科技有限公司

View PDF5 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0002] With the rapid development of the network, the World Wide Web has become the carrier of a large amount of information. In the process of building the map, the data provided by the enterprise may not be able to meet the existing business: one is that the data is not comprehensive enough, and the other is that the data has a certain timeliness

It is a good choice to enrich database data by crawling data from open source websites. However, the webpage formats of current webpages are not uniform. Even the same webpage may contain different types of entities and relationships. For each Writing corresponding crawler codes to extract all kinds of data has the following disadvantages: First, the workload is heavy, and corresponding parsing logic needs to be written for each entity and relationship on each page; Adjustments may be made accordingly. When the structure of the web page changes, the code needs to be adjusted

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0038] Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present invention are shown in the drawings, it should be understood that the invention may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided for more thorough understanding of the present invention and to fully convey the scope of the present invention to those skilled in the art.

[0039] figure 1 is a block diagram of an example computing device 100 arranged to implement a web crawler-based knowledge graph data extraction method according to the present invention. In a basic configuration 102 , computing device 100 typically includes system memory 106 and one or more processors 104 . A memory bus 108 may be used for communication between the processor 104 and the system memory 106 .

[0040] Depending on the desired co...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The embodiment of the invention provides a knowledge graph data extraction method and device based on a web crawler, a readable storage medium and computing equipment, which are used for realizing crawler code reuse, deeply and automatically crawling webpage data in batches and avoiding the situation that a large number of webpage analysis codes need to be modified due to page changes. The method comprises the following steps: acquiring a target webpage for crawling data; configuring a crawling rule and an analysis rule of the target webpage; crawling the target webpage and a webpage linked with the target webpage according to the crawling rule; obtaining entity information and relation information contained in the target webpage and a webpage linked with the target webpage according to the analysis rule; and generating a knowledge graph according to the entity information and the relationship information.

Description

technical field [0001] The present invention relates to the technical field of artificial intelligence and automatic machine learning, in particular to a method, device, readable storage medium and computing device for extracting knowledge map data based on web crawlers. Background technique [0002] With the rapid development of the network, the World Wide Web has become the carrier of a large amount of information. In the process of building the map, the data provided by the enterprise may not be able to meet the existing business: one is that the data is not comprehensive enough, and the other is that the data has a certain timeliness. It is a good choice to enrich database data by crawling data from open source websites. However, the webpage formats of current webpages are not uniform. Even the same webpage may contain different types of entities and relationships. For each Writing corresponding crawler codes to extract all kinds of data has the following disadvantages: ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06F16/951G06F16/36G06F40/205

CPCG06F16/951G06F16/367G06F40/205

Inventor 洪万福钱智毅吴文杰

Owner 厦门渊亭信息科技有限公司

Knowledge graph data extraction method and device based on web crawler

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology