Unlock instant, AI-driven research and patent intelligence for your innovation.

Full data translation method, device, server and storage medium

A full-volume data and data technology, which is applied in the fields of full-volume data translation methods, servers and storage media, and devices, can solve problems such as high maintenance costs, and achieve the effect of improving comprehensiveness and stability

Active Publication Date: 2021-05-07
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF13 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The embodiment of the present invention provides a full data translation method, device, server, and storage medium, which solves the maintenance cost caused by the existing knowledge map data processing system relying on the product end to find outdated data in the full amount of data, and manually deleting the expired data Excessively high problem greatly improves the comprehensiveness and stability of the knowledge graph database

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Full data translation method, device, server and storage medium
  • Full data translation method, device, server and storage medium
  • Full data translation method, device, server and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0032] figure 1 It is a flow chart of a full data translation method provided by Embodiment 1 of the present invention. This embodiment is applicable to an incremental data processing system, which automatically translates full data into incremental data during data processing, for example, a knowledge map data processing system. The method can be executed by the full data translation device or server provided by the embodiment of the present invention, and the device can be realized by hardware and / or software, such as figure 1 As shown, the full data translation method includes:

[0033] S101. Scan the data of each site stored in a preset storage unit according to a preset time interval.

[0034] Among them, the preset storage unit stores the full amount of data of each site, including: site identification, sub-chain information of each version data of the site, resource address of each sub-chain of the site, resource content of each resource address of the site and the ve...

Embodiment 2

[0045] On the basis of the above-mentioned embodiments, this embodiment provides a full data translation method, figure 2 A flow chart of the data delivery and storage process in a full data translation method provided by Embodiment 2 of the present invention, as shown in figure 2 As shown, the method includes:

[0046] S201. Receive byte stream data.

[0047] Among them, the byte stream data refers to the data delivered by the data platform to the full data translation device, and the byte stream data may include data that needs to be processed by the knowledge graph data processing system. Byte stream data can be captured by the data platform.

[0048] Specifically, the data platform can be an open knowledge graph platform (such as Baidu's KGopen platform), which realizes data sharing by receiving high-quality byte stream data input by webmasters (generally referring to groups with personal websites), so that the value of data reaches Maximize and provide users with bette...

Embodiment 3

[0061] On the basis of the above-mentioned embodiments, this embodiment provides a preferred storage format in the preset storage unit, and a full data translation process based on the storage format.

[0062] In this embodiment, the preset storage unit is Hbase, and the preset storage format is a table structure designed based on the Sitemap format. A site will have a unique identifier (siteid) and an index (index) file. The index file does not contain actual content, but only includes all resource links under the site, that is, sub-chains. There will be multiple loc data in each sub-chain, and loc represents the actual web page address, that is, the actual location of the resource (resource address), and is also the smallest unit for resource addition and deletion.

[0063] Based on the above Sitemap format, three data tables are designed to store the full amount of data, as shown in Table 1-3, including: a resource table for storing index and sub-chain information, a link t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the present invention discloses a full data translation method, device, server and storage medium, wherein the method includes: scanning the data of each site stored in the preset storage unit according to the preset time interval; The maximum version and the version of each resource content; according to the comparison result, the full data of the site is translated into incremental data. The embodiment of the present invention can automatically translate the full amount of data into incremental data, which solves the problem that the existing knowledge map data processing system relies on the product side to find outdated data in the full amount of data, and the maintenance cost caused by manual deletion of expired data is too high. It greatly improves the comprehensiveness and stability of the knowledge graph database.

Description

technical field [0001] The embodiments of the present invention relate to the technical field of data processing, and in particular, to a full data translation method, device, server and storage medium. Background technique [0002] With the development of Internet technology, the use of massive data on the Internet to build a knowledge graph database can provide users with a search experience of "instant search and instant search". [0003] At present, the knowledge graph data processing system is mainly based on incremental data processing, and some webmasters can only submit full data due to limited capabilities, and the incremental data processing system based on incremental data cannot effectively automatically find out what happened between different versions of full data. Changed data can only be deleted by manual intervention after the product side discovers expired data. [0004] However, the manual intervention method can only be deleted after the expired data is ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/36G06F16/21
CPCG06F16/367
Inventor 熊灏黎江王军委
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD