Full data translation method, device, server and storage medium
A full-volume data and data technology, which is applied in the fields of full-volume data translation methods, servers and storage media, and devices, can solve problems such as high maintenance costs, and achieve the effect of improving comprehensiveness and stability
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0032] figure 1 It is a flow chart of a full data translation method provided by Embodiment 1 of the present invention. This embodiment is applicable to an incremental data processing system, which automatically translates full data into incremental data during data processing, for example, a knowledge map data processing system. The method can be executed by the full data translation device or server provided by the embodiment of the present invention, and the device can be realized by hardware and / or software, such as figure 1 As shown, the full data translation method includes:
[0033] S101. Scan the data of each site stored in a preset storage unit according to a preset time interval.
[0034] Among them, the preset storage unit stores the full amount of data of each site, including: site identification, sub-chain information of each version data of the site, resource address of each sub-chain of the site, resource content of each resource address of the site and the ve...
Embodiment 2
[0045] On the basis of the above-mentioned embodiments, this embodiment provides a full data translation method, figure 2 A flow chart of the data delivery and storage process in a full data translation method provided by Embodiment 2 of the present invention, as shown in figure 2 As shown, the method includes:
[0046] S201. Receive byte stream data.
[0047] Among them, the byte stream data refers to the data delivered by the data platform to the full data translation device, and the byte stream data may include data that needs to be processed by the knowledge graph data processing system. Byte stream data can be captured by the data platform.
[0048] Specifically, the data platform can be an open knowledge graph platform (such as Baidu's KGopen platform), which realizes data sharing by receiving high-quality byte stream data input by webmasters (generally referring to groups with personal websites), so that the value of data reaches Maximize and provide users with bette...
Embodiment 3
[0061] On the basis of the above-mentioned embodiments, this embodiment provides a preferred storage format in the preset storage unit, and a full data translation process based on the storage format.
[0062] In this embodiment, the preset storage unit is Hbase, and the preset storage format is a table structure designed based on the Sitemap format. A site will have a unique identifier (siteid) and an index (index) file. The index file does not contain actual content, but only includes all resource links under the site, that is, sub-chains. There will be multiple loc data in each sub-chain, and loc represents the actual web page address, that is, the actual location of the resource (resource address), and is also the smallest unit for resource addition and deletion.
[0063] Based on the above Sitemap format, three data tables are designed to store the full amount of data, as shown in Table 1-3, including: a resource table for storing index and sub-chain information, a link t...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


