Unlock instant, AI-driven research and patent intelligence for your innovation.

Data cleaning method, graph database device and computer readable storage medium

A database device and data cleaning technology, which is applied in the field of data processing, can solve problems such as dirty data, poor performance, and a row of data that cannot be expired at the same time, achieving the effect of simple implementation and good cleaning performance

Pending Publication Date: 2021-11-19
ZHEJIANG DAHUA TECH CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] At present, data cleaning can be realized through the characteristics of the life cycle (Time To Live, TTL) of the open source database HBase, but because it is a data cleaning solution at the column cluster level, it cannot handle the life cycle of multiple data and data. Different situations, and the life cycle needs to be set in advance when the table is created. When the data is updated, due to inconsistent timestamps of some columns, a row of data cannot expire at the same time and dirty data will be generated.
In addition, there is also a plan to delete data according to the query results (Delete By Query), create an index for the time attribute field of the object that needs to be cleaned up in the graph data, and set the data expiration time at the same time, regularly query whether there is an expiration time every day, if there is The data identification (Identity document, ID) corresponding to the expiration time is obtained from the index, and the data is deleted according to the ID. The performance of this data cleaning method is very poor.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data cleaning method, graph database device and computer readable storage medium
  • Data cleaning method, graph database device and computer readable storage medium
  • Data cleaning method, graph database device and computer readable storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] The following will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are only some of the embodiments of the present application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

[0020] The solution provided by this application relates to the fields of knowledge graph and graph database, and mainly involves the technology of how to perform data cleaning in graph database by fusion graph. The fusion graph contains multiple types of data, each type of data has a different life cycle, but they are all stored in the same column cluster of an HBase table, and point data and edge data are stored in the same row; edge data write The essence o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a data cleaning method, a graph database device and a computer readable storage medium, the method is applied to the graph database device, the graph database device comprises a plurality of storage units, the storage units are used for storing row data, and the row data comprises information of a plurality of vertexes and information of edges connected with the vertexes. The method comprises the following steps: acquiring a life cycle of a vertex; recording a storage unit for storing the information of the vertex as a first storage unit, and setting a life cycle and a timestamp for the first storage unit; recording the storage unit for storing the information of the edge as a second storage unit, acquiring information of a vertex connected with the edge, recording the information as connection point information, and setting a life cycle and a timestamp for the second storage unit based on the connection point information; and after the life cycle of the storage unit is ended, cleaning the data in the storage unit. Through the mode, the expired data can be cleaned, and the implementation is simple.

Description

technical field [0001] The present application relates to the technical field of data processing, and in particular to a data cleaning method, a graph database device, and a computer-readable storage medium. Background technique [0002] At present, data cleaning can be realized through the characteristics of the life cycle (Time To Live, TTL) of the open source database HBase, but because it is a data cleaning solution at the column cluster level, it cannot handle the life cycle of multiple data and data. Different situations, and the life cycle needs to be set in advance when the table is created. After the data is updated, due to inconsistent timestamps of some columns, a row of data cannot expire at the same time, resulting in dirty data. In addition, there is also a plan to delete data according to the query results (Delete By Query), create an index for the time attribute field of the object that needs to be cleaned up in the graph data, and set the data expiration tim...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/21G06F16/27
CPCG06F16/219G06F16/27
Inventor 俞毅沈秋军周明伟李丛
Owner ZHEJIANG DAHUA TECH CO LTD