Functional dependency-based diversity data restoration method

A data repair and function-dependent technology, applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., can solve problems such as inefficiency, regardless of the cost of repair, to avoid redundancy, improve efficiency, improve efficiency and The effect of scalability

Active Publication Date: 2018-02-02
SHANGHAI MUNICIPAL ELECTRIC POWER CO +2
View PDF3 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

On the other hand, the method of randomly generating repairs does not consider the cost of repairs at all, that is, it completely ignores the difference in the possibility of different repairs actually occurring.
Given the exponential size of the repair space, methods that randomly generate repairs are generally inefficient

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Functional dependency-based diversity data restoration method
  • Functional dependency-based diversity data restoration method
  • Functional dependency-based diversity data restoration method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0048] Aiming at the deficiencies of traditional methods and the needs of practical problems, this paper proposes a novel data restoration problem - diversity restoration. The goal of this problem is to find k cheap and diverse fixes. These representative inpainting results can not only be applied to interactive inpainting methods involving users to provide reliable references for users, but also can effectively summarize the entire possible inpainting space to improve the performance of methods for querying consistent results. Starting from the goal of this problem, this paper introduces a binary balance optimal problem of cost and diversity (expressed by distance), and proves that the complexity of this problem is NP-Complete.

[0049] Despite the high complexity of the diversity repair problem, this paper proposes a heuristic approach. We embed repair computation into the diversity model, and directly generate k repairs satisfying the condition through a model-guided algor...

example 1

[0112]As shown in Table 1, when the internal attributes of the tuples are arranged in the order of ID, Name, Zip, and City in the order of tuples t1, t2, and t3, Genrepair generates repair 1 in Table 2, and the specific details of repair 1 The steps are shown in Table 5, Table 6, Table 7 and Table 8. At each step, we indicate the cells currently being processed by Genrepair, and indicate equivalence classes (EC) containing more than two elements by cells in the same column with dashed borders. Specifically, step 1 units t1[ID], t1[Name], t1[Zip], t1[City], t2[ID], t2[Name] in Table 5 were processed sequentially, but did not cause the transformation of attribute values. At the beginning, the value of each unit is equal to the unique value containing its own EC after being processed. For example, t2[Zip] is set to "100000", which is equal to the value of its own EC, expressed as e2Zip. Since the value of this EC is equal to the value of EC(e1City) containing t1[City], the two E...

example 2

[0115] As shown in Table 3, when the input sequence is arranged in the order of t1, t2, and t3 in the order of attribute columns ID, Name, City, and Zip, Genrepair generates repair 2. Table 9 Repair 2 Step 1 processing t1[ID], t3[ID] does not change the value of the unit, but causes the merger of equivalence classes containing t1[ID] and t3[ID], and t1[Name], t1 [Zip], t1[City] are merged with the equivalence classes of t3[Name], t3[Zip], t3[City] respectively. Table 10 Fix 2 Step 2, the value of t3[Name] was modified to "Michael", equal to the value of t1[Name], because they are in the same equivalence class, and the value of this equivalence class is in the process of t1[ Name] has been identified as "Michael". Table 11 In step 3 of repair 2, similar to the method in step 2, we modify t3[City]. Table 12 Fix 2 Step 4, when processing t2[Zip], the value of its equivalence class e2Zip is set to the same value as "100000", which will cause the equivalence class of t1[Zip] to b...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a functional dependency-based diversity data restoration method. The method comprises the following steps of: initializing a restoration set; judging whether a restoration number in the restoration set is smaller than or equal to a set restoration number, if the judging result is positive, initializing an input queue and executing the next step, and otherwise, executing the last step; selecting a restoration element of each restoration by utilizing a preference function w'(C) so as to generate the input queue; carrying out data restoration by utilizing a Genrepair algorithm; judging whether the restoration set comprises a restoration same as the restoration or not, if the judging result is positive, directly returning the second step, and otherwise, adding the restoration into the restoration set and returning to the second step; and judging whether a termination condition is satisfied or not, if the judging result is positive, completing the restoration, and otherwise, checking the restoration set and selecting corresponding restorations to carry out replacement. Compared with the prior art, the method has the advantages of carrying out data restoration through considering both the diversity and cost, improving the restoration efficiency and being suitable for the effective dynamic sampling of restoration spaces in index levels.

Description

technical field [0001] The invention relates to the field of power data restoration, in particular to a method for restoring diversity data based on functional dependence. Background technique [0002] With the acceleration of the social informatization process, the amount of data processing is increasing day by day, and the data processing process has become more complicated. It is inevitable that dirty data and inconsistencies will appear. In addition to the well-known 3V characteristics of big data, that is, multiple data types and multiple data sources (variety), massive scale (volume) and dynamic data characteristics (velocity), academia and industry are paying more and more attention to the quality of big data. Features of big data such as data value and data veracity are proposed. Obviously, if there are various errors in the data itself, no matter how fast the system can process a large amount of data, it will not be able to provide users with correct information. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/215
Inventor 谈子敬周向东庞悦陈海波苏运郭乃网田英杰
Owner SHANGHAI MUNICIPAL ELECTRIC POWER CO
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products