R-TBF-based RFID redundant data cleaning strategy

A redundant data and strategy technology, applied in the field of data cleaning, can solve problems such as incomplete considerations, reduced data quality, poor cleaning effect, etc., to achieve the effect of improving cleaning effect and improving data quality

Active Publication Date: 2017-06-13
重庆大学溧阳智慧城市研究院
View PDF4 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In actual RFID applications, due to its non-contact and non-line-of-sight characteristics, when the reader is not close to the target tag, a large amount of target tag data has been generated, and these data have certain redundancy; in addition, Since there are often multiple readers working at the same time in practical applications, a large amount of redundant data will be generated for the same target tag in a similar time. The generation of these redundant data is unavoidable in the entire RFID application process, and these redundant data The existence of redundant data also limits the popularization of RFID applications.
[0003] In addition, in RFID applications, most RFID data has the characteristics of mobility, which poses a greater challenge to its processing, so the main problem facing the cleaning of RFID redundant data is that for a large number of RFID data streams, How to clean it in real time in a short time and in a small space, which puts forward higher requirements on the execution time and occupied space of the cleaning algorithm
[0004] At present, there are many cleaning methods for RFID redundant data. Alonso proposed an extensible data flow cleaning model ESP based on statement query, but it needs to save all the data to be processed, which does not meet the dynamic requirements of RFID data flow, and will occupy a large amount of data. Memory space; In addition, Bloom proposed Bloom Filter (hereinafter referred to as BF) in 1970. BF is widely used in the field of data cleaning due to its low memory ratio and efficient query. Metwally uses BF to detect redundant data. Since BF has no delete function, when the amount of data is large enough, it will be filled and invalid
In addition, Bloom Filter judges whether it is redundant by the presence or absence of data. For a large amount of data in practical applications, it is necessary to save useful data information for the same tag instead of just one data information. A single data information is one-sided ...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • R-TBF-based RFID redundant data cleaning strategy
  • R-TBF-based RFID redundant data cleaning strategy
  • R-TBF-based RFID redundant data cleaning strategy

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0045]As shown in the figure, the R-TBF-based RFID redundant data cleaning strategy provided by this embodiment solves the problem that the traditional TBF-based RFID data cleaning strategy has poor cleaning effect due to insufficient constraints and deletes useful data by mistake. Further improve the data quality, restore the authenticity of the data, and provide a strong guarantee for the effective use of subsequent data; on the basis of the TBF-based RFID data cleaning strategy, the consideration factor is adjusted from a single time to two factors of time and intensity, because in In the actual application scenario, due to the large coverage of the reader, the mobile RFID inspection vehicle can already read the data information of the corresponding tag before it reaches the position facing the tag. The typical characteristics of this type of information are that the time is short and the intensity is high. Small, if only relying on time to judge whether the data is redundan...

Embodiment 2

[0074] Combine below figure 2 The shown cleaning process is described in detail, and the cleaning process provided in this embodiment mainly includes the following steps:

[0075] Step 1: Initialize the filter, the initialization content includes:

[0076] 11) an integer array M used to save the data time attribute, with a size of m;

[0077] The size of m is calculated according to the following formula:

[0078]

[0079] Among them, n is the size of the input data, P represents the probability that a unit in the integer array is still empty after k n times of mapping, and k is the number of hash functions;

[0080] 12) k hash functions h for mapping data label information to integer arrays 1 … h k ;

[0081] The size of k should satisfy the following inequality:

[0082] k·n

[0083] Among them, n is the size of the input data volume, and m is the size of the integer array;

[0084] The formula for calculating k is:

[0085]

[0086] Among them, n is the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an R-TBF-based RFID redundant data cleaning strategy. Firstly, a filter is initialized, wherein an integer array M used for storing data time attributes, a Hash function, a mapping function, a Map set P, a time threshold tau and an intensity threshold alpha are included; then, the redundancy of current data X is judged, and according to the {ID, TIME, RSSI} format transmission and cleaning rule, redundant data cleaning is conducted; finally, processing of the current data X is completed. According to the R-TBF-based RFID redundant data cleaning strategy, the two constraints, namely the time factor and the strength factor, are taken into consideration so as to conduct corresponding cleaning on the data, through primary timestamp cleaning and secondary intensity value cleaning, the cleaning effect is improved, the data quality is improved, the data authenticity is restored to the largest extent, and a strong guarantee is provided for the effective utilization of subsequent data.

Description

technical field [0001] The invention relates to the technical field of data cleaning, in particular to an R-TBF-based RFID redundant data cleaning strategy. Background technique [0002] RFID technology is widely used in logistics, supply chain and other fields due to its non-contact and non-line-of-sight characteristics. Especially with the development of modern computers and intelligent warehouse construction, the application of RFID technology is more common. RFID data is an important part of RFID applications, and the quality of RFID data has an important impact on the application of RFID technology. In actual RFID applications, due to its non-contact and non-line-of-sight characteristics, when the reader is not close to the target tag, a large amount of target tag data has been generated, and these data have certain redundancy; in addition, Since there are often multiple readers working at the same time in practical applications, a large amount of redundant data will b...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/215
Inventor 孙棣华郑林江赵敏刘卫宁朱文霖
Owner 重庆大学溧阳智慧城市研究院
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products