Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

R-TBF-based RFID redundant data cleaning strategy

A redundant data and strategy technology, applied in the field of data cleaning, can solve problems such as incomplete considerations, reduced data quality, poor cleaning effect, etc., to achieve the effect of improving cleaning effect and improving data quality

Active Publication Date: 2017-06-13
重庆大学溧阳智慧城市研究院
View PDF4 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In actual RFID applications, due to its non-contact and non-line-of-sight characteristics, when the reader is not close to the target tag, a large amount of target tag data has been generated, and these data have certain redundancy; in addition, Since there are often multiple readers working at the same time in practical applications, a large amount of redundant data will be generated for the same target tag in a similar time. The generation of these redundant data is unavoidable in the entire RFID application process, and these redundant data The existence of redundant data also limits the popularization of RFID applications.
[0003] In addition, in RFID applications, most RFID data has the characteristics of mobility, which poses a greater challenge to its processing, so the main problem facing the cleaning of RFID redundant data is that for a large number of RFID data streams, How to clean it in real time in a short time and in a small space, which puts forward higher requirements on the execution time and occupied space of the cleaning algorithm
[0004] At present, there are many cleaning methods for RFID redundant data. Alonso proposed an extensible data flow cleaning model ESP based on statement query, but it needs to save all the data to be processed, which does not meet the dynamic requirements of RFID data flow, and will occupy a large amount of data. Memory space; In addition, Bloom proposed Bloom Filter (hereinafter referred to as BF) in 1970. BF is widely used in the field of data cleaning due to its low memory ratio and efficient query. Metwally uses BF to detect redundant data. Since BF has no delete function, when the amount of data is large enough, it will be filled and invalid
In addition, Bloom Filter judges whether it is redundant by the presence or absence of data. For a large amount of data in practical applications, it is necessary to save useful data information for the same tag instead of just one data information. A single data information is one-sided and uncertain. Therefore, the traditional Bloom Filter does not meet the actual application requirements
[0005] Chun-Hee Lee and others first proposed TBF (Time Bloom Filter) to use time information to eliminate redundant data. Although it solves the redundancy problem of RFID data in time attributes, it can clean the data and retain valid information to a certain extent. , but RFID data also has strength attributes in addition to time attributes, and strength attributes also play an important role in various RFID applications, and the data that is redundant in judging time attributes is not necessarily redundant in strength attributes, so based on TBF Data cleaning is easy to lose a lot of effective strength attribute information. For RFID applications, the cleaning effect based on TBF is poor. The considerations are not comprehensive and the quality of the data is reduced to a certain extent, which affects the effective use of RFID data in subsequent applications.
Similarly, although the DTBF-based RFID redundant data cleaning method and system proposed in the patent application number CN201610212717.1 can solve the data cleaning problem in the case of uncertain data flow size, it fails to consider the impact of the strength factor in the data attribute on the cleaning effect. Therefore, although the data can be cleaned, the cleaning effect is not good due to insufficient consideration
In the actual application scenario of RFID data cleaning, due to the large coverage of the reader, the mobile RFID inspection vehicle can already read the data information of the corresponding tag before it reaches the position facing the tag. The typical characteristics of this type of information are The time is small and the intensity is small. If you only rely on time to judge whether the data is redundant, the effective intensity information of the same tag at a similar time will be lost, which cannot truly reflect the authenticity of the data itself, and thus cannot restore its true location.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • R-TBF-based RFID redundant data cleaning strategy
  • R-TBF-based RFID redundant data cleaning strategy
  • R-TBF-based RFID redundant data cleaning strategy

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0045]As shown in the figure, the R-TBF-based RFID redundant data cleaning strategy provided by this embodiment solves the problem that the traditional TBF-based RFID data cleaning strategy has poor cleaning effect due to insufficient constraints and deletes useful data by mistake. Further improve the data quality, restore the authenticity of the data, and provide a strong guarantee for the effective use of subsequent data; on the basis of the TBF-based RFID data cleaning strategy, the consideration factor is adjusted from a single time to two factors of time and intensity, because in In the actual application scenario, due to the large coverage of the reader, the mobile RFID inspection vehicle can already read the data information of the corresponding tag before it reaches the position facing the tag. The typical characteristics of this type of information are that the time is short and the intensity is high. Small, if only relying on time to judge whether the data is redundan...

Embodiment 2

[0074] Combine below figure 2 The shown cleaning process is described in detail, and the cleaning process provided in this embodiment mainly includes the following steps:

[0075] Step 1: Initialize the filter, the initialization content includes:

[0076] 11) an integer array M used to save the data time attribute, with a size of m;

[0077] The size of m is calculated according to the following formula:

[0078]

[0079] Among them, n is the size of the input data, P represents the probability that a unit in the integer array is still empty after k n times of mapping, and k is the number of hash functions;

[0080] 12) k hash functions h for mapping data label information to integer arrays 1 … h k ;

[0081] The size of k should satisfy the following inequality:

[0082] k·n

[0083] Among them, n is the size of the input data volume, and m is the size of the integer array;

[0084] The formula for calculating k is:

[0085]

[0086] Among them, n is the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an R-TBF-based RFID redundant data cleaning strategy. Firstly, a filter is initialized, wherein an integer array M used for storing data time attributes, a Hash function, a mapping function, a Map set P, a time threshold tau and an intensity threshold alpha are included; then, the redundancy of current data X is judged, and according to the {ID, TIME, RSSI} format transmission and cleaning rule, redundant data cleaning is conducted; finally, processing of the current data X is completed. According to the R-TBF-based RFID redundant data cleaning strategy, the two constraints, namely the time factor and the strength factor, are taken into consideration so as to conduct corresponding cleaning on the data, through primary timestamp cleaning and secondary intensity value cleaning, the cleaning effect is improved, the data quality is improved, the data authenticity is restored to the largest extent, and a strong guarantee is provided for the effective utilization of subsequent data.

Description

technical field [0001] The invention relates to the technical field of data cleaning, in particular to an R-TBF-based RFID redundant data cleaning strategy. Background technique [0002] RFID technology is widely used in logistics, supply chain and other fields due to its non-contact and non-line-of-sight characteristics. Especially with the development of modern computers and intelligent warehouse construction, the application of RFID technology is more common. RFID data is an important part of RFID applications, and the quality of RFID data has an important impact on the application of RFID technology. In actual RFID applications, due to its non-contact and non-line-of-sight characteristics, when the reader is not close to the target tag, a large amount of target tag data has been generated, and these data have certain redundancy; in addition, Since there are often multiple readers working at the same time in practical applications, a large amount of redundant data will b...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/215
Inventor 孙棣华郑林江赵敏刘卫宁朱文霖
Owner 重庆大学溧阳智慧城市研究院
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products