Rapid similar data detection method based on unified sampling
A data detection and fast technology, applied in other database retrieval, other database index, other database query and other directions, can solve the problem of slow calculation speed, reduce the number of fingerprints, quickly and efficiently detect similar data, and simplify the calculation.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0040] The present invention will be further described below in conjunction with the description of the drawings and specific embodiments.
[0041] Such as Figure 1 to Figure 3 As shown, a fast similar data detection method based on uniform sampling includes the following steps:
[0042] A. Quickly calculate the hash set based on the sliding window algorithm to ensure that as many duplicate or similar content are covered as possible, that is, if two data blocks are similar, the corresponding hash set also has many repeated values;
[0043] B. Quickly and uniformly sample the calculated hash set. If the two data sets are very similar, then the data set after uniform sampling of this data set is also very similar;
[0044] C. Perform M linear transformations on the sampled hash sets to obtain M new sets, and based on the principle of calculating the maximum value, extract a feature value (maximum value or minimum value) from each set, and calculate the feature value The formu...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com