Large-Scale Image Data Similarity Search Method Based on emd Distance

An EMD distance and similarity search technology, which is applied in the fields of electrical digital data processing, special data processing applications, instruments, etc., can solve problems such as unsatisfactory requirements, affecting the reliability of image retrieval results, and heavy manual annotation workload.

Active Publication Date: 2018-04-13
GUANGXI UNIV
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] To find valuable image information from large-scale image datasets, the traditional text-based image retrieval method (Text-Based Image Retrieval, TBIR for short) obviously cannot meet the demand
Because TBIR technology relies on manual annotation of image content, when the number of images increases sharply, it brings two serious problems: first, the workload of manual annotation is too large, and the cost of annotation is too high; second, the subjective If it is too strong, it will directly affect the reliability of image retrieval results.
In view of the high computational complexity of the EMD distance, it is obviously biased to estimate the calculation cost of the computing node by the amount of data (rather than the actual number of EMD distance calculations), which is not conducive to balancing the computing load of each computing node and directly reduces the overall Query Processing Performance in Distributed Systems
On the other hand, when the size of the image data set surges, the filtering performance of the distributed index in Melody-Join for irrelevant calculations is still insufficient.
The above two aspects directly lead to the fact that the scalable performance of Melody-Join in processing large-scale image datasets cannot meet the needs of practical applications.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Large-Scale Image Data Similarity Search Method Based on emd Distance
  • Large-Scale Image Data Similarity Search Method Based on emd Distance
  • Large-Scale Image Data Similarity Search Method Based on emd Distance

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0062] Such as image 3 As shown, the steps of the large-scale image data similarity search method based on EMD distance in this embodiment include:

[0063] 1) An image data mapping function f designed to map image data to the one-dimensional real key-value space Ω(Φ), the image data mapping function f includes the relationship between the image data and the key value in the one-dimensional real key-value space The mapping relationship between;

[0064] 2) Start a MapReduce job MR1, and estimate the query processing load corresponding to each key value in the one-dimensional real number key-value space Ω(Φ) based on the query image set Q and the image set I to be retrieved through the MapReduce job MR1;

[0065] 3) Start a MapReduce job MR2, and cut the one-dimensional real number key-value space Ω(Φ) through the Map task of MapReduce job MR2 based on the estimated query processing load in step 2), and respectively divide the one-dimensional real number key-value space Ω( Φ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a large-scale image data similarity search method based on EMD distance. The steps include: designing an image data mapping function f for mapping to a one-dimensional real number key-value space Ω(Φ); starting job MR1 and estimating Ω The load of each key value in (Φ); start the job MR2, cut Ω(Φ) through the Map task based on the estimated key value load, and send the data fragments corresponding to the cut area to the Reduce task; based on f, each Reduce The image data received by the task is mapped to the key value in Ω(Φ), and an index structure oriented to EMD distance is constructed based on the key value; based on the index structure, a similarity search based on EMD distance is performed; each Reduce task in MR2 is based on EMD distance The execution result of the similarity search takes the union output. The invention has the advantages of lower network transmission data volume, more balanced calculation load distribution, higher similarity search efficiency and better scalability for analysis and processing of large data sets.

Description

technical field [0001] The invention relates to a computer image data similarity search technology, in particular to a large-scale image data similarity search method based on EMD distance. Background technique [0002] With the popularization of digital devices such as portable computers, smart phones and digital cameras, multimedia data represented by images is increasing and growing explosively. All this indicates that the era of image big data has arrived. At present, academia, industry and even government agencies have begun to pay close attention to the analysis and processing of image big data. [0003] To find valuable image information from large-scale image data sets, the traditional Text-Based Image Retrieval (TBIR for short) obviously cannot meet the demand. Because TBIR technology relies on manual annotation of image content, when the number of images increases sharply, it brings two serious problems: first, the workload of manual annotation is too large, and ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
Inventor 许嘉吕品李陶深陈宁江许华杰文珺张佳振
Owner GUANGXI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products