Evaluation of similar content-based images
By combining SSIM, SIFT, and histogram values, this method solves the problem that existing image search algorithms struggle to identify image content similarity, enabling accurate content-based image search and ranking.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SONY GROUP CORP
- Filing Date
- 2022-03-18
- Publication Date
- 2026-06-30
AI Technical Summary
Existing image search algorithms are based on pixel analysis, which makes it difficult to accurately identify the similarity of image content, leading to false detections, such as red jalapeno peppers being identified as red tomatoes.
A combination of Structural Similarity Index (SSIM), Scale Invariant Feature Transform (SIFT), and histogram values is used to calculate the similarity score of images. By analyzing the brightness, contrast, structure, and features of images, the accuracy of image search is improved.
Content-based image search was implemented, which can accurately identify the structure and features of objects in images, improving the accuracy of search results and subjective similarity assessment.
Smart Images

Figure CN116569162B_ABST
Abstract
Description
[0001] Cross-references to related applications
[0002] This application claims priority to U.S. Patent Application No. 17 / 490,406, filed September 30, 2021, entitled “EVALUATION OF SIMILAR CONTENT-BASEDIMAGES,” which claims priority to U.S. Provisional Patent Application No. 63 / 184,274 (Client Reference No. SYP340001US01), filed May 5, 2021, entitled “OBJECTIVE METHOD TO EVALUATE SIMILARITY FOR CONTENTS-BASED IMAGES,” which are incorporated herein by reference as if fully set forth herein for all purposes. Background Technology
[0003] Image search algorithms are used to find similar images across the internet. Search techniques can involve the visual similarity of images' pixels (e.g., similar colors). Searches can involve mean squared error (MSE) techniques, peak signal-to-noise ratio (PSNR) techniques, and other subjective evaluations used to measure image similarity. Because MSE and PSNR techniques estimate absolute error based on pixel-by-pixel analysis, a search for a query image of a red jalapeno might produce search results for images of red tomatoes due to the amount of similar red pixels. These techniques are often used to compare the quality of the original image with the restored image, or can be used to match the original image with distorted images (e.g., due to blur, rotation, scaling, lighting, etc.). Summary of the Invention
[0004] Implementation schemes typically involve the evaluation of similar content-based images. In some implementation schemes, a system includes one or more processors and includes logic encoded in one or more non-transitory computer-readable storage media for execution by the one or more processors. When executed, the logic is operable to cause the one or more processors to perform operations including: receiving a first image, wherein the first image includes at least one first object; receiving a second image, wherein the second image includes at least one second object; calculating a structural similarity index measure (SSIM) value based on at least one first object and at least one second object; calculating a scale-invariant feature transform (SIFT) value based on at least one first object and at least one second object; calculating a histogram value based on at least one first object and at least one second object; and calculating a similarity score based on the SSIM value, SIFT value, and histogram value.
[0005] Further regarding the system, in some implementations, the SSIM value is based on one or more of the brightness, contrast, and structure of the first and second images. In some implementations, the SIFT value is based on one or more predetermined features of the first and second images. In some implementations, the histogram value is based on the frequency of predetermined histogram values of the first and second images. In some implementations, the logic, when executed, is further operable to cause the one or more processors to perform operations including calculating an adjusted SIFT value based on the SSIM value. In some implementations, the logic, when executed, is further operable to cause the one or more processors to perform operations including calculating an adjusted histogram value based on the SSIM value and the SIFT value. In some implementations, the logic, when executed, is further operable to cause the one or more processors to perform operations including calculating a sum of the SSIM value, the adjusted SIFT value, and the adjusted histogram value.
[0006] In some implementations, a non-transitory computer-readable storage medium having program instructions thereon is provided. When executed by one or more processors, the instructions are operable to cause the one or more processors to perform operations including: receiving a first image, wherein the first image includes at least one first object; receiving a second image, wherein the second image includes at least one second object; calculating a structural similarity index metric (SSIM) value based on at least one first object and at least one second object; calculating a scale-invariant feature transform (SIFT) value based on at least one first object and at least one second object; calculating a histogram value based on at least one first object and at least one second object; and calculating a similarity score based on the SSIM value, SIFT value, and histogram value.
[0007] Further regarding the computer-readable storage medium, in some implementations, the SSIM value is based on one or more of the brightness, contrast, and structure of the first and second images. In some implementations, the SIFT value is based on one or more predetermined features of the first and second images. In some implementations, the histogram value is based on the frequency of predetermined histogram values of the first and second images. In some implementations, when executed, the instructions are further operable to cause the one or more processors to perform operations including calculating an adjusted SIFT value based on the SSIM value. In some implementations, when executed, the instructions are further operable to cause the one or more processors to perform operations including calculating an adjusted histogram value based on the SSIM value and the SIFT value. In some implementations, when executed, the instructions are further operable to cause the one or more processors to perform operations including calculating a sum of the SSIM value, the adjusted SIFT value, and the adjusted histogram value.
[0008] In some implementations, a method includes: receiving a first image, wherein the first image includes at least one first object; receiving a second image, wherein the second image includes at least one second object; calculating a structural similarity index metric (SSIM) value based on at least one first object and at least one second object; calculating a scale-invariant feature transform (SIFT) value based on at least one first object and at least one second object; calculating a histogram value based on at least one first object and at least one second object; and calculating a similarity score based on the SSIM value, SIFT value, and histogram value.
[0009] Further regarding the method, in some implementations, the SSIM value is based on one or more of the brightness, contrast, and structure of the first and second images. In some implementations, the SIFT value is based on one or more predetermined features of the first and second images. In some implementations, the histogram value is based on the frequency of predetermined histogram values of the first and second images. In some implementations, the method further includes calculating an adjusted SIFT value based on the SSIM value. In some implementations, the method further includes calculating an adjusted histogram value based on both the SSIM and SIFT values.
[0010] A further understanding of the nature and advantages of the particular implementation disclosed herein can be achieved by referring to the remainder of the specification and the accompanying drawings. Attached Figure Description
[0011] Figure 1 This is a block diagram of an example environment for evaluating similar content-based images, which can be used in the implementation scheme described in this paper.
[0012] Figure 2 This is an example flowchart for evaluating similar content-based images based on some implementation schemes.
[0013] Figure 3 This is an example query image of a red Mexican chili pepper based on some implementation schemes.
[0014] Figure 4 This is an example image of a green jalapeño pepper found in image search results, based on some implementation schemes.
[0015] Figure 5 These are example query images of superheroes based on some implementation schemes.
[0016] Figure 6 These are example images of superheroes found in image search results, based on some implementation schemes.
[0017] Figure 7 These are example images of people found in image search results based on some implementation schemes.
[0018] Figure 8 This is a block diagram of an example network environment, which can be used for some of the implementation schemes described in this article.
[0019] Figure 9 This is a block diagram of an example computer system that can be used for some of the implementation schemes described in this article. Detailed Implementation
[0020] The implementation described in this paper typically involves the evaluation of similar content-based images. As described in more detail here, the system searches for and ranks images similar to the query image. The system arranges the search results in order of similarity, where these results are subjectively determined (e.g., using the human visual system, etc.). Conversely, when comparing the found images with the query image, these results are objectively calculated based on content and / or objects. For example, the system analyzes features and structures in the images to provide quantitative search results.
[0021] As described in more detail herein, in various implementations, the system receives a first image including at least one first object and a second image including at least one second object. The system then calculates a Structural Similarity Index (SSIM) value based on the at least one first object and the at least one second object. The system also calculates a Scale Invariant Feature Transform (SIFT) value based on the at least one first object and the at least one second object. The system further calculates a histogram value based on the at least one first object and the at least one second object. The system then calculates a similarity score based on the SSIM value, SIFT value, and histogram value. As described in more detail herein, the system adjusts the SIFT and histogram values to calculate a more accurate similarity score. Although the implementations disclosed herein are described in the context of still images, these implementations can also be applied to image frames of video.
[0022] Figure 1 This is a block diagram of an example network environment 100 for evaluating similar content-based images, which can be used in some implementations described herein. In some implementations, network environment 100 includes system 102, which includes server device 104 and database 106. Network environment 100 also includes client device 110, which communicates with system 102 via network 150 of network environment 100. Network 150 can be any suitable communication network or combination of networks, which can include network types such as Bluetooth networks, Wi-Fi networks, the Internet, etc.
[0023] As described in more detail herein, system 102 receives a first image that includes at least one first object. This first image, also referred to as the query image, will be used in the image search. System 102 performs a search to find other images with similar content. In various implementations, system 102 also ranks the images based on similarity. During the search, system 102 receives at least a second image that includes at least one second object. For ease of illustration, the second image is described. In various implementations, system 102 processes many other images (e.g., hundreds of images, thousands of images, etc.) in a similar manner to the second image.
[0024] As described in more detail herein, in various implementations, system 102 calculates a Structural Similarity Index (SSIM) value based on a first object in a first image and a second object in a second image. System 102 also calculates a Scale Invariant Feature Transform (SIFT) value based on the first object and at least one second object. System 102 further calculates a histogram value based on the first and second objects. Then, system 102 calculates a similarity score based on the SSIM value, SIFT value, and histogram value. System 102 performs these steps and others on multiple images found in the search. Further implementations for the generation and execution of the survey are described in more detail herein.
[0025] For ease of explanation, Figure 1 A box is displayed for each of system 102, server device 104, database 106, and client device 110. Boxes 102, 104, 106, and 110 may represent multiple systems, server devices, databases, and clients. In other implementations, environment 100 may not have all the components shown and / or may have other elements including those that replace those shown herein or other types of elements besides those shown herein.
[0026] While system 102 performs the implementation described herein, in other implementations, any suitable component or combination of components associated with system 102 or any suitable one or more processors associated with system 102 may facilitate the performance of the implementation described herein.
[0027] Figure 2 This is an example flowchart for evaluating similar content-based images based on some implementation schemes. (See reference...) Figure 1 and Figure 2 Both methods begin at box 202, where a system such as system 102 receives a first image, which includes a first object. This first image may be referred to as the query image, where system 102 performs a search for images similar to the query image. As described in more detail herein, system 102 also ranks the images found in the search based on their similarity to the query image.
[0028] Figure 3 This is an example query image 300 of red jalapeno peppers, based on some implementation schemes. Three jalapeno peppers are shown, which are red in this example. For clarity, this document may describe the first image in the context of an object. In various implementation schemes, the first image may contain one or more objects, as shown in query image 300. The number of objects in a given image can vary depending on the specific implementation scheme. For clarity, the terms first image and query image are used interchangeably.
[0029] At box 204, system 102 receives a second image, wherein the second image includes a second object. The second image is one of many images found by the system in an image search of images similar to the query image.
[0030] The system searches the internet and / or other suitable databases against the search result image. The terms "second image" and "search result image" are used interchangeably. The system then performs the other steps described herein to compare the second image / search result image with the first image / query image.
[0031] Figure 4 This is example image 400 of green jalapeños found in image search results, based on some implementation schemes. Three jalapeños are shown, which are green in this example. For clarity, this document may describe the second image in the context of an object. In various implementation schemes, one or more objects may be present in the first image, as shown in image 400. The number of objects in a given image can vary depending on the specific implementation scheme. For clarity, the terms second image and search result image are used interchangeably.
[0032] The implementation described in this paper produces search image results containing objects with the same content as those objects in the query image, even if these objects have different colors. This is because such searches are based on comparing the content of the image, not just comparing the pixels. This paper describes aspects of this type of search in more detail.
[0033] At box 206, system 102 calculates a structural similarity index metric (SSIM) value based on one or more first objects in the query image and one or more second objects in the search result image. In various implementations, the system calculates the SSIM value based on a predetermined comparison metric between two given samples (such as the first image and the second image).
[0034] The predefined comparison metrics for SSIM values may include color, brightness, contrast, and structure. Structure may include shape. Therefore, in various implementations, SSIM values are based on one or more of the color, brightness, contrast, and structure of the first and second images. In various implementations, the system analyzes objects in the images against these metrics. Depending on the specific implementation, the predefined comparison metrics may vary.
[0035] The system calculates the SSIM value based on a first image and a second image. In some implementations, the SSIM value is a value between 0 and 1. A value of +1 indicates that the first image and the second image are very similar or identical. A value of 0 indicates that the first image and the second image are very different.
[0036] Such as Figure 3 Image 300 (red jalapeño peppers) and Figure 4 Images like image 400 (green jalapeño pepper) are visually similar because they are structurally similar. They have similar shapes (e.g., the shape of a jalapeño pepper) and similar features (e.g., the stem of a jalapeño pepper). In this example, the system calculates an SSIM value closer to 1. Therefore, even though image 400 contains objects with colors different from those in image 300, the system still includes image 400 in the set of images similar to image 300. This is because the system analyzes the content of the images (not just the color). Based at least on structural similarity, the search result image 400 is similar to the query image 300. In other words, images 300 and 400 are similar based on content. Therefore, the implementation described in this paper provides not only blind evaluation of visually similar images but also blind evaluation of images based on content or feature similarity.
[0037] At box 208, system 102 computes Scale Invariant Feature Transform (SIFT) values based on a first object and a second object. In various implementations, the SIFT values are based on one or more predetermined features of the first and second images. In various implementations, SIFT is a feature detection algorithm in computer vision. The SIFT algorithm finds correspondences between the first and second images. These correspondences of local features in each image can be referred to as keypoints of the image. These keypoints are scaling and rotation invariants that can be used to evaluate the degree of similarity between the first and second images. In addition to being based on keypoints, SIFT values are also based on the background of the search result image compared to the background of the query image. In various implementations, the system analyzes the keypoints of the objects and the background in the images for these metrics.
[0038] The example above involves finding a green jalapeno pepper in a search result image that resembles a red jalapeno pepper in a query image. In this example, there is no background in the image to consider. In various implementations, the system includes additional SIFT values to provide greater accuracy in determining the similarity between a given search result image and a query image, especially when the image content is more complex, such as when there is a background.
[0039] Figure 5 The following are examples of query images and Figure 6 and Figure 7 The associated search results images are more complex in their context. (See this article for more information.) Figure 2 The remaining steps of the process and Figure 5 , Figure 6 and Figure 7As described in more detail in the image, the system employs additional analysis and calculates additional values. These additional analyses and values enable the system to more accurately find similar search result images and rank them more accurately based on similarity. The following describes... Figure 5 , Figure 6 and Figure 7 The image is followed by a description of the additional analysis and calculations performed by the system when searching for and ranking search result images.
[0040] Figure 5 This is example query image 500 of a superhero based on some implementation schemes. The content of a given query image can vary depending on the specific implementation. For example, in various implementations, the superhero in image 500 could wear a suit with a specific color combination (e.g., black and yellow, black and red, etc.), containing any number of colors in the combination. The suit may include a mask that also has a specific color combination. Both the suit and the associated mask can have other distinguishing markings or patterns. As shown, the superhero covers most of the image area. Furthermore, the background in the image is mostly a darker color, with lighter areas that are difficult to describe.
[0041] Figure 6 This is example image 600 of a superhero found in image search results, based on some implementation schemes. In this example image, the content includes the same superhero as the one shown in query image 500. In this example implementation scheme, the superhero in image 600 is wearing... Figure 5 The image features 500 superheroes in identical suits, including the same masks. Furthermore, the superheroes cover less than half of the image area, and the background is mostly a lighter color with aerial perspective elements that appear to be city buildings.
[0042] Figure 7 This is example image 700 of a person found in image search results, based on some implementation schemes. In this example image, the content includes a person. The person is wearing appropriate clothing. Figure 5 and Figure 6 The images depict 500 and 600 superheroes in different costumes, but with some similar colors. The figures also wear corresponding... Figure 5 and Figure 6 Images of 500 and 600 superhero masks with different masks.
[0043] In various implementations, the system calculates the adjusted SIFT value based on the SSIM value. For example, the system can calculate the adjusted SIFT value by multiplying the SIFT value (associated with the given search result image and query image) with the SSIM factor (associated with the given search result image and query image). Regarding images 500, 600, and 700, the system can include images 600 and 700 in the set of search result images similar to query image 500.
[0044] exist Figure 2 At box 210, system 102 calculates histogram values based on the first and second objects. The histogram shows how frequently or infrequently certain values appear in the first and second images. In various implementations, the histogram is a distribution of its discrete intensity levels, ranging from 0 to L⁻¹. In various implementations, histogram values are also based on various factors such as mood and brightness. For example, a given image may have certain levels of brightness in specific areas of the image. Such brightness levels can indicate the time of day (e.g., morning, afternoon, evening, etc.). Such brightness levels can also indicate whether objects in a given image are indoors or outdoors, etc.
[0045] In various implementations, the histogram values are based on the frequencies of predetermined histogram values from the first and second images. In various implementations, the histogram is a normalized histogram. Furthermore, histogram normalization involves transforming a discrete distribution of intensity into a discrete distribution of probability. To perform this transformation, the system divides each histogram value by the number of pixels.
[0046] In various implementation schemes, the system calculates the adjusted histogram value based on the SSIM and SIFT values. For example, the system can calculate the adjusted histogram value by multiplying the histogram value by both the SSIM and SIFT factors.
[0047] At box 212, system 102 calculates a similarity score based on SSIM, SIFT, and histogram values. In various implementations, the system applies the following expression to the first and second images:
[0048] SSIM+(1-SSIM)*SIFT+(1-SSIM)*(1-SIFT)*histogram.
[0049] The system calculates a similarity score ranging from 0 to 1. Furthermore, as shown in this paper, the system calculates a similarity score for each search result image found during the search.
[0050] In various implementations, to calculate the similarity score, the system calculates the sum of the SSIM value, the adjusted SIFT value, and the adjusted histogram value. The similarity score approaches 0 as at least one first object and at least one second object become less similar, and approaches 1 as at least one first object and at least one second object become more similar.
[0051] In the example calculation of SSIM + (1-SSIM)*SIFT + (1-SSIM)*(1-SIFT)*histogram, the system can calculate an SSIM value of 0.8, which indicates relatively high similarity (e.g., similar objects). The system can calculate a SIFT value of 0.1, which indicates relatively low similarity (e.g., dissimilar backgrounds). The system can calculate the adjusted SIFT value as (1-0.8)*0.1 or (0.2)*0.1 or 0.02. The system can calculate a histogram value of 0.1, which indicates relatively low similarity (e.g., dissimilar atmospheres, lighting, etc.). The system can calculate the adjusted histogram value as (1-0.8)*(1-0.1)*0.1 or (0.2)*(0.9)*(0.1) or 0.018. The resulting similarity score will be 0.8218.
[0052] In various implementation schemes, the system ranks search result images 600 and 700 (among other search result images) based on their similarity to the query image. (See reference...) Figure 5 The query image 500 and the corresponding Figure 6 and Figure 7 For example, in the search results for images 600 and 700, the system will rank image 600 higher than image 700 based on similarity and the resulting similarity score. For instance, image 600 includes objects that are similar to the same superhero as the superhero in query image 500 based on their SSIM value. Although image 600 includes a background that differs from the background of query image 500 based on its SIFT value, the SSIM value carries a greater weight than the SIFT value. Therefore, the system will rank image 600 higher than image 700 solely based on both the SSIM and SIFT values.
[0053] Furthermore, although image 600 includes an atmosphere different from the query image 500 based on histogram values, the SSIM value carries a greater weight than the histogram value. Therefore, the system will rank image 600 higher than image 700 solely based on SSIM and histogram values.
[0054] Traditional ranking techniques might rank image 700 higher than image 600 based solely on the number of pixels viewed. Because System 102 analyzes and compares images based on SSIM, SIFT, and histogram values, it ranks the search results images more accurately.
[0055] Although steps, operations, or calculations may be presented in a particular order, this order can be changed in a particular implementation. Depending on the specific implementation, other orders of steps are possible. In some specific implementations, multiple steps shown in sequence in this specification may be performed simultaneously. Furthermore, some implementations may not have all the steps shown and / or may have alternative steps to those shown herein or other steps besides those shown herein.
[0056] The implementation described in this paper offers various benefits. For example, it can also be applied to reverse image search, label matching, image tracking, and image recognition. Furthermore, it can be used for keyframe search, similar image search, and reverse search of images and / or videos.
[0057] Figure 8 This is a block diagram of an example network environment 800, which can be used in some implementations described herein. In some implementations, network environment 800 includes system 802, which includes server device 804 and database 806. For example, system 802 can be used to implement... Figure 1 The system 802 and the implementation scheme described herein are described herein. The network environment 800 also includes client devices 810, 820, 830, and 840, which can communicate with the system 802 and / or can communicate with each other directly or via the system 802. The network environment 800 also includes a network 850 through which the system 802 and client devices 810, 820, 830, and 840 communicate. The network 850 can be any suitable communication network, such as a Wi-Fi network, a Bluetooth network, the Internet, etc.
[0058] For ease of explanation, Figure 8 One box is shown for each of system 802, server device 804, and network database 806, and four boxes are shown for client devices 810, 820, 830, and 840. Boxes 802, 804, and 806 can represent multiple systems, server devices, and network databases. Furthermore, any number of client devices can be present. In other implementations, environment 800 may not have all the components shown and / or may have additional elements including those that substitute for those shown herein or other types of elements besides those shown herein.
[0059] While the server device 804 of system 802 performs the implementation described herein, in other implementations, any suitable component or combination of components associated with system 802 or any suitable one or more processors associated with system 802 may facilitate the performance of the implementation described herein.
[0060] In the various implementations described herein, the processor of system 802 and / or the processor of any client device 810, 820, 830 and 840 enable the elements described herein (e.g., information, etc.) to be displayed in a user interface on one or more displays.
[0061] Figure 9 This is a block diagram of an example computer system 900, which can be used in some of the implementation schemes described herein. For example, computer system 900 can be used to implement... Figure 8 Server equipment 804 and / or Figure 1 The system 900 includes system 102 and implementations for performing the methods described herein. In some implementations, computer system 900 may include processor 902, operating system 904, memory 906, and input / output (I / O) interface 908. In various implementations, processor 902 may be used to implement the various functions and features described herein, as well as implementations for performing the methods described herein. Although processor 902 is described as performing the implementations described herein, any suitable component or combination of components of computer system 900, or any suitable one or more processors associated with computer system 900 or any suitable system, may perform the described steps. The implementations described herein may be performed on user equipment, servers, or a combination of both.
[0062] Computer system 900 also includes software application 910, which may be stored in memory 906 or any other suitable storage location or computer-readable medium. Software application 910 provides instructions that enable processor 902 to perform the implementation schemes and other functions described herein. The software application may also include engines, such as network engines for performing various functions associated with one or more networks and network communications. Components of computer system 900 may be implemented by any combination of one or more processors or hardware devices, and any combination of hardware, software, firmware, etc.
[0063] For ease of explanation, Figure 9A box is shown for each of the processor 902, operating system 904, memory 906, I / O interface 908, and software application 910. These boxes 902, 904, 906, 908, and 910 may represent multiple processors, operating systems, memory, I / O interfaces, and software applications. In various implementations, the computer system 900 may not have all the components shown and / or may have other elements including alternatives to those shown herein or other types of components besides those shown herein.
[0064] Although specific implementations have been described, these are illustrative rather than restrictive. The concepts illustrated in the examples can be applied to other examples and implementations.
[0065] In various implementations, the software is encoded in one or more non-transitory computer-readable media for execution by one or more processors. When executed by one or more processors, the software is operable to perform the implementations and other functions described herein.
[0066] Routines for a specific implementation can be implemented using any suitable programming language, including C, C++, C#, Java, JavaScript, assembly language, etc. Different programming techniques can be employed, such as procedural or object-oriented. Routines can execute on a single processing device or multiple processors. Although the steps, operations, or calculations may be presented in a specific order, this order can be changed in different specific implementations. In some specific implementations, multiple steps shown in this specification as being in sequence can be executed simultaneously.
[0067] Specific implementations may be implemented in a non-transitory computer-readable storage medium (also known as a machine-readable storage medium) for use by or in conjunction with an instruction execution system, apparatus, or device. Specific implementations may be implemented as control logic in software, hardware, or a combination of both. This control logic, when executed by one or more processors, is operable to perform the implementations described herein and other functions. For example, tangible media such as hardware storage devices may be used to store the control logic, which may include executable instructions.
[0068] A particular implementation can be achieved using a programmable general-purpose digital computer and / or by using application-specific integrated circuits (ASICs), programmable logic devices, field-programmable gate arrays (FPGAs), optical, chemical, biological, quantum, or nanoengineered systems, components, and mechanisms. Typically, the functionality of a particular implementation can be achieved by any means known in the art. Distributed, networked systems, components, and / or circuits can be used. Data communication or transmission can be wired, wireless, or by any other means.
[0069] A “processor” can include any suitable hardware and / or software system, mechanism, or component that processes data, signals, or other information. A processor can include a system or other system having a general-purpose central processing unit, multiple processing units, dedicated circuitry, or other systems for implementing its functions. Processing is not geographically or temporally limited. For example, a processor can perform its functions in a “real-time,” “offline,” “batch mode,” or similar manner. Parts of the processing can be performed at different times and locations by different (or the same) processing systems. A computer can be any processor that communicates with memory. Memory can be any suitable data storage device, storage medium, and / or non-transitory computer-readable storage medium, including electronic storage devices such as random access memory (RAM), read-only memory (ROM), magnetic storage devices (hard disk drives, etc.), flash memory, optical storage devices (CDs, DVDs, etc.), magnetic disks or optical discs, or other tangible media suitable for storing instructions (e.g., program or software instructions) for processor execution. For example, tangible media such as hardware storage devices can be used to store control logic, which may include executable instructions. Instructions can also be included in and provided as electronic signals, for example in the form of Software as a Service (SaaS) delivered from a server (e.g., a distributed system and / or a cloud computing system).
[0070] It should also be understood that one or more elements depicted in the accompanying drawings / figures may also be implemented in a more separate or integrated manner, or even removed or rendered inoperable in some cases, depending on their usefulness for a particular application. Implementations may also be stored in machine-readable media as programs or code that allow a computer to execute any of the methods described above.
[0071] As used in the description herein and throughout the following claims, “a,” “an,” and “the” include plural references unless the context clearly specifies otherwise. Furthermore, as used in the description herein and throughout the following claims, “in” means both “in” and “on” unless the context clearly specifies otherwise.
[0072] Therefore, while specific implementation schemes have been described herein, the foregoing disclosure is intended to allow for modification, various alterations, and substitutions. It should be understood that in some cases, certain features of the specific implementation scheme will be adopted without departing from the scope and spirit set forth, without corresponding use of other features. Thus, many modifications can be made to adapt specific situations or materials to the fundamental scope and spirit.
Claims
1. A system comprising: One or more processors; as well as Logic encoded in one or more non-transitory computer-readable storage media for execution by the one or more processors, and said logic, when executed, is operable to cause the one or more processors to perform operations including: Receive a first image, wherein the first image includes at least one first object; Receive a second image, wherein the second image includes at least one second object; The structural similarity index (SSIM) value is calculated based on the at least one first object and the at least one second object. Scale-invariant feature transform (SIFT) values are calculated based on the at least one first object and the at least one second object. The adjusted SIFT value is calculated by multiplying the SIFT value by an SSIM factor that decreases as the SSIM value increases; Histogram values are calculated based on the at least one first object and the at least one second object; The adjusted histogram value is calculated by multiplying the histogram value by the SSIM factor and then by the SIFT factor, which decreases as the SIFT value increases. Calculate the sum of the SSIM value, the adjusted SIFT value, and the adjusted histogram value; as well as The similarity score is calculated based on the sum of the SSIM value, the adjusted SIFT value, and the adjusted histogram value.
2. The system of claim 1, wherein the SSIM value is based on one or more of the brightness, contrast, and structure of the first image and the second image.
3. The system of claim 1, wherein the SIFT values are based on one or more predetermined features of the first image and the second image.
4. The system of claim 1, wherein the histogram values are based on the frequency of predetermined histogram values of the first image and the second image.
5. A non-transitory computer-readable storage medium having program instructions stored thereon, the program instructions being operable, when executed by one or more processors, to cause the one or more processors to perform operations including: Receive a first image, wherein the first image includes at least one first object; Receive a second image, wherein the second image includes at least one second object; The structural similarity index (SSIM) value is calculated based on the at least one first object and the at least one second object. Scale-invariant feature transform (SIFT) values are calculated based on the at least one first object and the at least one second object. The adjusted SIFT value is calculated by multiplying the SIFT value by an SSIM factor that decreases as the SSIM value increases; Histogram values are calculated based on the at least one first object and the at least one second object; The adjusted histogram value is calculated by multiplying the histogram value by the SSIM factor and then by the SIFT factor, which decreases as the SIFT value increases. Calculate the sum of the SSIM value, the adjusted SIFT value, and the adjusted histogram value; as well as The similarity score is calculated based on the sum of the SSIM value, the adjusted SIFT value, and the adjusted histogram value.
6. The computer-readable storage medium of claim 5, wherein the SSIM value is based on one or more of the brightness, contrast, and structure of the first image and the second image.
7. The computer-readable storage medium of claim 5, wherein the SIFT values are based on one or more predetermined features of the first image and the second image.
8. The computer-readable storage medium of claim 5, wherein the histogram values are based on the frequency of predetermined histogram values of the first image and the second image.
9. A computer-implemented method for evaluating similar content-based images, the method comprising: Receive a first image, wherein the first image includes at least one first object; Receive a second image, wherein the second image includes at least one second object; The structural similarity index (SSIM) value is calculated based on the at least one first object and the at least one second object. Scale-invariant feature transform (SIFT) values are calculated based on the at least one first object and the at least one second object. The adjusted SIFT value is calculated by multiplying the SIFT value by an SSIM factor that decreases as the SSIM value increases; Histogram values are calculated based on the at least one first object and the at least one second object; The adjusted histogram value is calculated by multiplying the histogram value by the SSIM factor and then by the SIFT factor, which decreases as the SIFT value increases. Calculate the sum of the SSIM value, the adjusted SIFT value, and the adjusted histogram value; as well as The similarity score is calculated based on the sum of the SSIM value, the adjusted SIFT value, and the adjusted histogram value.
10. The method of claim 9, wherein the SSIM value is based on one or more of the brightness, contrast, and structure of the first and second images.
11. The method of claim 9, wherein the SIFT value is based on one or more predetermined features of the first image and the second image.
12. The method of claim 9, wherein the histogram values are based on the frequency of predetermined histogram values of the first image and the second image.