[0057] Reference figure 1 , Shows Embodiment 1 of an image stitching method of the present invention, which may specifically include:
[0058] Step 101: For the first image and the second image to be spliced, respectively extract the key points of each image and the characteristic parameters of the key points; the first image and the second image may be selected by the user from an image library;
[0059] Step 102: Acquire corresponding key point pairs between the first image and the second image;
[0060] Step 103: Obtain the transformation relationship between the image point position coordinates of the two images according to the key point pair;
[0061] For example, using polynomial fitting regression to obtain the transformation relationship:
[0062] u = a 0 + a 1 x + a 2 y + a 3 x 2 + a 4 xy + a 5 y 2 v = b 0 + b 1 x + b 2 y + b 3 x 2 + b 4 xy + b 5 y 2
[0063] Substituting the obtained position data (x, y) of the key point pair into the above equation, establishing a homogeneous least squares equation system and solving it can get the parameter a i , B i , Where i=0-5.
[0064] Step 104: Transform each image point on the second image according to the transformation relationship; for example, use the above parameter a i , B i The determined transformation relation equation performs coordinate transformation on each image point on the second image;
[0065] Step 105: Splicing the transformed second image with the first image to obtain a spliced image. In fact, the transformation process in step 104 obtains the new coordinates when each image point of the second image is spliced onto the first image. In the same coordinate dimension, the splicing of the two can be easily realized.
[0066] Since the image captured by the user through the camera device (generally dynamically captured) may have problems such as blurring, noise interference, etc., in order to ensure image quality, in another preferred embodiment of the present invention, step 100 may be further included before step 101: The image to be spliced is pre-processed, and the pre-processing may include noise reduction or gray-scale transformation, and if necessary, pre-processing such as coordinate transformation may also be performed.
[0067] In addition, since the images to be spliced selected by the user may have differences in the original camera angle, brightness and other acquisition conditions, the two spliced images may have different image characteristics. Therefore, in another preferred aspect of the present invention, In the embodiment, after step 105, step 106 may be further included: adjusting the image characteristics of the second image according to the first image, the image characteristics including illumination characteristics or resolution, and so on. For example, adjust the light characteristics of the second image according to the light characteristics of the first image to adapt to the first image; or adjust the resolution of the second image through up-sampling or down-sampling according to the resolution of the first image to adapt to the first image. Image; or, smoothing the stitching boundary of two images, etc.
[0068] The following briefly introduces a feasible way of acquiring image key points and their characteristic parameters: using scale-invariant feature transformation algorithm (SIFT algorithm) to extract the key points and characteristic parameters of each image.
[0069] The result of SIFT algorithm extraction is to obtain a large number of features represented by high-dimensional descriptors distributed on different scales. A brief description of the SIFT algorithm process is as follows:
[0070] (1) Detect extreme points in scale space
[0071] In order to effectively detect stable key points in the scale space, the present invention can adopt the Gaussian difference scale-space (DOG scale-space), which uses different scale Gaussian difference kernels and image convolutions to construct an image pyramid to generate the required scale space .
[0072] D(x,y,σ)=(G(x,y,kσ)-G(x,y,σ))*I(x,y)=L(x,y,kσ)-L(x,y ,Σ)
[0073] Among them, (x, y) is the space coordinate, σ is the scale coordinate, G(x, y, σ) is the scale variable Gaussian function,
[0074] G ( x , y , σ ) = 1 2 πσ 2 e - ( x 2 + y 2 ) / 2 σ 2
[0075] Specifically, it is assumed that the constructed image pyramid has a total of P groups, and each group has S layers, where the images of the next group are obtained by down-sampling the images of the previous group.
[0076] (2), precise positioning of extreme points
[0077] In order to find the extreme points of the scale space, each sampling point must be compared with all its neighboring points to see if it is larger or smaller than the neighboring points of its image domain and scale domain. Such as figure 2 As shown, the detection point "X" in the middle is compared with its 8 neighboring points of the same scale and 9×2 points corresponding to the upper and lower neighboring scales, a total of 26 points "O" to ensure that the scale space and the two-dimensional image Extremum points are detected in all spaces.
[0078] (3) Specify direction parameters for each key point
[0079] In this step, the gradient direction distribution characteristics of the pixels in the neighborhood of the key point can be used to specify the direction parameter for each key point, so that the DOG operator has rotation invariance.
[0080] m ( x , y ) = ( L ( x + 1 , y ) - L ( x - 1 , y ) ) 2 + ( L ( x , y + 1 ) - L ( x , y - 1 ) ) 2
[0081] θ(x,y)=a tan 2((L(x,y+1)-L(x,y-1))/(L(x+1,y)-L(x-1,y)) )
[0082] The above formula is the modulus value and direction formula of the gradient at coordinates (x, y). The scale used for L is the scale where each key point is located.
[0083] In actual calculations, the present invention can sample in a neighborhood window centered on key points, and use a histogram to count the gradient directions of neighborhood pixels. The range of the gradient histogram is 0-360 degrees, with one bar every 10 degrees, for a total of 36 bars. The peak of the histogram represents the main direction of the neighborhood gradient at the key point, that is, the direction of the key point. image 3 It is an example of using the gradient histogram as the key point to determine the main direction when using 7 bars.
[0084] In the histogram of gradient directions, when there is another peak that is equivalent to 80% of the energy of the main peak, this direction is regarded as the auxiliary direction of the key point. A key point may be designated to have multiple directions (one main direction, multiple auxiliary directions), which can enhance the robustness of matching.
[0085] At this point, the key points of the image have basically been detected. Each key point includes three pieces of information: location, scale, and direction; from this, a SIFT feature area can be determined.
[0086] (4) Generate key point descriptors
[0087] This step is used to generate descriptors (characteristic parameters) from the above three key points of information to facilitate subsequent calculations; of course, the present invention does not need to limit the specific form of the characteristic parameters.
[0088] First, rotate the coordinate axis to the direction of the key point to ensure rotation invariance. Next, take the 8×8 window with the key point as the center. Reference Figure 4 , The left part shows the direction of the domain gradient, the central black point is the position of the current key point, each small grid represents a pixel in the scale space where the key point neighborhood is located, the arrow direction represents the gradient direction of the pixel, and the arrow length represents the gradient Modulus value, the circle on the periphery of the figure represents the range of Gaussian weighting (the closer the pixel is to the key point, the greater the contribution of the gradient direction information).
[0089] Then, calculate the gradient direction histogram in 8 directions on every 4×4 small block, and draw the cumulative value of each gradient direction to form a seed point, such as Figure 4 Shown in the right part (key point feature vector illustration). In this figure, a key point is composed of 2×2 4 seed points, and each seed point has 8 direction vector information. This idea of combining neighborhood directional information enhances the ability of the algorithm to resist noise, and at the same time provides better fault tolerance for feature matching containing positioning errors.
[0090] Preferably, in order to enhance the robustness of matching, the present invention can use 4×4 16 seed points to describe each key point, so that 128 data can be generated for one key point, that is, a 128-dimensional SIFT feature is finally formed. vector. At this time, the SIFT feature vector has removed the influence of geometric deformation factors such as scale change, rotation, etc., and then continue to normalize the length of the feature vector, then the influence of the illumination change can be further removed.
[0091] It should be noted that, in addition to the foregoing implementation algorithm, the present invention may also adopt a corner detection algorithm or a matching algorithm based on edge detection. Among them, the corner detection algorithm is an image processing algorithm that directly uses the image gray level to effectively detect edges and corners.
[0092] The following briefly introduces how to obtain the required key point pairs from the key points of the two images. For example, refer to the following table:
[0093]