Method for estimating orientation change characteristics
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- GENIUS SPORTS SS LLC
- Filing Date
- 2024-09-27
- Publication Date
- 2026-06-16
Smart Images

Figure CN122228523A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to a method, processor, and non-transitory computer-readable medium for estimating orientation variation characteristics of an object from multiple images depicting an object captured at various times. In the examples disclosed herein, the example method and processor are used to capture video of sporting activities, such as sporting events or training sessions. Background Technology
[0002] During sporting events, knowing the ball's spin rate is often desirable. Several methods have been developed for estimating the ball's angular velocity from images depicting the ball. These methods typically rely on a model of the ball's surface appearance. By comparing the ball's appearance in an image with its known appearance, the ball's orientation at the time the image was taken can be determined. The ball's angular velocity can then be determined using the ball's orientation calculated at two different times.
[0003] As mentioned, these methods rely on an understanding of the appearance of the sphere. This means that custom models are needed for spheres that look different from each other. Each custom model must include a value indicating the color of each point on the surface of the sphere. Furthermore, the camera must be carefully color-calibrated to ensure that it reproduces an image of a sphere that matches the model. Summary of the Invention
[0004] According to a first aspect of the invention, a method is provided for estimating orientation change characteristics of an object from multiple images depicting an object captured at various different times, the method comprising: for each of a plurality of candidate values for the orientation change characteristics of the object; identifying a first region in a first image and a second region in a second image of the plurality of images, wherein if the orientation change characteristics are equal to the candidate value, the first region and the second region would represent the same part of the surface of the object; determining a value of a cost function based on pixel intensity values of one or more pixels in the first region, pixel intensity values of one or more pixels in the second region, and a set of illumination parameters, the set of illumination parameters including values for each of the one or more illumination parameters, the set of illumination parameters representing the illumination conditions of the object in the plurality of images; and estimating the orientation change characteristics based on the value of the cost function.
[0005] Optionally, identifying the second region includes identifying the second region based on the shape of the first region and the object.
[0006] Optionally, determining the value of the cost function includes: determining, based on the illumination parameter set, at least one illumination-adjusted difference between the pixel intensity values of one or more pixels in the first region and the pixel intensity values of one or more pixels in the second region; and determining the value of the cost function based on the illumination-adjusted difference.
[0007] Optionally, the illumination parameter set is a first candidate illumination parameter set in a plurality of candidate illumination parameter sets, and the method includes: for each of a plurality of candidate values for orientation change characteristics of an object; for each candidate illumination parameter set in the plurality of candidate illumination parameter sets, determining an illumination-specific value of a cost function based on pixel intensity values of one or more pixels in a first region, pixel intensity values of one or more pixels in a second region, and the candidate illumination parameter set; and estimating orientation change characteristics based on the illumination-specific value of the cost function.
[0008] Optionally, the method includes: determining an illumination parameter set representing the illumination conditions of an object in multiple images by selecting a candidate illumination parameter set from multiple candidate illumination parameter sets based on the value of a cost function.
[0009] Optionally, estimating orientation change characteristics includes selecting, from multiple candidate values, the candidate value with the lowest value of the cost function among the values of the cost function.
[0010] Optionally, the illumination parameters include at least one of the following: the position of the light source relative to the object, the illumination direction of the object, the intensity of the illumination, the color of the illumination, the specular reflection characteristics of the object, the diffuse reflection characteristics of the object, or the environmental reflection characteristics of the object.
[0011] Optionally, the orientation change characteristic is for a first angular velocity at a first time, and estimating the first angular velocity includes: for each of a plurality of candidate values of the first angular velocity: further determining the value of the cost function based on the difference between the candidate value of the first angular velocity and the second angular velocity of the object at different second times, so as to reduce the difference between the estimated first angular velocity and the second angular velocity.
[0012] Optionally, determining the value of the cost function includes: weighting the difference between the candidate value of the first angular velocity and the second angular velocity based on the difference between the first time and the second time.
[0013] Optionally, identifying the second region includes: determining a first position in 3D space of a portion of the object's surface from the first image; determining a second position in 3D space of the portion of the object's surface based on the first position and candidate values of orientation change characteristics, the second position representing the position of the portion of the object's surface in 3D space at the time the second image is captured; and determining a second region in the second image based on the second position of the portion of the object's surface in 3D space.
[0014] Optionally, determining the first location includes: determining the first location based on the known size of the object and based on the measured size of the object in the first image, and / or identifying the second location includes: identifying the second location based on the known size of the object and based on the measured size of the object in the second image.
[0015] Optionally, the orientation change characteristics include the object's angular velocity, at least one component of the angular velocity, or the direction of the angular velocity.
[0016] Optionally, the object is a sports ball.
[0017] Optionally, the method includes: determining that the maximum expected angular displacement of the object between the first image and the second image is below a threshold based on the time difference between the first image and the second image and the maximum expected angular velocity of the object, wherein identifying a first region in the first image and a second region in the second image is performed in response to determining that the maximum expected angular displacement is below the threshold.
[0018] According to a second aspect of the invention, a processing system configured to perform the method of the first aspect is provided.
[0019] According to a third aspect of the invention, a system is provided, comprising: a processing system of the second aspect, and one or more cameras configured to provide multiple images to the processing system.
[0020] According to a fourth aspect of the invention, a non-transitory computer-readable medium is provided, comprising instructions that, when executed by a processor, cause the processor to perform the method of the first aspect. Attached Figure Description
[0021] Figure 1 A schematic diagram of the method according to the present invention is shown;
[0022] Figure 2 A schematic diagram of the system used to implement the method is shown;
[0023] Figure 3 It shows Figure 2 A perspective view of the system when it is deployed during a basketball game;
[0024] Figure 4 A schematic geometry is shown indicating the relative arrangement of the system's components when used to implement the method;
[0025] Figure 5 The illustration schematically demonstrates a technique for determining the position of an object using two cameras;
[0026] Figure 6 This diagram illustrates a technique for determining corresponding regions in different images, as used in the method; and
[0027] Figure 7 An example technique for estimating orientation change characteristics based on the method is shown. Detailed Implementation
[0028] refer to Figure 1 The illustration depicts a method 100 for estimating orientation change characteristics of an object 50 from multiple images 40, 41 depicting an object 50 captured at various times. In some embodiments, the orientation change characteristic is the angular velocity of a sports ball. The sports ball is in play during a sports competition and rotates as it passes through the air. Images of the ball are captured at different times by equipment of cameras 110(a) to 110(g) positioned around the playing space (e.g., the court or field where the sports competition is held).
[0029] In general, Method 100 includes:
[0030] - In step 102, for each of the multiple candidate values of the orientation change characteristics of object 50, a first region in the first image 40 of the multiple images 40, 41 is identified. x 0 and the second region in the second image 41 of multiple images 40 and 41 x 1, where if the orientation change characteristic is equal to the candidate value, then the first region... x 0 and the second region x 1 will represent the same part of the surface of object 50;
[0031] - In step 104, for each of the multiple candidate values of the orientation change characteristics of object 50, based on the first region x Pixel intensity values of one or more pixels in 0, second region x The value of the cost function is determined by the pixel intensity values of one or more pixels in 1, and the illumination parameter set, which includes illumination parameters λ1, λ2, ..., λ3. k The values of each illumination parameter in the image, the illumination parameter set representing the illumination conditions of object 50 in multiple images 40, 41; and
[0032] - In step 106, orientation change characteristics are estimated based on the value of the cost function.
[0033] Therefore, a method 100 is provided for estimating the orientation change characteristics of object 50. In a general sense, method 100 works by determining candidate values for orientation change characteristics (e.g., the angular velocity of object 50) that are consistent with the changing appearance of object 50 observed across multiple images 40, 41.
[0034] As an introductory illustrative example, assume that the surface of object 50 is painted grayscale according to a pattern, where no surface portion has the same grayscale hue as any other portion of the surface, object 50 is uniformly illuminated by planar radiation from all angles at all times, and object 50 is a perfect diffuse reflector of light. We "guess" that object 50 has rotated a specific angle between images, say 3 degrees. We select pixel intensity values in a first image 40 representing a portion of the surface of object 50. We "rotate" object 50 by 3 degrees to obtain the theoretical (3D) position of that portion of the surface in a second image 41. If the pixel intensity values in the second image 41 representing this theoretical position are similar to those in the first image 40, then it is possible that object 50 has indeed rotated approximately 3 degrees between images. Conversely, if the pixel intensity values are significantly different, it is possible that the pixels in the first image 40 represent a different portion of the surface of object 50 than those in the second image 41, thus the assumption that object 50 has rotated by 3 degrees is incorrect. This information regarding the accuracy of the assumed angular displacement can be reflected in the cost function by setting the cost function to be equal to the difference between the pixel intensity values. A low value for the cost function may indicate that the assumed angular displacement is correct, while a high value may indicate that the assumed angular displacement is incorrect. The cost function can be calculated for various "guessed" rotation angles. To estimate the angular displacement, the rotation angle corresponding to the lowest value of the photogrammetric cost function can be identified.
[0035] By determining the cost function for each of multiple candidate values of orientation change characteristics (e.g., angular velocity or angular displacement), it is possible to Without knowledge of the appearance of the surface of the ball Orientation change characteristics are estimated. That is, instead of identifying the ball's orientation at each time step by searching for known markings on the ball in each image, the orientation change characteristics are estimated by determining the angular velocity that best matches the observed pixel intensity value, regardless of the appearance of the ball's surface. In other words, the appearance of a portion (or a part thereof) of the ball's surface facing camera 110 can be obtained from the first image 40, and the same process can be performed for the second image 41; by comparing the appearance of the portion of the ball's surface that would appear in the two images at a hypothetical specific angular velocity, it can be determined whether the hypothesized angular velocity is correct.
[0036] The example above assumes that object 50 is uniformly illuminated from all angles at all times. Of course, in reality, objects are typically illuminated non-uniformly and reflect light non-uniformly. Therefore, in the example above, even if the assumed angular velocity matches the actual angular velocity, the pixel intensity values in different images representing the same part of the surface of object 50 will generally not match each other because object 50 is rotating and thus that part of the surface of object 50 may be illuminated more or less in the second image 41 than in the first image 40.
[0037] To address this issue, method 100 includes using not only multiple candidate values for the orientation change characteristics, but also a set of illumination parameters representing the illumination conditions of object 50 in multiple images 40, 41 to determine the value of the cost function. This allows the orientation change characteristics to be reliably and accurately estimated across different illumination conditions and for a range of objects with different reflectivity characteristics. System Overview
[0038] refer to Figure 2 The illustration shows a system 200 according to an example. System 200 includes a server 210 (also referred to as a processing system) and multiple cameras 110(a) to 110(g). Method 100 includes receiving multiple images 40, 41 from the multiple cameras 110(a) to 110(g) at the server 210. The cameras 110(a) to 110(g) are located at or near a sporting event.
[0039] Server 210 includes an input interface 213, an output interface 214, a processor 211, and a memory 212. The processor 211 and memory 212 are configured to execute the method. Memory 212 stores instructions that, when executed by the processor 211, cause the processor 211 to perform the method. The instructions can be stored on any computer-readable medium, such as any non-transitory computer-readable medium.
[0040] refer to Figure 3 The illustration shows Figure 2 A sample of System 200 was deployed to analyze basketball games. Figure 3 Each of the cameras 110(a) to 110(g) shown has a corresponding different viewpoint 114 from which video is captured. As shown, most of the cameras 110(a) to 110(g) are arranged in various fixed positions and orientations around the basketball court. In some examples, some of the cameras 110(a) to 110(g) may be held by coaches or fans (e.g., a fan's own personal equipment can be used as cameras 110(a) to 110(g) of system 200). Furthermore, although... Figure 3 The text depicts a basketball game, but it should be understood that this is merely illustrative. Figure 2 The System 200 is suitable for deployment in many other types of sporting activities, and is also suitable for deployment in non-sporting environments, such as non-sporting live events (e.g., concerts, comedy performances, or dramas) or non-sporting practice sessions (e.g., music practice, or rehearsals for drama).
[0041] Now back Figure 2Regarding system 200, it should be noted that in some examples, server 210 may be located in the same physical location as a portable electronic device. For example, in a sporting event (e.g., where system 200 is deployed...), Figure 3 In the case illustrated, server 210 can be located in a server room at the venue where the sporting activity is being conducted. Alternatively, server 210 can be located in a truck parked at the venue. However, in other examples, server 210 can be a remote / cloud server.
[0042] According to method 100, cameras 110(a) to 110(g) have been synchronized to capture frames simultaneously, for example, according to the technology described in patent application publication number PCT / US2023 / 036480. Alternatively, server 210 may instruct cameras 110(a) to 110(g) to capture images at a predefined future timestamp. Cameras 110(a) to 110(g) continuously capture frames depicting a sports ball in play during a sporting event. The sports ball may be, for example, a basketball, volleyball, tennis ball, football, or cricket.
[0043] Each image is at a resolution of H x W RGB images, along with timestamps stored by each camera 110(a) to 110(g). t It is associated with and sent to server 210 along with the image. Ball detection
[0044] Method 100 includes detecting, for each of a plurality of images 40, 41, a portion of the image depicting a sports ball. Various techniques for detecting balls in images are known in the art, but for completeness, we describe a ball detection technique that can be used to implement the methods disclosed herein.
[0045] A neural network trained to detect objects with specific shapes in an image (e.g., a circular object representing a basketball, or a non-circular object such as a rugby ball) is used to obtain an initial estimate of the portion of the image depicting the ball, as well as the diameter of the circle (if the ball is round). Subsequently, square regions in the image with sides larger than the circle's diameter (e.g., 1.2 times the circle's diameter) are cropped (if the ball is round), and a Hough transform is applied to determine a thinned estimate of the portion of the image depicting the ball. The portion of the image depicting the ball is called a mask and includes pixels representing the ball.
[0046] Alternatively, instead of using a neural network to obtain an initial estimate, the Hough transform can be applied directly to the entire image to obtain the position of the sphere. 3D ball position calculation
[0047] Once the ball is identified in a given image, its 3D position in space can be determined, such as... Figure 4 As shown. Similarly, various techniques for this purpose exist in the art, but for completeness, we describe a technique for determining the 3D position of a sphere that can be used to implement the method disclosed herein.
[0048] First, a camera calibration process can be performed to determine the extrinsic and intrinsic parameters of camera 110. The extrinsic parameters include the position (X) of camera 110. c Y c Z c (In 3D space, for example, the position (X) of camera 110 relative to the pitch of a football field where sports activities are taking place) c , Y c Z c Orientation. Orientation can be represented using quaternions. Alternatively, orientation can be represented using a vector whose direction is parallel to the ray direction corresponding to the center of the field of view of camera 110. Alternatively, orientation can be represented using a rotation matrix, where the first column of the matrix is a direction parallel to the vertical direction in the image planes 40 and 41 captured by camera 110. y (See example) Figure 4 The vector of the matrix, the second column of which is parallel to the horizontal direction in the plane of images captured by camera 110 40, 41. x The vector, and the third column parallel to the ray direction corresponding to the field of view center of camera 110. Here, the vertical direction in images 40 and 41 y and horizontal direction x These are directions that are mutually perpendicular and perpendicular to the ray direction corresponding to the center of the camera's field of view. The camera calibration process for determining intriguing parameters is described in patent publication US 10600210 B1. Alternatively, a GPS sensor (not shown) in camera 110 can be used to determine the position (X) of camera 110. c Y c Z c ).
[0049] Intrinsic parameters include the focal length of camera 110. f The intrinsic parameters include the field of view of camera 110 and parameters describing the optical distortion associated with camera 110. These intrinsic parameters can also be determined according to the camera calibration process described in US Patent Publication 10600210 B1; or they can be obtained from the memory of camera 110 or the memory 212 of server 210.
[0050] Intrinsic parameters, the diameter D of the sphere measured in the image. m And the known diameter D of the sphere.r It can be used to determine the distance between the ball's center of mass and camera 110. L By inverting the following equations: in, It is a distortion function For 2D pixel positions x (where pixel position) x = (0, 0) represents the distortion rate at the image center. In the absence of distortion, the distortion rate is equal to 1.
[0051] Then, we can use eigenparameters, eigenparameters, and the position of the sphere's center measured in the image (x). b , y b The 3D position (X) of the sphere's centroid is determined by the pixel representing the center of the hemispherical surface of the sphere facing the camera 110, and the distance L between the sphere and the camera 110. b Y b Z b Specifically, the position of the center of the sphere measured in the image (x... b , y b Together with the known eigenvalues, they are used to determine the unit vector of the line connecting the centroid of the camera 110 and the sphere. Then, the 3D position (X) of the sphere's center of mass can be calculated according to equation (1). b Y b Z b ). (1)
[0052] The 3D position (X) of the sphere's center of mass can be calculated from multiple images taken simultaneously from different cameras. b Y b Z b To improve accuracy, for example, for each of a plurality of candidate 3D locations, a reprojection error can be calculated for each image. For a given 3D location and image, the reprojection error is, for example, a linear... The angle between the line that connects the camera to the candidate 3D location (calculated using the method described above for this image) and the line itself. The corresponding camera is adjacent to the centroid of the sphere. For a given 3D location and image, the reprojection error can alternatively be the 2D distance (e.g., measured in pixels) between the point in the image representing the centroid of the sphere (calculated using the method described above) and the point in the image representing the candidate 3D location. Alternatively, for a given 3D location and image, the reprojection error can be the 3D distance (e.g., measured in meters) between the 3D location of the centroid of the sphere (calculated using the method described above) and the candidate 3D location.
[0053] For a given candidate 3D location, calculate the reprojection error for each image and sum them to obtain the total reprojection error. Select the candidate 3D location with the lowest total reprojection error as the 3D location of the sphere's centroid.
[0054] Points on the surface of the sphere satisfy (For non-spherical objects, this equation can be replaced with a suitable equation describing the known shape of the object; therefore, more generally, the second region is identified.) x 1 includes based on the first region x The shapes of objects 0 and 50 (e.g., known shapes) are used to identify the second region. x 1. Further details about these equations are provided below. (For a given point...) The line adjacent to camera 110 can be determined as This line can be used in conjunction with known intrinsic parameters to determine the pixel representation of that point on the surface of the sphere in the image. x , y Therefore, for a specific point on the surface of a sphere, the eigenvalues and inequalities of the camera can be used to determine the representation of that point on the surface of the sphere by... any The pixels (x, y) in an image captured by the camera.
[0055] Points on the surface of the ball X = ( X , Y , Z ) to pixel x = ( x , y The transformation of ) is represented as x = φ c ( X ),in c Indicates that it was used to capture pixels x The camera that takes the image.
[0056] Its reverse process, that is, finding the pixel x = ( x , y () represents a point on the surface of the sphere. X = ( X , Y , Z ), involving identifiers targeting pixels x line The point of intersection with the surface of the sphere; the line is represented as φ c -1 ( x Typically, there will be two intersection points; the intersection point closer to camera 110 is retained, and the result is represented as... .
[0057] These conversion methods can be applied using various camera models, such as pinhole models or RPC models.
[0058] The above describes the calculation of the 3D position (X) of the sphere's center of mass from an image. b Y b Z b The method is as follows. However, if two cameras with known intrinsic and extrinsic parameters capture images of the sphere at the same time, the 3D position (X) of the sphere's centroid is determined. b ,Y b Z b The size of the sphere can be determined using simple triangulation without knowing its dimensions. In other words, a line can be found that connects the first camera to the center of mass of the sphere. 1 and the line that connects the second camera to the center of mass of the sphere. The intersection of 2, and use this intersection as the 3D position (X) of the sphere's center of mass. b Y b Z b ),like Figure 5 As shown in the diagram.
[0059] As mentioned above, object 50 does not need to be spherical; a sphere could be, for example, a rugby ball. More generally, the geometry of object 50... G It can be represented by a solid geometric object modeled by any of the following:
[0060] 1. Implicit equations ,in ,
[0061] 2. Explicit Equations ,in ,
[0062] 3. Grid ,in It is a set of triangles representing a grid.
[0063] Intersection function This function can be used to determine 3D lines (such as...) 1) With geometry G The intersection points between them. Such a geometric model is well known in the field and will not be discussed further in this paper.
[0064] In addition, the 3D position (X) of the sphere's center of mass. b Y b Z bThe determination of the location is not limited to the methods described above; for example, the location can be received from a GPS sensor embedded in the sphere. Image selection
[0065] Once the 3D position (X) of the sphere (or more generally, object 50) has been calculated... b Y b Z b Method 100 includes, for at least one time when images are captured, selecting a subset of images captured at that time, based on the distance between each camera 110(a) to 110(g) and the ball (or the size of the ball as displayed in the image) and / or based on the resolution of cameras 110(a) to 110(g), to be used to estimate orientation change characteristics (e.g., angular velocity). Cameras 110(a) to 110(g) can be numerous (e.g., up to 84 cameras can exist), and for a given timestamp, a given portion of the surface of the ball can be depicted by images from several different cameras 110(a) to 110(g). To reduce processing requirements, a subset of cameras is appropriately selected.
[0066] First, for each point on the surface of the sphere (e.g., a small surface area) and for each camera 110 whose image includes that area, the number of pixels depicting that area is calculated based on the distance L between the sphere and the camera 110. If an image from more than one camera depicts that point, the camera whose image depicts the highest density of pixels at that point is selected.
[0067] This reduces the number of representations (i.e., pixels representing each point on the surface of the ball) to one. Typically, because some cameras are closer to the ball than others at a given capture time, the number of cameras whose images are used to represent the ball for a given capture time will be significantly less than the total number of cameras. In any case, four images depicting the ball are usually sufficient for a given capture time; therefore, in method 100, a maximum of four images are used.
[0068] It should be noted that the image selection technique described above is only optional; in principle, estimating orientation change characteristics only requires depicting two images of the object 50 captured at different times. Image pairing
[0069] Once an image has been selected for each capture time, it can be used to... different Images captured over time are paired to determine the cost function.
[0070] In a broader sense, method 100 includes: determining that the maximum expected angular displacement of object 50 between the first image 40 and the second image 41 is below a threshold based on the time difference between the first image 40 and the second image 41 and the maximum expected angular velocity of object 50, wherein in response to determining that the maximum expected angular displacement is below the threshold, identifying a first region in the first image 40 is performed. x 0 and the second region in the second image 41 x 1.
[0071] Consider the image pair depicting object 50, in time t 0 and t 0+ δt Shot from the same camera 110. Typically, object 50 will be in the interval between the two images. δt Internal rotation. By using the maximum desired angular velocity. w max The maximum expected angular displacement can be calculated as follows: w max δt The maximum expected angular velocity can be obtained empirically. For example, when the object 50 is a sports ball (such as a basketball), the maximum recorded angular velocity of the basketball can be used as the maximum expected angular velocity. The maximum recorded angular velocity can be estimated using other methods, or using a high frame rate camera with the method described in this paper.
[0072] By ensuring δt Not too big, enough to ensure the ball stays in the gap. δt The rotation must not exceed half a circle. Specifically, the maximum allowed time interval between the two images to be compared is... δt max = π / w max (Measured in radians per second) w max (However, in embodiments where the orientation change characteristic is not angular velocity but rather the difference in orientation of the ball between the times of capturing two images, this requirement does not need to be imposed.)
[0073] In practice, it is desirable to ensure that a sufficiently large portion of the ball's surface (i.e., a threshold percentage, such as 15%, or alternatively any value in the range of 1% to 40%, or 10% to 20%) appears in both images to improve the comparison quality between the images (the comparison between images is described in the "Cost Function" below). Therefore, it is desirable to ensure that the ball has been rotated to a maximum of a predetermined angle. σ This angle is less than π radians. This threshold reduces the allowed time interval between two images to a value ε.
[0074] For a given camera, any two images captured by that camera within a time interval of less than ε can form a comparable image pair.
[0075] In multi-camera settings, comparisons are made by... different Images taken by a camera (e.g., cameras with different orientations) are possible. In this case, the allowed time interval between images is less than ε because it must be ensured that the ball has rotated by a smaller angle (to ensure that a sufficient portion of the ball's surface appears in both images). In the worst case, the ball has an angular velocity... w max In θ c2 - θ c1 Rotate in opposite directions, where θ c1 It is the orientation of camera C1, which was used to capture the earlier image of the pair, and θ c2 This refers to the orientation of camera C2, which was used to capture the later image in the pair. In this case, the maximum allowed time... δt max satisfy w max δt max + | θ c2 - θ c1 | = σ Since the axis of rotation of the ball is usually unknown, the maximum permissible time between the comparison image pairs captured by cameras C1 and C2 is ( σ - | θ c2 - θ c1 |) / w max .
[0076] Therefore, image pairs can be selected appropriately. As mentioned, for each time period... t Choose four images depicting a sphere. Therefore, for time... t 0、 t 1. A maximum of four image pairs can be selected. In the case of synchronized cameras, image pairs can include adjacent frame pairs. However, in principle, only a pair of images captured at different times is required.
[0077] Method 100 does not rely on the techniques described above for 3D sphere position calculation, image selection, and image pairing; for example, when the same camera 110 is used to capture the first image 40 and the second image 41, method 100 can start with region pairing, which will now be described. Region pairing
[0078] Once a suitable image pair is selected, the first image 40 and the second image 41 within the given pair can be compared.
[0079] Refer again Figure 1 As mentioned, method 100 includes, in step 102, identifying a first region in a first image 40 of a plurality of images 40, 41 for each of a plurality of candidate values of orientation change characteristics (e.g., angular velocity) of an object 50 (e.g., a sports ball). x 0 and the second region in the second image 41 of multiple images 40 and 41 x 1, where if the orientation change characteristic (e.g., angular velocity) equals the candidate value, then the first region... x 0 and the second region x 1 will represent the same portion of the surface of object 50 (e.g., a sports ball). The second area will be identified as will be described now. x 1 includes: determining a first position in 3D space of a portion of the surface of object 50 from a first image 40; determining a second position in 3D space of the portion of the surface of object 50 based on the first position and candidate values of orientation change characteristics, the second position representing the position of the portion of the surface of object 50 in 3D space at the time the second image 41 is captured; and determining a second region in the second image 41 based on the second position of the portion of the surface of object 50 in 3D space. x 1.
[0080] Candidate values for the assumed orientation change characteristics w t0 Used to identify the first area x 0 and the second region x Example region pairing method 102a in 1 Figure 6 It is shown in the middle. Figure 6 The same camera 110 was shown to be used to capture the first image 40 (in time). t 0) and the second image 41 (in time) t Example 1), but as mentioned, in other examples, different cameras are used for the image pairs. Angular velocity w t0 It is represented on the same 3D basis (i.e., relative to 3D space) as the orientation of cameras 110(a) to 110(g).
[0081] First, select a first region from the first image 40. x 0. (As will be mentioned later, the following procedure applies to all areas of the depicted object 50 in the first image 40.) x 0 duplicates. ) First region x 0 includes one pixel. This paper discusses examples of using regions that include a single pixel, but it should be understood that corresponding region pairs (i.e., regions in one image and regions in another image that would represent the same part of the surface of object 50 if the orientation change property is equal to the candidate value) can consist of multiple pixels.
[0082] Region pairing method 102a includes identifying a first surface region (first location) in 3D space. X 0, First surface region X 0 is the first region in the first image 40 x 0 represents a portion of the sphere's surface. The above reference can be used. Figure 4 The described technique is used to identify the first surface area. X 0.
[0083] Region pairing method 102a includes determining a second surface region (second location) in 3D space. X 1, where the second surface region X 1 is with X 0. The same spherical surface portion, but at the time of capturing the second image 41. t 1. Candidate values for the orientation change characteristics. w t0 In other words, angular velocity is assumed to take a specific value. w t0 Here, the angular velocity is at intervals. t 0 to t Period 1 is assumed to have a constant value. w t0 As mentioned above, this is a small time interval (during which the object rotates at most half a circle), so it is assumed that the angular velocity cannot change significantly during this interval.
[0084] Based on the first surface region in 3D space X Candidate values of 0 and orientation variation characteristics are used to determine the second surface region in 3D space. X 1. Orient the 3D vector around the axis w t0 Rotation angle w t0 ( t 1 - t A matrix of 0) in Figure 6 The Chinese character is represented as R ( w t0 ( t 1 - t 0). The appropriate matrix for performing such rotational transformation is well known in the art and will not be discussed in detail hereafter.
[0085] It should be noted that although the method described in this paper involves estimating angular velocity... w t0 However, other orientation change characteristics can be estimated, such as the object's orientation in... t 0 and t angular displacement between 1 (equal w t0 ( t 1 - t 0)). When the orientation change characteristic is angular displacement There is no need to know the time. t 0 or t 1: Conversely, the rotation matrix can be simply written as .
[0086] Determine the second surface region in 3D space X 1. Additional ground includes (if the ball's center of mass is moving), such as Figure 6 As shown, add vectors ( T 1 - T 0), of which T 1 indicates time t The 3D position of the center of mass of a ball, and T 0 indicates time t The 3D position of the center of mass of 0 sphere. T 1 and T 0 can be obtained from multiple cameras 110(a) to 110(g) in time. t 1 and t The captured image was determined as described in the reference above. Figure 4 As stated above.
[0087] Region pairing method 102a includes a second surface region based on 3D space. X 1. Determine the second region x 1. This is performed using the intrinsic and extrinsic parameters of camera 110, as referenced above. Figure 4 As described above. In an example where the camera 110 used to capture the first image 40 is different from the camera used to capture the second image 41, these will be the intrinsic and extrinsic parameters of the camera used to capture the second image 41. Figure 6 In the middle, 3D surface area X 1 to pixel position x The conversion of 1 is represented as x 1= φ c ( X 1).
[0088] Therefore, the second region x 1 can be obtained from the first area x Candidate values for 0 and orientation change characteristics were determined.
[0089] For non-spherical objects, the sphere in time t geometry G t It can be written more generally as (refer to the geometry provided under "3D Sphere Position Calculation") G (Basic definition)
[0090] 1. For implicit equations, ,as well as
[0091] 2. For explicit equations, ,
[0092] in X t B Indicates the time of the ball's center of mass. t The location, and Indicates the ball's position in time. t . orientation.
[0093] In some cases (e.g., at certain candidate values of angular velocity), the second surface region X 1. The area located away from object 50 is used as one side of the camera that captures the second image 41. In this case, for the purpose of calculating the cost function below, the region... x 0 is ignored. Cost function
[0094] As mentioned, method 100 includes, in step 104, for each of a plurality of candidate values of orientation change characteristics, based on a first region x Pixel intensity values of one or more pixels in 0, second region x The value of the cost function is determined by the pixel intensity values of one or more pixels in 1, and the illumination parameter set, which includes one or more illumination parameters λ1, λ2, ..., λ3. k The values of each illumination parameter in the set represent the illumination conditions of object 50 in multiple images 40, 41.
[0095] Cost function P Including photogrammetric terms, where equation (2) below is shown. The photogrammetric terms are for the region in the first image 40. x Summing by zero, the terms of the summation are from the first region. x Intensity value of one or more pixels in 0 I ( x 0) and the second region x Intensity value of one or more pixels in 1 I ( x 1) difference function (As described below with reference to equation (4), this can be generalized to include individual sums from multiple image pairs.) Difference function (Further details below) This is based on one or more irradiation parameters λ1, λ2, ..., λ k The values of λ1, λ2, ..., λ3 are parameterized. k As described in further detail below, but may include: the position of the light source relative to the object 50, the intensity of the illumination, the color of the illumination, the specular reflection characteristics of the object 50, the diffuse reflection characteristics of the object 50, or the ambient reflection characteristics of the object 50. (2)
[0096] w t0 It is (as described above) object 50 (e.g., a sports ball) in time t Candidate values for angular velocity of 0. As mentioned above, the second region. x 1 uses angular velocity w t0 It is determined, therefore the photogrammetric term is related to angular velocity. w t0 It is dependent. Also, as mentioned above, a specific first region... x 0 in the corresponding second surface region x If 1 is located on the side of the sphere not depicted in the second image 41, it is ignored (i.e., it does not contribute to the summation).
[0097] The "irradiation parameter set" referred to in this article includes one or more irradiation parameters λ1, λ2, ..., λ3. k Each of the illumination parameters in the dataset contains individual values. Different "candidate illumination parameter sets" have different sets of values for one or more illumination parameters. For example, a first candidate illumination parameter set may have a first set of values for the position of the light source relative to the object 50, illumination intensity, etc., and a second candidate illumination parameter set may have a second set of values for the position of the light source relative to the object 50, illumination intensity, etc., the second set differing from the first set in that at least one of the values is different.
[0098] The cost function is applied to multiple candidate illumination parameter sets (and to angular velocity). w t0 Each illumination parameter among multiple candidate values is evaluated. That is, the position of the light source relative to the object 50, the illumination direction of the object 50, the intensity of the illumination, etc. (also written as λ1, λ2, ..., λ...). k ) and candidate angular velocity w t0 Each variable varies, and for each combination of candidate values, a photogrammetric term is calculated.
[0099] Figure 7 The diagram illustrates only the angular velocity. x Components ( w t0 ) x and the light source x An example of the variation of position λ1 (relative to a fixed point in 3D space), and all other variables are assumed to be known; for angular velocity... x Components ( w t0 ) x The candidate values and those for this light source x For each combination of candidate values for position λ1, calculate the value of the cost function, and in Figure 7 The corresponding cell in the table is shown. However, more generally, angular velocity... w t0 It has three components (all the independent variables to be determined in equation (2)) and there exist multiple irradiation parameters as λ1, λ2, ..., λ k .
[0100] By taking into account various possible irradiation conditions, orientation change characteristics (e.g., angular velocity) can be accurately estimated even when the actual irradiation conditions and the surface appearance of object 50 are unknown.
[0101] However, in other examples, the illumination parameters λ1, λ2, ..., λ k The value is known (e.g., from previous estimates of angular velocity and illumination parameters), and only the angular velocity is changed. w t0 . Difference function
[0102] We now describe the difference function First, use each region x 0 is an example that includes only one pixel in the first image 40. In such an example, simply put, the difference function... It is the second area x Intensity value of the pixel in 1 I ( x 1) With the first region x Intensity value of pixels in 0 I ( x Between 0) After lighting adjustment difference.
[0103] As a first example, the difference function It could be the second area. x Illumination-independent intensity values of pixels in 1 With the first area x Illumination-independent intensity value of pixels in 0 The difference between The square or modulus. Illumination-independent intensity value. , The intensity value measured by the camera will be obtained by using the illumination parameter values. I ( x 0) I ( x 1) It is determined by mapping to individual illumination-independent intensity values (as described in further detail below). For example, illumination-independent intensity values can represent the eigenvalues (the sum of RGB values) that represent the color of a ball.
[0104] As a second example, the difference function It can be the intensity value of the first pixel out of two pixels (e.g.) I ( x The square or modulus of the difference between the illumination-adjusted intensity value of one pixel and the intensity value of another pixel. In this case, the illumination-adjusted intensity value is determined by the intensity value. I ( x 0) Mapped to the intensity value of the first pixel in a measurement of two pixels. I ( x 0) Intensity values measured under the same irradiation conditions It has been confirmed.
[0105] By taking the first region x The intensity value in 0 (after appropriate lighting adjustments) and and the second area x The difference between the sum and the (appropriately illuminated) intensity values in 1, as illustrated above, can be generalized to each region in the first image 40. x 0 includes cases with more than one pixel.
[0106] In any case, "pixel intensity value" can refer to the sum of the intensity values of the red, green, and blue components of a pixel, or another function of these values, such as the grayscale intensity value obtained from these components, or the ratio between two of the three components. Lighting model
[0107] We now describe a method for calculating illumination-independent and illumination-adjusted intensity values using a known illumination model. In one example, the Phong reflection model is used to model the illumination of object 50. It is assumed that the parameters of the Phong reflection model are constant throughout the capture of multiple images 40, 41. The illumination parameters λ1, λ2, ..., λ... k Yes: Each light source m Relative to the position of object 50 (therefore each light source) m Irradiation orientation of object 50), specular component intensity of each light sourcei s Intensity of diffuse reflection component of each light source i d Ambient lighting i a Specular reflection constant of object 50 k s The diffuse reflection constant of object 50 k d Environmental reflection constant of object 50 k a And the gloss constant α of object 50. The Phong reflection model provides equation (3) for calculating the illumination of points on the surface of object 50. I p : (3)
[0108] Here, L m Indicates the direction from a point on the surface toward the light source. m The direction vector (i.e., the direction of illumination relative to the light source). N This represents the direction vector of the surface normal. V This represents the direction vector from a point on the surface to camera 110, and R m equal
[0109] (For the first area) x 0, i.e., the first surface region X 0, N and V It is known, and L m From the light source m Position parameterization. For the second region. x 1, i.e., the second surface region X 1, N and V From angular velocity w t0 Parameterization, and L m From the light source m Position and angular velocity w t0 Both are parameterized.
[0110] Given a set of illumination parameters (position of each light source), i s , i d (etc.) and the given angular velocity w t0It can calculate separately from the region X 0 and / or X 1 irradiation I p , respectively as I p ( x 0) and I p ( x 1). Then, the color of the sphere is modeled as its intrinsic color (represented by intensity values independent of illumination) and the contributions of the diffuse and specular components of the illumination (by...). I p The sum of (represented) luminous inertial values can be calculated as follows: I ( x 0) - I p ( x 0) and I ( x 1) - I p ( x 1), such that the difference function is or
[0111] The difference function remains the same when the intensity value adjusted for illumination is used instead; for example, for pixels. x The intensity value after lighting adjustment of 0 can be calculated as follows: I ( x 0) - I p ( x 0) + I p ( x 1) In the difference function, this value is derived from the value for pixels. x The (unadjusted) intensity value of 1 I ( x 1) Subtract from the middle.
[0112] Other lighting models, such as the Gouraud model, can be used.
[0113] Although according to the Phong model above, the direction of illumination L m It is calculated from the relative position of the light source with respect to object 50, so it is not an independent parameter, but it can be an independent parameter in other models. Estimation of orientation change characteristics (e.g., angular velocity)
[0114] As mentioned, method 100 includes, in step 106, estimating orientation change characteristics (e.g., angular velocity) based on the value of a cost function. Estimating the orientation change characteristics includes selecting candidate values from a plurality of candidate values that have the lowest cost function value for the orientation change characteristics. Additionally, based on the value of the cost function, determining a set of illumination parameters that correctly represents the illumination of object 50 from a plurality of candidate sets of illumination parameters. Specifically, selecting the set with the lowest cost function value ( X Candidate values for angular velocity Y Candidate values for angular velocity Z Candidate values for angular velocity, candidate values for λ1, candidate values for λ2, ..., λ k (candidate values).
[0115] Regardless of whether the selected candidate values for the orientation change characteristics and / or the selected candidate irradiation parameter set have the lowest cost function value among the values of the cost function, the orientation change characteristics can be estimated using iterative solution methods (such as the Levenberg-Marquardt algorithm) or alternatively gradient descent, and irradiation conditions can be optionally determined. For iterative solution methods, initial candidate angular velocities (such as (0,0,0)) and an initial irradiation parameter set (e.g., where each component is a random number) can be used as a starting point. Iterative methods use values for values constrained to - w max and + w max The value of each component of the angular velocity between these ranges, i.e., values outside this range are not considered. Various iterative methods for finding the local or global minimum of a function of multiple variables are well known in the art, and these methods will not be discussed further here.
[0116] exist Figure 7 In the example shown, the angular velocity that causes the cost function to reach its minimum can be determined. x Components ( w t0 ) x and the light source x The combination of positions λ1 is (20 rad s) -1 , 2 m). Smoothing Term
[0117] In a simple example, the cost function includes only the photogrammetry terms mentioned above and calculates the angular velocity for only one time period.
[0118] However, in some cases, estimating angular velocity from an image taken at only one time can be difficult (e.g., due to poor illumination of object 50, or object 50 being too far from the camera). The inventors have recognized that, through... At multiple timesDetermining angular velocity (from three or more images captured at different times) can improve the reliability and / or accuracy of estimates made for a given time, as described below.
[0119] The angular velocity of a ball can be affected by air resistance and bounce. However, in general, the angular velocity of a ball is unlikely to change significantly in a short period of time.
[0120] To incorporate this observation into the current angular velocity estimation method, method 100 uses a modified cost function that takes into account the different circumstances surrounding object 50. t 0 time t 2's second angular velocity w t2 When the second angular velocity w t2 When the value is known, the corrected cost function P 'for ,
[0121] Among them, items This is called a smoothing term. For example, the uncorrected cost function can be used to take advantage of time... t a and t b Captured images to estimate time t a angular velocity w ta Then, in order to... t b The captured images and at a later time t c The captured third image is estimated for time. t b angular velocity w tb The previously estimated angular velocity w ta It can be considered a known value w t2 And it is used in the aforementioned modified cost function, where the photogrammetry term is used in time. t b and t c Captured image pairs. Because the modified cost function includes | w ta - w tb The terms that are added are therefore more definite compared to the case where there are no smoothing terms. w tb The value is more likely to be close to w ta .
[0122] (at angular velocity) w ta and w tb In the case of a matrix R(θ) (representing the rotational transformation of the rotation angle θ around the candidate rotation axis of object 50, where θ is, for example, equal to the angular velocity of object 50 multiplied by 1 second), the term | w ta - w tb | can be calculated as 。
[0123] function yes( t A positive, monotonically decreasing function of 2–t0). This function considers the condition for t0. t 0 and t Lack of understanding of the torque acting on object 50 between 2: t 0 and t The greater the distance between two objects, the more likely their true angular velocity has changed significantly; therefore, "smoothing" should be considered less. (Function) It could be, for example, an exponentially decaying function, i.e., exp(-| t 2– t 0|), Gaussian function exp(-( t 2– t 0) 2 ), or defined as The function.
[0124] The modified cost function described above assumes the second angular velocity of object 50. w t2 It is known. However, even if the angular velocity of object 50 is unknown at any given time, the cost function can be similarly modified to calculate the angular velocity over two or more time intervals. The generalized form of the cost function is... (4)
[0125] in w t0 ... w tn It is object 50 at various times t 0、 t 1, ... t n angular velocity, I ( x 0{ i 0}) is an image pair i The first imagei Region 0 (e.g., the first image 40) x The intensity of one or more pixels in 0, while I ( x 1{ i 1}) is an image pair i The second image i The corresponding region in 1 (e.g., the second image 41) x The intensity of one or more pixels in 1. Note that the image pairs i Images in time t 0、 t 1, ... t n The images are taken at any time (or even in between), and some of these times can be covered by multiple (e.g., two or more) images.
[0126] By calculating the two angular velocities for each time period using the joint cost function with a smoothing term as described above, a stable and accurate estimate of the angular velocity for each time period can be obtained.
[0127] The above embodiments should be understood as illustrative examples of the present invention. It should be understood that any feature described with respect to any embodiment can be used alone or in combination with other described features, and can also be used in combination with one or more features from any other embodiment, or any combination of any other embodiment. Furthermore, equivalents and modifications not described above may be employed without departing from the scope of the invention as defined by the appended claims.
Claims
1. A method for estimating orientation variation characteristics of an object from multiple images depicting an object captured at various different times, the method comprising: For each of the multiple candidate values for the orientation change characteristic of the object: Identify a first region in a first image and a second region in a second image among the plurality of images, wherein if the orientation change characteristic is equal to the candidate value, the first region and the second region will represent the same part of the surface of the object; as well as The value of the cost function is determined based on the pixel intensity values of one or more pixels in the first region, the pixel intensity values of one or more pixels in the second region, and a set of illumination parameters, wherein the set of illumination parameters includes values for each of the one or more illumination parameters, and the set of illumination parameters represents the illumination conditions of the object in the plurality of images. as well as The orientation change characteristics are estimated based on the value of the cost function.
2. The method of claim 1, wherein determining the value of the cost function comprises: Based on the illumination parameter set, at least one illumination-adjusted difference is determined between the pixel intensity value of one or more pixels in the first region and the pixel intensity value of one or more pixels in the second region; as well as The value of the cost function is determined based on the difference after the irradiation adjustment.
3. The method according to claim 1 or claim 2, wherein the irradiation parameter set is a first candidate irradiation parameter set among a plurality of candidate irradiation parameter sets, and the method comprises: For each of the plurality of candidate values for the orientation change characteristic of the object: For each of the plurality of candidate illumination parameter sets, an illumination-specific value of the cost function is determined based on the pixel intensity value of the one or more pixels in the first region, the pixel intensity value of the one or more pixels in the second region, and the candidate illumination parameter set. as well as The orientation change characteristics are estimated based on the specific irradiation value of the cost function.
4. The method according to claim 3, comprising: The illumination parameter set representing the illumination conditions of the object in the plurality of images is determined by selecting a candidate illumination parameter set from the plurality of candidate illumination parameter sets based on the value of the cost function.
5. The method according to any one of claims 1 to 4, wherein estimating the orientation change characteristics comprises: The candidate value for the orientation change characteristic that has the lowest value of the cost function among the plurality of candidate values is selected.
6. The method according to any one of claims 1 to 5, wherein the irradiation parameters include at least one of the following: the position of the light source relative to the object, the irradiation direction of the object, the intensity of the irradiation, the color of the irradiation, the specular reflection characteristics of the object, the diffuse reflection characteristics of the object, or the environmental reflection characteristics of the object.
7. The method according to any one of claims 1 to 6, wherein the orientation change characteristic is relative to a first angular velocity at a first time, and estimating the first angular velocity comprises: For each of the plurality of candidate values of the first angular velocity: The value of the cost function is further determined based on the difference between the candidate value of the first angular velocity and the second angular velocity of the object for different second times, in order to reduce the difference between the estimated first angular velocity and the second angular velocity.
8. The method of claim 7, wherein determining the value of the cost function comprises: The difference between the candidate value of the first angular velocity and the second angular velocity is weighted based on the difference between the first time and the second time.
9. The method according to any one of claims 1 to 8, wherein identifying the second region comprises: Determine the first position of the portion of the surface of the object in 3D space from the first image; Based on the first position and the candidate value of the orientation change characteristic, a second position of the portion of the surface of the object in the 3D space is determined, the second position representing the position of the portion of the surface of the object in the 3D space at the time of capture of the second image; as well as The second region in the second image is determined based on the second position of the portion of the object's surface in the 3D space.
10. The method of claim 9, wherein determining the first position comprises: Determining the first position based on the known size of the object and based on the measured size of the object in the first image, and / or identifying the second position, includes: identifying the second position based on the known size of the object and based on the measured size of the object in the second image.
11. The method according to any one of claims 1 to 10, wherein the orientation change characteristic includes the angular velocity of the object, at least one component of the angular velocity, or the direction of the angular velocity.
12. The method according to any one of claims 1 to 11, wherein the object is a sports ball.
13. The method according to any one of claims 1 to 12, comprising: Based on the time difference between the first image and the second image and the maximum expected angular velocity of the object, it is determined that the maximum expected angular displacement of the object between the first image and the second image is below a threshold. In response to determining that the maximum expected angular displacement is below the threshold, identifying the first region in the first image and the second region in the second image is performed.
14. A processing system configured to perform the method according to any one of claims 1 to 13.
15. A system comprising: The processing system according to claim 14, and One or more cameras are configured to provide the processing system with the plurality of images.
16. A non-transitory computer-readable medium comprising instructions that, when executed by a processor, cause the processor to perform the method according to any one of claims 1 to 13.