Self-organizing learning of three-dimensional motion data

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
The system addresses the limitations of existing radar and camera systems by using synchronized radar camera units and machine learning to generate accurate three-dimensional motion representations, enhancing tracking capabilities and reducing calibration needs.

JP7872623B2Active Publication Date: 2026-06-10

View PDF 24 Cites 0 Cited by

Patent Information

Authority / Receiving Office: JP · JP
Patent Type: Patents
Filing Date: 2022-08-26
Publication Date: 2026-06-10

Application Information

Patent Timeline

26 Aug 2022

Application

10 Jun 2026

Publication

JP7872623B2

IPC: G01S13/72; G01S13/86; G01S13/87; A63B69/00; A63B71/06; G06T7/00; H04N7/18

CPC: G01S13/867; G01S13/72; G01S7/417; G06T2207/30196; G06T7/246; G06T2207/30232; G06T2207/10028; G06T2207/30241

AI Tagging

Application Domain

Image enhancement Image analysis

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

⚠Technical Problem

Existing radar and camera systems face challenges in accurately tracking the motion of moving objects due to sensor range limitations, environmental factors, and calibration requirements, which affect the quality and flexibility of data acquisition.

⚗Method used

A system comprising synchronized radar camera units that capture and process image and radar data to estimate intrinsic and extrinsic camera parameters, allowing for the generation of three-dimensional motion representations using machine learning models, which reduces the need for precise camera calibration and enhances data capture flexibility.

🎯Benefits of technology

The system provides accurate and flexible tracking of moving objects by combining radar and image data, overcoming sensor limitations and environmental constraints, and enabling precise three-dimensional motion representation.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure 0007872623000017
Figure 0007872623000018
Figure 0007872623000019

Patent Text Reader

Abstract

The method may include capturing image data associated with an object in the defined environment at one or more time points. The method may include capturing radar data associated with the object in the defined environment at the same time points. The method may include acquiring image data and radar data associated with the object in the defined environment by a machine learning model. The method may include pairing each image data with corresponding radar data based on a time-series occurrence of the image data and the radar data. The method may include generating, by the machine learning model, a three-dimensional motion representation associated with the object associated with the image data and the radar data.

Need to check novelty before this filing date? Find Prior Art

Description

【Technical Field】【0001】 The present disclosure generally relates to three-dimensional motion representation of moving objects using self-organizing chemical learning that combines radar and image data. 【Background Art】【0002】 A moving object can be represented by characteristics of the moving object such as its position, moving direction, speed, velocity, etc. The moving object may include an object used in sports such as a ball. By evaluating the characteristics of the moving object, information regarding performance and / or events occurring in sports can be provided. 【0003】 The subject matter claimed in the present disclosure is not limited to embodiments that solve any disadvantages, nor is it limited to operating only in the environments as described above. Rather, this background is provided only to explain an example technical field in which some embodiments described in the present disclosure may be implemented. 【Summary of the Invention】【Means for Solving the Problems】【0004】 According to one aspect of the embodiment, the method may include capturing image data related to an object in a defined environment at one or more time points by a plurality of synchronized radar camera units included in an image capture system. The radar data and / or image data captured by the plurality of radar camera units may be processed, corrected, and / or constructed so as to be viewed from one or more positions of the defined environment associated with the virtual camera. The method may include estimating intrinsic and extrinsic camera parameters of the virtual camera at one or more locations within and / or around the defined environment to facilitate construction of the image data related to the virtual camera. 【0005】 This method may include capturing radar data related to an object in a defined environment at the same time. This method may also include acquiring image data and radar data related to an object in a defined environment using a machine learning model. This method may also include pairing each image data with its corresponding radar data based on the time-series occurrence of the image data and radar data. This method may also include generating a three-dimensional motion representation associated with the object associated with the image data and radar data using a machine learning model. 【0006】 The objectives and advantages of the embodiments are realized and achieved at least by the elements, features, and combinations specifically indicated in the claims. It should be understood that the above general description and the following detailed description are for illustrative purposes only and do not limit the invention as described in the claims. [Brief explanation of the drawing] 【0007】 Exemplary embodiments will be described in more specific and detail with reference to the attached drawings. [Figure 1] Figure 1 shows an example system for deploying cameras and radar in a defined environment according to this disclosure. [Figure 2A] Figure 2A shows an example of a system related to the generation of three-dimensional motion representations according to this disclosure. [Figure 2B] Figure 2B is a perspective view of the sensor device according to this disclosure. [Figure 3A] Figure 3A is a flowchart of an example of a method for generating a three-dimensional motion representation of a moving object according to this disclosure. [Figure 3B] Figure 3B is a flowchart illustrating an exemplary method for pairing each image data with corresponding radar data in order to generate a three-dimensional motion representation of a moving object according to this disclosure. [Figure 4] Figure 4 shows an example of a computing system. [Modes for carrying out the invention] 【0008】 Radar technology can be used to detect and track the movement of objects used in certain sports. Radar technology can also be used to measure various parameters of an object, such as its position, direction of movement, speed, and / or velocity. Furthermore, camera systems can be used to capture images of objects so that their movement can be observed and / or measured. 【0009】 Existing radar sensors and cameras may have various drawbacks, including difficulty in being installed in a given environment to track the motion of moving objects. Such radar systems may exhibit limitations due to insufficient sensor range, vibrations affecting the radar sensor, and / or adverse weather conditions. Various camera systems may exhibit limitations due to camera resolution, dependence on ambient light conditions, and insufficient camera frame rates. Furthermore, the calibration and placement of radar sensors and / or cameras can affect the quality of information obtained regarding moving objects, requiring precise calibration and specific placement of the radar sensors and / or cameras for accurate data acquisition. 【0010】 Calibration of existing stereoscopic image capture systems involves estimating the intrinsic and extrinsic parameters of the camera included in the image capture system. For example, intrinsic parameters of the camera include specifications for the sensor, lens, and / or other components of the camera, while extrinsic parameters include the camera's geographical location, environmental conditions, etc. Thus, existing image capture systems may require highly sensitive and accurate calibration of the camera. Furthermore, such image capture systems may be limited in terms of where the camera can be placed to capture image data from a given environment. 【0011】 This disclosure may, in particular, relate to a method and / or system comprising one or more radar camera units configured to capture radar data and image data in a defined environment. The combination of radar data and image data as described in this disclosure can reduce the sensitivity of camera calibration to existing image capture systems and / or completely eliminate the need to calibrate the camera. Additionally or alternatively, the radar camera units can be positioned in the defined environment with more flexibility than existing image capture systems so that more image data relating to the defined environment can be captured. 【0012】 In some embodiments, image data and radar data captured by multiple camera radar units may be used to estimate the camera parameters of a virtual camera at a specific position and angle. The image associated with the specified position and angle of the virtual camera may be projected based on the estimated camera parameters, image data, and / or radar data associated with the virtual camera. 【0013】 Embodiments of this disclosure will be described with reference to the attached figures. 【0014】 Figure 1 shows an exemplary system 100 for arranging multiple cameras and multiple radars in a defined environment 120 according to this disclosure. The environment 120 may include one or more camera-radar units 110, one or more objects 130, and / or one or more sports users 132-136. Each camera-radar unit 110 may include a camera 112 configured to transmit and / or receive electromagnetic pulses 116, and a radar unit 114. The cameras 112 and radar unit 114 may cooperate to analyze the characteristics of the objects 130 and / or sports users 132-136 (collectively, “Moving Object” or “Multiple Moving Objects”). Image data and / or radar data captured by the multiple camera-radar units 110 can be used to simulate one or more virtual cameras 140. 【0015】 Each camera radar unit 110 may include a processor, memory, and communication equipment, such as a processor 410, memory 420, and communication unit 440, as further described in relation to Figure 4. The operation of the camera radar unit 110 may be controlled by the processor, which may communicate with each of the other components of the camera radar unit 110. The components of the camera radar unit 110 can work together using either or both of the radar data acquired by the radar unit 114 and the image data acquired by the camera 112 to analyze the characteristics of moving objects. Any of the components of the camera radar unit 110 may communicate with each other; for example, the radar unit 114 may communicate with the camera 112, and the camera 112 may communicate with the memory and communication equipment, etc. Furthermore, although the camera radar unit 110 is shown as an integrated device, one or more of its components may be distributed or spread across multiple devices. 【0016】 In some embodiments, the system 100 may include at least two camera radar units 110 configured to acquire image data and / or radar data relating to the movement of objects 130 and / or sports users 132-136 in a defined environment 120. In some embodiments, the camera radar units 110 may be located outside or on the periphery of the defined environment 120 such that each camera radar unit 110 faces the defined environment 120. Placing the camera radar units 110 outside or on the periphery of the defined environment 120 can facilitate the capture of more moving objects within the field of view of the camera radar units 110 when fewer camera radar units 110 are required to capture motion at any given point within the defined environment 120. Additionally or alternatively, the camera radar units 110 may be located at any location within the defined environment 120. 【0017】 The camera radar units 110 may be arranged such that each camera radar unit 110 included in the system 100 has the same line of sight to the same moving object. For example, the first camera radar unit 110a and the second camera radar unit 110b may each have a line of sight to object 130 and / or the same sports users 132-136. In some embodiments, each camera radar unit 110 having a line of sight to the same moving object may capture image data and / or radar data from different angles so that the same motion of a given moving object is captured from multiple viewpoints. 【0018】 Some defined environments 120 may include obstacles that obstruct one or more lines of view of the camera radar unit 110 such that captured image data and / or radar data related to a predetermined object moving through the obstacles appear discontinuous. In some embodiments, the obstacles may include other moving objects. 【0019】 The camera radar unit 110 may be configured to acquire and / or infer motion information about a moving object in situations where the camera radar unit 110's line of sight to the moving object is partially or completely obstructed. In some embodiments, additional camera radar units 110 may be configured to cover blind spots within the field of view of an existing camera radar unit 110. Additionally or alternatively, the motion of a partially or completely obscured moving object can be inferred based on kinematic, dynamical, and / or ballistic modeling, using motion data obtained before and after the object's occlusion. For example, one or more camera radar units 110 may acquire image data and / or radar data related to the movement of a thrown ball. The line of sight between the camera radar unit and the ball may be obstructed by a wall for a period of time. Image data and / or radar data of the ball acquired before and after the ball is obstructed by the wall, as well as the timing of the acquisition of the image data and / or radar data, may be compared to predict the ball's trajectory during the duration the ball was obstructed by the wall. 【0020】 In some embodiments including one or more camera radar units 110, the camera radar units 110 may be positioned such that when a moving object is detected at an intermediate point between the camera radar units 110, the obtained image data and / or radar data associated with the same moving object includes a positional shift. Thus, the positioning of the camera radar units 110 may be asymmetric with respect to the boundaries of a defined environment 120 in order to facilitate positional shifts between the camera radar units 110. Asymmetry in the positioning of the camera radar units can facilitate obtaining additional dimensionality in motion data associated with a given object and / or user (e.g., along the x, y, and / or z axes of the defined environment 120). 【0021】 In some embodiments, the camera radar unit 110 may include a camera 112 and a radar unit 114 that are co-located within the same module. By co-locating the camera 112 and the radar unit 114 within the same module, it is made easier to co-locate the image data captured by the camera 112 and the radar data captured by the radar unit 114 for a given object, and accurate pairing of the image data and the radar data can be facilitated. Additionally or alternatively, the camera radar unit 110 may include separate modules for the camera 112 and the radar unit 114. In some embodiments, by positioning the camera 112 and the radar unit 114 separately, the number of camera radar units 110 required to fully cover the environment 120 with defined fields of view of the camera radar unit 110 can be reduced. In the case of embodiments having separate cameras and radar units, each camera may include a processor, a memory, and a communication unit, and each radar unit may similarly include a processor, a memory, and a communication unit. 【0022】 In some embodiments, the fields of view of the camera 112 and the radar unit 114 may be the same or different. If the fields of view of the camera 112 and the radar unit 114 are different, a trigger mechanism can be activated to keep the object 130 and / or the users 132 - 136 within the field of view of the camera 112 as long as an image is being captured. 【0023】 In some embodiments, the camera radar unit 110 may be configured to acquire image data and / or radar data at a specified frame rate. For example, the camera radar unit 110 may be configured to capture images, and / or sample radar data, once per second, once every 10 seconds, once every 30 seconds, once every minute, etc. Increasing the frame rate of the camera radar unit 110 may improve the accuracy of modeling the motion of a moving object and / or facilitate a more detailed capture of the motion of the moving object, while decreasing the frame rate of the camera radar unit 110 may reduce the power consumption of the camera radar unit 110. In these and other embodiments, the frame rate of the camera radar unit 110 may be specified based on user input. Additionally or alternatively, the frame rate of the camera radar unit 110 may be controlled by a processor based on the operation of the camera radar unit 110. For example, a particular processor may be configured to increase the frame rate of a particular camera radar unit in response to determining that the amount of image data and / or radar data acquired by the particular camera radar unit is insufficient. In this example, the particular processor may be configured to decrease the frame rate of a particular camera radar unit in situations where the processor determines that energy should be conserved (e.g., when the remaining amount of battery supplying energy to the particular camera radar unit is low). 【0024】 Camera 112 may include any device, system, component, or assembly of components configured to capture an image. While one camera 112 is illustrated in relation to each camera radar unit 110 with reference to Figure 1, any number of cameras can be envisioned. Camera 112 may include, for example, optical elements such as lenses, filters, holograms, and splitters, and an image sensor on which the image is recorded. Such an image sensor may include any device that converts an image represented by incident light into an electronic signal. The image sensor may include a plurality of pixel elements that can be arranged in a pixel array (e.g., a grid of pixel elements). For example, the image sensor may consist of a charge-coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) image sensor. The pixel array may include a two-dimensional array having aspect ratios such as 1:1, 4:3, 5:4, 3:2, 16:9, 10:7, 6:5, 9:4, 17:6, or other ratios. The image sensor can be optically aligned with various optical elements, such as lenses, that focus light onto the pixel array. For example, it can contain any number of pixels, such as 8 million pixels, 15 million pixels, 20 million pixels, 50 million pixels, 1 million pixels, 2 million pixels, 6 million pixels, 10 million pixels, etc. 【0025】 Camera 112 can operate at a specific frame rate or capture a specific number of images within a given time. Camera 112 may operate at a frame rate of approximately 30 frames per second or more. In specific examples, camera 112 may operate at a frame rate between approximately 100 and 300 frames per second. In some embodiments, a smaller subset of available pixels in the pixel array may be used to enable camera 112 to operate at a higher frame rate. For example, if a moving object is known or estimated to be located in a specific quadrant, region, or space of the pixel array, only that quadrant, region, or space may be used when capturing an image, allowing for a faster refresh rate to capture another image. This can enable the use of a less expensive camera while enjoying a higher effective frame rate with less usage than the entire pixel array. 【0026】 The camera 112 may also include various other components. Such components may include one or more illumination functions, such as a flash or other light source, a light diffuser, or other components for illuminating an object. In some embodiments, the illumination function may be configured to illuminate a moving object when the moving object is close to the image sensor, for example, when the moving object is within 3 meters of the image sensor. 【0027】 Any number of different triggers can be used to cause camera 112 to capture one or more images of a moving object. As a non-limiting example, camera 112 may be triggered when it is known or estimated that a moving object is within the camera's field of view, when the moving object first begins or changes its motion (e.g., when a baseball is thrown, when a baseball is hit, when a golf ball is struck, when a tennis ball is served, when a cricket ball is bowled, etc.), or when the moving object is detected in the first row of pixels in the pixel array. Another example of a trigger is a persistent peak in the spectrum of reflected microwaves. For example, if there is a consistent peak at a given frequency known to be the expected moving object frequency for a given time, this may act as a trigger event. 【0028】 In some embodiments, camera 112 may have a field of view from which images can be captured. The field of view may correspond to a pixel array. In some embodiments, the field of view may be limited so that a moving object spends only a limited amount of time within the field of view. In such embodiments, camera 112 may be triggered to capture an image while the moving object is within the field of view. The amount of time the moving object is within the field of view of camera 112 may be called the optimal capture time frame. In some embodiments, the optimal capture time frame may include the time when only the entire moving object is within the field of view, or when only a portion of the moving object is within the field of view. Other factors, such as the distance between the image sensor and the moving object, and the amount of illumination that may be provided by the illumination function, may also contribute to the optimal capture time frame. For example, the optimal capture time frame may occur when the moving object is moving between a distance of 3 meters and 1 meter from camera 112, because this may be where the flash of camera 112 provides illumination to the moving object. 【0029】 The radar unit 114 may include any system, component, or set of components configured to transmit one or more microwaves or other electromagnetic waves toward a moving object and to receive reflections of the transmitted microwaves reflected back from the moving object. The radar unit 114 may include a transmitter and a receiver. The transmitter may transmit microwaves toward the moving object via an antenna. The receiver may receive microwaves reflected back from the moving object. The radar unit 114 can operate based on pulsed Doppler, continuous wave Doppler, frequency shift keying radar, frequency modulated continuous wave radar, or other radar techniques known in the art. The frequency shift of the reflected microwaves may be measured to derive the radial velocity of the moving object, in other words, to measure the speed at which the moving object is moving toward the radar unit 114. The radial velocity can be used to estimate the velocity of the moving object, the distance between the moving object and the radar unit 114, the frequency spectrum of the moving object, and so on. 【0030】 The radar unit 114 may also include any of various signal processing or tuning components, for example, the radar unit 114 may include an analog front-end amplifier and / or filter to increase the signal-to-noise ratio (SNR) by amplifying and / or filtering high frequencies or low frequencies, depending on the moving object and the situation in which the radar unit 114 is being used. In some embodiments, the signal processing or tuning components may separate low frequencies from high frequencies, and high frequencies may be amplified and / or filtered separately from low frequencies. In some embodiments, the range of motion of an object may be several meters to tens of meters, and therefore the radar bandwidth may be narrow. 【0031】 The radar unit 114 can initially detect an object when it is within the radar's field of view, or when it first enters the radar's field of view. In some embodiments, the radar signal is tracked for a predetermined period of time. At a trigger point within the predetermined duration, the camera 112 is triggered and image capture begins. 【0032】 In some embodiments, one or more of the virtual cameras 140 may be simulated at a target position and angle based on image data captured by camera 112 and / or radar data captured by radar unit 114. The image data and / or radar data may be captured from two or more camera radar units 110, including one or more overlapping fields of view. The motion of a moving object may be captured by two or more camera radar units 110 such that the motion of the moving object is captured from the position and angle corresponding to each of the camera radar units 110. 【0033】 Image data and / or radar data captured from various positions and angles can facilitate the estimation of external parameters associated with a virtual camera 140 relative to multiple camera radar units 110. For example, landmarks, boundaries, field markers, and / or other identifiable features captured in overlapping regions of image data can be used to estimate the external parameters of the virtual camera 140 relative to each other and / or relative to the camera radar units 110. In some embodiments, the estimated external parameters of the virtual camera 140 can facilitate the projection of virtual image data from the position and angle of the virtual camera 140. 【0034】 Figure 2A shows an exemplary system 200 related to the generation of a three-dimensional motion representation 230 according to the present disclosure. The system 200 may include one or more sensor devices, such as a first sensor device 210a, a second sensor device 210b, and up to an Nth sensor device 210c. The sensor device 210 may include the same or similar device as the camera radar unit 110 described in relation to Figure 1. Sensor data 215 may be collected by the sensor device 210 and transmitted to a machine learning model 220. The machine learning model 220 may be configured and trained to output one or more three-dimensional motion representations 230 related to moving objects in an environment (e.g., a defined environment 120) based on the acquired sensor data 215. 【0035】 The machine learning model 220 may be trained using training sensor data to output a three-dimensional motion representation 230. In some embodiments, the training sensor data may include image data and / or radar data collected from the training environment, which includes more accurate data collection than the environment in which the three-dimensional motion representation 230 is output ("analysis environment"). In these and other embodiments, the number of cameras and / or radar units configured to collect data in the training environment may be greater than the number of cameras and / or radar units included in the analysis environment. For example, the training environment may include six cameras and six radar units arranged to collect motion data about moving objects in the training environment, while the analysis environment may include three cameras and three radar units. Increasing the number of cameras and / or radar units included in the training environment facilitates the collection of more accurate motion data about moving objects in the training environment, which may result in improved accuracy for the machine learning model 220 trained on such data. 【0036】 In these embodiments and other embodiments, the training environment and the analysis environment may include the same defined environment. The camera and / or radar unit corresponding to the training environment and the camera and / or radar unit corresponding to the analyzed environment may be arranged to capture motion data from the same defined environment such that each camera and / or radar unit captures motion data about the same moving object at the same time. Capturing motion about the same moving object at the same time may improve the training efficiency of the machine learning model 220 by providing a stronger correlation between the training and analyzed motion data. 【0037】 Additionally or alternatively, the image data recognition features (aspect) and radar data recognition features of the machine learning model 220 may be trained separately. In some embodiments, the image data recognition features of the machine learning model 220 may be trained to identify and track one or more moving objects, while the radar data recognition features of the machine learning model 220 may be trained to identify the motion signatures (e.g., spectral data) of moving objects. The machine learning model 220 can then correlate the image data recognition features and radar data recognition features based on the time the image data and radar data were collected to output a three-dimensional motion representation 230. 【0038】 In some embodiments, a machine learning model 220 trained in accordance with this disclosure may be configured to determine a three-dimensional motion representation 230 for new objects not included in the training sensor data. In other words, the machine learning model 220 may be configured to determine a three-dimensional motion representation 230 for objects not included in the training sensor data. For example, a particular machine learning model trained to model the three-dimensional motion of a tennis racket and tennis ball may be able to model the three-dimensional motion of a ping-pong paddle and ping-pong ball. As another example, a particular machine learning model trained to model the physical motion of an athlete may be able to model the physical motion of an athlete corresponding to a variety of heights, weights, builds, ethnicities, genders, etc., regardless of user characteristics included in the image and radar data used to train the particular machine learning model. 【0039】 After training, the machine learning model 220 may acquire sensor data 215, including image data and / or radar sensor data, and output one or more three-dimensional motion representations 230. In some embodiments, the machine learning model 220 may arrange image data of a moving object over a period of time in a time series and determine a motion representation of the moving object (e.g., optical flow) to model the two-dimensional motion of the moving object over a time period. Radar data corresponding to the same moving object collected over time can be used to apply the motion representation to determine and model the three-dimensional motion of the moving object. In some embodiments, the three-dimensional motion representation 230 may be used for velocity analysis of a moving object, body motion analysis (e.g., of a human user, human appendages, etc.), or motion simulation of an object. 【0040】 The machine learning model 220 can recognize and distinguish two or more different moving objects based on sensor data 215 acquired from a defined environment. In some embodiments, the machine learning model 220 may be configured to identify two or more moving objects based on image data contained in the sensor data 215. The machine learning model 220 may be configured to match each identified moving object with radar data contained in the sensor data 215, captured at the same time as the image data, in a physically realistic manner. For example, the first moving object may include a first baseball, and the second moving object may include a second baseball thrown at a steeper upward angle than the first baseball. The machine learning model 220 may determine that the radar data (e.g., frequency signature) corresponding to the second moving object should include characteristics indicating upward movement, while the radar data corresponding to the first moving object should not include such characteristics. As another example, the first and second moving objects may traverse the same trajectory, but the first moving object may be moving at a faster speed than the second moving object. The machine learning model 220 determines that the radar data corresponding to the first moving object should include characteristics indicating a faster speed, and accordingly, it can pair the radar data with the image data associated with each moving object. 【0041】 In some embodiments, multiple moving objects may be tracked, either intentionally or because the radar may detect multiple other moving objects (e.g., other balls, birds, airplanes, people) within the field of view. When multiple moving objects are present, tracking the correct object can be difficult; therefore, for example, the radar used may be a narrow-beam radar with a predetermined beamwidth. When a moving object is within the radar beam, it generates a Doppler frequency equivalent to its radial velocity. Simultaneously tracked moving objects include, but are not limited to, a pitcher's hand, a ball, a pitcher and / or batter's arm, a bat or golf club swing, and the like. 【0042】 Detection may also be based on the calculation of the signal-to-noise ratio (SNR). The identified frequency may be associated with an existing predetermined radar track stored in the radar track pool based on proximity. The system evaluates whether the identified frequency can be associated with a predetermined radar track, and if it is not possible to associate it, a new radar track may be created and placed in the radar track pool. 【0043】 If it is determined that a radar track is associated with an existing track, the pre-associated radar track may be determined to exist in the radar track pool. In each iteration, radar track data may be used to predict the frequency that is expected to be detected next. If detection for a radar track fails in multiple iterations (e.g., failure to detect one object out of several objects, or failure to distinguish between several objects), the radar track may be removed from the radar track pool. On the other hand, if the radar track does not fail (e.g., detection of an object from a group of objects, or successful distinguishing between several objects), the radar track may be updated and entered into the radar track pool for later association. 【0044】 In some embodiments, the machine learning model 220 may be configured to selectively track and analyze the motion of specific moving objects in a defined environment. For example, the machine learning model 220 may receive user input indicating that it should track only the motion of a tennis ball in a tennis match, while ignoring the motion of the players and / or tennis rackets. In these embodiments and other embodiments, the machine learning model 220 may be configured to recognize specific moving objects based on image recognition training during the training process of the machine learning model 220. 【0045】 In some embodiments, the machine learning model 220 may be configured to track the position of a moving object based on image data associated with the moving object. In these embodiments and other embodiments, the defined environment from which motion data corresponding to the moving object is collected may include one or more markings that can be referenced by the machine learning model 220 to determine the position of the moving object relative to the defined environment. For example, a particular defined environment may include a basketball court with well-defined floor markings that the machine learning model 220 can reference to track the position of a basketball. In this example, tracking the two-dimensional position of the basketball can be facilitated by the relative position of the basketball's image data and the floor markings. The height of the basketball may be required to model the three-dimensional motion of the basketball, and such height data may be determined based on collected radar data associated with the basketball. 【0046】 In some embodiments, the machine learning model 220 may be configured to track and analyze motion data associated with a moving object, including partial and / or total obstructions to the moving object at any point in the object's motion. The machine learning model 220 may be trained to identify gaps in image data associated with a given moving object and determine whether the given moving object is partially or completely obstructed at any point in its trajectory. In some embodiments, the machine learning model 220 may be configured to predict the trajectory of a partially or completely obstructed moving object based on kinematic, dynamic, and / or ballistic modeling of the moving object, and based on image data and / or radar data collected before and after the obstruction of the moving object. 【0047】 Figure 2B is a diagram of a sensor device 240 according to the present disclosure. Sensor device 240 may represent any of the sensor devices shown in Figure 2A, such as a first sensor device 210a, a second sensor device 210b, and / or an Nth sensor device 210c. Sensor device 240 may include a camera input 242 (e.g., image data) and / or a radar sensor input 246 (e.g., radar data). In some embodiments, the camera input 242 may be preprocessed to generate image preprocessing data 244, and / or the radar sensor input 246 may be preprocessed to generate radar signal preprocessing data 248. Although shown as a single sensor device, the camera input 242 and the radar sensor input 246 may be acquired by separate sensor devices such that each sensor device includes only the camera input 242 or only the radar sensor input 246. 【0048】 Preprocessing of camera input 242 and / or radar sensor input 246 may include analyzing and correcting acquired image data and / or radar data before providing data to a machine learning model. In some embodiments, preprocessing of camera input 242 and / or radar sensor input 246 may include identifying and removing erroneous data. Image data and / or radar data acquired by the sensor device 240, including impossible data values (e.g., negative velocity detected by the radar unit), unlikely data values, noisy data, etc., may be removed during image preprocessing 244 and / or radar signal preprocessing 248 so that the removed data is not acquired by the machine learning model. Additionally or alternatively, image data and / or radar data may include missing data pairs where an image captured at a particular point in time does not have corresponding radar data, or vice versa, and such missing data pairs may be removed during data preprocessing. In these embodiments and other embodiments, the image preprocessing 244 and / or radar signal preprocessing 248 may include formatting the data acquired by the sensor device 240 so that a machine learning model can acquire and analyze the preprocessed image data and / or radar data. 【0049】 Figure 3A is a flowchart of an exemplary method 300 for generating a three-dimensional motion representation of an object according to the present disclosure. Method 300 may be performed by any suitable system, apparatus, or device. For example, sensor devices 210a-c and / or machine learning models 220 may perform one or more of the operations related to Method 300. Although illustrated in discrete blocks, the steps and operations related to one or more blocks of Method 300 may be divided into additional blocks, combined into fewer blocks, or eliminated, depending on the particular implementation. 【0050】 Method 300 may begin with block 310, which can capture image data related to a moving object. In some embodiments, the image data may be captured by a camera, such as camera 112 of a camera radar unit 110 as described above in relation to Figure 1. In these embodiments and other embodiments, a moving object may be detected within the field of view of the radar unit, which may trigger the camera to begin photographing the moving object. When predetermined tracking conditions are met, the camera is triggered to begin photographing. In one example, the entry of a thrown ball into the radar's field of view is the trigger for starting photographing. Simultaneously, the movement of the hand throwing the ball is detected, and therefore the movement of the ball may be tracked as both an incident and an outgoing object. In an additional or alternative example, the trigger for starting photographing may be identifying the movement of a ball when it is hit from a tee. In this example, a radar signal from the swing of the hand is detected, and as a result, the camera is triggered. 【0051】 In an additional or alternative example, the trigger for starting the capture may be the detection of a hand swing. For example, a camera radar unit can be mounted on a tripod so that it can see the hand of a user swinging a bat or golf club. The hand swing can be tracked, and the camera can be triggered to start taking pictures. In an additional or alternative example, the hand swing can be tracked, and the swing data can be correlated with a predefined mask until a threshold parameter (amplitude, velocity, etc.) is met. The correlation signal may be a time-domain reference signal. 【0052】 In some embodiments, when the camera is triggered and shooting begins, the system can take N photographs, where N is determined by a number of conditions predetermined by the user, calculated by the system based on previous shooting rounds, determined by manufacturing, and / or which can change the number of photographs taken. 【0053】 In some cases, including the use of a single camera, prior knowledge of the shape and size of the object in question is useful. For example, knowing that the object is a baseball and not a golf ball or football, and that its approximate diameter is 2.5 inches. In some cases, including the use of multiple cameras (e.g., a stereo camera system), prior knowledge of the shape and size of the object may not be necessary, but providing the data beforehand can speed up processing. 【0054】 In some embodiments, prior knowledge based on motion-triggered mechanisms may be available, where an object is moving or will move relative to a static background (e.g., the movement of a batted ball across a static baseball field environment). Thus, in one example, the object in question can be detected by using the first photograph in a series of photographs as the background image and subtracting the images of each subsequent photograph from the first background photograph. After the subtraction operation, the object relative to the background can be detected by applying thresholding and / or filtering to remove noise. 【0055】 In another embodiment, it may be possible to detect objects in a photograph by first selecting a photograph as the "start" image from a series of photographs. Then, the image from the photograph is subtracted from photographs that occurred before and after the "start" photograph. Different photographs are multiplied, and parts of the photographs common to the preceding and succeeding images are highlighted. As a result of this multiplication, the target area in each photograph where a moving object can be found is further highlighted. If the moving object has clearly defined features (e.g., circular, elliptical, etc.), pattern matching using a known pattern may facilitate the identification of the moving object in the image. 【0056】 In another embodiment, object detection may further include using a Hough transform for objects that can be parameterized using known parameters. For example, a circle has three parameters (i.e., radius, horizontal and vertical position of the center of the circle), and an ellipse has four parameters (i.e., major axis, minor axis, horizontal and vertical position of the center of the ellipse). When it is detected that a moving object is present in multiple images, the relevant parameters are stored in an array, each entry may have a timestamp (e.g., using the internal clock of system 100 or another timing device) that helps track the path of the moving object. 【0057】 In block 320, radar data related to the same moving object from which image data was captured may be acquired. In some embodiments, the radar data may be captured by a radar unit such as radar unit 114 of the camera radar unit 110 described above in relation to Figure 1. In some embodiments, the radar data may include two analog components: an in-phase component (or I channel) and an orthogonal component (or Q channel). When the in-phase and orthogonal components are combined, a complex signal s(t) is formed. Equation 1 【0058】 TIFF0007872623000001.tif10154 Here, i is expressed as follows: This is equivalent to TIFF0007872623000002.tif1312. The common-mode and quadrature components are obtained using an analog-to-digital converter, and the sampling frequency component F S The sample is pre-filtered and amplified as needed before sampling. After sampling, a higher-order finite impulse response (FIR) digital filter is applied to each channel. In some embodiments, an infinite impulse response (IIR) filter may be applied to the sample instead of an FIR filter. In some cases, the filter removes low-frequency motion generated by, for example, the movement of an individual (e.g., a pitcher or batter in this example), or the movement of limbs other than those involved. In this case, the data may be in the time domain, and a moving window N-point fast Fourier transform (FFT) is used to convert the time-domain data to time-frequency domain data. To generate a smooth spectrum with fewer artifacts from finite-time windowing and to reduce spectral leakage, window functions such as Hamming, Blackman, or Kaiser can be applied to pre-multiply the time-domain data before performing the FFT. 【0059】 Raw data may be captured in camera coordinates, but it needs to be converted to world coordinates. World coordinates are, for example, spherical or Cartesian coordinates. To convert data to world coordinates, a conversion matrix from camera to world coordinates is used, based on the camera's position and orientation. Create TIFF0007872623000003.tif77. The camera-to-world coordinate transformation matrix may be a 4x4 matrix containing the associated rotations and translations used to transform any vector from the camera into the selected world coordinate system. The world coordinate vector can be obtained using the following equation: Equation 2 【0060】 TIFF0007872623000004.tif10153 Here, S W This is a vector converted to world coordinates. TIFF0007872623000005.tif834 is a vector of camera coordinates. Although the vector is three-dimensional, a fourth dimension "1" can be added for translation. TIFF0007872623000006.tif88 is a 4x4 file. 【0061】 radius For a regularly shaped object (baseball, golf ball) with TIFF0007872623000007.tif85, the spherical coordinates in the camera reference frame are as follows: TIFF0007872623000008.tif727 is given as follows: Equation 3 【0062】 TIFF0007872623000009.tif10152 formula 4 【0063】 TIFF0007872623000010.tif11152 formula 5 【0064】 TIFF0007872623000011.tif11152 Here, TIFF0007872623000012.tif76 is the covering angle of an object (the angle made by the vectors), Given by TIFF0007872623000013.tif1019, where r is the radius of the object in pixels and L is the total length of the image. TIFF0007872623000014.tif87 is the field of view of the lens. TIFF0007872623000015.tif87 and TIFF0007872623000016.tif66 represents the x and y pixel values of the raw image at the center position of the nth object. 【0065】 The trajectory of the moving object may be estimated, and relevant parameters may be calculated from the estimated trajectory. These relevant parameters may include velocity, speed, rotation, axis of rotation, rotational speed, vertical elevation angle, azimuth angle, trajectory, and release angle. 【0066】 In block 330, each image data may be paired with corresponding radar data related to the same moving object, as will be described in more detail below in relation to Figure 3B. In block 340, one or more three-dimensional motion representations of the moving object may be generated based on the paired image data and radar data, as will be described in more detail below in relation to Figure 3B. 【0067】 Method 300 can be modified, added to, or omitted without departing from the scope of this disclosure. For example, the designation of different elements in the described embodiments is for illustrative purposes only and not limiting to the concepts described herein. Furthermore, Method 300 may include any number of other elements or may be implemented in a system or context other than those described. 【0068】 Figure 3B is a flowchart illustrating an exemplary method of performing the operations in blocks 330 and 340 of Method 300 for pairing each image data with corresponding radar data, as disclosed herein. The exemplary method may be performed by any suitable system, apparatus, or device. For example, sensor devices 210a-c and / or machine learning models 220 may perform one or more of the operations related to the operations in blocks 330 and 340. Although illustrated in discrete blocks, the operations related to one or more of blocks 330 and 340 may be divided into additional blocks, combined into fewer blocks, or eliminated, depending on the particular implementation. 【0069】 In block 332, image data can be organized chronologically. Image data related to moving objects collected by the camera may include a timestamp indicating when the image was captured. In some embodiments, the camera may include an internal clock on which the timestamp can be determined. Additionally or alternatively, the camera may determine relative timestamps, where the first image is assigned a timestamp of zero ("t0"), and subsequent images are assigned timestamps based on the camera's frame rate and the timestamp of the previous image. For example, a particular camera with a frame rate of 1 Hz may capture one image per second such that the timestamp corresponding to the second image indicates a time of 1 second, the timestamp corresponding to the third image indicates a time of 2 seconds, and the timestamp corresponding to the tenth image indicates a time of 9 seconds. 【0070】 In block 334, motion representations of one or more moving objects may be generated based on image data. The motion representation may include image data preprocessing to determine the optical flow, optical tracking, image segmentation, and tracking of the moving objects. For example, the optical flow may be generated based on image data arranged in a time series. Images of moving objects arranged in a time series may be matched as a single image representing the two-dimensional trajectory of the moving object over a time period, based on the timestamp associated with the first image included in the optical flow and the timestamp associated with the last image included in the optical flow. 【0071】 Block 336 may identify radar data collected during the same period as the image data. Radar data related to a moving object collected by the radar unit may include a timestamp indicating when the radar data was captured. In some embodiments, the radar unit may include an internal clock that can determine the timestamp. Additionally or alternatively, the radar unit may determine relative timestamps, where the first radar data is assigned a timestamp of time zero ("t0"), and subsequent radar data is assigned timestamps based on the radar unit's frame rate and the timestamp of the previous radar data. 【0072】 In block 338, identified radar data can be applied to a two-dimensional motion representation of a moving object to generate a three-dimensional motion representation of the moving object. By applying radar data to the two-dimensional motion representation, information describing the moving object in three dimensions, which was previously excluded by the two-dimensional motion representation, can be provided. In some embodiments, each radar data can be paired with a corresponding image having a matching timestamp. Additionally or alternatively, in situations where a single radar data does not have a matching corresponding image, the single radar data may be paired with two or more corresponding adjacent images in the motion representation of the moving object, such as a first adjacent image containing a timestamp before the capture of the single radar data and a second adjacent image containing a timestamp after the capture of the single radar data. 【0073】 The operation in block 330 can be modified, added to, or omitted without departing from the scope of this disclosure. For example, the designation of different elements in the described embodiments is for illustrative purposes only and not limiting to the concepts described herein. Furthermore, the operation in block 330 may include any number of other elements or may be performed in other systems or contexts other than those described herein. 【0074】 Figure 4 shows an exemplary computing system 400 according to at least one embodiment described herein. The computing system 400 may include a processor 410, a memory 420, a data storage device 430, and / or a communication unit 440, all of which may be communicatively coupled. Any or all of the sensor devices 210a to 210c in Figure 2A may be implemented as a computing system corresponding to the computing system 400. 【0075】 Generally, the processor 410 may include any suitable computer, computing entity, or processing unit, including various computer hardware or software modules, and may be configured to execute instructions stored in any applicable computer-readable storage medium. For example, the processor 410 may include a microprocessor, microcontroller, digital signal processor (DSP), application-specific integrated circuit (ASIC), field-programmable gate array (FPGA), or any other digital or analog circuit configured to interpret and / or execute program instructions and / or process data. 【0076】 Although illustrated as a single processor in Figure 4, it is understood that the processor 410 may include any number of processors distributed across any number of networks or physical locations, configured to individually or collectively perform any number of operations described in this disclosure. In some embodiments, the processor 410 may interpret and / or execute program instructions stored in the memory 420, the data storage device 430, or both the memory 420 and the data storage device 430, and / or process data. In some embodiments, the processor 410 may fetch program instructions from the data storage device 430 and load the program instructions into the memory 420. 【0077】 After the program instructions are loaded into memory 420, the processor 410 may execute program instructions such as the instruction to perform method 300 in Figure 3A. For example, the processor 410 may capture image data associated with a moving object, capture radar data associated with the same moving object, pair each image data with the corresponding radar data, and / or generate one or more three-dimensional motion representations of the moving object. 【0078】 The memory 420 and the data storage device 430 may include a computer-readable storage medium or one or more computer-readable storage media for carrying or storing computer-executable instructions or data structures. Such a computer-readable storage medium may be any available medium that can be accessed by a computer, such as a processor 410. For example, the memory 420 and / or the data storage device 430 may store acquired image data and / or radar data. In some embodiments, the arithmetic system 400 may or may not include either the memory 420 or the data storage device 430. 【0079】 Such computer-readable storage media include, but are not limited to, random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), compact disk read-only memory (CD-ROM) or other optical disk storage devices, magnetic disk storage devices or other magnetic storage devices, flash memory devices (e.g., solid-state memory devices), or any other storage media accessible by a computer that are used to transmit or store desired program code in the form of computer-executable instructions or data structures. Combinations of the above may also be included in the scope of computer-readable storage media. Computer-executable instructions may include, for example, instructions and data configured to cause the processor 410 to perform a particular operation or group of operations. 【0080】 The communication unit 440 may include any component, device, system, or combination thereof configured to send and receive information over a network. In some embodiments, the communication unit 440 can communicate with other devices located elsewhere, in the same location, or with other components within the same system. For example, the communication unit 440 may include a modem, a network card (wireless or wired), an optical communication device, an infrared communication device, a wireless communication device (such as an antenna), and / or a chipset (such as a Bluetooth® device, an 802.6 device (e.g., a Metropolitan Area Network (MAN)), a WiFi® device, a WiMAX® device, a cellular communication device, etc.), and / or these. The communication unit 440 may enable the exchange of data with the network and / or any other devices or systems described herein. For example, the communication unit 440 may enable the system 400 to communicate with other systems, such as a computing device and / or other networks. 【0081】 A person skilled in the art will, after reviewing this disclosure, recognize that System 400 may be modified, added to, or omitted without departing from the scope of this disclosure. For example, System 400 may include more or fewer components than those expressly illustrated and described. 【0082】 The embodiments described herein may involve the use of a computer including various computer hardware or software modules. Furthermore, the embodiments described herein may be carried out using a computer-readable medium for carrying or having computer-executable instructions or data structures stored thereon. 【0083】 The terms used in this disclosure and, in particular, in the appended claims (e.g., the body of the appended claims) are generally intended to be “open terms” (for example, the term “including” should be interpreted as “including, but not limited to.”). 【0084】 Furthermore, if a specific number of references to the introduced claims is intended, such intent is explicitly stated in the claims; if no such statement is made, such intent does not exist. For example, for the sake of understanding, the attached claims below may contain the use of the introductory phrases “at least one” and “one or more” to introduce the claims. However, the use of such phrases should not be interpreted as meaning that the introduction of a claim repetition with the indefinite article “a” or “an” limits any particular claim containing such introduced claim repetitions to embodiments containing only one such repetition, and the same applies to the use of the definite article used to introduce the claims, even if the same claim contains the introductory phrases “one or more” or “at least one” or an indefinite article such as “a” or “an” (for example, “a” and / or “an” should be interpreted as meaning “at least one” or “one or more”). 【0085】 In addition, even if a specific number of claims introduced is explicitly stated, a person skilled in the art will recognize that such a statement should be interpreted as meaning at least the stated number (for example, the bare statement “two statements” without other modifiers means at least two statements, or two or more statements). Furthermore, when idiomatic expressions similar to “at least one of A, B, and C, etc.” or “one or more of A, B, and C, etc.” are used, such constructions are generally intended to include A alone, B alone, C alone, A and B together, A and C together, B and C together, or A, B, and C together, etc. 【0086】 Furthermore, in this specification, the claims, or the drawings, any conjunction or phrase preceding two or more alternative terms should be understood as presuming the possibility of including one of the terms, either of the terms, or both of the terms. For example, the phrase "A or B" should be understood as including the possibility of "A" or "B," or "A and B." 【0087】 All examples and conditions incorporated herein are intended for educational purposes to help the reader understand the concepts to which the inventors have contributed to further the disclosure and the art, and are construed as not being limited to such specifically incorporated examples and conditions. While embodiments of this disclosure have been described in detail, various changes, substitutions, and modifications can be made without departing from the spirit and scope of this disclosure.

Claims

[Claim 1] Capturing image data related to a moving object in a defined environment at multiple points in time, Capturing radar data related to the moving object in the defined environment at the multiple points in time, The machine learning model acquires the image data and radar data related to the moving object in the defined environment, Based on the time-series generation of the aforementioned image data and radar data, each image data is paired with the corresponding radar data. The machine learning model generates a three-dimensional motion representation of the moving object related to the image data and the radar data, The machine learning model is configured to track and analyze motion data of the moving object, including partial or total occlusion of the moving object. The machine learning model is configured to predict the trajectory of the moving object based on kinematic, dynamic, and / or ballistic modeling, using the image data and radar data collected before and after the moving object is obscured. To generate the three-dimensional motion representation of the moving object relating to the image data and radar data, The aforementioned image data is organized in chronological order, To generate an optical flow based on the image data organized in the aforementioned time series, Identifying radar data collected during the same period as the aforementioned image data, A method comprising applying the identified radar data to the optical flow based on the time-series occurrence of the radar data. [Claim 2] The method according to claim 1, wherein the three-dimensional motion representation of the moving body includes at least one of velocity analysis, motion analysis, or motion simulation of the moving body. [Claim 3] The method according to claim 1, wherein the image data includes a plurality of two-dimensional consecutive frames of the moving object. [Claim 4] The method according to claim 1, wherein the radar data includes at least one of distance data, velocity data, and frequency data related to the moving object. [Claim 5] The method according to claim 1, wherein the moving body includes at least one of a ball, a sports device, a human accessory, or a human being. [Claim 6] Generating the three-dimensional motion representation of the moving object includes simulating virtual image data, and simulating the virtual image data is Identifying the position and angle of the virtual camera, Based on the captured image data and the captured radar data, one or more external parameters of the virtual camera are estimated. The method according to claim 1, comprising generating virtual image data at the specified position and angle of the virtual camera. [Claim 7] The system comprises a computing system that is communicatively coupled to one or more cameras and one or more radar sensors, and the computing system is Image data related to a moving object in a defined environment is acquired from one or more cameras at multiple points in time. Radar data related to the moving object in the defined environment is acquired from one or more radar sensors at the multiple time points in time. Based on the time-series generation of the aforementioned image data and radar data, each image data is paired with the corresponding radar data. A machine learning model is used to generate a three-dimensional motion representation of the moving object related to the image data and radar data. The machine learning model is configured to track and analyze motion data of the moving object, including partial or total occlusion of the moving object. The machine learning model is configured to predict the trajectory of the moving object based on kinematic, dynamic, and / or ballistic modeling, using the image data and radar data collected before and after the moving object is obscured. To generate the three-dimensional motion representation of the moving object relating to the image data and radar data, The aforementioned image data is organized in chronological order, To generate an optical flow based on the image data organized in the aforementioned time series, Identifying radar data collected during the same period as the aforementioned image data, A system for capturing and analyzing motion, comprising applying the identified radar data to the optical flow based on the time-series occurrence of the radar data. [Claim 8] The system according to claim 7, wherein the three-dimensional motion representation of the moving body includes at least one of velocity analysis, body motion analysis, or motion simulation of the moving body. [Claim 9] The system according to claim 7, wherein the image data includes a plurality of two-dimensional consecutive frames of the moving object. [Claim 10] The system according to claim 7, wherein the radar data includes at least one of distance data, velocity data, and frequency data related to the moving object. [Claim 11] One or more cameras configured to collect image data in the environment defined above, The system according to claim 7, further comprising one or more radar sensors configured to collect radar data in the environment defined above. [Claim 12] The three-dimensional motion representation relating to the moving body includes simulating virtual image data, and simulating the virtual image data is Identifying the position and angle of the virtual camera, Based on the captured image data and the captured radar data, one or more external parameters of the virtual camera are estimated. The system according to claim 7, comprising generating virtual image data at the specified position and angle of the virtual camera. [Claim 13] It is configured to store instructions that cause the system to perform an operation in response to what has been performed, and said operation is, Capturing image data of a moving object in a defined environment at multiple points in time, To capture radar data relating to the moving object in the defined environment at the multiple points in time, The machine learning model acquires the image data and radar data related to the moving object in the defined environment, Based on the time-series generation of the aforementioned image data and radar data, each image data is paired with the corresponding radar data. The machine learning model generates a three-dimensional motion representation of the moving object related to the image data and the radar data, The machine learning model is configured to track and analyze motion data of the moving object, including partial or total occlusion of the moving object. The machine learning model is configured to predict the trajectory of the moving object based on kinematic, dynamic, and / or ballistic modeling, using the image data and radar data collected before and after the moving object is obscured. To generate the three-dimensional motion representation of the moving object relating to the image data and radar data, The aforementioned image data is organized in chronological order, To generate an optical flow based on the image data organized in the aforementioned time series, Identifying radar data collected during the same period as the aforementioned image data, One or more non-temporary computer-readable storage media, comprising applying the identified radar data to the optical flow based on the time-series occurrence of the radar data. [Claim 14] The one or more non-temporary computer-readable storage media according to claim 13, wherein the three-dimensional motion representation of the moving body includes at least one of the velocity analysis, motion analysis, or motion simulation of the moving body. [Claim 15] The image data comprises a plurality of two-dimensional continuous frames of the moving object, in one or more non-temporary computer-readable storage media according to claim 13. [Claim 16] The radar data comprises at least one of distance data, velocity data, and frequency data related to the moving object, in one or more non-temporary computer-readable storage media according to claim 13. [Claim 17] The three-dimensional motion representation of the moving object is generated by simulating virtual image data, and simulating the virtual image data is Identifying the position and angle of the virtual camera, Based on the captured image data and the captured radar data, one or more external parameters of the virtual camera are estimated. A method for generating virtual image data at the specified position and angle of the virtual camera, and one or more non-temporary computer-readable storage media according to claim 13. [Claim 18] The system according to claim 7, wherein the system includes at least two camera radar units. [Claim 19] The system according to claim 18, further comprising an additional camera radar unit for covering blind spots within the field of view of the at least two camera radar units. [Claim 20] The system according to any one of claims 18 or 19, wherein the defined environment includes an obstacle that obstructs the line of sight of at least two camera radar units, and as a result, image data and / or radar data relating to a predetermined object passing through the obstacle becomes discontinuous. [Claim 21] The system according to claim 20, wherein the obstacle includes other moving objects.