Image processing device, image processing method, and computer-readable recording medium
The image processing apparatus and method improve realism in 3D models by applying image textures to 3D point cloud data using corresponding feature points, addressing processing burdens and model deviation issues.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- NEC SOLUTION INNOVATORS LTD
- Filing Date
- 2025-12-11
- Publication Date
- 2026-06-18
AI Technical Summary
Existing methods for adding color to 3D point cloud data using laser scanners result in high processing burdens and often lead to 3D models that deviate significantly from the actual object, lacking realism.
An image processing apparatus and method that acquires 3D point cloud data from a depth sensor and image data from a camera, extracts corresponding feature points, and applies image data as textures to 3D point cloud data using triangles with vertices as corresponding feature points, reducing the need for associating image data with each point and enhancing realism.
This approach reduces processing load while creating a highly realistic 3D model that faithfully represents the actual object by accurately applying image textures based on corresponding feature points.
Smart Images

Figure JP2025043198_18062026_PF_FP_ABST
Abstract
Description
Image processing apparatus, image processing method, and computer-readable recording medium 【0001】 This disclosure relates to image processing techniques for constructing three-dimensional models. 【0002】 In recent years, technologies for constructing three-dimensional models of real-world objects have attracted considerable attention. These generated 3D models are being used in a variety of fields, including entertainment, architecture, product development, and medicine. 【0003】 One technique for constructing 3D models is the use of laser scanners. A laser scanner is a device that measures the distance to an object by receiving the reflected light of a laser beam shone on the object and measuring the time from irradiation to reception. The laser scanner measures the distance to numerous points on the object. By using the distance measured at each point, 3D point cloud data of the object is constructed. 【0004】 Incidentally, while 3D point cloud data constructed using laser scanners is highly accurate, it lacks color, making it difficult to recognize objects. For this reason, Patent Document 1 discloses a system for adding color to 3D point cloud data. 【0005】 The system disclosed in Patent Document 1 first acquires three-dimensional point cloud data constructed using a laser scanner and a color image of an elephant generated by taking a picture with a camera. Next, the system disclosed in Patent Document 1 determines the relative positional relationship between the laser scanner and the camera at the time of shooting, and based on the determined positional relationship, associates each point in the three-dimensional point cloud data with the image data of the color image. Then, the system disclosed in Patent Document 1 adds the corresponding image data to each point constituting the three-dimensional point cloud data, thereby adding color to the three-dimensional point cloud data. 【0006】 Japanese Patent Publication No. 2005-77385 【0007】 However, the system disclosed in Patent Document 1 has the problem that it is necessary to associate image data with each point in the 3D point cloud data, which places a burden on processing. 【0008】Furthermore, the system disclosed in Patent Document 1 only has the function of simply adding color differences to each point of 3D point cloud data acquired by a laser scanner, and the 3D model ultimately constructed deviates significantly from the actual object. For this reason, there is a need to construct a highly realistic 3D model that faithfully represents the actual object. 【0009】 One example of the purpose of this disclosure is to improve realism while suppressing the processing load when constructing three-dimensional models with color. 【0010】 To achieve the above objective, an image processing apparatus in one aspect of this disclosure is characterized by comprising: a data acquisition unit that acquires three-dimensional point cloud data of an object acquired by a depth sensor and image data of the object obtained by taking a picture with an image camera; a feature point extraction unit that converts the coordinate system of the three-dimensional point cloud data to the coordinate system of the image data, compares the converted three-dimensional point cloud data and the image data, and extracts corresponding feature points from each; and a texture image application unit that sets a plurality of triangles with the extracted feature points as vertices so as to correspond to each other in the three-dimensional point cloud data and the image data, respectively, and applies the corresponding triangles from the image data as textures to each of the triangles set in the three-dimensional point cloud data before conversion. 【0011】Furthermore, in order to achieve the above objective, an image processing method in one aspect of this disclosure is characterized by comprising: a data acquisition step of acquiring three-dimensional point cloud data of an object acquired by a depth sensor and image data of the object obtained by taking a picture with an image camera; a feature point extraction step of converting the coordinate system of the three-dimensional point cloud data to the coordinate system of the image data, comparing the converted three-dimensional point cloud data and the image data, and extracting corresponding feature points from each; and a texture image application step of setting a plurality of triangles whose vertices are the extracted feature points so that they correspond to each other in the three-dimensional point cloud data and the image data, respectively, and applying the corresponding triangles in the image data as textures to each of the triangles set in the three-dimensional point cloud data before conversion. 【0012】 Furthermore, in order to achieve the above objective, a computer-readable recording medium in one aspect of this disclosure is characterized in that it records a program that includes instructions to cause a computer to execute: a data acquisition step of acquiring three-dimensional point cloud data of an object acquired by a depth sensor and image data of the object obtained by taking a picture with an image camera; a feature point extraction step of converting the coordinate system of the three-dimensional point cloud data to the coordinate system of the image data, comparing the converted three-dimensional point cloud data and the image data, and extracting corresponding feature points from each; and a texture image application step of setting a plurality of triangles whose vertices are the extracted feature points so that they correspond to each other in the three-dimensional point cloud data and the image data, respectively, and applying the corresponding triangles in the image data as textures to each of the triangles set in the three-dimensional point cloud data before conversion. 【0013】 As described above, this disclosure makes it possible to improve realism while suppressing the processing load when constructing a three-dimensional model with color. 【0014】Figure 1 is a schematic diagram showing the general configuration of an example of an image processing device. Figure 2 is a schematic diagram showing the configuration of an example of an image processing device in detail. Figure 3 is a diagram illustrating self-localization performed by estimating the external parameters of a depth sensor. Figure 4 is a diagram showing the relationships between each parameter estimated by the image processing device. Figure 5 is a diagram illustrating the timing of image data acquisition, the timing of 3D point cloud data measurement, and the timing of each parameter estimation. Figure 6 is a diagram illustrating the projection process of 3D point cloud data and the extraction process of feature points from image data. Figure 7 is a diagram illustrating an example of the triangle setting process required for texture application. Figure 8 is a diagram conceptually illustrating the texture application process. Figure 9 is a flowchart showing an example of the operation of an image processing device. Figure 10 is a block diagram showing an example of a computer that implements an image processing device. 【0015】 (Embodiment) In the following embodiment, the image processing apparatus, image processing method, and program will be described with reference to Figures 1 to 10. 【0016】 [Device Configuration] First, the schematic configuration of an example of an image processing device will be explained using Figure 1. Figure 1 is a configuration diagram showing the schematic configuration of an example of an image processing device. 【0017】 The image processing device 10 shown in Figure 1 is a device for constructing a three-dimensional model. As shown in Figure 1, the image processing device 10 includes a data acquisition unit 11, a feature point extraction unit 12, and a texture image application unit 13. 【0018】 The data acquisition unit 11 acquires 3D point cloud data of the object acquired by the depth sensor and image data of the object obtained by taking pictures with the image camera. The feature point extraction unit 12 converts the coordinate system of the 3D point cloud data to the coordinate system of the image data, compares the converted 3D point cloud data and image data, and extracts corresponding feature points from each. 【0019】The texture image application unit 13 first sets up multiple triangles in the 3D point cloud data and image data before conversion, with the extracted feature points as vertices, so that they correspond to each other. Next, the texture image application unit 13 applies the corresponding triangles from the image data as textures to each of the triangles set in the 3D point cloud data before conversion. 【0020】 In this way, the image processing device 10 uses corresponding feature points between the 3D point cloud data and the image data to determine the correspondence between them, and based on the determined correspondence, it applies the texture obtained from the image data to the 3D point cloud data. In other words, with the image processing device 10, there is no need to associate image data with each point in the 3D point cloud data, thus reducing the processing burden. Furthermore, the texture applied to the 3D point cloud data is not merely a color assigned to each point, but an image that contains information about the relationships between points. Therefore, the resulting 3D model faithfully represents real-world objects and possesses a high degree of realism. 【0021】 Next, the configuration and functions of the image processing device 10 will be specifically explained using Figures 2 to 8. Figure 2 is a configuration diagram that specifically shows the configuration of an example of an image processing device. 【0022】 As shown in Figure 2, the image processing device 10 includes, in addition to the data acquisition unit 11, feature point extraction unit 12, and texture image application unit 13 shown in Figure 1, a calibration execution unit 14, a parameter estimation unit 15, and a data storage unit 16. 【0023】 In the example shown in Figure 2, the image processing device 10 is connected to the depth sensor 20 and the image camera 30 for data communication. The data acquisition unit 11 acquires 3D point cloud data from the depth sensor 20 and image data from the image camera 30. The data acquisition unit 11 also stores the acquired 3D point cloud data and image data in the data storage unit 16. 【0024】The image processing apparatus 10 does not necessarily need to be connected to the depth sensor 20 and the image camera 30 in a data communicable manner. In this case, the data acquisition unit 11 acquires these data from an external device that stores three-dimensional point cloud data and image data. As the external device, a file server and a storage device can be mentioned. 【0025】 The depth sensor 20 measures the distance to each point of the subject and outputs three-dimensional point cloud data using the measurement results. In the example of FIG. 2, the depth sensor 20 is a LiDAR (Light Detection And Ranging). In addition to LiDAR, a laser scanner and a TOF (Time Of Flight) camera can also be mentioned as the depth sensor 20. 【0026】 The image camera 30 outputs an image of the subject as image data. The image camera 30 is in a fixed relationship with the depth sensor 20 and is integrated. That is, the depth sensor 20 and the image camera 30 are fixed so that the same subject can be measured by the image camera 30 while photographing the subject. Also, in the example of FIG. 2, the image data of the subject is obtained by photographing with one image camera 30. 【0027】 In addition, since the measurement (scan) interval of the depth sensor 20 (about 10 Hz) is shorter than the photographing interval (1 Hz) of the image camera 30, the data acquisition unit 11 controls the measurement timing of the depth sensor 20 and the photographing timing by the image camera 30. Specifically, when the measurement by LiDAR is set to be performed a certain number of times, the data acquisition unit 11 causes the image camera 30 to execute photographing simultaneously at the next measurement (see FIG. 5 described later). The data acquisition unit 11 acquires the three-dimensional point cloud data and the image data obtained by measurement and photographing at the same timing. 【0028】The reason for using an image camera 30 (shooting interval: 1 Hz) instead of a video camera as a means of acquiring images is as follows. First, the shooting interval of a video camera is about 30 Hz, which is shorter than the measurement (scanning) interval of the depth sensor 20, which is 10 Hz. On the other hand, the measurement of the depth sensor and the image acquisition need to overlap at an appropriate rate (about 70% to 90%). For this reason, an image acquisition interval of about 1 Hz is generally suitable, and therefore the image camera 30 is suitable as a means of acquiring images. Note that the means of acquiring images is not limited to an image camera, nor is the shooting interval limited to 1 Hz. 【0029】 Furthermore, the depth sensor 20 measurement and the image camera 30 capture are performed by the photographer while changing their position relative to the entire object. Therefore, the data acquisition unit 11 acquires multiple sets of 3D point cloud data and image data, each with different measurement and capture positions. 【0030】 The data acquisition unit 11 also acquires 3D point cloud data and calibration image data for use by the calibration execution unit 14, which will be described later. The 3D point cloud data and calibration image data are also stored in the data storage unit 16. There are several examples of calibration methods, and they are not particularly limited. For example, the 3D point cloud data and image data for calibration are generated by taking images with the image camera 30 and measuring with the depth sensor 20 on a pre-prepared calibration checkerboard. An example of a calibration board is a board with a black and white checkerboard pattern. 【0031】 The calibration execution unit 14 first estimates the internal parameters of the image camera 30, namely the matrix K that converts camera coordinates to image coordinates. The estimation of matrix K is performed using a known method, using the specifications of the image camera 30, such as focal length, lens distortion, and shear. 【0032】The matrix K, which is an internal parameter of the image camera 30, is represented by the following Equation (1) and Equation (2). In Equation (1), (u, v, 1) represents image coordinates, and (X c , Y c , Z c ) represents camera coordinates. In Equation (2), f x and f y are focal lengths, and c x and c y are the center coordinates of the camera. 【0033】 【0034】 【0035】 Further, the calibration execution unit 14 acquires calibration three-dimensional point cloud data and calibration image data from the data storage unit 16. Then, the calibration execution unit 14 estimates a transformation matrix between the image camera 30 and the depth sensor 20 using the calibration three-dimensional point cloud data and the calibration image data. Hereinafter, this transformation matrix is referred to as the "camera-depth sensor transformation matrix". 【0036】 Specifically, the calibration execution unit 14 first extracts the world coordinates of the corners of the checkerboard from the calibration image data using an existing function. Next, the calibration execution unit 14 extracts the main plane of the checkerboard from the calibration three-dimensional point cloud data using an existing function. Further, the calibration execution unit 14 calculates a rigid body transformation matrix composed of rotation R and translation t using the extracted world coordinates of the corners and the extracted main plane. The calculated rigid body transformation matrix becomes the camera-depth sensor transformation matrix. The camera-depth sensor transformation matrix is denoted as "[R cl |t cl ". 【0037】The parameter estimation unit 15 estimates the external parameters of the depth sensor 20. Specifically, the parameter estimation unit 15 retrieves the 3D point cloud data stored in the data storage unit 16 in a time-series manner, and uses the retrieved time-series 3D point cloud data to perform self-position estimation for the depth sensor 20 in a time-series manner. 【0038】 Figure 3 illustrates the self-localization performed during the estimation of the external parameters of the depth sensor. As shown in Figure 3, the parameter estimation unit 15 first sets, for example, the measurement position of the first 3D point cloud data as the starting position. Then, the parameter estimation unit 15 rotates and translates the latest 3D point cloud data and superimposes it onto the previous 3D point cloud data to determine the position of the latest depth sensor 20. In this way, the parameter estimation unit 15 determines the position of the depth sensor 20 over time using existing SLAM (Simultaneous Localization And Mapping). The parameter estimation unit 15 then identifies the rotation and translation values performed during self-localization and sets the identified rotation and translation values to the external parameters [R l |t l In addition, external sensors such as IMUs (Inertial Measurement Units) may be used to improve the accuracy of SLAM, in which case the parameter estimation unit 15 can use data output from the IMU. 【0039】 Furthermore, the parameter estimation unit 15 calculates the camera-depth sensor transformation matrix [R cl |t cl ] and the external parameter [R l |t l Using ], the external parameters of the image camera 30 for each shooting position [R c |t c We estimate the external parameter [R c |t c This can be calculated using the following equation 3. Figure 4 shows the relationship between each parameter estimated by the image processing device. 【0040】 【0041】 Figure 5 is a diagram illustrating the timing of image data acquisition, the timing of 3D point cloud data measurement, and the timing of each parameter estimation. As shown in Figure 5, the measurement interval of the depth sensor 20 is shorter than the acquisition interval (1 Hz) of the image camera 30. External parameter [R c |t c The estimation of [R] is performed at the timing of image capture by the image camera 30. Note that the camera-depth sensor transformation matrix [R cl |t cl ] is estimated in advance. On the other hand, the self-position estimation of the depth sensor 20 is performed at the measurement timing of the depth sensor 20, and the estimation of the external parameters of the depth sensor 20 is also performed at the measurement timing of the depth sensor 20. 【0042】 As shown in Figure 6, the feature point extraction unit 12 first processes the camera-depth sensor transformation matrix [R cl |t cl The coordinate system of the 3D point cloud data is transformed into camera coordinates using [ ]. The coordinate transformation is performed based on equation 4 below. Figure 6 illustrates the projection process of the 3D point cloud data and the extraction process of feature points from the image data. 【0043】 【0044】 Next, as shown in Figure 6, the feature point extraction unit 12 projects the 3D point cloud data, whose coordinate system has been transformed to the coordinate system of the image data, onto the image data using the intrinsic parameters (matrix K) of the image camera. The projection is performed based on equation 6 below. 【0045】 【0046】 The feature point extraction unit 12 then compares the image data with the projected 3D point cloud data and extracts corresponding feature points from both. Specifically, the feature point extraction unit 12 identifies feature points on the image data that correspond to each point in the projected 3D point cloud data, and extracts the points in the projected 3D point cloud data and the corresponding feature points on the image data as corresponding feature points. 【0047】Furthermore, in areas where there are extremely few feature points on the image data that correspond to each point in the projected 3D point cloud data, or in areas where there are extremely few feature points on the image data itself, the feature point extraction unit 12 may select only the points projected from the point cloud data as feature points. Also, in areas where there are an extremely large concentration of feature points, the feature point extraction unit 12 may appropriately adjust the number of feature points by applying smoothing processing or the like as appropriate. These adjustments to the number of feature points may also be performed simultaneously when setting up the triangles in the subsequent stage. 【0048】 The texture image application unit 13 first acquires 3D point cloud data and image data from the data storage unit 16, and sets up a plurality of first triangles with extracted feature points as vertices in the acquired 3D point cloud data and image data, respectively, so that they correspond to each other. 【0049】 Furthermore, the texture image application unit 13 can also divide part or all of the multiple triangles whose vertices are extracted feature points in the acquired 3D point cloud data and image data, and set up multiple second triangles. In addition, the texture image application unit 13 can combine multiple triangles to set up a single second triangle. 【0050】 The texture image application unit 13 then applies the corresponding first triangle from the image data as a texture to each of the first triangles set in the 3D point cloud data. If a second triangle is set, the texture image application unit 13 also applies the corresponding second triangle from the image data as a texture to each of the second triangles set in the 3D point cloud data. 【0051】 Here, we will specifically explain the processing in the texture image application unit 13 using Figures 7 and 8. Figure 7 is a diagram showing an example of the triangle setting process required for texture application. Figure 8 is a diagram conceptually illustrating the texture application process. 【0052】As shown in Figure 7, the texture image application unit 13 sets up a first triangle whose vertices are the feature points of the 3D point cloud data. Furthermore, the texture image application unit 13 divides the set up first triangle, for example, those with a number of pixels equal to or greater than a set value, by connecting the center points of each side, and sets up a second triangle. Then, as shown in Figure 8, the texture image application unit 13 applies the texture extracted from the image data to the corresponding positions in the 3D point cloud data. This generates a 3D model with the image data applied as a texture. 【0053】 [Device Operation] Next, an example of the operation of the image processing device will be explained using Figure 9. Figure 9 is a flowchart showing an example of the operation of the image processing device. In the following explanation, Figures 1 to 8 will be referred to as appropriate. In this embodiment, the image processing method is carried out by operating the image processing device 10. Therefore, in this embodiment, the explanation of the image processing method will be replaced by the following explanation of the operation of the image processing device 10. 【0054】 First, as a prerequisite, the data acquisition unit 11 acquires 3D point cloud data and calibration image data, and stores them in the data storage unit 16. The data acquisition unit 11 also controls the measurement timing of the depth sensor 20 and the image capture timing of the image camera 30. Therefore, when the LiDAR measurement is performed a set number of times, the image camera 30 will capture an image simultaneously during the next measurement. Furthermore, the measurement by the depth sensor 20 and the image capture by the image camera 30 are performed by the photographer while changing their position relative to the entire object. 【0055】 As shown in Figure 9, first, the calibration execution unit 14 estimates the internal parameters of the image camera 30, that is, the matrix K that converts camera coordinates to image coordinates (step A1). 【0056】 Next, the calibration execution unit 14 acquires 3D point cloud data and calibration image data from the data storage unit 16, and uses these to estimate the camera-depth sensor transformation matrix (step A2). 【0057】Next, the data acquisition unit 11 acquires 3D point cloud data from the depth sensor 20 and image data from the image camera 30, and stores the acquired 3D point cloud data and image data in the data storage unit 16 (step A3). The 3D point cloud data and image data acquired in step A3 are all the data obtained from the measurement and photography of the object, and are stored in chronological order. 【0058】 Next, the parameter estimation unit 15 determines the external parameter [R l |t l Estimate ] (Step A4). 【0059】 Specifically, in step A4, the parameter estimation unit 15 retrieves the 3D point cloud data stored in the data storage unit 16 in chronological order, and uses the retrieved 3D point cloud data in chronological order to perform self-position estimation for the depth sensor 20. The parameter estimation unit 15 uses the rotation and translation of the 3D point cloud data performed in self-position estimation to determine the external parameter [R l |t l We estimate ]. 【0060】 Next, the parameter estimation unit 15 calculates the camera-depth sensor transformation matrix [R cl |t cl ] and the external parameters of the depth sensor 20 estimated in step A4 [R l |t l Using ], the external parameters of the image camera 30 for each shooting position [R c |t c Estimate ] (Step A5). 【0061】 Next, the feature point extraction unit 12 uses the external parameter [R l |t l ], external parameters of the image camera 30 [R c |t c ], and the three-dimensional point cloud data is projected onto the image data using the intrinsic parameters (matrix K) of the image camera (step A6). 【0062】Specifically, in step A6, the feature point extraction unit 12 performs the camera-depth sensor transformation matrix [R] as shown in Figure 6. cl |t cl The coordinate system of the 3D point cloud data is converted to camera coordinates using [ ]. Furthermore, as shown in Figure 6, the feature point extraction unit 12 projects the 3D point cloud data, whose coordinate system has been converted to the coordinate system of the image data, onto the image data using the intrinsic parameters (matrix K) of the image camera. 【0063】 Next, the feature point extraction unit 12 compares the image data with the projected 3D point cloud data and extracts corresponding feature points from both (step A7). 【0064】 Next, the texture image application unit 13 sets up multiple triangles with extracted feature points as vertices in the 3D point cloud data and image data respectively, so that they correspond to each other (step A8). In step A8, it is also possible to divide part or all of the set triangles and set up a second triangle. 【0065】 Next, the texture image application unit 13 applies the corresponding triangle of the image data as a texture to each triangle of the 3D point cloud data set in step A7 (step A9). Execution of step A9 generates a 3D model with the image data applied as a texture. 【0066】 [Effects of the Embodiment] As described above, according to the embodiment, an image of the object can be applied as a texture to the 3D point cloud data of the object. In addition, in the embodiment, corresponding feature points between the 3D point cloud data and image data can be automatically and accurately identified, and the position where the texture is applied is determined based on these corresponding feature points. Therefore, according to the embodiment, the processing burden in generating the 3D model is reduced. Furthermore, since the texture, which is an image containing information between points, is accurately applied, the 3D model faithfully represents the real object and possesses a high degree of realism. 【0067】[Program] In this embodiment, the program is one that causes the computer to execute steps A1 to A9 shown in Figure 9. By installing and executing this program on the computer, the image processing device 10 and the image processing method can be realized. In this case, the computer's processor functions as a data acquisition unit 11, a feature point extraction unit 12, a texture image application unit 13, a calibration execution unit 14, and a parameter estimation unit 15, and performs the processing. 【0068】 The data storage unit 16 may be implemented by a storage device such as a hard disk installed in the computer, or by a storage device in another computer. Examples of computers include general-purpose PCs, server computers, smartphones, and tablet devices. 【0069】 Furthermore, in this embodiment, the program may be executed by a computer system constructed by multiple computers. In this case, for example, each computer may function as one of the following: a data acquisition unit 11, a feature point extraction unit 12, a texture image application unit 13, a calibration execution unit 14, and a parameter estimation unit 15. 【0070】 [Physical Configuration] Here, a computer that realizes the image processing device 10 by executing the program in the embodiment will be described with reference to Figure 10. Figure 10 is a block diagram showing an example of a computer that realizes the image processing device. 【0071】 As shown in Figure 10, the computer 110 comprises a processor 111, main memory 112, storage device 113, input interface 114, display controller 115, data reader / writer 116, and communication interface 117. Each of these components is connected to the others via a bus 121, enabling data communication. 【0072】The processor 111 may be a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), an FPGA (Field-Programmable Gate Array), an ASIC (Application Specific Integrated Circuit), or an MPU (Micro Processing Unit). Furthermore, the computer 110 may have multiple processors 111. 【0073】 The processor 111 loads the program in the embodiment, which consists of a group of codes stored in the storage device 113, into the main memory 112, and performs various calculations by executing each code in a predetermined order. The main memory 112 is typically a volatile storage device such as DRAM (Dynamic Random Access Memory). 【0074】 Furthermore, the program in this embodiment is provided stored on a computer-readable recording medium 120. The program in this embodiment may also be distributed over the internet via a communication interface 117. 【0075】 Furthermore, specific examples of the storage device 113 include hard disk drives and semiconductor storage devices such as flash memory. The input interface 114 mediates data transmission between the processor 111 and input devices 118 such as a keyboard and mouse. The display controller 115 is connected to the display device 119 and controls the display on the display device 119. 【0076】 The data reader / writer 116 mediates data transmission between the processor 111 and the recording medium 120, reads programs from the recording medium 120, and writes processing results from the computer 110 to the recording medium 120. The communication interface 117 mediates data transmission between the processor 111 and other computers. 【0077】Furthermore, specific examples of the recording medium 120 include general-purpose semiconductor memory devices such as CF (Compact Flash®) and SD (Secure Digital), magnetic recording media such as Flexible Disks, or optical recording media such as CD-ROMs (Compact Disk Read Only Memory). 【0078】 Furthermore, the image processing device 10 can be implemented not only by a computer with a program installed, but also by using hardware corresponding to each part, such as electronic circuits. Moreover, the image processing device 10 may be partially implemented by a program and the remaining part by hardware. In this embodiment, the computer is not limited to the computer shown in Figure 10. 【0079】 Some or all of the embodiments described above can be expressed by (Appendix 1) to (Appendix 15) described below, but are not limited to the following descriptions. 【0080】 (Note 1) An image processing apparatus comprising: a data acquisition unit that acquires three-dimensional point cloud data of an object obtained by a depth sensor and image data of the object obtained by taking a picture with an image camera; a feature point extraction unit that converts the coordinate system of the three-dimensional point cloud data to the coordinate system of the image data, compares the converted three-dimensional point cloud data and the image data, and extracts corresponding feature points from each; and a texture image application unit that sets a plurality of triangles with the extracted feature points as vertices so that they correspond to each other in the three-dimensional point cloud data and the image data, respectively, and applies the corresponding triangles from the image data as textures to each of the triangles set in the three-dimensional point cloud data before conversion. 【0081】(Note 2) The image processing apparatus according to Note 1, wherein the texture image application unit divides part or all of the plurality of triangles whose vertices are the extracted feature points in the three-dimensional point cloud data and the image data before conversion, respectively, to set up a plurality of second triangles, and applies the corresponding second triangle of the image data as a texture to each of the second triangles set up in the three-dimensional point cloud data before conversion. 【0082】 (Note 3) The image processing apparatus according to Note 1, wherein the feature point extraction unit converts the coordinate system of the three-dimensional point cloud data to the coordinate system of the image data using a camera-depth sensor conversion matrix, and then projects the three-dimensional point cloud data, whose coordinate system has been converted to the coordinate system of the image data, onto the image data using the internal parameters of the image camera, thereby comparing the converted three-dimensional point cloud data with the image data. 【0083】 (Note 4) The image processing apparatus according to Note 1, wherein the data acquisition unit acquires the three-dimensional point cloud data and the image data which are generated at the same time. 【0084】 (Note 5) The image processing apparatus described in Note 1, wherein the image data of the object is obtained by taking a picture with a single image camera. 【0085】 (Note 6) An image processing method characterized by comprising: a data acquisition step of acquiring three-dimensional point cloud data of an object obtained by a depth sensor and image data of the object obtained by taking a picture with an image camera; a feature point extraction step of converting the coordinate system of the three-dimensional point cloud data to the coordinate system of the image data, comparing the converted three-dimensional point cloud data and the image data, and extracting corresponding feature points from each; and a texture image application step of setting a plurality of triangles whose vertices are the extracted feature points so that they correspond to each other in the three-dimensional point cloud data and the image data, respectively, and applying the corresponding triangles in the image data as textures to each of the triangles set in the three-dimensional point cloud data before conversion. 【0086】 (Note 7) The image processing method according to Note 6, wherein in the step of applying a texture image, in the 3D point cloud data and the image data before conversion, a part or all of the plurality of triangles whose vertices are the extracted feature points are divided to set up a plurality of second triangles, and the corresponding second triangle of the image data is applied as a texture to each of the second triangles set up in the 3D point cloud data before conversion. 【0087】 (Note 8) The image processing method according to Note 6, wherein in the feature point extraction step, the coordinate system of the three-dimensional point cloud data is converted to camera coordinates using a camera-depth sensor conversion matrix that converts the coordinate system of the three-dimensional point cloud data to the coordinate system of the image data, and the three-dimensional point cloud data whose coordinate system has been converted to the coordinate system of the image data is projected onto the image data using the intrinsic parameters of the image camera, thereby comparing the converted three-dimensional point cloud data with the image data. 【0088】 (Note 9) The image processing method described in Note 6, wherein in the data acquisition step, the three-dimensional point cloud data and the image data are acquired at the same time. 【0089】 (Note 10) The image processing method described in Note 6, wherein the image data of the object is obtained by taking a picture with a single image camera. 【0090】(Note 11) A computer-readable recording medium that records a program that includes instructions to cause a computer to execute: a data acquisition step of acquiring three-dimensional point cloud data of an object obtained by a depth sensor and image data of the object obtained by taking a picture with an image camera; a feature point extraction step of converting the coordinate system of the three-dimensional point cloud data to the coordinate system of the image data, comparing the converted three-dimensional point cloud data and the image data, and extracting corresponding feature points from each; and a texture image application step of setting a plurality of triangles whose vertices are the extracted feature points so that they correspond to each other in the three-dimensional point cloud data and the image data, respectively, and applying the corresponding triangles in the image data as textures to each of the triangles set in the three-dimensional point cloud data before conversion. 【0091】 (Note 12) The computer-readable recording medium according to Note 11, wherein in the texture image application step, in the three-dimensional point cloud data and the image data before conversion, a part or all of the plurality of triangles whose vertices are the extracted feature points are divided to set up a plurality of second triangles, and the corresponding second triangle of the image data is applied as a texture to each of the second triangles set up in the three-dimensional point cloud data before conversion. 【0092】 (Note 13) The computer-readable recording medium according to Note 11, wherein in the feature point extraction step, the coordinate system of the three-dimensional point cloud data is converted to camera coordinates using a camera-depth sensor conversion matrix that converts the coordinate system of the three-dimensional point cloud data to the coordinate system of the image data, and the three-dimensional point cloud data whose coordinate system has been converted to the coordinate system of the image data is projected onto the image data using the intrinsic parameters of the image camera, thereby comparing the converted three-dimensional point cloud data with the image data. 【0093】 (Note 14) A computer-readable recording medium as described in Note 11, which acquires the three-dimensional point cloud data and the image data that are generated at the same time in the data acquisition step. 【0094】 (Note 15) The computer-readable recording medium described in Note 11, wherein the image data of the object is obtained by taking a picture with a single image camera. 【0095】 Although the present invention has been described above with reference to embodiments, the present invention is not limited to the above embodiments. Various modifications to the structure and details of the present invention can be made, as can be understood by those skilled in the art within the scope of the present invention. 【0096】 This application claims priority based on Japanese Patent Application No. 2024-217143, filed on 12 December 2024, and incorporates all of its disclosures herein. 【0097】 As described above, this disclosure makes it possible to improve realism while suppressing the processing load when constructing a three-dimensional model with color. This disclosure is useful in systems that require the generation of three-dimensional models. 【0098】 10 Image processing device 11 Data acquisition unit 12 Feature point extraction unit 13 Texture image application unit 14 Calibration execution unit 15 Parameter estimation unit 16 Data storage unit 20 Depth sensor 30 Image camera 110 Computer 111 CPU 112 Main memory 113 Storage device 114 Input interface 115 Display controller 116 Data reader / writer 117 Communication interface 118 Input device 119 Display device 120 Recording medium 121 Bus
Claims
1. An image processing apparatus comprising: data acquisition means for acquiring three-dimensional point cloud data of an object obtained by a depth sensor and image data of the object obtained by taking photographs with an image camera; feature point extraction means for converting the coordinate system of the three-dimensional point cloud data to the coordinate system of the image data, comparing the converted three-dimensional point cloud data and the image data, and extracting corresponding feature points from each; and texture image application means for setting a plurality of triangles whose vertices are the extracted feature points so that they correspond to each other in the three-dimensional point cloud data and the image data, respectively, and applying the corresponding triangles in the image data as textures to each of the triangles set in the three-dimensional point cloud data before conversion.
2. The image processing apparatus according to claim 1, wherein the texture image application means divides part or all of the plurality of triangles whose vertices are the extracted feature points in the three-dimensional point cloud data and the image data before conversion, respectively, to set up a plurality of second triangles, and applies the corresponding second triangle of the image data as a texture to each of the second triangles set up in the three-dimensional point cloud data before conversion.
3. The image processing apparatus according to claim 1, wherein the feature point extraction means converts the coordinate system of the three-dimensional point cloud data to camera coordinates using a camera-depth sensor conversion matrix that converts the coordinate system of the three-dimensional point cloud data to the coordinate system of the image data, and compares the converted three-dimensional point cloud data with the image data by projecting the three-dimensional point cloud data, whose coordinate system has been converted to the coordinate system of the image data, onto the image data using the intrinsic parameters of the image camera.
4. The image processing apparatus according to claim 1, wherein the data acquisition means acquires the three-dimensional point cloud data and the image data which are generated at the same time.
5. The image processing apparatus according to claim 1, wherein image data of the object is obtained by taking a picture with a single image camera.
6. An image processing method characterized by: acquiring three-dimensional point cloud data of an object obtained by a depth sensor and image data of the object obtained by taking pictures with an image camera; converting the coordinate system of the three-dimensional point cloud data to the coordinate system of the image data; comparing the converted three-dimensional point cloud data and the image data and extracting corresponding feature points from each; setting up a plurality of triangles in the three-dimensional point cloud data and the image data before conversion, with the extracted feature points as vertices so as to correspond to each other; and applying the corresponding triangles in the image data as textures to each of the triangles set in the three-dimensional point cloud data before conversion.
7. The image processing method according to claim 6, wherein, in the application of the texture to the image, a part or all of the plurality of triangles whose vertices are the extracted feature points are divided in the three-dimensional point cloud data and the image data before conversion to set up a plurality of second triangles, and the corresponding second triangle of the image data is applied as a texture to each of the second triangles set in the three-dimensional point cloud data before conversion.
8. The image processing method according to claim 6, wherein, in extracting feature points, the coordinate system of the three-dimensional point cloud data is converted to camera coordinates using a camera-depth sensor conversion matrix that converts the coordinate system of the three-dimensional point cloud data to the coordinate system of the image data, and the three-dimensional point cloud data, whose coordinate system has been converted to the coordinate system of the image data, is projected onto the image data using the intrinsic parameters of the image camera, thereby comparing the converted three-dimensional point cloud data with the image data.
9. The image processing method according to claim 6, wherein the acquisition of the image data includes acquiring the three-dimensional point cloud data and the image data which are generated at the same time.
10. The image processing method according to claim 6, wherein the image data of the object is obtained by taking a picture with a single image camera.
11. A computer-readable recording medium that records a program that causes a computer to acquire three-dimensional point cloud data of an object obtained by a depth sensor and image data of the object obtained by taking pictures with an image camera, convert the coordinate system of the three-dimensional point cloud data to the coordinate system of the image data, compare the converted three-dimensional point cloud data and the image data, extract corresponding feature points from each, set up a plurality of triangles in the pre-conversion three-dimensional point cloud data and the image data, with the extracted feature points as vertices so that they correspond to each other, and apply the corresponding triangles in the image data as textures to each of the triangles set in the pre-conversion three-dimensional point cloud data.
12. The computer-readable recording medium according to claim 11, wherein the program causes the computer to divide part or all of the plurality of triangles whose vertices are the extracted feature points in the three-dimensional point cloud data and the image data before conversion, respectively, in the application of the texture image, thereby setting up a plurality of second triangles, and to apply the corresponding second triangle of the image data as a texture to each of the second triangles set in the three-dimensional point cloud data before conversion.
13. The computer-readable recording medium according to claim 11, wherein the program causes the computer to convert the coordinate system of the three-dimensional point cloud data to camera coordinates using a camera-depth sensor conversion matrix that converts the coordinate system of the three-dimensional point cloud data to the coordinate system of the image data in the extraction of feature points, and projects the three-dimensional point cloud data, whose coordinate system has been converted to the coordinate system of the image data, onto the image data using the intrinsic parameters of the image camera, thereby comparing the converted three-dimensional point cloud data with the image data.
14. The computer-readable recording medium according to claim 11, wherein the program causes the computer to acquire the three-dimensional point cloud data and the image data, which are generated at the same time in acquiring the image data.
15. The computer-readable recording medium according to claim 11, wherein the image data of the object is obtained by taking a picture with a single image camera.