Control device, information processing system, and control method

By estimating imaging timing using sound data and controlling the movement of a moving object to align with this timing, the solution addresses image blur issues, improving the accuracy of 3D image reproduction from multiple viewpoints.

WO2026140054A1PCT designated stage Publication Date: 2026-07-02NTT DOCOMO INC

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
NTT DOCOMO INC
Filing Date
2024-12-24
Publication Date
2026-07-02

AI Technical Summary

Technical Problem

Captured images become blurred and unclear when an imaging device on a moving object performs imaging due to motion, significantly reducing the accuracy of reproducing images viewed from arbitrary positions in a 3D space.

Method used

An estimation unit determines future imaging timing based on sound data generated by the imaging device, and a movement control unit stops the moving object during this timing to improve image clarity, using a combination of vertical and horizontal positioning of the imaging device to capture images from multiple viewpoints.

Benefits of technology

Enhances the accuracy of reproducing images viewed from any position in a 3D space by minimizing image blur during capture, allowing for precise three-dimensional image modeling.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure JP2024045589_02072026_PF_FP_ABST
    Figure JP2024045589_02072026_PF_FP_ABST
Patent Text Reader

Abstract

A control device 100 that is built into a mobile body 10 is provided with: an estimation unit 14 that, if sound data based on sound occurring during imaging by an imaging device 130 mounted on the mobile body 10 is acquired multiple times, estimates, on the basis of the multiple pieces of sound data, an imaging time during which imaging is to be carried out in the future; and a movement control unit 15 that controls the movement of the mobile body 10, and stops the mobile body 10 for a period including the estimated imaging time.
Need to check novelty before this filing date? Find Prior Art

Description

Control Device, Information Processing System, and Control Method

[0001] The present invention relates to a technique for imaging an image by an imaging device mounted on a moving object.

[0002] As one of the techniques for generating a 3D image model for reproducing an image viewed from an arbitrary position in a 3D space using 2D image data captured from multiple viewpoints, Gaussian Splatting is known. For example, in Patent Document 1, when a self-propelled inspection robot stops at an inspection target imaging location and the camera is directed at the inspection target, a mechanism for evaluating the imaging difficulty based on the amount of 3D point cloud in front of the inspection target among the 3D point clouds that fit within the imaging angle is disclosed.

[0003] Japanese Patent Application Laid-Open No. 2023-72514

[0004] For example, a use case is being considered where an imaging device is mounted on a moving object, and the moving object autonomously moves while the imaging device captures images of the surroundings from multiple viewpoints.

[0005] However, if the moving object is moving at the timing when the imaging device performs imaging, the captured image becomes blurred and unclear, and the accuracy when reproducing an image viewed from an arbitrary position in the 3D space is significantly reduced.

[0006] Therefore, an object of the present invention is to improve the accuracy when reproducing an image viewed from an arbitrary position in a 3D space in a 3D image model generated using image data indicating an image captured by an imaging device mounted on a moving object.

[0007] To solve the above problems, the present invention provides an estimation unit that estimates an imaging timing at which the imaging will be performed in the future based on a plurality of sound data when the sound data corresponding to the sound generated in response to the imaging by the imaging device mounted on the moving object is acquired a plurality of times, and a movement control unit that controls the movement of the moving object, the movement control unit stopping the moving object during a period including the estimated imaging timing.

[0008] According to the present invention, it is possible to improve the accuracy of reproducing an image viewed from any position in three-dimensional space in a three-dimensional image model generated using image data representing an image captured by an imaging device mounted on a moving object.

[0009] This figure shows an example of the appearance of a mobile body 10 according to one embodiment of the present invention. This is a block diagram showing an example of the electrical configuration of the mobile body 10 according to the same embodiment. This is a block diagram showing an example of the functional configuration of the control device 10 provided in the mobile body 10 according to the same embodiment. This is a plan view illustrating the imaging positions on the path of the mobile body 10 according to the same embodiment. This figure illustrates the relationship between the mobile body 10 and object O according to the same embodiment. This figure illustrates the relationship between the mobile body 10 and object O according to the same embodiment. This is a flowchart illustrating the operation of the mobile body 10 according to the same embodiment.

[0010] [Configuration] A mobile body 10 according to one embodiment of the present invention moves along a horizontal road surface R and takes images of its surroundings. As a result, two-dimensional image data is obtained for each object around the mobile body 10 from multiple viewpoints (positions where imaging is performed), and a three-dimensional image model is generated using Gaussian Splatting to reproduce an image viewed from any position in three-dimensional space using this two-dimensional image data. In this embodiment, an example using Gaussian Splatting is described, but in the present invention, the technique for generating a three-dimensional image model to reproduce an image viewed from any position in three-dimensional space using two-dimensional image data taken from multiple viewpoints is not limited to Gaussian Splatting.

[0011] As shown in Figure 1, the mobile body 10 comprises a main body 110, a plurality of wheels 120, an imaging device 130, and a support member 140 that supports the imaging device 130. The support member 140 includes a slide control mechanism consisting of a member extending in the height direction (up and down direction), which allows the imaging device 130 to slide vertically (vertically in this embodiment) relative to the main body 110. In other words, the imaging device 130 can capture images of each object around the mobile body 10 from different horizontal viewpoints as the mobile body 10 moves horizontally, and the support member 140 can move the imaging device 130 vertically as it moves vertically, allowing it to capture images of each object around the mobile body 10 from different vertical viewpoints.

[0012] Figure 2 shows an example of the electrical configuration of the mobile body 10. Physically, the mobile body 10 is configured as a computer device including a processor 1001, memory 1002, storage 1003, communication device 1004, user interface device 1005, imaging device 130, vertical sliding mechanism 1007, wheel control mechanism 1008, and a bus connecting these. In the following description, the word "device" can be read as a circuit, device, unit, etc. The electrical configuration of the mobile body 10 may be configured to include one or more of the devices shown in the figure, or it may be configured to omit some of the devices.

[0013] Each function in the mobile device 10 is realized by loading predetermined software (programs) onto hardware such as the processor 1001 and memory 1002, which allows the processor 1001 to perform calculations, control communication by the communication device 1004, and control at least one of the reading and writing of data in the memory 1002 and storage 1003.

[0014] The processor 1001 controls the entire computer, for example, by running an operating system. The processor 1001 may consist of a central processing unit (CPU) that includes interfaces with peripheral devices, control units, arithmetic units, registers, and so on.

[0015] The processor 1001 reads programs (program code), software modules, data, etc., from at least one of the storage 1003 and the communication device 1004 into the memory 1002 and executes various processes accordingly. The program used is one that causes a computer to execute at least a part of the operations described later. Functional blocks of the mobile body 10 may be stored in the memory 1002 and implemented by control programs that run on the processor 1001. Various processes may be executed by one processor 1001, but may also be executed simultaneously or sequentially by two or more processors 1001. The processor 1001 may be implemented by one or more chips. The program may also be transmitted to the mobile body 10 via a telecommunications line.

[0016] The memory 1002 is a computer-readable recording medium and may consist of at least one of the following: ROM (Read Only Memory), EPROM (Erasable Programmable ROM), EEPROM (Electrically Erasable Programmable ROM), RAM (Random Access Memory), etc. The memory 1002 may also be called a register, cache, main memory, etc. The memory 1002 can store executable programs (program code), software modules, etc., for carrying out the method according to this embodiment.

[0017] The storage 1003 is a computer-readable recording medium and may consist of at least one of the following: an optical disc such as a CD-ROM (Compact Disc ROM), a hard disk drive, a flexible disk, a magneto-optical disk (e.g., a compact disc, a digital multipurpose disc, a Blu-ray® disc), a smart card, flash memory (e.g., a card, a stick, a key drive), a floppy® disk, a magnetic strip, etc. The storage 1003 may also be called an auxiliary storage device.

[0018] The communication device 1004 is hardware (transceiver / receiver device) for communicating between computers via a communication network, and is also called a network device, network controller, network card, or communication module. The communication device 1004 communicates data with the information processing device 200 via a communication network (not shown). The information processing device 200 is a computer that generates a three-dimensional image model using two-dimensional image data showing images captured from multiple different viewpoints by the imaging device 130. The information processing system according to this embodiment is constructed by the control device 100 (described later) and the information processing device 200 provided in the mobile body 10.

[0019] Each device, such as the processor 1001 and the memory 1002, is connected by a bus for communicating information. The bus may be configured using a single bus, or different buses may be configured for each device. The control device 100 according to this embodiment is realized by a computer consisting of the processor 1001, the memory 1002, and the storage 1003.

[0020] The user interface device 1005 includes an input device (e.g., keys, switches, buttons, etc.) that receives input from the user and an output device (e.g., a display, speaker, LED lamp, etc.) that provides output to the user. The input device and the output device may be configured as an integrated unit (e.g., a touchscreen).

[0021] The vertical sliding mechanism 1007 is a mechanism that moves the imaging device 130 in the vertical direction relative to the main body 110. The vertical sliding mechanism 1007 is a mechanism that allows the imaging device 130 to move along a support member that extends in the vertical direction.

[0022] The wheel control mechanism 1008 is a mechanism for controlling the wheel 120 and includes a motor mechanism for rotating the wheel 120 and a steering mechanism for changing the orientation of the wheel 120's axle.

[0023] The sound-collecting device 1009 is a device for collecting sound, such as a microphone.

[0024] As described above, the imaging device 130 is a device that generates two-dimensional image data showing the captured image, and it is preferable that it be a wide-angle camera such as a 180-degree camera or a 360-degree camera. The imaging device 130 is set to automatically take images at predetermined time intervals by its own setting function. As a result, while the moving object 10 is moving, the imaging device 130 takes images at regular time intervals, but the movement of the moving object 10 is temporarily stopped at the time when the imaging device 130 takes images, by a mechanism described later.

[0025] The mobile unit 10 may also be composed of hardware such as a microprocessor, a digital signal processor (DSP), an ASIC (Application Specific Integrated Circuit), a PLD (Programmable Logic Device), and an FPGA (Field Programmable Gate Array), and some or all of the functional blocks may be realized by this hardware. For example, the processor 1001 may be implemented using at least one of these hardware components.

[0026] Figure 3 is a block diagram showing an example of the functional configuration of the control device 100 provided by the mobile body 10. In the control device 100, the hardware shown in Figure 2 works together to realize the functions of an acquisition unit 11, a storage unit 12, a registration unit 13, an estimation unit 14, a movement control unit 15, and an up-and-down movement control unit 16, as shown in Figure 3.

[0027] The acquisition unit 11 acquires image data representing images captured by the imaging device 130. This image data may represent a still image or a moving image. The acquired image data is stored in the storage unit 12. The acquisition unit 11 also acquires sound data representing sounds captured by the sound pickup device 1009. The acquired sound data is stored in the storage unit 12.

[0028] The registration unit 13 registers (stores) reference data in the storage unit 12 to determine whether the sound data acquired by the acquisition unit 11 includes sound data indicating sounds generated in response to imaging by the imaging device 130. The registration unit 13 actually records sounds generated in response to imaging by the imaging device 130 and registers sound data indicating those sounds as reference data. The imaging device 130 generally emits a sound that imitates the shutter sound of an analog camera, etc., at the timing of imaging. The sound generated in response to imaging by the imaging device 130 is the sound emitted from the imaging device 130 at the time of imaging, and is, for example, an audible sound that imitates this shutter sound, etc. However, the sound generated in response to imaging by the imaging device 130 is not limited to audible sounds, and may be inaudible sounds such as ultrasound.

[0029] Furthermore, the registration unit 13 may register characteristic data representing the sound characteristics of each sound generated in response to imaging by multiple types of imaging devices 130 that can be mounted on the mobile body 10 as reference data. Since the sounds emitted from the imaging device 130 during imaging may differ depending on the model and settings of the imaging device 130, the reference data may be characterized by showing common characteristics of these sounds rather than the sounds themselves.

[0030] The estimation unit 14 estimates the timing of future imaging based on the sound data acquired multiple times by the acquisition unit 11, when sound data corresponding to sounds generated in response to imaging by the imaging device 130 is acquired multiple times. As mentioned above, since the imaging device 130 is set to automatically perform imaging at predetermined time intervals, the estimation unit 14 can compare the sound data acquired by the acquisition unit 11 with the reference data stored in the storage unit 12 and determine that the sound data acquired by the acquisition unit 11 includes sound data indicating sounds generated in response to imaging by the imaging device 130. Then, when the estimation unit 14 determines multiple times in a row that the acquired sound data corresponds to sounds generated in response to imaging by the imaging device 130, it calculates the time interval between the sound acquisition times of the multiple sound data, and uses the calculated time interval as the imaging interval to estimate the timing of future imaging. For example, if the time interval between sound data acquisitions corresponding to the sound generated in response to imaging by the imaging device 130 is T (seconds), then the timing T (seconds) after the timing of the Nth acquisition of sound data corresponding to the sound generated in response to imaging by the imaging device 130 is estimated to be the N+1th imaging time.

[0031] The movement control unit 15 controls the movement of the mobile body 10. In the space to be imaged by the mobile body 10, the path along which the mobile body 10 moves while imaging is predetermined, and path data indicating that path is stored in the storage unit 12. The movement control unit 15 refers to this path data and controls the wheels 120 so that the mobile body 10 moves along that path.

[0032] Here, Figure 4 is a plan view illustrating the imaging positions along the path of the moving body 10. As mentioned above, the imaging device 130 is set to automatically perform imaging at predetermined time intervals. Therefore, considering the movement speed of the moving body 10 and the time required to change the direction of movement, the imaging positions along the path of the moving body 10 can be specified, for example, as positions s1, s2, s3, etc., shown by the circular figures in Figure 4. The movement control unit 15 stops the moving body 10 for the entire period during which it controls the movement of the moving body 10, including the imaging time of the imaging device 130 estimated by the estimation unit 14 (for example, a few seconds before and after the imaging time). This is because stopping the moving body 10 and imaging the object results in less blurring of the captured image compared to imaging the object while the moving body 10 is moving.

[0033] The vertical movement control unit 16 controls the vertical sliding mechanism 1007 to move the imaging device 130 up and down. The space to be imaged by the imaging device 130 often contains objects that differ in shape or size in the vertical direction, such as desks, chairs, and plants. If images of such objects are taken from only one viewpoint in the vertical direction, the shape or size of the objects above or below that viewpoint will not be captured with sufficient accuracy. Therefore, it is necessary to capture images of each object from different viewpoints in the vertical direction to accurately reproduce the shape and size of the objects in three dimensions.

[0034] Here, the operating modes of the moving body 10 include a first operating mode in which the imaging device takes images from the same reference imaging position in the vertical direction while the moving body 10 is stopped, and a second operating mode in which the imaging device 130 takes images from different imaging positions in the vertical direction while the moving body 10 is stopped. During the period when the moving body 10 is moving, the imaging device 130 is positioned at the lowest position by the vertical movement control unit 16.

[0035] Then, as shown in Figure 5, in the first operating mode for imaging, similar to the moving period described above, the imaging device 130 is positioned at its lowest position by the vertical movement control unit 16 to perform imaging. After imaging of the surrounding object O is first performed in the first operating mode at a certain position along the path, the device then transitions to the second operating mode while remaining at that position along the path, and the imaging device 130 is positioned at a higher position than in the first operating mode by the vertical movement control unit 16 to perform imaging of the object O.

[0036] In the second operating mode, the movement control unit 15 stops the moving body 10 from the start to the end of imaging from different imaging positions. For example, as shown in Figures 6 and 7, imaging is performed at a first position higher than in the first operating mode, and at a second position even higher than the first position. This ensures that the object O is sufficiently imaged from different viewpoints in the vertical direction. The movement control unit 15 stops the moving body from the start of imaging in the first operating mode until the end of imaging from different imaging positions in the second operating mode, which is continuous with the first operating mode.

[0037] For example, at position s1 illustrated in Figure 4, imaging is first performed in the first operating mode, then the system transitions to the second operating mode at position s1, and imaging is performed from multiple viewpoints at different heights than those in the first operating mode. Once imaging in the second operating mode at position s1 is completed, the movement control unit 15 resumes movement along the path, and imaging in the first operating mode is started at the next position s2.

[0038] The number of images taken in each operation mode is predetermined, for example, one image taken in the first operation mode at a certain position along the path, and two images taken in the second operation mode at the same position. For example, at position s1 illustrated in Figure 4, one image is taken in the first operation mode, then the system transitions to the second operation mode at position s1, and two images are taken from viewpoints at different heights than those in the first operation mode. Once the two images taken in the second operation mode at position s1 are completed, the movement control unit 15 resumes movement along the path, and then the first operation mode image is started at position s2. In this case, the movement control unit 15 and the vertical movement control unit 16 count the number of imaging times estimated by the estimation unit 14 from the start of movement, and determine whether to use the first operation mode or the second operation mode based on the counted number to control the moving body 10.

[0039] [Operation] Next, the operation of the mobile unit 10 will be described. The procedures for each process shown in Figure 8 are described in the program stored in the mobile unit 10.

[0040] In Figure 8, before the movement control unit 15 starts moving, the imaging device 130 takes multiple images at predetermined time intervals while the moving body 10 is stopped. As a result, the acquisition unit 11 acquires multiple sound data representing the sound generated in response to the imaging by the imaging device 130, that is, the sound picked up by the sound pickup device 1009 (step S11; YES). The multiple acquired sound data are stored (registered) by the storage unit 12 via the registration unit 13.

[0041] The estimation unit 14 calculates the imaging interval of the imaging device 130 based on the timing of sound acquisition of multiple stored sound data (step S12).

[0042] Next, when the movement and imaging of the moving body 10 are started (step S13; YES), the estimation unit 14 estimates the next imaging timing (step S14), and before a predetermined period when that imaging timing arrives (step S15; YES), the moving body 10 is stopped (step S16). In this stopped state, imaging in the first operation mode is started, and imaging in the second operation mode that continues the first operation mode is performed. When the imaging in the second operation mode ends, the movement control unit 15 determines to move the moving body 10 (step S17; YES), and resumes the movement of the moving body 10 (step S18).

[0043] The above processes of steps S12 to S18 are repeatedly performed until the moving body 10 moves all over the route. Then, in response to the imaging of the imaging device 130, the acquisition unit 11 acquires image data indicating an image captured by the imaging device 130. The information processing device 200 generates a three-dimensional image model using two-dimensional image data indicating images captured from a plurality of different viewpoints by the imaging device 130.

[0044] According to the embodiment described above, in the three-dimensional image model generated using the image data indicating the image captured by the imaging device 130 mounted on the moving body 10, when reproducing an image viewed from an arbitrary position in the three-dimensional space, it is possible to improve the accuracy.

[0045] [Modification Example] The present invention is not limited to the above-described embodiment. The above-described embodiment may be modified as follows. Also, two or more of the following modification examples may be combined and implemented.

[0046] [Modification Example 1] The moving body 10 may include a detection unit that detects the remaining amount of power for driving the self-moving body, and the movement control unit 15 may control the movement speed of the moving body 10 according to the remaining amount detected by the detection unit. For example, the movement control unit 15 controls the moving body 10 to move at the maximum speed achievable with the remaining amount detected by the detection unit.

[0047] [Modification Example 2] In the first operation mode and the second operation mode, the sound generated according to the imaging by the imaging device 130 may be different. In this case, the registration unit 13 registers reference data regarding the sound generated according to the imaging by the imaging device 130 in the first operation mode and reference data regarding the sound generated according to the imaging by the imaging device 130 in the second operation mode. The estimation unit 14 compares the sound data acquired by the acquisition unit 11 with the reference data stored in the storage unit 12, and determines whether the sound data acquired by the acquisition unit 11 includes sound data indicating the sound generated according to the imaging by the imaging device 130. When the level of ambient noise is high, the accuracy of the determination or estimation by the estimation unit 14 may decrease. If it is as in this modification example, it is possible to at least discriminate whether the operation mode of the moving body 10 is the first operation mode or the second operation mode. As a result, the accuracy of the determination or estimation by the estimation unit 14 is improved.

[0048] [Modification Example 3] The vertical movement control unit 16 may move the imaging device 130 in a direction including at least a vertical direction component. For example, the vertical movement control unit 16 may control a multi-joint robot arm provided with the imaging device 130 to move the imaging device 130 in a direction including at least a vertical direction component.

[0049] [Other Modification Examples] Note that the block diagrams used in the description of the above embodiment show functional unit blocks. These functional blocks (components) are realized by an arbitrary combination of at least one of hardware and software. Also, the method for realizing each functional block is not particularly limited. That is, each functional block may be realized using one physically or logically combined device, or two or more physically or logically separated devices may be directly or indirectly (for example, using wired, wireless, etc.) connected, and realized using these multiple devices. The functional block may be realized by combining software with the above one device or the above multiple devices.

[0050] The processing procedures, sequences, flowcharts, etc., of each aspect / embodiment described in this disclosure may be reordered, provided they do not contradict each other. For example, the methods described in this disclosure present various step elements using exemplary order and are not limited to the specific order presented.

[0051] Functions include, but are not limited to, judgment, decision, determination, calculation, calculation, processing, derivation, investigation, exploration, confirmation, reception, transmission, output, access, resolution, selection, selection, establishment, comparison, assumption, expectation, assumption, broadcasting, notifying, communicating, forwarding, configuring, reconfiguring, allocating (mapping), and assigning. For example, a functional block (configuration part) that enables transmission is called a transmitting unit or transmitter. In all cases, as mentioned above, the method of implementation is not particularly limited.

[0052] The notification of information is not limited to the embodiments described herein and may be carried out by other means. For example, the notification of information may be carried out by physical layer signaling (e.g., DCI (Downlink Control Information), UCI (Uplink Control Information)), upper layer signaling (e.g., RRC (Radio Resource Control) signaling, MAC (Medium Access Control) signaling, broadcast information (MIB (Master Information Block), SIB (System Information Block))), other signals, or combinations thereof. RRC signaling may also be called RRC messages, and may be, for example, RRC Connection Setup messages, RRC Connection Reconfiguration messages, etc.

[0053] Each aspect / embodiment described in this disclosure may be applied to at least one of the following: LTE (Long Term Evolution), LTE-A (LTE-Advanced), SUPER 3G, IMT-Advanced, 4G (4th generation mobile communication system), 5G (5th generation mobile communication system), FRA (Future Radio Access), NR (new Radio), W-CDMA®, GSM®, CDMA2000, UMB (Ultra Mobile Broadband), IEEE 802.11 (Wi-Fi®), IEEE 802.16 (WiMAX®), IEEE 802.20, UWB (Ultra-Wide Band), Bluetooth®, and other appropriate systems, as well as next-generation systems extended based thereon. Furthermore, multiple systems may be applied in combination (for example, a combination of at least one of LTE and LTE-A with 5G).

[0054] Information can be output from a higher layer (or lower layer) to a lower layer (or higher layer). Input and output may also occur via multiple network nodes.

[0055] Input and output information may be stored in a specific location (e.g., memory) or managed using a management table. Input and output information may be overwritten, updated, or appended to. Output information may be deleted. Input information may be transmitted to other devices.

[0056] The determination may be made by a value represented by one bit (0 or 1), by a boolean value (true or false), or by a numerical comparison (for example, a comparison with a predetermined value).

[0057] Each aspect / embodiment described in this disclosure may be used individually, in combination, or switched between as needed during implementation. Furthermore, notification of specific information (e.g., notification that "X is") is not limited to explicit notification, but may also be implicit (e.g., by not providing such notification).

[0058] Although the present disclosure has been described in detail above, it will be clear to those skilled in the art that the present disclosure is not limited to the embodiments described herein. The present disclosure can be implemented in modified and altered forms without departing from the intent and scope of the present disclosure as defined by the claims. Therefore, the descriptions in the present disclosure are illustrative and not intended to be restrictive in any way.

[0059] Software should be broadly interpreted to mean instructions, instruction sets, code, code segments, program code, programs, subprograms, software modules, applications, software applications, software packages, routines, subroutines, objects, executable files, execution threads, procedures, functions, and so on, whether they are called software, firmware, middleware, microcode, hardware description languages, or by any other name.

[0060] Furthermore, software, instructions, information, etc., may be transmitted and received via a transmission medium. For example, if software is transmitted from a website, server, or other remote source using at least one of wired technology (such as coaxial cable, fiber optic cable, twisted pair, or digital subscriber line (DSL)) and wireless technology (such as infrared or microwave), then at least one of these wired and wireless technologies is included in the definition of a transmission medium.

[0061] The information, signals, etc. described in this disclosure may be represented using any of the various different techniques. For example, the data, instructions, commands, information, signals, bits, symbols, chips, etc. that may be referred to throughout the above description may be represented by voltage, current, electromagnetic waves, magnetic fields or magnetic particles, optical fields or photons, or any combination thereof.

[0062] In addition, terms used in this disclosure and terms necessary for understanding this disclosure may be replaced with terms having the same or similar meanings. For example, at least one of the channel and symbol may be a signal (signaling). Also, a signal may be a message. Furthermore, a component carrier (CC) may be called a carrier frequency, cell, frequency carrier, etc.

[0063] The terms “system” and “network” as used in this disclosure are interchangeable.

[0064] Furthermore, the information, parameters, etc., described in this disclosure may be expressed using absolute values, relative values ​​from a given value, or other corresponding information. For example, wireless resources may be indicated by an index.

[0065] The names used for the parameters described above are not restrictive in any way. Furthermore, the formulas and other expressions using these parameters may differ from those expressly disclosed in this disclosure. Various channels (e.g., PUCCH, PDCCH, etc.) and information elements can be identified by any suitable name, and therefore, the various names assigned to these various channels and information elements are not restrictive in any way.

[0066] In this disclosure, terms such as “Mobile Station (MS),” “user terminal,” “User Equipment (UE),” and “terminal” may be used interchangeably. A mobile station may also be referred to by those skilled in the art as a subscriber station, mobile unit, subscriber unit, wireless unit, remote unit, mobile device, wireless device, wireless communication device, remote device, mobile subscriber station, access terminal, mobile terminal, wireless terminal, remote terminal, handset, pedestrian agent, mobile client, client, or several other appropriate terms.

[0067] The mobile device 10 may also be called a transmitting device, a receiving device, a communication device, etc.

[0068] The terms "determining" and "decision" can encompass a wide variety of actions. "Determining" and "decision" can include, for example, judging, calculating, computing, processing, deriving, investigating, looking up, searching, inquiry (e.g., searching in tables, databases, or other data structures), and ascertaining. Furthermore, "determining" and "decision" can also include receiving (e.g., receiving information), transmitting (e.g., sending information), input, output, and accessing (e.g., accessing data in memory). Furthermore, "judgment" and "decision" can include considering something as having been "judged" or "decided" after resolving, selecting, choosing, establishing, comparing, etc. In other words, "judgment" and "decision" can include considering something as having been "judged" or "decided" after some action. Also, "judgment (decision)" can be reinterpreted as "assuming," "expecting," or "considering."

[0069] The terms “connected,” “coupled,” or any variation thereof, mean any direct or indirect connection or coupling between two or more elements, and may include the presence of one or more intermediate elements between two elements that are “connected” or “coupled” with each other. The coupling or connection between elements may be physical, logical, or a combination thereof. For example, “connection” may be reinterpreted as “access.” As used in this disclosure, two elements may be considered to be “connected” or “coupled” with each other using at least one of one or more wires, cables, and printed electrical connections, and, in some non-limiting and non-exclusive examples, electromagnetic energy having wavelengths in the radio frequency domain, microwave domain, and optical (both visible and invisible) domain.

[0070] In this disclosure, the phrase "based on" does not mean "based solely on" unless otherwise specified. In other words, the phrase "based on" means both "based solely on" and "based at least on."

[0071] In the configuration of each of the above devices, "means" may be replaced with "part," "circuit," "device," etc.

[0072] Where the terms “include,” “including,” and variations thereof are used in this disclosure, these terms are intended to be inclusive, as is the term “comprising.” Furthermore, the term “or” as used in this disclosure is not intended to mean exclusive OR.

[0073] In this disclosure, if articles are added through translation, such as a, an, and the in English, this disclosure may include the fact that the noun following these articles is plural.

[0074] In this disclosure, the term "A and B are different" may mean "A and B are different from each other." The term may also mean "A and B are each different from C." Terms such as "separate" and "combine" may be interpreted similarly to "different."

[0075] 10...Moving body, 1001...Processor, 1002...Memory, 1003...Storage, 1004...Communication device, 1005...User interface device, 1007...Up / down sliding mechanism, 1008...Wheel control mechanism, 1009...Sound pickup device, 11...Acquisition unit, 12...Storage unit, 13...Registration unit, 14...Estimation unit, 15...Movement control unit, 16...Up / down movement control unit, 100...Control device, 110...Main body, 120...Wheels, 130...Imaging device, 140...Support member, R...Road surface, O...Object, V...Up / down direction.

Claims

1. A control device comprising: an estimation unit that, when sound data corresponding to sounds generated in response to imaging by an imaging device mounted on a moving body is acquired multiple times, estimates the imaging time in which imaging will be performed in the future based on the multiple sound data; and a movement control unit that controls the movement of the moving body, wherein the movement control unit stops the moving body during the period including the estimated imaging time.

2. The control device according to claim 1, characterized in that the movement control unit starts moving the moving body after the imaging timing has been estimated by the estimation unit based on the sound data acquired multiple times while the moving body is stopped.

3. The control device according to claim 1, comprising a detection unit for detecting the remaining amount of power for driving the mobile body, wherein the movement control unit controls the movement speed of the mobile body according to the detected remaining amount.

4. The control device according to claim 1, wherein the moving body is stopped and the imaging device takes images from the same reference imaging position in the vertical direction, and the moving body is stopped and the imaging device takes images from different imaging positions in the vertical direction, and the movement control unit stops the moving body from the start of imaging in the first operation mode until imaging from the different imaging positions in the second operation mode, which is continuous with the first operation mode, is completed.

5. The control device according to claim 4, characterized in that the sound produced in response to imaging by the imaging device is different in the first operating mode and the second operating mode.

6. The control device according to claim 1, further comprising a registration unit that collects sound generated in response to imaging by the imaging device mounted on the mobile body and registers sound data indicating said sound.

7. The control device according to claim 1, further comprising a registration unit for registering characteristic data that represents the sound characteristics of sounds generated in response to imaging by a plurality of types of imaging devices that can be mounted on the mobile body.

8. An information processing system comprising: a control device according to any one of claims 1 to 7; and an information processing device that generates a three-dimensional image model for reconstructing an image viewed from an arbitrary position in three-dimensional space using two-dimensional image data showing images captured from multiple different positions by the imaging device.

9. The information processing system according to claim 8, characterized in that the information processing device generates the three-dimensional image model using Gaussian Splatting.

10. A control method characterized by comprising: an estimation step of estimating an imaging time in which imaging will be performed in the future based on multiple sound data acquired in response to imaging by an imaging device mounted on a moving body; and a movement control step of a movement control unit that controls the movement of the moving body and stops the moving body during a period including the estimated imaging time.