Data augmentation method and device for performing same

The data augmentation method addresses training challenges by generating and adjusting copy data from point clouds and images to enhance diversity and consistency, improving autonomous driving path generation.

WO2026142347A1PCT designated stage Publication Date: 2026-07-0242DOT INC

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
42DOT INC
Filing Date
2025-12-24
Publication Date
2026-07-02

AI Technical Summary

Technical Problem

Existing autonomous driving models face challenges in training data diversity and overfitting, particularly when generating paths for vehicles, necessitating improved data augmentation techniques.

Method used

A data augmentation method and device that generates copy data for target objects based on bounding boxes from point clouds and images, adjusting placements to avoid occlusions and maintain spatial consistency across frames, enhancing training data variety.

Benefits of technology

Enhances the diversity and consistency of training data for autonomous driving models, reducing overfitting and improving the generation of accurate driving paths.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure KR2025022810_02072026_PF_FP_ABST
    Figure KR2025022810_02072026_PF_FP_ABST
Patent Text Reader

Abstract

A data augmentation method is disclosed. A method according to one embodiment may comprise the operations of: obtaining first data including an image and a point cloud for surroundings of a vehicle from a sensor mounted on the vehicle; generating data to copy corresponding to a target object to copy among a plurality of objects included in the first data; and synthesizing the data to copy with second data obtained by sensing the surroundings of the vehicle, on the basis of the location and size of the target object in the first data.
Need to check novelty before this filing date? Find Prior Art

Description

Data augmentation method and device for performing the same

[0001] The following disclosure relates to a data augmentation method and an apparatus for performing the same.

[0002] An end-to-end model is a type of machine learning model that derives output data by performing operations directly on input data without intermediate processing steps or separate components. Based on information obtained from sensors of an autonomous driving vehicle (ADV), an end-to-end model can generate a path for the vehicle to drive (e.g., a target trajectory).

[0003] Meanwhile, to improve the performance of autonomous driving path generation using an end-to-end model, it may be important to train the model using various types of data.

[0004] The background technology described above is possessed or acquired by the inventor in the process of deriving the content of the disclosure of the present application, and cannot necessarily be considered as prior art disclosed to the general public prior to the filing of this application.

[0005] One embodiment may provide data augmentation technology for training an autonomous driving model.

[0006] One embodiment can augment consistent data for the same object for continuous data.

[0007] However, technical challenges are not limited to the technical challenges described above, and other technical challenges may exist.

[0008] A method according to one embodiment may include the operation of acquiring first data including an image and a point cloud of the surroundings of the vehicle from a sensor mounted on the vehicle, the operation of generating copy data corresponding to a target object to copy among a plurality of objects included in the first data, and the operation of synthesizing the copy data to second data acquired by sensing the surroundings of the vehicle based on the location and size of the target object in the first data.

[0009] According to one embodiment, the operation of generating the copy data may include the operation of obtaining a bounding box corresponding to the target object from the point cloud, the operation of projecting the bounding box onto the image to extract an image patch corresponding to the target object from the image, and the operation of storing the image patch as the copy data.

[0010] According to one embodiment, the synthesizing operation may include placing the copy data at the location on the second data and adjusting the location where the copy data is placed.

[0011] According to one embodiment, the adjusting operation may include an operation of determining whether the target object is occluded with another object within the second data, and an operation of changing the location where the copy data is placed within a predetermined distance until the occlusion disappears in response to the determination that the occlusion has occurred.

[0012] According to one embodiment, the second data may include a plurality of frames that sense the surroundings of the vehicle.

[0013] According to one embodiment, the synthesizing operation may include placing the copy data in the second data such that the target object maintains the same spatial location in the plurality of frames.

[0014] According to one embodiment, the placing operation may include the operation of placing the copy data at the location such that the target object is located on the ground in the second data.

[0015] According to one embodiment, a computer-readable recording medium storing one or more computer programs may include instructions for performing the method in a processor.

[0016] An apparatus according to one embodiment may include at least one processor and a memory including instructions. Based on the instructions being executed individually or collectively by the at least one processor, the apparatus may acquire first data including an image and a point cloud of the vehicle's surroundings from a sensor mounted on the vehicle, generate copy data corresponding to a target object to copy among a plurality of objects included in the first data, and synthesize the copy data with second data acquired by sensing the vehicle's surroundings based on the location and size of the target object in the first data.

[0017] According to one embodiment, based on the instructions being executed individually or collectively by at least one processor, the device may obtain a bounding box corresponding to the target object from the point cloud, project the bounding box onto the image to extract an image patch corresponding to the target object from the image, and store the image patch as the copy data.

[0018] According to one embodiment, based on the instructions being executed individually or collectively by the at least one processor, the device may place the copy data on the second data at the location and adjust the location where the copy data is placed.

[0019] According to one embodiment, based on the instructions being executed individually or collectively by the at least one processor, the device may determine whether the target object is occluded with another object within the second data, and in response to the determination that the occlusion has occurred, change the location where the copy data is placed within a predetermined distance until the occlusion disappears.

[0020] According to one embodiment, the second data may include a plurality of frames that sense the surroundings of the vehicle.

[0021] According to one embodiment, based on the instructions being executed individually or collectively by the at least one processor, the device may be able to place the copy data in the second data so that the target object maintains the same spatial location in the plurality of frames.

[0022] According to one embodiment, based on the instructions being executed individually or collectively by the at least one processor, the device may be able to place the copy data at the location such that the target object is located on the ground in the second data.

[0023] In relation to the description of the drawings, the same or similar reference numerals may be used for identical or similar components.

[0024] FIG. 1 is a drawing for illustrating a data augmentation system according to one embodiment.

[0025] FIG. 2 is a diagram illustrating an end-to-end based path generation framework according to one embodiment.

[0026] FIG. 3 is a diagram illustrating a data augmentation process according to one embodiment.

[0027] FIGS. 4a and FIGS. 4b are drawings for illustrating examples of augmented data according to one embodiment.

[0028] FIG. 5 is a flowchart illustrating a method according to one embodiment.

[0029] FIG. 6 is a schematic block diagram of an electronic device according to one embodiment.

[0030] Specific structural or functional descriptions of the embodiments are disclosed for illustrative purposes only and may be modified and implemented in various forms. Accordingly, actual implementations are not limited to the specific embodiments disclosed, and the scope of this specification includes modifications, equivalents, or substitutions included in the technical concept described by the embodiments.

[0031] Terms such as "first" or "second" may be used to describe various components, but these terms should be interpreted solely for the purpose of distinguishing one component from another. For example, the first component may be named the second component, and similarly, the second component may be named the first component.

[0032] When it is stated that a component is "connected" to another component, it should be understood that it may be directly connected to or coupled with that other component, or that there may be other components in between.

[0033] Singular expressions include plural expressions unless the context clearly indicates otherwise. In this document, phrases such as “A or B,” “at least one of A and B,” “at least one of A or B,” “A, B or C,” “at least one of A, B and C,” and “at least one of A, B, or C” may each include any one of the items listed together with the corresponding phrase, or all possible combinations thereof. In this specification, terms such as “comprising” or “having” are intended to designate the existence of the described feature, number, step, action, component, part, or combination thereof, and should be understood as not precluding the existence or addition of one or more other features, numbers, steps, actions, components, parts, or combinations thereof.

[0034] Unless otherwise defined, all terms used herein, including technical or scientific terms, have the same meaning as generally understood by those skilled in the art. Terms such as those defined in commonly used dictionaries should be interpreted as having a meaning consistent with their meaning in the context of the relevant technology, and should not be interpreted in an ideal or overly formal sense unless explicitly defined in this specification.

[0035] As used herein, the term "module" may include a unit implemented in hardware, software, or firmware, and may be used interchangeably with terms such as logic, logic block, component, or circuit. A module may be a component formed integrally, or a minimum unit of said component or a part thereof that performs one or more functions. For example, according to one embodiment, a module may be implemented in the form of an application-specific integrated circuit (ASIC).

[0036] As used in this document, the term "part" refers to software or hardware components, such as FPGAs or ASICs, and the "part" performs certain roles. However, the meaning of "part" is not limited to software or hardware. The "part" may be configured to reside in an addressable storage medium or configured to operate one or more processors. For example, the "part" may include components such as software components, object-oriented software components, class components, and task components, as well as processes, functions, attributes, procedures, subroutines, segments of program code, drivers, firmware, microcode, circuits, data, databases, data structures, tables, arrays, and variables. The functions provided within the components and "parts" may be combined into a smaller number of components and "parts" or further separated into additional components and "parts." Furthermore, the components and "parts" may be implemented to operate one or more CPUs within a device or secure multimedia card. Additionally, '~part' may include one or more processors.

[0037] Hereinafter, embodiments will be described in detail with reference to the attached drawings. In the description with reference to the attached drawings, identical components are given the same reference numeral regardless of the drawing number, and redundant descriptions thereof will be omitted.

[0038]

[0039] FIG. 1 is a drawing for illustrating a data augmentation system according to one embodiment.

[0040] Referring to FIG. 1, according to one embodiment, a data augmentation system (e.g., a data augmentation system (10)) may be a system for augmenting data used to train a model (e.g., an end-to-end model) for autonomous driving of a vehicle (e.g., a vehicle (110)).

[0041] The data augmentation system (10) may include a data augmentation device (100), a server (130), and a vehicle (110). The data augmentation device (100) may be a device that augments data used to train a model that generates a driving path for the vehicle (110). The data augmentation device (100) may be mounted on the vehicle (110), but is not limited thereto. For example, the data augmentation device (100) may be a software module implemented in the processor of the vehicle (110), but may also be mounted on an external device of the vehicle (110) (e.g., a server (130)) to augment data. The data augmentation device (100) may acquire first data including an image and a point cloud of the surroundings of the vehicle (110) from a sensor mounted on the vehicle (110). The sensor mounted on the vehicle (110) may include a plurality of sensors. For example, the vehicle (110) may include various sensors such as an image sensor (e.g., camera), a LiDAR sensor, a radar sensor, an event sensor, an illuminance sensor, a GPS (global positioning system) device, and an accelerometer. The data augmentation device (100) may obtain an image of the surroundings of the vehicle (110) from an image sensor (e.g., camera) mounted on the vehicle (110). The data augmentation device (100) may include a point cloud of the surroundings of the vehicle (110) from a LiDAR sensor mounted on the vehicle (110). The point cloud may include a static point cloud and a dynamic point cloud.

[0042] The data augmentation device (100) can generate copy data corresponding to a target object to copy among a plurality of objects included in the first data. The target object may be an object to be used for data augmentation among a plurality of objects located around the vehicle (110). The data augmentation device (100) can augment data by synthesizing the copy data with second data obtained by sensing the surroundings of the vehicle (110). For example, the data augmentation device (100) can generate augmented data (e.g., augmented data (350) of FIG. 3) by obtaining an object from the first data that is not present in the second data and synthesizing it with the second data. The second data may include a plurality of frames that sense the surroundings of the vehicle (110). For example, the second data may include a plurality of frames that continuously sense the surroundings of the vehicle (110). The data augmentation device (100) can synthesize copy data into the second data based on the location and size of the target object in the first data. The data augmentation device (100) can place the target object in the second data so that it maintains the same spatial location in a plurality of frames included in the second data. For example, the data augmentation device (100) can synthesize copy data into the second data so that the distance between the vehicle (110) and the target object and the size of the target object change naturally as the vehicle (110) moves in a plurality of temporally consecutive frames.

[0043] The training data augmented by the data augmentation device (100) is transmitted to a model (e.g., an end-to-end model) for autonomous driving of the vehicle (110) and can be used to train the model. The trained model can be used to generate an autonomous driving path for the vehicle (110). An autonomous driving device (not shown) may include a model (e.g., an end-to-end model). It may be trained using the data augmented by the data augmentation device (100). The training of the model may be performed internally within the autonomous driving device (not shown), or the autonomous driving device (not shown) may use a model trained on a separate device (e.g., a training device (not shown), a server (130)).

[0044] An autonomous driving device (not shown) can generate a driving path for a vehicle (110) based on an end-to-end model. The end-to-end model may be a model that performs motion planning by directly processing input data without complex intermediate steps. The end-to-end model may be included in the vehicle (110) and / or server (130) to generate a driving path for the vehicle (110). The autonomous driving device (not shown) may be mounted inside the vehicle (110) to control the driving of the vehicle (110). For example, the autonomous driving device (not shown) may be a software module implemented on a processor (not shown) of the vehicle (110). The vehicle (110) refers to a vehicle that transports people and / or goods and may include a vehicle such as an automobile. The vehicle (110) may be an autonomous vehicle.

[0045] The data augmentation device (100), vehicle (110), and server (130) can communicate using a network (not shown). For example, the network may include a Local Area Network (LAN), a Wide Area Network (WAN), a Value Added Network (VAN), a mobile radio communication network, a satellite communication network, and combinations thereof. The network is a comprehensive data communication network that enables the data augmentation device (100) and the server (130) to communicate smoothly with each other, and may include wired internet, wireless internet, and mobile wireless communication networks. Additionally, the wireless communication network may include, for example, Wi-Fi, Bluetooth, Bluetooth Low Energy, Zigbee, Wi-Fi Direct (WFD), Ultra-Wideband (UWB), Infrared Data Association (IrDA), Near Field Communication (NFC), but is not limited thereto.

[0046]

[0047] FIG. 2 is a diagram illustrating an end-to-end based path generation framework according to one embodiment.

[0048] Referring to FIG. 2, according to one embodiment, an end-to-end based path generation framework may include a data augmentation process (210), a training process (220), and an inference process (230). An autonomous driving system (not shown) can generate a driving path for a vehicle (e.g., vehicle (110) of FIG. 1) through the end-to-end based path generation framework.

[0049] The data augmentation process (210) may be a process of augmenting data to train a model (e.g., an end-to-end model). The data augmentation process (210) may be performed between the dataset creation process and the dataset validation process. The data augmentation process (210) may be performed by a data augmentation device (e.g., the data augmentation device (100) of FIG. 1).

[0050] The learning process (220) may be a process of training a model (e.g., an end-to-end model) to generate an optimal driving path. The learning process (220) may be performed in a learning device (not shown). The learning device (not shown) may perform the learning process (220) by receiving a dataset containing data augmented through the data augmentation process (210) from the data augmentation device (100). A model (e.g., a trained model) that generates a driving path of a vehicle (e.g., the vehicle (110) of FIG. 1) may perform an inference process (230) using image-formatted data as input, but point cloud data acquired through a LiDAR sensor may also be used in the learning process (220). The learning device (not shown) may be an autonomous driving device (not shown) or a separate device from the autonomous driving device (not shown). For example, the training of the model may be performed in a separate learning device (not shown), and the autonomous driving device (not shown) may generate a driving path using the trained model.

[0051] The inference process (230) may be a process for generating a driving path of a vehicle (e.g., vehicle (110)) using a learned model. An autonomous driving device (not shown) may generate a driving path in the inference process (230) based on data input from a plurality of sensors of the vehicle (110). In the inference process (230), a learned model (e.g., a learned end-to-end model) may generate a driving path of the vehicle (110) by taking data in image format as input.

[0052]

[0053] FIG. 3 is a diagram illustrating a data augmentation process according to one embodiment.

[0054] Referring to FIG. 3, according to one embodiment, a data augmentation device (e.g., the data augmentation device (100) of FIG. 1) can acquire first data (310) including an image (e.g., image (315)) and a point cloud (e.g., point cloud (311)) of the surroundings of a vehicle (e.g., the vehicle (110) of FIG. 1) from a sensor (e.g., a sensor such as an image sensor or a LiDAR sensor) mounted on the vehicle (e.g., the vehicle (110) of FIG. 1). The point cloud (e.g., point cloud (311)) is a set of points represented to indicate the shape of an object in three-dimensional space, and can be generated by detecting a signal (e.g., light) reflected from the object through a LiDAR sensor. The image (315) may be an image containing multiple objects located around the vehicle (110) (e.g., other vehicles, pedestrians, two-wheeled vehicles, streetlights, rubber cones, telegraph poles, etc.). The point cloud (311) may contain three-dimensional spatial information (e.g., three-dimensional coordinates) for multiple objects located around the vehicle (110) (e.g., other vehicles, pedestrians, streetlights, rubber cones, telegraph poles, etc.).

[0055] The data augmentation device (100) can generate copy data (e.g., copy data (330)) corresponding to a target object to be copied among a plurality of objects included in the first data (310). The target object may be an object that the data augmentation device (100) has decided to include in the augmented data (e.g., augmented data (350)). The target object may include an object that does not exist in the original data (e.g., second data (340)) and / or an object that exists in the original data (e.g., second data (340)) but whose number is to be increased. The data augmentation device (100) can obtain a bounding box corresponding to the target object from the point cloud (311). The bounding box may be a rectangular box surrounding the three-dimensional space corresponding to the object (e.g., target object). The data augmentation device (100) can project a bounding box corresponding to a target object onto an image (315) and extract an image patch corresponding to the target object from the image (315). For example, the data augmentation device (100) can project a bounding box corresponding to the target object onto the image (315) and extract an image patch containing image pixels corresponding to the projected area. The data augmentation device (100) can store the extracted image patch as copy data (e.g., copy data (330)).

[0056] Copy data (330) may include image patches corresponding to various multiple objects. For example, copy data (330) may include multiple image patches obtained by projecting bounding boxes corresponding to multiple objects (e.g., other vehicles, pedestrians, motorcycles, streetlights, rubber cones, utility poles, etc.) onto an image (e.g., image (315)). Copy data (330) may be stored in the memory of the data augmentation device (100) (e.g., memory (610) of FIG. 6) and converted into a database (DB). Copy data (330) may include image patches corresponding to a target object and three-dimensional information about the target object. For example, information about the target object may include the type of the target object, three-dimensional size (e.g., length of the target object, width of the target object, height of the target object), three-dimensional position (e.g., three-dimensional coordinates expressed as x, y, z), and the direction the target object is facing (e.g., yaw angle). Note that the copy data (330) is illustrated as being generated based on an image (315) obtained through an image sensor and a point cloud (311) obtained through a lidar sensor, but is not limited thereto. For example, the copy data (330) may be generated based on data obtained through a single sensor of an image sensor or lidar sensor, or may be generated based on data obtained by a sensor other than an image sensor and lidar sensor.

[0057] The data augmentation device (100) can synthesize copy data (330) into second data (340) based on the location and size of the target object in the first data (310). The second data (340) may be data obtained by sensing the surroundings of the vehicle (110). For example, the second data (340) may include a point cloud obtained using a LiDAR sensor and / or an image obtained using an image sensor. The second data (340) may be data obtained by sensing the surroundings of the vehicle (110) omnidirectionally with the vehicle (110) as the center. The second data (340) may include a plurality of frames that sense the surroundings of the vehicle (110). For example, the second data (340) may include a plurality of image frames and / or a plurality of point cloud frames that are continuously sensed for the omnidirectionals of the vehicle (110).

[0058] The data augmentation device (100) can randomly load copy data (e.g., copy data (330)) to be synthesized from the second data (340). For example, the data augmentation device (100) can load copy data (e.g., copy data corresponding to a rubber cone) that corresponds to at least one target object among the copy data (330) stored in memory (e.g., memory (610) of FIG. 6). The data augmentation device (100) can synthesize the loaded copy data into the second data (340). The data augmentation device (100) can place the copy data (330) in the second data (340) at the location of the target object within the first data (310). For example, the data augmentation device (100) may place copy data corresponding to a target object in the second data (340) so that, for a target object of size Y meters located at a distance of X meters from the vehicle (110) within the first data (310), it has a size of Z pixels, just as in the first data (310). The data augmentation device (100) may place copy data (330) corresponding to a target object so that the target object is located on the ground in the second data (340). For example, the data augmentation device (100) may place copy data in the second data (340) by taking into account the size (e.g., height) of the target object so that the target object that was located on the ground in the first data (310) can also be located on the ground in the second data (340).

[0059] The data augmentation device (100) can adjust the position where the copy data (330) is placed within the second data (340). The data augmentation device (100) can determine whether the target object is occluded with another object within the second data (340). For example, it can determine whether the target object corresponding to the copy data (330) placed within the second data (340) overlaps with another object to determine whether interference occurs. In response to the determination that interference has occurred, the data augmentation device (100) can change the position where the copy data is placed within a predetermined distance until the interference disappears. For example, in response to the determination that the target object is occluded with another object, the data augmentation device (100) can change the position of the copy data (330) within a predetermined distance (e.g., n meters, provided that n may be 5).

[0060] The data augmentation device (100) may place copy data (330) in the second data (340) so that a target object maintains the same spatial location in a plurality of frames included in the second data (340). In data used to train an autonomous driving model (e.g., an end-to-end model) that generates a driving path using temporally continuous data, a specific object may need to maintain the same spatial location within the continuous data. For example, an object located in front of the vehicle (110) may need to be closer to the vehicle (110) as the vehicle (110) moves forward (e.g., as it moves to a subsequent frame).

[0061] The data augmentation device (100) can generate augmented data (350) by placing copy data (330) in the second data (340) so that the target object maintains the same spatial position in multiple frames included in the second data (340). For example, the data augmentation device (100) can place copy data (330) in the second data (340) so that the target object can be placed in the same spatial position in consecutively acquired previous and subsequent frames. The augmented data (350) can be classified into an augmented point cloud (351) and an augmented image (355) according to the format of the data. The augmented point cloud (351) may be a point cloud in which points corresponding to the target object included in the copy data (330) are synthesized for data in which the format of the second data (340) is a point cloud. For example, if the number of points in the second data (340) is N and the number of points included in the copy data (330) is M, the number of points in the augmented point cloud (351) may be N+M. The augmented image (355) may be an image in which the copy data (330) is overlaid on the data in the second data (340) that is in the format of an image. The augmented data (350) may be used in a learning process (e.g., the learning process (220) of FIG. 2) for generating a driving path of a model (e.g., an end-to-end model). In the inference process (e.g., the inference process (230) of FIG. 2) of the end-to-end model, only image data may be used as input, but point cloud data may be used together in the learning process (220) of the end-to-end model.

[0062] The data augmentation device (100) augments data in units of objects, thereby preventing overfitting that may occur when augmenting data in units of frames (e.g., data frames such as image frames or point cloud frames) and ensuring diversity of data. As the data augmentation device (100) augments data in units of objects, it can efficiently generate data for a specific case (e.g., a case where an ambulance is located in front) to be learned.

[0063]

[0064] FIGS. 4a and FIGS. 4b are drawings for illustrating examples of augmented data according to one embodiment.

[0065] Referring to FIGS. 4a and 4b, according to one embodiment, a data augmentation device (e.g., the data augmentation device (100) of FIG. 1) can obtain first data (e.g., the first data (310) of FIG. 3) including an image (e.g., image (411), image (311) of FIG. 3) and a point cloud (e.g., point cloud (415), point cloud (315) of FIG. 3) of the surroundings of a vehicle (e.g., the first data (310) of FIG. 1) from a sensor mounted on the vehicle (e.g., the vehicle (110) of FIG. 1). The data augmentation device (100) can generate copy data (e.g., copy data (430), copy data (330) of FIG. 3) corresponding to a target object to be copied among a plurality of objects included in the first data (e.g., the first data (310)). The data augmentation device (100) can obtain a bounding box corresponding to a target object (e.g., a rubber cone) from the point cloud (415). The data augmentation device (100) can project a bounding box corresponding to a target object (e.g., a rubber cone) onto an image (411) to extract an image patch corresponding to the target object (e.g., a rubber cone) from the image (411). The data augmentation device (100) can store the image patch as copy data. For example, the data augmentation device (100) can store an image patch for a rubber cone as copy data (430). The copy data (430) may include three-dimensional information (e.g., three-dimensional position) for the target object (e.g., a rubber cone).

[0066] The data augmentation device (100) can synthesize copy data (430) into second data (440) based on the location and size of a target object (e.g., rubber cone) in the first data (e.g., image (411), point cloud (415)). The data augmentation device (100) can synthesize copy data (430) into second data (440) (e.g., second data (340) of FIG. 3) obtained by sensing the surroundings of the vehicle (110). The second data (440) may be data obtained by sensing the surroundings of the vehicle (110) in all directions, centered on the vehicle (110). The second data (440) may include a plurality of frames. For convenience of explanation, the second data (440) may be an exemplary illustration of only one frame among the plurality of frames included in the second data (440).

[0067] The data augmentation device (100) can place copy data (430) in the second data (440) at the location of a target object (e.g., a rubber cone) within the first data (e.g., an image (410), a point cloud (415)). For example, the data augmentation device (100) can place copy data (430) in the second data (440) such that a rubber cone of size Y meters located at a distance of X meters from the vehicle (110) within the image (410) and the point cloud (415) has the same size of Z pixels within the second data (440). The data augmentation device (100) can place copy data (430) in the second data (440) based on the location of a target object (e.g., a rubber cone) within the first data (e.g., an image (411), a point cloud (415)). For example, regarding a rubber cone located X meters away from the front of the vehicle (110) in the image (411) and point cloud (415), copy data (430) can be placed so that it is also located X meters away from the front of the vehicle (110) in the second data (440).

[0068] The data augmentation device (100) can place copy data (430) so that a target object (e.g., a rubber cone) is positioned on the ground in the second data (440). For example, the data augmentation device (100) can place copy data (430) so that the rubber cone is positioned on the ground in the second data (440) based on the size (e.g., height) of the rubber cone. The data augmentation device (100) can determine whether interference occurs between the target object (e.g., a rubber cone) and other objects within the second data (440). For example, the data augmentation device (100) can determine whether interference occurs by overlapping with other objects included in the second data (440), such as other vehicles, benches, or trash cans. In response to determining that interference has occurred, the data augmentation device (100) can change the position where the copy data (430) is placed within a predetermined distance until the interference disappears. The data augmentation device (100) may not adjust the position where the copy data (430) is placed in response to the determination that no interference has occurred.

[0069] The data augmentation device (100) may place copy data (430) in the second data (440) so that a target object (e.g., a rubber cone) maintains the same spatial location in multiple frames included in the second data (e.g., the second data (440)). For example, the data augmentation device (100) may place copy data (430) in the second data (440) so that the rubber cone maintains the same spatial location in the frames prior to and following a specific frame included in the second data (440). The data augmentation device (100) may generate augmented data (450) by synthesizing copy data (430) into the second data (440). The augmented data (450) may include multiple frames. The augmented data (450) may be used for training a model (e.g., an end-to-end model). For example, augmented data (450) can be used in the learning process (220) described with reference to FIG. 2.

[0070]

[0071] FIG. 5 is a flowchart illustrating a method according to one embodiment.

[0072] Referring to FIG. 5, according to one embodiment, operations 510 to 550 may be operations performed by the data augmentation device (100) of FIG. 1 described with reference to FIG. 1 to 6.

[0073] According to one embodiment, operations 510 to 590 may be understood to be performed in a processor (e.g., processor (630) of FIG. 6) of a data augmentation device (100) (e.g., electronic device (600) of FIG. 6) described with reference to FIG. 1.

[0074] In operation 510, the data augmentation device (100) can obtain first data including an image and a point cloud of the surroundings of the vehicle (110) from a sensor mounted on the vehicle (e.g., the vehicle (110) of FIG. 1).

[0075] In operation 530, the data augmentation device (100) can generate copy data corresponding to a target object to be copied among a plurality of objects included in the first data. The data augmentation device (100) can obtain a bounding box corresponding to the target object from a point cloud. The data augmentation device (100) can project the bounding box onto an image to extract an image patch corresponding to the target object from the image. The data augmentation device (100) can store the image patch as copy data.

[0076] In operation 550, the data augmentation device (100) can synthesize copy data into second data obtained by sensing the surroundings of the vehicle based on the position and size of the target object in the first data. The data augmentation device (100) can place copy data in the second data, which includes a plurality of frames, so that the target object maintains the same spatial position.

[0077] Operations 510 to 550 may be substantially the same as the method used by the device described with reference to FIGS. 1 to 5 (e.g., the data augmentation device (100) of FIG. 1).

[0078] Operations 510 through 550 may be performed sequentially, but are not limited thereto. For example, two or more operations may be performed in parallel.

[0079]

[0080] FIG. 6 is a schematic block diagram of an electronic device according to one embodiment.

[0081] Referring to FIG. 6, according to one embodiment, an electronic device (600) (e.g., a data augmentation device (100) of FIG. 1) may include a memory (610) and a processor (630).

[0082] The memory (610) can store instructions (or programs) executable by the processor (630). For example, the instructions may include instructions for executing the operation of the processor (630) and / or the operation of each component of the processor (630).

[0083] The memory (610) may include one or more computer-readable storage media. The memory (610) may include non-volatile storage devices (e.g., magnetic hard disc, optical disc, floppy disc, flash memory, EPROM (electrically programmable memories), EEPROM (electrically erasable and programmable)).

[0084] The memory (610) may be a non-transitory medium. The term "non-transitory" may indicate that the storage medium is not implemented by a carrier wave or a propagated signal. However, the term "non-transitory" should not be interpreted as meaning that the memory (610) is immobile.

[0085] The processor (630) can process data stored in memory (610). The processor (630) can execute computer-readable code (e.g., software) stored in memory (610) and instructions triggered by the processor (630).

[0086] The processor (630) may be a data processing device implemented in hardware having a circuit having a physical structure for executing desired operations. For example, the desired operations may include code or instructions included in a program.

[0087] For example, a data processing device implemented in hardware may include a microprocessor, a central processing unit, a processor core, a multi-core processor, a multiprocessor, an Application-Specific Integrated Circuit (ASIC), and a Field Programmable Gate Array (FPGA).

[0088] The processor (630) can cause the electronic device (600) to perform one or more operations by executing code and / or instructions stored in memory (610). The operations performed by the electronic device (600) may be substantially the same as the operations performed by the data augmentation device (100) described with reference to FIGS. 1 through 6. Such redundant descriptions are omitted.

[0089]

[0090] The embodiments described above may be implemented as hardware components, software components, and / or combinations of hardware and software components. For example, the devices, methods, and components described in the embodiments may be implemented using a general-purpose computer or a special-purpose computer, such as, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate array (FPGA), a programmable logic unit (PLU), a microprocessor, or any other device capable of executing and responding to instructions. The processing unit may execute an operating system (OS) and software applications executed on said operating system. Additionally, the processing unit may access, store, manipulate, process, and generate data in response to the execution of the software. For ease of understanding, the processing unit may be described as being used as a single unit, but those skilled in the art will understand that the processing unit may include multiple processing elements and / or multiple types of processing elements. For example, the processing unit may include multiple processors or one processor and one controller. In addition, other processing configurations, such as parallel processors, are also possible.

[0091] Software may include computer programs, code, instructions, or a combination of one or more of these, and may configure a processing unit to operate as desired or instruct the processing unit independently or collectively. Software and / or data may be stored on any type of machine, component, physical device, virtual equipment, computer storage medium, or device so as to be interpreted by the processing unit or to provide instructions or data to the processing unit. Software may be distributed over networked computer systems and stored or executed in a distributed manner. Software and data may be stored on computer-readable recording media.

[0092] The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded on a computer-readable medium. The computer-readable medium may store program instructions, data files, data structures, etc., either individually or in combination, and the program instructions recorded on the medium may be those specifically designed and configured for the embodiment or those known and available to those skilled in the art of computer software. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tapes; optical recording media such as CD-ROMs and DVDs; magneto-optical media such as floptical disks; and hardware devices specifically configured to store and execute program instructions, such as ROM, RAM, and flash memory. Examples of program instructions include machine code, such as that generated by a compiler, as well as high-level language code that can be executed by a computer using an interpreter, etc.

[0093] The hardware device described above may be configured to operate as one or more software modules to perform the operation of the embodiment, and vice versa.

[0094] Although the embodiments have been described above with reference to the limited drawings, those skilled in the art can apply various technical modifications and variations based thereon. For example, suitable results may be achieved even if the described techniques are performed in a different order than described, and / or if the components of the described system, structure, device, circuit, etc. are combined or assembled in a form different from described, or replaced or substituted by other components or equivalents.

[0095] Therefore, other implementations, other embodiments, and equivalents to the claims also fall within the scope of the claims set forth below.

Claims

1. An operation of acquiring first data including an image and a point cloud of the surroundings of the vehicle from a sensor mounted on the vehicle; The operation of generating copy data corresponding to a target object to copy among a plurality of objects included in the first data; and Synthesizing the copy data based on the location and size of the target object in the first data, and the second data obtained by sensing the surroundings of the vehicle. A method including 2. In Paragraph 1, The operation of generating the above copy data is, The operation of obtaining a bounding box corresponding to the target object from the point cloud; The operation of projecting the bounding box onto the image to extract an image patch corresponding to the target object from the image; and Storing the above image patch as the copy data A method including 3. In Paragraph 1, The above-mentioned synthesis operation is, The operation of placing the copy data at the location on the second data; and Adjusting the position where the above copy data is placed A method including 4. In Paragraph 3, The above-mentioned adjusting operation is, An operation to determine whether the target object interferes with another object within the second data (occlusion); and In response to the determination that the above interference has occurred, an operation to change the location where the copy data is placed within a predetermined distance until the interference disappears. A method including 5. In Paragraph 1, The above second data is, A method comprising a plurality of frames that sense the surroundings of the vehicle.

6. In Paragraph 5, The above-mentioned synthesis operation is, The operation of placing the copy data in the second data so that the target object maintains the same spatial location in the plurality of frames. A method including 7. In Paragraph 3, The above-mentioned placement operation is, The operation of placing the copy data at the location such that the target object is located on the ground in the second data. A method including 8. A computer program stored on a computer-readable recording medium in combination with hardware to execute the method of any one of claims 1 through 7.

9. In the device, At least one processor; and memory that stores instructions Includes, Based on the above instructions being executed individually or collectively by the at least one processor, the device, A first data including an image and a point cloud of the surroundings of the vehicle is obtained from a sensor mounted on the vehicle, and Among the plurality of objects included in the first data above, copy data corresponding to the target object to copy is generated, and A device that synthesizes the copy data with the second data obtained by sensing the surroundings of the vehicle, based on the location and size of the target object in the first data.

10. In Paragraph 9, Based on the above instructions being executed individually or collectively by the at least one processor, the device, A bounding box corresponding to the target object is obtained from the point cloud above, and The bounding box is projected onto the image to extract an image patch corresponding to the target object from the image, and A device for storing the above image patch as the copy data.

11. In Paragraph 9, Based on the above instructions being executed individually or collectively by the at least one processor, the device, Place the copy data at the above location on the above second data, and A device for adjusting the position where the above copy data is placed.

12. In Paragraph 11, Based on the above instructions being executed individually or collectively by the at least one processor, the device, Determining whether the target object interferes with another object within the second data (occlusion), A device that, in response to determining that the above interference has occurred, changes the location where the above copy data is placed within a predetermined distance until the above interference disappears.

13. In Paragraph 9, The above second data is, A device comprising a plurality of frames that sense the surroundings of the vehicle.

14. In Paragraph 13, Based on the above instructions being executed individually or collectively by the at least one processor, the device, A device for placing the copy data in the second data such that the target object maintains the same spatial location in the plurality of frames.

15. In Paragraph 12, Based on the above instructions being executed individually or collectively by the at least one processor, the device, A device that places the copy data at the location such that the target object is located on the ground in the second data.