Learning device

The learning device addresses the inflexibility of fixed learning ranges by using a hash function to dynamically update feature data, enabling efficient learning across changing or non-user-present regions.

JP2026109881APending Publication Date: 2026-07-02IVIS INC

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Applications
Current Assignee / Owner
IVIS INC
Filing Date
2024-12-20
Publication Date
2026-07-02

AI Technical Summary

Technical Problem

Existing learning technologies, such as NeRF and INGP, are limited by a fixed learning range and do not allow for flexible adjustment, leading to interference when changing learning areas.

Method used

A learning device that includes a feature data management system using a hash function to dynamically update feature data based on user-defined areas of interest and weight settings, allowing flexible adjustment of the learning range by changing feature data linked to new position coordinates.

Benefits of technology

Enables flexible and efficient learning of attribute information across changing or non-user-present regions, reducing interference and improving learning efficiency by prioritizing relevant data.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure 2026109881000001_ABST
    Figure 2026109881000001_ABST
Patent Text Reader

Abstract

In learning the attribute information of the subject, the learning range can be flexibly changed. [Solution] The area of ​​interest reception unit 41 accepts the user's area of ​​interest setting. The feature data management unit 43 holds feature data that is determined according to the position of the object being photographed, serves as input to the learning model, and is subject to updating through learning. The feature data management unit 43 includes a hash function calculation unit 430 that outputs a hash value as input to position coordinates, and a table 431 that holds feature data linked to the hash value output by the hash function calculation unit 430. When a new learning area is set, the feature data held in the table 431 is changed to different data linked to the hash value output when the position coordinates included in the newly set learning area are input to the hash function calculation unit 430.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] The present invention relates to a learning device.

Background Art

[0002] In recent years, a method has been proposed in which, using a plurality of images obtained by photographing a subject to be photographed from a plurality of directions as teacher data, attribute information of the subject (for example, the color or reflectance of the surface of the subject) is learned (see, for example, Non-Patent Documents 1 and 2).

Prior Art Documents

Non-Patent Documents

[0003]

Non-Patent Document 1

Non-Patent Document 2

Summary of the Invention

Problems to be Solved by the Invention

[0004] In the above technology, the learning range, which is the area in which the subject to be photographed exists, is predetermined, and learning about subjects outside of the learning range is not intended. Therefore, when learning the attribute information of subjects, it is necessary to be able to flexibly change the learning range.

[0005] This invention has been made in view of these points, and aims to provide a technology for flexibly changing the learning range when learning attribute information of a subject. [Means for solving the problem]

[0006] One aspect of the present invention is a learning device. This learning device includes a learning unit that learns parameters for a learning model that outputs attribute information of the subject at a given position when the position of the subject is input, based on teacher information including a plurality of two-dimensional images that reflect attribute information of the subject to be photographed, two-dimensional images of subjects included in a predetermined learning area, the three-dimensional coordinates of the camera that took the plurality of two-dimensional images, and the optical axis direction of the camera; an interest area receiving unit that accepts the setting of the user's interest area; a weight setting unit that sets a weight for the two-dimensional images included in the interest area among the plurality of two-dimensional images included in the teacher information to be greater than the weight set for the two-dimensional images not included in the interest area; and data that is determined according to the position of the subject to be photographed and becomes the input to the learning model. The system includes a feature data management unit that holds feature data subject to update by the learning process, the feature data management unit comprising a hash function calculation unit that takes position coordinates as input and outputs a hash value, and a table that holds the feature data linked to the hash value output by the hash function calculation unit, the learning unit uses two-dimensional images with larger set weights for updating the parameters of the learning model and the feature data, prioritizing two-dimensional images with smaller set weights over two-dimensional images, and when a new learning area is set, the feature data management unit changes the feature data held in the table to different data linked to the hash value output when the position coordinates included in the newly set learning area are input to the hash function calculation unit.

[0007] Furthermore, any combination of the above components, as well as conversions of the expression of the present invention between methods, apparatus, systems, computer programs, data structures, recording media, etc., are also valid embodiments of the present invention. [Effects of the Invention]

[0008] According to the present invention, the learning range can be flexibly changed when learning the attribute information of a subject. [Brief explanation of the drawing]

[0009] [Figure 1] This diagram illustrates the outline of the processing performed by the learning device 1 according to the embodiment. [Figure 2] This diagram schematically shows the functional configuration of the learning device 1 according to the embodiment. [Figure 3] This is a flowchart illustrating the flow of information processing performed by the learning device 1 according to the embodiment. [Modes for carrying out the invention]

[0010] <Overview of the Embodiment> Figures 1(a) and 1(b) illustrate an overview of the processing performed by the learning device 1 according to the embodiment. Specifically, Figure 1(a) is a diagram illustrating the technology that underlies the processing performed by the learning device 1, and Figure 1(b) is a diagram illustrating an overview of the processing performed by the learning device 1.

[0011] Learning device 1 is a device that performs learning based on the NeRF (Neural Radiance Fields) technology disclosed in the paper by Mildenhall et al. Since NeRF is a known technology, the details will be omitted, but a brief explanation of NeRF and the learning method described in the paper by Muller et al. (hereinafter referred to as "INGP," short for Instant Neural Graphics Primitives), which is a type of NeRF, will be given with reference to Figure 1(a).

[0012] NeRF is a technique that generates a learning model that learns the three-dimensional attribute information of a target object S based on image data obtained by photographing the target object S from multiple different viewpoints P. In the example shown in Figure 1(a), a house, which is an example of a target object S, is photographed from four different viewpoints P (first viewpoint P1, second viewpoint P2, third viewpoint P3, and fourth viewpoint P4). The image data, along with the position coordinates and shooting direction (i.e., the optical axis direction of the camera) of each viewpoint P, is used as training data, and the color data and light reflectivity in the three-dimensional coordinates of the first target object S1 are used as attribute information. Note that the number of viewpoints P is not limited to four; generally, the NeRF learning process is performed using a large amount of training data obtained by photographing the first target object S1 from four or more viewpoints P.

[0013] Furthermore, the example shown in Figure 1(a) illustrates a case where a visible light camera, which is a camera, is present at viewpoint P, and the visible light reflected from the first target S1 is imaged. In other words, the attribute information of the first target S1 is the color and reflectance of the visible light reflected from the first target S1. However, the attribute information that the learning device 1 learns is not limited to visible light, but may also be infrared light, radar reflection, or reflected light from a LiDAR (Light Detection And Ranging) laser. In this case, viewpoint P has a camera, such as an infrared camera, that corresponds to each piece of attribute information and takes a picture of the subject, the first target S1. Since it is assumed that a camera is present at viewpoint P, in this specification, the 3D coordinates of viewpoint P and the 3D coordinates of the camera present at viewpoint P will represent the same coordinates.

[0014] The learning device 1 performs NeRF learning based on training information including 2D images, which are multiple captured data reflecting the attribute information of the first target S1, the 3D coordinates of the camera that captured the multiple 2D images, and the optical axis direction of the camera. As a result, when the position of the first target S1 is input, the learning device 1 learns the parameters of a learning model that outputs the attribute information of the first target S1 at that position.

[0015] As shown in Figure 1(a), the learning model generated by NeRF learning is stored in the learning device 1. When the attribute information is visible light reflected from the surface of the first target S1, the learning device 1 can use the learned model to generate a 2D image taken from a position and optical axis direction not included in the training information. For example, in the example shown in Figure 1(a), if the user of the learning device 1 inputs a fifth viewpoint P5 that is different from the first viewpoint P1, second viewpoint P2, third viewpoint P3, and fourth viewpoint P4, the learning device 1 can generate a 2D image taken of the first target S1 from the fifth viewpoint P5.

[0016] INGP is a type of NeRF, but it has the advantage of requiring less training time compared to Mildenhall et al.'s method. On the other hand, because INGP stores position-dependent features during the training and inference processes, it is necessary to define a region where attribute information can be learned, i.e., a region where features are set (hereinafter sometimes simply referred to as "training region L"). In the example shown in Figure 1(a), the interior of the spherical region indicated by the code R1 is the first training region L1, which is an example of training region L. For this reason, it is thought that INGP is not intended to learn attribute information from regions outside of training region L.

[0017] As is well known, INGP uses a table that stores feature data representing location-dependent features, associated with the output of a hash function that takes location coordinates as input. The feature data stored in this table is both the input data for the learning model in INGP and the data that is updated during the learning process. Since the hash function outputs information of the same size (e.g., 1024 bits of information) regardless of the input, the structure of the table does not need to be changed even if the learning area L is changed and the location coordinates within it are changed. Therefore, when learning attribute information of a region that has gone beyond the learning area L that has been set using INGP, formally, INGP learning can be continued by setting a new learning area L that includes the region that has gone beyond the initial learning area.

[0018] Based on the above, the overview of the learning device 1 according to the embodiment will be described while referring to FIG. 1(b). FIG. 1(b) is a diagram showing the first learning region L1, the second learning region L2, and the third learning region L3, and the first imaging target S1, the second imaging target S2, and the third imaging target S3 which are the respective imaging targets S. The first imaging target S1 is the same as the first imaging target S1 in FIG. 1(a) and is a house. The second imaging target S2 is a tower structure, and the third imaging target S3 is a castle structure. Although not all are labeled to avoid complication, in FIG. 1(b), the same figure as the viewpoint P indicated by the reference sign P represents the viewpoint P where the camera for imaging the imaging target S exists.

[0019] As shown in FIG. 1(b), after the first learning region L1 is set and the learning model is generated by INGP, it is assumed that the second learning region L2 different from the first learning region L1 is set. At this time, the inventor of the present application found through experiments that if the learning model learned for the first learning region L1 is continuously used to start the learning of the second learning region L2 as it is, the learning result for the first learning region L1 may affect the learning for the structure which is the second imaging target S2 in the second learning region L2.

[0020] Therefore, when a new learning region L is set in the learning device 1 according to the embodiment, the feature data corresponding to the position coordinates included in the newly set learning region L is changed to different data (for example, random numbers). As described above, in INGP, the feature data is both the input data of the learning model and the data to be updated in the learning process. Changing the feature data corresponding to the position coordinates included in the newly set learning region L of the learning device 1 to different data is equivalent to changing the initial value of the input data of the learning model. The inventor of the present application found through experiments that by changing the feature data along with the change of the learning region L, the influence of the learning regarding the original learning region L can be reduced.

[0021] In the example shown in Figure 1(b), the user moves from the location of the first learning area L1 to the location of the second learning area L2, and finally to the location of the third learning area L3. At this time, the learning device 1 updates the feature data, which is the initial value of the input data for the learning model, each time the learning area L changes due to the user's movement. In this way, the learning device 1 can learn attribute information around the location where the user is located and generate a learning model. Thus, the learning device 1 according to this embodiment can flexibly change the learning range when learning INGP.

[0022] <Functional configuration of the learning device 1 according to the embodiment> Figure 2 is a schematic diagram showing the functional configuration of a learning device 1 according to an embodiment. The learning device 1 comprises a storage unit 2, a communication unit 3, and a control unit 4. In Figure 2, the arrows indicate the main data flow, and there may be data flows not shown in Figure 2. In Figure 2, each functional block shows a functional unit configuration, not a hardware (device) unit configuration. Therefore, the functional blocks shown in Figure 2 may be implemented in a single device, or they may be implemented separately in multiple devices. Data exchange between functional blocks may be performed via any means, such as a data bus, network, or portable storage medium.

[0023] The memory unit 2 is a large-capacity storage device such as an HDD (Hard Disk Drive) or SSD (Solid State Drive) that stores various information, including a ROM (Read Only Memory) that stores the BIOS (Basic Input Output System) of the computer that realizes the learning device 1, a RAM (Random Access Memory) that serves as the working area of ​​the learning device 1, an OS (Operating System) and application programs, and tables that store learning models and feature data referenced when the application programs are executed.

[0024] Communication unit 3 is a communication interface for learning device 1 to communicate with external devices, and is implemented using known communication modules such as LAN (Local Area Network) modules and Wi-Fi (registered trademark) modules. Hereafter in this specification, it is assumed that learning device 1 communicates with external devices via communication unit 3, and the description of communication unit 3 may be omitted.

[0025] The control unit 4 is a processor such as the CPU (Central Processing Unit) or GPU (Graphics Processing Unit) of the learning device 1, and functions as a learning unit 40, a region of interest receiving unit 41, a weight setting unit 42, and a feature data management unit 43 by executing the program stored in the memory unit 2.

[0026] Figure 2 shows an example where the learning device 1 is composed of a single device. However, the learning device 1 may be implemented using multiple computing resources such as processors and memory, for example, in a cloud computing system. In this case, each part constituting the control unit 4 is implemented by at least one of the multiple different processors executing a program.

[0027] The learning unit 40 learns the parameters of a learning model based on training information including a plurality of two-dimensional images that reflect the attribute information of the target object S and are taken of a subject included in a predetermined learning area, the three-dimensional coordinates of the camera that took the plurality of two-dimensional images, and the optical axis direction of the camera. This learning model is a learning model generated based on the INGP learning method so that when the position of the target object S is input, it outputs the attribute information of the target object S at that position. The learning area can be set at any time by the user, including during the execution of the learning process by the learning device 1. Specifically, the learning unit 40 can acquire position information indicating the user's position obtained by a position acquisition module such as a GPS (Global Positioning System) module installed in a device (not shown) such as a smartphone used by the user, and set the learning area to include that position.

[0028] The area of ​​interest reception unit 41 receives the setting of the area of ​​interest of the user of the learning device 1. Here, the area of ​​interest is the area that the user of the learning device 1 particularly wants to learn from the learning area L which is set as the learning target for attribute information of the learning device 1. For example, in Figure 1(b), the second learning area L2 may contain other subjects such as the background in addition to the second photographic target S2, which is the structure of the tower. However, if the user particularly wants to learn the second photographic target S2, they set the area including the second photographic target S2 as the area of ​​interest for the learning device 1.

[0029] The region of interest can be set at any time by the user, including during the execution of the learning process by the learning device 1. For example, similar to the setting of the learning region L described above, the region of interest receiving unit 41 can acquire location information indicating the user's location, which is obtained by a location acquisition module such as a GPS module installed in a device (not shown) such as a smartphone used by the user, and set a predetermined range including that location as the region of interest.

[0030] The weight setting unit 42 sets a weight for each 2D image included in the region of interest from among the multiple 2D images included in the training information, which is greater than the weight set for each 2D image not included in the region of interest. As will be described in detail later, the weight set by the weight setting unit 42 for each 2D image is a parameter that determines the frequency of reference of the 2D image during the learning process. The larger the weight set by the weight setting unit 42 for a 2D image, the more frequently it will be referenced during the learning process.

[0031] The feature data management unit 43 holds feature data that is determined according to the position of the object being photographed S, serves as input to the learning model, and is subject to update through learning. As shown in Figure 2, the feature data management unit 43 includes a hash function calculation unit 430 and a table 431.

[0032] The hash function calculation unit 430 takes position coordinates as input and outputs a hash value. As an example, the hash function calculation unit 430 calculates a hash value based on three-dimensional position coordinates (x1, x2, x3) based on the following equation (1).

[0033]

number

[0034] In equation (1), x1, x2, and x3 are elements of the 3D position coordinates, and π i (i=1, 2, 3) are each uniquely large prime numbers, and the right-hand side is x i π i This is the remainder (so-called modulo) when the exclusive OR of is divided by the size T of table 431. Equation (1) outputs a value from 0 to T-1 regardless of the elements of the 3D position coordinates. Although not limited, as an example, π i At least one of them is a prime number on the order of billions.

[0035] Table 431 holds feature data associated with the hash value output by the hash function calculation unit 430. For example, if the three-dimensional position coordinates (x1, x2, x3) are determined, the hash value H (0 ≦ H < T) calculated by the hash function calculation unit 430 is determined, so the value stored in the H-th position of Table 431 becomes the feature data corresponding to the position coordinates (x1, x2, x3). In this sense, the hash value calculated by the hash function calculation unit 430 can also be said to be the address of Table 431.

[0036] The learning unit 40 preferentially uses a two-dimensional image with a large weight set by the weight setting unit 42 for updating the parameters of the learning model and the feature data rather than a two-dimensional image with a small set weight. Specifically, the learning unit 40 increases the frequency of reference in the learning process for a two-dimensional image with a larger weight set by the weight setting unit 42. More specifically, the learning unit 40 uses a two-dimensional image with a larger weight set by the weight setting unit 42 more frequently during learning or increases the reference points (sampling numbers) during learning than a two-dimensional image with a smaller weight. Thereby, the learning unit 40 can focus on learning a two-dimensional image with a large weight set by the weight setting unit 42, that is, a two-dimensional image capturing a subject included in the region of interest.

[0037] As described above, the feature data stored in table 431 serves as input values ​​for the learning model and is also data that is updated during the learning process. Therefore, if the learning unit 40 has sufficiently progressed with respect to a certain learning area L, the feature data stored in table 431 will reflect the information of the learning area L. For this reason, when a new learning area L is set, the feature data management unit 43 changes the feature data held in table 431 to different data by associating it with the hash value output when the position coordinates included in the newly set learning area L are input to the hash function calculation unit 430. As an example, when a new learning area L is set, the feature data management unit 43 replaces the data in the table corresponding to that area with random numbers. This allows the feature data management unit 43 to reset the information about the old learning area L reflected in the feature data and reduce the influence that the information of the old learning area L has on the newly set learning area L.

[0038] Here, when a new learning area L is established, if the distance between the newly established learning area L and the original learning area L is small, the attribute information of the target S included in the original learning area L and the attribute information of the target S included in the newly established learning area L can be expected to be more similar to each other compared to when the distance is large. As an extreme example, when a new learning area L is established, if the newly established learning area L and the original learning area L overlap, the attribute information of the target S included in the overlapping portion will be identical. In this case, by not resetting the feature data, the learning device 1 can be expected to improve the learning efficiency of the learning model. In this sense, the distance between the newly established learning area L and the original learning area L can be used as an indicator to estimate the similarity of the attribute information of the target S included in the two learning areas L, and the similarity of the attribute information can be used as an indicator of learning efficiency.

[0039] Therefore, when a new learning area L is set, the feature data management unit 43 may change the feature data held in table 431 to different data by associating it with the hash value output when the position coordinates included in the newly set learning area L are input to the hash function calculation unit 430, provided that the position coordinates included in the newly set learning area L are a predetermined distance away from the original learning area L. The predetermined distance is the similarity determination distance for determining the similarity of the attribute information of the target S included in the learning area L. The specific value of the similarity determination distance can be set in advance considering the efficiency of the learning process performed by the learning unit 40, but for example, it is 30 kilometers. The value of the similarity determination distance may also be changed depending on the location of the set learning area L. For example, if the learning area L is a sparsely populated area such as a mountainous area, the similarity determination distance may be longer compared to a densely populated area such as an urban area.

[0040] The feature data management unit 43 controls whether or not to change the feature data based on the distance between the newly set learning area L and the original learning area L, thereby further improving the learning efficiency of the learning device 1.

[0041] Even if the learning unit 40 makes progress in learning a certain learning area L, it is not guaranteed that all of the feature data stored in table 431 will be updated in conjunction with the learning. Specifically, since the hash value H, which is the output of a hash function, including the hash function shown in equation (1), can be considered a random number, the output value of the hash function calculation unit 430 does not necessarily cover all addresses in table 431. Therefore, when a new learning area L is set, the feature data management unit 43 may change the feature data held in table 431 to different data, provided that the feature data has already been learned, by associating it with the hash value H output when the position coordinates included in the newly set learning area L are input to the hash function calculation unit 430. This eliminates the need to change feature data that has not yet been learned.

[0042] The above describes the case where feature data is modified when the distance between the newly established learning region L and the original learning region L exceeds the similarity threshold. Alternatively, or in addition to this, the modification of feature data may be controlled based on the time elapsed between the establishment of the original learning region L and the establishment of the new learning region L.

[0043] For example, even within the same learning area L, the objects S to be photographed within that area can change over time. If the learning area L is set in an urban area, the structures within that area can change. Even if the learning area L is in a mountainous area, the scenery can change due to the growth of trees, etc.

[0044] Therefore, when a new learning area L is set, the feature data management unit 43 may change the feature data held in table 431 to different data, provided that the time when the feature data was learned is a predetermined amount of time in the past, linked to the hash value output when the position coordinates included in the newly set learning area L are input to the hash function calculation unit 430.

[0045] The predetermined time is the similarity determination time for determining the similarity between the attribute information of the target S included in the newly set learning area L and the attribute information of the target S included in past learning areas L. The specific value of the similarity determination time can be set in advance, taking into account the efficiency of the learning process performed by the learning unit 40, but for example, it may be one year. The value of the similarity determination time may also be changed depending on the location of the set learning area L. For example, if the learning area L is a sparsely populated area such as a mountainous area, the similarity determination time may be longer compared to a densely populated area such as an urban area.

[0046] <Processing flow of the information processing method executed by learning device 1> Figure 3 is a flowchart illustrating the flow of information processing performed by the learning device 1 according to this embodiment. The processing in this flowchart starts, for example, when the learning device 1 is started up.

[0047] The learning unit 40 receives the setting of the learning area L from the user of the learning device 1 (S2). The area of ​​interest receiving unit 41 receives the setting of the area of ​​interest from the user of the learning device 1 (S4). The weight setting unit 42 sets the weights to be set for the 2D images included in the area of ​​interest from among the multiple 2D images included in the teacher information to be greater than the weights to be set for the 2D images not included in the area of ​​interest (S6).

[0048] The learning unit 40 sets up training data for learning the parameters of the learning model based on the weights set on the 2D image (S8). The feature data management unit 43 stores feature data, which is determined based on the position of the target object S, serves as input to the learning model, and is subject to update by learning, in table 431 (S10). The learning unit 40 performs learning based on INGP using the feature data and training data, and updates the learning model and feature data (S12).

[0049] While the learning area L is not reset by the user of the learning device 1 (No in S14), the learning unit 40 executes the learning process related to step S12 until the learning converges. If the learning area L is reset by the user of the learning device 1 (Yes in S14), the feature data management unit 43 identifies the position coordinates included in the newly set learning area L (S16). The feature data management unit 43 changes the feature data held in table 431 to different data by associating it with the hash value output when the position coordinates included in the newly set learning area L are input to the hash function calculation unit 430 (S18). After that, the learning device 1 returns to step S4 and continues the above process.

[0050] The process in this flowchart ends when the INGP-based learning converges or when a stop command is received from the user of learning device 1.

[0051] <Usage scenarios of the learning device 1 according to the embodiment> The usage scenarios for the learning device 1 described above will now be explained. (First use case) The first use case for learning device 1 is the learning of attribute information of a target object S in a region that changes as the user moves. For example, as the user of learning device 1 moves between places such as Tokyo, Nagoya, and Osaka, learning device 1 generates a learning model that learns the visible light in the region centered on the user as attribute information. In this case, the area around the user of learning device 1 becomes the learning region L. When the user's location changes over time due to travel, relocation, etc., learning device 1 can learn attribute information of the area around the user.

[0052] (Second use case) A second use case for the learning device 1 is to learn attribute information of the target object S in a region where no user of the learning device 1 exists, using training data from that region. As mentioned above, the hash function used by the hash function calculation unit 430 has a fixed range of values ​​(range of output values ​​of the hash function). Therefore, even if the learning device 1 is learning a specific region (for example, Osaka where a user exists) as the learning region L, it can also learn using training data from other regions (for example, Tokyo where no user exists). This allows the learning device 1 to reflect attribute information from regions where no user exists in its learning model.

[0053] (Third use case) A third application scenario for learning device 1 is to simultaneously learn two or more attribute pieces of information about the target object S in the learning area L. A specific example is to have learning device 1 simultaneously learn attribute pieces related to visible light and infrared light reflected from the target object S. In this case, three-dimensional visible light information (RGB information) is combined with one-dimensional infrared information (Ir information), and a two-dimensional image where each pixel element is four-dimensional data is provided as the data. <Effects of the learning device 1 according to the embodiment> As described above, the learning device 1 according to the embodiment allows for flexible modification of the learning range when learning the attribute information of a subject.

[0054] Although the present invention has been described above using embodiments, the technical scope of the present invention is not limited to the scope described in the above embodiments, and various modifications and changes are possible within the scope of its gist. For example, all or part of the apparatus can be configured by functionally or physically distributing and integrating in any unit. Furthermore, new embodiments resulting from any combination of multiple embodiments are also included in the embodiments of the present invention. The effects of the new embodiments resulting from the combinations are combined with the effects of the original embodiments.

[0055] The present invention may be further specified by the items described below. [Item 1] A learning unit learns parameters for a learning model that, when the position of the target is input, outputs attribute information of the target at a given position, based on training information including a plurality of two-dimensional images that reflect the attribute information of the target being photographed, two-dimensional images of subjects included in a predetermined learning area, the three-dimensional coordinates of the camera that took the plurality of two-dimensional images, and the optical axis direction of the camera; A section for receiving user interest settings, A weight setting unit sets a weight for the two-dimensional images included in the training information that are included in the region of interest, and sets a weight that is greater than the weight set for the two-dimensional images that are not included in the region of interest. The system includes a feature data management unit that holds feature data determined according to the position of the object being photographed, which serves as input to the learning model and is subject to update by the learning process, The aforementioned feature data management unit, A hash function calculation unit that takes position coordinates as input and outputs a hash value, The system includes a table that stores the feature data in association with the hash value output by the hash function calculation unit, The learning unit prioritizes using two-dimensional images with larger set weights over two-dimensional images with smaller set weights for updating the parameters and feature data of the learning model. When a new learning area is set, the feature data management unit modifies the feature data held in the table by associating it with the hash value output when the position coordinates included in the newly set learning area are input to the hash function calculation unit. Learning device. [Item 2] The feature data management unit, when a new region of interest is set, changes the feature data held in the table to different data by associating it with the hash value output when the position coordinates included in the newly set region of interest are input to the hash function calculation unit, provided that the position coordinates included in the newly set region of interest are separated from the original region of interest by a predetermined distance. The learning device described in item 1. [Item 3] When a new region of interest is set, the feature data management unit modifies the feature data held in the table to different data, provided that the feature data has already been learned, by associating it with the hash value output when the position coordinates included in the newly set region of interest are input to the hash function calculation unit. A learning device as described in item 1 or item 2. [Item 4] When a new region of interest is set, the feature data management unit 43, when the position coordinates included in the newly set region of interest are input to the hash function calculation unit, modifies the feature data held in the table, linked to the hash value output, and if the feature data has already been learned, the condition that the learning of the feature data occurred a predetermined time ago, modifies the feature data to different data. The learning device described in item 3. [Explanation of Symbols]

[0056] 1. Learning device 2...Storage section 3. Communications Department 4. Control Unit 40. Learning Department 41. Area of ​​Interest Reception Department 42...Settings section 43. Feature Data Management Department 430...Hash function calculation section 431... Table

Claims

[Claim 1] A learning unit learns parameters for a learning model that, when the position of the target is input, outputs attribute information of the target at a given position, based on training information including a plurality of two-dimensional images that reflect the attribute information of the target to be photographed, two-dimensional images of subjects included in a predetermined learning area, the three-dimensional coordinates of the camera that took the plurality of two-dimensional images, and the optical axis direction of the camera; A section for receiving user interest settings, A weight setting unit sets a weight for the two-dimensional images included in the training information that are included in the region of interest, and the weight set for each of these two-dimensional images is greater than the weight set for each of the two-dimensional images that are not included in the region of interest. The system includes a feature data management unit that holds feature data determined according to the position of the object being photographed, which serves as input to the learning model and is subject to update by the learning process, The aforementioned feature data management unit, A hash function calculation unit that takes position coordinates as input and outputs a hash value, The system includes a table that stores the feature data in association with the hash value output by the hash function calculation unit, The learning unit prioritizes using two-dimensional images with larger set weights over two-dimensional images with smaller set weights for updating the parameters and feature data of the learning model. When a new learning area is set, the feature data management unit modifies the feature data held in the table by associating it with the hash value output when the position coordinates included in the newly set learning area are input to the hash function calculation unit. Learning device.