Information processing device
The information processing device addresses the challenge of data collection for large language models by generating new graph structures and image data, automating data augmentation to enhance data quality and reduce human effort.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- TOYOTA JIDOSHA KK
- Filing Date
- 2024-12-16
- Publication Date
- 2026-06-26
Smart Images

Figure 2026105637000001_ABST
Abstract
Description
Technical Field
[0001] The present invention relates to the technical field of information processing apparatuses.
Background Art
[0002] As this type of apparatus, for example, a system has been proposed in which a large language model (LLM) generates query data based on a document, and a pair of the document and the query data is used for learning a search model for a chatbot (see Patent Document 1).
Prior Art Documents
Patent Documents
[0003]
Patent Document 1
Summary of the Invention
Problems to be Solved by the Invention
[0004] A large language model is a language model constructed using a very large dataset and deep learning technology. In some cases, it is difficult to collect a large amount of data (i.e., training data) used for learning such a model. Therefore, a technique called data augmentation has been proposed in which new data is artificially generated by modifying existing data. On the other hand, when a person sets the modifications to be made to the existing data, the workload of the person is relatively large.
[0005] The present invention has been made in view of the above circumstances, and an object thereof is to provide an information processing apparatus capable of generating new data.
Means for Solving the Problems
[0006] An information processing device according to one aspect of the present invention comprises: a first generation means for generating a new graph structure whose deviation from the plurality of graph structures is greater than or equal to a first predetermined value, based on graph information relating to a plurality of graph structures each generated based on a plurality of image data registered in a database; and a second generation means for generating new image data based on the new graph structure. [Brief explanation of the drawing]
[0007] [Figure 1] This block diagram shows an example of the configuration of an information processing device according to the embodiment. [Figure 2] This block diagram shows an example of the configuration of a computing device according to the embodiment. [Figure 3] This is a conceptual diagram illustrating the operation of the information processing device according to the embodiment. [Modes for carrying out the invention]
[0008] Embodiments relating to the information processing device will be described with reference to Figures 1 to 3. In Figure 1, the information processing device 10 includes an arithmetic unit 11, a storage device 12, a communication device 13, an input device 14, and an output device 15. The arithmetic unit 11, the storage device 12, the communication device 13, the input device 14, and the output device 15 are connected via a data bus 16.
[0009] The arithmetic unit 11 may have a processor. The arithmetic unit 11 may have a single processor or multiple processors. In other words, the arithmetic unit 11 may have one or more processors. Furthermore, the processor may be a multi-core processor. If the arithmetic unit 11 has a single processor that is a multi-core processor, then logically, the arithmetic unit 11 can be said to have multiple processors.
[0010] The processor may be at least one of the following: CPU (Central Processing Unit), GPU (Graphics Processing Unit), FPGA (Field Programmable Gate Array), and TPU (Tensor Processing Unit).
[0011] The storage device 12 may be at least one of the following: RAM (Random Access Memory), ROM (Read Only Memory), hard disk drive, magneto-optical disk drive, SSD (Solid State Drive), and optical disk array. In other words, the storage device 12 may be implemented by a single device or by multiple devices.
[0012] The communication device 13 may be capable of communicating with devices outside the information processing device 10. The communication device 13 may use either wired or wireless communication.
[0013] The input device 14 is a device capable of receiving information input to the information processing device 10 from an external source. The input device 14 may include an operating device (e.g., a keyboard, mouse, touch panel, etc.) that can be operated by the user of the information processing device 10. The input device 14 may include a recording medium reader capable of reading information recorded on a recording medium that can be attached to and detached from the information processing device 10, such as a USB (Universal Serial Bus) memory. When information is input to the information processing device 10 via the communication device 13 (in other words, when the information processing device 10 acquires information via the communication device 13), the communication device 13 may function as an input device.
[0014] The output device 15 is a device capable of outputting information to the outside of the information processing device 10. The output device 15 has a display device 151 capable of outputting visual information such as characters and images as the above information. The output device 15 may also have a speaker capable of outputting auditory information such as sound as the above information. The output device 15 may also have a vibration motor capable of outputting tactile information such as vibration as the above information. The output device 15 may also have a printer. The output device 15 may be capable of outputting information to a recording medium that can be attached to and detached from the information processing device 10, such as a USB memory stick. When the information processing device 10 outputs information via the communication device 13, the communication device 13 may function as an output device.
[0015] The storage device 12 is capable of storing desired data. The storage device 12 may store the computer program CP that the arithmetic unit 11 will execute. The storage device 12 may temporarily store data that the arithmetic unit 11 will use temporarily when the arithmetic unit 11 is executing the computer program CP.
[0016] Furthermore, the computer program CP may be recorded on a non-temporary recording medium that is readable by a computer. In this case, the computer program CP may be stored in the storage device 12 by reading the recording medium using a recording medium reading device (not shown) provided by the information processing device 10. Furthermore, at least one of the following may be used as the recording medium: an optical disc, a magnetic medium, a magneto-optical disc, a semiconductor memory, and any other medium capable of storing a program. Furthermore, the computer program CP may be obtained from an external device (not shown) of the information processing device 10 via a communication device 13. In other words, the computer program CP may be downloaded from an external device to the storage device 12 of the information processing device 10.
[0017] The arithmetic unit 11 (for example, a processor) may execute the processing that the information processing device 10 should perform together with the memory device 12 in which the computer program CP is stored (in other words, together with the memory device 12 and the computer program CP stored in the memory device 12). For example, by the arithmetic unit 11 executing the computer program CP, a logical functional block for executing the processing that the information processing device 10 should perform may be realized within the arithmetic unit 11 (for example, within the processor).
[0018] As shown in Figure 2, the arithmetic unit 11 has a first generation unit 111 and a second generation unit 112. The first generation unit 111 and the second generation unit 112 may be implemented as the logical functional blocks described above. At least one of the first generation unit 111 and the second generation unit 112 may be implemented as a physical processing circuit. At least one of the first generation unit 111 and the second generation unit 112 may be implemented in a form in which logical functional blocks and physical processing circuits are mixed.
[0019] The operation of the information processing device 10 configured as described above will be explained with reference to Figure 3. In Figure 3, multiple image data files (Imgs) are registered in the database (DB). For each of the multiple image data files (Imgs), there are graph information (GSI) related to multiple graph structures generated based on the multiple image data files (Imgs). The graph information (GSI) may be information that shows the multiple graph structures themselves, or it may be information that shows the feature vectors into which each of the multiple graph structures is vectorized. The graph information (GSI) may also be a distribution map that shows the distribution of multiple features related to the multiple graph structures in the feature space.
[0020] Incidentally, the graph information GSI may be registered in the database DB or may be stored in a device different from the database DB. Incidentally, the graph structure may mean data composed of a group of nodes representing the relationship between parts of objects in an image related to one image data and a group of edges indicating the relationship between the nodes. Incidentally, various existing modes can be applied to the method of generating a graph structure from image data. Therefore, a detailed description of the method of generating a graph structure from image data is omitted. Incidentally, the feature amount related to the graph structure may be calculated using a learning model (for example, Graph Neural Network: GNN). The "feature amount related to the graph structure" may be a feature amount related to the entire graph structure or may be a feature amount related to each component (that is, node) included in the graph structure.
[0021] The information processing apparatus 10 may perform the following-described processing in order to generate new image data (for example, image data Img) using a plurality of image data Imgs registered in the database DB.
[0022] The first generation unit 111 of the information processing apparatus 10 may generate a new graph structure (for example, graph structure GS) based on the graph information GSI, the amount of deviation from a plurality of graph structures related to the plurality of image data Imgs being equal to or greater than a first predetermined value. Incidentally, the first generation unit 111 may generate a new graph structure based on the graph information GSI, the amount of deviation from the plurality of graph structures being equal to or greater than a first predetermined value and less than a second predetermined value.
[0023] For example, the first generation unit 111 may determine (or estimate) the amount of deviation by calculating the distance between each of the plurality of graph structures and a candidate for a new graph structure. For example, the first generation unit 111 may determine (or estimate) the amount of deviation based on the feature vector related to each of the plurality of graph structures and the feature vector related to the candidate for the new graph structure. For example, the first generation unit 111 may determine (or estimate) the amount of deviation based on at least one of the type, positional relationship, shape, and color of the objects in each of the plurality of graph structures and at least one of the type, positional relationship, shape, and color of the objects in the candidate for the new graph structure.
[0024] For example, when the graph information GSI is the above-described distribution map, the first generation unit 111 may generate a new graph structure such that the new graph structure includes a component corresponding to a blank portion of the distribution map (that is, a region where no data points indicating feature amounts exist). In this case, the first generation unit 111 may determine (or estimate) the amount of deviation by any of the methods described above. Since the first generation unit 111 generates a new graph structure, the first generation unit 111 may be referred to as a graph structure generation means or a structure generation means.
[0025] For example, if a large amount of mutually similar data is included in the data set used for model learning, causing a bias in the data distribution, the quality of the data set deteriorates. In the present embodiment, in order to suppress such a quality degradation, the first generation unit 111 may generate a new graph structure in which the amount of deviation from the plurality of graph structures related to the plurality of image data Imgs is equal to or greater than a first predetermined value. That is, it can be said that the first predetermined value is a value for suppressing the occurrence of a bias in the data distribution in the data set. The first predetermined value may be a fixed value or a variable value according to some parameter.
[0026] For example, if the dataset used to train a model contains a relatively large number of outliers or abnormal values, the accuracy of the model trained using that dataset may be relatively low. In this embodiment, in order to suppress the generation of outliers and abnormal values, the first generation unit 111 may generate a new graph structure in which the deviation amount from the multiple graph structures relating to the multiple image data Imgs is less than a second predetermined value. In other words, the second predetermined value can be said to be a value for suppressing the generation of outliers and abnormal values. The second predetermined value may be a fixed value or a variable value depending on some parameter.
[0027] The second generation unit 112 of the information processing device 10 may generate new image data (e.g., image data Img) based on a new graph structure (e.g., graph structure GS) generated by the first generation unit 111. In other words, the second generation unit 112 may generate new image data such that the graph structure generated based on the new image data becomes the new graph structure generated by the first generation unit 111. Furthermore, when the second generation unit 112 receives the new graph structure generated by the first generation unit 111 as input, it may generate new image data using a learning model (e.g., image generation AI (Artificial Intelligence)) that generates new image data. Since the second generation unit 112 generates new image data, it may also be referred to as an image generation means.
[0028] The arithmetic unit 11 of the information processing device 10 may register the new image data (e.g., image data Img) generated by the second generation unit 112 in the database DB. The arithmetic unit 11 of the information processing device 10 may also control the display device 151 to display the image related to the new image data (e.g., image data Img) generated by the second generation unit 112. The user of the information processing device 10 may instruct, via the input device 14, whether or not to register the image data related to the image displayed on the display device 151 in the database 40.
[0029] (Technical effects) In this embodiment, the first generation unit 111 generates a new graph structure based on graph information GSI. The second generation unit 112 generates new image data based on the newly generated graph structure. In other words, the information processing device 10 according to this embodiment can generate new image data. Here, in this embodiment, the information processing device 10 automatically generates new image data based on graph information GSI. Therefore, the information processing device 10 can reduce the workload on humans.
[0030] The embodiments of the invention derived from the above-described embodiments are described below.
[0031] An information processing device according to one aspect of the invention includes: a first generation means that generates a new graph structure whose deviation from the plurality of graph structures is greater than or equal to a first predetermined value, based on graph information relating to a plurality of graph structures each generated based on a plurality of image data registered in a database; and a second generation means that generates new image data based on the new graph structure. In the above embodiment, the "first generation unit 111" corresponds to an example of the "first generation means," and the "second generation unit 112" corresponds to an example of the "second generation means."
[0032] In the information processing apparatus according to the above embodiment, the graph information may be a distribution map in which the distribution of the plurality of graph structures is mapped, and the first generation means may generate a graph structure that includes components corresponding to the blank portions of the distribution map as the new graph structure. With this configuration, it is possible to generate image data that is different from the plurality of image data already registered in the database relatively easily.
[0033] In the information processing apparatus according to the above embodiment, the first generation means may generate a new graph structure based on the graph information in which the deviation amount is greater than or equal to a first predetermined value and less than a second predetermined value. By configuring it in this way, it is possible to suppress the generation of image data corresponding to outliers or abnormal values.
[0034] In the information processing apparatus according to the above embodiment, the graph information may include information indicating at least one of the types, positional relationships, shapes, and colors of objects in the graph structure, and the first generation means may determine the amount of deviation based on at least one of the types, positional relationships, shapes, and colors of objects in the graph structure. With this configuration, the amount of deviation can be determined relatively easily.
[0035] The present invention is not limited to the embodiments described above, and can be modified as appropriate without contradicting the gist or idea of the invention as can be read from the claims and specification as a whole. Information processing devices and other devices that undergo such modifications are also included within the technical scope of the present invention. [Explanation of Symbols]
[0036] 10... Information processing device, 11... Arithmetic unit, 12... Memory device, 13... Communication device, 14... Input device, 15... Output device, 111... First generation unit, 112... Second generation unit
Claims
1. A first generation means generates a new graph structure whose deviation from the aforementioned multiple graph structures is greater than or equal to a first predetermined value, based on graph information relating to multiple graph structures, each generated based on multiple image data registered in the database. A second generation means for generating new image data based on the aforementioned new graph structure, An information processing device equipped with the following features.
2. The aforementioned graph information is a distribution map in which the distribution of the multiple graph structures is mapped. The first generation means generates a graph structure that includes components corresponding to the blank areas of the distribution map as the new graph structure. The information processing apparatus according to claim 1.
3. The first generation means generates a new graph structure based on the graph information such that the deviation amount is greater than or equal to a first predetermined value and less than a second predetermined value. The information processing apparatus according to claim 1.
4. The aforementioned graph information includes information indicating at least one of the following: the type of object, the positional relationship, the shape, and the color of the object in the graph structure. The first generation means determines the amount of deviation based on at least one of the types, positional relationships, shapes, and colors of objects in the graph structure. The information processing apparatus according to claim 1.