Information processing system, information processing method, and program

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
The system addresses the challenge of evaluating driving safety by using an encoder to associate simulation and actual driving data, enabling effective analysis and evaluation of driving performance.

WO2026141048A1PCT designated stage Publication Date: 2026-07-02NEC CORP

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: WO · WO
Patent Type: Applications
Current Assignee / Owner: NEC CORP
Filing Date: 2025-12-16
Publication Date: 2026-07-02

Application Information

Patent Timeline

16 Dec 2025

Application

02 Jul 2026

Publication

WO2026141048A1

IPC: G08G1/00; G06F16/783

AI Tagging

Technology Topics

Information processing Data acquisition

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing systems face challenges in accurately evaluating driving safety due to the difficulty in preparing dangerous driving videos, leading to a lack of comparable data for analysis.

Method used

An information processing system that associates simulation data with detailed text and actual driving data using an encoder trained on both, allowing for the retrieval and analysis of actual driving data by referencing simulation data.

Benefits of technology

Enables accurate evaluation of driving safety by bridging the gap between simulated and actual driving data, providing a comprehensive analysis of driving performance.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure JP2025043914_02072026_PF_FP_ABST

Patent Text Reader

Abstract

Provided is an information processing system comprising: a storage means that stores, in a database, a first feature amount which is information relating to a vehicle driving simulation, the first feature amount being linked to a first text generated on the basis of simulation data; an actual data acquisition means that acquires a second feature amount which is information relating to actual driving of the vehicle; an identification means that identifies, from the database, the first feature amount in the simulation data relating to actual data, on the basis of the second feature amount; and an analysis means that analyzes the actual data on the basis of the first text linked to the identified first feature amount.

Need to check novelty before this filing date? Find Prior Art

Description

Information Processing System, Information Processing Method, Program

[0001] The present disclosure relates to an information processing system, an information processing method, and a program.

[0002] Patent Document 1 describes a program capable of determining a driving state regarding whether traffic rules indicated by road signs are being followed.

[0003] Japanese Patent Application Laid-Open No. 2023-93386

[0004] However, in the case of driving videos, it is difficult to prepare dangerous driving videos. Therefore, when the input driving video is a dangerous driving situation, similar data cannot be obtained, and there is a problem that desired results cannot be obtained. Therefore, an object of the present disclosure is to provide an information processing system capable of evaluating actual driving data by searching for and referring to simulation data of driving related to the actual data of driving.

[0005] The information processing system of the present disclosure includes: storage means for storing, in a database, by associating a first feature amount obtained by inputting simulation data, which is information related to vehicle driving simulation, into an encoder, and a first text generated based on the simulation data; actual data acquisition means for acquiring a second feature amount by inputting actual data, which is information related to actual driving of a vehicle, into the encoder; specifying means for specifying, based on the second feature amount, the first feature amount of the simulation data related to the actual data from the database; and analysis means for analyzing the actual data based on the first text associated with the specified first feature amount. The encoder is an encoder learned using a second text generated based on the simulation data and a third text generated based on the actual data. It is an information processing system.

[0006] The information processing method disclosed herein is an information processing method comprising: inputting simulation data, which is information relating to a vehicle driving simulation, into an encoder to obtain a first feature quantity obtained by associating it with a first text generated based on the simulation data and storing it in a database; inputting actual data, which is information relating to the actual driving of a vehicle, into the encoder to obtain a second feature quantity; identifying the first feature quantity of the simulation data relating to the actual data from the database based on the second feature quantity; analyzing the actual data based on the first text associated with the identified first feature quantity; and the encoder being an encoder that has been trained using the second text generated based on the simulation data and the third text generated based on the actual data.

[0007] The program of this disclosure is a program in which a storage means stores in a database a first feature obtained by inputting simulation data, which is information relating to the driving of a vehicle acquired from sensor data, into an encoder, and a first text generated based on the simulation data; a second feature is obtained by inputting actual data, which is information relating to the actual driving of a vehicle, into the encoder; the first feature of the simulation data relating to the actual data is identified from the database based on the second feature; and the information processing device is made to analyze the actual data based on the first text associated with the identified first feature; and the encoder is an encoder that has been trained using a second text generated based on the simulation data and a third text generated based on the actual data.

[0008] This disclosure provides an information processing system that can evaluate actual driving data by searching for and referencing driving simulation data related to actual driving data.

[0009] This is a diagram showing an overview of the related information processing system. This is a first block diagram showing the configuration of the information processing system related to this disclosure. This is a flowchart of the information processing method related to this disclosure. This is a diagram showing an overview of the information processing system related to this disclosure. This is a second block diagram showing the configuration of the information processing system related to this disclosure. This is a diagram explaining an encoder that resolves the gap between the simulated operation video and the actual operation video of the information processing system related to this disclosure. This is a first diagram explaining the video memory of the information processing system related to this disclosure. This is a second diagram explaining the video memory of the information processing system related to this disclosure. This is a flowchart of the learning method for the encoder of the information processing system related to this disclosure. This is a flowchart of the video memory construction method for the information processing system related to this disclosure. This is a flowchart of the search and analysis method for the information processing system related to this disclosure. This is a block diagram showing the configuration of the information processing device related to this disclosure.

[0010] (Explanation of the related information processing system) Figure 1 is a diagram showing an overview of the related information processing system. The related information processing system will be explained with reference to Figure 1.

[0011] With the widespread adoption of dashcams, the importance of driver assistance systems utilizing dashcam video data, GPS (Global Positioning System), or sensor data such as acceleration sensors is increasing. To provide a method for accurately evaluating whether a driver is driving safely based on dashcam footage, for the purpose of safety management by transportation companies and changing employee driving habits, related information processing systems are being considered. Furthermore, the purpose of such systems is not limited to the above. For example, the system could also be used to evaluate the driving performance of autonomous vehicles.

[0012] As shown in Figure 1, the related information processing system 100 includes a video memory 102, a search module 104, and a large language model (LLM) 105.

[0013] The video memory 102 stores multiple simulation data 101 related to driving. The video memory 102 is a database called RAG (Retrieval-Augmented Generation).

[0014] The search module 104 acquires the driving data 103. Here, the driving data 103 is information about actual driving, and will hereafter be referred to as actual driving data. The search module 104 then issues a query regarding the driving data 103. Subsequently, the search module 104 accesses the video memory 102. Using the query, the search module 104 acquires simulation data 101 related to the driving data 103 from the video memory 102. The search module 104 inputs the driving data 103 and the acquired simulation data 101 into the LLM 105.

[0015] When the LLM105 receives driving data 103, which is actual driving data, and simulation data 101, which is simulated driving data, it evaluates the actual driving and outputs a judgment result.

[0016] Building such a system would allow us to evaluate a driver's driving. Therefore, as mentioned above, when the driving dataset is limited, an approach that searches for and analyzes driving data similar to the input driving data is more appropriate than direct learning. However, in the relevant RAG, the annotation information included in the simulation data 101 of the driving dataset is simple and lacks sufficient information. As a result, there is insufficient information to determine whether or not driving is safe.

[0017] (Description of the Information Processing System According to the Embodiment) Figure 2 is a first block diagram showing the configuration of the information processing system according to this disclosure. The information processing system according to the embodiment will be described with reference to Figure 2.

[0018] As shown in Figure 2, the information processing system 200 according to this embodiment includes a storage unit 201, a real data acquisition unit 202, a identification unit 203, and an analysis unit 204.

[0019] The memory unit 201 stores in a database a combination of text information, which includes a first feature obtained by inputting simulation data, which is information related to the vehicle driving simulation, into an encoder, and a first text generated based on the simulation data. The database has storage devices such as a hard disk, memory, CD-ROM (Read Only Memory), and DVD-ROM (Read Only Memory).

[0020] Information regarding the driving simulation is obtained through sensor data. Sensor data includes at least one of the following: video, images, vehicle speed, vehicle acceleration, vehicle control signals, or vehicle position information. The sensor data includes video and image data acquired by RGB cameras, infrared cameras, RGBD cameras, etc. It also includes speed data acquired by a speed sensor, acceleration data acquired by an acceleration sensor, control signals derived for braking, and position information acquired by GPS. Preferably, the sensor data primarily consists of video, with speed, acceleration, and control signals used as secondary elements. However, it is not limited to this, and only one type of sensor data may be acquired.

[0021] The encoder is a machine learning model that takes simulation data, which is information related to driving simulations, as input and outputs a first feature. The encoder is trained using a second text generated based on the simulation data and a third text generated based on real data. The second and third texts will be described later.

[0022] Furthermore, the memory unit 201 is a database that stores text information having a first text. The first text is text that describes detailed driving conditions. The first text is, for example, a verbal representation of at least one of the following: the status of the traffic signals, the color and type of the lane lines, or the traffic signs shown by the simulation data. The first text consists of detailed text such as, for example, "The traffic light is red, so stop at the stop line," "The lane is yellow, so drive without overtaking," or "There is a stop sign, so stop at the stop line." The first text is detailed enough to comply with traffic rules. The first text is also one of the annotation pieces of information, which is a sentence that explains what the simulation data shows. The first text is acquired by a machine learning model that outputs the content of the simulation driving in text based on the input of the simulation driving data. Alternatively, the first text may be assigned by a human.

[0023] The second text is, for example, text describing the vehicle's movement as shown by the simulation data. The second text consists of simple phrases such as "drive straight," "turn a curve," and "stop at a stop line." The second text is also a form of annotation information that explains what the simulation data indicates. The second text is obtained by a machine learning model that outputs the details of the simulation's driving in text form based on the input of the simulation's driving data. Alternatively, the second text may be added by a human. Here, the first text contains more detailed information about the driving than the second text. The first text may also contain more information than the second text.

[0024] The memory unit 201 may store the first feature of the simulation data, the first text, and the second text in a database, linking them together.

[0025] The third text is text describing the vehicle's movement as shown by the real data. The third text is acquired by a machine learning model that outputs text based on video data input related to driving. Furthermore, the third text is a piece of annotation information that explains the content shown by the real data. Here, the third text has a similar level of detail to the second text. Therefore, the third text is structured to be comparable to the second text.

[0026] The real data acquisition unit 202 acquires a second feature by inputting real data, which is information related to the actual operation of the vehicle, into the encoder. The real data is acquired from sensor data. The sensor data includes at least one of the following: video, images, vehicle speed, vehicle acceleration, vehicle control signals, or vehicle position information. The sensor data is video and image data acquired by an RGB camera, infrared camera, RGBD camera, etc. Also, the sensor data is speed data acquired by a speed sensor. Also, the sensor data is acceleration data acquired by an acceleration sensor. Also, the sensor data is a control signal derived for applying the brakes. Also, the sensor data is position information acquired by GPS. It is preferable to use video as the main sensor data, with speed, acceleration, control signals, etc., as secondary data. However, it is not limited to this, and only one type of sensor data may be acquired. The real data is acquired using sensor data that has the same function as the simulation data stored in the storage unit 201. Therefore, the simulation data and the real data can be compared.

[0027] The encoder is a machine learning model that takes real data as input and outputs a second feature. This encoder may be the same as the encoder that takes simulated data as input and outputs a first feature.

[0028] The identification unit 203 identifies the first feature of the simulation data relating to the actual data from the database based on the second feature. The identification unit 203 identifies simulation data from multiple simulation data present in the storage unit 201 that satisfy predetermined conditions regarding its relationship with the actual data. For example, the identification unit 203 searches for simulation data that is close to the actual data. The information processing system 200 then obtains the identified simulation data along with the first text associated with that simulation data. The identification unit 203 further refers to the rule memory, which stores traffic rules and the like indicating rules related to driving, and identifies the relevant traffic rules based on the obtained first text. The identification unit 203 obtains information regarding the identified traffic rules.

[0029] The analysis unit 204 analyzes real data based on the first text of the identified simulation data. The analysis unit 204 is composed of, for example, an LLM (Likely a Machine Learning Model). The analysis unit 204 is a machine learning model that outputs a driving evaluation result shown by the real data, based on the first text and the input of acquired traffic rules. Here, the content shown by the driving evaluation result is not limited to, for example, driving stability. Also, the output may be a numerical value such as a score, but is not limited to this. Since the first text has a degree of detail that conforms to traffic rules, the analysis unit 204 compares the traffic rules with the first text to evaluate driving safety, etc.

[0030] The memory unit 201, the actual data acquisition unit 202, the identification unit 203, and the analysis unit 204 may be read as memory means, actual data acquisition means, identification means, and analysis means, respectively.

[0031] The above configuration provides an information processing system that can evaluate real-world driving data by searching for and referencing simulation data related to actual driving data.

[0032] (Description of the Information Processing Method According to the Embodiment) Figure 3 is a flowchart of the information processing method according to this disclosure. The information processing method according to the embodiment will be described with reference to Figure 3.

[0033] As shown in Figure 3, first, the memory unit 201 stores the first feature quantity and the first text in association (step S301). The memory unit 201 stores the vehicle driving simulation data acquired from the sensor data in association with the text information having the first text. Note that the memory unit 201 does not need to store the second text in association, and the driving simulation data and the first text may be stored in association.

[0034] Next, the real data acquisition unit 202 acquires a second feature (step S302). The real data acquisition unit 202 acquires a second feature from sensor data and real data related to vehicle operation.

[0035] Next, the identification unit 203 searches for simulation data from the storage unit 201 (step S303). The identification unit 203 identifies simulation data related to the actual data from the storage unit 201 based on the second feature and the first feature. At this time, it also acquires the first text associated with the simulation data. Furthermore, the identification unit 203 extracts and acquires traffic rules related to the actual data from the rule memory based on the acquired first text.

[0036] Finally, the analysis unit 204 analyzes the actual data (step S304) and terminates the process. The analysis unit 204 analyzes the actual data based on the actual data and acquired information such as traffic rules. The analysis unit 204 outputs the results regarding the evaluation of driving indicated by the analyzed actual data.

[0037] The above configuration provides a data retrieval method that enables the evaluation of actual driving data by searching for and referencing simulation data related to actual driving data.

[0038] The above method is carried out by executing and processing a program in an information processing device. As shown in Figure 9, the information processing device 900 includes a processor 901 that executes and processes a program, and a memory 902 that stores the program. The information processing device may consist of one device or multiple devices. The information processing device may also be a cloud server in which some or all of its functions are processed in a distributed manner.

[0039] (Description of the Information Processing System According to Embodiment 1) Figure 4 is a diagram showing an overview of the information processing system according to the present disclosure. Figure 5 is a second block diagram showing the configuration of the information processing system according to the present disclosure. Figure 6 is a diagram illustrating an encoder that eliminates the gap between the simulated video of the operation of the information processing system according to the present disclosure and the actual video of the operation. Figure 7A is a first diagram illustrating the video memory of the information processing system according to the present disclosure. Figure 7B is a second diagram illustrating the video memory of the information processing system according to the present disclosure. The information processing system according to Embodiment 1 will be described with reference to Figures 4 to 7B.

[0040] As shown in Figure 4, the information processing system 400 according to Embodiment 1 includes, as a memory structure, a simulation video encoder 401, a text encoder 402, an MLP (Multi Layer Perceptron) projection layer 403, a simulation video memory 405, and a rule memory 406.

[0041] Furthermore, the information processing system 400 according to Embodiment 1 includes, as a configuration for generating explanations and justifications, a video encoder 408, a text encoder 409, an MLP projection layer 410, an attention mechanism 416, a text encoder 417, and an LLM 418. Generating explanations and justifications means generating explanations and justifications for the evaluation of the operation.

[0042] As shown in Figure 5, the information processing system according to Embodiment 1 comprises, in terms of function, a drive simulator unit 501, an encoder 502, a simulation video memory 405, a rule memory 406, an encoder 503, a search and integration unit 504, and an LLM 418.

[0043] The drive simulator unit 501 is a simulator that acquires driving data such as sensor data including video, speed, and acceleration, and control data such as steering angle and braking. The drive simulator unit 501 generates text information related to driving. The text information has two texts, a simple text and a detailed text. Here, the simple text corresponds to the second text. Also, the detailed text corresponds to the first text. The generation of text information related to driving may be performed by a person.

[0044] The encoder 502 extracts a first feature amount from the simulation data. The encoder 502 includes both a simulation video encoder 401 and a text encoder 402.

[0045] The simulation video memory 405 is a video memory that stores a first feature amount, which is information related to the simulation data of driving. The simulation video memory 405 associates and stores the feature amount of the simulation data related to driving and the text information related to the simulation data. The text information may include a simple text and a detailed text, or may not include a simple text.

[0046] The rule memory 406 is a database that stores traffic rules related to driving.

[0047] The encoder 503 extracts a second feature amount from actual data, which is actual driving data. The actual data includes at least one of sensor data such as video, images, speed data, acceleration data, and control data. Note that by using a plurality of sensor data, the generation accuracy of the second feature amount is improved. The encoder 503 includes a video encoder 408 and a text encoder 409. The encoder 503 may use the same one as the encoder 502.

[0048] The search and integration unit 504 searches the simulation data in the simulation video memory 405 using the second feature amount obtained from the actual data as a query. The simulation video memory 405 transmits the detailed text associated with the simulation data similar to the actual data to the search and integration unit 504. Here, simple text may also be transmitted together. Further, the search and integration unit 504 searches the rule memory 406 based on the detailed text which is the first text retrieved. The rule memory 406 transmits the traffic rules related to the identified detailed text to the search and integration unit 504. The search and integration unit 504 here serves as the text encoder 417.

[0049] Further, the search and integration unit 504 inputs data to the LLM 418 using the attention mechanism 416. The data is the actual data, feature amounts, detailed text, and traffic rules. The attention mechanism 416 is a method designed to mimic cognitive attention in an artificial neural network. The attention mechanism 416 makes it pay attention to important parts even if the data is small. Here, the attention mechanism 416 identifies the traffic rules to be noted. The search and integration unit 504 inputs data to the LLM 418 after identifying the traffic rules to be noted through the attention mechanism 416.

[0050] The LLM 418 outputs a driving evaluation such as whether the driver's driving is a safe driving or not from the input actual data, feature amounts, detailed text, and traffic rules. The LLM 418 performs an analysis by comparing the identified traffic rules with the actual data and outputs the result of the analysis.

[0051] The learning of the encoder for searching simulation data from actual data will be described using FIG. 6. As shown in FIG. 6, the simulation video is input to the simulation video encoder 401. Also, a control signal is input to the text encoder 402. The data input to the encoder passes through the machine-learned MLP projection layer 403, and the feature amount 404 is output.

[0052] Similarly, video data of the actual data is input to the video encoder 408. Control signals are also input to the text encoder 409. The data input to the encoders passes through the machine learning-trained MLP projection layer 410, and feature quantities 411 are output. By comparing feature quantities 404 and 411, the simulated video encoder 401 and the video encoder 408 are trained.

[0053] The simulation video encoder 401 and video encoder 408 utilize existing encoders. They calculate the loss from two feature quantities 404 and 411 output by the MLP projection layers 403 and 410, and from the simple text. The MLP projection layers 403 and 410 learn to minimize the loss of similar annotations. In other words, the simulation video encoder 401 and video encoder 408 learn to satisfy predetermined conditions for the values calculated based on the second and third texts. In this way, the differences between the simulation data and the real data can be absorbed. That is, the differences between the simulation data and the real data can be reduced, and the gap can be minimized.

[0054] By inputting multimodal data such as text-based sensor data and control signals into the MLP projection layers 403 and 410, it is possible to improve the accuracy of feature generation.

[0055] The construction of the video memory will be explained using Figures 7A and 7B. As shown in Figure 7A, the simulation video memory 405 is constructed by combining three items: the feature quantities of the simulation data output from encoder 502, and simple text and detailed text generated from the simulator or manually generated. The simple text is mainly used for training encoders 502 and 503, so it is optional to store it in the simulation video memory 405.

[0056] As shown in Figure 7B, when referencing actual operating data from video and sensor data as input, the feature quantities, including the driving text encoded from the actual operating data, are used as queries to retrieve the relevant simulation driving data feature quantities, simple text, and detailed text from the simulation video memory 405, which is the database.

[0057] The detailed text further refers to the rule memory 406, which is a database where traffic rules are stored as text, and retrieves the traffic rules related to the detailed text.

[0058] Specifically, actual driving data is input to the video encoder 408, sensor data is input to the text encoder 409, and feature quantities 411 are output via the MLP projection layer 410. A query is issued with feature quantities 411, which include driving text, to retrieve feature quantities 412, simple text 413, and detailed text 414 from the simulation video memory 405. The retrieved detailed text 414 is input to the rule memory 406 to retrieve the relevant traffic rules 415. Feature quantities 411, 412, simple text 413, detailed text 414, and traffic rules 415 are input to the LLM 418 via the text encoder 417 and the machine learning-trained attention mechanism 416. The LLM 418 then produces the desired output.

[0059] (Description of the Information Processing Method According to Embodiment 1) Figure 8A is a flowchart of the learning method for the encoder of the information processing system according to the present disclosure. Figure 8B is a flowchart of the video memory construction method of the information processing system according to the present disclosure. Figure 8C is a flowchart of the search and analysis method of the data retrieval processing system according to the present disclosure. The information processing method according to Embodiment 1 will be described with reference to Figures 8A to 8C.

[0060] As shown in Figure 8A, the encoder learning method first acquires one data point each from the simulation data and the real data (step S601). Next, the similarity is calculated from the second text of each data point (step S602). Then, features are calculated from each data point (step S603).

[0061] Next, the loss is calculated from the similarity and features (step S604). Finally, the parameters are updated so that the calculated loss value satisfies predetermined conditions. For example, the parameters are updated in a direction that reduces the calculated loss value (step S605), and the process is terminated.

[0062] As shown in Figure 8B, the video memory construction method first acquires simulation data and the first and second texts (step S611). Next, the encoded result of the simulation data is saved to the video memory as the key and the first and second texts as the value (step S612), and the process ends. The key is also called a feature.

[0063] As shown in Figure 8C, the search and analysis method encodes the actual data and creates a key (step S621). Next, it retrieves keys and values similar to the key stored in the database (step S622). The key is a feature, and the value is the first and second text.

[0064] Using the second text associated with the acquired simulation data, the relevant rules are retrieved from the rule memory (step S623). Finally, the actual data is analyzed and evaluated using the actual data, the data acquired from the video memory, and the rules acquired from the rule memory (step S624), and the process is terminated.

[0065] Some or all of the processing in the information processing systems 200, 400, and 900 described above can be implemented as computer programs. Such programs can be stored and supplied to a computer using various types of non-temporary computer-readable media. Non-temporary computer-readable media include various types of tangible recording media. Examples of non-temporary computer-readable media include magnetic recording media (e.g., flexible disks, magnetic tapes, hard disk drives), magneto-optical recording media (e.g., magneto-optical disks), CD-ROMs (Read Only Memory), CD-Rs, CD-R / Ws, and semiconductor memory (e.g., mask ROMs, PROMs (Programmable ROMs), EPROMs (Erasable PROMs), flash ROMs, and RAMs (Random Access Memory)). Programs may also be supplied to a computer using various types of temporary computer-readable media. Examples of temporary computer-readable media include electrical signals, optical signals, and electromagnetic waves. Temporary computer-readable media can supply programs to a computer via wired communication channels such as electric wires and optical fibers, or via wireless communication channels.

[0066] Although the present disclosure has been described above with reference to embodiments, the present disclosure is not limited to the embodiments described above. Various modifications to the structure and details of the present disclosure can be made as can be understood by those skilled in the art within the scope of the present disclosure. Furthermore, each embodiment can be combined with other embodiments as appropriate.

[0067] Each drawing is merely illustrative to illustrate one or more embodiments. Each drawing may be associated with one or more other embodiments, rather than being associated with only one specific embodiment. As those skilled in the art will understand, various features or steps described with reference to any one drawing can be combined with features or steps shown in one or more other drawings, for example, to create embodiments not explicitly shown or described. Not all features or steps shown in any one drawing to illustrate an exemplary embodiment are necessarily required, and some features or steps may be omitted. The order of steps described in any of the drawings may be changed as appropriate.

[0068] Some or all of the above embodiments may also be described as follows, but are not limited to the following: (Note 1) An information processing system comprising: a storage means for storing in a database a first feature quantity obtained by inputting simulation data, which is information relating to the driving simulation of a vehicle, into an encoder, and a first text generated based on the simulation data; a real data acquisition means for obtaining a second feature quantity by inputting real data, which is information relating to the actual driving of a vehicle, into the encoder; a identification means for identifying the first feature quantity of the simulation data relating to the real data from the database based on the second feature quantity; and an analysis means for analyzing the real data based on the first text associated with the identified first feature quantity, wherein the encoder is an encoder that has been trained using a second text generated based on the simulation data and a third text generated based on the real data. (Note 2) The information processing system according to Note 1, wherein the encoder resolves the difference between the simulation data and the real data by learning that the values calculated based on the second text and the third text satisfy predetermined conditions. (Note 3) The information processing system according to Note 1, wherein the simulation data and the actual data each include at least one sensor data from among video, images, vehicle speed, vehicle acceleration, vehicle control signals, or vehicle position information. (Note 4) The information processing system according to any one of Notes 1 to 3, wherein the storage means stores the first feature quantity relating to the simulation data, the first text, and the second text in the database in association. (Note 5) The information processing system according to any one of Notes 1 to 3, wherein the analysis means identifies the traffic rules relating to the first text from a rule memory storing traffic rules relating to driving, performs the analysis by comparing the identified traffic rules with the actual data, and outputs the result of the analysis.(Note 6) The information processing system according to any one of Notes 1 to 3, wherein the first text and the second text are sentences that describe the content indicated by the simulation data, the first text is a sentence that describes the content indicated by the simulation data in more detail than the second text, and the third text is a sentence that describes the content indicated by the actual data and includes information equivalent to that of the second text. (Note 7) The information processing system according to Note 6, wherein the first text includes at least one of the information indicated by the simulation data: the status of the signal, the color of the lane, the type of lane line, and the traffic sign. (Note 8) The information processing system according to Note 7, wherein the second text includes information regarding the movement of the vehicle indicated by the simulation data, and the third text includes information regarding the movement of the vehicle indicated by the actual data. (Note 9) An information processing method comprising: inputting simulation data, which is information relating to the driving simulation of a vehicle, into an encoder to obtain a first feature quantity obtained by linking it with a first text generated based on the simulation data and storing it in a database; inputting actual data, which is information relating to the actual driving of a vehicle, into the encoder to obtain a second feature quantity; identifying the first feature quantity of the simulation data relating to the actual data from the database based on the second feature quantity; analyzing the actual data based on the first text associated with the identified first feature quantity; and the encoder being an encoder that has been trained using the second text generated based on the simulation data and the third text generated based on the actual data.(Note 10) A program that stores in a database a first feature obtained by inputting simulation data, which is information relating to the driving simulation of a vehicle, into an encoder, and a first text generated based on the simulation data; obtains a second feature by inputting real data, which is information relating to the actual driving of a vehicle, into the encoder; identifies the first feature of the simulation data relating to the real data from the database based on the second feature; and causes the information processing device to perform analysis of the real data based on the first text associated with the identified first feature; and the encoder is an encoder that has learned using the second text generated based on the simulation data and the third text generated based on the real data.

[0069] Some or all of the elements (e.g., configuration and function) described in Appendices 2 to 8 that are dependent on Appendice 1 {e.g., System} may also be dependent on Appendices 9 {e.g., Method} and 10 {e.g., Program} in the same way as in Appendices 2 to 8. Some or all of the elements described in any appendice may be applied to various hardware, software, recording means, systems, and methods for recording software.

[0070] Although the present invention has been described above with reference to embodiments, the present invention is not limited thereto. Various modifications to the structure and details of the present invention can be made that are understandable to those skilled in the art within the scope of the invention.

[0071] This application claims priority based on Japanese Patent Application No. 2024-231348, filed on 26 December 2024, and incorporates all of its disclosures herein.

[0072] 100 Information processing system, 101 Simulation data, 102 Video memory, 103 Driving data, 104 Search module, 105 LLM, 200 Information processing system, 201 Memory unit, 202 Actual data acquisition unit, 203 Identification unit, 204 Analysis unit, 400 Information processing system, 401 Simulation video encoder, 402 Text encoder, 403 MLP projection layer, 404 Feature quantity, 405 Simulation video memory, 406 Rule memory, 408 Video encoder, 409 Text encoder, 410 MLP projection layer, 411 Feature quantity, 412 Feature quantity, 413 Simple text, 414 Detailed text, 415 Traffic rules, 416 Attention mechanism, 417 Text encoder, 418 LLM, 501 Drive simulator unit, 502 Encoder, 503 Encoder, 504 Search and integration unit, 900 Information processing device, 901 Processor, 902 Memory

Claims

1. An information processing system comprising: a storage means for storing in a database a first feature obtained by inputting simulation data, which is information relating to the driving simulation of a vehicle, into an encoder, and a first text generated based on the simulation data; a real data acquisition means for obtaining a second feature by inputting real data, which is information relating to the actual driving of a vehicle, into the encoder; an identification means for identifying the first feature of the simulation data relating to the real data from the database based on the second feature; and an analysis means for analyzing the real data based on the first text associated with the identified first feature, wherein the encoder is an encoder that has been trained using a second text generated based on the simulation data and a third text generated based on the real data.

2. The information processing system according to claim 1, wherein the encoder eliminates the difference between the simulation data and the actual data by learning that the values calculated based on the second text and the third text satisfy predetermined conditions.

3. The information processing system according to claim 1, wherein the simulation data and the actual data each include at least one sensor data from among video, images, vehicle speed, vehicle acceleration, vehicle control signals, or vehicle position information.

4. The information processing system according to any one of claims 1 to 3, wherein the storage means stores the first feature quantity relating to the simulation data, the first text, and the second text in the database in association with each other.

5. The information processing system according to any one of claims 1 to 3, wherein the analysis means identifies the traffic rules related to the first text from a rule memory storing traffic rules relating to driving, performs the analysis by comparing the identified traffic rules with the actual data, and outputs the results of the analysis.

6. The information processing system according to any one of claims 1 to 3, wherein the first text and the second text are sentences that describe the content indicated by the simulation data, the first text is a sentence that describes the content indicated by the simulation data in more detail than the second text, and the third text is a sentence that describes the content indicated by the actual data and includes information equivalent to that of the second text.

7. The information processing system according to claim 6, wherein the first text includes at least one of the following: the signal status indicated by the simulation data, the lane color, the lane line type, and information regarding traffic signs.

8. The information processing system according to claim 7, wherein the second text includes information relating to the movement of the vehicle as shown by the simulation data, and the third text includes information relating to the movement of the vehicle as shown by the actual data.

9. An information processing method comprising: inputting simulation data, which is information relating to the driving simulation of a vehicle, into an encoder to obtain a first feature quantity obtained by linking it with a first text generated based on the simulation data, and storing it in a database; inputting actual data, which is information relating to the actual driving of a vehicle, into the encoder to obtain a second feature quantity; identifying the first feature quantity of the simulation data relating to the actual data from the database based on the second feature quantity; analyzing the actual data based on the first text associated with the identified first feature quantity; and the encoder being an encoder that has been trained using the second text generated based on the simulation data and the third text generated based on the actual data.

10. The information processing method according to claim 9, wherein the encoder eliminates the difference between the simulation data and the actual data by learning that the values calculated based on the second text and the third text satisfy predetermined conditions.

11. The information processing method according to claim 9, wherein the simulation data and the actual data each include at least one sensor data from among video, images, vehicle speed, vehicle acceleration, vehicle control signals, or vehicle position information.

12. The information processing method according to any one of claims 9 to 11, wherein the storage involves associating the first feature quantity relating to the simulation data, the first text, and the second text and storing them in the database.

13. The information processing method according to any one of claims 9 to 11, wherein the analysis involves identifying the traffic rules related to the first text from a rule memory storing traffic rules relating to driving, performing the analysis by comparing the identified traffic rules with the actual data, and outputting the results of the analysis.

14. The information processing method according to any one of claims 9 to 11, wherein the first text and the second text are sentences that describe the content indicated by the simulation data, the first text is a sentence that describes the content indicated by the simulation data in more detail than the second text, and the third text is a sentence that describes the content indicated by the actual data and includes information equivalent to that of the second text.

15. The information processing method according to claim 14, wherein the first text includes at least one of the following: the signal status indicated by the simulation data, the lane color, the lane line type, and information regarding traffic signs.

16. The information processing method according to claim 15, wherein the second text includes information relating to the movement of the vehicle as shown by the simulation data, and the third text includes information relating to the movement of the vehicle as shown by the actual data.

17. A program that stores in a database a first feature obtained by inputting simulation data, which is information relating to the driving simulation of a vehicle, into an encoder, and a first text generated based on the simulation data; obtains a second feature by inputting real data, which is information relating to the actual driving of a vehicle, into the encoder; identifies the first feature of the simulation data relating to the real data from the database based on the second feature; and causes the information processing device to analyze the real data based on the first text associated with the identified first feature; and the encoder is an encoder that has been trained using the second text generated based on the simulation data and the third text generated based on the real data.

18. The program according to claim 17, wherein the encoder resolves the difference between the simulation data and the actual data by learning that the values calculated based on the second text and the third text satisfy predetermined conditions.

19. The program according to claim 17, wherein the simulation data and the actual data each include at least one sensor data from among video, images, vehicle speed, vehicle acceleration, vehicle control signals, or vehicle position information.

20. The program according to any one of claims 17 to 19, wherein the storage involves associating the first feature quantity relating to the simulation data, the first text, and the second text and storing them in the database.