Determination program, determination method, and information processing device

The determination program addresses the issue of reduced reliability in machine learning models by evaluating output similarity to source information, enhancing accuracy and reliability through context preservation and alert mechanisms.

WO2026126320A1PCT designated stage Publication Date: 2026-06-18FUJITSU LTD

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
FUJITSU LTD
Filing Date
2024-12-10
Publication Date
2026-06-18

AI Technical Summary

Technical Problem

Machine learning models, particularly large language models, often output incorrect information due to forgetting the source information on which the output is based, leading to reduced reliability and inaccuracies in responses.

Method used

A determination program evaluates the reliability of a machine learning model's response by comparing the original source information with the generated output using similarity metrics, determining the memory state of the model, and adjusting or flagging responses based on the similarity results.

🎯Benefits of technology

The program effectively assesses the reliability of machine learning model outputs, ensuring accurate and contextually correct information is provided by identifying and mitigating the effects of information loss, thereby improving the model's performance and user trust.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure JP2024043540_18062026_PF_FP_ABST
    Figure JP2024043540_18062026_PF_FP_ABST
Patent Text Reader

Abstract

The present invention evaluates the reliability of the response of a machine learning model to a prompt. An information processing device (10) inputs, to a machine learning model (13), a prompt (14) including first instructions that dictate the output of information related to first information and second instructions that dictate the output of the first information, acquires second information which was outputted from the machine learning model (13) in response to the second instructions, calculates the similarity between the first information and the second information, and determines a storage condition of the first information in the machine learning model (13) on the basis of the similarity.
Need to check novelty before this filing date? Find Prior Art

Description

Determination Program, Determination Method, and Information Processing Apparatus 【0001】 The present invention relates to a determination program, a determination method, and an information processing apparatus. 【0002】 A machine learning model may receive a prompt and output information corresponding to the prompt. For example, a natural language processing model such as a large language model (LLM) may receive a prompt text described in natural language and output a response text describing the information required by the prompt text. A machine learning model that responds to a prompt may be implemented using a neural network. 【0003】 There is a technology for extracting user input items from source code, searching a database for explanatory texts corresponding to the user input items, and generating an input operation manual. There is also a technology for searching for reusable source code from a query described in natural language using a natural language processing model. There is also a technology for generating an abstract syntax tree from source code, analyzing the abstract syntax tree, and generating an explanatory text showing an explanation about the source code. 【0004】 Japanese Patent Laid-Open No. 7-152548, Japanese Patent Laid-Open No. 2023-47336, Japanese Patent Laid-Open No. 2024-120169 【0005】 One of the problems regarding machine learning models is hallucination, in which a machine learning model may output incorrect information. A machine learning model that responds to a prompt may output information that does not conform to the instructions of the prompt. Therefore, the response of the machine learning model to the prompt is not always reliable. 【0006】One reason for the reduced reliability of a prompt response is that the machine learning model may forget the source information on which the output information is based when it is generated. When the size of the prompt or output information is large, some machine learning models may dilute or lose the context of the source information specified in the prompt midway through the process. As a result, the machine learning model may output inaccurate information based on an incomplete memory of the source information. Therefore, in one aspect, the present invention aims to evaluate the reliability of a machine learning model's response to a prompt. 【0007】 In one aspect, a determination program is provided that causes a computer to execute a process in which it receives prompts from a machine learning model, including a first instruction that instructs the model to output information related to a first piece of information, and a second instruction that instructs the model to output the first piece of information, retrieves the second piece of information output from the machine learning model in response to the second instruction, calculates the similarity between the first piece of information and the second piece of information, and determines the memory state of the first piece of information in the machine learning model based on the similarity. 【0008】 In one aspect, the reliability of the machine learning model's response to a prompt can be evaluated. The above and other objects, features and advantages of the present invention will become apparent from the following description in conjunction with the accompanying drawings illustrating preferred embodiments as examples of the invention. 【0009】 This is a diagram illustrating the information processing device of the first embodiment. This is a diagram showing an example of the hardware of the information processing device of the second embodiment. This is a diagram showing an example of the structure of a machine learning model. This is a diagram showing a first example of output text describing the source code. This is a diagram showing a second example of output text describing the source code. This is a diagram showing an example of prompt text input to a large-scale language model. This is a diagram showing an example of source code division. This is a diagram showing an example of a design information file. This is a diagram showing an example of source code. This is a diagram showing an example of the similarity judgment result. This is a diagram showing an example of a descriptive text to which a "requires confirmation" flag is added. This is a block diagram showing an example of the functions of the information processing device of the second embodiment. This is a flowchart showing an example of the procedure for extracting design information. 【0010】 Hereinafter, this embodiment will be described with reference to the drawings. (a) First Embodiment Figure 1 is a diagram illustrating the information processing device of the first embodiment. The information processing device 10 of the first embodiment inputs a prompt 14 to the machine learning model 13 and obtains information from the machine learning model 13 as a response to the prompt 14. At this time, the information processing device 10 evaluates the reliability of the machine learning model 13's response to the prompt 14. The information processing device 10 may be a client device or a server device. The information processing device 10 may also be called a computer or a decision device. 【0011】 The information processing device 10 includes a storage unit 11 and a processing unit 12. The storage unit 11 may be a volatile memory such as RAM (Random Access Memory). Alternatively, the storage unit 11 may be a non-volatile storage such as an HDD (Hard Disk Drive) or SSD (Solid State Drive). 【0012】 The processing unit 12 is a processor, such as a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), or a DSP (Digital Signal Processor). However, the processing unit 12 may also include electronic circuits such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array). The processor executes a program stored in memory, such as RAM. The processor is sometimes called a processor circuit. A collection of processors is sometimes called a multiprocessor or simply a "processor." Different processing steps among multiple processing steps may be executed by different processors. 【0013】The machine learning model 13 is pre-trained using machine learning with training data and has trained parameter values. The machine learning model 13 may be a neural network or a machine learning model with other internal structures. Furthermore, the machine learning model 13 may be a generative AI (Artificial Intelligence) or a machine learning model other than a generative AI. In addition, the machine learning model 13 may be a natural language processing model such as a large-scale language model, or another type of machine learning model such as an image processing model. 【0014】 A natural language processing model typically receives prompt text written in natural language and outputs response text written in natural language in response to the instructions in the prompt text. The information processing device 10 may store the machine learning model 13. The machine learning model 13 may be stored in the storage unit 11. The information processing device 10 may also send a prompt 14 to another information processing device that stores the machine learning model 13 and receive the output of the machine learning model 13 from the other information processing device. 【0015】 Here, the machine learning model 13 may output information that does not conform to the instructions of prompt 14. Therefore, the machine learning model 13's response to prompt 14 is not always reliable. One reason for the reduced reliability of the response is that the machine learning model 13 may have forgotten the original information on which the output information is based at the time the output information is generated. Therefore, the information processing device 10 checks the memory state of the original information at the time the output information is generated and evaluates the reliability of the response from the perspective of forgetting. 【0016】 The memory unit 11 stores information 15 as first information. Information 15 is source information related to the information that the machine learning model 13 wants to output. For example, information 15 is target information that the machine learning model 13 wants to explain the meaning of. Information 15 may be text data such as source code or the whole or part of a document, or it may be other types of data such as image data, or it may be multiple types of data. 【0017】The processing unit 12 generates a prompt 14 and inputs the prompt 14 to the machine learning model 13. The prompt 14 includes instruction 14a as a first instruction and instruction 14b as a second instruction. Instruction 14a instructs the output of information related to information 15. For example, instruction 14a instructs an explanation of the meaning of information 15. Instruction 14b instructs the output of information 15. The prompt 14 may be a prompt text written in natural language, and instructions 14a and 14b may be instruction sentences written in natural language. The prompt 14 may include information 15. The information related to information 15 may be instructions, commands, requests, etc., regarding information 15. In addition to an explanation of the meaning of information 15, the information related to information 15 may be any information that the machine learning model can output, such as a summary, modification, transformation of information 15, or new ideas regarding information 15. 【0018】 The processing unit 12 obtains information 16 corresponding to instruction 14b as second information and information 17 corresponding to instruction 14a as third information from the machine learning model 13. Ideally, information 16 is information 15 itself. However, due to the nature of the machine learning model 13, information 16 may not be identical to information 15. Information 17 is information related to information 15, for example, an explanatory text describing the meaning of information 15 in natural language. 【0019】 Typically, instructions 14a and 14b are included in the same prompt, and information 16 and 17 are included in a single output. However, the machine learning model 13 may be able to maintain context across multiple prompts. In that case, instructions 14a and 14b may be included in different prompts, and information 16 and 17 may be separated into different outputs. 【0020】The processing unit 12 calculates the similarity between information 15 and information 16. The processing unit 12 can use various similarity metrics. The processing unit 12 may also use "distance," where a smaller value indicates higher similarity. Examples of distances include Levenshtein distance, cosine distance, Euclidean distance, Jacquard distance, Dice coefficient, and WMD (Word Mover's Distance). Alternatively, the processing unit 12 may calculate the similarity based on the inclusion relationship between information 15 and information 16, such as the inclusion relationship between strings. 【0021】 The processing unit 12 determines the memory state of the information 15 of the machine learning model 13 based on the similarity. The memory state may be a flag indicating whether or not the machine learning model 13 has forgotten the information 15. For example, the processing unit 12 determines that the machine learning model 13 has forgotten the information 15 if the similarity is below a certain level, such as when the distance exceeds a threshold. The processing unit 12 may also determine the reliability of the information 17 according to the memory state. For example, if the machine learning model 13 has forgotten the information 15, the processing unit 12 determines that the reliability of the information 17 is low. 【0022】 The processing unit 12 may output the determined storage state or reliability. For example, the processing unit 12 may store the determined storage state or reliability in non-volatile storage, display it on a display device, or transmit it to another information processing device. 【0023】 Furthermore, if the processing unit 12 determines that the machine learning model 13 has forgotten the information 15, it may instruct the machine learning model 13 to output information related to the information 15 again, or it may input the prompt 14 to the machine learning model 13 again. If the machine learning model 13 is using random numbers internally, the machine learning model 13 may output information different from the information 17 as related information. The processing unit 12 may also instruct the machine learning model 13 to output information different from the information 17. 【0024】Furthermore, if the processing unit 12 determines that the machine learning model 13 has forgotten the information 15, it may add warning information to the information 17 indicating a reliability warning. The processing unit 12 may output the warning information itself or the information 17 with the warning information attached. For example, the processing unit 12 may store the warning information itself or the information 17 with the warning information attached in non-volatile storage, display it on a display device, or transmit it to another information processing device. 【0025】 As described above, the information processing device 10 of the first embodiment inputs a prompt 14 to the machine learning model 13, which includes an instruction 14a that instructs the machine learning model 13 to output information related to information 15, and an instruction 14b that instructs the machine learning model 13 to output information 15. The information processing device 10 acquires the information 16 output from the machine learning model 13 in response to instruction 14b. The information processing device 10 calculates the similarity between information 15 and information 16. Based on the similarity, the information processing device 10 determines the storage state of information 16 in the machine learning model 13. 【0026】 As a result, the information processing device 10 can use the machine learning model 13 to obtain useful information related to the information 15. For example, the information processing device 10 can obtain an explanatory text that explains the meaning of the information 15 and provide useful information to the user. In addition, the information processing device 10 can check whether the machine learning model 13 has forgotten the information 15 and estimate the decrease in reliability caused by forgetting. Therefore, the information processing device 10 can take appropriate measures against the decrease in reliability of the machine learning model 13's response, such as alerting the user if the reliability is low. 【0027】(b) Figure 2 of the second embodiment shows an example of the hardware of the information processing device of the second embodiment. The information processing device 100 of the second embodiment extracts design information from the source code of an information processing system using a large-scale language model. However, the method of using the machine learning model in the second embodiment can be applied to machine learning models other than the large-scale language model, and can also be applied to purposes other than extracting design information from source code. The information processing device 100 corresponds to the information processing device 10 of the first embodiment. 【0028】 The information processing device 100 includes a CPU 101, RAM 102, HDD 103, GPU 104, input interface 105, media reader 106, and communication interface 107. The CPU 101 corresponds to the processing unit 12 of the first embodiment. The RAM 102 or HDD 103 corresponds to the storage unit 11 of the first embodiment. 【0029】 The CPU 101 is a processor that executes program instructions. The CPU 101 loads the program and data from the HDD 103 into the RAM 102 and executes the program. The information processing device 100 may have multiple processors. 【0030】 RAM 102 is a volatile semiconductor memory that temporarily stores programs executed by the CPU 101 and data used for calculations by the CPU 101. The information processing device 100 may have a type of volatile memory other than RAM. 【0031】 The HDD 103 is a non-volatile storage device that stores software programs such as operating systems, middleware, and application software, as well as data. The information processing device 100 may also have other types of non-volatile storage, such as an SSD or flash memory. 【0032】The GPU 104 works in conjunction with the CPU 101 to perform image processing and outputs the image to the display device 111 connected to the information processing device 100. The display device 111 is, for example, a CRT (Cathode Ray Tube) display, a liquid crystal display, an organic EL (Electro Luminescence) display, or a projector. 【0033】 Furthermore, the GPU 104 may be used as a GPGPU (General Purpose Computing on Graphics Processing Unit). The GPU 104 can execute programs in response to instructions from the CPU 101. The information processing device 100 may have a volatile semiconductor memory other than RAM 102 as GPU memory. 【0034】 The input interface 105 receives input signals from an input device 112 connected to the information processing device 100. The input device 112 is, for example, a mouse, a touch panel, or a keyboard. Multiple input devices may be connected to the information processing device 100. 【0035】 The media reader 106 is a reading device that reads programs and data recorded on the recording medium 113. The recording medium 113 is, for example, a magnetic disk, an optical disk, or semiconductor memory. Magnetic disks include flexible disks (FDs) and HDDs. Optical disks include CDs (Compact Discs) and DVDs (Digital Versatile Discs). The media reader 106 copies the programs and data read from the recording medium 113 to other recording media such as RAM 102 or HDD 103. The read programs may be executed by the CPU 101. 【0036】 The recording medium 113 may be a portable recording medium. The recording medium 113 may be used for distributing programs and data. The recording medium 113 and HDD 103 may also be referred to as computer-readable recording media. 【0037】The communication interface 107 communicates with other information processing devices via the network 114. The communication interface 107 may be a wired communication interface connected to a wired communication device such as a switch or router, or it may be a wireless communication interface connected to a wireless communication device such as a base station or access point. 【0038】 Next, the structure of the large-scale language model will be described. The large-scale language model may be a neural network, or it may be implemented using a transformer with an attention mechanism. Transformers are also described in the following non-patent document: Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, and Lukasz Kaiser, "Attention Is All You Need", Proc. of the 31st International Conference on Neural Information Processing Systems (NIPS 2017), pages 6000-6010, December 2017. Note that the large-scale language model in the second embodiment is not limited to a transformer, but may also be other machine learning models such as LSTM (Long Short-Term Memory). 【0039】 Figure 3 shows an example of the structure of a machine learning model. The machine learning model 130 is an encoder-decoder type neural network, also known as a transformer. The machine learning model 130 has embedding layers 131, 132, position coding layers 133, 134, encoder 135, decoder 136, linear layer 137, and softmax layer 138. 【0040】The embedding layer 131 converts each of the multiple words contained in the input text into a word vector called an embedding representation or distributed representation. A word vector is a numerical vector with a fixed number of dimensions, such as 512 or 1024 dimensions. Similar word vectors are assigned to words used in similar contexts. The correspondence between words and word vectors is determined by a neural network. The embedding layer 131 may be trained together with the other layers of the machine learning model 130, or it may be pre-trained. 【0041】 The embedding layer 132 converts each of the one or more words that have been determined so far from the words that should be included in the output text into a word vector. In the machine learning model 130, the words that should be included in the output text are determined one by one from the beginning. The same correspondence between words and word vectors as in the embedding layer 131 is used. 【0042】 The position encoding layer 133 adds a position vector corresponding to the word's position to the word vector output by the embedding layer 131. This addition of position vectors is sometimes called position encoding. The position vector is a numerical vector with the same number of dimensions as the word vector. For each of the multiple words in the input text, the position encoding layer 133 calculates the numerical values ​​for each dimension of the position vector using a sine function or cosine function, based on a non-negative integer indicating which word it is from the beginning. 【0043】 The position coding layer 134 adds a position vector corresponding to the word's position to the word vector output by the embedding layer 132. The method for calculating the position vector is the same as that of the position coding layer 133. For each of the one or more words in the output text, the position coding layer 134 calculates the numerical values ​​for each dimension included in the position vector using a sine function or cosine function, based on a non-negative integer indicating which word it is from the beginning. 【0044】Encoder 135 converts a plurality of vectors corresponding to a plurality of words. Encoder 135 sequentially includes a self-attention layer 135a, a normalization layer 135b, a feed-forward layer 135c, and a normalization layer 135d. The machine learning model 130 may stack a plurality of Encoders 135 in series. In that case, the first encoder receives a vector from the positional encoding layer 133, and the last encoder outputs a vector to the decoder 136. 【0045】 The self-attention layer 135a converts a vector using an attention mechanism. The self-attention layer 135a has a query matrix, a key matrix, and a value matrix as trained parameter values. The self-attention layer 135a selects one word of interest from among a plurality of words included in the input text. 【0046】 The self-attention layer 135a converts the vector of the word of interest by the query matrix to calculate a vector called a query. Also, the self-attention layer 135a converts the vectors of each of the plurality of words by the key matrix to calculate a vector called a key. The self-attention layer 135a calculates the inner product of the query and the key as the attention score for each word. The attention score indicates the degree of relevance between the word of interest and each word. 【0047】 The self-attention layer 135a converts the vectors of each of the plurality of words by the value matrix to calculate a vector called a value. The self-attention layer 135a uses the attention score as a weight to calculate the weighted sum of the values among the plurality of words, and outputs the calculated weighted sum as the vector after conversion for the word of interest. The self-attention layer 135a repeats the above process while changing the word of interest. 【0048】The normalization layer 135b normalizes the vector output by the self-attention layer 135a so that the numerical values for each dimension follow a certain distribution. The feed-forward layer 135c is a forward neural network. The feed-forward layer 135c individually converts the vectors of multiple words using the trained parameter values. The normalization layer 135d normalizes the vector output by the feed-forward layer 135c in the same way as the normalization layer 135b. 【0049】 The decoder 136 converts the vectors of one or more words that have been determined so far among the words to be included in the output text. The decoder 136 sequentially includes a self-attention layer 136a, a normalization layer 136b, an attention layer 136c, a normalization layer 136d, a feed-forward layer 136e, and a normalization layer 136f. The machine learning model 130 may stack a plurality of decoders 136 in series. In that case, the first decoder receives the vector from the positional encoding layer 134, and the last encoder outputs the vector to the linear layer 137. 【0050】 The self-attention layer 136a converts the vector using the same attention mechanism as the self-attention layer 135a. The query, key, and value are calculated from the vectors of the words in the output text. The normalization layer 136b normalizes the vector output by the self-attention layer 136a in the same way as the normalization layer 135b. 【0051】 The attention layer 136c converts the vectors of the words in the output text using the attention mechanism. However, the attention layer 136c calculates the query from the vectors of the words in the output text, and calculates the key and value from the vectors of the words in the input text. Thereby, the degree of relevance between the words in the output text and the words in the input text is determined. 【0052】The attention layer 136c selects one word of interest from one or more words contained in the output text. The attention layer 136c calculates a query by transforming the vector of the word of interest using a query matrix. The attention layer 136c also receives vectors of multiple words contained in the input text from the encoder 135. The attention layer 136c calculates a key by transforming the vector of each word using a key matrix, and calculates a value by transforming the vector of each word using a value matrix. 【0053】 The attention layer 136c calculates the dot product of the query and the key as an attention score for each word in the input text. The attention score indicates the degree of relevance between each word in the input text and the word in the output text of interest. The attention layer 136c uses the attention scores as weights to calculate a weighted sum of values ​​among multiple words in the input text. The attention layer 136c outputs the calculated weighted sum as a transformed vector for the word in the output text of interest. 【0054】 The normalization layer 136d normalizes the vector output by the attention layer 136c in the same way as the normalization layer 135b. The feedforward layer 136e transforms the word vectors of the output text individually using the trained parameter values. The normalization layer 136f normalizes the vector output by the feedforward layer 136e in the same way as the normalization layer 135b. 【0055】 The linear layer 137 uses the numerical values ​​contained in the vector output by the decoder 136 to calculate scores for various words listed in the dictionary. The words listed in the dictionary are words to which word vectors are assigned by the embedding layers 131 and 132. For example, the word vectors of the embedding layers 131 and 132 are referenced in the calculation of the scores. 【0056】The softmax layer 138 converts the scores of various words into probabilities between 0 and 1. The machine learning model 130 selects one word according to the probability and adds the selected word to the end of the output text. The machine learning model 130 generates the output text by repeating the process of the decoder 136 described above. 【0057】 The machine learning model 130 uses random numbers when selecting words for the output text to ensure diversity in the output text. The machine learning model 130 randomly selects one word from several words in order of probability, starting with the words with the highest probability. Therefore, the word with the highest probability is not necessarily selected. The extent to which lower-ranking words are included as selection candidates is adjusted by the hyperparameters of the machine learning model 130. 【0058】 Next, we will explain the extraction of design information using a large-scale language model. The information processing device 100 has the large-scale language model explain the meaning of the constituent units contained in the source code to be analyzed. This allows the information processing device 100 to support the understanding of old source code for which manual design information is not available. As will be described later, constituent units of source code can include the entire source code, functions, lines, etc. In the second embodiment, we mainly assume the case where individual lines contained in the source code are used as constituent units. 【0059】 Figure 4 shows a first example of output text describing the source code. As an example, the information processing device 100 reads the source code 141 and generates prompt text 142 from the source code 141. The source code 141 is a program written in a procedural high-level language and contains multiple lines corresponding to instructions. For example, the source code 141 is a PL / I program and contains lines separated by semicolons (;). 【0060】The prompt text 142 is the input text that is input to the large-scale language model. The prompt text 142 includes instructions that tell the system to explain the meaning of each of the multiple lines contained in the specified source code according to the specified output format. The information processing device 100 inserts the entire source code 141 into the prompt text 142. The information processing device 100 also divides the source code 141 into multiple lines. The information processing device 100 inserts an output format into the prompt text 142 that specifies that an explanatory text should be added to each of the multiple lines. 【0061】 The large-scale language model generates output text 143 from prompt text 142. Output text 143 is the response text to prompt text 142. Output text 143 includes explanatory text that describes the meaning of each of the multiple lines, according to the output format specified by prompt text 142. 【0062】 In the example in Figure 4, the descriptions in the first and second lines are accurate. However, the code in the third line of the output text 143 differs from the original third line in the prompt text 142. Therefore, the large-scale language model may be outputting an incorrect description based on an incomplete memory of the source code 141. 【0063】 Thus, large-scale language models face a challenge called hallucination, where they may output information that does not fit the prompt text. When the prompt text or output text is long, large-scale language models may generate incorrect explanations by forgetting at least part of the source code during the language processing process. For example, large-scale language models that internally use context vectors may not give importance to the context of words far removed from the word of focus, and contextual information may gradually become diluted. This can lead to a forgetting phenomenon where the memory of the initially given source code gradually fades. 【0064】Therefore, the information processing device 100 of the second embodiment improves the method of using the large-scale language model as described below. First, the information processing device 100 divides the source code into multiple lines. The information processing device 100 instructs the large-scale language model to explain the meaning of one line at a time. At this time, the information processing device 100 generates prompt text that inserts the entire source code and the target code indicating the line to be explained, and inputs it to the large-scale language model. The information processing device 100 repeats the input of prompt text according to the number of target codes. In this way, the information processing device 100 suppresses the risk of forgetting the target codes. 【0065】 Furthermore, the information processing device 100 inserts an instruction into the prompt text that instructs the system to output the target code itself in addition to the explanatory text for the target code. This allows the information processing device 100 to check whether the large-scale language model has forgotten the target code. If the target code has not been forgotten, the large-scale language model is highly likely to be able to reproduce the target code itself when generating the explanatory text. On the other hand, if the target code has been forgotten, the large-scale language model is highly likely to be unable to reproduce the target code when generating the explanatory text. Therefore, the information processing device 100 determines whether or not the code has been forgotten based on the degree of recall of the target code. 【0066】 Figure 5 shows a second example of output text explaining the source code. The information processing device 100 uses libraries such as Structured Outputs and Langchain Output Parser to efficiently generate multiple prompt texts corresponding to different target codes. 【0067】The information processing device 100 provides format specification text 144 and prompt text 145. The format specification text 144 and prompt text 145 may be input by the user. The format specification text 144 specifies the format of the output text of the large-scale language model. The format specification text 144 specifies a JSON (JavaScript Object Notation) text data format that includes a string indicating the target code and a string indicating a description of the target code. 【0068】 The format specification text 144 includes a regular expression that indicates the string pattern of the target code. The information processing device 100 can verify the output of the large-scale language model by checking whether the output string of the large-scale language model conforms to the regular expression. In the example in Figure 5, the target code contains one or more characters other than semicolons and ends with a semicolon. 【0069】 The prompt text 145 is a template used commonly for multiple target codes. The prompt text 145 includes a variable indicating the location where the entire source code will be inserted. It also includes a variable indicating the location where a single target code will be inserted. Furthermore, the prompt text 145 includes a variable indicating the location where information about the output format specified by the format specification text 144 will be inserted. 【0070】 Furthermore, prompt text 145 includes an instruction to explain the meaning of the target code within the entire source code. In addition, prompt text 145 includes an instruction to return the target code as is, separate from the explanation. 【0071】The information processing device 100 takes the entire source code, a single target code, and output format information as input to the prompt text 145 to generate a specific prompt text that queries the meaning of a single target code. The information processing device 100 inputs the generated prompt text into a large-scale language model. The information processing device 100 analyzes the output text obtained from the large-scale language model according to the format specification text 144 and extracts the target code and explanatory text from the output text. 【0072】 The information processing device 100 obtains multiple pairs of target codes and explanatory texts by repeating the above process for multiple target codes. As a result, the information processing device 100 generates output text 146, which is a sequence of target code and explanatory text pairs. In the example in Figure 5, for the first line, the target code output by the large-scale language model matches the original. Therefore, it is estimated that the possibility of forgetting the target code is low, and the reliability of the explanatory text is high. The same applies to the second and third lines. 【0073】 Figure 6 shows an example of prompt text input to a large-scale language model. Prompt text 147 is a specific prompt text generated from prompt text 145, and corresponds to an instance of prompt text 145. Prompt text 147 is input to the large-scale language model. 【0074】 The prompt text 147 includes the entire source code 141. The prompt text 147 also includes the first line of source code 141 as the target code. Furthermore, the prompt text 147 includes an instruction to output JSON-formatted text data containing the target code and a descriptive text explaining the target code. 【0075】The information processing device 100 compares the original target code with the target code output by the large-scale language model for each prompt to determine the degree of similarity. If the two target codes are sufficiently similar, the information processing device 100 accepts the explanatory text output by the large-scale language model. On the other hand, if the two target codes are not sufficiently similar, the information processing device 100 rejects the explanatory text output by the large-scale language model. Because the large-scale language model may fine-tune the string representation even if it has not forgotten the target code, the information processing device 100 may accept the explanatory text even if the two target codes are not an exact match. 【0076】 In similarity determination, the information processing device 100 may calculate the distance between the two target codes, and may accept the descriptive text if the distance is less than a threshold. A shorter distance indicates a higher degree of similarity. Examples of distances include the Levenshtein distance (edit distance), cosine distance, Euclidean distance, Jacquard distance, Dice coefficient, and WMD. 【0077】 The Levenshtein distance is the number of editing operations required to match one string to another, normalized by the number of characters in the longer string. Editing operations include adding, deleting, or replacing one character. For example, "chikara udon" can be converted to "karagenki" by deleting the first and fourth characters, replacing the fifth character with "ge", and adding "ki" to the end. Therefore, the number of editing operations between "chikara udon" and "karagenki" is 4, and the Levenshtein distance is 4 / 6 = 0.67. 【0078】 The Jackard distance is the ratio of characters that appear in at least one of the two strings to characters that appear in only one of the two strings. The cosine distance is the value obtained by subtracting the cosine similarity from 1. The information processing device 100 generates character vectors for each of the two strings, with the dimension corresponding to the appearing characters being 1 and the dimension corresponding to the non-appearing characters being 0. The information processing device 100 calculates the cosine similarity by dividing the dot product of the two character vectors by the L2 norm of the two character vectors. The Euclidean distance corresponds to the L2 norm of the difference between the two character vectors. 【0079】 The Dice coefficient is calculated by dividing twice the number of common characters between the two strings by the sum of the number of characters in the two strings. To convert the Dice coefficient into a distance index, the information processing device 100 may use a value obtained by subtracting the Dice coefficient from 1. WMD is the sum of the distances between word vectors of corresponding words in the two strings. The information processing device 100 divides each of the two strings into multiple words and converts each word into a word vector called a distributed representation or embedding representation. The information processing device 100 determines the word correspondence between the two strings that minimizes cost, calculates the Euclidean distance of the word vectors of the corresponding words, and sums them up. 【0080】 Furthermore, in the similarity determination, the information processing device 100 may also determine whether the output target code contains the original target code as a string, and may accept the explanatory text if such a containment relationship exists. This containment relationship means that the original target code is a substring of the output target code, and indicates that there is no loss of information in the output target code. However, when determining the containment relationship of strings, the information processing device 100 may ignore non-alphabetic characters such as spaces and tabs. 【0081】 Furthermore, in the similarity determination, the information processing device 100 may accept the explanatory text if the distance is less than the threshold and the generated target code includes the original target code. On the other hand, the information processing device 100 may reject the explanatory text if the distance is greater than or equal to the threshold, or if the generated target code does not include the original target code. Note that the method of similarity determination is not limited to the above method, and the information processing device 100 may use other methods. 【0082】 If the explanatory text is rejected, the information processing device 100 inputs the same prompt text again into the large-scale language model. Because the large-scale language model uses random numbers, it may generate different output text from the same prompt text. The information processing device 100 continues to input the same prompt text until the explanatory text is accepted. 【0083】However, if the number of trials reaches a certain number N, the information processing device 100 will terminate the input of prompt text. For example, N is 10 times. In that case, the information processing device 100 will select the one with the smallest distance from the N explanatory texts. At this time, the information processing device 100 will add a "requires verification" flag to the selected explanatory text. The "requires verification" flag is warning information that draws the user's attention because the reliability of the explanatory text is predicted to be low. 【0084】 As mentioned above, source code includes various constituent units such as the entire source code, functions, and lines of code. When extracting design information from source code, the user specifies which constituent units' descriptions should be added to the design information. 【0085】 Figure 7 shows an example of source code division. Source code 148 is a PL / I program. Source code 148 accepts two integers, calculates the sum and product of the two integers, and outputs them. Source code 148 contains four global variables, a main function, and two subfunctions. When extracting design information from source code 148, the user specifies design information items 151. The information processing device 100 extracts the constituent units to be explained from source code 148 according to the specified design information items 151. 【0086】 The information processing device 100 may select the entire source code 148 as the target code. The information processing device 100 may also extract a function set 152 from the source code 148. In the example in Figure 7, the function set 152 includes three functions: MAIN, ADD_FUNC, and MULTIPLY_FUNC. Furthermore, the information processing device 100 may extract a line set 153 from the source code 148. The line set 153 includes the instructions for each of the multiple lines, such as the first line's instruction and the second line's instruction. 【0087】 The information processing device 100 may extract different types of constituent units, such as functions and lines, as target codes. The information processing device 100 generates a prompt instance by inserting each extracted target code into the prompt template. 【0088】Figure 8 shows an example of a design information file. The information processing device 100 ultimately generates a design information file 155 from the source code 148. The design information file 155 contains design information extracted according to the design information items 151. In the example in Figure 8, the design information file 155 contains a processing summary of the entire source code 148. This processing summary may be an explanatory text generated from the entire source code 148 by a large-scale language model. 【0089】 Furthermore, the design information file 155 includes the original code, which is the source code 148 itself. The design information file 155 also includes a processing overview and a flowchart of the function MAIN. This processing overview may be an explanatory text generated from the code of the function MAIN by a large-scale language model. The explanation of each step included in the flowchart may be an explanatory text generated from the instruction statements of each line by a large-scale language model, or it may be generated using a source code analysis technique not described in the second embodiment. 【0090】 The information processing device 100 outputs the generated design information file 155. The information processing device 100 may save the design information file 155 to non-volatile storage, display it on the display device 111, or transmit it to another information processing device. Next, an example of evaluating the explanatory text generated by the large-scale language model will be described. 【0091】 Figure 9 shows an example of source code. Source code 149 is a PL / I program used for payroll calculation. Source code 149 contains 141 statements. Among the 141 statements are 3 include statements, 8 function declaration statements, and 46 variable declaration statements. In addition, among the 141 statements are 3 function call statements, 36 control statements, 28 assignment statements, and 17 file operation statements. 【0092】Source code 149 specifies the process for generating a departmental salary list from an employee salary information file. Source code 149 includes a main function, a first sub-function, and a second sub-function. The main function is a function that generates a departmental salary list and has a variable indicating salary. The first sub-function is a function that reads the employee salary information file and has a variable indicating department. The second sub-function is a function that outputs the departmental salary list and has a variable indicating employee. The input data record includes department number, department name, employee number, employee name, and salary amount. The output data record includes department number, department name, number of employees, and average salary amount. 【0093】 The information processing device 100 causes the large-scale language model to output 10 explanatory sentences for each of the 141 command statements. For each of the 10 outputs, the information processing device 100 determines whether the command statement output together with the command statement meets the acceptance criteria. In addition, for each of the 141 command statements, the user manually evaluates the explanatory sentence with the smallest distance from the 10 generated explanatory sentences. The manual evaluation is performed from the perspectives of accuracy and completeness. Accuracy indicates that the explanatory sentence does not contain errors. Completeness indicates that the explanatory sentence is neither excessive nor insufficient. 【0094】 Figure 10 shows an example of the similarity judgment results. Table 156 shows the results of automatic judgment and manual evaluation of the descriptions output by two large-scale language models. The automatic judgment results indicate acceptance or rejection based on the similarity of the target code. The manual evaluation results show the results of the user's assessment of the accuracy and completeness of the description. 【0095】 In the first large-scale language model, all of the explanations generated 10 times were accepted for 125 out of 141 imperatives (88.7%). For 13 imperatives (9.3%), some of the explanations generated 10 times were accepted. For 3 imperatives (2.0%), all of the explanations generated 10 times were rejected. 【0096】Of the 125 imperative sentences in which all 10 explanations were accepted, 120 sentences (96.0%) were found to be accurate, and 125 sentences (100%) were found to be satisfactory. Of the 13 imperative sentences in which some explanations were accepted, 13 sentences (100%) were found to be accurate, and 13 sentences (100%) were found to be satisfactory. Of the 3 imperative sentences in which all 10 explanations were rejected, 2 sentences (66.7%) were found to be accurate, and 3 sentences (100%) were found to be satisfactory. 【0097】 In the second large-scale language model, all of the explanations generated 10 times were accepted for 117 out of 141 imperatives (83.0%). For 15 imperatives (10.7%), some of the explanations generated 10 times were accepted. For 9 imperatives (6.3%), all of the explanations generated 10 times were rejected. 【0098】 Of the 117 imperative sentences in which all 10 explanations were accepted, 106 (90.6%) were found to be accurate, and 102 (87.2%) were found to be satisfactory. Of the 15 imperative sentences in which some explanations were accepted, all 15 (100%) were found to be accurate, and all 15 (100%) were found to be satisfactory. Of the 9 imperative sentences in which all 10 explanations were rejected, 6 (66.7%) were found to be accurate, and 8 (88.9%) were found to be satisfactory. 【0099】 As shown in Table 156, many of the explanatory texts received by the information processing device 100 are evaluated as being accurate and sufficient even when processed manually. Furthermore, even if the information processing device 100 initially rejects an explanatory text, it is highly likely that it can generate an explanatory text that ultimately satisfies accuracy and sufficientity by having the large-scale language model output the explanatory text again. 【0100】 Figure 11 shows an example of a description that has a "needs verification" flag added to it. Table 157 includes two cases in which a description has a "needs verification" flag added to it. The first case is a case in Table 156 mentioned above where it failed 10 times and the description does not meet the accuracy requirement. The second case is a case where it failed 10 times and the description does not meet the sufficiency requirement. 【0101】 In Table 157, "Original Code" shows the original target code, and "Generated Code" shows the target code generated by the large-scale language model. Similarity is calculated by subtracting distance from 1, with a higher value indicating greater similarity. 【0102】 In the first case, the picture format specified in the third variable item included in the generated code differs from that of the original code. As a result, the similarity between the original code and the generated code is calculated to be 0.973. The explanatory text generated by the large-scale language model contains an incorrect description of the picture format indicated in the generated code. In the first case, forgetting of the target code has occurred in the large-scale language model, and the reliability of the explanatory text has decreased due to the effects of forgetting. The information processing device 100 adds a flag to the explanatory text of the first case according to the acceptance criteria. This alerts the user to the above lack of accuracy. 【0103】 In the second example, the content of the THEN block included in the original code is missing from the generated code. As a result, the similarity between the original code and the generated code is calculated to be 0.667. The explanatory text generated by the large-scale language model does not explain the content of the THEN block, as shown in the generated code, and omits some of the content that should have been explained. The information processing device 100 adds a flag to the explanatory text in the second example according to the acceptance criteria. This alerts the user to the lack of sufficiency mentioned above. Next, the functions and processing procedures of the information processing device 100 will be described. 【0104】Figure 12 is a block diagram showing an example of the functions of an information processing device according to a second embodiment. The information processing device 100 includes a program storage unit 121, a model storage unit 122, a design information storage unit 123, a model access unit 124, a prompt generation unit 125, a design information extraction unit 126, and a similarity determination unit 127. The program storage unit 121, the model storage unit 122, and the design information storage unit 123 are implemented using, for example, a RAM 102 or an HDD 103. The model access unit 124, the prompt generation unit 125, the design information extraction unit 126, and the similarity determination unit 127 are implemented using, for example, a CPU 101, a GPU 104, and a program. 【0105】 The program storage unit 121 stores the source code to be analyzed. The model storage unit 122 stores the trained large-scale language model. However, instead of the information processing device 100 storing the large-scale language model, another information processing device may store the large-scale language model. The design information storage unit 123 stores the design information extracted from the source code. 【0106】 The model access unit 124 receives prompt text from the prompt generation unit 125 and inputs the prompt text to the large-scale language model. In response to the prompt text, the model access unit 124 obtains output text from the large-scale language model and outputs it to the prompt generation unit 125. If the large-scale language model is stored in another information processing device, the model access unit 124 may send the prompt text to the other information processing device, or receive the output text from the other information processing device. 【0107】 The prompt generation unit 125 receives prompt text templates and design information items from the user. The prompt generation unit 125 divides the source code according to the design information items and generates multiple target codes to be explained. For each of the multiple target codes, the prompt generation unit 125 generates specific prompt text from the template that instructs the user to explain that target code. The prompt generation unit 125 outputs the generated prompt text to the model access unit 124. 【0108】The design information extraction unit 126 obtains the output text output by the large-scale language model in response to the prompt text and extracts the target code and explanatory text from the output text. The design information extraction unit 126 has the similarity determination unit 127 determine the similarity of the target code. If the similarity meets the acceptance criteria, the design information extraction unit 126 saves the explanatory text in the design information storage unit 123. If the similarity does not meet the acceptance criteria, the design information extraction unit 126 instructs the prompt generation unit 125 to re-execute. If the number of attempts reaches the upper limit, the design information extraction unit 126 adds a flag to the explanatory text requiring confirmation and saves it in the design information storage unit 123. 【0109】 The similarity determination unit 127 determines the similarity between the original target code and the generated target code. For example, the similarity determination unit 127 calculates the distance between the two target codes and determines whether the distance is less than a threshold. The similarity determination unit 127 also determines whether the original target code corresponds to a substring of the generated target code. If both the above distance condition and inclusion relationship are met, the similarity determination unit 127 determines to accept the description; otherwise, it determines to reject the description. 【0110】 The information processing device 100 may store the determined similarity in non-volatile storage, display it on the display device 111, or transmit it to another information processing device. The information processing device 100 may also display the generated explanatory text on the display device 111 or transmit it to another information processing device. The information processing device 100 may also display a flag requiring confirmation on the display device 111 or transmit it to another information processing device. 【0111】 Figure 13 is a flowchart illustrating an example of the procedure for extracting design information. In step S10, the prompt generation unit 125 acquires source code, a template, and design information items. In step S11, the prompt generation unit 125 divides the source code according to the design information items and generates multiple target codes. For example, if a design information item includes a line of source code, the prompt generation unit 125 divides the source code into multiple lines. 【0112】In step S12, the prompt generation unit 125 selects one target code. The prompt generation unit 125 substitutes the entire source code and the selected target code into a template to generate prompt text. The prompt text includes an explanatory text that describes the meaning of the target code and an instruction that tells the system to output the target code itself. 【0113】 In step S13, the model access unit 124 inputs the generated prompt text into the large-scale language model. In step S14, the design information extraction unit 126 obtains output text that includes the explanatory text and the target code itself. In step S15, the similarity determination unit 127 determines the similarity between the original target code and the target code included in the output text. For example, the similarity determination unit 127 determines whether the distance between the two target codes is less than a threshold and whether the output target code encompasses the original target code. 【0114】 In step S16, the design information extraction unit 126 determines whether the similarity satisfies the acceptance criteria. If the similarity satisfies the acceptance criteria, the process proceeds to step S19. If the similarity does not satisfy the acceptance criteria, the process proceeds to step S17. In step S17, the design information extraction unit 126 determines whether the number of iterations in steps S13 to S16 has reached the threshold N. If the number of iterations has reached the threshold N, the process proceeds to step S18. If the number of iterations has not reached the threshold N, the process returns to step S13. 【0115】 In step S18, the design information extraction unit 126 selects the description with the shortest distance from the N description sentences. The design information extraction unit 126 adds a flag to the selected description sentence. In step S19, the design information extraction unit 126 determines whether it has obtained the description sentences for all target codes. If it has obtained the description sentences for all target codes, the process proceeds to step S20. If there are target codes for which the description sentence has not been obtained, the process returns to step S12. In step S20, the design information extraction unit 126 outputs design information that includes the description sentences for multiple target codes for the specified design information item. 【0116】As described above, the information processing device 100 of the second embodiment extracts design information from source code using a large-scale language model. This automatically generates useful information for understanding older source code where manual design information is no longer available. 【0117】 Furthermore, the information processing device 100 divides the source code into constituent units for generating explanatory texts and requests an explanatory text for one target code at a time from the large-scale language model with a single prompt. This reduces the risk of forgetting the target code in the large-scale language model and suppresses the deterioration of explanatory text quality caused by forgetting. In addition to the explanatory text, the information processing device 100 also requests the large-scale language model to return the target code itself. The information processing device 100 estimates whether or not forgetting has occurred based on the similarity between the original target code and the generated target code. This allows for the detection of quality deterioration due to forgetting. 【0118】 Furthermore, if forgetting of the target code occurs, the information processing device 100 requests the large-scale language model to regenerate the explanatory text. This improves the quality of the explanatory text. Also, if forgetting of the target code is not resolved even after regeneration, the information processing device 100 adds a "check required" flag to the explanatory text to alert the user. This reduces the risk that the user may misinterpret the meaning of the target code based on an incorrect explanatory text. 【0119】 The above merely illustrates the principle of the present invention. Furthermore, numerous modifications and changes are possible for those skilled in the art, and the present invention is not limited to the exact configurations and applications shown and described above. All corresponding modifications and equivalents are considered to be within the scope of the present invention as defined by the appended claims and their equivalents. 【0120】 10 Information processing device 11 Memory unit 12 Processing unit 13 Machine learning model 14 Prompt 14a, 14b Instruction 15, 16, 17 Information

Claims

1. A determination program that causes a computer to execute a process that includes inputting a prompt to a machine learning model, which includes a first instruction to output information related to a first piece of information and a second instruction to output the first piece of information; acquiring the second piece of information output from the machine learning model in response to the second instruction; calculating the similarity between the first piece of information and the second piece of information; and determining the memory state of the first piece of information in the machine learning model based on the similarity.

2. The determination program according to claim 1, wherein the machine learning model is a natural language processing model, and the prompt is a prompt text including a first instruction sentence describing the first instruction in natural language and a second instruction sentence describing the second instruction in natural language.

3. The determination program according to claim 1, wherein the first information is the target code included in the source code, and the related information is an explanatory text that explains the meaning of the target code.

4. The determination program according to claim 1, wherein if the determined memory state indicates forgetting of the first information, the computer further causes the machine learning model to output the related information again.

5. The determination program according to claim 1, wherein if the determined memory state indicates forgetting of the first information, the computer further causes the computer to perform a process of adding warning information indicating a reliability warning to the third information output from the machine learning model in response to the first instruction.

6. A determination method in which a computer performs a process comprising: inputting a prompt to a machine learning model that includes a first instruction to output information related to a first piece of information and a second instruction to output the first piece of information; obtaining the second piece of information output from the machine learning model in response to the second instruction; calculating the similarity between the first piece of information and the second piece of information; and determining the memory state of the first piece of information in the machine learning model based on the similarity.

7. An information processing device comprising: a storage unit for storing first information; and a processing unit that inputs a prompt to a machine learning model including a first instruction for instructing the output of information related to the first information and a second instruction for instructing the output of the first information, acquires second information output from the machine learning model in response to the second instruction, calculates the similarity between the first information and the second information, and determines the storage state of the first information of the machine learning model based on the similarity.