Report evaluation method, program, and report evaluation system
The report evaluation method and system address inefficiencies in existing technologies by dynamically generating check items aligned with the report's content, ensuring accurate and efficient evaluations with detailed feedback.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO LTD
- Filing Date
- 2025-10-28
- Publication Date
- 2026-06-18
AI Technical Summary
Existing document evaluation technologies, such as those using large language models (LLM), face inefficiencies due to manual filtering of generated check items, excessive resource consumption, and inflexible evaluation criteria, making it difficult to adapt to various perspectives and accurately evaluate documents based on their content.
A report evaluation method and system that dynamically generates flexible check items tailored to the content of the report by determining main check items, subdividing them, and incorporating user feedback to ensure accurate and detailed evaluations.
Enables flexible and efficient evaluation of reports by generating check items that align with the report's content, reducing resource consumption, and providing detailed feedback for improvement.
Smart Images

Figure JP2025037866_18062026_PF_FP_ABST
Abstract
Description
Report Evaluation Method, Program, and Report Evaluation System 【0001】 The present disclosure relates to a report evaluation method and the like. 【0002】 Conventionally, techniques for evaluating documents have been proposed (see, for example, Non-Patent Document 1). In Non-Patent Document 1, a device is disclosed that uses a large language model (LLM) to generate check items for evaluation criteria determined by a person and evaluates a document using the generated check items. In the device described in Non-Patent Document 1, filtering of check items by a person is performed before evaluating the document. 【0003】 Also, a technique has been proposed for confirming consistency from the difference between the global (overall) perspective and the local (partial) content with respect to evaluation criteria for a document (see, for example, Patent Document 1). 【0004】 U.S. Patent No. 11194964 【0005】 Yukyung Lee, et al., "CheckEval: Robust Evaluation Framework using Large Language Model via Checklist", [online], March 27, 2024, arXiv, Internet (https: / / arxiv.org / abs / 2403.18771) 【0006】 However, in the technique of Non-Patent Document 1, although it aims to evaluate how well a document satisfies check items, check items for evaluation criteria adapted to the content of the text are not generated. Instead, the check items generated in large quantities by the LLM are filtered manually. Also, in the technique of Non-Patent Document 1, the manual filtering itself may be required each time evaluation is performed, and considering practical operation, it is not very effective. Furthermore, when using the LLM, including unnecessary or low-relevance information in the prompt increases the number of tokens and memory usage for inference, excessively consuming computing resources and reducing the efficiency of the evaluation process. 【0007】Furthermore, the technology described in Patent Document 1 performs evaluations based on evaluation criteria applicable only to confirming the consistency between the partial and overall content of the document being evaluated, making it difficult to set flexible evaluation criteria. For this reason, it may not be fully useful when it is necessary to check the content of a document from various perspectives. Moreover, in the technology described in Patent Document 1, if the definition of the set evaluation criteria is ambiguous, the content to be checked from the document being evaluated may not be appropriately selected, making accurate evaluation difficult. 【0008】 Therefore, this disclosure provides a report evaluation method that enables the generation of flexible check items that take into account the contents of the created report. 【0009】 To achieve the above objective, a report evaluation method according to one form of the present disclosure is a report evaluation method performed by a report evaluation system that evaluates the contents of a report, and includes an acquisition step of acquiring a report to be evaluated; a determination step of determining main check items, which are evaluation perspectives for evaluating the acquired report to be evaluated, according to the contents of the acquired report to be evaluated; an evaluation step of evaluating the report to be evaluated based on the determined main check items; and an output step of outputting an evaluation result obtained by evaluating the report to be evaluated. 【0010】 To achieve the above objective, a program relating to one form of this disclosure is a program that causes a computer to execute the report evaluation method described above. 【0011】 To achieve the above objective, a report evaluation system according to one form of the present disclosure is a report evaluation system for evaluating the contents of a report, comprising: an acquisition unit for acquiring a report to be evaluated; a determination unit for determining main check items, which are evaluation perspectives for evaluating the acquired report to be evaluated, according to the contents of the acquired report to be evaluated; an evaluation unit for evaluating the report to be evaluated based on the determined main check items; and an output unit for outputting evaluation results obtained by evaluating the report to be evaluated. 【0012】This disclosure provides a report evaluation method that enables the generation of flexible check items that take into account the content of the created report. 【0013】Figure 1 is a flowchart showing the general processing flow executed by the report evaluation system according to the embodiment. Figure 2A is an image of the pre-configuration (S1) shown in Figure 1. Figure 2B is an image of the report review (S2) shown in Figure 1. Figure 2C is an image of the annotations on the difficulty and effect of implementing the check items shown in Figure 1, and the recommendation (S3). Figure 2D is an image of the execution of additional user confirmation (S4) shown in Figure 1. Figure 3 is a block diagram showing the configuration of the report evaluation system according to the embodiment. Figure 4 is a flowchart showing the operation of the report evaluation system according to the embodiment. Figure 5 is a flowchart showing an example of detailed operation in step S20 shown in Figure 4. Figure 6A is a diagram showing an example of a pre-created list of main check items. Figure 6B is an example of information related to past evaluation results included in past report information. Figure 7 is a flowchart showing an example of detailed operation in step S30 shown in Figure 4. Figure 8A is a diagram showing an example of a report to be evaluated. Figure 8B is a diagram for explaining the specific implementation method of step S302 in Figure 7. Figure 8C is a diagram illustrating the specific implementation method of step S303 in Figure 7. Figure 9A is a diagram illustrating an example of an evaluation target report. Figure 9B is a diagram illustrating the specific implementation method of step S302 in Figure 7. Figure 9C is a diagram illustrating the specific implementation method of step S303 in Figure 7. Figure 10 is a diagram illustrating the specific implementation method of step S305 in Figure 7. Figure 11 is a diagram illustrating an example of when the check item subdivision unit creates a prompt using external information. Figure 12 is a diagram illustrating the operation when the check item subdivision unit modifies the subdivided check items based on user feedback. Figure 13 is a flowchart illustrating another detailed operation in step S30 shown in Figure 4. Figure 14A is a diagram illustrating an example of a subdivided check item determined by the check item subdivision unit. Figure 14B is a diagram illustrating the specific implementation method of steps S311 to S313 in Figure 13. Figure 15 is a diagram illustrating an example of when each of the multiple subdivided check items is assigned to each survey item group according to the assigned label.Figure 16 is a flowchart showing the detailed operation in step S40 shown in Figure 4. Figure 17 is a diagram illustrating the specific implementation method of step S407 in Figure 16. Figure 18 is a flowchart showing another detailed operation in step S40 shown in Figure 4. Figure 19 is a diagram illustrating the specific implementation method when the check item completion judgment unit assigns labels to the entire group of subdivided check items or to check item units. Figure 20 is a diagram showing an example of the satisfaction level of each survey item group calculated by the satisfaction level judgment unit. Figure 21 is a flowchart showing the detailed operation in step S50 shown in Figure 4. Figure 22 is a diagram illustrating the specific implementation method of steps S501 to S504 in Figure 21. Figure 23 is a diagram illustrating the processing flow when creating a baseline in advance. Figure 24 is a flowchart showing the detailed operation in step S70 shown in Figure 4. Figure 25 is a diagram showing an example of rule setting. Figure 26 is a diagram illustrating the specific implementation method of steps S701 to S706 in Figure 24. Figure 27 shows an example of the necessary information presented to the user. 【0014】 (Knowledge forming the basis of the present invention) Conventionally, techniques for evaluating documents have been proposed (see, for example, Non-Patent Document 1). Non-Patent Document 1 discloses a device that uses a Large-Language-Model (LLM) to generate check items based on evaluation criteria determined by a person, and uses the generated check items to evaluate a document. In the device described in Non-Patent Document 1, the check items are filtered by a person before the document is evaluated. 【0015】 Furthermore, a technique has been proposed to verify consistency in a document by examining the differences between its global (overall) perspective and its local (partial) content in relation to evaluation criteria (see, for example, Patent Document 1). 【0016】However, the technology described in Non-Patent Document 1 aims to evaluate the extent to which a document meets the checklist criteria, but the checklist criteria are not generated based on the content of the document. Instead, a large number of checklist items are generated by LLM, and manual filtering is performed on them. Furthermore, with the technology described in Non-Patent Document 1, manual filtering may be required each time an evaluation is performed, which is not very effective in practical terms. 【0017】 Furthermore, the technology described in Patent Document 1 performs evaluations based on evaluation criteria applicable only to confirming the consistency between the partial and overall content of the document being evaluated, making it difficult to set flexible evaluation criteria. For this reason, it may not be fully useful when it is necessary to check the content of a document from various perspectives. Moreover, in the technology described in Patent Document 1, if the definition of the set evaluation criteria is ambiguous, the content to be checked from the document being evaluated may not be appropriately selected, making accurate evaluation difficult. 【0018】 In other words, the technologies described in Non-Patent Document 1 and Patent Document 1 make it difficult to generate flexible checklist items that take into account the content of the generated report. 【0019】 Furthermore, in the technology described in Non-Patent Document 1, if the points to be evaluated in accordance with the context of the document are not detailed, the generated checklist items may become general in nature, making it impossible to perform a professional evaluation that is in line with the document. 【0020】 To that end, the inventors diligently considered whether there was a way to evaluate the reports that were being created appropriately. As a result, they came up with the idea that by determining the checklist items, which are the evaluation criteria for evaluating the reports, according to the content of the reports, it would be possible to achieve an evaluation that takes the content of the reports into consideration. 【0021】More specifically, as a means of solving the above problem for generating check items that take into account the contents of the created report, the report evaluation method according to the first embodiment is a report evaluation method executed by a report evaluation system that evaluates the contents of a report, and includes an acquisition step of acquiring a report to be evaluated, a determination step of determining main check items, which are evaluation perspectives for evaluating the acquired report to be evaluated, according to the contents of the acquired report to be evaluated, an evaluation step of evaluating the report to be evaluated based on the determined main check items, and an output step of outputting the evaluation results obtained by evaluating the report to be evaluated. 【0022】 As a result, the report evaluation method determines the main check items, which are the evaluation criteria for evaluating the report being evaluated, according to the content of the report being evaluated. Therefore, it is possible to flexibly generate main check items that are appropriate to the content of the report being evaluated. Consequently, the report evaluation method can perform an evaluation that is suitable for the created report. 【0023】 Furthermore, the report evaluation method according to the second embodiment is a report evaluation method according to the first embodiment, in which, in the decision step, key points are extracted from the contents of the report to be evaluated, and from among a plurality of pre-prepared candidates for main check items, candidates for main check items suitable for evaluating the report to be evaluated obtained based on the extracted key points are extracted, and the extracted candidates for main check items are determined to be the main check items. 【0024】 As a result, the report evaluation method determines the most suitable main check items from a selection of candidate main check items based on the key points of the report being evaluated, and then performs the evaluation of the report. This allows for the flexible generation of main check items that correspond to the content of the report being evaluated. Therefore, the report evaluation method can perform an evaluation that takes the content of the created report into greater consideration. 【0025】Furthermore, the report evaluation method according to the third embodiment is a report evaluation method according to the first embodiment, in which, in the decision step, the degree of similarity between the contents of past reports previously evaluated by the report evaluation system and the contents of the report to be evaluated is extracted, if the extracted similarity is equal to or greater than a predetermined threshold, the main check items used in the evaluation of the past reports are determined to be the main check items for evaluating the report to be evaluated, and if the extracted similarity is less than the predetermined threshold, one category is identified from among a plurality of categories based on the contents of the report to be evaluated, and the main check items used in the evaluation of other past reports classified into the identified one category are determined to be the main check items for evaluating the report to be evaluated. 【0026】 As a result, the report evaluation method identifies past reports that share commonalities with the report being evaluated, determines the main check items used in the evaluation of the identified past reports as the main check items for the report being evaluated, and then performs the evaluation of the report being evaluated. Therefore, it is possible to flexibly generate main check items that correspond to the content of the report being evaluated. Consequently, the report evaluation method can perform an evaluation that takes into account the content of the created report more carefully. 【0027】 Furthermore, as a means of solving the above-mentioned problems for conducting a professional evaluation in accordance with the document, the report evaluation method according to the fourth embodiment is a report evaluation method according to any of the first to third embodiments, wherein in the decision step, an approach is selected to subdivide the content of the decided main check item, and based on the acquired evaluation target report, the decided main check item, and the selected approach, subdivided check items are determined by subdividing the content of the decided main check item, and in the evaluation step, the evaluation target report is evaluated based on the determined subdivided check items, and the subdivided check items include the main check item and a plurality of individual check items that subdivide the content of the main check item. 【0028】This allows the report evaluation method to determine specific, detailed checklist items for the report being evaluated, enabling more detailed evaluation results and more effective feedback. In other words, the report evaluation method can provide a professional evaluation tailored to the content of the report being evaluated. 【0029】 Furthermore, in the report evaluation method according to the fifth embodiment, in the decision step, key points are extracted from the contents of the report to be evaluated, two or more candidate main check items suitable for evaluating the report to be evaluated are selected from a plurality of pre-prepared candidate main check items based on the extracted key points, each of the two or more candidate main check items selected is determined as the main check item, a first subdivided check item is determined as a subdivided check item obtained by subdividing the content of the first main check item determined as the main check item, and a second main check item determined as the main check item is determined This is a report evaluation method according to a fourth embodiment, in which a second set of subdivided check items is determined as subdivided check items obtained by further subdividing the contents of the item, and if a first individual check item or the first main check item among the plurality of individual check items included in the first set of subdivided check items matches a second individual check item or the second main check item among the plurality of individual check items included in the second set of subdivided check items, the first individual check item or the first main check item and the second individual check item or the second main check item are integrated. 【0030】 This allows the report evaluation method to consolidate related subdivided check items by merging the main or individual check items included in the first subdivided check item with the main or individual check items included in the second subdivided check item. Therefore, users can more easily review the determined subdivided check items, thus reducing the burden on users in the report evaluation method. 【0031】Furthermore, the report evaluation method according to the sixth embodiment is a report evaluation method according to the fourth or fifth embodiment, in which, in the evaluation step, the acquired report to be evaluated and the determined subdivided check items are input into LLM (Large-Language-Model) to evaluate the report to be evaluated. 【0032】 This allows the report evaluation method to be performed by inputting the report to be evaluated and its subdivided check items into the LLM, thereby taking into account the content of the report to be evaluated. 【0033】 Furthermore, the report evaluation method according to the seventh embodiment is a report evaluation method according to the sixth embodiment, in which, in the evaluation step, external information stored in an external database is input into the LLM to evaluate the report to be evaluated. 【0034】 This allows for a more accurate assessment of on-site conditions by inputting external information into the LLM (Learning and Evaluation Management) system. 【0035】 Furthermore, the report evaluation method according to the eighth embodiment further includes a check item output step that outputs the determined subdivided check items, and an input reception step that receives feedback that points out evaluation perspectives that are missing from the output subdivided check items, and in the determination step, the determined subdivided check items are modified based on the received feedback, and is a report evaluation method according to any of the fourth to seventh embodiments. 【0036】 As a result, the report evaluation method modifies its subdivided checklist items based on user feedback, allowing for the determination of more appropriate subdivided checklist items for evaluating the report being assessed. Therefore, the report evaluation method can obtain more detailed evaluation results and provide more effective feedback. 【0037】Furthermore, the report evaluation method according to the ninth embodiment is a report evaluation method according to any of the fourth to eighth embodiments, wherein in the evaluation step, the report to be evaluated is evaluated by calculating a degree of satisfaction that indicates to what extent the contents of the report to be evaluated satisfy the main check item and the multiple individual check items among the determined subdivided check items. 【0038】 This allows the report evaluation method to quantitatively evaluate the reports being evaluated based on the calculated degree of sufficiency. 【0039】 Furthermore, the report evaluation method according to the tenth embodiment is a report evaluation method according to the ninth embodiment, in which, in the evaluation step, the evaluation of the report to be evaluated is evaluated based on whether or not the evaluation value for each evaluation criterion obtained from the calculated degree of sufficiency exceeds a predetermined standard value. 【0040】 This allows the report evaluation method to identify areas where the current content of the evaluated report is insufficient, based on the evaluation values for each evaluation criterion. 【0041】 Furthermore, the report evaluation method according to the 11th embodiment is the report evaluation method according to the 10th embodiment, wherein the predetermined standard value is a value based on the degree of sufficiency of past reports that have been evaluated in the past by the report evaluation system. 【0042】 This allows the report evaluation method to determine whether the content of the current report being evaluated is sufficient by comparing it with past results. 【0043】Further, in the report evaluation method according to the twelfth aspect, the report to be evaluated is a report in which information related to security is described. In the determination step, when the content of the subdivided check item determined is content indicating an attack on the security, an attack label is assigned. When the content of the subdivided check item is content indicating over-detection in the security, an over-detection label is assigned. In the evaluation step, the sufficiency of the subdivided check item to which the attack label is assigned and the sufficiency of the subdivided check item to which the over-detection label is assigned are calculated, and it is evaluated whether the security state concluded in the report to be evaluated is appropriate. It is a report evaluation method according to any one of the ninth to eleventh aspects. 【0044】 Thus, the report evaluation method calculates the sufficiency of the subdivided check item to which the attack label is assigned and the sufficiency of the subdivided check item to which the over-detection label is assigned. Therefore, the report evaluation method can determine whether there is a high possibility of an attack on the security system or a high possibility of over-detection by the security system by checking the sufficiency calculated in the evaluation step. Therefore, the report evaluation method can provide feedback to the user on whether the security state concluded by the user is appropriate. 【0045】 Further, in the report evaluation method according to the thirteenth aspect, in the output step, feedback information for prompting correction of the report to be evaluated is created based on the evaluation result, and the feedback information is output. It is a report evaluation method according to any one of the first to twelfth aspects. 【0046】 Thus, the report evaluation method can present to the user how the description content of the report to be evaluated can be efficiently corrected when corrected in order from which evaluation perspective. Therefore, the report evaluation method can provide more effective feedback in improving the report to be evaluated to a user with limited working time, and can more generously support the improvement of the report to be evaluated. 【0047】 Also, the program according to the 14th aspect is a program for causing a computer to execute the report evaluation method according to any one of the 1st to 13th aspects. 【0048】 As a result, the program exhibits the same operational effects as the above-described report evaluation method. 【0049】 Also, the report evaluation system according to the 15th aspect is a report evaluation system that evaluates the description content of a report, and includes an acquisition unit that acquires an evaluation target report to be evaluated, and a main check item that is an evaluation perspective for evaluating the acquired evaluation target report. A determination unit that determines according to the description content of the acquired evaluation target report, an evaluation unit that evaluates the evaluation target report based on the determined main check item, and an output unit that outputs an evaluation result obtained by evaluating the evaluation target report. 【0050】 As a result, since the report evaluation system determines the main check item, which is an evaluation perspective for evaluating the evaluation target report, according to the description content of the evaluation target report, it can flexibly generate the main check item according to the description content of the evaluation target report. Therefore, the report evaluation system can perform an evaluation suitable for the created report. 【0051】 These comprehensive or specific aspects may be realized by a system, a method, an integrated circuit, a computer program, or a recording medium such as a computer-readable CD-ROM, or may be realized by any combination of a system, a method, an integrated circuit, a computer program, and a recording medium. Further, the recording medium may be a non-temporary recording medium. 【0052】(Embodiments) Hereinafter, embodiments of the present disclosure will be described in detail with reference to the drawings. The embodiments described below are all specific examples of the present disclosure. The numerical values, components, arrangement and connection configurations of components, steps, order of steps, and display examples shown in the following embodiments are examples only and are not intended to limit the present disclosure. In addition, the figures are not necessarily strictly illustrative. In each figure, substantially identical components are denoted by the same reference numerals, and redundant explanations are omitted or simplified. 【0053】 Furthermore, among the components in the following embodiments, those components not described in an independent claim will be described as optional components. 【0054】 First, the report evaluation system according to this embodiment will be described with reference to Figure 1. Figure 1 is a flowchart showing the general processing flow performed by the report evaluation system according to this embodiment. The report evaluation system according to this embodiment is a system that analyzes reports created by users and provides feedback to improve the content of the reports. For example, the report evaluation system according to this embodiment targets analysis reports (in other words, security reports) on incidents or alerts submitted by security analysts (an example of a user). 【0055】 As shown in Figure 1, the operation of the report evaluation system includes the following steps in this order: pre-configuration (S1), report review (S2), annotation of the difficulty and effectiveness of the check items, recommendations (S3), and execution of additional user confirmation (S4). Each step will be explained below. 【0056】First, in the pre-configuration (S1), the required level of security reports (hereinafter sometimes simply referred to as "reports") for each site is set, and evaluation criteria are created. Evaluation criteria are the points for evaluating the report, such as whether the technical content described in the security report is consistent and whether the analysis items that the security report should satisfy are met. Figure 2A is an illustrative diagram of the pre-configuration (S1) shown in Figure 1. 【0057】 As shown in Figure 2A, the required report level is set for each site. Specifically, the required level is set to "high" at Factory A and to "low" at Factory B. These required report levels for each site are managed, for example, as security level information. Note that in Figure 2A, the required report level is simply labeled as "level". 【0058】 Furthermore, the report evaluation system creates evaluation criteria based on various information, using a Large-Scale Language Model (LLM), etc. Specifically, the report evaluation system creates evaluation criteria by inputting information such as security level information, base criteria, past inquiries from the field, and target field information into the LLM. Base criteria are criteria identified from prior interactions with the other party (user) before the report evaluation system was introduced, and are criteria that the report evaluation system should consider important when evaluating reports, regardless of the content of the report. Past inquiries from the field refer to the history of past inquiries from the field, such as requests for additional information in response to feedback given in the past. Target field information includes detailed information about the field that is the subject of the report, such as information on the type and number of devices installed at the site. 【0059】 The report evaluation system then performs a report review (S2) based on the evaluation criteria created in the pre-configuration (S1). 【0060】Returning to the explanation in Figure 1, in the report review (S2), check items are determined in a user-friendly structure such as a hierarchical structure, based on the evaluation criteria created in the pre-configuration (S1) and the content of the report, and these check items are then refined. Then, based on the refined check items, any missing information in the report is extracted (i.e., the report is reviewed). Figure 2B is an illustrative diagram of the report review (S2) shown in Figure 1. 【0061】 As shown in Figure 2B, the report evaluation system creates multiple check items using a Large-Scale Language Model (LLM), etc. The check items created in the report review (S2) are check items based on the evaluation perspectives created in the pre-configuration (S1), and are determined according to the content of the report. Furthermore, the multiple check items are edited to have a hierarchical structure such as Tree A. Among the multiple check items included in such a hierarchical structure, the lower-level check items are check items derived from the higher-level check items. Specifically, the lower-level check items are check items created to evaluate the report in more detail than the higher-level check items. Creating lower-level check items that are derived from higher-level check items in this way is called subdivision of check items. 【0062】 The report evaluation system then reviews the report based on the created checklist. Specifically, it determines whether the current report content meets the requirements of each checklist item. 【0063】 Furthermore, during the report review (S2), the report evaluation system may receive feedback from the user on the multiple check items it has created. Based on the user's feedback, the report evaluation system will revise the multiple check items it has created. 【0064】Returning to the explanation in Figure 1, in the section on the difficulty and effectiveness of implementing the check items, and the recommendation (S3), the difficulty of implementing each check item is calculated from the type of additional information required to satisfy the check item and the past implementation effort required to add that information, and the effect obtained by satisfying the check item is calculated. Furthermore, in the section on the difficulty and effectiveness of implementing the check items, and the recommendation (S3), recommendations are made to the user based on the calculated difficulty and effectiveness of the check items. Recommendations to the user include information such as how the report should be modified and in what order the report should be modified. Figure 2C is an illustrative diagram of the section on the difficulty and effectiveness of implementing the check items, and the recommendation (S3) shown in Figure 1. 【0065】 As shown in Figure 2C, the report evaluation system uses a Large-Scale Language Model (LLM) or similar to calculate the difficulty of implementing check items that are not currently met in the report, and calculates the effects that can be obtained by fulfilling those check items. Based on the calculated difficulty of implementing the check items and the effects, the report evaluation system then makes recommendations to the user (in the example in Figure 2C, these are recommendations in a tabular format). 【0066】 Returning to the explanation in Figure 1, in the execution of additional user confirmation (S4), the user reviews the contents of the report based on the recommendations. In other words, the user performs tasks such as modifying the contents of the report or adding information to the report. The report evaluation system then records what the user has reviewed and displays the difference between that and the required level that should be met. Figure 2D is an image diagram of the execution of additional user confirmation (S4) shown in Figure 1. 【0067】As shown in Figure 2D, the user performs tasks such as modifying the report content or adding information to the report based on the presented recommendations. The report evaluation system then re-evaluates whether the check items are met based on the modified and added content of the report. In the example in Figure 2D, the report evaluation system determines that the two lower-level check items included in Tree B are met as a result of the report modifications and additions. 【0068】 The report evaluation system then calculates and displays the difference between the current requirement level and the required level based on the results obtained by re-evaluating the report's contents. Steps S2 to S4 described above are carried out until the report's contents reach the required level. For example, as shown in Figure 2D, if the revised and added report meets the check items shown in Tree B, and it meets the requirement level of Factory B, then no further revision of the report is necessary. However, if it meets the requirement level of Factory A, then further revision of the report is necessary. 【0069】 In this way, the report evaluation system can flexibly generate main check items according to the content of the report being evaluated. Therefore, the report evaluation system can perform an evaluation appropriate to the created report and assist the user in revising the report. 【0070】 [Configuration] The configuration of the report evaluation system according to this embodiment will be described below. Figure 3 is a block diagram showing the configuration of the report evaluation system 1 according to this embodiment. 【0071】As shown in Figure 3, the report evaluation system 1 provides feedback on the report based on external information including the report information to be evaluated M1, past report information M2, context information M3, main check item list M4, subdivided knowledge M5, and evaluation level information M6. Note that the external information such as the report information to be evaluated M1 may be stored in one device (or one server) or in multiple devices (or multiple servers). Furthermore, the report evaluation system 1 may store the external information such as the report information to be evaluated M1. 【0072】 First, the report information M1 includes information about the report to be evaluated (specifically, the report whose contents are evaluated by the report evaluation system 1). The report information M1 includes, for example, information about the report to be evaluated itself, and information related to the report to be evaluated, such as who the report was intended for and at what stage it was created. 【0073】 Past report information M2 includes information about the evaluated reports whose content has been evaluated in the past by the report evaluation system 1. Past report information M2 includes, for example, information about the evaluated reports themselves that have been evaluated in the past, and information related to past evaluation results such as the evaluation results of each evaluated report and the main check items used in past evaluations. In this specification, the main check item refers to the top-level check item in each tree among the multiple check items shown in Figure 2B. 【0074】 Contextual information M3 includes, for example, inquiry information from the field, communication history of terminals installed at the field, and information such as the required level of the evaluation report (i.e., security level information) set in the pre-configuration (S1). 【0075】 The main checklist M4 is a list of main checklist items, which are evaluation criteria used to evaluate the report being evaluated. The main checklist M4 corresponds to, for example, knowledge created in advance by a user. 【0076】The subdivided knowledge M5 is knowledge that organizes information for subdividing the main check items determined by the determination unit 20 of the report evaluation system 1 (described later) according to the content of the report. In this specification, subdivision means making the determined main check items more detailed and hierarchically structuring the multiple check items obtained through the detailed preparation with respect to the main check items. In this specification, hierarchical structuring may also be simply referred to as "structuring". 【0077】 The evaluation level information M6 includes information about the baseline, which is a set of criteria used to determine whether the content of the evaluation report is sufficient. A detailed explanation of the baseline will be provided later. 【0078】 Furthermore, the report evaluation system 1 includes an acquisition unit 10, a determination unit 20, an evaluation unit 30, an output unit 40, and a processing report storage unit 50. 【0079】 The acquisition unit 10 is a processing unit that acquires external information such as the evaluation target report information M1, and is implemented, for example, by a communication module. 【0080】 The determination unit 20 is a processing unit that determines the main check items, which are evaluation criteria for evaluating the evaluation target report acquired by the acquisition unit 10, according to the contents of the evaluation target report acquired by the acquisition unit 10. The determination unit 20 is implemented by, for example, a microcomputer, but may also be implemented by a processor. The function of the determination unit 20 is realized, for example, by the microcomputer or processor that constitutes the determination unit 20 executing a computer program stored in memory or the like. 【0081】 The determination unit 20 includes a check item determination unit 21, a check item subdivision unit 22, a check item output unit 23, and an input reception unit 24. 【0082】The check item determination unit 21 determines the main check items, which are evaluation criteria for evaluating the evaluation target report acquired by the acquisition unit 10, according to the contents of the evaluation target report acquired by the acquisition unit 10. Specifically, the check item determination unit 21 determines the main check items based on the evaluation target report information M1, context information M3, and the main check item list M4, etc. 【0083】 The check item subdivision unit 22 subdivides the main check items determined by the check item determination unit 21 according to the contents of the evaluation target report acquired by the acquisition unit 10, and determines subdivided check items. Specifically, the check item subdivision unit 22 subdivides the determined main check items based on past report information M2 and subdivided knowledge M5, etc., and determines subdivided check items. Multiple check items obtained by subdividing the contents of the main check items are referred to as multiple individual check items. Furthermore, a subdivided check item is a check item that includes the main check item and multiple individual check items. In addition, in this specification, if it is not necessary to distinguish between the main check item and multiple individual check items, they may simply be referred to as "check item". 【0084】 Furthermore, the check item subdivision unit 22 may modify the determined subdivided check items based on the feedback FB1 received by the input reception unit 24. 【0085】 The check item output unit 23 outputs the determined subdivided check items as subdivided check item information T1 to an external terminal (for example, a terminal used by the user) via a communication interface (communication circuit) not shown. The external terminal, for example, displays the acquired subdivided check item information T1 on a display and presents it to the user. The user may create feedback FB1 about the content of the created subdivided check items based on the presented subdivided check item information T1 and input the created feedback FB1 into the report evaluation system 1. Feedback FB1 is, for example, feedback that points out evaluation perspectives that are missing from the output subdivided check items. 【0086】 The input receiving unit 24 receives feedback FB1 input by the user via a communication interface (communication circuit) not shown. 【0087】 The evaluation unit 30 evaluates the evaluation target report using the subdivided check items determined by the check item subdivision unit 22. The evaluation unit 30 is implemented by, for example, a microcomputer, but may also be implemented by a processor. The functions of the evaluation unit 30 are realized, for example, by the microcomputer or processor constituting the evaluation unit 30 executing a computer program stored in memory or the like. 【0088】 The evaluation unit 30 includes a check item completion determination unit 31, a sufficiency determination unit 32, and a correction report acquisition unit 33. 【0089】 The Check Item Completion Determination Unit 31 determines whether the contents of the evaluation target report acquired by the acquisition unit 10 satisfy the subdivided check items acquired by the decision unit 20, for each of the multiple check items included in the subdivided check items (specifically, for each main check item and each of the multiple individual check items). The Check Item Completion Determination Unit 31 may also determine whether the contents of the revised report R1A acquired by the revised report acquisition unit 33 (described later) satisfy the subdivided check items acquired by the decision unit 20, for each of the multiple check items included in the subdivided check items. 【0090】 The sufficiency determination unit 32 calculates a degree of sufficiency, based on the determination result of the check item completion determination unit 31, indicating the extent to which the contents of the report to be evaluated satisfy the main check item and multiple individual check items among the determined subdivided check items. Based on the calculated degree of sufficiency, the sufficiency determination unit 32 generates an evaluation result for the report to be evaluated. 【0091】The correction report acquisition unit 33 acquires the correction report R1A, which is the evaluation target report corrected by the user, via a communication interface (communication circuit) not shown. When acquiring the correction report R1A, the correction report acquisition unit 33 reads the evaluation target report before correction stored in the processing report storage unit 50. The correction report acquisition unit 33 outputs the acquired correction report R1A and the read evaluation target report before correction to the check item completion determination unit 31. 【0092】 The output unit 40 is a processing unit that outputs the evaluation result generated by the sufficiency determination unit 32 to an external terminal (for example, a terminal used by a user) via a communication interface (communication circuit) not shown. The output unit 40 is implemented by, for example, a microcomputer, but may also be implemented by a processor. The function of the output unit 40 is realized, for example, by the microcomputer or processor constituting the output unit 40 executing a computer program stored in memory or the like. 【0093】 The output unit 40 includes a recommendation unit 41, a screen information generation unit 42, and a post-feedback creation unit 43. 【0094】 The recommendation unit 41 determines whether or not to create a recommendation for the user based on the evaluation results generated by the satisfaction determination unit 32 and the evaluation level information M6. Specifically, the recommendation unit 41 determines whether or not to create a recommendation for the user based on whether or not the generated evaluation results exceed the baseline. For example, if the generated evaluation results exceed the baseline, the recommendation unit 41 does not create a recommendation for the user. Also, if the generated evaluation results do not exceed the baseline, the recommendation unit 41 creates a recommendation for the user based on the generated evaluation results. In order to create a recommendation for the user, the recommendation unit 41 calculates the difficulty of implementing the check items that are determined not to be satisfied in the current evaluation report and calculates the effect that can be obtained by satisfying the check items. 【0095】The screen information generation unit 42 generates screen information corresponding to the recommendations created by the recommendation unit 41. The screen information generation unit 42 outputs the generated screen information to an external terminal (for example, a terminal used by the user) via a communication interface (communication circuit) not shown. The external terminal presents the user with the necessary information T2 by, for example, displaying the acquired screen information on its display. Based on the presented necessary information T2, the user modifies the content of the report to be evaluated, or adds content to the report to be evaluated, to create a revised report R1A, and inputs the created revised report R1A into the report evaluation system 1. 【0096】 Furthermore, when the screen information generation unit 42 obtains a recommendation created by the recommendation unit 41, it stores the evaluation target report, which is the subject of the created recommendation, in the processing report storage unit 50. 【0097】 The post-feedback creation unit 43 creates a feedback report FB2 based on the evaluation results generated by the satisfaction determination unit 32 when the recommendation unit 41 does not create a recommendation for the user. The post-feedback creation unit 43 outputs the created feedback report FB2 to an external terminal (for example, a terminal used by the user) via a communication interface (communication circuit) not shown. 【0098】 The processing report storage unit 50 is a storage device that temporarily stores the evaluation target reports that are the subject of recommendations, created by the recommendation unit 41. The processing report storage unit 50 is implemented, for example, by a semiconductor memory. 【0099】 [Operation] Next, the operation of the report evaluation system 1 according to this embodiment will be described. Figure 4 is a flowchart showing the operation of the report evaluation system 1 according to this embodiment. 【0100】 First, the acquisition unit 10 acquires the evaluation target report that is subject to evaluation (S10). 【0101】The check item determination unit 21 of the determination unit 20 determines the main check items, which are evaluation criteria for evaluating the evaluation target report obtained in step S10, according to the contents of the evaluation target report obtained in step S10 (S20). 【0102】 The check item subdivision unit 22 of the determination unit 20 performs detailing and structuring of the main check items determined in step S20 (i.e., subdivision of the main check items) (S30). 【0103】 The check item completion determination unit 31 of the evaluation unit 30 confirms the completed check items among the subdivided check items determined in step S30, based on the contents of the evaluation target report obtained in step S10 (S40). Specifically, the check item completion determination unit 31 determines for each of the multiple check items included in the subdivided check items whether the contents of the evaluation target report obtained in step S10 satisfy the subdivided check items determined in step S30. Note that the confirmation of completed check items may include the calculation of the degree of satisfaction by the satisfaction determination unit 32 of the evaluation unit 30. 【0104】 The evaluation unit 30 calculates an evaluation value for the report to be evaluated based on the judgment result obtained in the confirmation in step S40 (S50). 【0105】 The recommendation unit 41 of the output unit 40 determines whether the content of the entire report to be evaluated is sufficient based on the evaluation value of the report to be evaluated calculated in step S50 (S60). 【0106】 If the recommendation unit 41 of the output unit 40 determines that the content of the entire evaluation report is insufficient (No in step S60), the output unit 40 creates and outputs information for correction feedback (S70). Specifically, the recommendation unit 41 creates recommendations for the user, the screen information generation unit 42 generates screen information corresponding to the created recommendations, and outputs the generated screen information to an external terminal. 【0107】The user creates a revised report R1A based on the required information T2 provided, and the revised report acquisition unit 33 of the evaluation unit 30 accepts the input of the created revised report R1A (S80). Then, the process returns to step S40 and the same process is repeated. 【0108】 On the other hand, if the recommendation unit 41 of the output unit 40 determines that the content of the entire evaluation report is sufficient (Yes in step S60), the output unit 40 creates and outputs information for final feedback (S90). Specifically, the post-feedback creation unit 43 creates a feedback report FB2 based on the evaluation results obtained in step S50, and outputs the created feedback report FB2 to an external terminal. 【0109】 Of the steps shown in Figure 4, step S10 corresponds to the acquisition step, steps S20 and S30 correspond to the decision step, steps S40 and S50 correspond to the evaluation step, steps S60, S70 and S90 correspond to the output step, and step S80 corresponds to the input reception step. 【0110】 Furthermore, the specific implementation methods for each step shown in Figure 4 will be explained below. 【0111】 [Implementation Method for Determining the Main Check Items] The following describes the specific implementation method for step S20 in Figure 4. First, we will explain the detailed operation in step S20 in Figure 4. Figure 5 is a flowchart showing an example of the detailed operation in step S20 shown in Figure 4. Figure 5 can also be considered a flowchart for determining the evaluation criteria shown in Figure 2A. 【0112】 First, the check item determination unit 21 of the determination unit 20 summarizes the evaluation target report obtained in step S10 of Figure 4 and extracts the key points of the evaluation target report (S201). The check item determination unit 21 performs the summarization of the evaluation target report and the extraction of the key points of the evaluation target report by, for example, using LLM or the like. In this specification, the key points of the evaluation target report mean the purpose or intent for which the evaluation target report was created. 【0113】 The check item determination unit 21 extracts the similarity between past reports and the report to be evaluated using RAG (Retrievable Augmented Generation) or the like (S202). In step S202, the check item determination unit 21 may use past reports and the report to be evaluated, or it may use the key points of past reports and the key points of the report to be evaluated. In this specification, similarity is an index that indicates the closeness between the content of past reports and the content of the report to be evaluated, and the closeness between the key points of past reports and the key points of the report to be evaluated. 【0114】 The check item determination unit 21 determines whether the similarity extracted in step S202 is above a predetermined threshold (S203). 【0115】 If the check item determination unit 21 determines that the similarity extracted in step S202 is above a predetermined threshold (Yes in step S203), it determines the main check items used in the evaluation of past reports as the main check items for the report to be evaluated (S204). 【0116】 On the other hand, if the check item determination unit 21 determines that the similarity extracted in step S202 is less than a predetermined threshold (No in step S203), it uses a machine learning model or the like to identify one category to which the report to be evaluated belongs from among multiple categories (S205). In this specification, a category is a pre-set classification. The check item determination unit 21 identifies which category the report to be evaluated belongs to from among multiple categories based on the contents of the report to be evaluated. For example, the check item determination unit 21 identifies one category to which the report to be evaluated belongs based on whether the purpose of creating the report to be evaluated is to submit the first report on incident response to a person on-site or to a person in the security department. 【0117】 The check item determination unit 21 determines the main check items for the report to be evaluated as the main check items for the report to be evaluated, which are the main check items used in the evaluation of other past reports that are classified into one category identified in step S205 (S206). 【0118】 In step S202, the past reports that are subject to review may be limited to reports that were previously judged to be good reports. A good report may be a past report that was evaluated by the report evaluation system 1 and received an evaluation of a certain value (for example, the requirements level) or higher, or it may be a report that the user has previously judged to have appropriate content. 【0119】 Furthermore, in steps S204 and S206, when the check item determination unit 21 determines the main check items, it refers to the main check item list M4 and the past report information M2. Figure 6A shows an example of a pre-created main check item list M4. Figure 6B shows an example of information related to past evaluation results included in the past report information M2. Specifically, the information shown in Figure 6B is information that summarizes the main check items used in past evaluations. 【0120】 As shown in Figure 6A, the main check item list M4 contains information such as the item number (labeled as Item # in Figure 6A), check item, major category, and minor category. The item number is the sequential number of the main check item registered in the main check item list M4. 【0121】 The "Check Items" column shows candidate main check items (hereinafter also referred to as "candidate main check items"), which are evaluation criteria for evaluating the report being evaluated. Note that the user must prepare a draft of the candidate main check items in advance, as shown in Figure 6A. Furthermore, the user does not need to create the candidate main check items alone; for example, the user may use a machine learning model. For instance, the user may have a few-shot trained LLM output a draft of candidate main check items, and then create the candidate main check items by reviewing and revising the outputted draft. 【0122】 The major and minor categories columns show the classifications assigned to each of the main check item candidates. In other words, the major and minor categories columns show the categories assigned to the main check item candidates. 【0123】 As shown in Figure 6B, the past report information M2 contains information such as the report number (indicated as Report # in Figure 6B), summary, relevant check items, and detailed content. The report number is the sequential number of the past report registered in the past report information M2. 【0124】 The summary section shows the key points of each past report. The corresponding check item shows the number of the candidate main check item used in each past report. The candidate main check item number shown in the check item corresponds to the item number shown in the main check item list M4 in Figure 6A. The detailed information section shows the category to which each past report is classified. 【0125】 In this way, the check item determination unit 21 identifies past reports similar to the report to be evaluated or identifies a category to which the report to be evaluated belongs, and determines the main check items by referring to the main check item list M4 and past report information M2. Therefore, by identifying past reports that have commonalities with the report to be evaluated, the check item determination unit 21 can determine the main check items suitable for evaluating the report from among multiple main check item candidates. 【0126】 The check item determination unit 21 may determine the main check items by referring only to the main check item list M4 without referring to the past report information M2. For example, the check item determination unit 21 may extract main check item candidates with categories suitable for the purpose of creating the report to be evaluated, based on the key points extracted in step S201, and determine the extracted main check item candidates as the main check items of the report to be evaluated. In this way, the check item determination unit 21 can determine the main check items suitable for evaluating the report from among multiple main check item candidates, based on the key points of the report to be evaluated. 【0127】Furthermore, in step S205, the check item determination unit 21 may, for example, use a classification model that has been fine-tuned using pre-set categories and past reports to identify one category to which the report to be evaluated belongs from among multiple categories. 【0128】 Furthermore, in the pre-setting of categories, not only users (people) but also machine learning models such as LLM may be involved. For example, a machine learning model may cluster past reports based on the key points of past reports, assign categories to each cluster, and users (people) may modify the clustering and categories as needed. Alternatively, categories may be set using a human-in-the-loop mechanism involving keyword extraction and topic modeling using LLM, and modifications by users (people). 【0129】 Furthermore, the pre-configured categories may have a hierarchical structure. For example, categories may be set up so that attack phases and attack technique types are higher-level categories, while attacks and false positives are lower-level categories. 【0130】 Furthermore, there may be an upper limit on the number of main check items determined in steps S204 and S206. For example, the main check items with the highest usage frequency among multiple main check items used in multiple past reports classified in the same category may be determined as the main check items for the report being evaluated. Alternatively, the importance of each of the multiple main check items used in multiple past reports classified in the same category may be calculated using TF-IDF (Term Frequency-Inverse Document Frequency), and characteristic main check items (i.e., main check items with high importance) may be determined as the main check items for the report being evaluated. 【0131】Thus, the check item determination unit 21 does not necessarily determine all of the main check items used in past reports as the main check items for the report under evaluation. Instead, it may determine the main check items that are appropriate for evaluating the report under evaluation (in other words, important main check items) as the main check items for the report under evaluation. This allows the report evaluation system 1 to provide more effective feedback to users with limited working time in improving the report under evaluation, and to provide more comprehensive support for improving the report under evaluation. 【0132】 [Implementation Method for Detailed and Structuring the Main Check Items] The following describes the specific implementation method for step S30 in Figure 4. However, before describing the specific implementation method, we will explain the advantages that can be obtained by detaileding and structuring the main check items. 【0133】 Firstly, the pre-created candidate main check items (i.e., the main check items determined in step S20 of Figure 4) may include general and ambiguous content so that they can be used to evaluate various types of reports. Therefore, even if the report evaluation system 1 inputs the report to be evaluated and the main check items determined in step S20 into the LLM, etc., it may not be able to obtain evaluation results equivalent to those obtained when an expert evaluates the report to be evaluated. 【0134】 Therefore, the report evaluation system 1 determines subdivided check items by refining and structuring the main check items determined in step S20 based on the contents of the report to be evaluated. These subdivided check items are specific to the report to be evaluated and clearly indicate the contents that should be considered in the evaluation of the report to be evaluated. As a result, the report evaluation system 1 can give appropriate instructions to the LLM, etc., thereby obtaining more detailed evaluation results and providing more effective feedback. 【0135】Next, we will explain the detailed operation in step S30 of Figure 4. Figure 7 is a flowchart showing the detailed operation in step S30 shown in Figure 4. Figure 7 can also be described as a flowchart for generating the hierarchical structure shown in Figure 2B for each main check item. 【0136】 First, the check item subdivision unit 22 of the determination unit 20 subdivides each main check item determined in step S20 of Figure 4 (S301). Specifically, the check item subdivision unit 22 selects one main check item to be subdivided. 【0137】 The check item subdivision unit 22 selects an approach for subdividing the content of the main check item selected in step S301 (S302). Specifically, the check item subdivision unit 22 selects an approach suitable for subdividing the content of the main check item from among multiple approaches that constitute the subdivision knowledge M5. The approach is set with a subdivision policy that matches the category of the main check item. 【0138】 Furthermore, as a concrete method for realizing step S302, a correspondence between specific main check items and specific approaches may be established in advance, so that the check item subdivision unit 22 can select an approach suitable for subdividing the content of the main check items. Alternatively, the check item subdivision unit 22 may use a classification model or the like to extract categories of the main check items and select an approach suitable for the extracted categories. In addition, if a category has not been assigned to a main check item created by the user, the LLM that has been trained in few shots may be used to classify the categories to which the main check item belongs before selecting an approach. 【0139】The check item subdivision unit 22 refines and structures the main check items to match the content of the evaluation report, based on the approach selected in step S302 (S303). In other words, the check item subdivision unit 22 determines subdivided check items by subdividing the content of the main check items selected in step S301. The check item subdivision unit 22 determines the subdivided check items by using, for example, LLM. 【0140】 The check item subdivision unit 22 checks whether subdivision of the main check items has been performed for all main check items determined in step S20 (S304). If subdivision of the main check items has not been performed, the check item subdivision unit 22 returns to the process of step S301 and performs subdivision of the main check items again. On the other hand, if subdivision of the main check items has been performed for all main check items determined in step S20, the check item subdivision unit 22 proceeds to the process of step S305. 【0141】 If the check item subdivision unit 22 obtains multiple subdivided check items through the processing in steps S301 to S304 above, it performs overall editing of the subdivided check items (S305). Overall editing of subdivided check items means merging (integrating) any overlapping check items among the check items included in the multiple subdivided check items. 【0142】 In the following, we will explain the specific implementation methods for steps S301 to S304 in Figure 7, referring to two specific examples. First, we will explain the first specific example. Figures 8A, 8B, and 8C are diagrams illustrating the first specific example. Figure 8A is a diagram illustrating an example of the evaluation target report R1. Figure 8B is a diagram illustrating the specific implementation method for step S302 in Figure 7. Figure 8C is a diagram illustrating the specific implementation method for step S303 in Figure 7. 【0143】Figure 8A shows an example of an evaluation report R1 acquired by the acquisition unit 10 of the report evaluation system 1. The evaluation report R1 includes, for example, a document created by the user (e.g., report content and judgment results), basic information of the evaluation report R1, and detailed information of the evaluation report R1. The basic information of the evaluation report R1 includes the date and time the evaluation report R1 was created and information about the terminal that was targeted. The detailed information of the evaluation report R1 includes information about the communication performed by the terminal targeted in the evaluation report R1, and information about the on-site operation of the terminal targeted in the evaluation report R1. 【0144】 In the first specific example, we will explain the case where, as shown in Figure 8B, the main check item, "If this is considered an operational event, is there sufficient evidence?", is determined to be the main check item for evaluating the evaluation target report R1 shown in Figure 8A. 【0145】 As shown in Figure 8B, the check item subdivision unit 22 first extracts the categories of the main check items. In this specific example, the check item subdivision unit 22 extracts the category "basis for hypothesis". Alternatively, the check item subdivision unit 22 may extract categories by extracting information from the major and minor classification columns of the main check item list M4 shown in Figure 6A. 【0146】 Next, the check item subdivision unit 22 selects an approach corresponding to the extracted category. The approach has a defined policy for subdividing the main check item category. In this specific example, the check item subdivision unit 22 selects "Approach to the basis of the hypothesis". 【0147】Next, as shown in Figure 8C, the check item subdivision unit 22 creates a prompt to be input to the LLM. The prompt is an instruction that includes, for example, the task, approach, report, check items, and output format. The task field contains an instruction for the LLM to subdivide the content of the main check items (i.e., an instruction for the LLM). The approach selected by the check item subdivision unit 22 is entered in the approach field. The report to be evaluated, R1, is entered in the report field. The main check items to be subdivided are entered in the check item field. The output example field contains an example of the output format to be specified to the LLM. 【0148】 Note that the task and output example fields in the prompt may be created in advance. For example, the check item subdivision unit 22 may create the prompt by entering the necessary information in the approach, report, and check item fields. 【0149】 The check item subdivision unit 22 then inputs the created prompt to the LLM and determines the output from the LLM as the subdivided check items. Figure 8C shows the subdivided check items in this specific example (i.e., the basis for the hypothesis). 【0150】 Next, I will explain the second specific example. Figures 9A, 9B, and 9C are diagrams illustrating the second specific example. Figure 9A is a diagram showing an example of the evaluation report R1. Figure 9B is a diagram illustrating the specific implementation method of step S302 in Figure 7. Figure 9C is a diagram illustrating the specific implementation method of step S303 in Figure 7. 【0151】 Figure 9A shows an example of an evaluation target report R1 acquired by the acquisition unit 10 of the report evaluation system 1. Note that the evaluation target report R1 shown in Figure 9A is the same report as the evaluation target report R1 shown in Figure 8A. In the second specific example, as shown in Figure 9B, we will explain a case where the main check item, "Is the role and name of the terminal clearly described?", is determined as the main check item for evaluating the evaluation target report R1 shown in Figure 9A. 【0152】 As shown in Figure 9B, the check item subdivision unit 22 extracts the category "clear statement of basic information". Next, the check item subdivision unit 22 selects the "approach to clear statement of basic information" that corresponds to the extracted category. 【0153】 Next, as shown in Figure 9C, the check item subdivision unit 22 creates prompts to be input to the LLM and inputs the created prompts to the LLM. Then, the check item subdivision unit 22 determines the output results from the LLM as the subdivided check items. Figure 9C shows the subdivided check items in this specific example (i.e., specifying basic information). 【0154】 Furthermore, the specific implementation method of step S305 in Figure 7 will be explained with reference to a specific example. Figure 10 is a diagram illustrating the specific implementation method of step S305 in Figure 7. In the explanation using Figure 10, an example is given in which the check item subdivision unit 22 determines two subdivided check items. 【0155】 As shown in Figure 10, the subdivided check items are check items with a hierarchical structure in which multiple individual check items branch off from the main check item determined by the check item determination unit 21. Specifically, the content of the first level from the top of the subdivided check items is the main check item, and the content of the second level and below from the top are multiple individual check items. 【0156】In this specific example, the contents enclosed by dashed lines in the two subdivided check items match. Specifically, the individual check items included in the subdivided check item shown above (hereinafter referred to as the first subdivided check item) and the main check items included in the subdivided check item shown below (hereinafter referred to as the second subdivided check item) match. In such cases, the check item subdivision unit 22 merges (integrates) the individual check items included in the first subdivided check item and the main check items included in the second subdivided check item to create a single subdivided check item. More generally, the check item subdivision unit 22, when it finds a first individual check item or main check item (hereinafter referred to as the first main check item) among the multiple individual check items included in the first subdivided check item and a second individual check item or main check item (hereinafter referred to as the second main check item) among the multiple individual check items included in the second subdivided check item, merges the first individual check item or first main check item with the second individual check item or second main check item to create a single subdivided check item. 【0157】The check item subdivision unit 22 selects one of the main check items and multiple individual check items included in the first subdivided check item and one of the main check items and multiple individual check items included in the second subdivided check item, and determines whether their contents match. Specifically, the check item subdivision unit 22 determines whether their contents match by using the closeness of the vector embedding representations of the two selected check items and keyword matching for the two selected check items. For example, the check item subdivision unit 22 uses a natural language processing model that generates vectors from text such as SentenceBert, and determines whether the two check items match by referring to the cosine similarity of the generated vectors. "Matching of two check items" here does not mean that the two check items are a perfect match, but rather that the two check items are similar (specifically, that the cosine similarity is above a predetermined value). 【0158】 In this way, the check item subdivision unit 22 can group related subdivision check items by merging the main check items or individual check items included in the first subdivision check item with the main check items or individual check items included in the second subdivision check item. As a result, the user can more easily check the subdivision check items determined by the check item subdivision unit 22, and the report evaluation system 1 can reduce the burden on the user. 【0159】 Furthermore, the check item subdivision unit 22 may subdivide the main check items using a method that references external information such as RAG. Figure 11 shows an example of when the check item subdivision unit 22 creates a prompt using external information. 【0160】As shown in Figure 11, if external information (in this specific example, an alert confirmation procedure document) is specified within the selected approach, the check item subdivision unit 22 reads the specified external information and creates a prompt. The check item subdivision unit 22, for example, inputs the read external information into the confirmation procedure field. In this way, the check item subdivision unit 22 can determine more specific subdivided check items by utilizing external information when subdividing the main check items. 【0161】 Furthermore, the check item subdivision unit 22 may modify the determined subdivided check items based on user feedback FB1. Figure 12 is a diagram illustrating the operation when the check item subdivision unit 22 modifies the subdivided check items based on user feedback FB1. In the explanation using Figure 12, an example is given in which the check item subdivision unit 22 determines a subdivided check item (the subdivided check item located on the left) consisting of a main check item and two individual check items by executing the processes S301 to S304 in Figure 7. 【0162】 As shown in Figure 12, the user confirms the contents (output content) of the subdivided check item information T1 output by the check item output unit 23. Based on the output content, the user creates feedback FB1 and inputs it into the report evaluation system 1. The check item subdivision unit 22 modifies the determined subdivided check items based on the feedback FB1 received by the input reception unit 24. Specifically, the check item subdivision unit 22 creates a prompt into which the evaluation target report R1, the determined subdivided check items, and feedback FB1 are input, and modifies the determined subdivided check items by inputting the created prompt into the LLM. In the example in Figure 12, the area enclosed by the dashed line corresponds to the modified area, and two individual check items have been newly added. In this way, the check item subdivision unit 22 can determine subdivided check items that are more suitable for evaluating the evaluation target report R1 by modifying the subdivided check items based on the feedback FB1 from the user. 【0163】 Furthermore, the content of Feedback FB1 should be feedback that points out evaluation perspectives that are missing from the determined subdivided checklist items. Specifically, the content of Feedback FB1 may be feedback that instructs the addition of individual checklist items, or feedback that instructs whether the content of the determined subdivided checklist items is necessary or not. 【0164】 [Modified Methods for Implementing the Detailed and Structuring of Main Check Items] Below, we will describe modified methods for the specific implementation of step S30 in Figure 4 described above. Figure 13 is a flowchart showing another detailed operation in step S30 shown in Figure 4. The flowchart shown in Figure 13 differs from the flowchart shown in Figure 7 in that labels are assigned to the determined subdivided check items. Note that each of steps S301 to S305 shown in Figure 13 is the same process as each of steps S301 to S305 shown in Figure 7, so their explanation will be omitted. 【0165】 After step S303, the check item subdivision unit 22 determines whether the content of the subdivided check item determined in step S303 indicates an attack or a false positive (S311). Here, "attack" means an attack against security, and "false positive" means a false positive in security. "Security" here refers to the security system that the user has targeted for reporting, and the security system that is reported in the evaluation report. In other words, the check item subdivision unit 22 determines whether the subdivided check item determined in step S303 is a check item to confirm that it is an attack against the security system installed on site, or a check item to confirm that it is a false positive by the security system installed on site. 【0166】 If the check item subdivision unit 22 determines that the content of the subdivided check item determined in step S303 indicates an attack or a false positive (Yes in step S311), it assigns an attack label or a false positive label to the subdivided check item (S312). 【0167】 On the other hand, if the check item subdivision unit 22 cannot determine that the content of the subdivided check item determined in step S303 indicates an attack or a false positive (No in step S311), it manages the subdivided check as a target for labeling, taking into account the contents of the evaluation target report R1 (S313). For example, the check item subdivision unit 22 manages the subdivided check item by assigning an unclassified label to the subdivided check item. 【0168】 Then, after performing the processing in step S312 or step S313 above, the check item subdivision unit 22 checks whether or not subdivision of the main check items has been performed for all main check items. 【0169】 The following section will explain the specific implementation method for when the check item subdivision unit 22 assigns labels to the subdivided check items, with reference to specific examples. Figures 14A and 14B are diagrams illustrating the specific implementation method for when the check item subdivision unit 22 assigns labels to the subdivided check items. Figure 14A is a diagram showing an example of a subdivided check item determined by the check item subdivision unit 22. Figure 14B is a diagram illustrating the specific implementation method for steps S311 to S313 in Figure 13. 【0170】 As shown in Figure 14A, the check item subdivision unit 22 determines the subdivided check items based on the evaluation report R1, the determined main check items, and the selected approach. This corresponds to step S303 in Figure 13. 【0171】As shown in Figure 14B, the check item subdivision unit 22 assigns one of the following labels to the subdivided check items shown in Figure 14A: an attack label, a false positive label, or an unclassified label. Specifically, the check item subdivision unit 22 assigns labels to the subdivided check items using a model that determines which label to assign to the subdivided check items. More specifically, the check item subdivision unit 22 assigns labels to the subdivided check items using a classification model that has been fine-tuned using LLM for labeling the subdivided check items. 【0172】 In this specific example, we showed a case where over-detection labels are assigned to both the main check item and the three individual check items included in a single subdivided check item, but this is not the only example. For instance, the types of labels assigned to the main check item and the multiple individual check items included in a single subdivided check item may be mixed. 【0173】 Furthermore, if the types of labels assigned to the main check item and multiple individual check items included in a single subdivided check item are mixed, and in particular attack labels and false positive labels are mixed, the check item subdivider 22 may edit the subdivided check item. Specifically, the check item subdivider 22 may separate the subdivided check item into check items assigned attack labels and check items assigned false positive labels, and then merge (integrate) the separated check items with other subdivided check items that have similar content. Note that merging with check items separated from other subdivided check items may be achieved by performing the same process as in step S305 of Figure 7 above. 【0174】 Furthermore, while the above explanation described a case where the check item subdivision unit 22 assigns labels to the subdivided check items, for example, the user may assign labels by reviewing the subdivided check items. 【0175】Furthermore, the check item subdivision unit 22 may assign the subdivision check items with the attack label to the group of investigation items related to attacks, and assign the subdivision check items with the false positive label to the group of investigation items related to false positives. Figure 15 shows an example of when each of the multiple subdivision check items is assigned to a respective group of investigation items according to the labels assigned to them. 【0176】 As shown in Figure 15, the check item subdivision unit 22 assigns subdivided check items with an attack label to a group of investigation items related to attacks, and assigns subdivided check items with a false positive label to a group of investigation items related to false positives. The check item subdivision unit 22 also manages subdivided check items with an unclassified label within the check item pool as items to which labels will be assigned, taking into account the contents of the evaluation target report R1. 【0177】 Furthermore, the construction of the attack-related investigation items may be carried out before deciding on the detailed checks. For example, a user may construct an attack-related investigation item by predicting potential attacks against the security system and creating procedures to confirm whether or not the anticipated attacks have occurred on the security system. Similarly, the construction of an investigation item for false positives may be carried out using the same procedure. 【0178】 [Implementation Method for Confirming Completed Check Items] The following describes the specific implementation method for step S40 in Figure 4. Figure 16 is a flowchart showing the detailed operation in step S40 shown in Figure 4. Figure 16 can also be considered a flowchart for making decisions regarding the check items shown in Figure 2B. 【0179】 First, the check item completion determination unit 31 of the evaluation unit 30 checks whether the contents of the evaluation target report satisfy the subdivided check items for each group of subdivided check items determined in step S30 (specifically, one tree) (S401). For example, if multiple subdivided check items have been determined in step S30, the check item completion determination unit 31 selects one subdivided check item. 【0180】The check item completion determination unit 31 checks whether the contents of the evaluation report satisfy the subdivided check items for each of the multiple check items included in the group of subdivided check items selected in step S401 (S402). Specifically, the check item completion determination unit 31 selects one check item from the main check item and multiple individual check items included in the group of subdivided check items determined in step S401. 【0181】 The check item completion determination unit 31 determines whether it is necessary to refer to external information in order to confirm whether the check item selected in step S402 has been completed (S403). For example, the information used to determine whether or not to refer to external information is set in advance according to the content of the check item (especially the main check item). 【0182】 If the Check Item Completion Determination Unit 31 determines that it is necessary to refer to external information (Yes in step S403), it obtains external information from an external database or the like (S404). Then, the Check Item Completion Determination Unit 31 determines whether the check items are satisfied based on the evaluation target report obtained in step S10, the check items selected in step S402, and the external information obtained in step S404 (S405). For example, the Check Item Completion Determination Unit 31 inputs a confirmation prompt to the LLM to determine whether the check items are satisfied (specifically, a prompt including the evaluation target report, the check items, and the external information) to determine whether the contents of the evaluation target report satisfy the check items selected in step S402. 【0183】On the other hand, if the Check Item Completion Determination Unit 31 determines that it is not necessary to refer to external information (No in step S403), it determines whether the check items are satisfied based on the evaluation target report obtained in step S10 and the check items selected in step S402 (S405). For example, the Check Item Completion Determination Unit 31 determines whether the contents of the evaluation target report satisfy the check items selected in step S402 by inputting a confirmation prompt (specifically, a prompt including the evaluation target report and the check items) to the LLM to determine whether the check items are satisfied. 【0184】 The Check Item Completion Determination Unit 31 checks whether the process in step S405 has been performed for all of the multiple check items included in the subdivided check item (S406). If the Check Item Completion Determination Unit 31 finds that the process in step S405 has not been performed for all of the multiple check items included in the subdivided check item, it returns to the process in step S402 and selects the check items that have not been checked. On the other hand, if the Check Item Completion Determination Unit 31 finds that the process in step S405 has been performed for all of the multiple check items included in the subdivided check item, it proceeds to the process in step S407. 【0185】 The sufficiency determination unit 32 calculates the sufficiency level for the entire group of subdivided check items (i.e., one tree) and stores the calculated sufficiency level in memory or the like (S407). The sufficiency determination unit 32 calculates the sufficiency level using, for example, LLM. 【0186】 The check item completion determination unit 31 checks whether a degree of satisfaction has been calculated for each of the subdivision check item groups determined in step S30 (S408). If a degree of satisfaction has not been calculated for each of the subdivision check item groups, the check item completion determination unit 31 returns to step S401 and selects the subdivision check item groups for which a degree of satisfaction has not been calculated. On the other hand, if a degree of satisfaction has been calculated for each of the subdivision check item groups, the check item completion determination unit 31 terminates its operation related to confirming completed check items. 【0187】 In this way, the check item completion determination unit 31 can use LLM to confirm whether each check item has been completed, taking into account the contents of the evaluation report. 【0188】 Furthermore, the check item completion determination unit 31 can make decisions that are more in line with the actual situation on site by referring to external information when confirming whether each check item has been completed. 【0189】 Furthermore, the sufficiency determination unit 32 calculates a degree of sufficiency indicating the extent to which the contents of the report under evaluation satisfy the main check items and multiple individual check items among the subdivided check items. This allows the evaluation unit 30 to quantitatively evaluate the report under evaluation based on the calculated degree of sufficiency. 【0190】 The following will explain the specific implementation method of step S407 in Figure 16, referring to a specific example. Figure 17 is a diagram illustrating the specific implementation method of step S407 in Figure 16. In the explanation using Figure 17, we will illustrate the case in which the sufficiency determination unit 32 calculates the sufficiency using LLM. 【0191】 As shown in Figure 17, the sufficiency determination unit 32 creates a prompt (left arrow in Figure 17) for input to the LLM. Then, the sufficiency determination unit 32 determines the overall sufficiency of the group of subdivided check items (overall sufficiency) from the output result from the LLM (right arrow in Figure 17). 【0192】A prompt is an instruction that includes a task, output format, report, and check items. The task field contains an instruction for the LLM to calculate the overall satisfaction level of the group of subdivided check items (i.e., an instruction for the LLM). The output format field contains an example of the output format to be specified to the LLM. The report to be evaluated is entered in the report field. The check items field contains the group of subdivided check items selected in step S401. The satisfaction level determination unit 32 inputs the results regarding the satisfaction level of each check item, as determined by the check item completion determination unit 31, into the LLM along with the created prompt. 【0193】 Furthermore, the satisfaction level determination unit 32 may change the weight of the influences that contribute to the overall satisfaction level according to the hierarchy of the check items included in the subdivided check. For example, in calculating the overall satisfaction level, the satisfaction level determination unit 32 may give a larger weight to higher-level check items and a smaller weight to lower-level check items. 【0194】 Furthermore, the above explanation described a case where the sufficiency determination unit 32 uses LLM to calculate the sufficiency of the entire group of subdivided check items, but it is not limited to this. For example, the sufficiency determination unit 32 may use a predetermined rule base to calculate the sufficiency of the entire group of subdivided check items. The rule base is, for example, a model that calculates the sufficiency according to the ratio of the number of check items that are judged to have met the requirements of the evaluation report (i.e., completed) to the total number of check items. 【0195】 Furthermore, the satisfaction level calculated by the satisfaction level determination unit 32 does not necessarily have to be in five stages. For example, the satisfaction level determination unit 32 may calculate a satisfaction level expressed in four stages or less, or it may calculate a satisfaction level expressed in six stages or more. In other words, the satisfaction level determination unit 32 only needs to calculate a satisfaction level expressed in n stages (where n is a natural number of 2 or more). 【0196】[Modified Method for Confirming Completed Check Items] Below, we will describe modified methods for the specific implementation of step S40 in Figure 4 described above. Figure 18 is a flowchart showing another detailed operation in step S40 shown in Figure 4. The flowchart shown in Figure 18 differs from the flowchart shown in Figure 16 in that it considers assigning labels to the entire group of subdivided check items or to each individual check item. Note that each of the steps S401 to S408 shown in Figure 18 is the same process as each of the steps S401 to S408 shown in Figure 16, so their explanations are omitted. 【0197】 After step S407, the check item completion determination unit 31 determines whether it is possible to assign an attack label or an over-detection label to the entire set of subdivided check items or to each individual check item (S411). For example, the check item completion determination unit 31 determines whether it is possible to assign an attack label or an over-detection label based on whether an attack label or an over-detection label has already been assigned to the entire set of subdivided check items or to each individual check item. More specifically, the check item completion determination unit 31 determines that it is possible to assign an attack label or an over-detection label to the entire set of subdivided check items or to each individual check item if an unclassified label has been assigned, and determines that it is not possible to assign an attack label or an over-detection label if an unclassified label has not been assigned. 【0198】 The check item completion determination unit 31 determines that it is possible to assign an attack label or a false positive label to the entire group of subdivided check items or to each individual check item (Yes in step S411), and assigns the label to the entire group of subdivided check items or to each individual check item (S412). 【0199】 On the other hand, if the check item completion determination unit 31 determines that it is not possible to assign an attack label or a false positive label to the entire set of subdivided check items or to individual check items (No in step S411), it proceeds to the process in step S408. 【0200】In the following section, we will explain the specific implementation method of step S412 in Figure 18, referring to specific examples. Figure 19 is a diagram illustrating the specific implementation method when the check item completion determination unit 31 assigns labels to the entire group of subdivided check items or to each individual check item. 【0201】 As shown in Figure 19, the Check Item Completion Determination Unit 31 takes into account the contents of the evaluation target report R1 and assigns either an attack label or a false positive label to the subdivided check items managed in the check item pool (i.e., check items with the unclassified label) and distributes them to one of the investigation item groups. For example, the Check Item Completion Determination Unit 31 uses LLM to extract relevant sentences from the evaluation target report R1 (in other words, sentences related to the content of the subdivided check items). Then, using a classification model that determines whether the check item belongs to the attack category or the false positive category based on the extracted relevant sentences and the check item with the unclassified label, the Check Item Completion Determination Unit 31 assigns either an attack label or a false positive label to the check item and distributes it to the appropriate investigation item group. Therefore, depending on the contents of the evaluation target report R1, the Check Item Completion Determination Unit 31 may distribute the same subdivided check item to either the attack-related investigation item group or the false positive investigation item group. 【0202】 In the above explanation, we described a case where the check item completion determination unit 31 assigns an attack label or a false positive label to the subdivided check items. However, for example, the user may assign an attack label or a false positive label by reviewing the subdivided check items. 【0203】 Furthermore, after the check item completion determination unit 31 has finished assigning labels to the entire set of subdivided check items or to individual check items, the sufficiency determination unit 32 may calculate the sufficiency of each of the attack-related investigation item group and the false-detection-related investigation item group. Figure 20 shows an example of the sufficiency of each investigation item group calculated by the sufficiency determination unit 32. 【0204】As shown in Figure 20, the sufficiency determination unit 32 calculates the sufficiency level (in other words, the survey rate) for each group of survey items. The survey rate calculated for a group of survey items is, for example, the ratio of the number of check items determined to be completed to the total number of check items assigned to the group of survey items. 【0205】 The evaluation unit 30 then determines, based on the calculated investigation rate, whether there is a high probability of an attack on the security system or a high probability of a false positive by the security system. Specifically, the evaluation unit 30 determines whether the calculated investigation rate is above a pre-set threshold and determines that the category (attack or false positive) of the group of investigation items for which an investigation rate above the threshold was obtained is highly likely. 【0206】 Furthermore, the threshold may be set by the user, or it may be set by the evaluation unit 30 based on past evaluation target report R1 in which it made judgments about attacks and false positives. If the evaluation unit 30 uses past evaluation target report R1 in which it made judgments about attacks and false positives as a basis, it may set the threshold by referring to, for example, the survey rate of each survey item group in past evaluation target report R1. 【0207】 In this way, the check item completion determination unit 31 assigns a label to the entire group of subdivided check items or to each individual check item, and the sufficiency determination unit 32 calculates the sufficiency level of each group of survey items. As a result, the evaluation unit 30 can determine whether there is a high probability of an attack on the security system or a high probability of false positives by the security system by checking the calculated sufficiency level of each group of survey items. Therefore, the report evaluation system 1 can provide feedback to the user on whether the security status concluded by the user is valid or not. 【0208】 [Method for Calculating Evaluation Values of the Report Under Evaluation] The following describes the specific method for implementing step S50 in Figure 4, as explained above. Figure 21 is a flowchart showing the detailed operation in step S50 shown in Figure 4. 【0209】First, the evaluation unit 30 extracts category information of the report to be evaluated (S501). In other words, the evaluation unit 30 identifies a category to which the report to be evaluated belongs. 【0210】 The evaluation unit 30 acquires a baseline corresponding to the category information extracted in step S501 (S502). Specifically, the evaluation unit 30 acquires evaluation level information M6 and extracts a baseline corresponding to the category information from the acquired evaluation level information M6, thereby realizing the process in step S502. The baseline is a reference value set for each category of information. Furthermore, a reference value is provided for each evaluation viewpoint set in step S1 (pre-setting) in Figure 2A. 【0211】 The evaluation unit 30 calculates evaluation values for each evaluation criterion set in step S1 (pre-setting) in Figure 2A, based on the degree of satisfaction for each subdivided check item calculated in step S40 in Figure 4 (S503). The evaluation unit 30 compares the baseline obtained in step S502 with the evaluation values for each evaluation criterion calculated in step S503, and performs an overall evaluation for each evaluation criterion (S504). 【0212】 In the following section, we will explain the specific implementation methods for steps S501 to S504 in Figure 21, referring to specific examples. Figure 22 is a diagram illustrating the specific implementation methods for steps S501 to S504 in Figure 21. 【0213】As shown in Figure 22, the evaluation unit 30 extracts category information from the report to be evaluated using a topic model or LLM, and obtains a baseline with the same category information as the extracted category information. The evaluation unit 30 also uses LLM to calculate evaluation values for each evaluation criterion set in step S1 (pre-setting) in Figure 2A from the degree of satisfaction for each check item (specifically, a group of subdivided check items) shown in the satisfaction list. The evaluation unit 30 then compares the evaluation value for each evaluation criterion with the baseline and determines whether the evaluation value for each evaluation criterion meets the respective standard values of the baseline. For example, the evaluation unit 30 determines that the evaluation criterion is met if the evaluation value exceeds the standard value, and that the evaluation criterion is not met if the evaluation value is below the standard value. In the example shown in Figure 22, when the evaluation unit 30 determines that the evaluation criterion is met, it is represented by a circle, and when the evaluation value is below the standard value, it is represented by an X, it is represented by a cross. 【0214】 It should be noted that, as a prerequisite, each check item shown in the satisfaction level list corresponds to one of the evaluation criteria. The main check item is a more specific expression of the evaluation criteria used to evaluate the report being evaluated, and the multiple individual check items, which are subdivisions of the main check item, also conform to the evaluation criteria. Therefore, each check item (specifically, the subdivided check item) is determined in accordance with the evaluation criteria. Thus, the evaluation value for each evaluation criterion is a value that reflects the degree of satisfaction of each check item. Specifically, as shown in the example in Figure 22, when the evaluation unit 30 calculates the evaluation value for consistency, it does so based on the degree of satisfaction of check item 1 and check item 2, etc. 【0215】 Furthermore, the evaluation value calculated by the evaluation unit 30 does not necessarily have to be on a five-level scale. For example, the evaluation unit 30 may calculate an evaluation value expressed on four levels or less, or it may calculate an evaluation value expressed on six levels or more. In other words, the evaluation unit 30 only needs to calculate an evaluation value expressed on n levels (where n is a natural number greater than or equal to 2). 【0216】Furthermore, baselines must be created in advance for each category of information. Figure 23 is a diagram illustrating the process flow when creating baselines in advance. 【0217】 As shown in Figure 23, first, the evaluation unit 30 uses a topic model or LLM, etc., to extract past good reports that have been assigned the same category information from among multiple past good reports. 【0218】 Next, the user (person) sets the evaluation criteria for evaluating the report to be evaluated. The evaluation unit 30 then creates a baseline by inputting the list of satisfaction levels corresponding to the extracted past good reports and the evaluation criteria set by the user (person) into a topic model or LLM, etc. The evaluation unit 30 also assigns category information corresponding to the created baseline and stores it in memory, etc. The evaluation unit 30 repeats this process for each piece of category information. 【0219】 Furthermore, the user may provide feedback or make modifications to the created baseline as needed. For example, the user may provide feedback instructing the evaluation unit 30 to modify the evaluation values for some evaluation criteria, or they may create a baseline with modified evaluation values for some evaluation criteria. 【0220】 Furthermore, each reference value included in the baseline created by the evaluation unit 30 does not necessarily have to be on a five-level scale. For example, the evaluation unit 30 may create reference values expressed on four levels or less, or on six levels or more. In other words, the evaluation unit 30 only needs to create reference values expressed on n levels (where n is a natural number greater than or equal to 2). 【0221】 Furthermore, while the above explanation illustrates a case where the evaluation unit 30 creates a baseline, the user may also create a baseline for each category of information. 【0222】In this way, the evaluation unit 30 evaluates the report under evaluation based on whether the evaluation values for each evaluation criterion obtained from the sufficiency level calculated by the sufficiency determination unit 32 exceed predetermined standard values. This allows the evaluation unit 30 to identify any insufficient content in the current report under evaluation. 【0223】 Furthermore, the evaluation unit 30 creates a standard value based on the degree of sufficiency of past reports that the report evaluation system 1 has previously evaluated. This allows the evaluation unit 30 to determine whether the content of the current report being evaluated is sufficient by comparing it with past results. 【0224】 [Implementation Method for Creating and Outputting Information for Correction Feedback] The following describes the specific implementation method for step S70 in Figure 4 described above. Figure 24 is a flowchart showing the detailed operation in step S70 shown in Figure 4. Figure 24 can also be described as a flowchart for creating and outputting the recommendation shown in Figure 2C. 【0225】 First, the recommendation unit 41 of the output unit 40 determines whether or not there are rule settings for recommendations (S701). In the rule settings, items (i.e., rules) that must be followed when the recommendation unit 41 selects check items to encourage the user to consider in order to improve evaluation perspectives that are not currently met in the content of the evaluation target report are set. Figure 25 shows an example of rule settings. 【0226】As shown in Figure 25, the rule settings include recommendation indicators such as effort and effectiveness. Effort is a parameter that represents the effort a user puts into checking the check items (specifically, modifying the report being evaluated). Effort includes, for example, the time required to modify the report being evaluated, and the number of documents or procedures consulted to modify the report being evaluated. Effectiveness is a parameter that represents the amount of information that is expected to be added to the report being evaluated as a result of checking the check items (specifically, modifying the report being evaluated). For example, the more detailed the check item, the more effective the check item will be if the higher-level check items are not met. 【0227】 Returning to the explanation in Figure 24, if the recommendation unit 41 determines that there are no rules set for the recommendation (No in step S701), it creates a recommendation using the LLM (S702). Specifically, the recommendation unit 41 inputs the evaluation target report obtained in step S10 in Figure 4, the confirmation results of the check items obtained in step S40, and the evaluation value calculated in step S50 into the LLM, and uses the output result from the LLM as the recommendation. After that, it proceeds to the process in step S705, which will be described later. 【0228】 On the other hand, if the recommendation unit 41 determines that there are rules set for recommendations (Yes in step S701), it calculates the man-hours and effects for each check item according to the rule settings (S703). 【0229】 Based on the calculation results obtained in step S703, the recommendation unit 41 determines the check items to be recommended and determines the recommendation order (i.e., priority) of the check items (S704). 【0230】 The recommendation unit 41 creates a diagram and a recommendation table showing the currently completed / not completed check items (S705). 【0231】The recommendation unit 41 generates comments regarding the overall satisfaction level and recommendation content, for example, using LLM (S706). The generated comments include, for example, the reasons for the recommendation order determined in step S704. 【0232】 The recommendation unit 41 outputs each piece of information it has created to the screen information generation unit 42 (S707). Specifically, the recommendation unit 41 outputs, for example, the drawing and recommendation table created in step S705, as well as the comments created in step S706, to the screen information generation unit 42. 【0233】 Then, the screen information generation unit 42 of the output unit 40 generates screen information based on the information acquired in step S707 and outputs the generated screen information to an external terminal (S708). 【0234】 In this way, the output unit 40 creates and outputs feedback information to prompt revision of the evaluation report based on the obtained evaluation results. This allows the report evaluation system 1 to show the user which evaluation perspectives to focus on and in what order to revise the contents of the evaluation report in order to make the revisions more efficient. Therefore, the report evaluation system 1 can provide more effective feedback to users with limited working time in improving the evaluation report, and can provide more comprehensive support for improving the evaluation report. 【0235】 In the following section, we will explain the specific implementation methods for steps S701 to S706 in Figure 24, referring to specific examples. Figure 26 is a diagram illustrating the specific implementation methods for steps S701 to S706 in Figure 24. 【0236】As shown in Figure 26, the recommendation unit 41 extracts evaluation criteria that are subject to modification by comparison with the baseline (specifically, evaluation criteria whose evaluation values are below the standard value). The recommendation unit 41 then identifies subdivided check items corresponding to the extracted evaluation criteria and determines the priority using LLM or according to rule settings. In determining the priority, the recommendation unit 41 calculates the effort and effect for each check item included in the subdivided check items and makes a comprehensive judgment. For example, the recommendation unit 41 may set a higher priority for the corresponding check item if the calculated effort is small, as this means that the effort required to modify the evaluation target report is small. Also, the recommendation unit 41 may set a higher priority for the corresponding check item if the calculated effect is large, as this means that the effect obtained by modifying a single evaluation criterion is large. This allows the report evaluation system 1 to provide more effective feedback to users with limited working time in improving the evaluation target report. 【0237】 Furthermore, when the recommendation unit 41 determines priorities according to the rule settings, it may set a higher priority for check items that contribute to the determination of an attack or false positive. For example, if the satisfaction level calculated by the satisfaction level determination unit 32 indicates a high possibility of false positive (specifically, if the survey rate of the group of survey items related to false positives reaches a satisfaction level of a certain value or higher), the priority of check items with a false positive label may be set higher. In such cases, recommending check items with a false positive label rather than recommending check items with an attack label reduces the effort required to revise the evaluation report, ultimately reducing the burden on the user. 【0238】 Furthermore, the following describes a specific example of the feedback (specifically, the necessary information T2) presented to the user in step S708 of Figure 24. Figure 27 is an example of the necessary information T2 presented to the user. 【0239】As shown in Figure 27, the necessary information T2 presented to the user includes first information 101, second information 102, third information 103, fourth information 104, and fifth information 105. 【0240】 Information 101 includes the contents (summary) of the report being evaluated. 【0241】 The second piece of information, 102, includes a diagram showing which check items have been completed and which have not, as well as the degree of satisfaction for each subdivided check item. 【0242】 The third piece of information, 103, includes a recommendation table. The recommendation table shows, for example, the priority of the check items, the check items that have been selected for recommendation, and the calculated effects and man-hours. 【0243】 Information Section 4, Section 104, includes a table of evaluation results for the report being evaluated. The table of evaluation results shows, for example, the evaluation criteria used to evaluate the report being evaluated, the calculated evaluation values, the baseline reference values used for comparison with the evaluation values, and the items to be corrected determined by the comparison with the baseline. 【0244】 Information item 5, item 105, includes an icon indicating the report's completeness. The report's completeness is an indicator that shows to what extent the current content of the report being evaluated meets the pre-set report requirements. 【0245】 This report evaluation system 1 generates second information 102, including diagrams showing completed / incomplete check items, so that it can visually represent to the user the extent to which the check items have been met in the current evaluation report. 【0246】 Furthermore, since the report evaluation system 1 generates third-party information 103 including a recommendation table, it can show the user which evaluation perspectives to use to efficiently revise the contents of the report being evaluated. 【0247】 Furthermore, the report evaluation system 1 generates a fourth piece of information 104, which includes a table of evaluation results, so that the evaluation results obtained from the currently evaluated report can be presented to the user in an easy-to-understand manner. 【0248】 Furthermore, the report evaluation system 1 generates fifth information 105, which includes an icon indicating the degree of report sufficiency, so that it can provide the user with an estimate of how much more revision is needed for the report being evaluated. 【0249】 Therefore, the report evaluation system 1 can support users in modifying the report being evaluated by presenting them with the necessary information T2. 【0250】 As described above, the report evaluation system 1 of this disclosure determines the main check items according to the content of the report to be evaluated (S20), and configures the input prompts to the LLM selectively and concisely based on the determination results. This reduces the number of tokens and provides the technical effect of reducing the computational load (memory usage, inference time, inference cost). In addition, it is possible to suppress the inclusion of unnecessary information (prompt contamination), which contributes to improving the consistency and reproducibility of the output, as well as avoiding exceeding the maximum context length. 【0251】 (Other Embodiments) The report evaluation system relating to this disclosure has been described above based on embodiments, but this disclosure is not limited to these embodiments. Without departing from the spirit of this disclosure, various modifications that a person skilled in the art could conceive of these embodiments and their variations, as well as other forms constructed by combining some of the components of the embodiments, are also included within the scope of this disclosure. 【0252】 In the above embodiment, each component may be implemented by dedicated hardware or by executing a software program suitable for each component. Each component may also be implemented by a program execution unit such as a CPU or processor reading and executing a software program recorded on a recording medium such as a hard disk or semiconductor memory. 【0253】 Furthermore, the communication method between devices in the above embodiment is not particularly limited. In addition, a relay device (not shown) may be interposed in the communication between devices. 【0254】For example, in the above embodiment, a process executed by one processing unit may be executed by another processing unit. Also, the order of multiple processes may be changed, or multiple processes may be executed in parallel. 【0255】 For example, the order of processing described in the flowchart of the above embodiment is just one example. The order of multiple processing steps may be changed, and multiple processing steps may be executed in parallel. 【0256】 Furthermore, some or all of the functions of the report evaluation system according to the above embodiment may be realized by a processor such as a CPU executing a program. 【0257】 Furthermore, each functional configuration of the report evaluation system according to the above embodiment may be implemented using, for example, a machine learning model or generative AI (Artificial Intelligence) instead of LLM to realize various functions. 【0258】 Furthermore, each component may be implemented by hardware. For example, each component may be a circuit (or integrated circuit). These circuits may form a single circuit as a whole, or they may be separate circuits. Also, each of these circuits may be a general-purpose circuit or a dedicated circuit. 【0259】 Some or all of the components constituting each of the above devices may consist of detachable IC cards or standalone modules attached to each device. The IC card or module is a computer system composed of a microprocessor, ROM, RAM, etc. The IC card or module may also include a highly functional LSI. The microprocessor operates according to a computer program, thereby enabling the IC card or module to achieve its function. The IC card or module may also be tamper-resistant. 【0260】Furthermore, comprehensive or specific embodiments of this disclosure may be implemented as systems, apparatus, methods, integrated circuits, computer programs, or recording media such as computer-readable CD-ROMs. They may also be implemented as any combination of systems, apparatus, methods, integrated circuits, computer programs, and recording media. The recording media may also be non-temporary recording media. 【0261】 Furthermore, this disclosure may be implemented as a method executed by a computer, or as a program causing a computer to execute the method. Alternatively, this disclosure may be implemented as a computer-readable, non-temporary recording medium on which such a program is recorded. 【0262】 The report evaluation system described in this disclosure can be used as a system for improving security reports. 【0263】 1 Report Evaluation System 10 Acquisition Unit 20 Decision Unit 21 Check Item Determination Unit 22 Check Item Subdivision Unit 23 Check Item Output Unit 24 Input Reception Unit 30 Evaluation Unit 31 Check Item Completion Judgment Unit 32 Satisfaction Judgment Unit 33 Correction Report Acquisition Unit 40 Output Unit 41 Recommendation Unit 42 Screen Information Generation Unit 43 Post-Factor Feedback Creation Unit 50 Processing Report Storage Unit 101 First Information 102 Second Information 103 Third Information 104 Fourth Information 105 Fifth Information M1 Evaluation Target Report Information M2 Past Report Information M3 Context Information M4 Main Check Item List M5 Subdivision Knowledge M6 Evaluation Level Information R1 Evaluation Target Report R1A Correction Report T1 Subdivision Check Item Information T2 Required Information FB1 Feedback FB2 Feedback Report
Claims
1. A report evaluation method performed by a report evaluation system that evaluates the contents of a report, comprising: an acquisition step of acquiring a report to be evaluated; a determination step of determining main check items, which are evaluation perspectives for evaluating the acquired report to be evaluated, according to the contents of the acquired report to be evaluated; an evaluation step of evaluating the report to be evaluated based on the determined main check items; and an output step of outputting evaluation results obtained by evaluating the report to be evaluated.
2. The report evaluation method according to claim 1, wherein in the decision step, key points are extracted from the contents of the report to be evaluated, and from among a plurality of pre-prepared candidates for main check items, candidates for main check items suitable for evaluating the report to be evaluated obtained based on the extracted key points are extracted, and the extracted candidates for main check items are determined to be the main check items.
3. The report evaluation method according to claim 1, wherein in the determination step, the similarity between the contents of past reports previously evaluated by the report evaluation system and the contents of the report to be evaluated is extracted; if the extracted similarity is equal to or greater than a predetermined threshold, the main check items used in the evaluation of the past reports are determined to be the main check items for evaluating the report to be evaluated; and if the extracted similarity is less than the predetermined threshold, one category is identified from among a plurality of categories based on the contents of the report to be evaluated, and the main check items used in the evaluation of other past reports classified into the identified one category are determined to be the main check items for evaluating the report to be evaluated.
4. In the determination step, an approach is selected to subdivide the content of the determined main check item; based on the acquired evaluation report, the determined main check item, and the selected approach, subdivided check items are determined by subdividing the content of the determined main check item; and in the evaluation step, the evaluation report is evaluated based on the determined subdivided check items; and the subdivided check items include the main check item and a plurality of individual check items obtained by subdividing the content of the main check item, the report evaluation method according to claim 1.
5. In the determination step, key points are extracted from the contents of the report to be evaluated, two or more candidate main check items suitable for evaluating the report to be evaluated are selected from a plurality of pre-prepared candidate main check items based on the extracted key points, each of the two or more selected candidate main check items is determined as the main check item, a first subdivided check item is determined as the subdivided check item obtained by subdividing the content of the first main check item determined as the main check item, a second subdivided check item is determined as the subdivided check item obtained by subdividing the content of the second main check item determined as the main check item, and if the first individual check item or the first main check item among the plurality of individual check items included in the first subdivided check item matches the second individual check item or the second main check item among the plurality of individual check items included in the second subdivided check item, the first individual check item or the first main check item and the second individual check item or the second main check item are integrated. The report evaluation method according to claim 4.
6. The report evaluation method according to claim 4, wherein in the evaluation step, the report to be evaluated is evaluated by inputting the acquired report to be evaluated and the determined subdivided check items into LLM (Large-Language-Model).
7. The report evaluation method according to claim 6, further comprising the evaluation step of evaluating the report to be evaluated by inputting external information stored in an external database into the LLM.
8. The report evaluation method according to claim 4, further comprising: a check item output step of outputting the determined subdivided check items; and an input reception step of receiving feedback that points out evaluation perspectives that are missing from the output subdivided check items, wherein the determination step modifies the determined subdivided check items based on the received feedback.
9. The report evaluation method according to claim 4, wherein in the evaluation step, the report to be evaluated is evaluated by calculating a degree of satisfaction indicating the extent to which the contents of the report to be evaluated satisfy the main check item and the plurality of individual check items among the determined subdivided check items.
10. The report evaluation method according to claim 9, wherein in the evaluation step, the evaluation of the report to be evaluated is performed based on whether or not the evaluation value for each evaluation criterion obtained from the calculated degree of sufficiency exceeds a predetermined standard value.
11. The report evaluation method according to claim 10, wherein the predetermined reference value is a value based on the degree of sufficiency of past reports previously evaluated by the report evaluation system.
12. The report evaluation method according to claim 9, wherein the report to be evaluated is a report containing information on security, in the determination step, an attack label is assigned to the determined subdivided check items if the content of the subdivided check items indicates an attack on the security, and an overspot label is assigned to the content of the subdivided check items if the content of the subdivided check items indicates a false positive in the security, and in the evaluation step, the degree of satisfaction of the subdivided check items to which the attack label is assigned and the degree of satisfaction of the subdivided check items to which the overspot label is assigned are calculated to evaluate whether the security state concluded in the report to be evaluated is appropriate.
13. The report evaluation method according to claim 1, wherein the output step involves creating feedback information to prompt revision of the evaluation target report based on the evaluation results, and outputting the feedback information.
14. A program that causes a computer to execute the report evaluation method described in any one of claims 1 to 13.
15. A report evaluation system for evaluating the contents of a report, comprising: an acquisition unit for acquiring a report to be evaluated; a determination unit for determining main check items, which are evaluation criteria for evaluating the acquired report to be evaluated, according to the contents of the acquired report to be evaluated; an evaluation unit for evaluating the report to be evaluated based on the determined main check items; and an output unit for outputting evaluation results obtained by evaluating the report to be evaluated.