A method for converting subjective question scoring rules based on LLM and scoring point extraction

By using an LLM-based scoring point extraction method, the teacher's scoring details are concatenated with the question stem, question type, and total score at the character level to identify and generate structured scoring rules. This solves the problems of difficult scoring point configuration and logical inconsistency in existing technologies, and achieves efficient and interpretable automatic scoring.

CN121480486BActive Publication Date: 2026-06-30UNICLOUD TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
UNICLOUD TECH CO LTD
Filing Date
2026-01-08
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

In existing technologies, there is a lack of a systematic conversion mechanism between the teacher's natural language scoring rules and the structured scoring rules inside the automatic scoring system. This results in high configuration costs and errors in the scoring points, making it difficult to adjust with the teaching syllabus or scoring standards. Furthermore, the automatic scoring system cannot strictly follow the teacher's scoring logic, affecting the interpretability and traceability of the results.

Method used

By using an LLM-based scoring point extraction method, the teacher's scoring details text is concatenated with the question stem, question type, and total score at the character level. A large language model is used to identify the scoring point names, emphasis words, and scores to generate structured scoring rules. The scores are then allocated in conjunction with the question type scoring template to ensure that the scoring points are uniformly scaled and rounded under the total score constraint.

Benefits of technology

It achieves accurate identification of scoring points and reasonable allocation of scores, reduces missed detections and false detections, reduces the workload of manual rule configuration, improves the consistency of scoring logic and the interpretability of rules, and supports seamless embedding of automatic scoring systems with teacher scoring logic.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN121480486B_ABST
    Figure CN121480486B_ABST
Patent Text Reader

Abstract

This invention relates to the fields of educational assessment and artificial intelligence technology, specifically to a method for converting subjective question scoring rules based on LLM and scoring point extraction. This method obtains teacher-written scoring rules and the stems, question types, and total scores of target test questions. It then uses a large language model to categorize the scoring rules according to question identifiers and question types. A character-level scoring point extraction model identifies the names of scoring points, corresponding original sentences, and score weighting information, completing the merging, splitting, and marking of score or weight levels for the scoring points. Next, the scoring points are matched with question type scoring templates, and the scores for each point are allocated according to the recommended ratio and the total score of the question. Under the premise of satisfying the minimum scoring unit and total score constraints, structured automatic scoring rules are generated for use by the automatic subjective question scoring system, thereby reducing manual configuration workload and improving scoring consistency and interpretability.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of educational assessment and artificial intelligence technology, and in particular relates to a method for converting subjective question scoring details based on LLM and scoring point extraction. Background Technology

[0002] In recent years, with the increasing informatization of assessments in basic and higher education, and the widespread use of online assignments and computer-based exams, the proportion of subjective questions in exams and assignments has been continuously rising. For subjective question types such as essay questions, calculation problems, and proof questions, teachers typically write detailed scoring guidelines in natural language, explaining the key knowledge points, logical steps, and writing or expression requirements to be tested in the answer process. In a human-marking scenario, examiners can combine these scoring guidelines with their experience to understand and subjectively judge students' answers, thus giving relatively consistent scores.

[0003] With the development of artificial intelligence technology, large-scale pre-trained models based on machine learning, deep learning, and large language models are gradually being introduced into the field of educational assessment to achieve automatic understanding and scoring of subjective question answers. Existing automatic scoring technologies for subjective questions mainly fall into two categories: one is a template- and rule-based scoring method, where teachers manually configure each scoring point, corresponding score, and deduction conditions in the system. The scoring engine compares student answers with the scoring points and calculates scores through keyword matching, similarity calculation, and other methods. The other is an end-to-end scoring method based on machine learning models or large language models. Through training on a large number of labeled samples or through prompt engineering, the model directly outputs the overall score for a question after inputting the question stem, reference answer, and simple scoring instructions. Some systems also pre-set scoring templates for different question types. Teachers select the question type and fill in the scoring points when recording questions, thus guiding the internal processing logic of the automatic scoring module. These solutions, to a certain extent, support the objectification and batch scoring process for subjective questions.

[0004] However, in existing technologies, there is often a lack of systematic conversion mechanisms between teachers' natural language scoring guidelines and the structured scoring rules within automated scoring systems. For template- and rule-based solutions, the names, scores, and weights of scoring points largely rely on manual input and maintenance by teachers or curriculum researchers, resulting in high configuration costs, a high risk of errors, and difficulty in large-scale updates as teaching syllabi or scoring standards change. While some systems support simple rule extraction, they often remain at the level of coarse matching at the keyword or sentence level, failing to distinguish between sentence components with different functions, such as "scoring point names," "score words," and "words of varying importance," thus limiting the accuracy in replicating teachers' original scoring logic. For end-to-end automated scoring methods, although large language models possess strong semantic understanding capabilities, their output is highly dependent on prompt design and training data distribution. They typically only combine the question stem, reference answer, and a small amount of scoring instructions, making it difficult to strictly adhere to teachers' complete scoring guidelines at a fine-grained level. Furthermore, they struggle to explicitly reflect the source, weight, and scoring basis of each scoring point, affecting the interpretability and traceability of the results.

[0005] Furthermore, existing solutions often treat question stem semantics, question type information, total score constraints, and scoring rubric text in a fragmented manner during modeling. They lack a unified modeling perspective to simultaneously consider "what type of question is this," "how teachers describe the scoring criteria in natural language," "what is the total score for this question," and "how should each scoring point be reasonably allocated under the total score." Without joint modeling and a systematic transformation mechanism, on the one hand, the large language model is difficult to precisely constrain within the teacher's existing scoring logic framework in educational assessment scenarios, easily leading to inconsistencies with teaching and testing syllabi or human scoring habits; on the other hand, there is a significant "semantic gap" between the structured rules within the automated scoring system and the scoring rubrics used by teachers daily.

[0006] Therefore, how to model and process the scoring rules text in a unified manner with the question type, question stem, and total score information of the target subjective questions without changing teachers' habit of writing scoring rules in natural language, automatically identify and extract fine-grained scoring points and their score importance information, and then generate structured scoring rules that can be directly called by the automatic scoring system for subjective questions, has become a technical problem that urgently needs to be solved in the field of intelligent scoring of subjective questions. Summary of the Invention

[0007] In view of this, the present invention aims to propose a subjective question scoring rule conversion method based on LLM and scoring point extraction, so as to at least solve one of the problems in the background art.

[0008] To achieve the above objectives, the technical solution of the present invention is implemented as follows:

[0009] Firstly, this solution discloses a method for converting subjective question scoring rules based on LLM and scoring point extraction, including:

[0010] S1. Obtain the teacher's scoring details text, as well as the question stem, question type, and total score of the target subjective question. Use a large language model to classify the scoring details according to the question identifier and question type to obtain the scoring details text corresponding to the target subjective question.

[0011] S2. Input the scoring rules text, the question stem, question type and total score of the target subjective question into the scoring point extraction model, identify the key point name, corresponding original sentence and words related to the score weight of multiple scoring points, and generate an initial list of scoring points.

[0012] S3. Merge and split the initial list of scoring points, and combine the numerical information or emphasis words in the scoring rules text to mark the specific score or emphasis level from the limited set of levels for each scoring point, so as to obtain the scoring point confirmation result.

[0013] S4. Based on the confirmed results and question types according to the scoring points, select the question type scoring template that matches the target subjective question from the question type scoring template library;

[0014] S5. Based on the confirmed results of the scoring points, the question type scoring template and the total score, retain the original score for the scoring points with specific scores, allocate the remaining score for the scoring points with only minor and major levels according to the recommended ratio and quantity, and perform uniform scaling and rounding on all scoring point scores under the constraint of the total score to obtain the scoring point score allocation result.

[0015] S6. Organize the key points of the scoring criteria, their corresponding original sentences, and target scores into structured automatic scoring rules for use by the automatic scoring system for subjective questions.

[0016] Furthermore, step S1 includes: concatenating the scoring details text and the question stem text at the character level, and attaching the question type marker text obtained by mapping the question type information and the score marker text obtained by converting the total score, to form a text input object for the input embedding layer of the scoring point extraction model to receive.

[0017] Furthermore, the scoring point extraction model divides the text input object into sentences and expands it into characters in the input embedding layer, generating a 256-dimensional feature vector for each character that includes its position number in the scoring rules text, whether it is a punctuation mark, whether it is a number, and whether it is a common scoring term.

[0018] Furthermore, the scoring point extraction model includes a sequentially connected context encoding layer, intermediate feature layer, and label output layer:

[0019] The context coding layer consists of two one-dimensional convolutional neural networks with a convolutional window length of 3, used to generate a sequence of context feature vectors containing the semantics of the left and right neighbors of characters.

[0020] The intermediate feature layer is a fully connected network containing 128 neurons, used to generate a sequence of scoring feature vectors;

[0021] The label output layer is a fully connected network containing ten output neurons, used to output a ten-dimensional label vector for each character.

[0022] Furthermore, each dimension of the ten-dimensional label vector is used to indicate whether the character belongs to a preset category among the scoring point name part, the stress word part, the score word part, the starting position of the scoring sentence, the internal position of the scoring sentence, and the inter-sentence connecting words. The text is linearly traversed according to the probability threshold corresponding to the starting position of the scoring sentence to construct candidate segments of the scoring sentence.

[0023] For each candidate scoring sentence fragment, the scoring point name is generated by using the characters of the scoring point name part category in the main label of the fragment, and words related to the score weight are generated by using the characters of the minor and major word parts category in the main label, and the entire fragment is used as the corresponding original sentence to construct the scoring point record in the initial list of scoring points.

[0024] Furthermore, in step S3, when merging the scoring points describing the same scoring requirements, the same character-level embedding and intermediate feature layer as the scoring point extraction model are used to vectorize the scoring point names, and the merging determination value is calculated by combining the position intersection-union ratio of the scoring point names in the ten-dimensional label vector subsequence. When the merging determination value is greater than the preset merging threshold, the corresponding scoring points are merged into a new scoring point record.

[0025] Furthermore, in step S3, when splitting the scoring points containing multiple scoring requirements in the same sentence, the segmentation position is determined based on the combination of the inter-sentence connector labels and the starting position labels of the scoring sentences in the ten-dimensional label vector subsequence of the corresponding original sentence. The original sentence is then divided into multiple candidate segments of scoring sentences, and an independent scoring point record is generated for each segment.

[0026] Furthermore, in step S3: when valid numerical information is parsed in the corresponding original sentence, the numerical information is written into the specific score field of the scoring point;

[0027] When no valid numerical information is parsed or the parsing result is lower than the preset valid score threshold, the stress words are extracted from the corresponding original sentence and mapped to the level names in the pre-set limited level set, which serve as the stress level of the scoring point.

[0028] Furthermore, in step S4: the scoring point name in the scoring point confirmation result and the preset scoring point name field in the candidate question type scoring template are vectorized and encoded in the same way as the intermediate feature layer of the scoring point extraction model, the similarity between the two is calculated, and the maximum similarity between each scoring point and the preset scoring point name field in each candidate question type scoring template is accumulated to obtain the template matching score.

[0029] Among the candidate templates whose template matching score is greater than the template selection threshold, the one with the highest matching score is selected as the question type scoring template that matches the target subjective question. When all template matching scores are less than the template selection threshold, the default question type scoring template corresponding to the question type is selected.

[0030] Furthermore, in step S5: the scoring points are divided into a first set and a second set according to whether they have specific scores. The specific scores of the scoring points in the first set are directly adopted and summed to obtain the total allocated scores. The remaining scores to be allocated to the second set are calculated based on the total scores and the total allocated scores.

[0031] The scoring points in the second set are grouped according to their severity level. Based on the recommended proportions for each severity level in the question type scoring template, the remaining scores are distributed among the groups and evenly distributed within each group according to the number of scoring points, thus obtaining the initial score for each scoring point.

[0032] Sum all initial scores and compare them with the total score. If the two are not equal, scale the scores of each scoring point according to their ratio and round and make necessary fine adjustments according to the smallest scoring unit so that the sum of the final scores equals the total score.

[0033] Furthermore, in step S6, the target score in the score allocation result of the scoring points is matched with the corresponding original text sentence of the same scoring point in the score confirmation result. Rule numbers are configured for each scoring point according to the order of the corresponding original text sentences in the scoring rules text. Based on the field structure preset by the subjective question automatic scoring system, the scoring point name is used as the rule identifier, the corresponding original text sentence is used as the scoring basis, the target score is used as the score field, the rule number is used as the execution order field, and the question number is used as the question field. An automatic scoring rule set is generated and provided to the subjective question automatic scoring system for calling one rule at a time according to the rule number.

[0034] Secondly, this solution discloses an automatic scoring device for subjective questions, including a memory and a processor. The memory stores a program that can run on the processor. When the program is executed, it causes the processor to perform a method for converting subjective question scoring rules.

[0035] Thirdly, this solution discloses a computer-readable storage medium storing a computer program thereon, which, when executed by a processor, implements a method for converting subjective question scoring details.

[0036] Compared with existing technologies, the subjective question scoring rule conversion method based on LLM and scoring point extraction described in this invention has the following advantages:

[0037] (1) This invention concatenates the scoring rules text and the question stem text at the character level, and adds question type marker text obtained by mapping question type information and score marker text obtained by converting total score. In the input embedding layer, a 256-dimensional feature vector containing position number, punctuation mark, number mark and common scoring term mark is constructed for each character, so that the scoring point extraction model can perceive question type, question stem semantics and total score constraints in a unified character sequence. Compared with the scheme that only processes the original text of the scoring rules in isolation, this invention can more accurately distinguish the scoring point name and weight-related terms when extracting scoring points, significantly reducing missed detections and false detections, and improving the structural integrity and consistency with the teacher's original scoring logic of the initial list of scoring points;

[0038] (2) In the scoring point extraction model, this invention uses a two-layer one-dimensional convolutional neural network with a window length of 3 as the context encoding layer, combined with a 128-dimensional intermediate feature layer and a label output layer with ten output neurons. This outputs a ten-dimensional label vector for each character, indicating whether the character belongs to the scoring point name, stress words, score words, the starting position of the scoring sentence, the internal position of the scoring sentence, or inter-sentence connectors. Based on the probability threshold of the starting position of the scoring sentence, the text is linearly traversed to construct candidate segments for the scoring sentence. Compared with existing extraction methods based on fixed sentence segmentation rules or simple line breaks, this invention can automatically identify the start and end boundaries and internal structure of the scoring sentence, stably extracting scoring sentences with complete semantics from teachers' long natural language rules, reducing the workload of manual sentence segmentation and rule configuration.

[0039] (3) For scoring points describing the same scoring requirements, this invention uses character-level embedding and intermediate feature layers consistent with the scoring point extraction model to vectorize the scoring point names. It also calculates the merging judgment value by combining the position intersection-union ratio of the scoring point names in the ten-dimensional label vector subsequence. When the merging judgment value is greater than a preset threshold, the same scoring requirement expressed in multiple sentences is automatically merged. At the same time, the segmentation position is determined based on the combination of the inter-sentence connector label and the starting position label of the scoring sentence in the label subsequence of the corresponding original sentence. The case of multiple scoring requirements in the same sentence is automatically split, and an independent scoring point record is generated for each segment. Compared with the existing technology that only scores according to the whole sentence or completely relies on manual judgment to determine whether to split or merge, this invention realizes deduplication and fine-grained division at the scoring point level. It avoids the same scoring requirement being scored repeatedly and avoids multiple scoring requirements being mixed in one rule, which would lead to an overly coarse scoring category. It better fits the teacher's detailed scoring logic.

[0040] (4) In the process of confirming the scoring points, when valid numerical information is parsed in the corresponding original sentence, the present invention writes the numerical information into the specific score field of the scoring point; when no valid numerical information is parsed or the parsing result is lower than the preset valid score threshold, the emphasis words are extracted from the original sentence and mapped to the level names in the preset limited level set as the emphasis level of the scoring point. Thus, even if the teacher only writes relative descriptions such as "emphasis" or "brief mention", the present invention can standardize them into a calculable level field. Compared with the prior art, which usually ignores or relies on manual experience to assign values ​​for such descriptions, the present invention achieves a unified structured expression of score words and emphasis words without changing the teacher's language habits, and improves the robustness of rule generation in scenarios with incomplete information.

[0041] (5) In this invention, the names of the scoring points in the scoring point confirmation results and the preset scoring point name fields in the candidate question type scoring templates are vectorized and encoded in the same way as the intermediate feature layer of the scoring point extraction model. The template matching score is obtained by calculating the similarity between the two and accumulating the "maximum similarity between each scoring point and its internal preset name field" for each candidate template. The highest score among the candidate templates whose matching score is not lower than the template selection threshold is selected as the final question type scoring template. If the threshold is not met, the template is returned to the default template corresponding to the question type. Compared with the method of hard matching only by question type field or relying on teachers to manually select templates, this invention constructs a semantic automatic alignment path between the teacher's natural language scoring rules and the standard question type scoring template library, which significantly reduces the workload of template configuration and maintenance, and ensures that the generation rules are structurally consistent with the existing scoring system.

[0042] (6) This invention first divides the scoring points into a first set and a second set according to whether they have specific scores. For the scoring points in the first set, the specific scores given by the teacher are directly used and summed to obtain the total allocated scores. Then, the remaining scores for the second set are calculated based on the total scores. Subsequently, the second set is grouped according to the importance level. According to the recommended proportions for each importance level in the question type scoring template, the remaining scores are distributed among the level groups, and evenly distributed within each group according to the number of scoring points to obtain the initial scores for each scoring point. Then, all the initial scores are summed and compared with the total score. Through uniform scaling and rounding according to the minimum scoring unit and necessary fine-tuning, the final sum of scores is equal to the total score. Compared with simple averaging or purely manual adjustment, this invention not only retains the specific scores given by the teacher, but also uses the template recommended proportions and importance levels to reflect the relative weight relationship in the allocation of remaining scores. Under the condition of satisfying the total score constraints and minimum scoring unit restrictions, it achieves automatic balance and avoids the overall score imbalance caused by rounding or adjustment.

[0043] (7) This invention matches the target score in the score allocation result with the corresponding original sentence of the same score point in the score confirmation result. It assigns a rule number to each score point according to the order of the original sentence in the score details text. Based on the preset field structure of the subjective question automatic scoring system, it uses the score point name as the rule identifier, the corresponding original sentence as the scoring basis, the target score as the score field, the rule number as the execution order field, and the question number as the question field, generating an automatic scoring rule set that can be directly called by the automatic scoring system in sequence. Compared with existing schemes that only store black-box weights internally or are difficult to trace back to the original scoring instructions, this invention allows each automatic scoring rule to be traced back to the teacher's original statement, facilitating review, interpretation, and manual intervention, while ensuring that the rule set can be seamlessly embedded into the existing subjective question automatic scoring system. Attached Figure Description

[0044] The accompanying drawings, which form part of this invention, are used to provide a further understanding of the invention. The illustrative embodiments of the invention and their descriptions are used to explain the invention and do not constitute an undue limitation of the invention. In the drawings:

[0045] Figure 1 This is a schematic diagram of a subjective question scoring rule conversion method based on LLM and scoring point extraction, as described in an embodiment of the present invention.

[0046] Figure 2 This is a schematic diagram illustrating the scoring rules and question information input construction process described in an embodiment of the present invention;

[0047] Figure 3 This is a schematic diagram of the scoring point extraction process described in an embodiment of the present invention;

[0048] Figure 4 This is a schematic diagram of the scoring point inspection and confirmation process described in an embodiment of the present invention;

[0049] Figure 5 This is a schematic diagram of the question type scoring template selection process according to an embodiment of the present invention;

[0050] Figure 6 This is a schematic diagram illustrating the automatic conversion of the scoring details text into structured scoring rules according to an embodiment of the present invention.

[0051] Figure 7 This is a schematic diagram showing the breakdown of the single-question scoring rules into scoring points according to an embodiment of the present invention;

[0052] Figure 8 This is a schematic diagram of the network structure of the scoring point extraction model described in an embodiment of the present invention. Detailed Implementation

[0053] It should be noted that, unless otherwise specified, the embodiments and features described in the present invention can be combined with each other.

[0054] The present invention will now be described in detail with reference to the accompanying drawings and embodiments.

[0055] This solution discloses a method for converting subjective question scoring criteria based on LLM and scoring point extraction, which mainly includes the following steps:

[0056] S1. Obtain the teacher's scoring details text, as well as the question stem, question type, and total score of the target subjective question. Use a large language model to classify the scoring details according to the question identifier and question type to obtain the scoring details text corresponding to the target subjective question.

[0057] S2. Input the scoring rules text, the question stem, question type and total score of the target subjective question into the scoring point extraction model, identify the key point name, corresponding original sentence and words related to the score weight of multiple scoring points, and generate an initial list of scoring points.

[0058] S3. Merge and split the initial list of scoring points, and combine the numerical information or emphasis words in the scoring rules text to mark the specific score or emphasis level from the limited set of levels for each scoring point, so as to obtain the scoring point confirmation result.

[0059] S4. Based on the confirmed results and question types according to the scoring points, select the question type scoring template that matches the target subjective question from the question type scoring template library;

[0060] S5. Based on the confirmed results of the scoring points, the question type scoring template and the total score, retain the original score for the scoring points with specific scores, allocate the remaining score for the scoring points with only minor and major levels according to the recommended ratio and quantity, and perform uniform scaling and rounding on all scoring point scores under the constraint of the total score to obtain the scoring point score allocation result.

[0061] S6. Organize the key points of the scoring criteria, their corresponding original sentences, and target scores into structured automatic scoring rules for use by the automatic scoring system for subjective questions.

[0062] In step S1, the specific steps include: classifying the teacher-written scoring rubric text according to the target subjective questions based on LLM, obtaining the original content of the scoring rubric text corresponding to the target subjective questions, and retaining the original order and literal expression of each paragraph and sentence in the scoring rubric text.

[0063] Obtain the question information for the target subjective questions. The question information should include at least the question stem, question type information, and total score information.

[0064] Based on the original content of the scoring rules text and the question type information in the question information, the scoring rules text is marked as the scoring rules text corresponding to the target subjective question type, providing question type constraints for the input of the subsequent scoring point extraction model;

[0065] The original content of the scoring rules text is concatenated with the question stem text at the character level, and the total score and question type information of the target subjective questions are added to form a text input object that can be received by the input embedding layer of the hierarchical neural network structure.

[0066] In step S2, the specific steps include: inputting the scoring details text into the input embedding layer of the scoring point extraction model, dividing the scoring details text into sentences and characters into character sequences, generating a 256-dimensional feature vector for each character containing the position number in the scoring details text, whether it is a punctuation mark, whether it is a number, and whether it is common scoring terms, to obtain the input feature sequence arranged in the original order of the scoring details text.

[0067] The input feature sequence is input into the context encoding layer of the scoring point extraction model. The structure consists of two layers of one-dimensional convolutional neural networks. In each layer of one-dimensional convolutional neural network, convolutional neurons with a convolution window length of three are used to convolve and nonlinearly transform the feature vectors of adjacent characters to generate a 256-dimensional context feature vector sequence containing semantic information of the current position and its left and right neighboring positions.

[0068] The context feature vector sequence is input into the intermediate feature layer of the scoring point extraction model. A fully connected neural network containing 128 fully connected neurons is used to map the 256-dimensional context feature vector corresponding to each character into a 128-dimensional scoring feature vector sequence, so that the scoring-related role of each character in the scoring details text is centrally represented through the scoring feature vector.

[0069] The scoring feature vector sequence is input into the label output layer of the scoring point extraction model. A fully connected neural network with ten output neurons generates a corresponding ten-dimensional label vector for each character. The ten-dimensional label vector is used in different dimensions to determine whether the character belongs to the scoring point name part, whether it belongs to the stress part, whether it belongs to the score word part, whether it is the starting position of the scoring sentence, whether it is the internal position of the scoring sentence, and whether it is a preset category in the inter-sentence connecting word category.

[0070] Based on the ten-dimensional label vector sequence, the character-level positions in the scoring rules text are linearly traversed. Characters with the starting position label of the scoring sentence are taken as the starting point of the scoring sentence. Characters with the internal position label of the scoring sentence are merged in sequence to form several candidate segments of scoring sentences corresponding to the scoring conditions, while maintaining the original order of the candidate segments of scoring sentences in the scoring rules text.

[0071] For each candidate scoring sentence, a scoring point name is generated based on the characters in the candidate scoring sentence that have the scoring point name part label. Based on the characters in the candidate scoring sentence that have the stress word part label and the score word part label, stress words related to the score stress are identified. The complete candidate scoring sentence is used as the corresponding original sentence to obtain a scoring point record containing the scoring point name, the corresponding original sentence, and stress words.

[0072] Multiple scoring point records are sorted according to the order in which the corresponding original sentences appear in the scoring rules text, and the scoring point names, corresponding original sentences, and stress words are organized according to the preset field order to form an initial list of scoring points that can be processed later.

[0073] In step S3, the specific steps include: aligning the start and end positions of the corresponding original sentences of each scoring point in the initial list of scoring points with the ten-dimensional label vector sequence output by the scoring point extraction model on the scoring details text, mapping each scoring point into a continuous ten-dimensional label vector subsequence, and providing a unified feature representation for subsequent checks and adjustments to the scoring requirements.

[0074] Based on the scoring point name, corresponding original sentence, and ten-dimensional label vector subsequence for each scoring point, the scoring point name is vectorized and encoded, and the similarity value between any two scoring point names is calculated. When the similarity value is greater than the preset merging threshold and the label distribution of the scoring point name in the corresponding original sentence meets the semantic pattern of the same scoring requirement, the two scoring points are merged to generate a new scoring point, and the corresponding original sentences are concatenated and merged.

[0075] For each original sentence corresponding to a scoring point in the initial list of scoring points, based on the combination of inter-sentence connector labels and the starting position labels of the scoring sentences in the ten-dimensional label vector subsequence obtained by mapping, the connection positions indicating different scoring requirements within the same corresponding original sentence are identified. When the number of connection positions is greater than the preset splitting threshold, the corresponding original sentence is divided into multiple candidate segments of scoring sentences, and an independent scoring point name and scoring point record are generated for each candidate segment of scoring sentences.

[0076] The scoring point records obtained through merging and splitting are matched with the scoring rules text again. For the original sentence corresponding to each scoring point record, characters with score word part tags are extracted from the ten-dimensional label vector subsequence. The corresponding specific score is parsed by combining the numerical information in the scoring rules text and written into the score field of the scoring point record.

[0077] For scoring point records where no tags with scoring words are extracted or the specific score after parsing is lower than the preset effective score threshold, characters with tags with weight words are extracted from the corresponding original sentence. The weight words are matched and mapped with the level names in the pre-set limited level set to obtain the weight level corresponding to the scoring point record. The weight level is then written into the weight level field of the scoring point record.

[0078] For a scoring point record that has both a specific score field and a severity level field, the specific score is used as the main score information of the scoring point record, and the severity level is used as the priority mark for the scoring point record in subsequent score ratio adjustments. This is used to constrain the relative weight relationship in the scoring point confirmation result while keeping the specific scores from conflicting.

[0079] All scoring point records that have been merged and split and have been written into specific score fields or severity level fields are arranged according to the order of the corresponding original sentences in the scoring rules text, forming a scoring point confirmation result that includes the scoring point name, the corresponding original sentence, and the specific score or severity level.

[0080] In step S4, the specific steps include: taking all the names of the scoring points in the scoring point confirmation results and the question types in the question information as input, filtering out multiple candidate question type scoring templates that match the question type field in the question type scoring template library, and obtaining a candidate set of question type scoring templates that match the target subjective question type.

[0081] Each scoring point name in the scoring point confirmation result and the preset scoring point name field in each candidate question type scoring template are vectorized and encoded in the same way as the intermediate feature layer of the scoring point extraction model. The similarity value between the scoring point name and the preset scoring point name field is calculated, and the template matching score for each candidate question type scoring template is accumulated.

[0082] The template matching score of each candidate question type scoring template is compared with the preset template selection threshold. The candidate question type scoring template with the highest template matching score and a template matching score not lower than the template selection threshold is selected as the question type scoring template that matches the target subjective question. When all template matching scores are lower than the template selection threshold, the default question type scoring template corresponding to the question type is selected as the question type scoring template that matches the target subjective question.

[0083] In step S5, the specific steps include: dividing the scoring points in the scoring point confirmation results according to whether they have specific scores; recording the scoring points with specific scores directly using the specific scores in the scoring point confirmation results; summing the scores of all scoring points with specific scores to obtain the total allocated scores; and then calculating the remaining scores to be allocated to the scoring points that only have a severity level based on the total score in the question information and the total allocated scores.

[0084] The scoring points with only a degree of severity are grouped according to their degree of severity. Based on the recommended proportions for each degree of severity in the question type scoring template, the remaining points are distributed among the degree of severity according to the recommended proportions. Within each degree of severity, the remaining points are further subdivided according to the number of scoring points to obtain the initial score for each scoring point with only a degree of severity.

[0085] The initial scores of all scoring points are summed, and the sum is compared with the total score in the question information. When the two are not equal, the initial scores of all scoring points are scaled proportionally according to the ratio of the total score to the sum of the initial scores, and the scaled scores are rounded according to the preset minimum scoring unit to obtain the target score that meets the total score constraint.

[0086] The target score for each scoring point is associated with the corresponding scoring point name, and the scoring points are arranged in the order of their confirmation in the scoring point confirmation results to form the scoring point score allocation result.

[0087] In step S6, the specific steps include: matching the target score of each scoring point in the scoring point score allocation result with the scoring point name and corresponding original sentence of the same scoring point in the scoring point confirmation result, and combining the matched scoring point name, corresponding original sentence and target score to form an automatic scoring rule record.

[0088] All automatic scoring rule records are sorted according to the order of the corresponding original sentences in the scoring rules text, and each automatic scoring rule record is assigned a rule number within the target subjective question, which is used to constrain the order in which the automatic scoring system calls the automatic scoring rule records.

[0089] Based on the input field conventions of the automatic scoring system for subjective questions, the scoring point name, corresponding original sentence, and target score in each automatic scoring rule record are mapped to a scoring rule field structure that can be recognized by the automatic scoring system for subjective questions, forming an automatic scoring rule set that corresponds one-to-one with the target subjective question.

[0090] The automatic scoring rule set, along with the question information of the target subjective question, is provided to the subjective question automatic scoring system. When scoring students' answers, the subjective question automatic scoring system calls the automatic scoring rule set one by one according to the rule number.

[0091] like Figure 1 The diagram illustrates the overall process of a subjective question scoring rubric transformation method based on a large language model. This process includes: obtaining the teacher's scoring rubric text, along with the question stem, question type, and total score of the target subjective question; classifying the scoring rubric according to question identifier and question type using the large language model to obtain the scoring rubric text corresponding to the target subjective question; inputting the scoring rubric text, along with the question stem, question type, and total score, into a scoring point extraction model to identify the key point names, corresponding original sentences, and words related to the weight of the scores for multiple scoring points, generating an initial list of scoring points; merging and splitting the initial list of scoring points, and labeling each scoring point based on numerical information or weighted words in the scoring rubric text. The specific scores or the severity levels from a finite set of levels are used to confirm the scoring points. Based on the confirmed scoring points and the question type, a question type scoring template matching the target subjective question is selected from the question type scoring template library. Then, based on the confirmed scoring points, the question type scoring template, and the total score, the original scores of scoring points with specific scores are retained, while the remaining scores of scoring points with only severity levels are allocated according to the recommended proportion and quantity. Under the constraint of the total score, all scoring point scores are uniformly scaled and rounded to obtain the scoring point score allocation results. Finally, the scoring point names, corresponding original sentences, and target scores are organized into structured automatic scoring rules for use by the automatic subjective question scoring system.

[0092] like Figure 2 The diagram illustrates the process of constructing the input object for the scoring point extraction model. This process includes: reading the scoring rubric text associated with the target subjective questions from the teaching management system and aggregating the rubrics according to the target subjective questions; classifying the scoring rubrics by question identifier, preserving the paragraph order and original expression, and correcting the classification results based on question information constraints and question type information; obtaining question stem, question type, and total score information from the question bank to form question information corresponding to the scoring rubrics; mapping question types to question type marker text and logically binding them with the scoring rubric text; and concatenating the question type marker, original scoring rubric text, question stem text, and score marker in character order within a unified text input object. After completing the input preparation, the constructed input object is sent to the embedding layer of the scoring point extraction model for subsequent processing.

[0093] like Figure 3 The diagram illustrates the processing flow for extracting scoring criteria. After inputting the scoring details text into the extraction model, character sequences are generated by segmenting by sentence and character. An input feature vector is constructed for each character; features may include positional markers, punctuation, numbers, and scoring terms, thus forming the context encoding input. The feature sequence is fed into the context encoding layer, where convolution extracts neighborhood semantic information and outputs a context feature sequence. The scoring feature layer maps the context features to obtain the scoring feature vector for each character. The label output layer outputs label vectors to distinguish categories such as criterion name, emphasis, score, and connection, and obtains character labels and initial probabilities. Combining labels and initial probabilities, the characters are traversed to obtain candidate segments for scoring sentences. Segmentation continues until all segments are processed. Then, the scoring criterion name is extracted from each segment, emphasis words are identified, and the corresponding original sentences are retained to construct a scoring criterion record list. Finally, the scoring criterion records are arranged in the original text order to form an initial list of scoring criteria.

[0094] like Figure 4The diagram illustrates the process of confirming scoring criteria and writing scores or severity levels. The process involves reading a ten-dimensional label sequence from the scoring rules text and an initial list of scoring criteria as input, aligning the scoring criteria with the label sequence. Based on the start and end positions of the corresponding original sentences, a subsequence of labels is extracted for each scoring criterion to form a unified feature representation. The names of the scoring criteria are vectorized and encoded, and the intersection-union ratio (IUU) of the name and label positions is used to determine whether merging is necessary, resulting in merged scoring criteria. Connecting words between sentences are identified in the label subsequence, and sentences with multiple scoring requirements are segmented based on the starting labels of the scoring sentences, resulting in split scoring criteria. The new segments can be re-aligned with labels for iterative processing. During linear traversal, specific scores are parsed and written based on the score word labels and numerical information. If no valid score is obtained or the score is below a threshold, severity words are extracted and mapped to severity levels in a finite set, and only severity level information is written. For records that simultaneously have scores and severity levels, priority markers are set, and the scoring criteria confirmation results are output in the original text order.

[0095] like Figure 5 The diagram illustrates the selection process for question type scoring templates. It reads the confirmation results of scoring points and question type information, and loads all templates from the question type scoring template library. Templates are filtered by question type field to form a candidate template set consistent with the target question type. Scoring point names are vectorized and encoded synchronously with preset point names in the candidate templates, and similarity is calculated in a unified feature space. A template matching score is calculated for each candidate template, which can be obtained by accumulating the maximum similarity corresponding to each scoring point. Each template matching score is compared with a selection threshold to determine if there is a template with a score not lower than the threshold. When a template meets the threshold condition, the one with the highest score is selected as the question type scoring template. When all template scores are lower than the threshold, the default scoring template corresponding to that question type is selected. The question type scoring template matching the target subjective question is output for use in subsequent score allocation steps.

[0096] like Figure 6 The diagram illustrates the process by which this invention automatically converts teacher-written grading guidelines in natural language into structured grading rules. The left side shows an unstructured, lengthy grading description; the middle section represents an intelligent extraction and conversion module based on LLM and neural networks; and the right side displays a list of clearly defined grading points and their corresponding scores. By automatically identifying key points, splitting and merging rules, and rationally allocating scores, this invention significantly reduces manual processing workload, standardizes grading standards across different teachers, provides directly applicable standardized rules for automatic grading systems of subjective questions, and improves grading efficiency and objectivity.

[0097] like Figure 7As shown, this invention demonstrates the processing of scoring rules for a single subjective question: the upper part consists of the question stem and a complete scoring rule written by the teacher, while the lower part comprises multiple structured scoring points automatically broken down by the system, each point being assigned a specific score. By transforming lengthy and ambiguous natural language rules into operable, fine-grained scoring units, this invention achieves refinement, visualization, and consistency in scoring standards, significantly reducing the workload of manual breakdown and providing a clear and stable rule foundation for subsequent automatic scoring and process analysis.

[0098] like Figure 8 The diagram illustrates the structure of the scoring point extraction model. The model takes a sequence of text as input, including scoring rules, question stems, question type markers, and total score markers. The input embedding layer converts each character into a 256-dimensional feature vector, which may include positional encoding, punctuation and number markers, and common scoring terms. The context encoding layer consists of two one-dimensional convolutional layers with a kernel length of 3 and 256 channels, used to extract contextual semantic features of the character's neighborhood. The intermediate feature layer is a character-by-character fully connected layer with an output dimension of 128, used to form 128-dimensional scoring features for each character. The label output layer is a character-by-character fully connected output layer containing 10 output neurons, used to output label information related to scoring point names, stress words, score words, the start or internal position of the scoring sentence, inter-sentence connectors, and other categories. Based on the label output, a 10-dimensional label vector is obtained for each character, which is then used to identify the start and boundary of the scoring sentence, extract scoring point names, and identify stress words and score words to output labels and scoring fragments for subsequent scoring point construction.

[0099] In the specific implementation process of this solution, the following is an example:

[0100] The specific implementation method of step S1 is as follows:

[0101] Step S1 primarily provides the scoring point extraction model with text input objects closely corresponding to the target subjective question, enabling the hierarchical neural network structure to simultaneously perceive the original content of the scoring rubric text, the stem text of the target subjective question, question type information, and total score information during subsequent processing. In practice, the scoring rubric text written by the teacher and the question information of the target subjective question are first stored in the teaching management system, and then linked together using a unified question identifier field. Based on this, the sub-steps are executed sequentially.

[0102] First, the scoring rubrics written by teachers are categorized according to the target subjective questions.

[0103] Specifically, the system assigns a unique question identifier to each target subjective question, and binds the various sections of scoring rubric text entered or imported by the teacher to this question identifier, forming several scoring rubric text records. For each scoring rubric text record corresponding to a target subjective question, its complete content is read, and the original content of the scoring rubric text is recorded as [original content]. In reading During the process, the paragraph order and word choice within sentences are not rewritten. The paragraphs are connected sequentially according to the teacher's input order, and the original punctuation marks, sequence numbers, and prompts are preserved. The arrangement of the character sequence is consistent with the teacher's scoring logic, thus providing a reliable basis for subsequent label prediction in character order.

[0104] Obtain the question information for the target subjective questions. Specifically, read the question stem text corresponding to the question identifier from the question bank or test paper structure, denoted as... Simultaneously, read the question type information corresponding to this question, and record it as... The question type information can be one of the following: essay question, calculation question, proof question, fill-in-the-blank question, or one of several predefined categories; then, the total score information for that question in the exam paper is read and recorded as follows. , The score parameter is a non-negative number. (This is achieved by...) , , The original content of the aforementioned scoring rules text By establishing connections, subsequent hierarchical neural network structures can process... It can combine the semantics of the question stem with the constraints of question type and total score to extract scoring points and interpret their weights.

[0105] Based on the original content of the scoring rules text Question type information in the question information The scoring rubric text is marked with the scoring rubric text corresponding to the target subjective question type. Specifically, question type information is maintained in the system. The mapping relationship with the question type marker string will Mapped to a text segment of question type tags that can be processed at the character level, denoted as For example, a fixed format description including question type name and question type code can be used. Original content of the scoring criteria text Logically bound together, enabling the scoring point extraction model to be used in subsequent... When performing sentence segmentation and character-level label prediction, the current scoring rules text is identified by question type marking as either an argumentative or computational scoring requirement. This allows for the use of semantic bias consistent with the question type when generating scoring point names and identifying words of varying importance.

[0106] The original text of the scoring rules With the question stem text Concatenate the characters and append the total score information for the target subjective questions. Information on question types This forms the text input object that is received by the input embedding layer of the hierarchical neural network structure.

[0107] Specifically, firstly, the question type information Question type tag text obtained through mapping Place it at the beginning of the text input object, and then insert the original content of the scoring criteria text. Then insert a separator text to indicate the start position of the question stem, denoted as . Then connect to the question text. After the data is assembled, the total score information will be displayed. Convert the score markers to a fixed format text, denoted as... ,Will Append it to the end of the text. The above operations yield a complete text input object, denoted as... , At the character level, they successively include , , , and .

[0108] In the following, The data is directly input into the input embedding layer of the scoring point extraction model, and the input embedding layer processes it. Divide into sentence-by-sentence and character-by-character categories for Each character in the input generates a 256-dimensional feature vector containing information such as position number, whether it is punctuation, whether it is a number, and whether it is a common rating term, forming the input feature sequence. Because... The text simultaneously includes the original content of the scoring rules, the question stem, the question type label text, and the score label text. When the context encoding layer performs one-dimensional convolution on the input feature sequence, it can model the semantic relationship between the scoring rules and the question stem in the same character sequence, as well as the constraint effect of question type information and total score information on the extraction of scoring points. This provides a unified and computable text foundation for generating the initial list of scoring points in step two and generating the confirmation result of scoring points in step three.

[0109] The specific implementation method of step S2 is as follows:

[0110] A scoring point extraction model is proposed to predict character-level labels in the scoring details text and construct an initial list of scoring points based on this.

[0111] The scoring details text is denoted as , This is the original natural language text composed of multiple sentences. During the model inference phase, it will... As input to the scoring point extraction model, the scoring point extraction model sequentially passes through an input embedding layer, a context encoding layer, an intermediate feature layer, and a label output layer. Each character in the text generates a corresponding ten-dimensional label vector. Then, based on the label sequence of consecutive characters, candidate segments of scoring sentences are generated. From the candidate segments of scoring sentences, the names of scoring points, the corresponding original sentences, and words of varying stress are extracted to form an initial list of scoring points.

[0112] Specifically, the scoring details text The input embedding layer of the model extracts the scoring criteria. First, for... The process involves dividing the text into sentence segments, then expanding each segment into a character sequence, resulting in a global character sequence, denoted as . , of which The characters are denoted as The total number of characters is denoted as For character sequences Each character in Construct a 256-dimensional eigenvector , Contains characters In the scoring details text The features include positional identifiers, punctuation marks, numbers, and common rating terms. These features are encoded by independent embedded neurons and then concatenated or linearly combined along the feature dimension. This becomes a real number vector of length 256. The feature vectors of all characters are then arranged in the original character order to obtain the input feature sequence. , of which The eigenvectors are and will The result is input from the embedding layer and output to the context encoding layer.

[0113] Input feature sequence The input scoring point extraction model uses a context encoding layer. This layer consists of two sequentially connected one-dimensional convolutional neural networks. The first layer has 256 convolutional neurons in the feature dimension, and each neuron uses a convolutional kernel with a window length of three in the character sequence dimension. Adjacent to it and Perform convolution operations and nonlinear transformations to obtain the first-layer context feature vector sequence. , of which The context feature vectors are denoted as , It is a 256-dimensional vector containing local semantic information of the current position and its left and right neighboring positions.

[0114] Will The input is passed to the second layer of the one-dimensional convolutional neural network, which also contains 256 convolutional neurons and uses a convolutional window length of three. Convolution and nonlinear transformations are performed on the second-layer context feature vector sequence and its adjacent positions. , of which The context feature vectors are denoted as In this embodiment, the second-layer output sequence is denoted as... , No. The context feature vectors are denoted as , Used to comprehensively express characters The semantic role in its left and right neighboring contexts.

[0115] context feature vector sequence The intermediate feature layer of the input scoring points is extracted into the model. This intermediate feature layer is a fully connected neural network containing 128 fully connected neurons, processing each context feature vector. After performing linear transformations and nonlinear activations, a 128-dimensional scoring feature vector is obtained, denoted as... This forms a sequence of scoring feature vectors. , of which Each scoring feature vector is Rating feature vector Used to centrally represent characters In the scoring details text The scoring-related functions, such as belonging to the scoring point name part, belonging to the stress word part, belonging to the score word part, or belonging to the sentence connection word, compress semantic information from different context positions into a unified scoring feature space.

[0116] The scoring feature vector sequence The input rating feature extraction model's label output layer. The label output layer is a fully connected neural network containing ten output neurons, which process each rating feature vector... Generate a ten-dimensional label vector, denoted as Thus, a ten-dimensional label vector sequence is obtained. , of which The label vectors are .for Each dimension in the algorithm corresponds to a pre-defined category, including several categories such as the scoring point name category, the emphasis / severity word category, the score word category, the starting position category of the scoring sentence category, the internal position category of the scoring sentence category, and the inter-sentence connector category. To introduce an explicit probability metric when identifying the starting position category of the scoring sentence, this implementation calculates the probability corresponding to the starting position category of the scoring sentence in the label output layer as follows:

[0117] ;

[0118] in, For characters The probability of being classified as the starting position category of a rating sentence; For characters The corresponding 128-dimensional scoring feature vector; This is the weight vector in the label output layer corresponding to the category at the start position of the rating sentence; The bias scalar corresponding to the category of the starting position of the rating sentence; For the tag output layer The weight vector corresponding to each output neuron; For the first The bias scalar corresponding to each output neuron; For the output neuron index; The total number of output neurons in the label output layer; For characters in a character sequence The position index in the middle.

[0119] Through the above probability calculations, the label output layer not only outputs the main label of the category to which the character belongs, but also provides continuous probability values ​​related to the category of the starting position of the rating sentence, providing numerical basis for determining the starting position of subsequent candidate segments of the rating sentence.

[0120] Based on the ten-dimensional label vector sequence and the probability of the starting position of the rating sentence The scoring criteria text Linear traversal of character-level positions is performed to construct candidate segments for rating sentences.

[0121] Specifically, from characters Start scanning sequentially, when a certain position is scanned... At that time, the character is first determined based on the result of the label output layer. The main label category, when the main label category is the rating sentence start position category and Greater than or equal to the preset starting threshold At that time, the character As the starting character of a new candidate segment for a rating sentence, a new rating sentence buffer is opened. Then, the process continues traversing backwards, appending characters whose main tag category is the rating sentence internal position category to the current rating sentence buffer. This continues until the next character satisfying the starting condition is encountered or the end of the character sequence is reached. At that time, consecutive characters in the buffer are concatenated into a candidate segment for a rating sentence, and then processed according to the... The starting position records the sequential index of the candidate segments for the rating sentence. This is achieved by fusing the main label category and the starting probability. The linear traversal process yields a set of candidate scoring sentences corresponding to the scoring conditions. Each candidate scoring sentence is a segment of the original text in the scoring rules that is continuous at the character level and has complete scoring semantics.

[0122] Fine-grained analysis is performed on each candidate segment of the scoring sentence to generate a record of scoring points.

[0123] Specifically, for a certain candidate segment of a rating sentence, the character subsequence Simultaneously read the label subsequence within this interval. .

[0124] First, characters whose main label category is the scoring point name section are selected from the subsequence. These characters are then concatenated in the order they appear in the fragment to generate the scoring point name text, which serves as a summary description of the scoring content that the teacher will focus on in the candidate scoring sentence fragment. Next, characters whose main label categories are the emphasis / severity section and the score value section are selected from the same subsequence. These characters are then concatenated in their original order to obtain the emphasis / severity text. This emphasis / severity text is used to express the relative importance of this scoring point compared to other scoring points in the entire scoring guidelines text.

[0125] Simultaneously, the original character subsequence of the entire candidate segment of the scoring sentence is... The corresponding original sentences are recorded. For each candidate scoring sentence, a scoring point record is constructed according to three items: the scoring point name, the corresponding original sentence, and the words with varying degrees of emphasis, so that each record has a complete scoring semantic description and weight prompt information.

[0126] Record the aforementioned scoring points in the scoring details text according to their corresponding original sentences. Sort by the order in which they appear.

[0127] During sorting, the starting character index of the original sentence corresponding to each scoring point record is used as the sorting key, and the records are arranged in ascending order to ensure that the order of the scoring point records is consistent with the original narrative order of the teacher's scoring guidelines. Then, the scoring point name field, corresponding original sentence field, and stress field of each scoring point record are organized according to the preset field order, and the organized set of scoring point records is saved as an initial scoring point list. Each record in the initial scoring point list originates from the unified label output sequence of the scoring point extraction model. It retains the original sentence of the teacher's natural language expression, and provides a structured description of the scoring requirements through the scoring point name and stress words, and also uses the probability of the starting position of the scoring sentence. Ensuring the reliability of the initial boundaries of candidate scoring sentences provides a clear and calculable foundation for merging scoring requirements, splitting scoring requirements, and labeling scores and severity levels based on the initial list of scoring points in the steps.

[0128] The specific implementation method of step S3 is as follows:

[0129] Based on the initial list of scoring criteria and the detailed scoring rules text, the scoring criteria are checked and adjusted to obtain the confirmed scoring criteria result. The detailed scoring rules text is denoted as... , The original natural language text is input into the scoring point extraction model in step two; the scoring point extraction model is then used in... The ten-dimensional label vector sequence output above is denoted as , of which The label vector is denoted as , For the location index is The ten-dimensional label vector corresponding to the characters; the initial list of scoring points is denoted as... , Each record must include at least the name of the scoring point, the corresponding original sentence, and the stress level of the words, and record the corresponding original sentence in the context of the scoring point. The starting and ending positions in the text.

[0130] Initial list of scoring criteria Each scoring point and a ten-dimensional label vector sequence Align them.

[0131] Specifically, for For each scoring point record, read its corresponding original sentence in the record. The start character position index and end character position index in the code are denoted as follows: and ,in and It is an integer and satisfies .according to and exist Extracting the tag subsequence from the middle The label subsequence is bound to the current scoring point record, serving as the ten-dimensional label vector subsequence for that scoring point.

[0132] Based on this approach, each scoring point in the initial list of scoring points is mapped to a continuous ten-dimensional label vector subsequence, providing a unified feature representation for merging and splitting scoring requirements.

[0133] Based on the name of each scoring point, the corresponding original sentence, and the ten-dimensional label vector subsequence, scoring points describing the same scoring requirements are merged.

[0134] Specifically, this involves creating an initial list of scoring criteria. Each rating point name is treated as a short text sequence. The rating point names are vectorized using the same character embedding method as the input embedding layer and context encoding layer of the rating point extraction model, resulting in a rating point name vector representation. The index is then used to... The name vector of the scoring criteria is denoted as , index as The name vector of the scoring criteria is denoted as , and All are 128-dimensional real vectors.

[0135] Simultaneously, in the ten-dimensional label vector subsequences corresponding to the original sentences of the two scoring points, the position index set of the labels whose main label category is the scoring point name is counted, and the index is... The set of indexes for the scoring criteria at that location is denoted as , index as The set of indexes for the scoring criteria at that location is denoted as .

[0136] Based on this, the combined judgment value of the two scoring points is calculated according to the following formula. :

[0137] ;

[0138] in, For index and index as The combined judgment value of the two scoring points; For the middle and Weighting coefficients between them; For index Vector representation of the names of the scoring criteria; For index Vector representation of the names of the scoring criteria; for transpose and The inner product; for The Euclidean norm; for The Euclidean norm; For index The scoring criteria are represented in the ten-dimensional label vector subsequence as a set of position indices of the main label category and the partial label of the scoring criteria name; For index The scoring criteria are represented in the ten-dimensional label vector subsequence as a set of position indices of the main label category and the partial label of the scoring criteria name; for and The number of positions in the intersection; for and The number of locations in which they are concentrated. Combined with preset threshold When comparing, Greater than When the index is determined to be and index as The two scoring points describe the same scoring requirement. In this case, the two scoring points are merged to generate a new scoring point record. The new scoring point name is obtained by concatenating the two original scoring point names or selecting a name text with a more complete semantic meaning. The corresponding original sentence is obtained through... The ten-dimensional label vector subsequences are concatenated and merged according to the order of their starting position indices. The original subsequences are connected according to their character positions to form new continuous subsequences.

[0139] The original text sentence corresponding to each scoring point in the initial list of scoring points is split to identify multiple scoring requirements contained within the same original text sentence.

[0140] Specifically, for a given rating point record, its ten-dimensional label vector subsequence is read. During the linear scanning process, the set of position indices of the main label category as inter-sentence connector labels are identified, and at the same time, it is checked whether there are positions of the main label category as the starting position label of the scoring sentence for several characters after these positions.

[0141] When the number of connection positions detected within the same corresponding original sentence exceeds the preset splitting threshold At that time, the original sentence is divided into multiple candidate segments of scoring sentences, and a segmentation boundary is set between the connecting position and the starting position of the subsequent scoring sentence each time.

[0142] For each candidate segment of a scoring sentence, an independent scoring point name is generated based on the characters of the main label category of the segment, which is the scoring point name part of the label. The original text character sequence of the segment is used as the new corresponding original text sentence. At the same time, the corresponding interval is extracted from the original ten-dimensional label vector subsequence as a new ten-dimensional label vector subsequence, thereby generating an independent scoring point record for each candidate segment of a scoring sentence.

[0143] The scoring points obtained after merging and splitting are then compared again with the scoring details text. To match the scores, we can analyze the specific values.

[0144] Specifically, for each scoring point record, its corresponding original sentence is read. The start and end positions in the interval, based on the interval in The latest ten-dimensional label vector subsequence is extracted. During the linear traversal of this label subsequence, the character position indices of the main label category (score-based words) are collected into a set of candidate positions, and the character content of these positions is read from the corresponding original sentence. For numeric characters near the candidate positions, consecutive numbers are merged in left-to-right order and parsed into specific scores in integer or decimal form.

[0145] For cases involving composite score descriptions such as full marks and deductions, consecutively appearing numbers and metrics are combined into multiple score candidates. Then, based on the contextual relationship with the score point name and emphasis words, the one that best matches is selected as the specific score for that score point.

[0146] When the specific score obtained from the analysis is valid and greater than the preset valid score threshold When this happens, the specific score will be written into the score field of the scoring point record.

[0147] For characters whose tags containing scoring words were not extracted, or whose scores after parsing are lower than the preset effective score threshold. The scoring criteria are recorded, and the stress words are extracted from the corresponding original sentences and their ten-dimensional label vector subsequences and mapped to a finite set of levels.

[0148] Specifically, the ten-dimensional label vector subsequence is scanned, and the character position indexes of the main label category of the minor and major words are collected into a candidate set of minor and major words. The character fragments at these positions are read in sequence in the corresponding original sentence to form the minor and major word text, which includes words that represent relative importance such as important, general, minor, and appropriate.

[0149] The system pre-defines a finite set of levels, where level names can include three or more sub-levels such as high, medium, and low. A mapping relationship is assigned to each type of stress word and its specific level name. Based on the identified stress word text, the system searches for the corresponding level name in the mapping relationship, and the found level name is used as the stress level of that scoring point record, written into the stress level field. If multiple stress words are identified for the same scoring point record, the level name with the highest priority in the mapping relationship is recorded.

[0150] For a scoring point record that already has both a specific score field and a severity level field, the specific score is used as the main score information for the scoring point record, and the severity level is used as the priority marker for the scoring point record in subsequent score ratio adjustments.

[0151] Specifically, a priority marker field is added to the scoring point record. Priority values ​​are assigned based on the order of severity within a finite set of levels; for example, "high level" corresponds to first priority, "medium level" to second priority, and "low level" to third priority. When adjusting the score proportions according to the question type scoring template, the specific score is used as a constraint, while the priority marker controls the direction and magnitude of the adjustment. This avoids excessive reduction of the specific scores for high-level scoring points, thus constraining the relative weighting relationships in the confirmed scoring point results while ensuring that the specific scores do not conflict.

[0152] All scoring points that have been merged and split and have been written into specific score fields or severity level fields will be recorded in the scoring details text according to their corresponding original sentences. The order in which the points are arranged forms the basis for confirming the scoring criteria.

[0153] Specifically, each scoring point is recorded in the corresponding original text sentence. The starting character position index is used as the sorting key, and the records are sorted from smallest to largest. After sorting, each scoring point record retains four core fields: scoring point name, corresponding original sentence, score, and severity level. The sorted set of scoring point records is denoted as... , This confirms the scoring criteria. In the following... It is used to combine with the total score in the question type scoring template and question information to realize the score calculation of the scoring points and the automatic generation of scoring rules.

[0154] The specific implementation method of step S4 is as follows:

[0155] This step is mainly used to select a question type scoring template from the question type scoring template library that matches the confirmed scoring points for the target subjective question.

[0156] Record the confirmation results of the scoring points obtained in step S3 as follows: , Each record should contain at least the name of the scoring point, the corresponding sentence from the original text, the specific score field, and the severity level field. The question information for the target subjective questions should be recorded as follows: , This includes a question type field and other information associated with the target subjective question. The question type scoring template library is denoted as... , Each question type scoring template contains at least a question type field and several preset scoring point name fields, and may also include a recommended score ratio field corresponding to each preset scoring point name field.

[0157] Combine all the names of the scoring criteria in the confirmed scoring criteria with the question types in the question information, and then use the question type scoring template library. The question type scoring templates that match the question type field are selected to form a candidate set of question type scoring templates. Specifically, from... Read the question type field of the target subjective question and match it with... Each question type scoring template in the system is compared one by one with its question type field. Only when the two are completely identical is the question type scoring template added to the question type scoring template candidate set. The question type scoring template candidate set is denoted as . After completing the traversal, Each question type scoring template in the template is consistent with the question type field of the target subjective question, only in The template matching score is calculated and the template is selected within the system.

[0158] Confirmation of scoring criteria The name of each scoring point and the candidate set of question type scoring templates. The preset scoring point name field in each question type scoring template is vectorized and encoded in the same way as the intermediate feature layer of the scoring point extraction model. The similarity value between the scoring point name and the preset scoring point name field is calculated, so as to obtain a template matching score for each candidate question type scoring template.

[0159] Specifically, for All scoring points are numbered in the order they are recorded, and the total number of scoring points is recorded as follows: , No. The text corresponding to each scoring point name is denoted as , From arrive Integers. Character-level partitioning is employed, generating a 256-dimensional feature vector for each character in the input embedding layer. Then, using the same context encoding layer and intermediate feature layer as the scoring point extraction model, the character sequence is mapped to a 128-dimensional vector representation, denoted as […]. Candidate set of scoring templates for question types. Each question type's scoring template is numbered, and the first one is assigned to the next question type. The scoring template for each question type is denoted as follows: , It is a positive integer. The internal preset scoring point name fields are numbered sequentially, and the number of preset scoring point name fields is recorded as follows: , No. The text corresponding to the preset scoring point name field is denoted as follows: ,right Adopted and The same vectorization method yields a 128-dimensional vector representation, which is denoted as . After obtaining all and Next, a scoring template for each question type was developed. Calculate the template matching score using the following formula. :

[0160] ;

[0161] in, For the first Scoring template for each question type Template matching score; Confirm the results for the scoring criteria. The number of scoring criteria names; For the first Scoring template for each question type The number of pre-defined scoring point name fields; For the first Each scoring point name The corresponding 128-dimensional vector representation; For the first Scoring template for each question type The Middle Preset scoring point name field The corresponding 128-dimensional vector representation; For vectors transpose and vector The inner product; For vectors The Euclidean norm; For vectors The Euclidean norm; Index of scoring criteria; Index the pre-defined scoring point name field; This is an index of question type scoring templates.

[0162] Based on the above calculations, for each scoring point name In the For each question type's scoring template, select the field with the highest similarity value from all preset scoring point name fields as the scoring point name. The matching contribution value of each question type's scoring template is calculated, and then the matching contribution values ​​of all scoring point names are summed to obtain the result in a unified vector space. The template matching score corresponds to the degree of structural matching of the scoring template for each question type. .

[0163] Based on template matching score The preset template selection threshold is used to select the final question type scoring template that matches the target subjective question. The template selection threshold is denoted as... , A pre-configured scalar threshold. This is used after completing the candidate set of question type scoring templates. Template matching scores for all question types in the scoring template. After calculation, traverse Compare the template matching scores of the scoring templates for each question type. and The size relationship between them. For template matching scores... Greater than or equal to The question type scoring template is included in the optional set, and the template matching score is selected from the optional set. The largest question type scoring template is used as the question type scoring template that matches the target subjective question. If all question type scoring templates match, the score is... All less than Then from the question type scoring template library Retrieves the default question type scoring template corresponding to the target subjective question type field, and records this default question type scoring template as... and will As a question type scoring template that matches the target subjective questions.

[0164] Through the above process, while ensuring the consistency of the question type fields, the name of the scoring point in the confirmation result of the scoring point is vectorized and matched with the preset scoring point name field in the question type scoring template library. A unified template matching score is then used for quantitative comparison, thereby determining a question type scoring template that can reflect the structural characteristics of the target subjective questions and the distribution of scoring points. This provides a reliable structural basis for calculating the scoring point scores based on the recommended score ratio in the question type scoring template.

[0165] The specific implementation method of step S5 is as follows:

[0166] Step S5 is mainly used to assign specific scores to each scoring point based on the confirmed scoring points and question information, thus forming the scoring point score allocation result. The confirmed scoring points obtained in Step 3 are denoted as... , Each record should contain at least the name of the scoring point, the corresponding sentence from the original text, the specific score field, and the severity level field. The question information for the target subjective questions should be recorded as follows: ,from Read the total score of the question from the middle and record the total score as . , Let be a real number. The scoring template for the question type that matches the target subjective question is denoted as . , The system pre-sets a recommended ratio field for each level of severity, and the recommended ratio is a real number that can be directly used for multiplication operations.

[0167] The scoring criteria in the confirmed scoring criteria results are divided according to whether they have specific score fields. Specifically, for The scoring point records are iterated through, and scoring points with specific score fields and valid values ​​in those fields are grouped into a set. Scoring points that only have a severity level field but no valid specific score field will be grouped into a set. ,in and For an indexed set. For each scoring point record, read its specific score field and record that score as... , For scoring criteria index, Let be a real number, and It is recorded directly as the current score for that scoring point.

[0168] After traversing the collection, All of them Summing is performed to obtain the total allocated points, and the total allocated points are denoted as... , It is a real number. Then... and When comparing, Greater than When the difference between the two is used as the value that can be assigned to the set, The remaining points for the scoring criteria are recorded as follows: ;when Less than or equal to At that time, Set as In the process, only the specific scores are scaled uniformly to ensure that the final score does not exceed the total score of the question.

[0169] set The scoring criteria, which only have a severity level field, are grouped according to their severity level. Specifically, from... Read a finite set of levels and record the names of each level as a set. For example, "high-level", "medium-level", and "low-level" all belong to... The elements in the set. For each scoring point record, read its severity level field and assign the scoring point to the corresponding group according to the severity level name.

[0170] for A certain level of severity is defined as follows: All light and heavy levels are The set of scoring criteria indexes is denoted as , A subset. Statistics The number of scoring points in the data is recorded as follows: , It is a positive integer.

[0171] From the question type scoring template Read the weight level The corresponding recommendation ratio is denoted as . , For the middle and Real numbers between. For all weight and rank names. Obtained in the manner described above and Afterwards, According to the recommended proportion The allocation is made among the various light and heavy weight levels. Calculate the remaining score within the light and heavy grade group, and record the remaining score within the group as... , Let be a real number, and make and Proportional. Then in terms of weight. Internally, based on the number of scoring points right To subdivide, according to Divide equally to classify by weight. Each scoring point within the group receives the same initial score, thus categorizing them by severity. The initial scores for the group's scoring criteria are denoted as follows: , It is a real number. For the set... After performing the above process on all levels of importance, the set will be... Each scoring point is assigned an initial score. .

[0172] Sum the initial scores for all scoring criteria and then compare the sum with the total score in the question information. Compare them.

[0173] Specifically, for sets The scoring criteria will include the aforementioned specific scores. Consider it as its initial score; for the set The scoring criteria in the text will be obtained Consider these as initial scores. Sum all initial scores according to the scoring criteria index, and then sum these initial scores to obtain the sum of initial scores. Record this sum as... , It is a real number.

[0174] Will and When comparing, and When scores are equal, all initial scores are used directly as the target score; when... and When they are not equal, and The ratio of is used as a uniform scaling factor, which is denoted as . , Let be a real number, and multiply the initial score for each scoring point by... The scaled score is obtained. To satisfy the minimum scoring unit constraint, the minimum scoring unit is read from the system configuration and recorded as [minimum scoring unit]. , It is a positive real number.

[0175] For each scaled score, it is ranked according to the closest... Round off any multiple of the given value, so that the rounded score is recorded as follows: , are real numbers and are Integer multiples of.

[0176] Because rounding may cause all The sum and There are minor differences; after rounding, for all... Perform summation; if the summation result is the same as... If they are not completely consistent, then based on the magnitude and sign of the difference, select several records from the scoring criteria that have a higher degree of severity and a non-zero score, and... By smallest scoring unit Make fine adjustments to gradually eliminate the sum difference until all The sum and The process continues until all scores are completely identical, thus obtaining the target score set that satisfies the total score constraint.

[0177] Associate the target score for each scoring point with the corresponding scoring point name, and confirm the results at each scoring point according to the scoring points. The points are arranged in order of priority to form the score allocation result for the scoring points. Specifically, the scores are rounded and adjusted as necessary. The scoring point name field in each scoring point record is written into the same record structure to maintain a one-to-one correspondence between the scoring point name, the corresponding original sentence, the target score, and the severity level.

[0178] According to the scoring criteria The original sequential index of the records is used to sort all records, starting from the scoring point with the smallest index and arranging them sequentially. The sorted set of records is the score allocation result for the scoring points. The score allocation result can be directly provided to the subsequent automatic scoring module for step-by-step scoring according to the scoring point name and target score when processing student answers.

[0179] The specific implementation method of step S6 is as follows:

[0180] Step S6 is used to establish a correspondence between the score allocation results of the scoring points and the confirmation results of the scoring points, and to convert the correspondence into a set of automatic scoring rules that can be directly called by the automatic scoring system for subjective questions.

[0181] Record the results of the assessment criteria confirmation as follows: , Each record should contain at least the following fields: scoring point name, corresponding original sentence, specific score, and severity level. It should also record the starting index of the corresponding original sentence within the scoring rules text. The scoring point score allocation result is denoted as... , Each record must contain at least one field for the name of the scoring point and a field for the target score corresponding to that scoring point. The scoring details text is denoted as... , The raw natural language text used to input the scoring criteria extraction model in the preceding steps is used as the input. The question information for the target subjective question is denoted as... , It includes a question number field and a question type field.

[0182] The automatic scoring system for subjective questions is denoted as... , It is a software system that can receive grading rules and automatically grade students' answers.

[0183] The score allocation results for the scoring points The target score for each scoring point and the confirmed results of the scoring points. Match the same rating points in the Chinese.

[0184] Specifically, for The records in the database are traversed sequentially, and the first record reached is... The name of the scoring point corresponding to each record is denoted as: The target score is denoted as ,in Positive integer index It is a real number. The text describes a string matching process based on the "scoring point name" field, searching for strings that match the given scoring point name. For identical records, the names of the matched rating points will be recorded as follows: The corresponding sentence in the original text is denoted as When multiple records simultaneously satisfy the condition of having the same name, in order to... The record appearing first in the sequence is used as the matching result. Then, the record appearing first in the sequence is used as the matching result. , and Create an automatic scoring rule record for the three items, and set the scoring point name field in this record as follows. The corresponding original sentence field is The target score field is .

[0185] All automatic scoring rule records are grouped into a set, and this set is denoted as . , This is a set of unsorted automatic scoring rule records. For in There is a name field, but If no record with the exact same name is found, the system marks the record as abnormal and outputs an alarm, without writing it to the database. .

[0186] Record all automatic scoring rules in the scoring details text according to the corresponding original sentences. The rules are sorted according to their order of appearance, and the rule number assigned to each automatic scoring rule within the target subjective question is recorded. Specifically, for... For each automatic scoring rule record, read the corresponding original sentence from the record. And confirm the results according to the scoring criteria. Search in As the record corresponding to the original sentence field, retrieve the record in the scoring details text. The starting character position index in the code is denoted as . , It is a non-negative integer. All As a sorting key, for Automatic scoring rules are recorded according to The automatic scoring rule records are stably sorted from smallest to largest to obtain a sorted sequence. After sorting, each automatic scoring rule record is assigned a rule number according to the sorted order, and the rule number is recorded as follows. , From Start with an incrementing integer, and... Write the rule number field into the automatic scoring rule record. After completing the above processing, the set of sorted automatic scoring rule records with the rule number field is denoted as follows: , This is used to constrain the sequential calling order of automatic scoring rule records in the automatic scoring system for subjective questions.

[0187] According to the automatic scoring system for subjective questions The input field convention maps the scoring point name, corresponding original sentence, and target score in each automatic scoring rule record to... The identifiable scoring rule field structure forms an automatic scoring rule set corresponding to the target subjective question.

[0188] Specifically, Predefine the field names and field types in the scoring rule data structure, including the rule identifier field, scoring basis field, score field, execution order field, and question number field. For each automatic scoring rule record, read the rule number. Scoring criteria name Corresponding original sentence and target score Then from the question information Read the question number from the middle and record the question number as , It can be an integer or a string.

[0189] according to The field conventions will Write to the rule identifier field, Write it into the scoring criteria field. Write to the score field, Write the execution order field, Write the question number into the field to form a matching question. Input the scoring rule records in the specified format. Summarize all scoring rule records into an automatic scoring rule set, denoted as [Set Name]. , This is the final set of automatic scoring rules corresponding to the target subjective questions.

[0190] Set of automatic scoring rules Information on the target subjective questions Together, they are provided to the automatic scoring system for subjective questions. ,Depend on When grading student answers, the automatic grading rule set is called up one by one according to the rule number. Specifically, in the subjective question grading stage, the automatic grading rule set is called up one by one. and Load to In the rule management module, the rule management module determines the execution order based on the execution order field. The scoring rule records are sorted and validated to ensure that the rule sequence numbers are correct. Continuous and starting value .

[0191] when Upon receiving a student's answer text on a target subjective question, the answer text is compared with the question information. This information is then passed to the scoring execution module. The scoring execution module processes the information according to the rule number. Read from smallest to largest The scoring rules record in the file, for the serial number is The scoring rules are recorded, and the scoring criteria names are used. and corresponding original sentences As a semantic matching condition, the student's answer text is matched and judged, and the target score is awarded when a match is successful. As the upper limit of the score for this scoring point, the score is calculated based on the degree of internal matching. After all scoring rules are recorded and processed, the scores of each scoring point are summed to obtain the automatic score result of the student on the target subjective question.

[0192] Through the above steps, based on the confirmation results of the scoring points and the score allocation results of the scoring points, a structured set of automatic scoring rules is formed, and this set of automatic scoring rules is stably connected to the operation link of the subjective question automatic scoring system.

[0193] The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the protection scope of the present invention.

Claims

1. A method for subjective question scoring rule conversion based on LLM and scoring point extraction, characterized in that, include: S1. Obtain the teacher's scoring details text, as well as the question stem, question type, and total score of the target subjective question. Use the large language model to classify the scoring details according to the target subjective question identifier and question type to obtain the scoring details text corresponding to the target subjective question. S2. Input the scoring rules text, the question stem, question type and total score of the target subjective question into the scoring point extraction model, identify the key point name, corresponding original sentence and words related to the score weight of multiple scoring points, and generate an initial list of scoring points. S3. Merge and split the initial list of scoring points, and combine the numerical information or emphasis words in the scoring rules text to mark the specific score or emphasis level from the limited set of levels for each scoring point, so as to obtain the scoring point confirmation result. S4. Based on the confirmed results and question types according to the scoring points, select a question type scoring template that matches the target subjective question from the question type scoring template library; S5. Divide the scoring points in the scoring point confirmation results according to whether they have specific scores. For scoring points with specific scores, directly record the specific scores in the scoring point confirmation results. Sum the scores of all scoring points with specific scores to obtain the total allocated scores. Then, calculate the remaining scores to be allocated to the scoring points that only have a severity level based on the total score in the question information and the total allocated scores. The scoring points with only a degree of severity are grouped according to their degree of severity. Based on the recommended proportions for each degree of severity in the question type scoring template, the remaining points are distributed among the degree of severity according to the recommended proportions. Within each degree of severity, the remaining points are further subdivided according to the number of scoring points to obtain the initial score for each scoring point with only a degree of severity. The initial scores of all scoring points are summed, and the sum is compared with the total score in the question information. When the two are not equal, the initial scores of all scoring points are scaled proportionally according to the ratio of the total score to the sum of the initial scores, and the scaled scores are rounded according to the preset minimum scoring unit to obtain the target score that meets the total score constraint. The target score for each scoring point is associated with the corresponding scoring point name, and the scoring points are arranged in the order of their confirmation in the scoring point confirmation results to form the scoring point score allocation result. S6. Organize the key points of the scoring criteria, their corresponding original sentences, and target scores into structured automatic scoring rules for use by the automatic scoring system for subjective questions. 2.The method of claim 1, wherein, Step S1 includes: The scoring rules text and the question stem text are concatenated at the character level, and question type marker text obtained by mapping question type information and score marker text obtained by converting total score are added to form a text input object for the scoring point extraction model to receive at the input embedding layer. 3.The method of claim 2, wherein, The scoring point extraction model includes a sequentially connected context encoding layer, intermediate feature layer, and label output layer: The context coding layer consists of two one-dimensional convolutional neural networks with a convolutional window length of 3, used to generate a sequence of context feature vectors containing the semantics of the left and right neighbors of characters. The intermediate feature layer is a fully connected network containing 128 neurons, used to generate a sequence of scoring feature vectors; The label output layer is a fully connected network containing ten output neurons, used to output a ten-dimensional label vector for each character. 4.The method of claim 3, wherein, Each dimension of the ten-dimensional label vector is used to indicate whether the character belongs to a preset category among the scoring point name part, the stress word part, the score word part, the starting position of the scoring sentence, the internal position of the scoring sentence, and the inter-sentence connectors. The text is linearly traversed according to the probability threshold corresponding to the starting position of the scoring sentence to construct candidate segments of the scoring sentence. For each candidate scoring sentence fragment, the scoring point name is generated by using the characters of the scoring point name part category in the main label of the fragment, and words related to the score weight are generated by using the characters of the minor and major word parts category in the main label, and the entire fragment is used as the corresponding original sentence to construct the scoring point record in the initial list of scoring points.

5. The method of claim 1, wherein the method further comprises: In step S3: When merging rating points that describe the same rating requirements, the same character-level embedding and intermediate feature layer as the rating point extraction model are used to vectorize and encode the rating point names. The merging determination value is calculated by combining the position intersection-union ratio of the rating point name in the ten-dimensional label vector subsequence. When the merging determination value is greater than the preset merging threshold, the corresponding rating points are merged into a new rating point record. When splitting a sentence containing multiple scoring requirements, the splitting position is determined based on the combination of inter-sentence connector labels and the starting position label of the scoring sentence in the ten-dimensional label vector subsequence of the corresponding original sentence. The original sentence is then split into multiple candidate segments of scoring sentences, and an independent scoring point record is generated for each segment. When valid numerical information is parsed in the corresponding original sentence, the numerical information is written into the specific score field of the scoring point; when no valid numerical information is parsed or the parsing result is lower than the preset valid score threshold, the stress words are extracted from the corresponding original sentence and mapped to the level name in the preset limited level set, which serves as the stress level of the scoring point.

6. The method of claim 1, wherein the method is characterized by: In step S4: The scoring point names in the scoring point confirmation results and the preset scoring point name fields in the candidate question type scoring templates are vectorized and encoded in the same way as the intermediate feature layer of the scoring point extraction model. The similarity between the two is calculated, and the maximum similarity between each scoring point and its internal preset scoring point name field is accumulated for each candidate question type scoring template to obtain the template matching score. Among the candidate templates whose template matching score is greater than the template selection threshold, the one with the highest matching score is selected as the question type scoring template that matches the target subjective question. When all template matching scores are less than the template selection threshold, the default question type scoring template corresponding to the question type is selected.

7. The method of claim 1, wherein the method further comprises: In step S6, the target score in the score allocation result of the scoring points is matched with the corresponding original text sentence of the same scoring point in the score confirmation result. Rule numbers are configured for each scoring point according to the order of the corresponding original text sentences in the scoring rules text. According to the field structure preset by the subjective question automatic scoring system, the scoring point name is used as the rule identifier, the corresponding original text sentence is used as the scoring basis, the target score is used as the score field, the rule number is used as the execution order field, and the target subjective question number is used as the target subjective question field. An automatic scoring rule set is generated and provided to the subjective question automatic scoring system for calling one rule at a time according to the rule number.