Judicial text entity recognition method based on ternary instruction fine tuning and VCoT verification

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
The judicial text entity recognition method, which combines ternary instruction fine-tuning and VCoT verification, solves the problem of recognizing long expressions and nested entities in judicial texts, improves the accuracy and consistency of entity recognition in judicial texts, and enhances the efficiency of judicial text processing.

CN120337927BActive Publication Date: 2026-06-19湖南工商大学

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: 湖南工商大学
Filing Date: 2025-04-01
Publication Date: 2026-06-19

AI Technical Summary

Technical Problem

Judicial texts present challenges such as long expressions, nested entities, and fine-grained identification, making it difficult for existing technologies to efficiently and accurately identify key entities within them.

Method used

This paper adopts a method for judicial text applications based on ternary instruction fine-tuning and VCoT verification. It improves the instruction fine-tuning technology of large language models, and utilizes a judicial text entity recognition method based on instruction fine-tuning and VCoT verification. It also applies a patent application method to extract the required content from the patent specification.

Benefits of technology

It significantly improves the accuracy and consistency of entity recognition in judicial texts, solves the problems of long expressions, nested entities, and fine-grained recognition, and improves the efficiency and accuracy of judicial text processing.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN120337927B_ABST

Patent Text Reader

Abstract

This invention provides a method for legal text entity recognition based on ternary instruction fine-tuning and VCoT verification, comprising the following steps: the user inputs the legal document and instructions to be recognized; an initial response is obtained through a large-scale legal entity recognition model; the initial response is progressively reasoned and verified through a VCoT verification mechanism, the verification results are corrected and optimized, and a final verified entity recognition result is generated. Based on instruction fine-tuning and thought chain technology, this invention improves the model's ability to distinguish between entity and non-entity information through instruction fine-tuning combined with ternary understanding enhancement, thereby reducing misidentification. Through VCoT-based verification chain reasoning, and the generation of a question list and self-verification mechanism, the model's recognition performance is further improved.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of smart judicial technology, and in particular to a method for recognizing entities in judicial texts based on ternary instruction fine-tuning and VCoT verification. Background Technology

[0002] In recent years, the rapid development of artificial intelligence and big data technologies has spurred a series of research directions with broad application value, such as smart justice and smart healthcare. Among these, legal artificial intelligence has gradually become an important aspect of judicial practice. In actual judicial work, judicial personnel often need to quickly and accurately extract useful information from massive amounts of judicial texts. The basic carrier of this information is named entities. How to efficiently and accurately process massive amounts of judicial texts and precisely identify key entities within them has become a critical issue that urgently needs to be addressed in the judicial field.

[0003] In legal practice, named entity recognition plays a crucial role in various application scenarios such as legal information retrieval, judicial knowledge graph construction, and legal judgment prediction. It can not only greatly improve the work efficiency of judicial personnel, but also effectively alleviate the work pressure of judges and promote the further development of smart justice.

[0004] Furthermore, with the continuous advancement of large language models, named entity recognition technology based on large language models has become a current research hotspot. Through training on extensive corpora and knowledge bases, large models have acquired powerful natural language understanding capabilities, accurately identifying complex entities within rich contexts.

[0005] However, named entity recognition in the field of judicial texts still faces many challenges. First, judicial texts often contain long expression entities composed of multiple nouns or phrases, which are often difficult to segment accurately using traditional word segmentation methods, and place higher demands on sequence modeling. Second, to more accurately describe the facts of a case, nested named entities are frequently used in judicial texts to express complex legal facts, leading to overlapping entity boundaries and making entity extraction extremely difficult. Furthermore, compared to named entities in general domains (such as personal names, place names, and organization names), the judicial domain focuses more on fine-grained entity differentiation, requiring the further distinction of individuals into specific judicial roles such as suspects and victims, and the precise identification of the time, place, and related items of the incident. This highly refined entity recognition not only requires deeper classification of entity categories but also addresses the problem of data sparsity. These challenges collectively increase the complexity of named entity recognition in judicial texts and place higher demands on existing technologies. Summary of the Invention

[0006] The purpose of this invention is to address the shortcomings of the aforementioned background technology by providing a solution based on an improved large language model instruction fine-tuning technique and the introduction of a VCoT verification mechanism. This aims to improve entity recognition performance in judicial texts and solve the challenges of long expressions, nested entities, and fine-grained recognition.

[0007] To achieve the above objectives, this invention provides a method for recognizing entities in judicial texts based on ternary instruction fine-tuning and VCoT verification, comprising the following steps:

[0008] S1, the user inputs the legal documents and instructions to be recognized;

[0009] S2, an initial response is obtained through a large-scale judicial entity identification model;

[0010] The large-scale model for judicial entity recognition is trained and optimized through fine-tuning of instructions enhanced by ternary understanding, in order to improve its ability to understand judicial texts and accurately identify different types of entities.

[0011] Instruction fine-tuning guides the judicial entity identification big model to identify various key entity information from judicial documents through explicit instruction design and the injection of domain knowledge;

[0012] Ternary understanding enhancement improves entity recognition performance in judicial documents through deep semantic understanding and context adaptation capabilities;

[0013] S3, the initial response is progressively reasoned and verified through the VCoT verification mechanism, the verification results are corrected and optimized, and the final verified entity recognition result is generated;

[0014] The VCoT verification mechanism checks each identified entity step by step to ensure that the annotations generated by the model are consistent with the context in the original text and that no omissions or inconsistencies are found.

[0015] The VCoT verification mechanism generates a series of inference chains through multiple rounds of inference verification, so that the entities generated by the model are not only correctly identified, but also conform to the actual context and logical relationships.

[0016] Furthermore, S1 receives the judicial documents and instructions to be identified provided by the user through the input module.

[0017] Furthermore, fine-tuning of the instructions includes:

[0018] By using a context-aware data re-representation method, the context surrounding an entity is treated as a key non-entity sample, enabling the model to learn how to more accurately distinguish between entity and non-entity information through samples.

[0019] The dataset is re-labeled using a label prefixing method, adding a unique prefix label to each entity type so that the model can accurately distinguish various fine-grained entities when generating output.

[0020] Furthermore, fine-tuning of instructions also includes:

[0021] The task mode of generating and predicting simultaneously allows the model to predict in real time whether the currently generated words belong to a specific entity category while generating text, thereby deciding whether to add entity labels. For ordinary words that do not belong to any entity, the model will keep their original form without adding any labels.

[0022] Furthermore, the triadic comprehension enhancement includes: a normative module, a knowledge-guided module, and a comparative learning module;

[0023] The standardization module is used to standardize the generated content to ensure the accuracy, completeness, and consistency of the output;

[0024] The knowledge guidance module provides a heuristic list containing definitions for each entity type and related feature vocabularies to help the model understand and distinguish different entity types. The heuristic list contains high-level rules or strategies for inferring specific tasks, and the feature vocabulary contains feature words associated with each entity type.

[0025] The contrastive learning module is used to select examples that are highly similar in semantics, TF-IDF, and dependency relations from the modified training set. Combined with the corresponding original instances, a pair of highly contrastive learning samples is constructed. This contrastive learning helps the model to deeply understand the annotation rules that combine context-aware data re-representation and label prefixing. This enables the model to not only understand the semantic and contextual information of entity types during the learning process, but also to master how to output normalized results with label prefixes according to instructions.

[0026] Furthermore, when training the large-scale model for judicial entity recognition in S2, the model's input... It consists of the following three parts:

[0027] Legal text: ;

[0028] instruction: ;

[0029] Enhanced understanding of the ternary principle: It includes text for the specification module, text for the knowledge guidance module, and text for the comparative learning module, namely...

[0030]

[0031] in, To standardize module text, For the knowledge guidance module text, For comparative learning of module text;

[0032]

[0033] During model training, the task is first defined and the loss function is set; then the input is vectorized; then the encoder is processed; and then the decoder is generated.

[0034] Furthermore, S3 ensures the accuracy and consistency of entity recognition results through a self-verification module, with the VCoT verification mechanism nested within the self-verification module.

[0035] Furthermore, the content of the reasoning chain includes:

[0036] Entity type consistency check: whether it is a predefined entity type; Context consistency check: ensure that the generated content is consistent with the original content without omissions, additions, or deviations; Logical check: check whether the contextual matching of entities is logical.

[0037] Furthermore, the VCoT verification mechanism, through the design of the prompt words, ensures that the final output not only meets the requirements of the instruction but also handles potential ambiguities and complexities in judicial texts, providing secondary correction capabilities.

[0038] The above-described solution of the present invention has the following beneficial effects:

[0039] The present invention provides a judicial text entity recognition method based on ternary instruction fine-tuning and VCoT verification. Through the context-aware data re-representation method of instruction fine-tuning, non-entity samples are introduced as contextual information during the training process, which significantly enhances the model's ability to understand complex contexts, thereby improving the model's ability to distinguish non-entity text and reducing misidentification. On this basis, a unified label prefix annotation method is used to not only introduce exclusive label prefixes for each entity type, solving the problems of ambiguous entity boundaries and difficulty in distinguishing multiple entity types in judicial texts, but also to simultaneously annotate multiple entity types in one output, avoiding the complexity of multi-round prompts and interactions in traditional methods.

[0040] Based on instruction fine-tuning, this invention effectively solves the problem of insufficient understanding of complex contexts in judicial texts by traditional methods through a three-element understanding enhancement module consisting of a standardization module, a knowledge guidance module, and a comparative learning module. It has achieved significant breakthroughs, especially in the accurate segmentation of long expression entities, the boundary differentiation of nested entities, and the recognition performance of fine-grained entities. Through multi-dimensional semantic enhancement, the model can more accurately understand instruction requirements and comprehensively improve the ability to identify and classify complex entities in judicial documents.

[0041] Based on traditional thought chain reasoning, this invention develops a new verification chain verification mechanism named VCoT. By generating a list of questions, checking contextual consistency, and verifying logical relationships, it achieves step-by-step self-correction and answer verification, significantly improving the reliability and consistency of the model's entity recognition results in long text contexts. It effectively avoids the illusion problem in generative models and ensures that the recognition results conform to the semantic logic of judicial texts and actual situations.

[0042] Other beneficial effects of the present invention will be described in detail in the following detailed description section. Attached Figure Description

[0043] Figure 1 This is a flowchart of the steps of the present invention;

[0044] Figure 2 This is a schematic diagram of the system of the present invention;

[0045] Figure 3 This is a schematic diagram of the specification modules of the present invention. Detailed Implementation

[0046] The following specific examples illustrate the implementation of this disclosure. Those skilled in the art can easily understand other advantages and effects of this disclosure from the content disclosed in this specification. Obviously, the described embodiments are only a part of the embodiments of this disclosure, and not all of them. This disclosure can also be implemented or applied through other different specific embodiments, and the details in this specification can also be modified or changed based on different viewpoints and applications without departing from the spirit of this disclosure. It should be noted that, in the absence of conflict, the following embodiments and features in the embodiments can be combined with each other. Based on the embodiments in this disclosure, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this disclosure.

[0047] It should be noted that various aspects of embodiments within the scope of the appended claims are described below. It will be apparent that the aspects described herein can be embodied in a wide variety of forms, and any particular structure and / or function described herein is merely illustrative. Based on this disclosure, those skilled in the art will understand that one aspect described herein can be implemented independently of any other aspect, and two or more of these aspects can be combined in various ways. For example, any number of aspects set forth herein can be used to implement the device and / or practice the method. Additionally, this device and / or method can be implemented using structures and / or functionalities other than one or more of the aspects set forth herein.

[0048] It should also be noted that the illustrations provided in the following embodiments are merely schematic representations of the basic concept of this disclosure. The illustrations only show components relevant to this disclosure and are not drawn according to the actual number, shape, and size of components in implementation. In actual implementation, the type, quantity, and proportion of each component can be arbitrarily changed, and the component layout may be more complex. Furthermore, specific details are provided in the following description to facilitate a thorough understanding of the examples. However, those skilled in the art will understand that the described aspects can be practiced without these specific details.

[0049] like Figure 1 As shown, embodiments of the present invention provide a method for judicial text entity recognition based on ternary instruction fine-tuning and VCoT verification, comprising the following steps:

[0050] S1, the user inputs the legal documents and instructions to be recognized.

[0051] In this embodiment, the input module receives the judicial documents and instructions to be identified provided by the user, and can then pass them to the subsequent entity information extraction module. (See also...) Figure 2 The input module includes a user input submodule and a user instruction submodule to perform the aforementioned functions respectively. Since the entity information extraction module, after training and fine-tuning, can infer the correct task requirements from concise instructions, users can use simple or colloquial instructions in the input module to express their requests for entity recognition tasks, thus guiding the entity recognition process.

[0052] S2 obtains an initial response through the large model for identifying judicial entities.

[0053] In this embodiment, the entity information extraction module of the built-in judicial entity recognition model accurately identifies and extracts key information entities from judicial documents. It should be noted that the judicial entity recognition model in this embodiment is based on the advanced pre-trained language model Flan-T5 and optimized through instruction fine-tuning enhanced by ternary understanding, aiming to address the challenges of recognizing long-expression entities, nested entities, and fine-grained entities in judicial documents. Flan-T5 is a pre-trained language model fine-tuned for different tasks, and its powerful natural language understanding capabilities enable it to handle complex text structures. In judicial texts, the recognition of long-expression entities and nested entities is particularly difficult because these entities often span multiple words and overlap with other entities.

[0054] To address this challenge, the large-scale judicial entity recognition model enhances its understanding of judicial texts through instruction fine-tuning enhanced by ternary understanding, ensuring accurate identification of different types of entities. Specifically, instruction fine-tuning optimizes the model's training. During training, explicit instruction design and the infusion of rich domain knowledge guide the model to identify various key entity information from judicial documents, such as criminal suspects, victims, and the time of the incident, better adapting to named entity recognition tasks in the judicial domain. Simultaneously, to handle the complex semantic structure of judicial texts, instruction fine-tuning further enhances the model's understanding of judicial entities and instruction requirements by combining ternary understanding enhancement, thereby more accurately extracting complex entity information and significantly improving the model's task adaptability and recognition capabilities.

[0055] Fine-tuning the training model requires a dataset. Considering the current lack of a dataset specifically for named entity recognition in the legal field, this embodiment uses the information extraction track from the 2021 China Legal Intelligence Technology Evaluation (CAIL 2021), focusing on analyzing "theft" cases. The dataset contains 5247 data entries, covering ten different entity types: NHCS (suspect), NASI (stolen goods), NS (location), NHVI (victim), NT (time), NCGV (value of goods), NCSM (stolen currency), NO (organization), NATS (tools of the crime), and NCSP (profit from theft), totaling 343,640 characters and 25,466 entities. To improve the model's performance in the information extraction task, the CAIL 2021 dataset was deeply modified, and its annotation method was converted to a context-aware data re-representation method to enhance the model's ability to recognize and distinguish entities.

[0056] Context-aware data re-representation specifically involves treating the context surrounding an entity (i.e., parts that do not contain the entity) as key non-entity samples. The model can learn from these samples how to more accurately distinguish between entity and non-entity information. For example, parts of text that do not contain any entities are labeled as context samples, helping the model understand the surrounding context and thus more accurately determine the entity category of each word. Through this dataset construction method, the model can more correctly understand the entity type of each word based on the surrounding context, helping it learn how to effectively distinguish between entity and non-entity information. This makes it more accurate when processing complex text, not only correctly identifying entities in the text but also more accurately distinguishing ordinary text without entities, reducing mislabeling and overlabeling, thereby significantly improving performance and generalization ability when processing complex legal texts.

[0057] Building upon this, the dataset is re-labeled using a prefix labeling method. Specifically, a unique prefix label is added to each entity type, enabling the model to more accurately distinguish between various fine-grained entities when generating output. An example of the improved labeling is shown below:

[0058] {"id":"xxx",

[0059] On August 10, 2017, at approximately 1:00 PM, the defendant, Huang, stole a Brand A mobile phone, valued at RMB 285, from the storage compartment of an electric bicycle belonging to Guo, located on the first floor of the dormitory of a certain **Tie & Clothing Co., Ltd.

[0060] "entities": "[NT: Around 1:00 PM on August 10, 2017], the defendant [NHCS: Huang Moumou] stole [NASI: A brand mobile phone], valued at [NCGV: RMB 285], from the storage box of an electric bicycle parked by [NHVI: Guo Moumou] on the first floor of the dormitory of [NS: Zhejiang ** Tie & Clothing Co., Ltd.]."

[0061] Therefore, by using a unified label prefixing method, each entity type has its own unique prefix, which solves the problems of difficult entity boundary definition and fine-grained entities in judicial texts. Furthermore, multiple entity types can be labeled simultaneously in a single output, reducing the number of times the model needs to be called during generation and avoiding the complexity of multiple rounds of instruction prompts. Thus, the context-aware data re-representation method adopted in this embodiment allows the model to preserve the original sentence's contextual structure while labeling entities. The generated output not only contains entity information but also reflects the completeness of the sentence, especially excelling in distinguishing non-entity text. The model can more accurately determine the entity type based on the surrounding context.

[0062] It should be noted that, to ensure the model's generalization ability and performance in real-world scenarios, this embodiment scientifically partitions the modified dataset for more effective training and evaluation. Specifically, the 5247 data points are divided into a training set, a validation set, and a test set in an 8:1:1 ratio. Through in-depth modification of the CAIL 2021 dataset and a scientific dataset partitioning strategy, the model's information extraction capability from theft case texts is significantly improved, providing more accurate technical support for intelligent analysis of judicial texts.

[0063] Given that generative large language models employ a token-by-token generation paradigm, this embodiment further incorporates a simultaneous generation and prediction task mode to maximize the model's performance in fine-grained entity recognition tasks. Specifically, while generating text, the model predicts in real-time whether each generated character belongs to a specific entity category, thus deciding whether to add an entity label. For ordinary words that do not belong to any entity (i.e., context samples), the model retains their original form without adding any labels. This design ensures the naturalness and readability of the text while reducing mislabeling and overlabeling. For example: "Please analyze the provided sentence and identify the type of each entity one by one. In the output, please add a label prefix to each entity type. If a word does not belong to any entity category, please do not add a label to it." During the token-by-token output process of the generative large language model, the model can adjust and predict in real-time based on contextual information, helping it to better understand the context and make more accurate judgments when faced with polysemous words and complex sentence structures. By progressively understanding the text and generating output with fine-grained entity labels, the accuracy and efficiency of entity information extraction are significantly improved, providing a more efficient and accurate solution for information processing of judicial texts.

[0064] Understandably, the purpose of instruction fine-tuning is to enable the model to more accurately understand user needs and perform tasks precisely. However, simple instruction fine-tuning often fails to address the ambiguity and contextual understanding issues in complex legal texts. To solve this problem, ternary understanding enhancement is incorporated into the instruction fine-tuning process, further improving entity recognition performance in legal documents through deeper semantic understanding and contextual adaptation capabilities.

[0065] The ternary understanding enhancement mechanism comprises three modules: a normative module, a knowledge-guided module, and a contrastive learning module. These three modules work together to ensure that the model not only understands instructions when performing tasks but also performs deep semantic reasoning based on complex legal document texts, thereby improving recognition performance.

[0066] For the standardization module, because generative large language models have a certain degree of flexibility and uncertainty in outputting judicial texts, inconsistencies or non-standard formatting may occur in entity annotations and entity types. Therefore, the standardization module specifically standardizes the generated content to ensure the accuracy, completeness, and consistency of the output. Figure 3 As shown, this module ensures that the model can only recognize and output predefined fixed entity types and uses only label prefixing for annotation, avoiding the generation of task-irrelevant information.

[0067] Therefore, the specification module sets strict entity recognition rules and annotation formats to ensure that the model behaves as expected. For example, for entities such as criminal suspects and victims, the specification module explicitly requires the model to only recognize and output these specific types of entities, while preventing other irrelevant entities or information from being extracted.

[0068] The knowledge-guided module consists of a heuristic list and a feature vocabulary. This module helps the model understand and distinguish different entity types by providing heuristic lists containing definitions of each entity type and related feature terms. In legal documents, the semantics and form of entities may vary depending on factors such as case type and description format. The knowledge-guided module enables the model to better understand this information. Guided by this prior knowledge, the model can better understand entity definitions and correctly identify relevant information from the text, thereby correctly extracting entities from complex legal texts and improving the accuracy of entity identification. For example, a criminal suspect is usually the subject or object of a sentence and may be related to words such as "suspect" or "defendant."

[0069] The list of heuristics is shown in Table 1. Heuristics are defined as high-level rules or strategies for inferring specific tasks, playing a crucial role in human cognition and often resulting in more accurate judgments than complex methods. Therefore, by setting up the list of heuristics shown in Table 1, which includes detailed strategies and methods for finding various entity types, the model can be helped to understand the representation of various entity categories in legal documents.

[0070] Table 1 Examples of Heuristic Lists

[0071]

[0072] The feature vocabulary is shown in Table 2, which contains the feature words associated with each entity type. The model can identify these feature words during training to help the model identify specific entity types more accurately, thereby further improving the accuracy of entity recognition.

[0073] Table 2 Examples of Feature Vocabulary Representation

[0074]

[0075] The core idea of the contrastive learning module is to select examples with high semantic, TF-IDF, and dependency similarity from the modified training set, and combine them with their corresponding unmodified instances to construct a pair of highly contrasting learning samples. This contrastive learning approach helps the model deeply understand the annotation rules combining context-aware data re-representation and label prefixing. During the learning process, the model not only understands the semantic and contextual information of entity types but also learns how to output normalized results with label prefixes according to instructions. This helps the model better understand the requirements of the instructions and strengthens its accurate identification of entity categories. The specific process is as follows:

[0076] Calculate the semantic similarity between the current text and the training examples: When calculating semantic similarity, let the input text be... T Training Set No. i One example is First, for each example in the training set... Pre-encoding is performed, and semantic embeddings are extracted using the pre-trained language model RoBERTa. and The semantic similarity between the two can be calculated using the cosine similarity formula:

[0077]

[0078] This formula ensures that the directional similarity of the high-dimensional embedding vectors is captured, thus reflecting the semantic similarity between the two texts.

[0079] Calculate the TF-IDF similarity between the current text and the training examples: Using TF-IDF weighted vector representation, perform Chinese word segmentation on all examples in the training set using the jieba library, and construct a bag-of-words model based on word frequency. Specifically, let the vocabulary be... V The TF-IDF representation of the input text is ,in Indicates the first in the vocabulary list j The TF-IDF weights of the nth word. Similarly, the nth word in the training set... i The example is represented as The similarity calculation formula is:

[0080]

[0081] This formula effectively measures the overlap of text in terms of feature words and is suitable for extracting the relationship between keywords in input text and training examples.

[0082] Calculate the dependency similarity between the current text and the training examples: This is achieved by calculating the matching degree of the text dependency tree structure. Specifically, the dependency parsing tool Spacy is used to extract the similarity between the input text and the training examples.i An example of a dependency syntax tree structure and The similarity between two dependency trees is calculated using the Graph Edit Distance (GED). The formula is as follows:

[0083]

[0084] in, Indicates the edit distance between dependency trees. and These represent the number of nodes in the dependency tree. This metric reflects the syntactic similarity between the input text and the candidate examples.

[0085] The comprehensive similarity score is obtained by weighting and summing the three indicators: semantic similarity, TF-IDF similarity, and dependency similarity.

[0086]

[0087]

[0088] in, , , They are respectively , as well as The respective weights satisfy + + =1. The example with the highest overall similarity was selected as the contrastive learning example. :

[0089]

[0090] The best comparison example is selected by matching the example with the highest overall similarity from the original training set. Find the example in the training set before the modification that corresponds to its ID. This pair of examples will serve as the input to the contrastive learning module.

[0091] Based on the above training dataset, when training the large-scale model for judicial entity recognition, the model input... It mainly consists of the following three parts:

[0092] Legal text: ;

[0093] instruction: ;

[0094] Enhanced understanding of the ternary principle: It includes text for the specification module, text for the knowledge guidance module, and text for the comparative learning module, namely...

[0095]

[0096] in, To standardize module text, For the knowledge guidance module text, For comparison and learning of module text.

[0097]

[0098] As can be seen from the above, these three parts are concatenated into a semantic instruction template. As input for the model to learn and understand, instructions containing specific contexts within the legal field allow the model to extract richer contextual information, thereby enhancing its ability to understand complex language structures in legal documents. This integration approach effectively improves the model's accuracy in identifying and classifying entity categories, especially when dealing with long expressions and nested entities.

[0099] During model training, the task is first defined and the loss function is set: given a judicial text... ,in Indicates the input number of the first... One word, and one user instruction I (For example: Please analyze the provided sentence and identify the type of each entity by label. Add a label prefix to each entity type in the output; if a word does not belong to any entity category, do not add a label to it.) And ternary understanding enhancement. The goal is to output a sequence with entity label prefixes. Each of them It includes text content and corresponding label prefixes. The model learns the conditional probability distribution. Generated word-for-word by a generative large model:

[0100]

[0101] in, Indicates the first t All words generated before the specified time step. It is a conditional probability distribution.

[0102] To optimize the difference between the generated sequence and the target sequence, the negative log-likelihood loss function is used as the base generation loss:

[0103]

[0104] in, Is the model at time step t The prediction loss is calculated, and the total loss of the entire output sequence is obtained by summing the prediction loss.

[0105] In addition, a constraint loss is set here. Predictions used to penalize violations of entity specifications. Assume the model outputs a sequence. This is the result generated by the decoder. The goal is to penalize parts that do not meet the rules. The general formula for constraint loss is:

[0106]

[0107] in, It is a time step t The penalty weight can be dynamically adjusted based on the importance of the rule. It is an index function, when the output The value is 1 if the constraint is violated, and 0 otherwise. Let a conditional function define the conditions for rule violation. The constraint rules and violation judgments are defined as follows:

[0108] Entity type collection

[0109] It defines the entity categories that may appear in the task, and each prediction output... It must belong to one of the classes in this set. If the output... This is considered a violation of the rules, and the formally defined condition function is as follows:

[0110]

[0111] Combining the above parts, we can obtain the complete loss function:

[0112]

[0113] Right now,

[0114] +

[0115] Then, input vectorization is performed: the pre-trained model Flan-T5 is used as the basic language model, denoted as H, and the pre-trained language model BERT is used to vectorize the input text. Mapping to the embedding space yields the embedding representation:

[0116]

[0117] in, , n Given the length of the input sequence, d For embedded dimensions, To input text Convert to a high-dimensional vector.

[0118] Then, the encoder process is performed: the embedding vector is processed by the encoder to extract the contextual semantic representation.

[0119]

[0120] in, It is the output sequence representation of the encoder, which contains a deep representation of the input text in the semantic space.

[0121] Then, decoder generation is performed: based on the context of the current time step and historical output, the probability distribution of the next word is generated.

[0122]

[0123] in, This indicates that the decoder has generated historical outputs. and input features Under the given conditions, generate the current output. The probability distribution, This represents the probability distribution of the output generated at the current moment. The goal of the decoder is to maximize the probability generated at each step, ultimately generating a complete output sequence that meets the input features and task requirements.

[0124] After the model training is completed, the user inputs the text to be recognized into the fine-tuned judicial entity recognition model, which can then output an initial response.

[0125] S3. The initial response is progressively reasoned and verified through the VCoT verification mechanism. The verification results are corrected and optimized to generate the final verified entity recognition result.

[0126] In this embodiment, a self-verification module ensures the accuracy and consistency of entity recognition results. It is understood that although the entity information extraction module achieves high-precision entity recognition through instruction fine-tuning and ternary understanding enhancement, the complex grammatical structures often found in legal documents and the inherent uncertainties in the generation of large language models mean that the model's prediction length increases with the fusion of context-aware data re-representation strategies and label prefixing. Longer generated sequences may pose challenges to large language models, potentially leading to misidentification or labeling issues, including word omissions, additions, and substitutions. With the increasing popularity of Chain of Thought (COT) technology, this embodiment innovatively designs a VCoT (Verification with Chain of Thought) verification mechanism based on COT. This mechanism progressively verifies the entity recognition results to discover and correct potential errors.

[0127] The VCoT verification mechanism progressively checks whether each identified entity is consistent with the context in the original text, ensuring that the model-generated annotations are complete and accurate. Through this self-correcting mechanism, the model can better handle complex legal documents, avoid misidentification, and accurately capture all key entities in the text.

[0128] In this embodiment, the VCoT verification mechanism generates a series of reasoning chains through multiple rounds of inference verification, ensuring that the entities generated by the model are not only correctly identified, but also conform to the actual context and logical relationships. For example, after identifying "Zhang San" as a suspect, the model will use reasoning to determine whether the entity matches the specific details of the case, confirming whether there is any mislabeling.

[0129] Optionally, the inference chain includes entity type consistency checks: whether it is a predefined entity type; context consistency checks: ensuring that the generated content does not omit, add, or deviate from the original content; and logical checks: checking whether the contextual collocation of entities is logical (e.g., a location cannot be labeled "person"). Through step-by-step verification, the model can promptly detect errors such as omissions, additions, and substitutions, especially in complex sentences. Furthermore, step-by-step inference helps the model verify whether entity tags conform to sentence logic, thereby avoiding erroneous tagging.

[0130] The following detailed process further illustrates the VCoT verification mechanism: The user inputs the legal text to be recognized into a finely tuned legal entity recognition model. Based on the initial response from the model, a series of verifications are performed. The generated content is checked for completeness against the original sentence, ensuring no missing, redundant, or replaced words. The entity type and entity in each tag prefix are listed based on the initial response. A series of contextual verification questions are generated based on the obtained tag list, helping to self-analyze whether there are any errors in the original response. Each verification question is answered sequentially, and the answer is then checked against the initial response to check for inconsistencies or errors, further improving the model's recognition performance. Finally, a final verification response is generated. As can be seen, the VCoT verification mechanism performs self-correction based on reasoning, ensuring that the entity recognition results are both accurate and consistent with the actual case context. Given any inconsistencies found (if any), a revised response containing the verification results is generated to ensure that the final output is optimal.

[0131] As a preferred implementation, this embodiment uses a rational design of prompts in the VCoT verification mechanism to ensure that the final output not only meets the requirements of the instruction, but also handles potential ambiguities and complexities in judicial texts, providing secondary corrections and thus improving the reliability of the recognition results.

[0132] Specifically, the design can be as follows: Please verify the completeness of the generated sentences compared to the original sentences, ensuring there are no missing, redundant, or replaced words, and step by step check the accuracy of each tag and ensure consistency with the context:

[0133] First, list all entity tags and their corresponding entity types in the generated response. For example, NHCS: Zhang Moumou, NASI: Brand B mobile phone. For each entity type, check if it conforms to the predefined type (NHCS, NASI, NS, NHVI, NT, NCGV, NCSM, NO, NATS, NCSP).

[0134] Then, verify the following contents in sequence:

[0135] For the suspect (NHCS), confirm its reasonable use in the context describing the criminal act or investigation; for stolen items (NASI), ensure it appears in the context describing loss or theft; for location (NS), check its reasonable use in the context describing the location of the incident; for victim (NHVI), confirm that the role appears in the context related to the criminal act, investigation, or legal proceedings; for time (NT), check that it appears in the context describing the time of the incident; for value of goods (NCGV), verify its use in the context describing the value of the property; for stolen currency (NCSM), confirm its use in the context involving monetary loss; for organization (NO), check that it appears in the context describing an organization or institution; for tools of the crime (NATS), ensure it appears in the context describing tools of the crime; for illicit proceeds (NCSP), verify that it appears in the context describing criminal gains, illicit profits, or illicit funds; answer each verification question one by one and check for inconsistencies or errors; based on the verification questions performed, if any inconsistencies or errors are found, provide revision suggestions and generate the final verification response. If everything is verified correctly, the following response will be generated: "Entity annotation conforms to the context, verification passed, no revision required." If errors or inconsistencies are found, a revised version will be generated.

[0136] Sentence: <Original sentence>

[0137] The generated annotation results: <Annotations generated by the model>

[0138] In summary, the judicial text entity recognition method based on ternary instruction fine-tuning and VCoT verification provided in this embodiment improves the model's ability to distinguish between entity and non-entity information by combining instruction fine-tuning with ternary understanding enhancement, thereby reducing misidentification. Furthermore, the recognition performance of the model is further improved by generating a question list and a self-verification mechanism through VCoT-based verification chain reasoning.

[0139] The following complete case further illustrates the function and effect of this method. Input command: Help me extract all entities in the following sentence. Sentence: Zhang Moumou smashed the glass with a hammer in unit XX, room XX, XX city, and stole a B brand mobile phone and 600 yuan in cash.

[0140] (a) Generate initial response:

[0141] [NHCS: Zhang Moumou] broke the glass with a hammer in [NS: XX City XX Unit XX Room] and stole a mobile phone of brand B and 600 yuan in cash.

[0142] (b) Sentence integrity check:

[0143] The generated sentence's text content is identical to the original text.

[0144] (c) Generate a list of tags:

[0145] NHCS: Zhang Moumou; NS: XX City, XX Unit, XX Room; NATS: Hammer; NASI: Brand B mobile phone; NCSP: 600 yuan.

[0146] (d) Entity type consistency check:

[0147] The entity type conforms to the predefined type.

[0148] (e) Design context validation problem:

[0149] Confirm whether "Zhang Moumou" appears reasonably in a context related to the criminal act, investigation, or legal process of the case; confirm whether "XX City, XX Unit, XX Room" appears in a context describing the location of the incident; confirm whether "hammer" appears in a context describing the tool used in the crime; confirm whether "Brand B mobile phone" appears in a context describing loss or theft; confirm whether "600 yuan" appears in a context describing criminal proceeds, illegal gains, or illicit funds.

[0150] (f) Perform verification:

[0151] Entity annotations conform to the context and pass validation; no revision is needed.

[0152] (g) Generate the final verification response:

[0153] NCSP (Profits from Theft) has been revised to NCSM (Stolen Currency) because "600 yuan" should be labeled as stolen cash, not profits from theft. No other inconsistencies were found and no revisions are required.

[0154] Final annotation results: NHCS: Zhang Moumou; NS: XX City, XX Unit, XX Room; NATS: Hammer; NASI: B Brand Mobile Phone; NCSM: 600 Yuan.

[0155] Please refer to it again. Figure 2Based on the same inventive concept, this embodiment also provides a judicial text entity recognition system based on ternary instruction fine-tuning and VCoT verification, including an input module, an entity information extraction module, and a self-verification module. The input module includes a user instruction submodule and a user input submodule; the entity information extraction module includes a large-scale judicial entity recognition model, which integrates ternary understanding enhancement with instruction fine-tuning to optimize the model; the self-verification module has a VCoT verification mechanism. The functions of each module and submodule have been given above and will not be repeated here.

[0156] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.

[0157] The above embodiments are merely illustrative of several implementation methods of this application, and their descriptions are relatively specific and detailed, but they should not be construed as limiting the scope of the application. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this application, and these all fall within the protection scope of this application. Therefore, the protection scope of this application should be determined by the appended claims.

Claims

1. A method for recognizing entities in judicial texts based on ternary instruction fine-tuning and VCoT verification, characterized in that, Includes the following steps: S1, the user inputs the legal documents and instructions to be recognized; S2, an initial response is obtained through a large-scale judicial entity identification model; The large-scale model for judicial entity recognition is trained and optimized through fine-tuning of instructions enhanced by ternary understanding; Instruction fine-tuning guides the judicial entity identification model to identify various key entity information from judicial documents through explicit instruction design and the injection of domain knowledge. Instruction fine-tuning includes: By using a context-aware data re-representation method, the context surrounding an entity is treated as a key non-entity sample; The dataset is re-labeled using a label prefixing method, adding a unique prefix label to each entity type; Instruction fine-tuning also includes: The task mode of generating and predicting simultaneously involves the model predicting the entity category of the currently generated word in real time while generating text. When the model predicts that the currently generated word belongs to a specific entity category, it adds an entity label to it. For ordinary words that do not belong to any entity, the model will keep their original form without adding any label. Triadic understanding enhancement improves entity recognition performance in legal documents through deep semantic understanding and context adaptation capabilities; triadic understanding enhancement includes: a normative module, a knowledge-guided module, and a comparative learning module; The standardization module is used to standardize the generated content; The knowledge guidance module provides a heuristic list containing definitions for each entity type and related feature vocabularies to help the model understand and distinguish different entity types. The heuristic list contains high-level rules or strategies for inferring specific tasks, and the feature vocabulary contains feature words associated with each entity type. The contrastive learning module is used to select examples that are highly similar in semantics, TF-IDF and dependency relations from the modified training set, and combine them with the corresponding unmodified instances to construct a pair of highly contrastive learning samples; S3, the initial response is progressively reasoned and verified through the VCoT verification mechanism, the verification results are corrected and optimized, and the final verified entity recognition result is generated; The VCoT verification mechanism checks each identified entity step by step to ensure that the annotations generated by the model are consistent with the context in the original text and that no omissions or inconsistencies are found. The VCoT verification mechanism generates a series of inference chains through multiple rounds of inference verification.

2. The judicial text entity recognition method based on ternary instruction fine-tuning and VCoT verification according to claim 1, characterized in that, S1 receives user-provided legal documents and instructions for identification through the input module.

3. The judicial text entity recognition method based on ternary instruction fine-tuning and VCoT verification according to claim 1, characterized in that, In S2, when training the large model for judicial entity recognition, the input of the model consists of the following three parts: Judicial text: ; Instructions: ; Triadic understanding enhancement: , which includes the norm module text, the knowledge guide module text, and the contrast learning module text, that is wherein, for normative module text, for knowledge guiding module text, for comparative learning module text; During model training, the task is first defined and the loss function is set; then the input is vectorized; then the encoder is processed; and then the decoder is generated.

4. The method for judicial text entity recognition based on ternary instruction fine-tuning and VCoT verification of claim 1, wherein, S3 uses a self-verification module to ensure the accuracy and consistency of entity recognition results, with the VCoT verification mechanism nested within the self-verification module.

5. The judicial text entity recognition method based on ternary instruction fine-tuning and VCoT verification according to claim 4, characterized in that, The reasoning chain includes: Entity type consistency check: whether it is a predefined entity type; Context consistency check: ensure that the generated content is consistent with the original content without omissions, additions, or deviations; Logical check: check whether the contextual matching of entities is logical.

6. The judicial text entity recognition method based on ternary instruction fine-tuning and VCoT verification according to claim 1, characterized in that, The VCoT verification mechanism includes the design of prompt words.