Judgment document-based bidirectional encoder characterization quantity model optimization method and device

A characterization and encoder technology, applied in the field of bidirectional encoder characterization model optimization based on judgment documents, can solve problems such as low data quality, poor model effect, and unreasonable task selection, and achieve improved application effects and good support. Effect

Pending Publication Date: 2021-02-09
平安直通咨询有限公司
0 Cites 0 Cited by

AI-Extracted Technical Summary

Problems solved by technology

[0004] At present, the cost of pre-training the BERT model is relatively high. Most model users cannot re-pre-train the BERT model based on the characteristic data of their application knowle...
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Abstract

The invention relates to artificial intelligence, and provides a judgment document-based bidirectional encoder characterization quantity model optimization method and device. The method comprises thefollowing steps: determining an initial pre-training model corresponding to legal judgment document data according to an initial bidirectional encoder representation quantity model; obtaining a presetnumber of cause categories determined according to the legal judgment document data, and adding corresponding category labels to the cause categories; extracting a corresponding training data set from the legal judgment document data based on the category label, and performing data preprocessing on the training data set; and based on the preprocessed training data set, carrying out optimization training on the determined specific hyper-parameters of the initial pre-training model to obtain an optimized bidirectional encoder characterization quantity model. By the adoption of the method, natural language representation of the legal judgment document is achieved according to the optimized bidirectional encoder representation quantity model, and the application effect of the bidirectional encoder representation quantity model in the field of legal knowledge to which the judgment document belongs is improved.

Application Domain

Technology Topic

Image

  • Judgment document-based bidirectional encoder characterization quantity model optimization method and device
  • Judgment document-based bidirectional encoder characterization quantity model optimization method and device
  • Judgment document-based bidirectional encoder characterization quantity model optimization method and device

Examples

  • Experimental program(1)

Example Embodiment

[0058]In order to make the purpose, technical solutions, and advantages of this application clearer, the following further describes this application in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the application, and not used to limit the application.
[0059]The method for optimizing the representation quantity model of the bidirectional encoder based on the judgment document provided in this application can be applied tofigure 1 In the application environment shown. Wherein, the terminal 102 communicates with the server 104 through the network through the network. According to the initial two-way encoder representation model, the initial pre-training model corresponding to the legal judgment document data is determined. The legal judgment document data can be stored in the local storage where the terminal 102 is located, or when the corresponding model optimization instruction is detected , Get it from the cloud storage of the server 104, and send it to the terminal 102 only. Obtain a preset number of case categories determined according to the legal judgment document data, and add corresponding category labels to each case category, and then extract the corresponding training data set from the legal judgment document data based on the category labels, and compare the training data set Perform data preprocessing. Based on the pre-processed training data set, the determined specific hyperparameters of the initial pre-training model are optimized and trained to obtain the optimized bidirectional encoder representation model. The terminal 102 can be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The server 104 can be implemented as an independent server or a server cluster composed of multiple servers.
[0060]In one embodiment, such asfigure 2 As shown, a method for optimizing the representation quantity model of the bidirectional encoder based on the judgment document is provided, and the method is applied tofigure 1 Take the terminal in as an example for description, including the following steps:
[0061]Step S202: Determine an initial pre-training model corresponding to the legal judgment document data according to the initial two-way encoder representation model.
[0062]Among them, the initial two-way encoder representation quantity model is based on the original neural network model, that is, the original transformer model, which is further constructed to obtain a multilayer two-way transformer encoder. The characterization model of the bidirectional encoder requires a fixed sequence length, such as 128. If it is not enough, it will be filled later, otherwise, the extra words will be intercepted to ensure that the input is a fixed-length word sequence. The first word is a special [CLS], which is used to encode the semantics of other words in the entire sentence. Among them, the judgment document has the characteristics of authority, language standardization, word accuracy, data integrity and high comprehensive quality.
[0063]Specifically, by selecting BERT-Base, Chinese: Chinese Simplified and Traditional (Chinese character models trained using simplified and traditional Chinese) as the initial pre-training model from the existing pre-training models in the initial two-way encoder representation model, The pre-training model is based on large-scale Chinese corpus training and has good Chinese language representation. Optimizing training of these pre-training models can realize the parameter training and optimization of the initial two-way encoder representation vector model. Among them, the pre-trained bidirectional encoder representation model provides a powerful sentence representation that includes context dependence, and can be used to process a variety of natural language processing tasks, including intent recognition and word slot filling tasks.
[0064]Step S204: Obtain a preset number of cause categories determined according to the legal judgment document data, and add a corresponding category label to each cause category.
[0065]Specifically, by selecting a preset number of legal judgment document data from the database, and selecting a preset number of cause categories from the legal judgment document data, wherein the preset number corresponds to the model training scale, and the preset number Can be set to 10.
[0066]Further, the rules for selecting the cause of the case are that the field data of the cause of the case is complete and the special field "ascertained by this court" field is complete. According to the principle of selection of the cause of the case, the categories of the cause of the case include private loan disputes, motor vehicle accident liability disputes, and financial contract loans Disputes, credit card disputes, house sale disputes, labor contract disputes, lease contract disputes, right of recovery, copyright infringement disputes, and insurance disputes. By adding corresponding category labels to each case category, the category labels set for the preset number of case categories include the first label, the second label, ... and the tenth label. According to each category label, different legal judgments can be identified separately The category of cause of action to which the document data belongs.
[0067]Step S206: Extract a corresponding training data set from the legal judgment document data based on the category label, and perform data preprocessing on the training data set.
[0068]Specifically, the legal judgment document data is classified according to the determined multiple category labels, and the legal judgment document data included in the cause categories corresponding to the different category labels are obtained. Then, according to the legal judgment document data corresponding to different types of labels, an initial data set composed of legal judgment document data corresponding to different types of labels is obtained, and the corresponding training data set is obtained by data preprocessing of the initial data set, and the training data The set is stored in a character separated value file.
[0069]Among them, the character-separated value file is provided with special fields and case data corresponding to the special fields. Among them, the data preprocessing methods for the initial data set include: data cleaning, deleting HTML tags, special characters, and garbled characters in the legal judgment document data.
[0070]In an embodiment, performing data preprocessing on the initial data set further includes:
[0071]According to the preset ratio, the legal judgment document data is divided into a training data set, a verification data set, and a test data set; among them, the training data set is used to train the initial pre-training model; the verification data set is used to perform the initial training during the training process. The generalization ability of the pre-training model is verified, and whether there is under-fitting or over-fitting is determined; the test data set is used for index testing of the optimized two-way encoder model.
[0072]Among them, the preset ratio is training set: verification set: test set=7:2:1, that is, according to the ratio of 7:2:1, the legal judgment document data is divided into training data set, verification data set and test data set.
[0073]In step S208, based on the pre-processed training data set, optimized training is performed on the determined specific hyperparameters of the initial pre-training model to obtain an optimized bidirectional encoder representation quantity model.
[0074]Specifically, according to the pre-processed training data set, the determined specific hyperparameters of the initial pre-training model are optimized for training, where the specific hyperparameters include: batch (batch): 64, that is, the batch is set to 64 , Max_len (maximum input sequence length): 256, that is, the maximum input sequence length is set to 256, epoch (training times): 5, that is, 5 training iterations are required.
[0075]Further, the determined specific hyperparameters of the initial pre-training model are optimized and trained to obtain the optimized bidirectional encoder representation model, which is based onimage 3 The architecture implementation of the optimized bidirectional encoder representation model shown, refer toimage 3 , The training process includes:
[0076]Use model_2 (representation training layer), bidirectional_1 (lstm) (sequence classification layer), dense_1 layer (output layer) in the model architecture to train all parameters, including:
[0077]1) Train the representation of the input layer according to the representation training layer;
[0078]2) Classify the output sequence of the upper layer according to the sequence classification layer;
[0079]3) According to the output layer to get the final output result, the output result includes the probability distribution of 10 cases by category.
[0080]Among them, input_1 (first input layer) and input_2 (second input layer) correspond to the embeddings layer (embedding layer) and segments layer (sentence layer) of the BERT model respectively, and all sublayers in model_2 (characterization training layer) are set to Trainable. The bidirectional_1 (lstm_1) layer (sequence classification layer) is a 128-unit bidirectional long and short-term memory network layer. The bidirectional long and short-term memory network layer is used to perform the downstream task of the initial bidirectional encoder representation model, that is, the classification task. The dense_1 layer (output layer) is a fully connected layer that can be used for multi-classification task activation, and the output result is the probability distribution of 10 cases by category.
[0081]In an embodiment, after obtaining the optimized bidirectional encoder characterization model, the method further includes:
[0082]According to the optimized two-way encoder representation model, the existing legal ruling document data is classified and processed, and the distribution probability of each legal ruling document data under the predetermined number of cause categories is obtained.
[0083]Specifically, apply the optimized two-way encoder representation quantity model, determine the legal judgment document data in the existing database as the application data set, and input the application data set into the optimized two-way encoder representation quantity model, Realize the classification processing of the legal ruling document data in the existing database, and then obtain the determined preset number (can be 10) of the legal ruling document in all the legal ruling document data in the database. Distribution probability.
[0084]In the above-mentioned method for optimizing the representation quantity model of the two-way encoder based on the judgment document, the initial pre-training model corresponding to the legal judgment document data is determined according to the initial two-way encoder representation quantity model; the preset number determined according to the legal judgment document data is obtained The category of the cause of the case, and the corresponding category label is added for each category of the cause of the case; the corresponding training data set is extracted from the legal judgment document data based on the category label, and the training data set is preprocessed; based on the preprocessed training data set Optimize the specific hyperparameters of the determined initial pre-training model to obtain the optimized bidirectional encoder representation model. Realize the method of optimizing the two-way encoder representation model, and then according to the optimized two-way encoder representation model to better represent the natural language corresponding to the judgment document in the legal field, and classify the downstream model of the legal field. Provide good support to improve the application effect of the two-way encoder representation model in the legal knowledge field to which the judgment document belongs.
[0085]In one embodiment, such asFigure 4 As shown, a method for optimizing the representation quantity model of a bidirectional encoder based on a judgment document is provided, which specifically includes the following steps:
[0086]Step S402: Obtain a preset character sequence to be input.
[0087]Specifically, such asFigure 5 As shown,Figure 5 Provides the construction process of the initial two-way encoder characterization model, refer toFigure 5 , E1, E2...En, etc. represent the output of the embedding layer (embedding layer), trm represents the multilayer primitive neural network model, that is, the ransformer model, T1, T2...Tn is used to indicate fine-tuning the output data of different embedding layers .
[0088]In step S404, the character sequence to be input is converted into a number sequence corresponding to the character sequence to be input via the vocabulary.
[0089]Specifically, by obtaining a preset vocabulary, the vocabulary includes the mapping relationship between characters and numbers, etc., by passing the character sequence to be input through the vocabulary, it can be converted into a number sequence corresponding to the character sequence to be input.
[0090]Step S406: Obtain multiple embedding layers connected to the original neural network model.
[0091]Specifically, such asImage 6 As shown,Image 6 Provides the input representation of the initial bidirectional encoder characterization model, refer toImage 6 , First of all, there are two sentences with different structures as the text sequence to be input. You need to add a special word Token[CLS] at the beginning of the first sentence you input, and add another after the last word of the sentence The special word token[SEP] is used to indicate the end of the first sentence. Similarly, a special word token [SEP] will be added after the last word of another sentence to indicate the end of the second sentence.
[0092]Further, the embedding layer connected with the original neural network model includes a token embedding, a position embedding, and a segment embedding. Among them, embedding is used to map a thing It is a point in the multidimensional space, that is, a vector. The word embedding layer represents a vector that maps a word to a word, and the location embedding layer represents a vector that maps location information to a point in the location space. In the same way, the sentence embedding layer maps sentence information to sentence vectors. The sentence information can indicate which sentence the selected word belongs to. Different sentences are segmented through [SEP]. For example, position embedding is similar to word embedding, which maps a position into a low-dimensional dense vector. The embedding vector of a sentence has only two values, either the first sentence or the second sentence, and each sentence corresponds to an embedding vector.
[0093]Step S408, input the digital sequence to each embedding layer to obtain output data of each embedding layer.
[0094]Specifically, by inputting the digital sequence into the word embedding layer, the position embedding layer, and the sentence embedding layer, the output data of different embedding layers are obtained. Among them, the word embedding layer can map each digit in the digit sequence to a corresponding vector. The position embedding layer means that the position information is mapped to a point in the position space, which is also a vector. The sentence embedding layer maps the sentence information to the sentence Vector, to determine which sentence the input number sequence belongs to.
[0095]Step S410, the output data of each embedding layer is summed to obtain an output data sequence.
[0096]Specifically, by summing the output data of the word embedding layer, the position embedding layer, and the sentence embedding layer, the output data sequence corresponding to the digital sequence can be obtained.
[0097]Step S412, training the multilayer primitive neural network model according to the output data sequence, and constructing the initial bidirectional encoder representation model.
[0098]Specifically, by obtaining the output data sequence corresponding to the digital sequence, and using the obtained output data sequence as the training data of the multilayer primitive neural network model, and training the multilayer primitive neural network model according to the training data, the initial Characteristic model of bidirectional encoder.
[0099]In the above-mentioned method for optimizing the representation quantity model of the bidirectional encoder based on the judgment document, the obtained character sequence to be input is converted into a digital sequence corresponding to the character sequence to be input through the vocabulary. By obtaining multiple embedding layers connected with the original neural network model, and inputting the digital sequence into each embedding layer, the output data of each embedding layer is obtained, and then the output data of each embedding layer is summed to obtain the output data sequence. According to the output data sequence, train the multi-layer original neural network model to construct the initial bidirectional encoder representation model. This method realizes the training of the original multi-layer neural network model according to the input text sequence, and obtains the initial bidirectional encoder representation model that can be used to determine the pre-training model, and then realizes the optimization of the model according to the determined pre-training model. Improve the application effect of the model in the field of legal knowledge.
[0100]In one embodiment, such asFigure 7 As shown, the steps of generating a training data set specifically include:
[0101]Step S702: Obtain a data length threshold preset for the initial data set.
[0102]Specifically, the data length threshold preset for the initial data set is obtained, where the preset data length threshold may be 256, that is, the data length included in the initial data set is less than or equal to 256.
[0103]Step S704: Perform length alignment on the initial data set according to the data length threshold to obtain an initial data set with the same length.
[0104]Specifically, according to the acquired data length threshold, the length alignment operation is performed on each data in the initial data set, and if the length is less than the data length threshold 256, the constant value 0 is used to complete the data after the data.
[0105]Step S706: Perform vectorization processing on the category labels corresponding to each case category in the initial data set to obtain label vectors corresponding to different category labels.
[0106]Specifically, by performing vectorization processing on the category labels corresponding to each case category in the initial data set, each category label is converted into a one-hot vector, that is, a multi-category label vector, and then label vectors corresponding to different category labels are obtained. Among them, the one-hot vector is expressed as a feature vector of an attribute, that is, there is only one activation point (not 0) at the same time, the vector has only one feature that is not 0, and the others are all 0.
[0107]Step S708: Obtain the legal judgment document data corresponding to each label vector, perform data cleaning on the legal judgment document data corresponding to each label vector, delete special characters, garbled characters and hypertext markup language mark tags in the legal judgment document data to obtain the corresponding Training data set.
[0108]Specifically, by obtaining the legal judgment document data corresponding to each label vector, where the label vector includes 10 corresponding label vectors corresponding to 10 category labels, and performing data cleaning on the legal judgment document data corresponding to each label vector, including Delete the special characters, garbled characters, and hypertext markup language mark tags in the legal judgment document data to obtain the corresponding training data set.
[0109]In this embodiment, by obtaining the preset data length threshold for the initial data set, and aligning the initial data set according to the data length threshold, the initial data set with the same length is obtained, and the corresponding categories of each case in the initial data set are obtained. The category labels are vectorized to obtain label vectors corresponding to different category labels. Obtain the legal judgment document data corresponding to each label vector, perform data cleaning on the legal judgment document data corresponding to each label vector, delete special characters, garbled characters and hypertext markup language mark tags in the legal judgment document data, and obtain corresponding training data set. It realizes the pre-processing of the initial data set, avoids invalid data or blank data in the subsequent training process, which causes the problem of model optimization training interruption, thereby improving the efficiency of model optimization training.
[0110]It should be understood that althoughfigure 2 ,Figure 4 as well asFigure 7 The steps in the flowchart are shown in sequence as indicated by the arrows, but these steps are not necessarily executed in the order indicated by the arrows. Unless specifically stated in this article, the execution of these steps is not strictly restricted in order, and these steps can be executed in other orders. and,figure 2 ,Figure 4 as well asFigure 7 At least part of the steps in may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but can be executed at different times, and the order of execution of these sub-steps or stages is not necessarily It is performed sequentially, but may be performed alternately or alternately with at least a part of other steps or sub-steps or stages of other steps.
[0111]In one embodiment, such asFigure 8 As shown, a device for optimizing a two-way encoder representation quantity model based on a referee document is provided, including: an initial pre-training model determination module 802, a category tag adding module 804, a training data set determination module 806, and a two-way encoder representation quantity model optimization Module 808, where:
[0112]The initial pre-training model determining module 802 is used to determine the initial pre-training model corresponding to the legal judgment document data according to the initial two-way encoder representation model;
[0113]The category label adding module 804 is used to obtain a preset number of cause categories determined according to the legal judgment document data, and add a corresponding category label to each category of the cause of the case;
[0114]The training data set determining module 806 is configured to extract the corresponding training data set from the legal judgment document data based on the category label, and perform data preprocessing on the training data set;
[0115]The bidirectional encoder representation quantity model optimization module 808 is used to optimize and train specific hyperparameters of the determined initial pre-training model based on the preprocessed training data set to obtain the optimized bidirectional encoder representation quantity model.
[0116]In the above-mentioned two-way encoder representation quantity model optimization device based on the judgment document, the initial pre-training model corresponding to the legal judgment document data is determined according to the initial two-way encoder representation quantity model; the preset number determined according to the legal judgment document data is obtained The category of the cause of the case, and the corresponding category label is added for each category of the cause of the case; the corresponding training data set is extracted from the legal judgment document data based on the category label, and the training data set is preprocessed; based on the preprocessed training data set Optimize the specific hyperparameters of the determined initial pre-training model to obtain the optimized bidirectional encoder representation model. Realize the method of optimizing the two-way encoder representation model, and then according to the optimized two-way encoder representation model to better represent the natural language corresponding to the judgment document in the legal field, and classify the downstream model of the legal field. Provide good support to improve the application effect of the two-way encoder representation model in the legal knowledge field to which the judgment document belongs.
[0117]In one embodiment, a device for optimizing the representation quantity model of a bidirectional encoder based on a judgment document is provided, which further includes: a text sequence acquisition module, a digital sequence generation module, an embedded layer acquisition module, an output data generation module, and an output data sequence generation module And the initial two-way encoder representation quantity model building module, where:
[0118]The text sequence obtaining module is used to obtain a preset text sequence to be input.
[0119]The digital sequence generation module is used to convert the character sequence to be input into a digital sequence corresponding to the character sequence to be input via the vocabulary.
[0120]The embedded layer acquisition module is used to acquire multiple embedded layers connected to the original neural network model.
[0121]The output data generation module is used to input the digital sequence into each embedding layer to obtain the output data of each embedding layer.
[0122]The output data sequence generation module is used to sum the output data of each embedding layer to obtain the output data sequence.
[0123]The initial two-way encoder representation quantity model building module is used to train the multilayer original neural network model according to the output data sequence, and construct the initial two-way encoder representation quantity model.
[0124]In the above-mentioned two-way encoder characterization model optimization device based on the judgment document, the acquired character sequence to be input is converted into a digital sequence corresponding to the character sequence to be input through the vocabulary. By obtaining multiple embedding layers connected with the original neural network model, and inputting the digital sequence into each embedding layer, the output data of each embedding layer is obtained, and then the output data of each embedding layer is summed to obtain the output data sequence. According to the output data sequence, the multi-layer original neural network model is trained, and the initial two-way encoder representation model is constructed. This method realizes the training of the original multi-layer neural network model according to the input text sequence, and obtains the initial bidirectional encoder representation model that can be used to determine the pre-training model, and then realizes the optimization of the model according to the determined pre-training model. Improve the application effect of the model in the field of legal knowledge.
[0125]In an embodiment, the training data set determination module is also used to:
[0126]Obtain the preset data length threshold for the initial data set; align the length of the initial data set according to the data length threshold to obtain an initial data set with the same length; perform vectorization processing on the category labels corresponding to each category in the initial data set to obtain Label vectors corresponding to different types of labels; obtain the legal judgment document data corresponding to each label vector, perform data cleaning on the legal judgment document data corresponding to each label vector, delete special characters, garbled characters and hypertext marks in the legal judgment document data Language tagging tags to get the corresponding training data set.
[0127]In this embodiment, by obtaining the preset data length threshold for the initial data set, and aligning the initial data set according to the data length threshold, the initial data set with the same length is obtained, and the corresponding categories of each case in the initial data set are obtained. The category labels are vectorized to obtain label vectors corresponding to different category labels. By obtaining the legal judgment document data corresponding to each label vector, data cleaning is performed on the legal judgment document data corresponding to each label vector, and special characters, garbled characters and hypertext markup language mark tags in the legal judgment document data are deleted to obtain corresponding training data set. It realizes the pre-processing of the initial data set, avoids invalid data or blank data in the subsequent training process, which causes the problem of model optimization training interruption, thereby improving the efficiency of model optimization training.
[0128]In an embodiment, a device for optimizing a representation quantity model of a bidirectional encoder based on a judgment document is provided, which further includes:
[0129]The distribution probability determination module is used to classify the existing legal ruling document data according to the optimized two-way encoder characterization model to obtain the distribution probability of each legal ruling document data under the preset number of cause categories.
[0130]In an embodiment, the training data set determination module is also used to:
[0131]Classify the legal judgment document data based on the category tags, obtain the legal judgment document data under the case category corresponding to different category tags, and obtain the initial data set composed of the legal judgment document data corresponding to the different category tags; perform the initial data set The data is preprocessed to obtain the corresponding training data set; the training data set is stored in a character-separated value file; wherein the character-separated value file is set with special fields and case data corresponding to the special fields.
[0132]In an embodiment, the training data set determination module is also used to:
[0133]According to the preset ratio, the legal judgment document data is divided into a training data set, a verification data set, and a test data set; among them, the training data set is used to train the initial pre-training model; the verification data set is used to perform the initial training during the training process. The generalization ability of the pre-training model is verified, and whether there is under-fitting or over-fitting is determined; the test data set is used for index testing of the optimized two-way encoder representation model.
[0134]Regarding the specific limitation of the device for optimizing the representation quantity model of the bidirectional encoder based on the judgment document, please refer to the above limitation on the optimization method of the representation quantity model of the bidirectional encoder based on the judgment document, which will not be repeated here. Each module in the above-mentioned bidirectional encoder characterization quantity model optimization device based on the judgment document can be implemented in whole or in part by software, hardware, and a combination thereof. The foregoing modules may be embedded in the form of hardware or independent of the processor in the computer device, or may be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to the foregoing modules.
[0135]In one embodiment, a computer device is provided. The computer device may be a terminal, and its internal structure diagram may be asPicture 9Shown. The computer equipment includes a processor, a memory, a network interface, a display screen and an input device connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities. The memory of the computer device includes a storage medium and an internal memory. The storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the storage medium. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer program is executed by the processor to realize a method for optimizing the representation quantity model of the bidirectional encoder based on the judgment document. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, or it can be a button, a trackball or a touchpad set on the housing of the computer equipment , It can also be an external keyboard, touchpad, or mouse.
[0136]Those skilled in the art can understand,Picture 9The structure shown in is only a block diagram of part of the structure related to the solution of the application, and does not constitute a limitation on the computer equipment to which the solution of the application is applied. The specific computer equipment may include more or Fewer parts, or combine some parts, or have a different arrangement of parts.
[0137]In one embodiment, a computer device is provided, including a memory and a processor, the memory stores a computer program, and the processor implements the following steps when executing the computer program:
[0138]Determine the initial pre-training model corresponding to the legal judgment document data according to the initial two-way encoder representation model;
[0139]Obtain a preset number of cause categories determined based on the legal judgment document data, and add corresponding category labels to each category of cause of action;
[0140]Extract the corresponding training data set from the legal judgment document data based on the category label, and perform data preprocessing on the training data set;
[0141]Based on the pre-processed training data set, the determined specific hyperparameters of the initial pre-training model are optimized and trained to obtain the optimized bidirectional encoder representation model.
[0142]In an embodiment, the processor further implements the following steps when executing the computer program:
[0143]According to the optimized two-way encoder representation model, the existing legal ruling document data is classified and processed, and the distribution probability of each legal ruling document data under the predetermined number of cause categories is obtained.
[0144]In an embodiment, the processor further implements the following steps when executing the computer program:
[0145]Obtain the preset text sequence to be input;
[0146]Convert the character sequence to be input into a numeric sequence corresponding to the character sequence to be input via the vocabulary;
[0147]Obtain multiple embedding layers connected to the original neural network model;
[0148]Input the digital sequence into each embedding layer to obtain the output data of each embedding layer;
[0149]Sum the output data of each embedding layer to obtain the output data sequence;
[0150]According to the output data sequence, train the multi-layer original neural network model to construct the initial bidirectional encoder representation model.
[0151]In an embodiment, the processor further implements the following steps when executing the computer program:
[0152]Classify the legal judgment document data based on the category tags, obtain the legal judgment document data under the case categories corresponding to different category tags, and obtain the initial data set composed of the legal judgment document data corresponding to the different category tags;
[0153]Perform data preprocessing on the initial data set to obtain the corresponding training data set;
[0154]The training data set is stored in a character-separated value file; among them, the character-separated value file is set with special fields and case data corresponding to the special fields.
[0155]In an embodiment, the processor further implements the following steps when executing the computer program:
[0156]Obtain the preset data length threshold for the initial data set;
[0157]Align the length of the initial data set according to the data length threshold to obtain an initial data set with the same length;
[0158]Perform vectorization processing on the category labels corresponding to each case in the initial data set to obtain label vectors corresponding to different category labels;
[0159]Obtain the legal judgment document data corresponding to each label vector, perform data cleaning on the legal judgment document data corresponding to each label vector, delete special characters, garbled characters, and hypertext markup language mark tags in the legal judgment document data to obtain the corresponding training data set.
[0160]In an embodiment, the processor further implements the following steps when executing the computer program:
[0161]According to the preset ratio, the legal judgment document data is divided into a training data set, a verification data set, and a test data set; among them, the training data set is used to train the initial pre-training model; the verification data set is used to perform the initial training during the training process. The generalization ability of the pre-training model is verified, and it is determined whether there is under-fitting or over-fitting; the test data set is used to test the index of the optimized two-way encoder model.
[0162]In one embodiment, a computer storage medium is provided, on which a computer program is stored, and when the computer program is executed by a processor, the following steps are implemented:
[0163]Determine the initial pre-training model corresponding to the legal judgment document data according to the initial two-way encoder representation model;
[0164]Obtain a preset number of cause categories determined based on the legal judgment document data, and add corresponding category tags to each category of cause of action;
[0165]Extract the corresponding training data set from the legal judgment document data based on the category label, and perform data preprocessing on the training data set;
[0166]Based on the pre-processed training data set, the determined specific hyperparameters of the initial pre-training model are optimized and trained to obtain the optimized bidirectional encoder representation model.
[0167]In an embodiment, the computer program further implements the following steps when being executed by the processor:
[0168]According to the optimized two-way encoder representation model, the existing legal ruling document data is classified and processed, and the distribution probability of each legal ruling document data under the predetermined number of cause categories is obtained.
[0169]In an embodiment, the computer program further implements the following steps when being executed by the processor:
[0170]Obtain the preset text sequence to be input;
[0171]Convert the character sequence to be input into a digital sequence corresponding to the character sequence to be input via the vocabulary;
[0172]Obtain multiple embedding layers connected to the original neural network model;
[0173]Input the digital sequence into each embedding layer to obtain the output data of each embedding layer;
[0174]Sum the output data of each embedding layer to obtain the output data sequence;
[0175]According to the output data sequence, train the multi-layer original neural network model to construct the initial bidirectional encoder representation model.
[0176]In an embodiment, the computer program further implements the following steps when being executed by the processor:
[0177]Classify the legal judgment document data based on the category tags, obtain the legal judgment document data under the case categories corresponding to different category tags, and obtain the initial data set composed of the legal judgment document data corresponding to the different category tags;
[0178]Perform data preprocessing on the initial data set to obtain the corresponding training data set;
[0179]The training data set is stored in a character-separated value file; among them, the character-separated value file is set with special fields and case data corresponding to the special fields.
[0180]In an embodiment, the computer program further implements the following steps when being executed by the processor:
[0181]Obtain the preset data length threshold for the initial data set;
[0182]Align the length of the initial data set according to the data length threshold to obtain an initial data set with the same length;
[0183]Perform vectorization processing on the category labels corresponding to each case in the initial data set to obtain label vectors corresponding to different category labels;
[0184]Obtain the legal judgment document data corresponding to each label vector, perform data cleaning on the legal judgment document data corresponding to each label vector, delete special characters, garbled characters, and hypertext markup language mark tags in the legal judgment document data to obtain the corresponding training data set.
[0185]In an embodiment, the computer program further implements the following steps when being executed by the processor:
[0186]According to the preset ratio, the legal judgment document data is divided into a training data set, a verification data set, and a test data set; among them, the training data set is used to train the initial pre-training model; the verification data set is used to perform the initial training during the training process. The generalization ability of the pre-training model is verified, and whether there is under-fitting or over-fitting is determined; the test data set is used for index testing of the optimized two-way encoder representation model.
[0187]A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through a computer program. The computer program can be stored in a non-volatile computer readable storage. In the medium, when the computer program is executed, it may include the procedures of the above-mentioned method embodiments. Wherein, any reference to memory, storage, database or other media used in the embodiments provided in this application may include non-volatile and/or volatile memory. Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. As an illustration and not a limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
[0188]The technical features of the above embodiments can be combined arbitrarily. In order to make the description concise, all possible combinations of the technical features in the above embodiments are not described. However, as long as there is no contradiction in the combination of these technical features, they should It is considered as the range described in this specification.
[0189]The above-mentioned embodiments only express several implementation manners of the present application, and their description is relatively specific and detailed, but they should not be understood as a limitation on the scope of the invention patent. It should be pointed out that for those of ordinary skill in the art, without departing from the concept of this application, several modifications and improvements can be made, and these all fall within the protection scope of this application. Therefore, the scope of protection of the patent of this application shall be subject to the appended claims.
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

no PUM

Description & Claims & Application Information

We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Similar technology patents

Deep and large foundation pit angle supporting horizontal shearing resistance structure

ActiveCN103967018AEnhanced horizontal shearGood supportExcavationsShear force
Owner:SHENZHEN GONGKAN GEOTECHN GRP

Chopping block type scissor for loom for shearing fabric in chopping manner

InactiveCN101333724AIncrease contact surface and frictionGood supportLoomsSevering textilesCamEngineering
Owner:JINGWEI TEXTILE MASCH CO LTD

Classification and recommendation of technical efficacy words

  • Good support
  • Improve application effect

Sports Garment

Owner:GROSSE STEFANI

Filter structure of vacuum cleaner

InactiveCN101176631AEasy to removeImprove application effectSuction filtersVacuum cleanerFilter element
Owner:LG ELECTRONICS (TIANJIN) APPLIANCES CO LTD

Preparation of chitosamine-selenium nanometer trace-element nutrition regulator

InactiveCN102603396AImprove efficiencyImprove application effectMetabolism disorderSulfur/selenium/tellurium inorganic active ingredientsChemistrySelenium
Owner:LIAONING NORMAL UNIVERSITY

Man-machine interaction method and device combining RPA and AI, storage medium and electronic equipment

ActiveCN113034095AImprove deployment and operation efficiencyImprove application effectNatural language translationSemantic analysisMachine learningNatural language
Owner:BEIJING LAIYE NETWORK TECH CO LTD +1

Infrared food detector

PendingCN112525846AMaintain operating efficiencyImprove application effectColor/spectral properties measurementsIr microscopeMid infrared
Owner:左延鹏

Branch device of high-efficiency cotton field atomization sprayer

ActiveCN104012507AImprove application effectGuaranteed application effectInsect catchers and killersAgricultural engineering
Owner:NANTONG HUANGHAI CHEM MACHINERY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products