Language model dynamic updating method and device, electronic equipment and storage medium

By combining local small models with pre-trained large models, and using word segmentation, vector matching, and integration to generate prompt data, the problem of AIGC models being unable to adapt to dynamic dataset changes was solved. This enabled low-cost and efficient dynamic updates of the language model, improving user experience and model adaptability.

CN116992283BActive Publication Date: 2026-06-23SHENZHEN FULIN TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SHENZHEN FULIN TECH CO LTD
Filing Date
2023-07-21
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Traditional AIGC model pre-training methods are based on static datasets, which cannot adapt to changes in dynamic datasets, resulting in limited model performance and high costs and resource consumption for large-scale training.

Method used

By combining local small models with pre-trained large models, prompt data is generated through word segmentation, vector matching, and integration, and the local optimized model is dynamically updated to achieve low-cost dynamic updating of the language model.

Benefits of technology

It enables real-time updates of the language model, reduces costs and resource consumption, improves the model's adaptability and matching accuracy, and enhances user experience and satisfaction.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116992283B_ABST
    Figure CN116992283B_ABST
Patent Text Reader

Abstract

The application provides a language model dynamic updating method and device, electronic equipment and storage medium, and belongs to the technical field of data processing. The method comprises the following steps: collecting a user question, performing word segmentation on the user question to obtain a plurality of keyword vectors; obtaining a basic word vector in a vector library, matching the basic word vector with the keyword vectors to obtain a plurality of matching word vectors; screening the plurality of matching word vectors based on a local initial model to obtain a plurality of preliminary screening vectors; integrating the plurality of preliminary screening vectors based on a pre-training model to generate a plurality of prompt data; updating the local initial model based on the prompt data to obtain a local optimization model, and obtaining a target language processing system based on the local optimization model and the pre-training model. The application can realize low-cost and efficient dynamic updating of the language model, better meet the needs and application scenarios of users, and improve the accuracy and efficiency of matching.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of data processing technology, and more specifically, to a method, apparatus, electronic device, and storage medium for dynamically updating a language model. Background Technology

[0002] In natural language processing, AIGC (Artificial Intelligence Generated Content) models require pre-training before deployment. Traditional pre-training methods can only be used with static datasets. However, in practical applications, static datasets cannot be updated as training data increases or application scenarios expand, significantly limiting model performance. Furthermore, AIGC models are large in size; retraining them periodically by updating static datasets would incur substantial costs and resources. Summary of the Invention

[0003] To address the aforementioned problems, this invention provides a method, apparatus, electronic device, and storage medium for dynamically updating a language model, which enables dynamic updates of the language model with good real-time performance and low cost.

[0004] In a first aspect, embodiments of this application provide a method for dynamically updating a language model, the method comprising:

[0005] Collect user questions, segment the user questions into words, and obtain multiple keyword vectors;

[0006] Obtain basic word vectors from the vector library, and match the basic word vectors with the keyword vectors to obtain multiple matching word vectors;

[0007] Based on the local initial model, multiple matching word vectors are filtered to obtain multiple preliminary filtered vectors;

[0008] Based on the pre-trained model, multiple preliminary screening vectors are integrated to generate multiple prompt data.

[0009] The local initial model is updated based on the prompt data to obtain a local optimized model, and the target language processing system is obtained based on the local optimized model and the pre-trained model.

[0010] In one embodiment, the step of segmenting the user question to obtain multiple keyword vectors includes:

[0011] The user's question was segmented using a word segmentation tool, resulting in multiple segmentation results.

[0012] The word segmentation results are encoded using a word vector model to obtain multiple keyword vectors.

[0013] In one embodiment, matching the base word vectors with the keyword vectors to obtain multiple matching word vectors includes:

[0014] Calculate the similarity between the basic word vectors and the keyword vectors, and sort the basic word vectors in descending order of similarity to obtain a sequential queue;

[0015] The first N basic word vectors of the sequential queue are determined as the matching word vectors.

[0016] In one embodiment, the method further includes:

[0017] Return the index of each matched word vector in the vector library.

[0018] In one embodiment, the integration of multiple preliminary screening vectors based on a pre-trained model to generate multiple prompt data includes:

[0019] Multiple preliminary selection vectors are input into the pre-trained model to generate multiple prompt data.

[0020] In one embodiment, updating the local initial model based on the prompt data includes:

[0021] The training set is updated based on the prompt data;

[0022] The local initial model is retrained based on the updated training set.

[0023] In one embodiment, determining the updated training set based on the prompt data includes:

[0024] Based on the prompt data, obtain updated annotation data;

[0025] The updated labeled data is added to the initial training set to obtain the updated training set.

[0026] Secondly, embodiments of this application provide a language model dynamic update device, the language model dynamic update device comprising:

[0027] The word segmentation module is used to collect user questions, segment the user questions into words, and obtain multiple keyword vectors.

[0028] The matching module is used to obtain basic word vectors from the vector library, match the basic word vectors with the keyword vectors, and obtain multiple matching word vectors.

[0029] The filtering module is used to filter multiple matching word vectors based on the local initial model to obtain multiple preliminary filtered vectors;

[0030] The integration module is used to integrate multiple preliminary screening vectors based on a pre-trained model to generate multiple prompt data.

[0031] The determination module is used to update the local initial model based on the prompt data to obtain a local optimized model, and to obtain the target language processing system based on the local optimized model and the pre-trained model.

[0032] Thirdly, embodiments of this application provide an electronic device, including a memory and a processor, wherein the memory is used to store a computer program, and the computer program executes the language model dynamic update method provided in the first aspect when the processor is running.

[0033] Fourthly, embodiments of this application provide a computer-readable storage medium storing a computer program that, when run on a processor, executes the language model dynamic update method provided in the first aspect.

[0034] The beneficial effects of this application are:

[0035] The language model dynamic update method provided in this application has advantages and innovations such as knowledge extraction and language generation models, fragmented storage methods, the combination of multiple natural language processing technologies and models, customized solutions, and miniaturization and efficiency. It can better meet user needs and application scenarios, improve matching accuracy and efficiency, and also enhance user experience and satisfaction. Attached Figure Description

[0036] To more clearly illustrate the technical solutions of the embodiments of the present invention, the accompanying drawings used in the embodiments will be briefly introduced below. It should be understood that the following drawings only show some embodiments of the present invention and should not be regarded as a limitation on the scope. For those skilled in the art, other related drawings can be obtained based on these drawings without creative effort.

[0037] Figure 1 A flowchart illustrating a language model dynamic update method provided in an embodiment of this application is shown.

[0038] Figure 2 A schematic diagram of the framework of the target language processing system provided in an embodiment of this application is shown;

[0039] Figure 3 A schematic diagram of the structure of the language model dynamic update device provided in an embodiment of this application is shown;

[0040] Figure 4 A schematic diagram of the structure of an electronic device provided in an embodiment of this application is shown. Detailed Implementation

[0041] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. The components of the embodiments of the present invention described and shown in the accompanying drawings can generally be arranged and designed in various different configurations.

[0042] Therefore, the following detailed description of the embodiments of the invention provided in the accompanying drawings is not intended to limit the scope of the claimed invention, but merely to illustrate selected embodiments of the invention. All other embodiments obtained by those skilled in the art based on the embodiments of the invention without inventive effort are within the scope of protection of the invention.

[0043] It should be noted that similar labels and letters in the following figures indicate similar items. Therefore, once an item is defined in one figure, it does not need to be further defined and explained in subsequent figures.

[0044] In the description of this invention, it should be noted that if terms such as "upper," "lower," "inner," or "outer" are used to indicate the orientation or positional relationship based on the orientation or positional relationship shown in the accompanying drawings, or the orientation or positional relationship in which the product of this invention is usually placed, they are only for the convenience of describing this invention and simplifying the description, and do not indicate or imply that the device or element referred to must have a specific orientation, or be constructed and operated in a specific orientation, and therefore should not be construed as a limitation of this invention.

[0045] Furthermore, the terms "first" and "second" are used only to distinguish descriptions and should not be interpreted as indicating or implying relative importance.

[0046] It should be noted that, where there is no conflict, the features in the embodiments of the present invention can be combined with each other.

[0047] Example 1

[0048] Natural Language Processing (NLP) is an important field within computer science and artificial intelligence. Generative AI (Artificial Intelligence Generated Content) models are frequently used in NLP tasks. As training data increases and application scenarios expand, the need for dynamic training will continue to grow.

[0049] Traditional pre-trained models can only be pre-trained using static datasets, making them unable to adapt to changes in dynamic datasets. Furthermore, they can only be pre-trained for a single task, failing to handle diverse and evolving requirements. This limits the generalization ability and adaptability of pre-trained models. Secondly, small businesses cannot afford the cost and resources of training large-scale models.

[0050] Based on this, embodiments of this application provide a low-cost method for dynamically updating a language model.

[0051] For details, see Figure 1 Methods for dynamically updating language models include:

[0052] Step S110: Collect user questions, segment the user questions into words, and obtain multiple keyword vectors;

[0053] This embodiment can be applied to dialogue systems such as chatbots. During the training of large models like GPT, it is necessary to collect a large amount of human-written dialogue data from various channels, such as social media, chat logs, and dialogue corpora. Within this dialogue data, a large number of question-answer pairs, i.e., human-to-human conversations, need to be identified. This data is then organized into a format usable by machine learning algorithms. Machine learning algorithms are then used to train the model on the organized data, continuously optimizing it to achieve better question-answering results.

[0054] However, this training process is costly and resource-intensive, requiring significant server resources and manpower. To achieve low-cost, miniaturized dynamic updates of the language model, this application employs a combination of two language models as the target language processing system for practical applications: a local small model and a pre-trained large model. In the implementation of this application's embodiments, no modifications to the large pre-trained model are required; only the local small model is optimized. The training cost of the small model is far lower than that of the large model, thus enabling low-cost dynamic updates.

[0055] To achieve dynamic updates, user questions collected by the dialogue system can be obtained in real time, and new training sets can be obtained based on these user questions to optimize the local small model in real time.

[0056] In one embodiment, segmenting the user question to obtain multiple keyword vectors includes: performing word segmentation on the user question using a word segmentation tool to obtain multiple word segmentation results; and encoding each of the word segmentation results using a word vector model to obtain the multiple keyword vectors.

[0057] Specifically, word segmentation of the user question can be assisted by some mature natural language processing tools and libraries, such as jieba, NLTK, spaCy, etc. These tools can perform word segmentation on Chinese and English texts and convert a text into a list of word vectors. Taking jieba for word segmentation in Python as an example: inputting "I want to listen to singer A's songs" can automatically output the word segmentation result: "I want to listen to singer A's songs".

[0058] Next, the word vectors in the word vector list can be used to encode the text. The word vectors can use pre-trained word vector models, such as Word2Vec, GloVe, etc. These models have been trained on a large amount of training data for word vectors and can convert each word into a vector representation of a fixed length. The specific encoding method can adopt methods such as the bag-of-words model (Bag of Words) or TF-IDF.

[0059] After word segmentation and vectorization are completed, these keyword vectors (keyword vectors) are stored in an index library for subsequent matching and retrieval. This fragmented storage method can effectively reduce the occupancy of storage space and at the same time improve the efficiency of matching and retrieval.

[0060] Step S120: Obtain the basic word vectors in the vector library, match the basic word vectors with the keyword vectors, and obtain multiple matching word vectors;

[0061] Specifically, this process generally uses a metric method, such as Euclidean distance, cosine similarity, etc., to calculate the similarity scores between the keyword vectors and all the basic word vectors in the vector library, then selects the top N basic word vectors with the highest scores, and returns the indexes of them in the original data, that is, the vector library, to the caller.

[0062] In one embodiment, the matching of the basic word vectors with the keyword vectors to obtain multiple matching word vectors includes: calculating the similarity between the basic word vectors and the keyword vectors, and sorting the basic word vectors in descending order of similarity to obtain an ordered queue; and determining the first N basic word vectors in the ordered queue as the matching word vectors.

[0063] In one embodiment, the method further includes: returning the indexes of each of the matching word vectors in the vector library.

[0064] Step S130: Based on the local initial model, filter the multiple matching word vectors to obtain multiple preliminary filtered vectors;

[0065] The local mini-model mentioned above can be the initial local model described in this step or an optimized local model. After obtaining the matching word vectors, the local mini-model can be used for language condensation, condensing the article or knowledge points into concise text content for better user understanding and reading. This process can utilize text summarization techniques, such as keyword-based summarization.

[0066] For example, the local initial model can be improved based on recurrent neural networks (RNNs). For instance, Long Short-Term Memory (LSTM) networks and gated recurrent units (GRUs) are commonly used RNN variants, which have advantages in handling long sequences and capturing long-term dependencies. Alternatively, it can be improved based on models such as Transformers, which have good applications in tasks such as machine translation, text generation, and semantic understanding. Transformer models can process the contextual information of the entire sentence simultaneously and encode semantic relationships at different levels, exhibiting good parallel computing capabilities.

[0067] Step S140: Based on the pre-trained model, integrate the multiple preliminary screening vectors to generate multiple prompt data;

[0068] Finally, this condensed text content is integrated into a concise and easy-to-understand prompt for better user comprehension and reading. For example, the pre-trained model could be a large language model such as GPT-3.5-turbo, which has excellent natural language processing capabilities, enabling language generation and optimization of text for a better user experience and dialogue system performance.

[0069] Because pre-trained models are typically large language models that are large in size and have good performance, but are also difficult to train locally, in this embodiment, language processing is only performed in collaboration with the local optimized model, without any training or modification of the pre-trained model itself.

[0070] A prompt can be understood as an AI suggestion, which refers to the information the system provides to the user or instructs the user on what action the user should take so that the system can correctly execute the next step. For example, if we want a chatbot to answer a weather question, we can prompt the user like this: "Which city's weather would you like to know?" This prompt is an example of a prompt.

[0071] Generally speaking, in this embodiment, there is no need to adjust the pre-trained model. Of course, if the hardware allows, the model can be fine-tuned according to actual needs to generate concise and easy-to-understand prompts suitable for different application scenarios; no specific limitations are made here.

[0072] In one embodiment, the step of integrating multiple preliminary screening vectors based on a pre-trained model to generate multiple prompt data includes: inputting multiple preliminary screening vectors into the pre-trained model to generate multiple prompt data.

[0073] Step S150: Update the local initial model based on the prompt data to obtain a local optimized model, and obtain the target language processing system based on the local optimized model and the pre-trained model.

[0074] The final target language processing system includes a local optimization model and a pre-trained model. In specific applications, such as dialogue systems, the local optimization model and the pre-trained model need to collaborate on natural language processing. For example, the local optimization model extracts features from the user's question, and then the pre-trained model generates a concise and easy-to-understand prompt.

[0075] Since user questions are constantly being updated, the training set of the local optimization model provided in this application will also be updated in real time, thus realizing the dynamic updating of the target language processing system.

[0076] After obtaining the prompt data, further annotation is needed to obtain the required training set. This can be done manually or automatically. Manual annotation involves using the generated prompt data and having it annotated by professionals in the relevant field or through crowdsourcing platforms. While manual annotation yields high-quality, standardized training data, it requires significant manpower and time.

[0077] Automatic annotation, using machine learning techniques and pre-generated prompts as training data, employs natural language processing tools and frameworks such as NER (Named Entity Recognition), syntactic parsers, and sentiment analyzers to automatically annotate relevant text or datasets. While automatic annotation can significantly reduce manual labor costs, it also faces challenges such as low annotation accuracy and the need for large amounts of data, requiring experimentation and optimization in practical applications.

[0078] In one embodiment, updating the local initial model based on the prompt data includes: determining an updated training set based on the prompt data; and retraining the local initial model based on the updated training set.

[0079] In one embodiment, determining the updated training set based on the prompt data includes: obtaining updated labeled data based on the prompt data; and adding the updated labeled data to the initial training set to obtain the updated training set.

[0080] In summary, please see Figure 2 As shown in the framework diagram, this embodiment of the application first obtains the user question, then segments and vectorizes the user question based on the local AI model to obtain multiple keyword vectors, then condenses them to obtain multiple sets of matching word vectors, then sends them into the pre-trained model (Big Model Server), and finally obtains the required prompt data to achieve dynamic updates of the language model.

[0081] The process of generating a prompt mainly includes word segmentation of the user's question, vector calculation and similarity calculation, matching and indexing, language condensation and generation, and prompt (AI-generated hints) summarization and generation. This process utilizes various natural language processing techniques and models, including text segmentation, vector calculation, similarity calculation, text summarization techniques, and large-scale pre-trained models. These techniques and models work together to effectively improve the accuracy and efficiency of matching, while also enhancing user experience and satisfaction.

[0082] The language model dynamic update method provided in this embodiment has the following advantages:

[0083] It can automatically extract relevant knowledge points from the knowledge base and integrate them into a concise and easy-to-understand prompt for better user comprehension and reading. The core technology of this model is a language model and information extraction technique based on NLP, enabling efficient knowledge extraction and language generation.

[0084] By employing text segmentation and fragmented associative storage, text data can be divided into individual words and converted into vector form for storage in an index. This fragmented storage method effectively reduces storage space usage while improving matching and retrieval efficiency.

[0085] It utilizes a variety of natural language processing techniques and models working together, such as text segmentation, vector computation, similarity calculation, text summarization, and large-scale pre-trained models, which can effectively improve the accuracy and efficiency of matching, while also enhancing user experience and satisfaction.

[0086] Designed for specific dialogue system applications, this solution uses collected data for local training to improve model adaptability and performance. This customized approach better meets user needs and application scenarios.

[0087] A small-scale solution based on dynamic data pre-training can achieve efficient problem matching and knowledge extraction even with limited hardware resources. This miniaturized and efficient solution can better meet user needs and application scenarios.

[0088] In summary, the language model dynamic update method provided in this embodiment has advantages and innovations such as knowledge extraction and language generation models, fragmented storage methods, the combination of multiple natural language processing technologies and models, customized solutions, and miniaturization and efficiency. It can better meet user needs and application scenarios, improve matching accuracy and efficiency, and also enhance user experience and satisfaction.

[0089] Example 2

[0090] Furthermore, embodiments of this application provide a language model dynamic update device.

[0091] Specifically, such as Figure 3 As shown, the language model dynamic update device 300 includes:

[0092] The word segmentation module 310 is used to collect user questions, segment the user questions into words, and obtain multiple keyword vectors.

[0093] The matching module 320 is used to obtain basic word vectors from the vector library, and match the basic word vectors with the keyword vectors to obtain multiple matching word vectors;

[0094] The filtering module 330 is used to filter multiple matching word vectors based on the local initial model to obtain multiple preliminary filtered vectors;

[0095] Integration module 340 is used to integrate multiple preliminary screening vectors based on a pre-trained model to generate multiple prompt data.

[0096] The determination module 350 is used to update the local initial model based on the prompt data to obtain a local optimized model, and to obtain a target language processing system based on the local optimized model and the pre-trained model.

[0097] In one embodiment, the word segmentation module 310 is further configured to:

[0098] The user question is segmented using a word segmentation tool to obtain multiple segmentation results; each segmentation result is encoded using a word vector model to obtain multiple keyword vectors.

[0099] In one embodiment, the matching module 320 is further configured to:

[0100] Calculate the similarity between the basic word vectors and the keyword vectors, and sort the basic word vectors in descending order of similarity to obtain a sequential queue; determine the first N basic word vectors in the sequential queue as the matching word vectors.

[0101] In one embodiment, the matching module 320 is further configured to:

[0102] Return the index of each matched word vector in the vector library.

[0103] In one embodiment, the integration module 340 is further configured to:

[0104] Multiple preliminary selection vectors are input into the pre-trained model to generate multiple prompt data.

[0105] In one embodiment, the determining module 350 is further configured to:

[0106] The updated training set is determined based on the prompt data; the local initial model is then retrained based on the updated training set.

[0107] In one embodiment, the determining module 350 is further configured to:

[0108] Based on the prompt data, obtain updated labeled data; add the updated labeled data to the initial training set to obtain the updated training set.

[0109] The language model dynamic update device 300 provided in this embodiment can implement the language model dynamic update method provided in Embodiment 1. To avoid repetition, it will not be described again here.

[0110] The language model dynamic update device provided in this embodiment has advantages and innovations such as knowledge extraction and language generation models, fragmented storage methods, the combination of multiple natural language processing technologies and models, customized solutions, and miniaturization and high efficiency. It can better meet user needs and application scenarios, improve matching accuracy and efficiency, and also enhance user experience and satisfaction.

[0111] Example 3

[0112] Furthermore, this application provides an electronic device, including a memory and a processor. The memory stores a computer program, which executes the language model dynamic update method provided in Embodiment 1 when running on the processor.

[0113] For details, see Figure 4 The electronic device 400 includes: a transceiver 401, a bus interface, and a processor 402, wherein the processor 402 is used for:

[0114] Collect user questions, segment the user questions into words, and obtain multiple keyword vectors;

[0115] Obtain basic word vectors from the vector library, and match the basic word vectors with the keyword vectors to obtain multiple matching word vectors;

[0116] Based on the local initial model, multiple matching word vectors are filtered to obtain multiple preliminary filtered vectors;

[0117] Based on the pre-trained model, multiple preliminary screening vectors are integrated to generate multiple prompt data.

[0118] The local initial model is updated based on the prompt data to obtain a local optimized model, and the target language processing system is obtained based on the local optimized model and the pre-trained model.

[0119] In one embodiment, the processor 402 is further configured to:

[0120] The user's question was segmented using a word segmentation tool, resulting in multiple segmentation results.

[0121] The word segmentation results are encoded using a word vector model to obtain multiple keyword vectors.

[0122] In one embodiment, the processor 402 is further configured to:

[0123] Calculate the similarity between the basic word vectors and the keyword vectors, and sort the basic word vectors in descending order of similarity to obtain a sequential queue;

[0124] The first N basic word vectors of the sequential queue are determined as the matching word vectors.

[0125] In one embodiment, the processor 402 is further configured to:

[0126] Return the index of each matched word vector in the vector library.

[0127] In one embodiment, the processor 402 is further configured to:

[0128] Multiple preliminary selection vectors are input into the pre-trained model to generate multiple prompt data.

[0129] In one embodiment, the processor 402 is further configured to:

[0130] The training set is updated based on the prompt data;

[0131] The local initial model is retrained based on the updated training set.

[0132] In one embodiment, the processor 402 is further configured to:

[0133] Based on the prompt data, obtain updated annotation data;

[0134] The updated labeled data is added to the initial training set to obtain the updated training set.

[0135] In this embodiment of the invention, the electronic device 400 further includes a memory 403. Figure 4 In this context, the bus architecture can include any number of interconnected buses and bridges, specifically linking various circuits together, represented by one or more processors (processor 402) and memory (memory 403). The bus architecture can also link together various other circuits such as peripheral devices, voltage regulators, and power management circuits, which are well known in the art and therefore will not be described further herein. The bus interface provides an interface. The transceiver 401 can be multiple elements, including transmitters and receivers, providing a unit for communicating with various other devices over a transmission medium. The processor 402 is responsible for managing the bus architecture and general processing, and the memory 403 can store data used by the processor 402 during operation.

[0136] The electronic device 400 provided in this embodiment of the invention can implement the language model dynamic update method provided in Embodiment 1. To avoid repetition, it will not be described again here.

[0137] The electronic device provided in this embodiment has advantages and innovations such as knowledge extraction and language generation models, fragmented storage methods, a combination of multiple natural language processing technologies and models, customized solutions, and miniaturization and high efficiency. It can better meet user needs and application scenarios, improve matching accuracy and efficiency, and also enhance user experience and satisfaction.

[0138] Example 4

[0139] This application also provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the language model dynamic update method provided in Embodiment 1.

[0140] In this embodiment, the computer-readable storage medium can be a volatile storage medium or a non-volatile storage medium, including read-only memory (ROM), random access memory (RAM), magnetic disk or optical disk, etc.

[0141] The computer-readable storage medium provided in this embodiment can implement the language model dynamic update method provided in Embodiment 1. To avoid repetition, it will not be described again here.

[0142] It should be noted that, in this document, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or terminal. Unless otherwise specified, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or terminal that includes that element.

[0143] Through the above description of the embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus necessary general-purpose hardware platforms. Of course, they can also be implemented by hardware, but in many cases the former is a better implementation method. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product is stored in a storage medium (such as ROM / RAM, magnetic disk, optical disk) and includes several instructions to cause a terminal (which may be a mobile phone, computer, server, air conditioner, or network device, etc.) to execute the methods described in the various embodiments of this application.

[0144] The embodiments of this application have been described above with reference to the accompanying drawings. However, this application is not limited to the specific embodiments described above. The specific embodiments described above are merely illustrative and not restrictive. Those skilled in the art can make many other forms under the guidance of this application without departing from the spirit and scope of the claims, and all of these forms are within the protection scope of this application.

Claims

1. A method for dynamically updating a language model, characterized in that, The method includes: Collect user questions, segment the user questions into words, and obtain multiple keyword vectors; Obtain basic word vectors from the vector library, and match the basic word vectors with the keyword vectors to obtain multiple matching word vectors; Based on the local initial model, multiple matching word vectors are filtered to obtain multiple preliminary filtered vectors; Based on the pre-trained model, multiple preliminary screening vectors are integrated to generate multiple prompt data. The local initial model is updated based on the prompt data to obtain a local optimized model, and the target language processing system is obtained based on the local optimized model and the pre-trained model. The process of filtering multiple matching word vectors based on a local initial model yields multiple preliminary filtered vectors, including: After obtaining the matching word vectors, the local initial model or the optimized local optimization model is used to perform language condensation, condensing the article or knowledge points into concise text content. This process uses text summarization technology, which includes summarization based on keyword extraction. The local initial model is obtained by improving the recurrent neural network, which includes a long short-term memory network and a gated recurrent unit converter model. The step of updating the local initial model based on the prompt data includes: The training set is updated based on the prompt data; The local initial model is retrained based on the updated training set; The step of determining the updated training set based on the prompt data includes: Based on the prompt data, obtain updated annotation data; The updated labeled data is added to the initial training set to obtain the updated training set.

2. The language model dynamic update method according to claim 1, characterized in that, The user question is segmented into words to obtain multiple keyword vectors, including: The user's question was segmented using a word segmentation tool, resulting in multiple segmentation results. The word segmentation results are encoded using a word vector model to obtain multiple keyword vectors.

3. The language model dynamic update method according to claim 1, characterized in that, The step of matching the basic word vector with the keyword vector to obtain multiple matching word vectors includes: Calculate the similarity between the basic word vectors and the keyword vectors, and sort the basic word vectors in descending order of similarity to obtain a sequential queue; The first N basic word vectors of the sequential queue are determined as the matching word vectors.

4. The language model dynamic update method according to claim 3, characterized in that, The method further includes: Return the index of each matched word vector in the vector library.

5. The language model dynamic update method according to claim 1, characterized in that, The process involves integrating multiple preliminary screening vectors based on a pre-trained model to generate multiple prompt data sets, including: Multiple preliminary selection vectors are input into the pre-trained model to generate multiple prompt data.

6. A language model dynamic update device, characterized in that, The device includes: The word segmentation module is used to collect user questions, segment the user questions into words, and obtain multiple keyword vectors. The matching module is used to obtain basic word vectors from the vector library, match the basic word vectors with the keyword vectors, and obtain multiple matching word vectors. The filtering module is used to filter multiple matching word vectors based on the local initial model to obtain multiple preliminary filtered vectors; The integration module is used to integrate multiple preliminary screening vectors based on a pre-trained model to generate multiple prompt data. The determination module is used to update the local initial model based on the prompt data to obtain a local optimized model, and to obtain a target language processing system based on the local optimized model and the pre-trained model. The process of filtering multiple matching word vectors based on a local initial model yields multiple preliminary filtered vectors, including: After obtaining the matching word vectors, the local initial model or the optimized local optimization model is used to perform language condensation, condensing the article or knowledge points into concise text content. This process uses text summarization technology, which includes summarization based on keyword extraction. The local initial model is obtained by improving the recurrent neural network, which includes a long short-term memory network and a gated recurrent unit converter model. The step of updating the local initial model based on the prompt data includes: The training set is updated based on the prompt data; The local initial model is retrained based on the updated training set; The step of determining the updated training set based on the prompt data includes: Based on the prompt data, obtain updated annotation data; The updated labeled data is added to the initial training set to obtain the updated training set.

7. An electronic device, characterized in that, It includes a memory and a processor, wherein the memory stores a computer program that executes the language model dynamic update method according to any one of claims 1 to 5 when the processor is running.

8. A computer-readable storage medium, characterized in that, It stores a computer program that, when run on a processor, executes the language model dynamic update method according to any one of claims 1 to 5.