An intelligent question and answer method and device combining expert and weight decomposition low rank adaptation
By introducing a hybrid expert mechanism, low-rank adaptation of weight decomposition, and Steenfair manifold optimization into the multi-task question answering model, the problems of inter-task interference and training instability are solved, and efficient and stable performance improvement of multi-task question answering is achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- UNIV OF SCI & TECH BEIJING
- Filing Date
- 2026-04-02
- Publication Date
- 2026-06-26
AI Technical Summary
Existing multi-task question answering models are prone to inter-task interference, gradient conflicts, and negative transfer under a unified parameter sharing framework. Hybrid expert mechanisms suffer from routing instability and high parameter overhead. Low-rank adaptation methods are prone to update direction coupling and training instability in multi-task scenarios and lack effective geometric constraint optimization.
By combining a hybrid expert mechanism, weight decomposition low-rank adaptation, and Steenfair manifold optimization, dynamic expert division of labor is achieved through a gating network. The weight decomposition low-rank adaptation structure is introduced and Steenfair manifold constraints are applied to optimize the low-rank update subspace, forming stable task decoupling and efficient parameter updating.
It improves the task decoupling ability, training stability and overall performance of multi-task question answering models, reduces training overhead and deployment costs, and enhances the accuracy and generalization ability of models in complex application scenarios.
Smart Images

Figure CN122287902A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the fields of artificial intelligence and natural language processing technology, and in particular to an intelligent question-answering method and apparatus that combines expert input with weighted decomposition and low-rank adaptation. Background Technology
[0002] With the rapid development of natural language processing technology and LLM (Large Language Model), intelligent question answering has been widely applied in various scenarios such as intelligent customer service, knowledge retrieval, educational tutoring, and medical consultation. To improve the model's adaptability to different tasks and scenarios, researchers have gradually adopted multi-task learning methods to simultaneously handle multiple tasks such as knowledge-based question answering, reading comprehension question answering, multi-turn dialogue question answering, retrieval-enhanced question answering, and domain-specific question answering within a unified model framework. This type of method can improve knowledge transfer efficiency through parameter sharing and reduce the training and maintenance costs associated with independent modeling of single tasks. However, different question answering tasks differ significantly in input format, semantic structure, inference path, and output target. Directly using shared parameters for joint training can easily lead to inter-task feature interference, gradient conflicts, and negative transfer phenomena, thus affecting the model's performance stability and generalization ability across sub-tasks.
[0003] To address the aforementioned issues, existing technologies have proposed methods such as hybrid expert mechanisms and efficient low-rank parameter fitting to enhance the model's task allocation capabilities and training efficiency. Hybrid expert mechanisms allocate different expert sub-modules to different tasks or samples through gated routing, thereby mitigating task conflicts in a unified parameter space. Weight decomposition low-rank fitting methods update model weights through low-rank decomposition, reducing the number of training parameters while improving task adaptability. However, existing solutions generally suffer from the following shortcomings: firstly, expert routing and low-rank fitting may still lead to representation coupling in multi-task scenarios, making it difficult to fundamentally suppress inter-task interference; secondly, the low-rank update matrix lacks effective geometric constraints during training, easily leading to problems such as subspace degradation, increased redundant representations, and optimization instability. Therefore, there is an urgent need for an intelligent question-answering method that can jointly utilize hybrid expert mechanisms, weight decomposition low-rank fitting, and geometric constraint optimization methods to achieve more stable and efficient multi-task learning.
[0004] Existing question-answering systems often employ pre-trained language models or large language models as a unified backbone, using shared parameters to jointly model different question-answering tasks. This approach can leverage common knowledge across tasks to some extent, improving the overall efficiency of the model and reducing the deployment and maintenance costs associated with training multiple single-task models separately. However, in multi-task question-answering scenarios, different tasks typically exhibit significant differences in question representation, knowledge dependency, inference depth, and answer generation patterns. Directly using shared parameters for joint learning can easily lead to competition among tasks within the same representation space, resulting in inconsistent gradient directions, imbalanced knowledge coverage, and negative transfer. This can limit performance improvements on some tasks and even cause a decrease in accuracy.
[0005] To address task conflict issues in multi-task learning, some existing technologies have introduced hybrid expert mechanisms. These mechanisms dynamically allocate expert sub-modules to different task samples using gating networks, aiming to alleviate parameter coupling problems through a shared backbone and expert division of labor. Such methods can improve task separation capabilities to some extent, enabling the model to have stronger targeted representation capabilities for different tasks. However, existing hybrid expert methods still have some shortcomings in multi-task question answering scenarios. For example, expert routing results are easily affected by fluctuations in input distribution, leading to instability; different tasks may compete for the same experts, causing load imbalance; and the training and storage costs increase significantly when the number of expert parameters is large, affecting practicality and scalability.
[0006] Furthermore, to reduce the parameter update cost in multi-task training, existing technologies employ low-rank adaptation or efficient parameter fine-tuning methods, which adapt the pre-trained model to the task by updating only a subset of low-rank parameters. Among these, weight decomposition low-rank adaptation methods further decompose parameter changes into direction and magnitude components based on traditional low-rank updates, helping to enhance parameter expressiveness and reduce update redundancy. However, in multi-task scenarios, these methods typically still employ unconstrained optimization in Euclidean space, leading to low-rank update directions from different tasks easily converging or even overlapping, thus weakening task discrimination. Simultaneously, issues such as low-rank subspace degradation, increased parameter redundancy, and unstable optimization paths may arise during training, limiting further model improvement in complex multi-task question-answering scenarios.
[0007] Therefore, while existing technologies have improved multi-task question-answering models from the perspectives of expert division of labor and efficient parameter adaptation, they still lack a technical solution that can organically combine hybrid expert mechanisms with weight decomposition low-rank adaptation, and further maintain the stability and independence of each task adaptation subspace through geometric constraint optimization. Especially in multi-task question-answering systems, if effective constraints are not imposed on the low-rank update direction, expert division of labor and low-rank adaptation may still couple at the latent space level, making it difficult to fundamentally solve the problems of task conflict and training instability. Therefore, a new multi-task question-answering system and method is urgently needed to achieve dynamic expert division of labor and efficient parameter updating, while simultaneously introducing Steenfair manifold optimization to constrain the low-rank update subspace, thereby improving the model's task decoupling ability, training stability, and overall question-answering performance.
[0008] To address the issues of inter-task interference, gradient conflicts, and negative transfer that arise in existing multi-task question answering models under a unified parameter sharing framework, this paper proposes a model training system and method suitable for multi-task question answering scenarios. This system and method would enable different question answering tasks to share basic knowledge while achieving more effective task division and representation decoupling, thereby improving the model's collaborative learning ability and overall performance across multiple question answering tasks.
[0009] To address the shortcomings of existing hybrid expert mechanisms in multi-task learning, such as unstable expert routing, uneven expert load, and high parameter overhead, this paper explores how to introduce hybrid expert mechanisms into multi-task question answering models and combine task features and input semantic features to achieve more reasonable expert selection. This would enhance the model's ability to differentiate between question answering tasks while reducing mutual interference between tasks due to sharing the same representation space.
[0010] To address the issues of update direction coupling, subspace degradation, and training instability in existing low-rank adaptation methods in multi-task scenarios, this paper proposes a method to apply effective geometric constraints to the low-rank direction parameters during weight decomposition low-rank adaptation. This would ensure that the adaptation update directions for different tasks maintain good independence, orthogonality, and stability during training, thereby reducing update redundancy and improving the applicability and optimization effect of efficient parameter fine-tuning in multi-task question answering scenarios.
[0011] Furthermore, how to organically integrate hybrid expert mechanisms, weight decomposition low-rank adaptation, and Steenfair manifold optimization methods in the same multi-task question answering system to form a unified technical solution that takes into account task decoupling, parameter efficiency, training stability, and performance improvement, so as to improve the accuracy, generalization ability, and engineering deployability of multi-task question answering models in complex application scenarios. Summary of the Invention
[0012] To address the aforementioned technical problems in existing technologies, embodiments of the present invention provide an intelligent question-answering method and apparatus that combines hybrid experts with low-rank weighted decomposition adaptation. The technical solution is as follows: On the one hand, a hybrid expert and low-rank weight decomposition adaptation intelligent question answering method is provided. This method is implemented by an intelligent question answering device that combines hybrid experts and low-rank weight decomposition adaptation. The method includes: S1, the issue of receiving user input.
[0013] S2. The trained multi-task question answering model generates answers to questions. The training process of the multi-task question answering model includes: Obtain historical question-and-answer text data, add task identifiers and supervision labels to the historical question-and-answer text data, and perform preprocessing to obtain the sample dataset.
[0014] The multi-task question answering model obtains expert output results based on samples; the multi-task question answering model includes a shared backbone network module, an expert allocation module, and multiple expert sub-modules.
[0015] We introduce a weight decomposition low-rank adaptation structure into the shared backbone network module and multiple expert sub-modules. We apply Steenfair manifold constraints and Riemann optimization to the direction parameters in the weight decomposition low-rank adaptation structure. Then, we train the multi-task question answering model based on the sample dataset to obtain the trained multi-task question answering model.
[0016] S3. Output the generated answer to the user.
[0017] Optionally, historical question-and-answer text data includes: knowledge question-and-answer text data, reading comprehension question-and-answer text data, multi-turn dialogue question-and-answer text data, retrieval-enhanced question-and-answer text data, and domain-specific question-and-answer text data.
[0018] Optionally, expert output results are obtained based on the sample, including: S21. Select any sample from the sample dataset and input the selected sample into the shared backbone network module to obtain the semantic representation of the sample.
[0019] S22. Concatenate the semantic representation of the sample with the task identifier vector of the sample to obtain the joint feature representation.
[0020] S23. Input the joint feature representation into the expert allocation module to obtain the routing weight vector of each expert sub-module; wherein, the expert allocation module adopts a gated network.
[0021] S24. Select a subset of expert submodules from multiple expert submodules based on the routing weight vectors of each expert submodule.
[0022] S25. By performing mapping operations on the semantic representation of the sample through the selected expert sub-modules, the operation results of each expert sub-module are obtained. The operation results of each expert sub-module are weighted and summed according to the routing weight vector of each expert sub-module to obtain the expert output result of the sample.
[0023] Optionally, a weight decomposition low-rank adaptation structure is introduced into the shared backbone network module and multiple expert submodules, including: For the target layer in the shared backbone network module that undertakes the task adaptation function, an incremental matrix generated by the low-rank adaptation structure of weight decomposition is introduced into the original weight matrix of the target layer to obtain the adapted weight matrix; wherein, the incremental matrix includes magnitude parameters and direction parameters.
[0024] During the training of the multi-task question answering model, only the magnitude and direction parameters of the adapted weight matrix are updated.
[0025] Optionally, a weight decomposition low-rank adaptation structure is introduced into the shared backbone network module and multiple expert submodules, including: An incremental matrix generated by a weight decomposition low-rank adaptation structure is introduced into the parameters of selected expert submodules.
[0026] Optionally, Steenfair manifold constraints and Riemann optimization are applied to the orientation parameters in the low-rank adaptable structure of the weight decomposition, including: Define the Steifel manifold and set the objective function for the training process of the multi-task question answering model.
[0027] The Euclidean gradient of the direction parameter of the increment matrix is obtained from the objective function. The Euclidean gradient of the direction parameter of the increment matrix is projected onto the tangent space of the Steifel manifold and optimized using the Riemann gradient to obtain the Riemann gradient that satisfies the tangent space constraint.
[0028] The directional parameters of the increment matrix are updated using the Riemann gradient that satisfies the tangent space constraint.
[0029] Optionally, the objective function for training the multi-task question answering model is shown in equation (1) below: (1) In the formula, The objective function represents the training process of the multi-task question answering model. This represents the multi-task autoregressive loss function. , Indicates the weighting coefficient. This indicates expert routing loss. This indicates the loss from expert load balancing.
[0030] On the other hand, a hybrid expert and weighted decomposition low-rank adaptation intelligent question answering device is provided. This device is applied to the hybrid expert and weighted decomposition low-rank adaptation intelligent question answering method, and includes: The receiving module is used to receive user input questions.
[0031] The generation module is used to generate answers to questions by the trained multi-task question answering model. The training process of the multi-task question answering model includes: Obtain historical question-and-answer text data, add task identifiers and supervision labels to the historical question-and-answer text data, and perform preprocessing to obtain the sample dataset.
[0032] The multi-task question answering model obtains expert output results based on samples; the multi-task question answering model includes a shared backbone network module, an expert allocation module, and multiple expert sub-modules.
[0033] We introduce a weight decomposition low-rank adaptation structure into the shared backbone network module and multiple expert sub-modules. We apply Steenfair manifold constraints and Riemann optimization to the direction parameters in the weight decomposition low-rank adaptation structure. Then, we train the multi-task question answering model based on the sample dataset to obtain the trained multi-task question answering model.
[0034] The output module is used to output the generated answer to the user.
[0035] Optionally, historical question-and-answer text data includes: knowledge question-and-answer text data, reading comprehension question-and-answer text data, multi-turn dialogue question-and-answer text data, retrieval-enhanced question-and-answer text data, and domain-specific question-and-answer text data.
[0036] Optionally, the model building module is further used for: S21. Select any sample from the sample dataset and input the selected sample into the shared backbone network module to obtain the semantic representation of the sample.
[0037] S22. Concatenate the semantic representation of the sample with the task identifier vector of the sample to obtain the joint feature representation.
[0038] S23. Input the joint feature representation into the expert allocation module to obtain the routing weight vector of each expert sub-module; wherein, the expert allocation module adopts a gated network.
[0039] S24. Select a subset of expert submodules from multiple expert submodules based on the routing weight vectors of each expert submodule.
[0040] S25. By performing mapping operations on the semantic representation of the sample through the selected expert sub-modules, the operation results of each expert sub-module are obtained. The operation results of each expert sub-module are weighted and summed according to the routing weight vector of each expert sub-module to obtain the expert output result of the sample.
[0041] Optionally, the training module is further used for: For the target layer in the shared backbone network module that undertakes the task adaptation function, an incremental matrix generated by the low-rank adaptation structure of weight decomposition is introduced into the original weight matrix of the target layer to obtain the adapted weight matrix; wherein, the incremental matrix includes magnitude parameters and direction parameters.
[0042] During the training of the multi-task question answering model, only the magnitude and direction parameters of the adapted weight matrix are updated.
[0043] Optionally, the training module is further used for: An incremental matrix generated by a weight decomposition low-rank adaptation structure is introduced into the parameters of selected expert submodules.
[0044] Optionally, the training module is further used for: Define the Steifel manifold and set the objective function for the training process of the multi-task question answering model.
[0045] The Euclidean gradient of the direction parameter of the increment matrix is obtained from the objective function. The Euclidean gradient of the direction parameter of the increment matrix is projected onto the tangent space of the Steifel manifold and optimized using the Riemann gradient to obtain the Riemann gradient that satisfies the tangent space constraint.
[0046] The directional parameters of the increment matrix are updated using the Riemann gradient that satisfies the tangent space constraint.
[0047] Optionally, the objective function for training the multi-task question answering model is shown in equation (1) below: (1) In the formula, The objective function represents the training process of the multi-task question answering model. This represents the multi-task autoregressive loss function. , Indicates the weighting coefficient. This indicates expert routing loss. This indicates the loss from expert load balancing.
[0048] On the other hand, a hybrid expert and weight decomposition low-rank adaptation intelligent question answering device is provided, the hybrid expert and weight decomposition low-rank adaptation intelligent question answering device comprising: a processor; a memory, the memory storing computer-readable instructions, the computer-readable instructions being executed by the processor to implement any of the methods described above for hybrid expert and weight decomposition low-rank adaptation intelligent question answering.
[0049] On the other hand, a computer-readable storage medium is provided, wherein at least one instruction is stored therein, the at least one instruction being loaded and executed by a processor to implement any of the above-described intelligent question-answering methods of hybrid expert and weight decomposition low-rank adaptation.
[0050] The beneficial effects of the technical solutions provided in the embodiments of the present invention include at least the following: This invention combines a hybrid expert mechanism, low-rank adaptation based on weight decomposition, and Steenfair manifold optimization to form a unified technical solution that balances task decoupling, parameter efficiency, and optimization stability. This improves the overall performance of intelligent question answering, achieving efficient and reliable intelligent question answering. Furthermore, it possesses significant engineering application value and task scalability, supporting efficient training, rapid adaptation, and stable deployment for various question answering tasks within a unified question answering model framework. Specifically: (1) By introducing a hybrid expert mechanism, this invention realizes the dynamic division of labor for multi-task question-answering samples, which can effectively reduce the mutual interference between different question-answering tasks and improve the multi-task collaborative modeling capability.
[0051] (2) This invention adopts a weight decomposition low-rank adaptation method to achieve efficient parameter updates while keeping the main structure of the model unchanged, thereby reducing the training overhead and deployment cost of the multi-task question answering model.
[0052] (3) By applying Steenfair manifold constraints to the low-rank directional parameters and using Riemann optimization for updating, this invention can improve the stability and independence of the low-rank fitting subspace and enhance the convergence and robustness of the training process. Attached Figure Description
[0053] To more clearly illustrate the technical solutions in the embodiments of the present invention, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0054] Figure 1 This is a flowchart of an intelligent question-answering method that combines hybrid experts with low-rank adaptation of weight decomposition, provided in an embodiment of the present invention. Figure 2This is a schematic diagram of an intelligent question-answering method that combines hybrid experts with low-rank adaptation of weight decomposition, provided in an embodiment of the present invention. Figure 3 This is a block diagram of an intelligent question-answering device that combines hybrid experts with low-rank weight decomposition, as provided in an embodiment of the present invention. Figure 4 This is a schematic diagram of the structure of an intelligent question-answering device that combines hybrid experts with low-rank weight decomposition, as provided in an embodiment of the present invention. Detailed Implementation
[0055] The technical solution of the present invention will now be described with reference to the accompanying drawings.
[0056] In embodiments of the present invention, words such as "exemplarily," "for example," etc., are used to indicate that something is an example, illustration, or description. Any embodiment or design described as "exemplary" in the present invention should not be construed as being more preferred or advantageous than other embodiments or designs. Specifically, the use of the word "exemplary" is intended to present the concept in a concrete manner. Furthermore, in embodiments of the present invention, the meaning expressed by "and / or" can be both, or either one.
[0057] In the embodiments of this invention, the terms "image" and "picture" may sometimes be used interchangeably. It should be noted that, without emphasizing the distinction between them, they convey the same meaning. Similarly, the terms "of," "corresponding (relevant)," and "corresponding" may sometimes be used interchangeably. It should be noted that, without emphasizing the distinction between them, they convey the same meaning.
[0058] In this embodiment of the invention, sometimes a subscript such as W1 may be written in a non-subscript form such as W1. When the difference is not emphasized, the meaning they express is the same.
[0059] To make the technical problems, technical solutions and advantages of the present invention clearer, a detailed description will be given below in conjunction with the accompanying drawings and specific embodiments.
[0060] This invention provides an intelligent question-answering method that combines hybrid experts with low-rank weight decomposition adaptation. This method can be implemented by an intelligent question-answering device that combines hybrid experts with low-rank weight decomposition adaptation, which can be a terminal or a server. Figure 1 The flowchart shown is for an intelligent question-answering method that combines hybrid experts with low-rank weighted decomposition. The processing flow of this method may include the following steps: S1, the issue of receiving user input.
[0061] S2. The trained multi-task question answering model generates answers to questions. The training process of the multi-task question answering model includes: Obtain historical question-and-answer text data, add task identifiers and supervision labels to the historical question-and-answer text data, and perform preprocessing to obtain the sample dataset.
[0062] In one feasible implementation, the raw data required by the multi-task question-answering model of the present invention mainly includes question-answering text data, task identification data, and related supervision label data.
[0063] Specifically, the question-answering text data uses publicly available question-answering datasets or industry-specific question-answering data as training corpora. This data can include, but is not limited to, knowledge-based question answering, reading comprehension question answering, multi-turn dialogue question answering, retrieval-enhanced question answering, and domain-specific question answering data. Each sample must contain at least the question text and may further include contextual text, candidate knowledge fragments, question options, etc. The data can originate from commonly used datasets in the field of natural language processing, covering various task formats such as knowledge-based question answering, reading comprehension question answering, multi-turn dialogue question answering, retrieval-enhanced question answering, and domain-specific question answering.
[0064] Task Identifier Data: Identifies which type of data this data belongs to: knowledge-based question answering, reading comprehension question answering, multi-turn dialogue question answering, retrieval-enhanced question answering, or domain-specific question answering data.
[0065] Label generation: Based on the training requirements of different question-answering tasks, corresponding supervision information is added to the input samples. For extractive question-answering tasks, labels may include the start and end positions of the answer, the answer text, and the question category information; for generative question-answering tasks, labels may include the standard answer text, the set of reference answers, and the answer length information; for multi-task unified modeling scenarios, task type labels, domain labels, difficulty labels, or routing prior labels can also be added to each sample to guide the hybrid expert module in task-aware routing and differential representation learning.
[0066] Further, data preprocessing: During preprocessing, operations such as uniform format conversion, text cleaning, sample alignment, and tag encoding are performed on the raw question-answer data. Preprocessing may include removing invalid characters, standardizing punctuation, truncating excessively long text, constructing question-context-answer triples, and adding special task prefixes or task prompt templates to adapt to the input format requirements of the unified question-answering backbone model. For multi-task samples, a unified prompt template can also be constructed according to the task type, mapping different tasks to a unified natural language input representation, thereby improving compatibility and transferability during multi-task joint training.
[0067] Furthermore, data storage: 1) Storage Format: Preprocessed question-and-answer text data is stored in JSON format. Task identification information, routing assistance information, and label data are stored in a structured text format with a one-to-one correspondence between JSON and samples. Expert routing logs generated during model training are stored. Low-rank adaptation parameters and intermediate state information can be saved using the tensor parameter format safetensors.
[0068] 2) Hierarchical storage: Based on the data purpose and task type, the data is divided into training set, validation set and test set, and stored in a preset ratio, such as 8:1:1 in a structured directory, to facilitate subsequent access, reading and experimental reproduction.
[0069] 3) Training set: mainly includes various question-answering tasks, various question formats and various domain texts, used to train the unified backbone model, hybrid expert module and low-rank adaptation parameters.
[0070] 4) Validation set: Contains some task samples, problem forms or domain scenarios that were not used in training. It is used to evaluate the generalization performance of the model during training and to adjust hyperparameters such as the number of experts, low rank value, and manifold constraint strength.
[0071] 5) Test set: Independent of the training and validation sets, used to finally evaluate the accuracy, robustness, parameter efficiency and cross-task transferability of the method of the present invention in multi-task question answering scenarios.
[0072] The multi-task question answering model obtains expert output results based on samples; the multi-task question answering model includes a shared backbone network module, an expert allocation module, and multiple expert sub-modules.
[0073] In one feasible implementation, the present invention sets up a MoE (Mixture of Experts) structure in the multi-task question answering model, and introduces a uniquely determined gating network as an expert allocation module after the shared backbone network.
[0074] Optionally, the above steps for obtaining expert output results based on the sample may include the following steps S21-S25: S21. Select any sample from the sample dataset and input the selected sample into the shared backbone network module to obtain the semantic representation of the sample.
[0075] In one feasible implementation, for an input sample, its semantic representation is first extracted from the shared coding backbone. .
[0076] S22. Concatenate the semantic representation of the sample with the task identifier vector of the sample to obtain the joint feature representation.
[0077] In one feasible implementation, the task identifier vector is... With semantic representation By concatenating the features, a joint feature representation is obtained. .
[0078] S23. Input the joint feature representation into the expert allocation module to obtain the routing weight vector of each expert sub-module; wherein, the expert allocation module adopts a gated network.
[0079] In one feasible implementation, the gating network is represented by joint features. As the sole input, calculate the routing weight vector for each expert. The calculation process is expressed as follows: ,in, The parameters representing the gating network, Indicates the first The weights assigned to each expert.
[0080] S24. Select a subset of expert submodules from multiple expert submodules based on the routing weight vectors of each expert submodule.
[0081] In one feasible implementation, the system selects experts from the expert set according to routing weights. Select the top-k experts to participate in the calculation of the current sample. This indicates the number of expert submodules, and the parameters of each expert are denoted as follows: .
[0082] S25. By performing mapping operations on the semantic representation of the sample through the selected expert sub-modules, the operation results of each expert sub-module are obtained. The operation results of each expert sub-module are weighted and summed according to the routing weight vector of each expert sub-module to obtain the expert output result of the sample.
[0083] In one feasible implementation, the selected experts perform mapping operations on the input representation respectively. The samples are then weighted and summed according to their corresponding routing weights to obtain the expert output results. .
[0084] Through the aforementioned hybrid expert division of labor mechanism, different question-answering tasks are assigned to different expert subspaces within the model to complete feature modeling and answer generation, enabling the sharing of parameters. This hybrid expert structure only undertakes general semantic modeling functions, allowing the differentiated adaptation for each task to be completed by the corresponding experts. This reduces inter-task interference, representation coupling, and negative transfer phenomena in multi-task joint training, improving task division capabilities and modeling accuracy in multi-task question answering scenarios. Simultaneously, this hybrid expert structure provides a clear role for subsequent low-rank adaptation of weight decomposition, enabling low-rank parameter updates to be performed separately within different experts, and providing a stable structural foundation for subsequent Steenfair manifold optimization.
[0085] This invention introduces the MoE structure into a multi-task question answering model. By using a gating network to dynamically route multiple experts based on the joint feature representation of the input samples, different question answering tasks can be processed in different expert subspaces on the basis of a shared backbone network, thereby achieving clear division of labor and differentiated modeling in multi-task scenarios.
[0086] We introduce a weight decomposition low-rank adaptation structure into the shared backbone network module and multiple expert sub-modules. We apply Steenfair manifold constraints and Riemann optimization to the direction parameters in the weight decomposition low-rank adaptation structure. Then, we train the multi-task question answering model based on the sample dataset to obtain the trained multi-task question answering model.
[0087] In one feasible implementation, the present invention introduces a weight decomposition low-rank adaptation structure into the shared backbone network and various expert sub-modules of the multi-task question answering model to replace the traditional full parameter update method, thereby achieving efficient parameter training of the multi-task question answering model.
[0088] Optionally, a weight decomposition low-rank adaptation structure is introduced into the shared backbone network module, including: For the target layer in the shared backbone network module that undertakes the task adaptation function, an incremental matrix generated by the low-rank adaptation structure of weight decomposition is introduced into the original weight matrix of the target layer to obtain the adapted weight matrix; wherein, the incremental matrix includes magnitude parameters and direction parameters.
[0089] During the training of the multi-task question answering model, only the magnitude and direction parameters of the adapted weight matrix are updated.
[0090] In one feasible implementation, for a shared backbone network, its basic parameters are denoted as follows: During training, this invention maintains the basic parameters. Instead of participating in the full update, low-rank adaptation parameters are introduced into the target layer that undertakes task adaptation functions in the shared backbone network, and the update amount of the original weights is decomposed and modeled.
[0091] Specifically, for the original weight matrix in the target layer The adapted weights are expressed as follows: ,in, This represents the incremental matrix generated by the weighted decomposition of the low-rank adaptor structure.
[0092] Incremental matrix The model is constructed using a decomposition of "amplitude parameter + direction parameter". Let the direction parameter matrix be denoted as... and The amplitude parameter is denoted as The increment matrix is then expressed as .in, , , Represents a low-rank dimension. and These represent the input and output dimensions of the target layer, respectively. Represents the transpose of a matrix. It represents the set of real numbers.
[0093] Through the above decomposition method, model parameter updates no longer directly affect the complete weight matrix. Instead, it acts on the low-dimensional direction matrix. and and its corresponding amplitude parameters This significantly reduces the size of the parameters to be trained, thereby improving the training and deployment efficiency of the model in multi-task scenarios.
[0094] Optionally, a weight decomposition low-rank adaptation structure is introduced in several expert submodules, including: An incremental matrix generated by a weight decomposition low-rank adaptation structure is introduced into the parameters of selected expert submodules.
[0095] In one feasible implementation, the weighted decomposition low-rank adaptation structure is deployed not only in the target layer of the shared backbone network, which performs task adaptation functions, but also in the expert internal target layer selected by the gating network. For the first... Each expert submodule has parameters denoted as... For any input sample, the system first determines the Top-k experts to participate in the calculation of the current sample based on the hybrid expert division of labor mechanism of the previous technical point; then, it shares backbone parameters. Based on this, apply a low-rank increment to the corresponding target layer. And in the selected expert parameters Internally, corresponding low-rank incremental updates are applied. In this way, the basic parameters... The low-rank adaptation parameter is responsible for maintaining the model's general semantic representation capability for multi-task question answering scenarios, while the low-rank adaptation parameter is responsible for fine-tuning the local features and answer generation patterns required for different tasks, thereby achieving hierarchical modeling of shared knowledge and task-specific knowledge.
[0096] Compared to traditional full-parameter fine-tuning methods, the weight decomposition low-rank adaptation technique used in this invention does not change the main structure of the original model parameters and does not require adjustments to the original model parameters. Perform a full update, instead of training only a small number of low-rank orientation parameters. , and amplitude parameters This process completes the adaptation of the multi-task question answering model, thereby reducing training overhead, GPU memory usage, and parameter storage costs. Simultaneously, because the low-rank update structure is explicitly embedded within the shared backbone network and expert submodules, the model can form more targeted parameter adjustment paths for different tasks while maintaining a unified semantic foundation. This improves adaptation accuracy and generalization ability in multi-task question answering scenarios and provides a clear optimization target for subsequent Steenfair manifold optimization to constrain the low-rank direction matrix.
[0097] This invention shares basic parameters The selected expert parameters are internally introduced with a weight decomposition low-rank adaptation structure. The target layer parameters are updated through the low-rank increment matrix, so that the model can complete multi-task question answering adaptation without full parameter fine-tuning, thereby reducing training costs and improving parameter utilization efficiency.
[0098] Optionally, Steenfair manifold constraints and Riemann optimization are applied to the orientation parameters in the low-rank adaptable structure of the weight decomposition, including: Define the Steifel manifold and set the objective function for the training process of the multi-task question answering model.
[0099] The Euclidean gradient of the direction parameter of the increment matrix is obtained from the objective function. The Euclidean gradient of the direction parameter of the increment matrix is projected onto the tangent space of the Steifel manifold and optimized using the Riemann gradient to obtain the Riemann gradient that satisfies the tangent space constraint.
[0100] The directional parameters of the increment matrix are updated using the Riemann gradient that satisfies the tangent space constraint.
[0101] In one feasible implementation, the present invention applies a Stiefel manifold constraint to the low-rank direction parameters during the training process of the weight decomposition low-rank adaptor structure, so as to improve the stability, independence and separability of the low-rank adaptor subspace.
[0102] Specifically, for the low-rank increment matrix defined in the previous technical point... The present invention will use the direction parameter matrix Constrained on the Stigel manifold. The Stigel manifold is defined as follows: ,in express An identity matrix. The above constraints represent the matrix. The column vectors are pairwise orthogonal and have normalized norms, thus ensuring that the low-rank update direction is always located in a feasible subspace with orthogonal structure, avoiding redundant overlap and representation degradation between different directions.
[0103] In the specific training process, this invention uses shared basic parameters. To fix the backbone parameters, training is only performed on the low-rank adaptation parameters, and the Stieffer manifold constraint is explicitly applied to the orientation parameter matrices in the shared backbone target layer and the target layer of the selected expert submodule. Let the current multi-task training objective function be denoted as: (1) In the formula, The objective function represents the training process of the multi-task question answering model. This refers to an autoregressive loss function that is uniformly applicable to multiple tasks such as knowledge-based question answering, reading comprehension question answering, multi-turn dialogue question answering, retrieval-enhanced question answering, and domain-specific question answering, and is consistent with the pre-training objective of LLM. , Indicates the weighting coefficient. This indicates expert routing loss. This indicates the loss from expert load balancing.
[0104] For the direction parameter matrix Its Euclidean gradient Recorded as: (2) because Constrained in the Steifel manifold In this case, gradient descent in ordinary Euclidean space cannot be used directly; instead, the Euclidean gradient needs to be projected onto the tangent space of the manifold. The tangent space can be represented as: (3) In the formula, Indicates the manifold at point The tangent space at the point, Let represent a tangent vector in the tangent space, where it is a A real matrix.
[0105] This invention uses the Riemannian gradient for optimization, and its calculation formula is as follows: (4) In the formula, Represents the loss function gradient, Through the above projection operation, the original Euclidean gradient can be obtained. This is transformed into a Riemann gradient satisfying tangent space constraints, thus ensuring that subsequent update directions always align with the geometry of the Steifel manifold. After obtaining the Riemann gradient, this invention employs a tangent space-based Riemann optimization method to optimize the direction parameter matrix. Update. Regarding the learning rate. First, perform one step of gradient descent within the tangent space: (5) Then, through retraction mapping, Project back to the Steifel manifold. The retraction map is in orthogonal form and is represented as: (6) One specific implementation of Retraction is a retraction based on QR decomposition. . Indicates taking the matrix The orthogonality factor of the QR decomposition. After the above update, the new direction parameters Still satisfied For the first The selected experts, whose internal directional parameter matrices are denoted as follows: Similarly satisfied .
[0106] By introducing the aforementioned Steenfair manifold constraints and Riemann optimization process, this invention transforms the learning process of low-rank directional parameters from unconstrained optimization in ordinary Euclidean space to geometrically constrained optimization with orthogonal structural restrictions. Thus, in multi-task question-answering scenarios, after different task samples are assigned to different experts via a gating network, the low-rank update directions within each expert can maintain greater independence, reducing mutual interference between tasks caused by overlapping low-rank subspaces. Simultaneously, the low-rank directional parameters on the shared backbone can maintain a stable representation basis during general knowledge modeling, improving convergence stability and parameter update efficiency during model training. This technique enables weight decomposition low-rank adaptation to no longer rely solely on parameter size compression for efficient training, but further ensures that the low-rank adaptation subspace has controllable, stable, and clear structural features through geometric constraints.
[0107] Compared to low-rank fitting methods without manifold constraints, the stable optimization technique based on Steenfair manifold constraints employed in this invention can effectively suppress problems such as enhanced column vector correlation, subspace collapse, and redundant representation accumulation in low-rank directional matrices during training, thus improving the stability of the low-rank increment matrix. While maintaining high representational quality while ensuring high parameter efficiency, this invention combines the aforementioned hybrid expert division of labor technique and low-rank weight decomposition adaptation technique to form a complete technical chain from task structure division of labor and efficient parameter modeling to geometric stability optimization. This significantly improves the training stability, representational ability, and final question-answering performance of multi-task question-answering models under complex task distributions.
[0108] This invention decomposes the orientation parameter matrix in low-rank adaptation into weights. By constraining the low-rank directional matrix to the Stiffer manifold, the independence, stability, and separability of the low-rank fitted subspace are enhanced.
[0109] The core of this invention does not lie in using MoE, weight decomposition low-rank adaptation, or Steenfair manifold optimization alone, but in organically combining the three under a unified multi-task question answering framework to form a complete technical chain from task division, efficient parameter adaptation to geometric constraint optimization.
[0110] S3. Output the generated answer to the user.
[0111] like Figure 2 As shown, this invention introduces a hybrid expert mechanism into the intelligent question-answering scenario and uses a gating network to achieve dynamic division of labor modeling of input samples in different expert subspaces. On this basis, a weight decomposition low-rank adaptation structure is further introduced into the shared backbone parameters and expert parameters to achieve efficient parameter training. At the same time, a Steenfel manifold constraint is applied to the low-rank direction matrix to ensure the orthogonality, stability and independence of the low-rank adaptation subspace. Thus, the three mechanisms of hybrid expert dynamic division of labor, weight decomposition low-rank adaptation and Steenfel manifold constraint are uniformly coupled and applied to the overall technical solution of intelligent question answering.
[0112] This invention proposes a training method for an intelligent question-answering model that integrates a hybrid expert mechanism and a weight decomposition low-rank adaptation, along with its Steenfair manifold optimization method. By introducing a hybrid expert routing mechanism, a weight decomposition low-rank adaptation module, and a Steenfair manifold constraint optimization strategy into a unified question-answering backbone model, it achieves collaborative modeling and efficient parameter training for multiple question-answering tasks. This significantly alleviates problems such as severe task conflicts, redundant parameter updates, unstable training, and insufficient generalization ability in existing multi-task question-answering systems. It is particularly suitable for intelligent question-answering scenarios involving unified learning of multiple tasks, such as knowledge question answering, reading comprehension question answering, multi-turn dialogue question answering, retrieval enhancement question answering, and domain question answering.
[0113] Figure 3 This is a block diagram illustrating an intelligent question-answering device that combines hybrid experts with low-rank weight decomposition adaptation, according to an exemplary embodiment. The device is used in an intelligent question-answering method that combines hybrid experts with low-rank weight decomposition adaptation, and includes: The receiving module 310 is used to receive user input.
[0114] The generation module 320 is used to generate answers to questions by the trained multi-task question answering model. The training process of the multi-task question answering model includes: Obtain historical question-and-answer text data, add task identifiers and supervision labels to the historical question-and-answer text data, and perform preprocessing to obtain the sample dataset.
[0115] The multi-task question answering model obtains expert output results based on samples; the multi-task question answering model includes a shared backbone network module, an expert allocation module, and multiple expert sub-modules.
[0116] We introduce a weight decomposition low-rank adaptation structure into the shared backbone network module and multiple expert sub-modules. We apply Steenfair manifold constraints and Riemann optimization to the direction parameters in the weight decomposition low-rank adaptation structure. Then, we train the multi-task question answering model based on the sample dataset to obtain the trained multi-task question answering model.
[0117] Output module 330 is used to output the generated answer to the user.
[0118] Figure 4 This is a schematic diagram of the structure of an intelligent question-answering device that combines hybrid experts with low-rank adaptation of weight decomposition, as provided in an embodiment of the present invention. Figure 4 As shown, intelligent question-answering devices that combine hybrid experts with low-rank weight decomposition adaptation can include the above-mentioned... Figure 3 The illustrated intelligent question-answering device combines hybrid experts with low-rank weighted decomposition. Optionally, the intelligent question-answering device 410 combining hybrid experts with low-rank weighted decomposition may include a first processor 2001.
[0119] Optionally, the intelligent question-answering device 410, which combines experts with weighted decomposition and low-rank adaptation, may also include a memory 2002 and a transceiver 2003.
[0120] The first processor 2001, memory 2002, and transceiver 2003 can be connected via a communication bus.
[0121] The following is combined with Figure 4 The following is a detailed introduction to the various components of the intelligent question-answering device 410, which combines hybrid experts with low-rank weight decomposition adaptation: The first processor 2001 is the control center of the intelligent question-answering device 410 that combines expert and weight decomposition low-rank adaptation. It can be a single processor or a collective term for multiple processing elements. For example, the first processor 2001 can be one or more central processing units (CPUs), application-specific integrated circuits (ASICs), or one or more integrated circuits configured to implement embodiments of the present invention, such as one or more digital signal processors (DSPs), or one or more field-programmable gate arrays (FPGAs).
[0122] Optionally, the first processor 2001 can perform various functions of the intelligent question-answering device 410 with hybrid expert and weight decomposition low-rank adaptation by running or executing software programs stored in the memory 2002 and calling data stored in the memory 2002.
[0123] In a specific implementation, as one example, the first processor 2001 may include one or more CPUs, for example... Figure 4 CPU0 and CPU1 are shown in the diagram.
[0124] In a specific implementation, as one example, the intelligent question-answering device 410 that combines hybrid experts with low-rank weight decomposition adaptation may also include multiple processors, for example... Figure 4 The first processor 2001 and the second processor 2004 are shown in the diagram. Each of these processors can be a single-core processor or a multi-core processor. Here, a processor can refer to one or more devices, circuits, and / or processing cores used to process data (such as computer program instructions).
[0125] The memory 2002 is used to store the software program that executes the present invention, and is controlled by the first processor 2001 to execute it. The specific implementation method can be referred to the above method embodiment, and will not be repeated here.
[0126] Optionally, the memory 2002 may be a read-only memory (ROM) or other type of static storage device capable of storing static information and instructions, random access memory (RAM) or other type of dynamic storage device capable of storing information and instructions, or electrically erasable programmable read-only memory (EEPROM), compact disc read-only memory (CD-ROM) or other optical disc storage, optical disc storage (including compressed optical discs, laser discs, optical discs, digital universal optical discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium capable of carrying or storing desired program code in the form of instructions or data structures and accessible by a computer, but not limited thereto. The memory 2002 may be integrated with the first processor 2001 or may exist independently, and may be connected via the interface circuit of the hybrid expert and weighted decomposition low-rank adapted intelligent question-answering device 410. Figure 4 (Not shown in the image) is coupled to the first processor 2001, and this embodiment of the invention does not specifically limit this.
[0127] The transceiver 2003 is used to communicate with network devices or with terminal devices.
[0128] Alternatively, transceiver 2003 may include a receiver and a transmitter. Figure 4 (Not shown separately). The receiver is used to implement the receiving function, and the transmitter is used to implement the transmitting function.
[0129] Optionally, the transceiver 2003 can be integrated with the first processor 2001 or exist independently, and can be connected to the interface circuit of the intelligent question-answering device 410 with hybrid expert and weight decomposition low-rank adaptation. Figure 4 (Not shown in the image) is coupled to the first processor 2001, and this embodiment of the invention does not specifically limit this.
[0130] It should be noted that, Figure 4 The structure of the intelligent question-answering device 410 with hybrid expert and weighted decomposition low-rank adaptation shown in the figure does not constitute a limitation on the router. Actual knowledge structure recognition devices may include more or fewer components than shown, or combine certain components, or have different component arrangements.
[0131] Furthermore, the technical effects of the intelligent question-answering device 410 with hybrid expert and weight decomposition low-rank adaptation can be referred to the technical effects of the intelligent question-answering method with hybrid expert and weight decomposition low-rank adaptation described in the above method embodiments, and will not be repeated here.
[0132] It should be understood that the first processor 2001 in the embodiments of the present invention may be a central processing unit (CPU), or it may be other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or any conventional processor, etc.
[0133] It should also be understood that the memory in the embodiments of the present invention can be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory. The non-volatile memory can be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), or flash memory. The volatile memory can be random access memory (RAM), which is used as an external cache. By way of example, but not limitation, many forms of random access memory (RAM) are available, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate synchronous DRAM (DDR SDRAM), enhanced synchronous DRAM (ESDRAM), synchronous linked DRAM (SLDRAM), and direct rambus RAM (DR RAM).
[0134] The above embodiments can be implemented, in whole or in part, by software, hardware (such as circuits), firmware, or any other combination thereof. When implemented using software, the above embodiments can be implemented, in whole or in part, as a computer program product. The computer program product includes one or more computer instructions or computer programs. When the computer instructions or computer programs are loaded or executed on a computer, all or part of the processes or functions described in the embodiments of the present invention are generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium that a computer can access or a data storage device such as a server or data center that includes one or more sets of available media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. A semiconductor medium can be a solid-state drive.
[0135] It should be understood that the term "and / or" in this article is merely a description of the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A existing alone, A and B existing simultaneously, or B existing alone. A and B can be singular or plural. Additionally, the character " / " in this article generally indicates an "or" relationship between the preceding and following related objects, but it can also represent an "and / or" relationship. Please refer to the context for a more accurate understanding.
[0136] In this invention, "at least one" means one or more, and "more than one" means two or more. "At least one of the following" or similar expressions refer to any combination of these items, including any combination of a single item or a plurality of items. For example, at least one of a, b, or c can represent: a, b, c, ab, ac, bc, or abc, where a, b, and c can be a single item or multiple items.
[0137] It should be understood that, in various embodiments of the present invention, the order of the above-mentioned process numbers does not imply the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.
[0138] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementations should not be considered beyond the scope of this invention.
[0139] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working processes of the devices, apparatuses, and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here.
[0140] In the several embodiments provided by this invention, it should be understood that the disclosed devices, apparatuses, and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another device, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between devices or units may be electrical, mechanical, or other forms.
[0141] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.
[0142] In addition, the functional units in the various embodiments of the present invention can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit.
[0143] If the aforementioned functions are implemented as software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this invention, or the part that contributes to the prior art, or a part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0144] The above description is merely a specific embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the technical scope disclosed in the present invention should be included within the scope of protection of the present invention. Therefore, the scope of protection of the present invention should be determined by the scope of the claims.
Claims
1. A method for intelligent question answering by hybridizing expert and weight-decomposed low-rank adaptation, characterized in that, The method includes: S1. The problem of receiving user input; S2. The trained multi-task question answering model generates an answer to the question. The training process of the multi-task question answering model includes: Obtain historical question-and-answer text data, add task identifiers and supervision labels to the historical question-and-answer text data, and perform preprocessing to obtain the sample dataset; The multi-task question answering model obtains expert output results based on samples; wherein, the multi-task question answering model includes a shared backbone network module, an expert allocation module, and multiple expert sub-modules; We introduce a weight decomposition low-rank adaptation structure into the shared backbone network module and multiple expert sub-modules. We apply Steenfair manifold constraints and Riemann optimization to the direction parameters in the weight decomposition low-rank adaptation structure. We then train the multi-task question answering model based on the sample dataset to obtain the trained multi-task question answering model. S3. Output the generated answer to the user.
2. The method of claim 1, wherein, The historical question-and-answer text data includes: knowledge question-and-answer text data, reading comprehension question-and-answer text data, multi-turn dialogue question-and-answer text data, retrieval-enhanced question-and-answer text data, and domain-specific question-and-answer text data. 3.The method of claim 1, wherein, The expert output results obtained based on the samples include: S21. Select any sample from the sample dataset, input the selected sample into the shared backbone network module, and obtain the semantic representation of the sample; S22. Concatenate the semantic representation of the sample with the task identifier vector of the sample to obtain the joint feature representation; S23. Input the joint feature representation into the expert allocation module to obtain the routing weight vector of each expert submodule; wherein, the expert allocation module adopts a gated network; S24. Select a subset of expert submodules from multiple expert submodules based on the routing weight vectors of each expert submodule. S25. By performing mapping operations on the semantic representation of the sample through the selected expert sub-modules, the operation results of each expert sub-module are obtained. The operation results of each expert sub-module are weighted and summed according to the routing weight vector of each expert sub-module to obtain the expert output result of the sample.
4. The method of claim 1, wherein, The introduction of a weight decomposition low-rank adaptation structure in the shared backbone network module and multiple expert submodules includes: For the target layer in the shared backbone network module that undertakes the task adaptation function, an incremental matrix generated by the low-rank adaptation structure of weight decomposition is introduced into the original weight matrix of the target layer to obtain the adapted weight matrix; wherein, the incremental matrix includes magnitude parameters and direction parameters. During the training of the multi-task question answering model, only the magnitude and direction parameters of the adapted weight matrix are updated.
5. The method of claim 1, wherein, The introduction of a weight decomposition low-rank adaptation structure in the shared backbone network module and multiple expert submodules includes: An incremental matrix generated by a weight decomposition low-rank adaptation structure is introduced into the parameters of selected expert submodules.
6. The method of claim 1, wherein, The application of Steifel manifold constraints and Riemann optimization to the direction parameters in the low-rank adaptable structure of weight decomposition includes: Define the Steifel manifold and set the objective function for the training process of the multi-task question answering model; The Euclidean gradient of the direction parameter of the increment matrix is obtained from the objective function. The Euclidean gradient of the direction parameter of the increment matrix is projected onto the tangent space of the Steifel manifold and optimized using the Riemann gradient to obtain the Riemann gradient that satisfies the tangent space constraint. The directional parameters of the increment matrix are updated using the Riemann gradient that satisfies the tangent space constraint.
7. The method of claim 6, wherein, The objective function for the training process of the multi-task question answering model is shown in equation (1) below: (1) In the formula, a target function representing a training process of the multi-task question answering model, a multi-task autoregressive loss function, 、 a weight coefficient, a specialist routing loss, a specialist load balancing loss.
8. A hybrid expert and weighted decomposition low-rank adaptation intelligent question answering device, wherein the hybrid expert and weighted decomposition low-rank adaptation intelligent question answering device is used to implement the hybrid expert and weighted decomposition low-rank adaptation intelligent question answering method as described in any one of claims 1-7, characterized in that, The device includes: The receiving module is used to receive user input questions; A generation module is used to generate answers to the questions from the trained multi-task question-answering model. The training process of the multi-task question-answering model includes: Obtain historical question-and-answer text data, add task identifiers and supervision labels to the historical question-and-answer text data, and perform preprocessing to obtain the sample dataset; The multi-task question answering model obtains expert output results based on samples; wherein, the multi-task question answering model includes a shared backbone network module, an expert allocation module, and multiple expert sub-modules; We introduce a weight decomposition low-rank adaptation structure into the shared backbone network module and multiple expert sub-modules. We apply Steenfair manifold constraints and Riemann optimization to the direction parameters in the weight decomposition low-rank adaptation structure. We then train the multi-task question answering model based on the sample dataset to obtain the trained multi-task question answering model. The output module is used to output the generated answer to the user.
9. An intelligent question-answering device that combines expert input with low-rank weight decomposition adaptation, characterized in that, The intelligent question-answering device that combines hybrid experts with low-rank weight decomposition includes: processor; A memory storing computer-readable instructions that, when executed by the processor, implement the method as described in any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that, The computer-readable storage medium contains program code that can be invoked by a processor to execute the method as described in any one of claims 1 to 7.